Different results when using split file option vs cross-tab

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Different results when using split file option vs cross-tab

oriana69
HI,
I've came across some strange behaviour of SPSS.
I have a database with two study waves that I want to compare. Both waves are weighted separately. I got different results when using:
1)  File split.
sort cases by wave.
split file by wave.
fre A1 .

2) Cross-tab.
cro A1 by wave / cells = column .

The results are not very big - like 0,1% for 300 of respondents but still it makes a difference.

I did some small investigation and it looks like for crosstabs, SPSS rounds the number of respondents to an integer and only then counts percentages.

Does anyone knows how to make SPSS calculate crosstabs correctly??





Reply | Threaded
Open this post in threaded view
|

Re: Different results when using split file option vs cross-tab

Jon K Peck
You can specify the rounding behavior you want in the CROSSTABS command.

From the CSR:
COUNT subcommand
The COUNT subcommand controls how case weights are handled.
ASIS. The case weights are used as is. However, when Exact Statistics are requested, the accumulated
weights in the cells are either truncated or rounded before computing the Exact test statistics.
CASE. The case weights are either rounded or truncated before use.
CELL. The case weights are used as is but the accumulated weights in the cells are either truncated or rounded
before computing any statistics.
ROUND. Performs Rounding operation.
TRUNCATE. Performs Truncation operation.

Also, you might have some missing data that is treated differently for SPLIT FILES than using CROSSTABS overall.

MISSING subcommand
By default, CROSSTABS deletes cases with missing values on a table-by-table basis. Cases with missing
values for any variable specified for a table are not used in the table or in the calculation of statistics. Use
MISSING to specify alternative missing-value treatments.
v The only specification is a single keyword.
v The number of missing cases is always displayed in the Case Processing Summary table.
v If the missing values are not included in the range specified on VARIABLES, they are excluded from the
table regardless of the keyword you specify on MISSING.
TABLE. Delete cases with missing values on a table-by-table basis. When multiple table lists are specified,
missing values are handled separately for each list. This is the default.
INCLUDE. Include user-missing values.
REPORT. Report missing values in the tables. This option includes missing values in tables but not in the
calculation of percentages or statistics. The missing status is indicated on the categorical label. REPORT is
available only in integer mode.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        oriana69 <[hidden email]>
To:        [hidden email],
Date:        06/29/2014 07:15 AM
Subject:        [SPSSX-L] Different results when using split file option vs cross-tab
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




HI,
I've came across some strange behaviour of SPSS.
I have a database with two study waves that I want to compare. Both waves
are weighted separately. I got different results when using:
1)  File split.
sort cases by wave.
split file by wave.
fre A1 .

2) Cross-tab.
cro A1 by wave / cells = column .

The results are not very big - like 0,1% for 300 of respondents but still it
makes a difference.

I did some small investigation and it looks like for crosstabs, SPSS rounds
the number of respondents to an integer and only then counts percentages.

Does anyone knows how to make SPSS calculate crosstabs correctly??









--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Different-results-when-using-split-file-option-vs-cross-tab-tp5726609.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Different results when using split file option vs cross-tab

oriana69
@Jon thank you for the answer.
It is not the case of missing values, it is rather the rounding/ truncating. Do you know how to make SPSS calculate column pct in crosstabs  based on actual weighted data and not rounded cases?
Is there any option in SPSS that changes it for the whole application?


On 29 June 2014 15:29, Jon K Peck <[hidden email]> wrote:
You can specify the rounding behavior you want in the CROSSTABS command.

From the CSR:
COUNT subcommand
The COUNT subcommand controls how case weights are handled.
ASIS. The case weights are used as is. However, when Exact Statistics are requested, the accumulated
weights in the cells are either truncated or rounded before computing the Exact test statistics.
CASE. The case weights are either rounded or truncated before use.
CELL. The case weights are used as is but the accumulated weights in the cells are either truncated or rounded
before computing any statistics.
ROUND. Performs Rounding operation.
TRUNCATE. Performs Truncation operation.

Also, you might have some missing data that is treated differently for SPLIT FILES than using CROSSTABS overall.

MISSING subcommand
By default, CROSSTABS deletes cases with missing values on a table-by-table basis. Cases with missing
values for any variable specified for a table are not used in the table or in the calculation of statistics. Use
MISSING to specify alternative missing-value treatments.
v The only specification is a single keyword.
v The number of missing cases is always displayed in the Case Processing Summary table.
v If the missing values are not included in the range specified on VARIABLES, they are excluded from the
table regardless of the keyword you specify on MISSING.
TABLE. Delete cases with missing values on a table-by-table basis. When multiple table lists are specified,
missing values are handled separately for each list. This is the default.
INCLUDE. Include user-missing values.
REPORT. Report missing values in the tables. This option includes missing values in tables but not in the
calculation of percentages or statistics. The missing status is indicated on the categorical label. REPORT is
available only in integer mode.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: <a href="tel:720-342-5621" value="+17203425621" target="_blank">720-342-5621




From:        oriana69 <[hidden email]>
To:        [hidden email],
Date:        06/29/2014 07:15 AM
Subject:        [SPSSX-L] Different results when using split file option vs cross-tab
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




HI,
I've came across some strange behaviour of SPSS.
I have a database with two study waves that I want to compare. Both waves
are weighted separately. I got different results when using:
1)  File split.
sort cases by wave.
split file by wave.
fre A1 .

2) Cross-tab.
cro A1 by wave / cells = column .

The results are not very big - like 0,1% for 300 of respondents but still it
makes a difference.

I did some small investigation and it looks like for crosstabs, SPSS rounds
the number of respondents to an integer and only then counts percentages.

Does anyone knows how to make SPSS calculate crosstabs correctly??









--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Different-results-when-using-split-file-option-vs-cross-tab-tp5726609.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Different results when using split file option vs cross-tab

Jon K Peck
That is what the ASIS option below should do.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Oriana69 <[hidden email]>
To:        [hidden email],
Date:        06/30/2014 06:32 AM
Subject:        Re: [SPSSX-L] Different results when using split file option vs cross-tab
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




@Jon thank you for the answer.
It is not the case of missing values, it is rather the rounding/ truncating. Do you know how to make SPSS calculate column pct in crosstabs  based on actual weighted data and not rounded cases?
Is there any option in SPSS that changes it for the whole application?


On 29 June 2014 15:29, Jon K Peck <peck@...> wrote:
You can specify the rounding behavior you want in the CROSSTABS command.

From the CSR:

COUNT subcommand

The COUNT subcommand controls how case weights are handled.
ASIS
. The case weights are used as is. However, when Exact Statistics are requested, the accumulated
weights in the cells are either truncated or rounded before computing the Exact test statistics.
CASE
. The case weights are either rounded or truncated before use.
CELL
. The case weights are used as is but the accumulated weights in the cells are either truncated or rounded
before computing any statistics.

ROUND
. Performs Rounding operation.
TRUNCATE
. Performs Truncation operation.

Also, you might have some missing data that is treated differently for SPLIT FILES than using CROSSTABS overall.


MISSING subcommand

By default, CROSSTABS deletes cases with missing values on a table-by-table basis. Cases with missing
values for any variable specified for a table are not used in the table or in the calculation of statistics. Use
MISSING to specify alternative missing-value treatments.
v The only specification is a single keyword.
v The number of missing cases is always displayed in the Case Processing Summary table.
v If the missing values are not included in the range specified on VARIABLES, they are excluded from the
table regardless of the keyword you specify on MISSING.
TABLE
. Delete cases with missing values on a table-by-table basis. When multiple table lists are specified,
missing values are handled separately for each list. This is the default.
INCLUDE
. Include user-missing values.
REPORT
. Report missing values in the tables. This option includes missing values in tables but not in the
calculation of percentages or statistics. The missing status is indicated on the categorical label. REPORT is
available only in integer mode.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM

peck@...
phone:
<a href="tel:720-342-5621" target=_blank>720-342-5621




From:        
oriana69 <oriana69117@...>
To:        
[hidden email],
Date:        
06/29/2014 07:15 AM
Subject:        
[SPSSX-L] Different results when using split file option vs cross-tab
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>




HI,
I've came across some strange behaviour of SPSS.
I have a database with two study waves that I want to compare. Both waves
are weighted separately. I got different results when using:
1)  File split.
sort cases by wave.
split file by wave.
fre A1 .

2) Cross-tab.
cro A1 by wave / cells = column .

The results are not very big - like 0,1% for 300 of respondents but still it
makes a difference.

I did some small investigation and it looks like for crosstabs, SPSS rounds
the number of respondents to an integer and only then counts percentages.

Does anyone knows how to make SPSS calculate crosstabs correctly??









--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Different-results-when-using-split-file-option-vs-cross-tab-tp5726609.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to

LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD