Casestovars question--sysmis values

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Casestovars question--sysmis values

Maguin, Eugene
I have some data like this (but this is not it).

Id rec x   y
11   1 1   2
11   2 .   3
11   4 .   1
12   1 .   4
12   3 .   4
12   4 1   2
12   5 .   3
13   1 .   2
14   1 .   3

I run it through casestovars like this

Casetovars id=id/index=rec.

What I expect back is

Id  x1 x2 x3 x4 x5 y1 y2 y3 y4 y5
11   1  .  .  .  .  2  3  .  1  .
12   .  .  .  1  .  4  .  4  2  3
13   .  .  .  .  .  2  .  .  .  .
14   .  .  .  .  .  3  .  .  .  .

What I have gotten back is like this.

Id   x y1 y2 y3 y4 y5
11   1  2  3  .  1  .
12   1  4  .  4  2  3
13   .  2  .  .  .  .
14   .  3  .  .  .  .

So, that's odd. If I recode x=sysmis to 0 and run it through casestovars,
things work as expected. The casestovars documentation doesn't mention
missing (sysmis) values as far as I can determine from a quick read through.
I did this originally on 16 but 17 is the same. Comments, please.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Casestovars question--sysmis values

Fry, Jonathan B.
Add "/autofix=no" to the command.  You'll get a warning that x does not vary within id groups, but you'll get the result you want.

Jonathan Fry
SPSS Inc.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gene Maguin
Sent: Monday, October 12, 2009 2:06 PM
To: [hidden email]
Subject: Casestovars question--sysmis values

I have some data like this (but this is not it).

Id rec x   y
11   1 1   2
11   2 .   3
11   4 .   1
12   1 .   4
12   3 .   4
12   4 1   2
12   5 .   3
13   1 .   2
14   1 .   3

I run it through casestovars like this

Casetovars id=id/index=rec.

What I expect back is

Id  x1 x2 x3 x4 x5 y1 y2 y3 y4 y5
11   1  .  .  .  .  2  3  .  1  .
12   .  .  .  1  .  4  .  4  2  3
13   .  .  .  .  .  2  .  .  .  .
14   .  .  .  .  .  3  .  .  .  .

What I have gotten back is like this.

Id   x y1 y2 y3 y4 y5
11   1  2  3  .  1  .
12   1  4  .  4  2  3
13   .  2  .  .  .  .
14   .  3  .  .  .  .

So, that's odd. If I recode x=sysmis to 0 and run it through casestovars,
things work as expected. The casestovars documentation doesn't mention
missing (sysmis) values as far as I can determine from a quick read through.
I did this originally on 16 but 17 is the same. Comments, please.

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Casestovars question--sysmis values

Maguin, Eugene
Jonathan,

I just checked and I agree. Adding 'autofix=no' does give me x.1 to x.5.
But, let me say that as I read the description of autofix, there is nothing
that would lead me to use autofix=no. I tried out some other things and I
conclude that the issue is how sysmis is handled in casestovars (and I'm not
sure I could describe the rule used, either). I say that because giving the
sysmis values a numeric value and then declaring that value to be (user)
missing still yields x.1 to x.5--without specifying autofix=no.

I think the documentation needs to be fixed up a bit on this point.


Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Casestovars question--sysmis values

Fry, Jonathan B.
The issue AUTOFIX is intended to address is variables whose values depend on the ID value.  You don't want to spread those.  The command identifies empirically the variables that vary within ID groups, ignoring SYSMIS values, and infers the rest to depend on the ID.  I agree that the treatment of SYSMIS needs to be documented.

This use case seems to argue that the command should not be ignoring SYSMIS values here.

Question for the list: is the current treatment of SYSMIS values ever right?

Jonathan Fry

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gene Maguin
Sent: Tuesday, October 13, 2009 10:59 AM
To: [hidden email]
Subject: Re: Casestovars question--sysmis values

Jonathan,

I just checked and I agree. Adding 'autofix=no' does give me x.1 to x.5.
But, let me say that as I read the description of autofix, there is nothing
that would lead me to use autofix=no. I tried out some other things and I
conclude that the issue is how sysmis is handled in casestovars (and I'm not
sure I could describe the rule used, either). I say that because giving the
sysmis values a numeric value and then declaring that value to be (user)
missing still yields x.1 to x.5--without specifying autofix=no.

I think the documentation needs to be fixed up a bit on this point.


Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Casestovars question--sysmis values

Matheson, David
Hi Gene,
  I have not seen any responses to John's question to the list as to whether the CASETOVARS handling of system missing values is correct. A defect has been filed with the SPSS Documentation group to request that the treatment of SYSMIS in CASESTOVARS be documented.

The Command Syntax Reference (available from the Help menu) provides the following notes for the /AUTOFIX subcommand. When AUTOFIX equals "YES" (the default):

"An original variable that does not vary within any row group is classified as a fixed variable and is copied into a single variable in the new data file.
An original variable that does vary within the row group is classified as the source of a variable group. It becomes a variable group in the new data file."

A row group is a set of cases with the same ID variable. The cases in a row group will become a single case in the restructured file.

The role of a system missing (sysmis) value in these rules is not explained . As of Statistics 18.0, if each row group (cases with same ID value) has a single valid value (not necessarily the same value across row groups) plus the system missing value (sysmis), the variable is treated as not varying within row groups. Such a variable will be copied into a single variable in the restructured file.  In your example, ID 11 had a single valid value, 1, plus system missing values observed for X. ID 12 had the single valid value 1, plus system missing, observed for X. The remaining cases had only system missing values observed for X. By the above rules, X is treated as a constant within ID groups, i.e. a fixed variable, and X becomes a single variable in the restructured file. When you recoded system missing to 0 before the CASESTOVARS command, you introduced within-ID variation in valid values for X (0 and 1 for ID 11; 0 and 1 for ID 12), so that X became a variable group in the
restructured file.

As John noted, you can add the subcommand "/autofix=no" to the VARSTOCASES command, as in

Casestovars id=id
  /index=rec
  /autofix = no .

You'll get a warning that X does not vary within id groups, but X will be restructured into multiple variables as you wished.


David Matheson
Statistical Support
SPSS, an IBM company

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Fry, Jonathan B.
Sent: Tuesday, October 13, 2009 1:08 PM
To: [hidden email]
Subject: Re: Casestovars question--sysmis values

The issue AUTOFIX is intended to address is variables whose values depend on the ID value.  You don't want to spread those.  The command identifies empirically the variables that vary within ID groups, ignoring SYSMIS values, and infers the rest to depend on the ID.  I agree that the treatment of SYSMIS needs to be documented.

This use case seems to argue that the command should not be ignoring SYSMIS values here.

Question for the list: is the current treatment of SYSMIS values ever right?

Jonathan Fry

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gene Maguin
Sent: Tuesday, October 13, 2009 10:59 AM
To: [hidden email]
Subject: Re: Casestovars question--sysmis values

Jonathan,

I just checked and I agree. Adding 'autofix=no' does give me x.1 to x.5.
But, let me say that as I read the description of autofix, there is nothing
that would lead me to use autofix=no. I tried out some other things and I
conclude that the issue is how sysmis is handled in casestovars (and I'm not
sure I could describe the rule used, either). I say that because giving the
sysmis values a numeric value and then declaring that value to be (user)
missing still yields x.1 to x.5--without specifying autofix=no.

I think the documentation needs to be fixed up a bit on this point.


Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD