A co worker of mine had a really weird issue with one of her datasets
that has me completely stumped. Hopefully one of you guys has heard of this issue. In this dataset when I do a fairly simple if command that creates a new variable, for example: if goreg=1& fulltime=1 blah=1. one would suspect that the only possible value for the new variable blah would be 1. However, instead when I do a frequency count of the new variable I get a number of different answers, ranging from 0 to 12. This anomaly occurs when combining a number of different variables in the dataset in a number of different ways. When I open the dataset in SPSS 11 this does not happen. What is even more bizarre is that this error does not seem to be a stable one. If I rerun the above syntax several times with different names for the variable I create I don't always get the same answer. For example: Ofter opening the dataset I run if goreg=1&fulltime=1 blah=1. freq blah. and get blah Frequency Percent Valid Percent Cumulative Percent Valid .00 4696 61.4 75.4 75.4 1.00 1256 16.4 20.2 95.5 2.00 19 .2 .3 95.8 3.00 17 .2 .3 96.1 4.00 24 .3 .4 96.5 5.00 18 .2 .3 96.8 6.00 22 .3 .4 97.1 7.00 21 .3 .3 97.4 8.00 46 .6 .7 98.2 9.00 4 .1 .1 98.3 10.00 11 .1 .2 98.4 11.00 13 .2 .2 98.6 12.00 85 1.1 1.4 100.0 Total 6232 81.5 100.0 Missing System 1412 18.5 Total 7644 100.0 but then if I immediately run if goreg=1&fulltime=1 acccc=1. freq acccc. I get acccc Frequency Percent Valid Percent Cumulative Percent Valid .00 2587 33.8 39.2 39.2 1.00 1286 16.8 19.5 58.8 2.00 220 2.9 3.3 62.1 3.00 276 3.6 4.2 66.3 4.00 291 3.8 4.4 70.7 5.00 384 5.0 5.8 76.5 6.00 305 4.0 4.6 81.1 7.00 330 4.3 5.0 86.1 8.00 648 8.5 9.8 96.0 9.00 41 .5 .6 96.6 10.00 83 1.1 1.3 97.9 11.00 41 .5 .6 98.5 12.00 100 1.3 1.5 100.0 Total 6592 86.2 100.0 Missing System 1052 13.8 Total 7644 100.0 as you can see the counts for accc and blah are different even though the syntax to create them is the same. If I keep running the syntax with different variable names the weird values eventually "go away" until, after you've run about 5 you only get "1"s. Which is of course all you should be getting in the first place. This problem only happens on this one dataset on SPSS 14. Does anyone have any idea what could be causing this? -Graham |
Does this
Comp blah = goreq =1 & fulltime = 1. Produce the same results? Could blah already exist in the data set-- And you are getting values not picked up by the IF command? -- jim -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Graham Wright Sent: Tuesday, January 02, 2007 12:33 PM To: [hidden email] Subject: bizarre spss 14 issue/instability? A co worker of mine had a really weird issue with one of her datasets that has me completely stumped. Hopefully one of you guys has heard of this issue. In this dataset when I do a fairly simple if command that creates a new variable, for example: if goreg=1& fulltime=1 blah=1. one would suspect that the only possible value for the new variable blah would be 1. However, instead when I do a frequency count of the new variable I get a number of different answers, ranging from 0 to 12. This anomaly occurs when combining a number of different variables in the dataset in a number of different ways. When I open the dataset in SPSS 11 this does not happen. What is even more bizarre is that this error does not seem to be a stable one. If I rerun the above syntax several times with different names for the variable I create I don't always get the same answer. For example: Ofter opening the dataset I run if goreg=1&fulltime=1 blah=1. freq blah. and get blah Frequency Percent Valid Percent Cumulative Percent Valid .00 4696 61.4 75.4 75.4 1.00 1256 16.4 20.2 95.5 2.00 19 .2 .3 95.8 3.00 17 .2 .3 96.1 4.00 24 .3 .4 96.5 5.00 18 .2 .3 96.8 6.00 22 .3 .4 97.1 7.00 21 .3 .3 97.4 8.00 46 .6 .7 98.2 9.00 4 .1 .1 98.3 10.00 11 .1 .2 98.4 11.00 13 .2 .2 98.6 12.00 85 1.1 1.4 100.0 Total 6232 81.5 100.0 Missing System 1412 18.5 Total 7644 100.0 but then if I immediately run if goreg=1&fulltime=1 acccc=1. freq acccc. I get acccc Frequency Percent Valid Percent Cumulative Percent Valid .00 2587 33.8 39.2 39.2 1.00 1286 16.8 19.5 58.8 2.00 220 2.9 3.3 62.1 3.00 276 3.6 4.2 66.3 4.00 291 3.8 4.4 70.7 5.00 384 5.0 5.8 76.5 6.00 305 4.0 4.6 81.1 7.00 330 4.3 5.0 86.1 8.00 648 8.5 9.8 96.0 9.00 41 .5 .6 96.6 10.00 83 1.1 1.3 97.9 11.00 41 .5 .6 98.5 12.00 100 1.3 1.5 100.0 Total 6592 86.2 100.0 Missing System 1052 13.8 Total 7644 100.0 as you can see the counts for accc and blah are different even though the syntax to create them is the same. If I keep running the syntax with different variable names the weird values eventually "go away" until, after you've run about 5 you only get "1"s. Which is of course all you should be getting in the first place. This problem only happens on this one dataset on SPSS 14. Does anyone have any idea what could be causing this? -Graham |
The only possible reasons I could fathom for the results are:
1. The outcome variable BLAH already existed, and the extraneous values were already there. They remained untouched as long as they were not affected by the conditions established at the COMPUTE command. 2. The COMPUTE command lacks any parenthesis. I know that SPSS can dispense with parentheses as of lately, but to be on the safe side I would enclose the condition in parentheses: Compute blah = (goreq =1 & fulltime = 1). 3. In the example mentioned by Graham Wright in his original posting there are no spaces between some parts of the condition, perhaps leading SPSS to confusion. He wrote the statement as: if goreg=1& fulltime=1 blah=1. As one can see, in this example the "&" sign is contiguous to "GOREG=1", and this may distort or invalidate the condition, hindering its application. If the BLAH variable existed beforehand, its original values would have survived even if the intended condition was present. 4. Notice also that COMPUTE BLAH=(goreq =1 & fulltime = 1) produces a zero in al "false" cases and 1 in al "true" cases, whereas the IF version used in the original posting produces (if correctly written) a 1 if the condition is fulfilled and nothing in any other case. -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Marks, Jim Enviado el: 02 January 2007 15:42 Para: [hidden email] Asunto: Re: bizarre spss 14 issue/instability? Does this Comp blah = goreq =1 & fulltime = 1. Produce the same results? Could blah already exist in the data set-- And you are getting values not picked up by the IF command? -- jim -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Graham Wright Sent: Tuesday, January 02, 2007 12:33 PM To: [hidden email] Subject: bizarre spss 14 issue/instability? A co worker of mine had a really weird issue with one of her datasets that has me completely stumped. Hopefully one of you guys has heard of this issue. In this dataset when I do a fairly simple if command that creates a new variable, for example: if goreg=1& fulltime=1 blah=1. one would suspect that the only possible value for the new variable blah would be 1. However, instead when I do a frequency count of the new variable I get a number of different answers, ranging from 0 to 12. This anomaly occurs when combining a number of different variables in the dataset in a number of different ways. When I open the dataset in SPSS 11 this does not happen. What is even more bizarre is that this error does not seem to be a stable one. If I rerun the above syntax several times with different names for the variable I create I don't always get the same answer. For example: Ofter opening the dataset I run if goreg=1&fulltime=1 blah=1. freq blah. and get blah Frequency Percent Valid Percent Cumulative Percent Valid .00 4696 61.4 75.4 75.4 1.00 1256 16.4 20.2 95.5 2.00 19 .2 .3 95.8 3.00 17 .2 .3 96.1 4.00 24 .3 .4 96.5 5.00 18 .2 .3 96.8 6.00 22 .3 .4 97.1 7.00 21 .3 .3 97.4 8.00 46 .6 .7 98.2 9.00 4 .1 .1 98.3 10.00 11 .1 .2 98.4 11.00 13 .2 .2 98.6 12.00 85 1.1 1.4 100.0 Total 6232 81.5 100.0 Missing System 1412 18.5 Total 7644 100.0 but then if I immediately run if goreg=1&fulltime=1 acccc=1. freq acccc. I get acccc Frequency Percent Valid Percent Cumulative Percent Valid .00 2587 33.8 39.2 39.2 1.00 1286 16.8 19.5 58.8 2.00 220 2.9 3.3 62.1 3.00 276 3.6 4.2 66.3 4.00 291 3.8 4.4 70.7 5.00 384 5.0 5.8 76.5 6.00 305 4.0 4.6 81.1 7.00 330 4.3 5.0 86.1 8.00 648 8.5 9.8 96.0 9.00 41 .5 .6 96.6 10.00 83 1.1 1.3 97.9 11.00 41 .5 .6 98.5 12.00 100 1.3 1.5 100.0 Total 6592 86.2 100.0 Missing System 1052 13.8 Total 7644 100.0 as you can see the counts for accc and blah are different even though the syntax to create them is the same. If I keep running the syntax with different variable names the weird values eventually "go away" until, after you've run about 5 you only get "1"s. Which is of course all you should be getting in the first place. This problem only happens on this one dataset on SPSS 14. Does anyone have any idea what could be causing this? -Graham |
In reply to this post by Marks, Jim
When I run your syntax on a freshly opened dataset that does not
already have a variable called BLAH in it I get the following: blah Frequency Percent Valid Percent Cumulative Percent Valid .00 5311 69.5 81.1 81.1 1.00 1237 16.2 18.9 100.0 Total 6548 85.7 100.0 Missing System 1096 14.3 Total 7644 100.0 This is I think correct: because the compute statement will return a zero if the condition is not met, instead of a system missing. So it looks your syntax works fine, although I'm not sure what that means. Also in regards to hectors suggestions: the variables mentioned (BLAH etc) did not already exist in the dataset, and I did try the compute statement with parenthesis and the space before the &, and I get the same results. -g Marks, Jim wrote: > Does this > > Comp blah = goreq =1 & fulltime = 1. > > Produce the same results? > > Could blah already exist in the data set-- And you are getting values > not picked up by the IF command? > > -- jim > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Graham Wright > Sent: Tuesday, January 02, 2007 12:33 PM > To: [hidden email] > Subject: bizarre spss 14 issue/instability? > > A co worker of mine had a really weird issue with one of her datasets > that has me completely stumped. Hopefully one of you guys has heard of > this issue. In this dataset when I do a fairly simple if command that > creates a new variable, for example: > > if goreg=1& fulltime=1 blah=1. > > one would suspect that the only possible value for the new variable blah > would be 1. However, instead when I do a frequency count of the new > variable I get a number of different answers, ranging from 0 to 12. This > anomaly occurs when combining a number of different variables in the > dataset in a number of different ways. When I open the dataset in SPSS > 11 this does not happen. > > What is even more bizarre is that this error does not seem to be a > stable one. If I rerun the above syntax several times with different > names for the variable I create I don't always get the same answer. For > example: > Ofter opening the dataset I run > > if goreg=1&fulltime=1 blah=1. > freq blah. > > and get > > blah > Frequency Percent Valid Percent Cumulative Percent > Valid .00 4696 61.4 75.4 75.4 > 1.00 1256 16.4 20.2 95.5 > 2.00 19 .2 .3 95.8 > 3.00 17 .2 .3 96.1 > 4.00 24 .3 .4 96.5 > 5.00 18 .2 .3 96.8 > 6.00 22 .3 .4 97.1 > 7.00 21 .3 .3 97.4 > 8.00 46 .6 .7 98.2 > 9.00 4 .1 .1 98.3 > 10.00 11 .1 .2 98.4 > 11.00 13 .2 .2 98.6 > 12.00 85 1.1 1.4 100.0 > Total 6232 81.5 100.0 > Missing System 1412 18.5 > Total 7644 100.0 > > but then if I immediately run > > if goreg=1&fulltime=1 acccc=1. > freq acccc. > > I get > acccc > Frequency Percent Valid Percent Cumulative Percent > Valid .00 2587 33.8 39.2 39.2 > 1.00 1286 16.8 19.5 58.8 > 2.00 220 2.9 3.3 62.1 > 3.00 276 3.6 4.2 66.3 > 4.00 291 3.8 4.4 70.7 > 5.00 384 5.0 5.8 76.5 > 6.00 305 4.0 4.6 81.1 > 7.00 330 4.3 5.0 86.1 > 8.00 648 8.5 9.8 96.0 > 9.00 41 .5 .6 96.6 > 10.00 83 1.1 1.3 97.9 > 11.00 41 .5 .6 98.5 > 12.00 100 1.3 1.5 100.0 > Total 6592 86.2 100.0 > Missing System 1052 13.8 > Total 7644 100.0 > > > as you can see the counts for accc and blah are different even though > the syntax to create them is the same. If I keep running the syntax with > different variable names the weird values eventually "go away" until, > after you've run about 5 you only get "1"s. Which is of course all you > should be getting in the first place. > > This problem only happens on this one dataset on SPSS 14. Does anyone > have any idea what could be causing this? > > -Graham > |
If Blah already exists, the Compute command will overwrite all existing values of Blah, and the new Blah has only three possible values: 0, 1, and system-missing. The If command will only overwrite the existing values of Blah if the condition is met.
input program. loop #i=1 to 1000. compute var1=trunc(rv.uniform(1,3)). compute var2=trunc(rv.uniform(1,3)). compute var3a=trunc(rv.uniform(1,12)). compute var3b=var3a. end case. end loop. end file. end input program. *note that both var3a and var3b already exist and contain values of 1-12. compute var3a=(var1=1 & var2=1). if var1=1 & var2=1 var3b=1. frequencies variables=var3a var3b. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Graham Wright Sent: Tuesday, January 02, 2007 1:50 PM To: [hidden email] Subject: Re: bizarre spss 14 issue/instability? When I run your syntax on a freshly opened dataset that does not already have a variable called BLAH in it I get the following: blah Frequency Percent Valid Percent Cumulative Percent Valid .00 5311 69.5 81.1 81.1 1.00 1237 16.2 18.9 100.0 Total 6548 85.7 100.0 Missing System 1096 14.3 Total 7644 100.0 This is I think correct: because the compute statement will return a zero if the condition is not met, instead of a system missing. So it looks your syntax works fine, although I'm not sure what that means. Also in regards to hectors suggestions: the variables mentioned (BLAH etc) did not already exist in the dataset, and I did try the compute statement with parenthesis and the space before the &, and I get the same results. -g Marks, Jim wrote: > Does this > > Comp blah = goreq =1 & fulltime = 1. > > Produce the same results? > > Could blah already exist in the data set-- And you are getting values > not picked up by the IF command? > > -- jim > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Graham Wright > Sent: Tuesday, January 02, 2007 12:33 PM > To: [hidden email] > Subject: bizarre spss 14 issue/instability? > > A co worker of mine had a really weird issue with one of her datasets > that has me completely stumped. Hopefully one of you guys has heard of > this issue. In this dataset when I do a fairly simple if command that > creates a new variable, for example: > > if goreg=1& fulltime=1 blah=1. > > one would suspect that the only possible value for the new variable blah > would be 1. However, instead when I do a frequency count of the new > variable I get a number of different answers, ranging from 0 to 12. This > anomaly occurs when combining a number of different variables in the > dataset in a number of different ways. When I open the dataset in SPSS > 11 this does not happen. > > What is even more bizarre is that this error does not seem to be a > stable one. If I rerun the above syntax several times with different > names for the variable I create I don't always get the same answer. For > example: > Ofter opening the dataset I run > > if goreg=1&fulltime=1 blah=1. > freq blah. > > and get > > blah > Frequency Percent Valid Percent Cumulative Percent > Valid .00 4696 61.4 75.4 75.4 > 1.00 1256 16.4 20.2 95.5 > 2.00 19 .2 .3 95.8 > 3.00 17 .2 .3 96.1 > 4.00 24 .3 .4 96.5 > 5.00 18 .2 .3 96.8 > 6.00 22 .3 .4 97.1 > 7.00 21 .3 .3 97.4 > 8.00 46 .6 .7 98.2 > 9.00 4 .1 .1 98.3 > 10.00 11 .1 .2 98.4 > 11.00 13 .2 .2 98.6 > 12.00 85 1.1 1.4 100.0 > Total 6232 81.5 100.0 > Missing System 1412 18.5 > Total 7644 100.0 > > but then if I immediately run > > if goreg=1&fulltime=1 acccc=1. > freq acccc. > > I get > acccc > Frequency Percent Valid Percent Cumulative Percent > Valid .00 2587 33.8 39.2 39.2 > 1.00 1286 16.8 19.5 58.8 > 2.00 220 2.9 3.3 62.1 > 3.00 276 3.6 4.2 66.3 > 4.00 291 3.8 4.4 70.7 > 5.00 384 5.0 5.8 76.5 > 6.00 305 4.0 4.6 81.1 > 7.00 330 4.3 5.0 86.1 > 8.00 648 8.5 9.8 96.0 > 9.00 41 .5 .6 96.6 > 10.00 83 1.1 1.3 97.9 > 11.00 41 .5 .6 98.5 > 12.00 100 1.3 1.5 100.0 > Total 6592 86.2 100.0 > Missing System 1052 13.8 > Total 7644 100.0 > > > as you can see the counts for accc and blah are different even though > the syntax to create them is the same. If I keep running the syntax with > different variable names the weird values eventually "go away" until, > after you've run about 5 you only get "1"s. Which is of course all you > should be getting in the first place. > > This problem only happens on this one dataset on SPSS 14. Does anyone > have any idea what could be causing this? > > -Graham > |
Free forum by Nabble | Edit this page |