|
Dear SPSS-L,
I am working on a project in which the data I receive is available only in Excel format and contains an unequal number of lines per case, but lists the ID number in only the first line of the case. I would like to know how I might accomplish the following with syntax: 1. Add the ID number to each line of the unequal number of lines per case. ID X1 111 333 334 333 112 333 333 113 333 333 333 333 2. Create an additional variable (X2) within the dataset in which the value computed is the sum of lines within a case possessing the value 334 for (X1). For example, for case 111 in the data above, the value of X2 would be 2, while the value for case 113 would be 4. 3. Create a separate dataset with a single case per variable, in which the line chosen as the case for the new dataset is determined by the value in (X1). For example, in the data above, (X2)=1 if (X1)=334 in ANY lines of the case and (X2)=0 if (X1)=333 in ALL lines of the case. Thanks in advance for your help with this, Jim ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Shalom
Here is a syntax for your first 2 requests your 3ed one is not clear . title 'Unequal Number of Lines Per Case with ID listed only in First Line' . DATA LIST / id x1 (2f4) . BEGIN DATA 111 333 334 333 112 333 333 113 333 333 333 333 END DATA. numeric x2 sum333 sum334 (f4) . leave x2 sum333 sum334 . do if sysmis(id) eq 0 . compute x2=id. compute sum333=0. compute sum334=0. else . compute id=x2. end if. if x1 eq 333 sum333=sum(sum333,1). if x1 eq 334 sum334=sum(sum334,1). execute . Hillel Vardi BGU James Whanger wrote: > Dear SPSS-L, > > I am working on a project in which the data I receive is available only in > Excel format and contains an unequal number of lines per case, but lists > the ID number in only the first line of the case. I would like to know > how I might accomplish the following with syntax: > > 1. Add the ID number to each line of the unequal number of lines per case. > > ID X1 > 111 333 > 334 > 333 > 112 333 > 333 > 113 333 > 333 > 333 > 333 > > 2. Create an additional variable (X2) within the dataset in which the > value computed is the sum of lines within a case possessing the value 334 > for (X1). For example, for case 111 in the data above, the value of X2 > would be 2, while the value for case 113 would be 4. > > 3. Create a separate dataset with a single case per variable, in which the > line chosen as the case for the new dataset is determined by the value in > (X1). For example, in the data above, (X2)=1 if (X1)=334 in ANY lines of > the case and (X2)=0 if (X1)=333 in ALL lines of the case. > > Thanks in advance for your help with this, > > Jim > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Whanger, J. Mr. CTR
James,
>>I am working on a project in which the data I receive is available only in Excel format and contains an unequal number of lines per case, but lists the ID number in only the first line of the case. I'm going to assume that you have read/can read the data into spss from the excel file. From this point on, and unless stated otherwise, I assume the data are in an spss datafile. >>1. Add the ID number to each line of the unequal number of lines per case. ID X1 111 333 334 333 112 333 333 113 333 333 333 333 Simple enough. If (sysmis(id)) id=lag(id). >>2. Create an additional variable (X2) within the dataset in which the value computed is the sum of lines within a case possessing the value 334 for (X1). For example, for case 111 in the data above, the value of X2 would be 2, while the value for case 113 would be 4. This doesn't make sense to me. If case 111 is the only example, a better definition of X2 would be that X2 is the sum of the line numbers within a case on which the value of X1 is 334. For case 111, x1=334 appears on line 2 and only line 2. Thus, X2=2. X1=334 does not appear in either case 112 or case 113. Please explain. >>3. Create a separate dataset with a single case per variable, in which the line chosen as the case for the new dataset is determined by the value in (X1). For example, in the data above, (X2)=1 if (X1)=334 in ANY lines of the case and (X2)=0 if (X1)=333 in ALL lines of the case. X2 is already used in Question 2. Let us say the new variable is X3 and defined as described above. Having muliple records or lines per case is a real pain. I understand that you can't do anything about the structure of the incoming data file. However, for question 3, I'd restructure your dataset from 'long' to 'wide' and then work with variables rather than records. I'll assume that x1 may have other values besides 333 and 334. One other comment. I haven't tested this code and I'm a bit skeptical that spss can handle a variable name of x1 in a vector structure. Thus it may be necessary to rename x1 to, for instance, y. I do so here. Also, I assume that the most number of lines or records per case is 5. You will need to adjust that number based on the value of recs from the frequencies command. Rename variables (x1=y). Casestovars /id=id/count=recs. Frequencies recs. Compute x3a=0. Compute x3b=0. Vector y=y1 to y5. /* y5 assume a max of 5 lines per case. May be too small. Loop #i=1 to recs. + if (x1(#i) eq 334) x3a=x3a+1. + if (x1(#i) eq 333) x3b=x3b+1. End loop. Compute x3=9. If (x3a ge 1) x3=1. If (x3b eq recs) x3=0. Save outfile='<file name string'/keep=id x3. Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
