|
Dear SPSSers,
I wonder if anyone could suggest a solution to this data processing / manipulation problem? *Background and description of data* I am working on a longitudinal data set that consists of psychiatric symptoms, coded on a 5-point (0-4) scale. The data set is 'long' with a separate case for each observation. Participants were interviewed annually and the number of observations per participant is variable. Each participant has a unique ID code (combination of letters and digits) and a date is recorded for each interview. *Question* I would like to know how many participants have had a score of 2 or more (this being the accepted criterion for the symptom being present) for each symptom at _any_ of the interviews during the follow up period. So, I guess I want to create a new variable (e.g. 'ever_panic') that is coded '1' if a given symptom (e.g. 'panic') is scored 2 or higher for _any_ of the interviews with an individual participant. Or alternatively, to generate an output that lists the ID codes and whether or not the symptom was present at any time? I can only think of tortuously long-winded ways to do this, which is not ideal because the process will need to be repeated for 30+ items! Can any suggest a solution? I am using SPSS 14. I would be very grateful for any suggestions! Thanks for reading. Best wishes, Jennifer ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Jennifer -
I'm not directly answering your question (sorry) but I do have a resource to point you toward that may be useful. I recently attended a workshop at APS by Don Hedeker in which he discussed longitudinal analysis in SPSS. Here is his webpage: http://tigger.uic.edu/~hedeker/ All of the handouts, datasets and syntax files are available for download. He also gives SAS code if you would prefer to use SAS for your analyses. Hope this helps. Sara Sara M. House, M.A. Adjunct Faculty Loyola University Chicago, Psychology Department Email: [hidden email] Teaching: Research Methods, Psychology & Law AND Data Analyst Chicago Public Schools, Department of Program Evaluation Email: [hidden email] ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Jennifer Thompson
Jennifer:
Try the COUNT transform command. COUNT NEWVAR_SYMA=syma1 syma2 syma3 (2 THRU 5). COUNT NEWVAR_SYMB=SYMb1 SYMb2 SYMb3 (2). EXECUTE. For the above, symptoms are represented by letters, the annual variables are represented by numbers. For each symptom, do the above, with the listing of the repeated variable/year after the equal sign. In parentheses are the values (2 through 5, in your case) for which you wish a "running count" to be created, for each symptom, for each subject. Therefore, if a given subject has ten years of data, and eight times s/he had a value of 2 or more, then NEWVAR = 8, for that subject. Then do a FREQUENCIES for each NEWVAR_SYMx. See if that works for you. Joe Burleson -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jennifer Thompson Sent: Thursday, May 29, 2008 12:46 PM To: [hidden email] Subject: Longitudinal data / data processing Dear SPSSers, I wonder if anyone could suggest a solution to this data processing / manipulation problem? *Background and description of data* I am working on a longitudinal data set that consists of psychiatric symptoms, coded on a 5-point (0-4) scale. The data set is 'long' with a separate case for each observation. Participants were interviewed annually and the number of observations per participant is variable. Each participant has a unique ID code (combination of letters and digits) and a date is recorded for each interview. *Question* I would like to know how many participants have had a score of 2 or more (this being the accepted criterion for the symptom being present) for each symptom at _any_ of the interviews during the follow up period. So, I guess I want to create a new variable (e.g. 'ever_panic') that is coded '1' if a given symptom (e.g. 'panic') is scored 2 or higher for _any_ of the interviews with an individual participant. Or alternatively, to generate an output that lists the ID codes and whether or not the symptom was present at any time? I can only think of tortuously long-winded ways to do this, which is not ideal because the process will need to be repeated for 30+ items! Can any suggest a solution? I am using SPSS 14. I would be very grateful for any suggestions! Thanks for reading. Best wishes, Jennifer ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi Joe,
Many thanks for your response. Unless I've misunderstood you, I think the method you suggest would work perfectly if my data were 'wide', with all the observations for each individual in one row /case, but each observation is actually a seperate row/case. I probably did not explain this clearly! See (fabricated) example below. Best wishes, Jennifer id date syma symb symc ----etc jt76 01.05.98 0 1 0 jt76 01.05.99 1 2 0 jt76 01.05.00 2 2 3 jt76 01.05.01 0 0 0 lj75 01.06.98 2 3 2 lj75 01.05.99 2 4 2 lj75 02.05.00 2 2 3 lj75 01.05.01 1 2 3 ah71 01.05.97 1 0 0 ah71 01.05.98 0 0 0 ah71 01.05.99 0 0 0 ah71 01.05.00 1 1 0 sm68 01.05.00 1 2 0 sm68 01.05.01 0 0 0sm68 01.05.02 1 1 1 Jennifer: > > Try the COUNT transform command. > > COUNT NEWVAR_SYMA=syma1 syma2 syma3 (2 THRU 5). > COUNT NEWVAR_SYMB=SYMb1 SYMb2 SYMb3 (2). > EXECUTE. > > For the above, symptoms are represented by letters, the annual variables > are represented by numbers. > > For each symptom, do the above, with the listing of the repeated > variable/year after the equal sign. In parentheses are the values (2 > through 5, in your case) for which you wish a "running count" to be > created, for each symptom, for each subject. Therefore, if a given > subject has ten years of data, and eight times s/he had a value of 2 or > more, then NEWVAR = 8, for that subject. > > Then do a FREQUENCIES for each NEWVAR_SYMx. > > See if that works for you. > > Joe Burleson > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Jennifer Thompson > Sent: Thursday, May 29, 2008 12:46 PM > To: [hidden email] > Subject: Longitudinal data / data processing > > Dear SPSSers, > > I wonder if anyone could suggest a solution to this data processing / > manipulation problem? > > *Background and description of data* > I am working on a longitudinal data set that consists of psychiatric > symptoms, coded on a 5-point (0-4) scale. The data set is 'long' with a > separate case for each observation. Participants were interviewed > annually > and the number of observations per participant is variable. Each > participant has a unique ID code (combination of letters and digits) and > a > date is recorded for each interview. > > *Question* > I would like to know how many participants have had a score of 2 or more > (this being the accepted criterion for the symptom being present) for > each > symptom at _any_ of the interviews during the follow up period. > > So, I guess I want to create a new variable (e.g. 'ever_panic') that is > coded '1' if a given symptom (e.g. 'panic') is scored 2 or higher for > _any_ > of the interviews with an individual participant. Or alternatively, to > generate an output that lists the ID codes and whether or not the > symptom > was present at any time? > > I can only think of tortuously long-winded ways to do this, which is not > ideal because the process will need to be repeated for 30+ items! Can > any > suggest a solution? I am using SPSS 14. > > I would be very grateful for any suggestions! Thanks for reading. > > Best wishes, > > Jennifer > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Jennifer Thompson
If you're trying to select individuals for further analysis you can easily
do this by aggregating on ID code and using the PGT function which will provide the percentage greater than 2 for your symptom by ID. You end up with one case per ID or with MODE=ADDVARIABLES, that value attached to each case. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jennifer Thompson Sent: Thursday, May 29, 2008 10:46 AM To: [hidden email] Subject: Longitudinal data / data processing Dear SPSSers, I wonder if anyone could suggest a solution to this data processing / manipulation problem? *Background and description of data* I am working on a longitudinal data set that consists of psychiatric symptoms, coded on a 5-point (0-4) scale. The data set is 'long' with a separate case for each observation. Participants were interviewed annually and the number of observations per participant is variable. Each participant has a unique ID code (combination of letters and digits) and a date is recorded for each interview. *Question* I would like to know how many participants have had a score of 2 or more (this being the accepted criterion for the symptom being present) for each symptom at _any_ of the interviews during the follow up period. So, I guess I want to create a new variable (e.g. 'ever_panic') that is coded '1' if a given symptom (e.g. 'panic') is scored 2 or higher for _any_ of the interviews with an individual participant. Or alternatively, to generate an output that lists the ID codes and whether or not the symptom was present at any time? I can only think of tortuously long-winded ways to do this, which is not ideal because the process will need to be repeated for 30+ items! Can any suggest a solution? I am using SPSS 14. I would be very grateful for any suggestions! Thanks for reading. Best wishes, Jennifer ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Jennifer Thompson
Shalom
you can use the fallowing code . recode symptom1 to symptom10(0 1=0)(2 3 4=1) . aggregate outfile=* / by ID / symptom1 to symptom10=sum(symptom1 to symptom10) . recode symptom1 to symptom10(1 thru hi=1). that will give you a file with a line for every Participant and value 1 for each symptom if that symptom had value of 2 or higher in any of the Participant observation. you can then match that line to the original file if you need it . Hillel Vardi BGU Jennifer Thompson wrote: > Dear SPSSers, > > I wonder if anyone could suggest a solution to this data processing / > manipulation problem? > > *Background and description of data* > I am working on a longitudinal data set that consists of psychiatric > symptoms, coded on a 5-point (0-4) scale. The data set is 'long' with a > separate case for each observation. Participants were interviewed annually > and the number of observations per participant is variable. Each > participant has a unique ID code (combination of letters and digits) and a > date is recorded for each interview. > > *Question* > I would like to know how many participants have had a score of 2 or more > (this being the accepted criterion for the symptom being present) for each > symptom at _any_ of the interviews during the follow up period. > > So, I guess I want to create a new variable (e.g. 'ever_panic') that is > coded '1' if a given symptom (e.g. 'panic') is scored 2 or higher for _any_ > of the interviews with an individual participant. Or alternatively, to > generate an output that lists the ID codes and whether or not the symptom > was present at any time? > > I can only think of tortuously long-winded ways to do this, which is not > ideal because the process will need to be repeated for 30+ items! Can any > suggest a solution? I am using SPSS 14. > > I would be very grateful for any suggestions! Thanks for reading. > > Best wishes, > > Jennifer > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Jennifer Thompson
Hello,
another idea is to restructure the dataset. As a result you have one row for every patient (=id): You start with the data: ID DATE SYMA SYMB SYMC 1,00 01-MAY-1998 ,00 1,00 ,00 1,00 01-MAY-1999 1,00 2,00 ,00 1,00 01-MAY-2000 2,00 2,00 3,00 2,00 02-MAY-1998 ,00 ,00 ,00 2,00 02-MAY-1999 2,00 3,00 2,00 3,00 03-MAY-1998 2,00 4,00 2,00 3,00 03-MAY-1999 2,00 2,00 3,00 3,00 03-MAY-2000 1,00 2,00 3,00 3,00 03-MAY-2001 1,00 ,00 ,00 And you restructure the data: Data -> restructure ... which will generate the following syntax: SORT CASES BY id . CASESTOVARS /ID = id /GROUPBY = VARIABLE . Which will transform the data in switching vars to cases. syma.1 syma.2 syma.3 syma.4 symb.1 etc 1,00 01-MAY-1998 01-MAY-1999 01-MAY-2000 . ,00 1,00 2,00 . 1,00 2,00 2,00 . ,00 ,00 3,00 . 2,00 02-MAY-1998 02-MAY-1999 . . ,00 2,00 . . ,00 3,00 . . ,00 2,00 . . 3,00 03-MAY-1998 03-MAY-1999 03-MAY-2000 03-MAY-2001 2,00 2,00 1,00 1,00 4,00 2,00 2,00 ,00 2,00 3,00 3,00 ,00 Now it is possible to analyse the data in the way you are used to (count, etc ..). Carsten. 2008/5/29 Jennifer Thompson <[hidden email]>: > Dear SPSSers, > > I wonder if anyone could suggest a solution to this data processing / > manipulation problem? > > *Background and description of data* > I am working on a longitudinal data set that consists of psychiatric > symptoms, coded on a 5-point (0-4) scale. The data set is 'long' with a > separate case for each observation. Participants were interviewed annually > and the number of observations per participant is variable. Each > participant has a unique ID code (combination of letters and digits) and a > date is recorded for each interview. > > *Question* > I would like to know how many participants have had a score of 2 or more > (this being the accepted criterion for the symptom being present) for each > symptom at _any_ of the interviews during the follow up period. > > So, I guess I want to create a new variable (e.g. 'ever_panic') that is > coded '1' if a given symptom (e.g. 'panic') is scored 2 or higher for > of the interviews with an individual participant. Or alternatively, to > generate an output that lists the ID codes and whether or not the symptom > was present at any time? > > I can only think of tortuously long-winded ways to do this, which is not > ideal because the process will need to be repeated for 30+ items! Can any > suggest a solution? I am using SPSS 14. > > I would be very grateful for any suggestions! Thanks for reading. > > Best wishes, > > Jennifer > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Many thanks to Joe, Carsten, Hillel & ViAnn for all the very helpful
responses to this query (I have included the original query and responses below). Restructuring the datafile from long/stacked to wide/flat would seem to be best solution. Best wishes, Jennifer From: *Jennifer Thompson* <[hidden email]> Date: 2008/5/29 To: [hidden email] Dear SPSSers, I wonder if anyone could suggest a solution to this data processing / manipulation problem? *Background and description of data* I am working on a longitudinal data set that consists of psychiatric symptoms, coded on a 5-point (0-4) scale. The data set is 'long' with a separate case for each observation. Participants were interviewed annually and the number of observations per participant is variable. Each participant has a unique ID code (combination of letters and digits) and a date is recorded for each interview. *Question* I would like to know how many participants have had a score of 2 or more (this being the accepted criterion for the symptom being present) for each symptom at _any_ of the interviews during the follow up period. So, I guess I want to create a new variable (e.g. 'ever_panic') that is coded '1' if a given symptom (e.g. 'panic') is scored 2 or higher for _any_ of the interviews with an individual participant. Or alternatively, to generate an output that lists the ID codes and whether or not the symptom was present at any time? I can only think of tortuously long-winded ways to do this, which is not ideal because the process will need to be repeated for 30+ items! Can any suggest a solution? I am using SPSS 14. I would be very grateful for any suggestions! Thanks for reading. Best wishes, Jennifer ---------- From: *Burleson,Joseph A.* <[hidden email]> Date: 2008/5/29 To: Jennifer Thompson <[hidden email]>, [hidden email] Jennifer: Try the COUNT transform command. COUNT NEWVAR_SYMA=syma1 syma2 syma3 (2 THRU 5). COUNT NEWVAR_SYMB=SYMb1 SYMb2 SYMb3 (2). EXECUTE. For the above, symptoms are represented by letters, the annual variables are represented by numbers. For each symptom, do the above, with the listing of the repeated variable/year after the equal sign. In parentheses are the values (2 through 5, in your case) for which you wish a "running count" to be created, for each symptom, for each subject. Therefore, if a given subject has ten years of data, and eight times s/he had a value of 2 or more, then NEWVAR = 8, for that subject. Then do a FREQUENCIES for each NEWVAR_SYMx. See if that works for you. Joe Burleson ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ---------- From: *Jennifer Thompson* <[hidden email]> Date: 2008/5/29 To: "Burleson,Joseph A." <[hidden email]> Cc: [hidden email] Hi Joe, Many thanks for your response. Unless I've misunderstood you, I think the method you suggest would work perfectly if my data were 'wide', with all the observations for each individual in one row /case, but each observation is actually a seperate row/case. I probably did not explain this clearly! See (fabricated) example below. Best wishes, Jennifer id date syma symb symc ----etc jt76 01.05.98 0 1 0 jt76 01.05.99 1 2 0 jt76 01.05.00 2 2 3 jt76 01.05.01 0 0 0 lj75 01.06.98 2 3 2 lj75 01.05.99 2 4 2 lj75 02.05.00 2 2 3 lj75 01.05.01 1 2 3 ah71 01.05.97 1 0 0 ah71 01.05.98 0 0 0 ah71 01.05.99 0 0 0 ah71 01.05.00 1 1 0 sm68 01.05.00 1 2 0 sm68 01.05.01 0 0 0 sm68 01.05.02 1 1 1 ---------- From: *ViAnn Beadle* <[hidden email]> Date: 2008/5/29 To: Jennifer Thompson <[hidden email]>, [hidden email] If you're trying to select individuals for further analysis you can easily do this by aggregating on ID code and using the PGT function which will provide the percentage greater than 2 for your symptom by ID. You end up with one case per ID or with MODE=ADDVARIABLES, that value attached to each case. ---------- From: *Burleson,Joseph A.* <[hidden email]> Date: 2008/5/29 To: Jennifer Thompson <[hidden email]> Jennifer: No problem: Simply use the RESTRUCTURE command in SPSS (best to use the "Wizard" that does it for you interactively, see the SPSS documentation). That will make a "flat" file out of your "stacked" file. The wizard asks how many types (30+ in your case) or groups of variables you want to restructure. It asks what is the maximum number of "stacked" rows per case. It also asks which variables are constant (e.g., gender), since it only has to ask for them once. You can keep as many or as few variables in your new flat file as you wish. ID, of course, has to be the "link" variable, so take note of how this is done in the WIZARD procedures. Then do the COUNT on the flat file. Joe ------------------------------ *From:* Jennifer Thompson [mailto:[hidden email]] *Sent:* Thursday, May 29, 2008 3:23 PM *To:* Burleson,Joseph A. *Cc:* [hidden email] *Subject:* Re: Longitudinal data / data processing ---------- From: *hillel vardi* <[hidden email]> Date: 2008/5/29 To: Jennifer Thompson <[hidden email]>, [hidden email] Shalom you can use the fallowing code . recode symptom1 to symptom10(0 1=0)(2 3 4=1) . aggregate outfile=* / by ID / symptom1 to symptom10=sum(symptom1 to symptom10) . recode symptom1 to symptom10(1 thru hi=1). that will give you a file with a line for every Participant and value 1 for each symptom if that symptom had value of 2 or higher in any of the Participant observation. you can then match that line to the original file if you need it . Hillel Vardi BGU Jennifer Thompson wrote: > ---------- From: *Carsten Pauck* <[hidden email]> Date: 2008/5/30 To: [hidden email] Cc: Jennifer Thompson <[hidden email]> Hello, another idea is to restructure the dataset. As a result you have one row for every patient (=id): You start with the data: ID DATE SYMA SYMB SYMC 1,00 01-MAY-1998 ,00 1,00 ,00 1,00 01-MAY-1999 1,00 2,00 ,00 1,00 01-MAY-2000 2,00 2,00 3,00 2,00 02-MAY-1998 ,00 ,00 ,00 2,00 02-MAY-1999 2,00 3,00 2,00 3,00 03-MAY-1998 2,00 4,00 2,00 3,00 03-MAY-1999 2,00 2,00 3,00 3,00 03-MAY-2000 1,00 2,00 3,00 3,00 03-MAY-2001 1,00 ,00 ,00 And you restructure the data: Data -> restructure ... which will generate the following syntax: SORT CASES BY id . CASESTOVARS /ID = id /GROUPBY = VARIABLE . Which will transform the data in switching vars to cases. syma.1 syma.2 syma.3 syma.4 symb.1 etc 1,00 01-MAY-1998 01-MAY-1999 01-MAY-2000 . ,00 1,00 2,00 . 1,00 2,00 2,00 . ,00 ,00 3,00 . 2,00 02-MAY-1998 02-MAY-1999 . . ,00 2,00 . . ,00 3,00 . . ,00 2,00 . . 3,00 03-MAY-1998 03-MAY-1999 03-MAY-2000 03-MAY-2001 2,00 2,00 1,00 1,00 4,00 2,00 2,00 ,00 2,00 3,00 3,00 ,00 Now it is possible to analyse the data in the way you are used to (count, etc ..). Carsten. 2008/5/29 Jennifer Thompson <[hidden email]>: ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi,
Somebody knows the way to convert all the string variables in my data editor to 255 width (syntax, script, python)? I remember i used a script, but don't find it. Thanks. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
If you have SPSS 16, you can use the ALTER TYPE command.
ALTER TYPE ALL (A=A255). That means any variables with A format (any length) become A255. HTH, Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Auberth Hurtado Sent: Friday, May 30, 2008 11:18 AM To: [hidden email] Subject: [SPSSX-L] Convert all string var to 255 character Hi, Somebody knows the way to convert all the string variables in my data editor to 255 width (syntax, script, python)? I remember i used a script, but don't find it. Thanks. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Jennifer Thompson
At 12:46 PM 5/29/2008, Jennifer Thompson wrote:
>I am working on a longitudinal data set that consists of psychiatric >symptoms, coded on a 5-point (0-4) scale. The data set is 'long' >with a separate case for each observation. The number of >observations per participant is variable. Each participant has a >unique ID code (combination of letters and digits) and a date is >recorded for each interview. > >I want to create a new variable (e.g. 'ever_panic') that is coded >'1' if a given symptom (e.g. 'panic') is scored 2 or higher for >_any_ of the interviews with an individual participant. ViAnn Beadle is right: AGGREGATE is the prime tool for summarizing in a 'long' dataset. I'd use MAX rather than PGT. Like this (not tested): AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=Participant_ID /ever_panic = MAX(panic). Now, 'ever_panic' is the worst panic score observed for the patient, which may do you just fine. Or, you can change to 'yes/no/' with RECODE: RECODE ever_panic (MISSING = 9) (2 THRU HI = 1) (ELSE = 2). (Note the special clause for missing values; omit it, and you'll get misleading results when there's any patient without data.) Now, you write, >I can only think of tortuously long-winded ways to do this, which is >not ideal because the process will need to be repeated for 30+ items! I'm afraid that this won't be as compact as one would like. That is, you'll need a line in the AGGREGATE for every symptom: AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=Participant_ID /ever_panic = MAX(panic) /ever_depress = MAX(depress) /ever_angry = MAX(angry) ... At least the RECODE stays simple: RECODE ever_panic TO ever_angry (MISSING = 9) (2 THRU HI = 1) (ELSE = 2). Does this get you nearer where you want to go? -Best of luck to you, Richard ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
