|
I am working on analysis of a caseload of an employment assistance program
our provincial government is running. Among other things I need to know length of "spells" when clients were receiving assistance. I have modified the original data set so that each month the program was delivered during a certain period of time (129 months) has been classified as an "entry month", an "exit month" or "no change month"; i.e., I have a series of month(i) variables from month1 to month129 storing entry and exit codes for about 20,000 clients. How could I count how many months elapses between "entry" and "exit" month for each client. Majority of the clients have entered and exited the program multiple times in the course of 129 months in question. I would hugely appreciate an advice that would point me in a right direction. Thanks! Oksana ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
I have an initial suggestion but I don't think it is a complete
solution. Consider: It is hard to tell what exactly the variables and the values are that you are using but I assume that you have a group of variables month1 to month129 to represent the 129 month time period (by the way, because months are a non-constant unit, it might have been better to use weeks). A case's data might look like the following: IDE0001 0000111111111111100000000000111111111.... The IDE0001 is a unique case identifier and each subsequent 0 or 1 represents a month and whether a person was in the assistance program for that month. So, the first 0 represents month1 and the case is NOT in the assistance program. The fifth digit is a "1" indicating that in month 5 the person entered into the assistance program. The cases stays in the assistance program for 13 months (hence the sequence of 13 ones). The case then exits in month 17. In essence you want the distance between the first one and the last one in this sequence. Something like the following might work (it is untested and the logic might not work): vector months=month1 to month129. /* Define a vector of 129 variables. loop #i=1 to 129 /* Define a loop for 129 values. do if months(#i) eq 1. /* Begin Do if for first occurance of a "1". compute start001=#i. /* make "start001" equal to the loop number containing "1". loop #j=#1 to 129. /* New loop for 129 values. do if months(#j) eq 0. /* test for occurance of "0". compute end001=(#j - 1). /* make end001 equal to the loop number for #j-1; Last "1" end if. /* end first if. end loop. /* end first loop. compute end001=129. /* if all remain months contain "1" then last month is 129. end if. /* end second if. end loop. /* end second loop. * start001 gives the # of the month a case first entered assistance program. * end001 give the # of the month a case first exited assistance program. * the difference between end001 and start001 gives # of months in program. compute period001=end001 - start001. The problem now is how to locate new sequences of "1s". If the above works, then one can substitute end001 as the beginning point for a copy of the above commands which will make the loops begin at the end of the first entry to the assistance program. And so on for subsequent sequences of ones. Again, I am not sure of the logic of the above program or whether all of the syntax is necessary but it should give you one idea of how to approach the problem. I would hope that others would point out any problems or errors with the above code. It is also possible that someone may come up with a simpler code. -Mike Palij New York University [hidden email] ----- Original Message ----- From: "Oksana Starchenko" <[hidden email]> To: <[hidden email]> Sent: Tuesday, March 09, 2010 3:52 PM Subject: Counting number of month between entry into and exit from a program >I am working on analysis of a caseload of an employment assistance program > our provincial government is running. Among other things I need to know > length of "spells" when clients were receiving assistance. > > I have modified the original data set so that each month the program was > delivered during a certain period of time (129 months) has been classified > as an "entry month", an "exit month" or "no change month"; i.e., I have a > series of month(i) variables from month1 to month129 storing entry and > exit codes for about 20,000 clients. How could I count how many months > elapses between "entry" and "exit" month for each client. Majority of the > clients have entered and exited the program multiple times in the course > of 129 months in question. > > I would hugely appreciate an advice that would point me in a right > direction. > > Thanks! > Oksana > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Oksana Starchenko
Shalom
If your data are IDE0001 0000111111111111100000000000111111111.... then count summonths(month1 to month129) . will give you the number of months the client was in the program . But if your data are IDE 00010000-10000001000000-1000001000.... where 0 is no change 1 is start and -1 is end then some thing like this syntax will do the job data list FIXED/ id month1 to month31 (a4,31f1). begin data . IDE 00010000200000010000002000001000 . end data . vector month=month1 to month31 . loop ii=1 to 31 . compute ttmonth=month(ii) . if month(ii) eq 1 workstat=1 . if month(ii) eq 2 workstat=2 . if (workstat eq 1 ) and ( month(ii) eq 0 or month(ii) eq 1) summonths=sum(summonths,1). print / ii workstat summonths ttmonth . end loop . execute . Without the loop this code will work on a long file as well . It is not clear if exit month should be added to the summonh if so change last line to if (workstat eq 1 ) and ( month(ii) eq 0 or month(ii) eq 1) or month(ii) eq -1 summonths=sum(summonth,1). Hillel Vardi BGU Oksana Starchenko wrote: > I am working on analysis of a caseload of an employment assistance program > our provincial government is running. Among other things I need to know > length of "spells" when clients were receiving assistance. > > I have modified the original data set so that each month the program was > delivered during a certain period of time (129 months) has been classified > as an "entry month", an "exit month" or "no change month"; i.e., I have a > series of month(i) variables from month1 to month129 storing entry and > exit codes for about 20,000 clients. How could I count how many months > elapses between "entry" and "exit" month for each client. Majority of the > clients have entered and exited the program multiple times in the course > of 129 months in question. > > I would hugely appreciate an advice that would point me in a right > direction. > > Thanks! > Oksana > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Mike
Oksana,
I am pretty, actually, absolutely certain that something like this has been done before. I can't recall when or the topic, which makes searching impossible. Anyway, you really have two problems to solve. The first is determining the number of 'spells'; the second is their lengths. You can side-step the first by assuming that the maximum possible number is some large number and then actually counting them after you've determined the length of each spell. The downside of this is that if you mis-guess the max number of possible spells, you get an error. Big deal. Increase the vector specification and run it again. Mike, I've used your code because you have much of the structure in place and added to it to incorporate the counting for multiple spells. My additions are in CAPS. Deletions are just that--deletions. I've also changed the formatting a bit to emphasize the structure. I'll assume the data correspond to Mike's example, which I don't think is quite true given Oksana's description but shouldn't matter. That is, data list / tag month1 to month40(a7,4x,40f1.0). begin data IDE0001 0000111111111111100001100000111111111000 IDE0002 1100111011000011100001100110111100011011 IDE0003 0110110000000101011001100000110011110101 end data. frequencies tag. vector months=month1 to month40. /* Define a vector of 129 variables. VECTOR SPELLS(10,F2.0). COMPUTE #I=0. COMPUTE #K=0. loop IF (#i LT 40). /* Define a loop for 129 values. + COMPUTE #I=#I+1. + do if months(#i) eq 1. /* Begin Do if for first occurance of a "1". + COMPUTE #K=#K+1. + loop #j=#i to 40. /* New loop for 129 values. << TYPO HERE. '1' INSTEAD OF 'I'. + do if months(#j) eq 0. /* test for occurance of "0". + COMPUTE SPELLS(#K)=(#J-#I). /* LENGTH OF K'TH SPELL. + COMPUTE #I=#J. /* RESET THE OUTER LOOP INDEX. + BREAK. /* JUMP OUT OF LOOP WHEN END OF SPELL FOUND. + ELSE IF (#J EQ 40). /* TEST FOR END OF STRING EFFECTS. N=40 FOR NOW. + COMPUTE SPELLS(#K)=(#J-#I+1). /* LENGTH OF LAST SPELL. + COMPUTE #I=#J. /* RESET THE OUTER LOOP INDEX. + BREAK. /* JUMP OUT OF LOOP WHEN END OF SPELL FOUND. + end if. /* end first if. + end loop. /* end first loop. + end if. /* end second if. end loop. /* end second loop. execute. LIST SPELLS1 TO SPELLS10. SPELLS1 SPELLS2 SPELLS3 SPELLS4 SPELLS5 SPELLS6 SPELLS7 SPELLS8 SPELLS9 SPELLS10 13 2 9 . . . . . . . 2 3 2 3 2 2 4 2 2 . 2 2 1 1 2 2 2 4 1 . The problem now is how to locate new sequences of "1s". If the above works, then one can substitute end001 as the beginning point for a copy of the above commands which will make the loops begin at the end of the first entry to the assistance program. And so on for subsequent sequences of ones. Again, I am not sure of the logic of the above program or whether all of the syntax is necessary but it should give you one idea of how to approach the problem. I would hope that others would point out any problems or errors with the above code. It is also possible that someone may come up with a simpler code. -Mike Palij New York University [hidden email] ----- Original Message ----- From: "Oksana Starchenko" <[hidden email]> To: <[hidden email]> Sent: Tuesday, March 09, 2010 3:52 PM Subject: Counting number of month between entry into and exit from a program >I am working on analysis of a caseload of an employment assistance program > our provincial government is running. Among other things I need to know > length of "spells" when clients were receiving assistance. > > I have modified the original data set so that each month the program was > delivered during a certain period of time (129 months) has been classified > as an "entry month", an "exit month" or "no change month"; i.e., I have a > series of month(i) variables from month1 to month129 storing entry and > exit codes for about 20,000 clients. How could I count how many months > elapses between "entry" and "exit" month for each client. Majority of the > clients have entered and exited the program multiple times in the course > of 129 months in question. > > I would hugely appreciate an advice that would point me in a right > direction. > > Thanks! > Oksana > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Administrator
|
I too thought that something similar was discussed fairly recently. I would have also guessed that Gene was one of the respondents who provided a good solution that time! ;-)
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
In reply to this post by Oksana Starchenko
At 03:52 PM 3/9/2010, Oksana Starchenko wrote:
I am working on analysis of a caseload of an employment assistance program our provincial government is running. Among other things I need to know length of "spells" when clients were receiving assistance. You're not clear what you want: do you want the difference between earliest entry date and latest exit date for each client, giving one number per client; or do you want the difference between entry and exit date for each spell, giving multiple values per client? Respondents so far have assumed the latter. I'll do the same, noting that the former (one number per client) is simpler. I'll do the same. There are two reasonable solution paths. One is VECTOR/LOOP logic, as all respondents so far have proposed. The other is 'unrolling' the data to one record per month per client, rather than one record per client. That is more 'idiomatic' SPSS, but may be slower. It doesn't look like a fully satisfactory solution has been posted, so here are both: List |-----------------------------|---------------------------| |Output Created |11-MAR-2010 13:23:17 | |-----------------------------|---------------------------| |(Has been hand-edited. See code, for the logic that | | produced this listing.) | |-----------------------------|---------------------------| [TestData] The variables are listed in the following order: LINE 1: tag,[month1 TO month20] LINE 2: [month21 TO month40[ tag: IDE0001 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 [month21]: 0 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 tag: IDE0002 1 1 0 0 1 1 1 0 1 1 0 0 0 0 1 1 1 0 0 0 [month21]: 0 1 1 0 0 1 1 0 1 1 1 1 0 0 0 1 1 0 1 1 tag: IDE0003 0 1 1 0 1 1 0 0 0 0 0 0 0 1 0 1 0 1 1 0 [month21]: 0 1 1 0 0 0 0 0 1 1 0 0 1 1 1 1 0 1 0 1 Number of cases read: 3 Number of cases listed: 3 * ........... Solution using VECTOR/LOOP ............... . DATASET ACTIVATE TestData WINDOW=FRONT. DATASET COPY LOOPing. DATASET ACTIVATE LOOPing WINDOW=FRONT. vector months=month1 to month40. VECTOR SpLen (10,F2) SpStrt(10,F2). NUMERIC #SpellNum(F2) /* Index # of spells within client */ #InSpell (F2) /* Flag: a 'spell' is running */ . COMPUTE #SpellNum = 0. COMPUTE #InSpell = 0. LOOP #MonthNum = 1 TO 40. . DO IF #InSpell /* In a running spell: */ . . DO IF Months(#MonthNum) EQ 1 /* Spell continues */ . * .. Nothing needs doing .. ** ** . . ELSE /* Spell has ended */ . . COMPUTE SpLen(#SpellNum) = #MonthNum - SpStrt(#SpellNum). . COMPUTE #InSpell = 0. . END IF. . ELSE /* With no active spell: */ . . DO IF Months(#MonthNum) EQ 1 /* Spell begins */ . . COMPUTE #InSpell = 1. . COMPUTE #SpellNum = #SpellNum + 1. . COMPUTE SpStrt(#Spellnum) = #MonthNum. . ELSE /* Not in a spell */ . * .. Nothing needs doing .. ** ** . . END IF. . END IF. END LOOP. * Special case: The last month is within a spell: .. . DO IF #InSpell. . COMPUTE SpLen(#SpellNum) = 41 - SpStrt(#SpellNum). END IF. LIST tag, SpLen1 TO SpStrt10. List |-----------------------------|---------------------------| |Output Created |11-MAR-2010 13:23:19 | |-----------------------------|---------------------------| [LOOPing] Sp Sp Sp Sp Sp Sp Sp Sp Sp SpL SpS SpS SpS SpS SpS SpS SpS SpS Le Le Le Le Le Le Le Le Le en1 trt trt trt trt trt trt trt trt SpSt SpSt tag n1 n2 n3 n4 n5 n6 n7 n8 n9 0 1 2 3 4 5 6 7 8 rt9 rt10 IDE0001 13 2 9 . . . . . . . 5 22 29 . . . . . . . IDE0002 2 3 2 3 2 2 4 2 2 . 1 5 9 15 22 26 29 36 39 . IDE0003 2 2 1 1 2 2 2 4 1 1 2 5 14 16 18 22 29 33 38 40 Number of cases read: 3 Number of cases listed: 3 * ........... Solution by unrolling ............... . DATASET ACTIVATE TestData WINDOW=FRONT. DATASET COPY Unroll. DATASET ACTIVATE Unroll WINDOW=FRONT. VARSTOCASES /MAKE InProgram FROM month1 TO month40 /INDEX = MonthNum(40) /KEEP = tag /NULL = KEEP. Variables to Cases |-----------------------------|---------------------------| |Output Created |11-MAR-2010 13:23:20 | |-----------------------------|---------------------------| [Unroll] Generated Variables |---------|------| |Name |Label | |---------|------| |MonthNum |<none>| |InProgram|<none>| |---------|------| |Variables In |41| |Variables Out|3 | |-------------|--| NUMERIC SpellNum (F2). LEAVE SpellNum. DO IF InProgram. . DO IF $CASENUM EQ 1. . COMPUTE SpellNum = 1. . ELSE IF MonthNum EQ 1. . COMPUTE SpellNum = 1. . ELSE IF LAG(InProgram) EQ 0. . COMPUTE SpellNum = SpellNum + 1. . END IF. END IF. EXECUTE /* required, for the following SELECT IF */. SELECT IF InProgram. AGGREGATE OUTFILE=* /BREAK= tag SpellNum /SpStrt 'First month of spell' = MIN(MonthNum) /SpEnd 'Last month of spell' = MAX(MonthNum). COMPUTE SpLen = SpEnd - SpStrt + 1. FORMATS SpLen (F2). LIST. List |-----------------------------|---------------------------| |Output Created |11-MAR-2010 13:23:23 | |-----------------------------|---------------------------| |(Blank lines added by hand) | |-----------------------------|---------------------------| tag SpellNum SpStrt SpEnd SpLen IDE0001 1 5 17 13 IDE0001 2 22 23 2 IDE0001 3 29 37 9 IDE0002 1 1 2 2 IDE0002 2 5 7 3 IDE0002 3 9 10 2 IDE0002 4 15 17 3 IDE0002 5 22 23 2 IDE0002 6 26 27 2 IDE0002 7 29 32 4 IDE0002 8 36 37 2 IDE0002 9 39 40 2 IDE0003 10 2 3 2 IDE0003 11 5 6 2 IDE0003 12 14 14 1 IDE0003 13 16 16 1 IDE0003 14 18 19 2 IDE0003 15 22 23 2 IDE0003 16 29 30 2 IDE0003 17 33 36 4 IDE0003 18 38 38 1 IDE0003 19 40 40 1 Number of cases read: 22 Number of cases listed: 22 ........................................................ At 09:35 AM 3/11/2010, Gene Maguin (or Mike Palij) wrote: I am pretty, actually, absolutely certain that something like this has been done before. I don't find a solution to count multiple 'runs' per case, but a similar problem, to find the longest 'run', is solved in posting Date: Fri, 9 Jan 2004 20:45:59 -0500 From: Richard Ristow <[hidden email]> Subject: Re: Count consecutive zeros across variables To: [hidden email] ============================= APPENDIX: Test data, and code ============================= * C:\Documents and Settings\Richard\My Documents . * \Technical\spssx-l\Z 2010a\ . * 2010-03-09 Starchenko - Counting number of months between entry and exit.SPS. * In response to posting . * Date: Tue, 9 Mar 2010 15:52:10 -0500 . * From: Oksana Starchenko <[hidden email]> . * Subject: Counting number of month between entry into and exit from a . * program . * To: [hidden email] . * ........... Test data, from earlier responses ............... . data list / tag month1 to month40(a7,4x,40f1.0). begin data IDE0001 0000111111111111100001100000111111111000 IDE0002 1100111011000011100001100110111100011011 IDE0003 0110110000000101011001100000110011110101 end data. DATASET NAME TestData. DATASET ACTIVATE TestData WINDOW=FRONT. TEMPORARY. STRING SPACE(A15) TagSp(A07). LIST tag, Month1 TO Month20, SPACE, TagSp, Month21 TO Month40. * ........... Solution using VECTOR/LOOP ............... . DATASET ACTIVATE TestData WINDOW=FRONT. DATASET COPY LOOPing. DATASET ACTIVATE LOOPing WINDOW=FRONT. vector months=month1 to month40. VECTOR SpLen (10,F2) SpStrt(10,F2). NUMERIC #SpellNum(F2) /* Index # of spells within client */ #InSpell (F2) /* Flag: a 'spell' is running */ . COMPUTE #SpellNum = 0. COMPUTE #InSpell = 0. LOOP #MonthNum = 1 TO 40. . DO IF #InSpell /* In a running spell: */ . . DO IF Months(#MonthNum) EQ 1 /* Spell continues */ . * .. Nothing needs doing .. ** ** . . ELSE /* Spell has ended */ . . COMPUTE SpLen(#SpellNum) = #MonthNum - SpStrt(#SpellNum). . COMPUTE #InSpell = 0. . END IF. . ELSE /* With no active spell: */ . . DO IF Months(#MonthNum) EQ 1 /* Spell begins */ . . COMPUTE #InSpell = 1. . COMPUTE #SpellNum = #SpellNum + 1. . COMPUTE SpStrt(#Spellnum) = #MonthNum. . ELSE /* Not in a spell */ . * .. Nothing needs doing .. ** ** . . END IF. . END IF. END LOOP. * Special case: The last month is within a spell: .. . DO IF #InSpell. . COMPUTE SpLen(#SpellNum) = 41 - SpStrt(#SpellNum). END IF. LIST tag, SpLen1 TO SpStrt10. * ........... Solution by unrolling ............... . DATASET ACTIVATE TestData WINDOW=FRONT. DATASET COPY Unroll. DATASET ACTIVATE Unroll WINDOW=FRONT. VARSTOCASES /MAKE InProgram FROM month1 TO month40 /INDEX = MonthNum(40) /KEEP = tag /NULL = KEEP. NUMERIC SpellNum (F2). LEAVE SpellNum. DO IF InProgram. . DO IF $CASENUM EQ 1. . COMPUTE SpellNum = 1. . ELSE IF MonthNum EQ 1. . COMPUTE SpellNum = 1. . ELSE IF LAG(InProgram) EQ 0. . COMPUTE SpellNum = SpellNum + 1. . END IF. END IF. EXECUTE /* required, for the following SELECT IF */. SELECT IF InProgram. AGGREGATE OUTFILE=* /BREAK= tag SpellNum /SpStrt 'First month of spell' = MIN(MonthNum) /SpEnd 'Last month of spell' = MAX(MonthNum). COMPUTE SpLen = SpEnd - SpStrt + 1. FORMATS SpLen (F2). LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
And, I'm afraid, fixing a bug in the 'unroll' logic, that assigned spell
numbers incorrectly for patients, other than the first, who weren't in
the program in month 1 -- IDE0003, in this
test data:
LINE 1: tag,[month1 TO month20] LINE 2: [month21 TO month40[ tag: IDE0001 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 [month21]: 0 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 tag: IDE0002 1 1 0 0 1 1 1 0 1 1 0 0 0 0 1 1 1 0 0 0 [month21]: 0 1 1 0 0 1 1 0 1 1 1 1 0 0 0 1 1 0 1 1 tag: IDE0003 0 1 1 0 1 1 0 0 0 0 0 0 0 1 0 1 0 1 1 0 [month21]: 0 1 1 0 0 0 0 0 1 1 0 0 1 1 1 1 0 1 0 1 Number of cases read: 3 Number of cases listed: 3 .................................................... * ........... Solution by unrolling ............... . VARSTOCASES /MAKE InProgram FROM month1 TO month40 /INDEX = MonthNum(40) /KEEP = tag /NULL = KEEP. Variables to Cases |-----------------------------|---------------------------| |Output Created |11-MAR-2010 22:46:37 | |-----------------------------|---------------------------| [Unroll] Generated Variables |-------------|------| |Name |Label | |-------------|------| |MonthNum |<none>| |InProgram |<none>| |-------------|------| |-------------|------| |Variables In | 41 | |Variables Out| 3 | |-------------|------| NUMERIC SpellNum (F2). LEAVE SpellNum. DO IF MonthNum EQ 1. . COMPUTE SpellNum = InProgram. ELSE. * The following is 'cute': It increments the spell number when .. . * the current month is in the program, and the previous month .. . * was not. But it won't be easy to read at first glance! .. . . COMPUTE SpellNum = SpellNum + (InProgram>LAG(InProgram)). END IF. * Remove the month numbers from records for months where the .. . * patient isn't in the program. The alternative is removing the .. . * records, using SELECT IF, and that requires an EXECUTE. .. . IF NOT InProgram MonthNum = $SYSMIS. AGGREGATE OUTFILE=* /BREAK= tag SpellNum /SpStrt 'First month of spell' = MIN(MonthNum) /SpEnd 'Last month of spell' = MAX(MonthNum). * Remove "spell 0", the months preceding the first admission .. . SELECT IF SpellNum GT 0. COMPUTE SpLen = SpEnd - SpStrt + 1. FORMATS SpLen (F2). LIST. List |-----------------------------|---------------------------| |Output Created |11-MAR-2010 22:46:37 | |-----------------------------|---------------------------| |(blank lines added by hand) | |-----------------------------|---------------------------| tag SpellNum SpStrt SpEnd SpLen IDE0001 1 5 17 13 IDE0001 2 22 23 2 IDE0001 3 29 37 9 IDE0002 1 1 2 2 IDE0002 2 5 7 3 IDE0002 3 9 10 2 IDE0002 4 15 17 3 IDE0002 5 22 23 2 IDE0002 6 26 27 2 IDE0002 7 29 32 4 IDE0002 8 36 37 2 IDE0002 9 39 40 2 IDE0003 1 2 3 2 IDE0003 2 5 6 2 IDE0003 3 14 14 1 IDE0003 4 16 16 1 IDE0003 5 18 19 2 IDE0003 6 22 23 2 IDE0003 7 29 30 2 IDE0003 8 33 36 4 IDE0003 9 38 38 1 IDE0003 10 40 40 1 Number of cases read: 22 Number of cases listed: 22 ===================================== APPENDIX: Test data, and code (With code for VECTOR/LOOP solution.) ===================================== * C:\Documents and Settings\Richard\My Documents . * \Technical\spssx-l\Z 2010a\ . * 2010-03-09 Starchenko - Counting number of months between entry and exit.SPS. * In response to posting . * Date: Tue, 9 Mar 2010 15:52:10 -0500 . * From: Oksana Starchenko <[hidden email]> . * Subject: Counting number of month between entry into and exit from a . * program . * To: [hidden email] . * Version 2: Clarify logic in VECTOR/LOOP version ............... . * Version 3: Fix bug in Unroll version ............... . * ........... Test data, from earlier responses ............... . data list / tag month1 to month40(a7,4x,40f1.0). begin data IDE0001 0000111111111111100001100000111111111000 IDE0002 1100111011000011100001100110111100011011 IDE0003 0110110000000101011001100000110011110101 end data. DATASET NAME TestData. DATASET ACTIVATE TestData WINDOW=FRONT. TEMPORARY. STRING SPACE(A15) TagSp(A07). LIST tag, Month1 TO Month20, SPACE, TagSp, Month21 TO Month40. * ........... Solution using VECTOR/LOOP ............... . DATASET ACTIVATE TestData WINDOW=FRONT. DATASET COPY LOOPing. DATASET ACTIVATE LOOPing WINDOW=FRONT. vector months=month1 to month40. VECTOR SpLen (10,F2) SpStrt(10,F2). NUMERIC #SpellNum(F2) /* Index # of spells within client */ #InSpell (F2) /* Flag: a 'spell' is running */ . COMPUTE #SpellNum = 0. COMPUTE #InSpell = 0. LOOP #MonthNum = 1 TO 40 /* For this month, */ . . DO IF Months(#MonthNum) EQ 1 /* if client's in the program: */ . . DO IF #InSpell /* Within a running spell, */ . * (no action) ** nothing needs doing ** . . ELSE /* If no spell is running, */ . . COMPUTE #InSpell = 1. /* start a spell, */ . . COMPUTE #SpellNum /* increment the spell number, */ = #SpellNum + 1. . COMPUTE SpStrt(#Spellnum) /* and record the first month; */ = #Monthnum. . END IF. . ELSE /* if client's not in program: */ . . DO IF #InSpell /* If a spell WAS running, */ . . COMPUTE #InSpell = 0. /* discontinue it */ . . COMPUTE SpLen(#SpellNum) /* and record its length */ = #MonthNum - SpStrt(#SpellNum). . ELSE /* If no spell is running, */ . * (no action) ** nothing needs doing ** . . END IF. . END IF. END LOOP. * Special case: The last month is within a spell: .. . DO IF #InSpell /* after leaving the loop, */. . COMPUTE SpLen(#SpellNum) = 41 - SpStrt(#SpellNum). END IF. LIST tag, SpLen1 TO SpStrt10. * ........... Solution by unrolling ............... . DATASET ACTIVATE TestData WINDOW=FRONT. DATASET COPY Unroll. DATASET ACTIVATE Unroll WINDOW=FRONT. VARSTOCASES /MAKE InProgram FROM month1 TO month40 /INDEX = MonthNum(40) /KEEP = tag /NULL = KEEP. NUMERIC SpellNum (F2). LEAVE SpellNum. DO IF MonthNum EQ 1. . COMPUTE SpellNum = InProgram. ELSE. * The following is 'cute': It increments the spell number when .. . * the current month is in the program, and the previous month .. . * was not. But it won't be easy to read at first glance! .. . . COMPUTE SpellNum = SpellNum + (InProgram>LAG(InProgram)). END IF. * Remove the month numbers from records for months where the .. . * patient isn't in the program. The alternative is removing the .. . * records, using SELECT IF, and that requires an EXECUTE. .. . IF NOT InProgram MonthNum = $SYSMIS. AGGREGATE OUTFILE=* /BREAK= tag SpellNum /SpStrt 'First month of spell' = MIN(MonthNum) /SpEnd 'Last month of spell' = MAX(MonthNum). * Remove "spell 0", the months preceding the first admission .. . SELECT IF SpellNum GT 0. COMPUTE SpLen = SpEnd - SpStrt + 1. FORMATS SpLen (F2). LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
As a SPSS newbie, I have been looking around for variants of
the rep() and count() functions in R (or similar in Minitab). Haven't found
anything yet though, but this should be a standard operation, shouldnt't it?
Using loops and such must be possible, but I would prerfer a more direct route.
Have I missed something obvious?
Robert
Robert Lundqvist
|
|
Administrator
|
Robert, please explain what rep() and count() do for those of us who are not R users.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
In reply to this post by Robert L
Robert
If you are a total newbie to SPSS, there are
entry level tutorials on:
They're all syntax-based and fully worked with
screenshots at each step, but also include comparisons with drop-down
menus. Not quite sure what count is in R, but
check out the tutorials for the SPSS count
command on: John Hall
|
|
In reply to this post by Oksana Starchenko
Shalom
I guess that in my first replay I misunderstud the "spells" . here is the syntax to clculate etch spell start, end, and length . dataset close all . data list / id month1 to month40(a7,4x,40f1.0). begin data IDE0001 0000111111111111100001100000111111111000 IDE0002 1100111011000011100001100110111100011011 IDE0003 0110110000000101011001100000110011110101 end data. vector month=month1 to month40 / in(10.f3) / out(10,f3)/ length(10,f3) . if ( month1 eq 1 ) in1=1 . if ( month1 eq 1 ) n=1 . loop ii=2 to 40 . if ( month(ii-1) eq 0) and ( month(ii) eq 1 ) n=sum(n,1) . if ( month(ii-1) eq 0) and ( month(ii) eq 1 ) in(n)=ii . if ( month(ii-1) eq 1) and ( month(ii) eq 0 ) out(n)=ii . if ( month(ii-1) eq 1) and ( month(ii) eq 0 ) length(n)= out(n) -in(n) . end loop . list id in1 to in5 out1 to out5 length1 to length5 . Hillel Vardi BGU Oksana Starchenko wrote: > I am working on analysis of a caseload of an employment assistance program > our provincial government is running. Among other things I need to know > length of "spells" when clients were receiving assistance. > > I have modified the original data set so that each month the program was > delivered during a certain period of time (129 months) has been classified > as an "entry month", an "exit month" or "no change month"; i.e., I have a > series of month(i) variables from month1 to month129 storing entry and > exit codes for about 20,000 clients. How could I count how many months > elapses between "entry" and "exit" month for each client. Majority of the > clients have entered and exited the program multiple times in the course > of 129 months in question. > > I would hugely appreciate an advice that would point me in a right > direction. > > Thanks! > Oksana > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Bruce Weaver
I managed to forget to tell what rep and count really does, sorry. Well, it's basically quite simple, these are straightforward ways to generate regular sequences.
rep(2,5) produces a vector 2, 2, 2, 2, 2. rep(v1,2) produces a new vector with the vector v1 replicated twice. As for count(), I forgot that this is probably a Minitab function (not sure though since I presently have got access to SPSS, not Minitab and not R), so that question should be left. However, there is this other useful R function called seq: seq(1,100) generates a vector 1, 2, 3,...100. This could also be achieved by the even simpler command 1:100, or assigned to a variable: "v <- c(1:100)" seq(length=10, from 5, by=0.1) generates a vector 5, 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8, 5.9 They can also be used together: rep(seq(1,100,10),2) and such. What is the easiest way to produce such regular sequences in SPSS? Robert ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Robert Lundqvist
|
|
These functions don't apply in the same way in SPSS as they might in R, because SPSS variables always fit in a rectangular table structure (other than simple scratch variables). So there are two scenarios. If you already have an active dataset, which automatically defines the length of a column, you would do a simple rep just with COMPUTE: COMPUTE repvar = 2. or more complicated logic for a multi-valued pattern. Table lookup values can be done with a TABLE type match in MATCH FILES or more conveniently with the SPSSINC TRANS extension command using the vlookup function in the extendedTransforms module (or write your own Python code). If you don't already have a dataset, then an SPSS INPUT PROGRAM allows you to generate arbitrary case data following whatever pattern you like. See the Command Syntax Reference for all the details. As an example to start from, you can generate an input program for a dataset of random numbers using the Make New Dataset with Cases custom dialog available from SPSS Developer Central (www.spss.com/devcentral). HTH, Jon Peck SPSS, an IBM Company [hidden email] 312-651-3435
I managed to forget to tell what rep and count really does, sorry. Well, it's basically quite simple, these are straightforward ways to generate regular sequences. rep(2,5) produces a vector 2, 2, 2, 2, 2. rep(v1,2) produces a new vector with the vector v1 replicated twice. As for count(), I forgot that this is probably a Minitab function (not sure though since I presently have got access to SPSS, not Minitab and not R), so that question should be left. However, there is this other useful R function called seq: seq(1,100) generates a vector 1, 2, 3,...100. This could also be achieved by the even simpler command 1:100, or assigned to a variable: "v <- c(1:100)" seq(length=10, from 5, by=0.1) generates a vector 5, 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8, 5.9 They can also be used together: rep(seq(1,100,10),2) and such. What is the easiest way to produce such regular sequences in SPSS? Robert ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
