I have a repeating ID table in which I need to identify when a null in
col2 appears either first within the ID repeating set or before the last repeating record-- note the last record may be preceded by a valid null. In the table below rowcnt 11, 13 & 19 are cases of an invalid null. I would create a new col as a 0,1 binary in which 1=invalid null. I've tried various lag syntaxes but not making progress. David. rowcnt, ID,col1, col2 1 1 197810 198320 2 1 198320 199330 3 1 199330 4 2 200210 200320 5 2 200320 200530 6 2 200530 200820 7 2 200530 8 2 200820 9 3 200010 200120 10 3 200520 200810 11 3 200520 12 3 200810 201110 13 3 200820 14 3 201110 201210 15 3 201210 201810 16 3 201810 17 4 196510 196820 18 4 196820 19 5 200010 20 5 200520 200520 21 5 200520 200610 22 5 200810 ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
3, 8, 16, 18, 22 are valid because they are the last record for the ID. Why is 7 a valid null? Does col1 always increase as rowcnt increases? Is rowcnt really a variable? *Put the last record on top. Sort cases by ID col1 (D). *alternatively. *Sort cases by ID rowcnt (D). *Identify the last record w/lag function. Compute lastrec=(Id ne lag(id). *identify invalid nulls if missing and not last record. Compute InvldNull=Missing(col2) and lastrec=0. *resort anytime. M -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of [hidden email] Sent: Tuesday, March 27, 2018 7:39 AM To: [hidden email] Subject: [SPSSX-L] identifying a null within a repeat ID table I have a repeating ID table in which I need to identify when a null in col2 appears either first within the ID repeating set or before the last repeating record-- note the last record may be preceded by a valid null. In the table below rowcnt 11, 13 & 19 are cases of an invalid null. I would create a new col as a 0,1 binary in which 1=invalid null. I've tried various lag syntaxes but not making progress. David. rowcnt,ID,col1,col2 11197810198320 21198320199330 31199330 42200210200320 52200320200530 62200530200820 72200530 82200820 93200010200120 103200520200810 113 200520 123200810201110 133200820 143201110201210 153201210201810 163201810 174196510196820 184196820 195200010 205200520200520 215200520200610 22 5 200810 ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ________________________________ This correspondence contains proprietary information some or all of which may be legally privileged; it is for the intended recipient only. If you are not the intended recipient you must not use, disclose, distribute, copy, print, or rely on this correspondence and completely dispose of the correspondence immediately. Please notify the sender if you have received this email in error. NOTE: Messages to or from the State of Connecticut domain may be subject to the Freedom of Information statutes and regulations. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Assuming the ID values are in ascending order you don't need the SORT.
To use Solution ONE below the file must be in proper order. Solution TWO doesn't require any SORT as long as the collective IDs are adjacent. I also question the non flagged 7. Operator error? -- /*Solution ONE using MATCH with LAST function assumes sorted*/. MATCH FILES /FILE * / LAST=@FLAG@/BY ID. COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@. EXECUTE. DELETE VARIABLES @FLAG@. /*Solution TWO using SHIFT VALUES with LEAD function */. SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1. COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@. EXECUTE. DELETE VARIABLES @NextID@. Could also use CREATE with LEAD but why bother if we have SHIFT VALUES? Ives, Melissa L wrote > 3, 8, 16, 18, 22 are valid because they are the last record for the ID. > > Why is 7 a valid null? > > Does col1 always increase as rowcnt increases? > Is rowcnt really a variable? > > *Put the last record on top. > Sort cases by ID col1 (D). > *alternatively. > *Sort cases by ID rowcnt (D). > > *Identify the last record w/lag function. > Compute lastrec=(Id ne lag(id). > *identify invalid nulls if missing and not last record. > Compute InvldNull=Missing(col2) and lastrec=0. > > *resort anytime. > M > > -----Original Message----- > From: SPSSX(r) Discussion [mailto: > SPSSX-L@.UGA > ] On Behalf Of > coxspss@ > Sent: Tuesday, March 27, 2018 7:39 AM > To: > SPSSX-L@.UGA > Subject: [SPSSX-L] identifying a null within a repeat ID table > > I have a repeating ID table in which I need to identify when a null in > col2 appears either first within the ID repeating set or before the last > repeating record-- note the last record may be preceded by a valid null. > In the table below rowcnt 11, 13 & 19 are cases of an invalid null. I > would create a new col as a 0,1 binary in which 1=invalid null. I've > tried various lag syntaxes but not making progress. David. > > > rowcnt,ID,col1,col2 > 11197810198320 > 21198320199330 > 31199330 > 42200210200320 > 52200320200530 > 62200530200820 > 72200530 > 82200820 > 93200010200120 > 103200520200810 > 113 200520 > 123200810201110 > 133200820 > 143201110201210 > 153201210201810 > 163201810 > 174196510196820 > 184196820 > 195200010 > 205200520200520 > 215200520200610 > 22 5 200810 > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the command. To leave the > list, send the command SIGNOFF SPSSX-L For a list of commands to manage > subscriptions, send the command INFO REFCARD > > ________________________________ > > This correspondence contains proprietary information some or all of which > may be legally privileged; it is for the intended recipient only. If you > are not the intended recipient you must not use, disclose, distribute, > copy, print, or rely on this correspondence and completely dispose of the > correspondence immediately. Please notify the sender if you have received > this email in error. NOTE: Messages to or from the State of Connecticut > domain may be subject to the Freedom of Information statutes and > regulations. > > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by wsu_wright
Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple nulls
but they must appear on the last adjacent rows, I can see where in my post it would lead someone to think it is invalid, tried to clarify that with the "-- note the last record may be preceded by a valid null" but upon reading it again I can see it just as confusing, sorry about that. This multiple valid nulls at the end of a repeating ID was also causing me endless frustrations in my own attempt to write the syntax. Should have also indicated that the file is sorted by ID & col1 in the file. On Tue, Mar 27, 2018 at 11:15 AM, David Marso wrote: > Assuming the ID values are in ascending order you don't need the SORT. > To use Solution ONE below the file must be in proper order. > Solution TWO doesn't require any SORT as long as the collective IDs > are > adjacent. > I also question the non flagged 7. > Operator error? > -- > > /*Solution ONE using MATCH with LAST function assumes sorted*/. > MATCH FILES /FILE * / LAST=@FLAG@/BY ID. > COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@. > EXECUTE. > DELETE VARIABLES @FLAG@. > > /*Solution TWO using SHIFT VALUES with LEAD function */. > SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1. > COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@. > EXECUTE. > DELETE VARIABLES @NextID@. > > Could also use CREATE with LEAD but why bother if we have SHIFT > VALUES? > > > > > > Ives, Melissa L wrote >> 3, 8, 16, 18, 22 are valid because they are the last record for the >> ID. >> >> Why is 7 a valid null? >> >> Does col1 always increase as rowcnt increases? >> Is rowcnt really a variable? >> >> *Put the last record on top. >> Sort cases by ID col1 (D). >> *alternatively. >> *Sort cases by ID rowcnt (D). >> >> *Identify the last record w/lag function. >> Compute lastrec=(Id ne lag(id). >> *identify invalid nulls if missing and not last record. >> Compute InvldNull=Missing(col2) and lastrec=0. >> >> *resort anytime. >> M >> >> -----Original Message----- >> From: SPSSX(r) Discussion [mailto: > >> SPSSX-L@.UGA > >> ] On Behalf Of > >> coxspss@ > >> Sent: Tuesday, March 27, 2018 7:39 AM >> To: > >> SPSSX-L@.UGA > >> Subject: [SPSSX-L] identifying a null within a repeat ID table >> >> I have a repeating ID table in which I need to identify when a null >> in >> col2 appears either first within the ID repeating set or before the >> last >> repeating record-- note the last record may be preceded by a valid >> null. >> In the table below rowcnt 11, 13 & 19 are cases of an invalid null. >> I >> would create a new col as a 0,1 binary in which 1=invalid null. I've >> tried various lag syntaxes but not making progress. David. >> >> >> rowcnt,ID,col1,col2 >> 11197810198320 >> 21198320199330 >> 31199330 >> 42200210200320 >> 52200320200530 >> 62200530200820 >> 72200530 >> 82200820 >> 93200010200120 >> 103200520200810 >> 113 200520 >> 123200810201110 >> 133200820 >> 143201110201210 >> 153201210201810 >> 163201810 >> 174196510196820 >> 184196820 >> 195200010 >> 205200520200520 >> 215200520200610 >> 22 5 200810 >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the command. To leave the >> list, send the command SIGNOFF SPSSX-L For a list of commands to >> manage >> subscriptions, send the command INFO REFCARD >> >> ________________________________ >> >> This correspondence contains proprietary information some or all of >> which >> may be legally privileged; it is for the intended recipient only. If >> you >> are not the intended recipient you must not use, disclose, >> distribute, >> copy, print, or rely on this correspondence and completely dispose of >> the >> correspondence immediately. Please notify the sender if you have >> received >> this email in error. NOTE: Messages to or from the State of >> Connecticut >> domain may be subject to the Freedom of Information statutes and >> regulations. >> >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD > > > > > > ----- > Please reply to the list and not to my personal email. > Those desiring my consulting or training services please feel free to > email me. > --- > "Nolite dare sanctum canibus neque mittatis margaritas vestras ante > porcos ne forte conculcent eas pedibus suis." > Cum es damnatorum possederunt porcos iens ut salire off sanguinum > cliff in abyssum?" > -- > Sent from: http://spssx-discussion.1045642.n5.nabble.com/ > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except > the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
If brute force is fine because efficiency does not matter, the first solution I saw gives the hint: Sort into reverse order; use the usual LAG( ) process to number the lines for each ID.
Then the (revised) first and second lines are the ones for which NULL is okay.
-- Rich Ulrich From: SPSSX(r) Discussion <[hidden email]> on behalf of [hidden email] <[hidden email]>
Sent: Tuesday, March 27, 2018 1:32 PM To: [hidden email] Subject: Re: identifying a null within a repeat ID table Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple nulls
but they must appear on the last adjacent rows, I can see where in my post it would lead someone to think it is invalid, tried to clarify that with the "-- note the last record may be preceded by a valid null" but upon reading it again I can see it just as confusing, sorry about that. This multiple valid nulls at the end of a repeating ID was also causing me endless frustrations in my own attempt to write the syntax. Should have also indicated that the file is sorted by ID & col1 in the file. ... |
In reply to this post by wsu_wright
Have you considered reorganizing your data to a wide format instead of LAG?
I think that would make it easier to identify invalid nulls. If you replaced the invalid nulls with a dummy code (9999999) you could compute invalid null after restructuring your data back to a long format. Jim Marks Sr. Market Research Manager National Market Research Kaiser Foundation Health Plan of the Mid-Atlantic States, Inc. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of [hidden email] Sent: Tuesday, March 27, 2018 1:33 PM To: [hidden email] Subject: Re: identifying a null within a repeat ID table Caution: This email came from outside Kaiser Permanente. Do not open attachments or click on links if you do not recognize the sender. ______________________________________________________________________ Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple nulls but they must appear on the last adjacent rows, I can see where in my post it would lead someone to think it is invalid, tried to clarify that with the "-- note the last record may be preceded by a valid null" but upon reading it again I can see it just as confusing, sorry about that. This multiple valid nulls at the end of a repeating ID was also causing me endless frustrations in my own attempt to write the syntax. Should have also indicated that the file is sorted by ID & col1 in the file. On Tue, Mar 27, 2018 at 11:15 AM, David Marso wrote: > Assuming the ID values are in ascending order you don't need the SORT. > To use Solution ONE below the file must be in proper order. > Solution TWO doesn't require any SORT as long as the collective IDs > are adjacent. > I also question the non flagged 7. > Operator error? > -- > > /*Solution ONE using MATCH with LAST function assumes sorted*/. > MATCH FILES /FILE * / LAST=@FLAG@/BY ID. > COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@. > EXECUTE. > DELETE VARIABLES @FLAG@. > > /*Solution TWO using SHIFT VALUES with LEAD function */. > SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1. > COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@. > EXECUTE. > DELETE VARIABLES @NextID@. > > Could also use CREATE with LEAD but why bother if we have SHIFT > VALUES? > > > > > > Ives, Melissa L wrote >> 3, 8, 16, 18, 22 are valid because they are the last record for the >> ID. >> >> Why is 7 a valid null? >> >> Does col1 always increase as rowcnt increases? >> Is rowcnt really a variable? >> >> *Put the last record on top. >> Sort cases by ID col1 (D). >> *alternatively. >> *Sort cases by ID rowcnt (D). >> >> *Identify the last record w/lag function. >> Compute lastrec=(Id ne lag(id). >> *identify invalid nulls if missing and not last record. >> Compute InvldNull=Missing(col2) and lastrec=0. >> >> *resort anytime. >> M >> >> -----Original Message----- >> From: SPSSX(r) Discussion [mailto: > >> SPSSX-L@.UGA > >> ] On Behalf Of > >> coxspss@ > >> Sent: Tuesday, March 27, 2018 7:39 AM >> To: > >> SPSSX-L@.UGA > >> Subject: [SPSSX-L] identifying a null within a repeat ID table >> >> I have a repeating ID table in which I need to identify when a null >> in >> col2 appears either first within the ID repeating set or before the >> last repeating record-- note the last record may be preceded by a >> valid null. >> In the table below rowcnt 11, 13 & 19 are cases of an invalid null. >> I >> would create a new col as a 0,1 binary in which 1=invalid null. I've >> tried various lag syntaxes but not making progress. David. >> >> >> rowcnt,ID,col1,col2 >> 11197810198320 >> 21198320199330 >> 31199330 >> 42200210200320 >> 52200320200530 >> 62200530200820 >> 72200530 >> 82200820 >> 93200010200120 >> 103200520200810 >> 113 200520 >> 123200810201110 >> 133200820 >> 143201110201210 >> 153201210201810 >> 163201810 >> 174196510196820 >> 184196820 >> 195200010 >> 205200520200520 >> 215200520200610 >> 22 5 200810 >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the command. To leave the >> list, send the command SIGNOFF SPSSX-L For a list of commands to >> manage subscriptions, send the command INFO REFCARD >> >> ________________________________ >> >> This correspondence contains proprietary information some or all of >> which may be legally privileged; it is for the intended recipient >> only. If you are not the intended recipient you must not use, >> disclose, distribute, copy, print, or rely on this correspondence and >> completely dispose of the correspondence immediately. Please notify >> the sender if you have received this email in error. NOTE: Messages >> to or from the State of Connecticut domain may be subject to the >> Freedom of Information statutes and regulations. >> >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the command. To leave the >> list, send the command SIGNOFF SPSSX-L For a list of commands to >> manage subscriptions, send the command INFO REFCARD > > > > > > ----- > Please reply to the list and not to my personal email. > Those desiring my consulting or training services please feel free to > email me. > --- > "Nolite dare sanctum canibus neque mittatis margaritas vestras ante > porcos ne forte conculcent eas pedibus suis." > Cum es damnatorum possederunt porcos iens ut salire off sanguinum > cliff in abyssum?" > -- > Sent from: http://spssx-discussion.1045642.n5.nabble.com/ > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except > the command. To leave the list, send the command SIGNOFF SPSSX-L For a > list of commands to manage subscriptions, send the command INFO > REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD NOTICE TO RECIPIENT: If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. Thank you. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
As long as you don't have a bajillion records the following should suffice.
/*-------------------------------------*/. SET MXLOOPS=10000000. MATRIX. GET DATA /VARIABLES rowcnt ID col1 col2 /MISSING=-999. COMPUTE Flags=MAKE(NROW(Data),1,0). LOOP #=NROW(DATA) TO 2 BY -1. COMPUTE Flags(#-1)= (Data(#,2) EQ Data(#-1,2) AND (Data(#,4) NE -999 AND Data(#-1,4) EQ -999)). END LOOP. SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags. END MATRIX. RECODE col2(-999=SYSMIS). /*-------------------------------------*/. Jim Marks wrote > Have you considered reorganizing your data to a wide format instead of > LAG? > > I think that would make it easier to identify invalid nulls. If you > replaced the invalid nulls with a dummy code (9999999) you could compute > invalid null after restructuring your data back to a long format. > > Jim Marks > Sr. Market Research Manager > National Market Research > Kaiser Foundation Health Plan of the Mid-Atlantic States, Inc. > > -----Original Message----- > From: SPSSX(r) Discussion [mailto: > SPSSX-L@.UGA > ] On Behalf Of > coxspss@ > Sent: Tuesday, March 27, 2018 1:33 PM > To: > SPSSX-L@.UGA > Subject: Re: identifying a null within a repeat ID table > > Caution: This email came from outside Kaiser Permanente. Do not open > attachments or click on links if you do not recognize the sender. > > ______________________________________________________________________ > Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple nulls but > they must appear on the last adjacent rows, I can see where in my post it > would lead someone to think it is invalid, tried to clarify that with the > "-- note the last record may be preceded by a valid null" but upon reading > it again I can see it just as confusing, sorry about that. > This multiple valid nulls at the end of a repeating ID was also causing me > endless frustrations in my own attempt to write the syntax. Should have > also indicated that the file is sorted by ID & col1 in the file. > > > On Tue, Mar 27, 2018 at 11:15 AM, David Marso wrote: > >> Assuming the ID values are in ascending order you don't need the SORT. >> To use Solution ONE below the file must be in proper order. >> Solution TWO doesn't require any SORT as long as the collective IDs >> are adjacent. >> I also question the non flagged 7. >> Operator error? >> -- >> >> /*Solution ONE using MATCH with LAST function assumes sorted*/. >> MATCH FILES /FILE * / LAST=@FLAG@/BY ID. >> COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@. >> EXECUTE. >> DELETE VARIABLES @FLAG@. >> >> /*Solution TWO using SHIFT VALUES with LEAD function */. >> SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1. >> COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@. >> EXECUTE. >> DELETE VARIABLES @NextID@. >> >> Could also use CREATE with LEAD but why bother if we have SHIFT >> VALUES? >> >> >> >> >> >> Ives, Melissa L wrote >>> 3, 8, 16, 18, 22 are valid because they are the last record for the >>> ID. >>> >>> Why is 7 a valid null? >>> >>> Does col1 always increase as rowcnt increases? >>> Is rowcnt really a variable? >>> >>> *Put the last record on top. >>> Sort cases by ID col1 (D). >>> *alternatively. >>> *Sort cases by ID rowcnt (D). >>> >>> *Identify the last record w/lag function. >>> Compute lastrec=(Id ne lag(id). >>> *identify invalid nulls if missing and not last record. >>> Compute InvldNull=Missing(col2) and lastrec=0. >>> >>> *resort anytime. >>> M >>> >>> -----Original Message----- >>> From: SPSSX(r) Discussion [mailto: >> >>> SPSSX-L@.UGA >> >>> ] On Behalf Of >> >>> coxspss@ >> >>> Sent: Tuesday, March 27, 2018 7:39 AM >>> To: >> >>> SPSSX-L@.UGA >> >>> Subject: [SPSSX-L] identifying a null within a repeat ID table >>> >>> I have a repeating ID table in which I need to identify when a null >>> in >>> col2 appears either first within the ID repeating set or before the >>> last repeating record-- note the last record may be preceded by a >>> valid null. >>> In the table below rowcnt 11, 13 & 19 are cases of an invalid null. >>> I >>> would create a new col as a 0,1 binary in which 1=invalid null. I've >>> tried various lag syntaxes but not making progress. David. >>> >>> >>> rowcnt,ID,col1,col2 >>> 11197810198320 >>> 21198320199330 >>> 31199330 >>> 42200210200320 >>> 52200320200530 >>> 62200530200820 >>> 72200530 >>> 82200820 >>> 93200010200120 >>> 103200520200810 >>> 113 200520 >>> 123200810201110 >>> 133200820 >>> 143201110201210 >>> 153201210201810 >>> 163201810 >>> 174196510196820 >>> 184196820 >>> 195200010 >>> 205200520200520 >>> 215200520200610 >>> 22 5 200810 >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >> >>> LISTSERV@.UGA >> >>> (not to SPSSX-L), with no body text except the command. To leave the >>> list, send the command SIGNOFF SPSSX-L For a list of commands to >>> manage subscriptions, send the command INFO REFCARD >>> >>> ________________________________ >>> >>> This correspondence contains proprietary information some or all of >>> which may be legally privileged; it is for the intended recipient >>> only. If you are not the intended recipient you must not use, >>> disclose, distribute, copy, print, or rely on this correspondence and >>> completely dispose of the correspondence immediately. Please notify >>> the sender if you have received this email in error. NOTE: Messages >>> to or from the State of Connecticut domain may be subject to the >>> Freedom of Information statutes and regulations. >>> >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >> >>> LISTSERV@.UGA >> >>> (not to SPSSX-L), with no body text except the command. To leave the >>> list, send the command SIGNOFF SPSSX-L For a list of commands to >>> manage subscriptions, send the command INFO REFCARD >> >> >> >> >> >> ----- >> Please reply to the list and not to my personal email. >> Those desiring my consulting or training services please feel free to >> email me. >> --- >> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante >> porcos ne forte conculcent eas pedibus suis." >> Cum es damnatorum possederunt porcos iens ut salire off sanguinum >> cliff in abyssum?" >> -- >> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> > LISTSERV@.UGA > (not to SPSSX-L), with no body text except >> the command. To leave the list, send the command SIGNOFF SPSSX-L For a >> list of commands to manage subscriptions, send the command INFO >> REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the command. To leave the > list, send the command SIGNOFF SPSSX-L For a list of commands to manage > subscriptions, send the command INFO REFCARD > > NOTICE TO RECIPIENT: If you are not the intended recipient of this > e-mail, you are prohibited from sharing, copying, or otherwise using or > disclosing its contents. If you have received this e-mail in error, > please notify the sender immediately by reply e-mail and permanently > delete this e-mail and any attachments without reading, forwarding or > saving them. Thank you. > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by Jim Marks
As long as you don't have a bajillion records the following should suffice.
/*-------------------------------------*/. SET MXLOOPS=10000000. MATRIX. GET DATA /VARIABLES rowcnt ID col1 col2 /MISSING=-999. COMPUTE Flags=MAKE(NROW(Data),1,0). LOOP #=NROW(DATA) TO 2 BY -1. COMPUTE Flags(#-1)= (Data(#,2) EQ Data(#-1,2) AND (Data(#,4) NE -999 AND Data(#-1,4) EQ -999)). END LOOP. SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags. END MATRIX. RECODE col2(-999=SYSMIS). /*-------------------------------------*/. In a situation where you do have a HUGE number of records try the following: DATA LIST LIST /rowcnt ID col1 col2 . BEGIN DATA 1 1 197810 198320 2 1 198320 199330 3 1 199330 4 2 200210 200320 5 2 200320 200530 6 2 200530 200820 7 2 200530 8 2 200820 9 3 200010 200120 10 3 200520 200810 11 3 200520 12 3 200810 201110 13 3 200820 14 3 201110 201210 15 3 201210 201810 16 3 201810 17 4 196510 196820 18 4 196820 19 5 200010 20 5 200520 200520 21 5 200520 200610 22 5 200810 END DATA. SPLIT FILE BY ID. INCLUDE FILE="C:\Users\Owner\Desktop\SPSSCODE\SplitMatrix.sps". RECODE col2(-999=SYSMIS). /* Contents of "C:\Users\Owner\Desktop\SPSSCODE\SplitMatrix.sps" */. SET MXLOOPS=10000000. MATRIX. COMPUTE Col2=4. GET DATA /VARIABLES rowcnt ID col1 col2 /MISSING=-999. COMPUTE N=NROW(Data). COMPUTE Flags=MAKE(N,1,0). LOOP #=N TO 2 BY -1. COMPUTE Flags(#-1)= (Data(#,Col2) NE -999 AND Data(#-1,Col2) EQ -999). END LOOP. SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags. END MATRIX. Jim Marks wrote > Have you considered reorganizing your data to a wide format instead of > LAG? > > I think that would make it easier to identify invalid nulls. If you > replaced the invalid nulls with a dummy code (9999999) you could compute > invalid null after restructuring your data back to a long format. > > Jim Marks > Sr. Market Research Manager > National Market Research > Kaiser Foundation Health Plan of the Mid-Atlantic States, Inc. > > -----Original Message----- > From: SPSSX(r) Discussion [mailto: > SPSSX-L@.UGA > ] On Behalf Of > coxspss@ > Sent: Tuesday, March 27, 2018 1:33 PM > To: > SPSSX-L@.UGA > Subject: Re: identifying a null within a repeat ID table > > Caution: This email came from outside Kaiser Permanente. Do not open > attachments or click on links if you do not recognize the sender. > > ______________________________________________________________________ > Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple nulls but > they must appear on the last adjacent rows, I can see where in my post it > would lead someone to think it is invalid, tried to clarify that with the > "-- note the last record may be preceded by a valid null" but upon reading > it again I can see it just as confusing, sorry about that. > This multiple valid nulls at the end of a repeating ID was also causing me > endless frustrations in my own attempt to write the syntax. Should have > also indicated that the file is sorted by ID & col1 in the file. > > > On Tue, Mar 27, 2018 at 11:15 AM, David Marso wrote: > >> Assuming the ID values are in ascending order you don't need the SORT. >> To use Solution ONE below the file must be in proper order. >> Solution TWO doesn't require any SORT as long as the collective IDs >> are adjacent. >> I also question the non flagged 7. >> Operator error? >> -- >> >> /*Solution ONE using MATCH with LAST function assumes sorted*/. >> MATCH FILES /FILE * / LAST=@FLAG@/BY ID. >> COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@. >> EXECUTE. >> DELETE VARIABLES @FLAG@. >> >> /*Solution TWO using SHIFT VALUES with LEAD function */. >> SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1. >> COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@. >> EXECUTE. >> DELETE VARIABLES @NextID@. >> >> Could also use CREATE with LEAD but why bother if we have SHIFT >> VALUES? >> >> >> >> >> >> Ives, Melissa L wrote >>> 3, 8, 16, 18, 22 are valid because they are the last record for the >>> ID. >>> >>> Why is 7 a valid null? >>> >>> Does col1 always increase as rowcnt increases? >>> Is rowcnt really a variable? >>> >>> *Put the last record on top. >>> Sort cases by ID col1 (D). >>> *alternatively. >>> *Sort cases by ID rowcnt (D). >>> >>> *Identify the last record w/lag function. >>> Compute lastrec=(Id ne lag(id). >>> *identify invalid nulls if missing and not last record. >>> Compute InvldNull=Missing(col2) and lastrec=0. >>> >>> *resort anytime. >>> M >>> >>> -----Original Message----- >>> From: SPSSX(r) Discussion [mailto: >> >>> SPSSX-L@.UGA >> >>> ] On Behalf Of >> >>> coxspss@ >> >>> Sent: Tuesday, March 27, 2018 7:39 AM >>> To: >> >>> SPSSX-L@.UGA >> >>> Subject: [SPSSX-L] identifying a null within a repeat ID table >>> >>> I have a repeating ID table in which I need to identify when a null >>> in >>> col2 appears either first within the ID repeating set or before the >>> last repeating record-- note the last record may be preceded by a >>> valid null. >>> In the table below rowcnt 11, 13 & 19 are cases of an invalid null. >>> I >>> would create a new col as a 0,1 binary in which 1=invalid null. I've >>> tried various lag syntaxes but not making progress. David. >>> >>> >>> rowcnt,ID,col1,col2 >>> 11197810198320 >>> 21198320199330 >>> 31199330 >>> 42200210200320 >>> 52200320200530 >>> 62200530200820 >>> 72200530 >>> 82200820 >>> 93200010200120 >>> 103200520200810 >>> 113 200520 >>> 123200810201110 >>> 133200820 >>> 143201110201210 >>> 153201210201810 >>> 163201810 >>> 174196510196820 >>> 184196820 >>> 195200010 >>> 205200520200520 >>> 215200520200610 >>> 22 5 200810 >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >> >>> LISTSERV@.UGA >> >>> (not to SPSSX-L), with no body text except the command. To leave the >>> list, send the command SIGNOFF SPSSX-L For a list of commands to >>> manage subscriptions, send the command INFO REFCARD >>> >>> ________________________________ >>> >>> This correspondence contains proprietary information some or all of >>> which may be legally privileged; it is for the intended recipient >>> only. If you are not the intended recipient you must not use, >>> disclose, distribute, copy, print, or rely on this correspondence and >>> completely dispose of the correspondence immediately. Please notify >>> the sender if you have received this email in error. NOTE: Messages >>> to or from the State of Connecticut domain may be subject to the >>> Freedom of Information statutes and regulations. >>> >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >> >>> LISTSERV@.UGA >> >>> (not to SPSSX-L), with no body text except the command. To leave the >>> list, send the command SIGNOFF SPSSX-L For a list of commands to >>> manage subscriptions, send the command INFO REFCARD >> >> >> >> >> >> ----- >> Please reply to the list and not to my personal email. >> Those desiring my consulting or training services please feel free to >> email me. >> --- >> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante >> porcos ne forte conculcent eas pedibus suis." >> Cum es damnatorum possederunt porcos iens ut salire off sanguinum >> cliff in abyssum?" >> -- >> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> > LISTSERV@.UGA > (not to SPSSX-L), with no body text except >> the command. To leave the list, send the command SIGNOFF SPSSX-L For a >> list of commands to manage subscriptions, send the command INFO >> REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the command. To leave the > list, send the command SIGNOFF SPSSX-L For a list of commands to manage > subscriptions, send the command INFO REFCARD > > NOTICE TO RECIPIENT: If you are not the intended recipient of this > e-mail, you are prohibited from sharing, copying, or otherwise using or > disclosing its contents. If you have received this e-mail in error, > please notify the sender immediately by reply e-mail and permanently > delete this e-mail and any attachments without reading, forwarding or > saving them. Thank you. > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by wsu_wright
Thanks for everyone responding so quickly. I tried Melissa's & David's
approach. Melissa's worked the fastest but had some false positives, this was connected to her questions about the multiple nulls being valid so she may have altered her syntax had I responded to her in time. David's approach took longer to process but did correctly identify the nulls, speed not really an issues, I am processing approximately 2 million records so I didn't except speed, so went with David's. Thanks again and to all who keep this listserv the best on spss. David Wright On Tue, Mar 27, 2018 at 1:46 PM, David Marso wrote: > As long as you don't have a bajillion records the following should > suffice. > > /*-------------------------------------*/. > SET MXLOOPS=10000000. > MATRIX. > GET DATA /VARIABLES rowcnt ID col1 col2 /MISSING=-999. > COMPUTE Flags=MAKE(NROW(Data),1,0). > LOOP #=NROW(DATA) TO 2 BY -1. > COMPUTE Flags(#-1)= (Data(#,2) EQ Data(#-1,2) AND (Data(#,4) NE -999 > AND > Data(#-1,4) EQ -999)). > END LOOP. > SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags. > END MATRIX. > RECODE col2(-999=SYSMIS). > /*-------------------------------------*/. > > > In a situation where you do have a HUGE number of records try the > following: > > DATA LIST LIST /rowcnt ID col1 col2 . > BEGIN DATA > 1 1 197810 198320 2 1 198320 199330 3 1 199330 4 2 200210 200320 5 2 > 200320 200530 6 2 200530 200820 7 2 200530 8 2 200820 9 3 200010 > 200120 10 3 200520 200810 11 3 200520 12 3 200810 201110 13 3 200820 > 14 3 201110 201210 15 3 201210 201810 16 3 201810 17 4 196510 196820 > 18 4 196820 19 5 200010 20 5 200520 200520 21 5 200520 200610 22 5 > 200810 END DATA. > SPLIT FILE BY ID. > INCLUDE FILE="C:\Users\Owner\Desktop\SPSSCODE\SplitMatrix.sps". > RECODE col2(-999=SYSMIS). > > > /* Contents of "C:\Users\Owner\Desktop\SPSSCODE\SplitMatrix.sps" */. > SET MXLOOPS=10000000. > MATRIX. > COMPUTE Col2=4. > GET DATA /VARIABLES rowcnt ID col1 col2 /MISSING=-999. > COMPUTE N=NROW(Data). > COMPUTE Flags=MAKE(N,1,0). > LOOP #=N TO 2 BY -1. > COMPUTE Flags(#-1)= (Data(#,Col2) NE -999 AND Data(#-1,Col2) EQ -999). > END LOOP. > SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags. > END MATRIX. > > > Jim Marks wrote >> Have you considered reorganizing your data to a wide format instead >> of >> LAG? >> >> I think that would make it easier to identify invalid nulls. If you >> replaced the invalid nulls with a dummy code (9999999) you could >> compute >> invalid null after restructuring your data back to a long format. >> >> Jim Marks >> Sr. Market Research Manager >> National Market Research >> Kaiser Foundation Health Plan of the Mid-Atlantic States, Inc. >> >> -----Original Message----- >> From: SPSSX(r) Discussion [mailto: > >> SPSSX-L@.UGA > >> ] On Behalf Of > >> coxspss@ > >> Sent: Tuesday, March 27, 2018 1:33 PM >> To: > >> SPSSX-L@.UGA > >> Subject: Re: identifying a null within a repeat ID table >> >> Caution: This email came from outside Kaiser Permanente. Do not open >> attachments or click on links if you do not recognize the sender. >> >> >> ______________________________________________________________________ >> Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple >> nulls but >> they must appear on the last adjacent rows, I can see where in my >> post it >> would lead someone to think it is invalid, tried to clarify that with >> the >> "-- note the last record may be preceded by a valid null" but upon >> reading >> it again I can see it just as confusing, sorry about that. >> This multiple valid nulls at the end of a repeating ID was also >> causing me >> endless frustrations in my own attempt to write the syntax. Should >> have >> also indicated that the file is sorted by ID & col1 in the file. >> >> >> On Tue, Mar 27, 2018 at 11:15 AM, David Marso wrote: >> >>> Assuming the ID values are in ascending order you don't need the >>> SORT. >>> To use Solution ONE below the file must be in proper order. >>> Solution TWO doesn't require any SORT as long as the collective IDs >>> are adjacent. >>> I also question the non flagged 7. >>> Operator error? >>> -- >>> >>> /*Solution ONE using MATCH with LAST function assumes sorted*/. >>> MATCH FILES /FILE * / LAST=@FLAG@/BY ID. >>> COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@. >>> EXECUTE. >>> DELETE VARIABLES @FLAG@. >>> >>> /*Solution TWO using SHIFT VALUES with LEAD function */. >>> SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1. >>> COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@. >>> EXECUTE. >>> DELETE VARIABLES @NextID@. >>> >>> Could also use CREATE with LEAD but why bother if we have SHIFT >>> VALUES? >>> >>> >>> >>> >>> >>> Ives, Melissa L wrote >>>> 3, 8, 16, 18, 22 are valid because they are the last record for the >>>> ID. >>>> >>>> Why is 7 a valid null? >>>> >>>> Does col1 always increase as rowcnt increases? >>>> Is rowcnt really a variable? >>>> >>>> *Put the last record on top. >>>> Sort cases by ID col1 (D). >>>> *alternatively. >>>> *Sort cases by ID rowcnt (D). >>>> >>>> *Identify the last record w/lag function. >>>> Compute lastrec=(Id ne lag(id). >>>> *identify invalid nulls if missing and not last record. >>>> Compute InvldNull=Missing(col2) and lastrec=0. >>>> >>>> *resort anytime. >>>> M >>>> >>>> -----Original Message----- >>>> From: SPSSX(r) Discussion [mailto: >>> >>>> SPSSX-L@.UGA >>> >>>> ] On Behalf Of >>> >>>> coxspss@ >>> >>>> Sent: Tuesday, March 27, 2018 7:39 AM >>>> To: >>> >>>> SPSSX-L@.UGA >>> >>>> Subject: [SPSSX-L] identifying a null within a repeat ID table >>>> >>>> I have a repeating ID table in which I need to identify when a null >>>> in >>>> col2 appears either first within the ID repeating set or before the >>>> last repeating record-- note the last record may be preceded by a >>>> valid null. >>>> In the table below rowcnt 11, 13 & 19 are cases of an invalid null. >>>> I >>>> would create a new col as a 0,1 binary in which 1=invalid null. >>>> I've >>>> tried various lag syntaxes but not making progress. David. >>>> >>>> >>>> rowcnt,ID,col1,col2 >>>> 11197810198320 >>>> 21198320199330 >>>> 31199330 >>>> 42200210200320 >>>> 52200320200530 >>>> 62200530200820 >>>> 72200530 >>>> 82200820 >>>> 93200010200120 >>>> 103200520200810 >>>> 113 200520 >>>> 123200810201110 >>>> 133200820 >>>> 143201110201210 >>>> 153201210201810 >>>> 163201810 >>>> 174196510196820 >>>> 184196820 >>>> 195200010 >>>> 205200520200520 >>>> 215200520200610 >>>> 22 5 200810 >>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>> >>>> LISTSERV@.UGA >>> >>>> (not to SPSSX-L), with no body text except the command. To leave >>>> the >>>> list, send the command SIGNOFF SPSSX-L For a list of commands to >>>> manage subscriptions, send the command INFO REFCARD >>>> >>>> ________________________________ >>>> >>>> This correspondence contains proprietary information some or all of >>>> which may be legally privileged; it is for the intended recipient >>>> only. If you are not the intended recipient you must not use, >>>> disclose, distribute, copy, print, or rely on this correspondence >>>> and >>>> completely dispose of the correspondence immediately. Please notify >>>> the sender if you have received this email in error. NOTE: Messages >>>> to or from the State of Connecticut domain may be subject to the >>>> Freedom of Information statutes and regulations. >>>> >>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>> >>>> LISTSERV@.UGA >>> >>>> (not to SPSSX-L), with no body text except the command. To leave >>>> the >>>> list, send the command SIGNOFF SPSSX-L For a list of commands to >>>> manage subscriptions, send the command INFO REFCARD >>> >>> >>> >>> >>> >>> ----- >>> Please reply to the list and not to my personal email. >>> Those desiring my consulting or training services please feel free >>> to >>> email me. >>> --- >>> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante >>> porcos ne forte conculcent eas pedibus suis." >>> Cum es damnatorum possederunt porcos iens ut salire off sanguinum >>> cliff in abyssum?" >>> -- >>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >>> > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except >>> the command. To leave the list, send the command SIGNOFF SPSSX-L For >>> a >>> list of commands to manage subscriptions, send the command INFO >>> REFCARD >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the command. To leave the >> list, send the command SIGNOFF SPSSX-L For a list of commands to >> manage >> subscriptions, send the command INFO REFCARD >> >> NOTICE TO RECIPIENT: If you are not the intended recipient of this >> e-mail, you are prohibited from sharing, copying, or otherwise using >> or >> disclosing its contents. If you have received this e-mail in error, >> please notify the sender immediately by reply e-mail and permanently >> delete this e-mail and any attachments without reading, forwarding or >> saving them. Thank you. >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD > > > > > > ----- > Please reply to the list and not to my personal email. > Those desiring my consulting or training services please feel free to > email me. > --- > "Nolite dare sanctum canibus neque mittatis margaritas vestras ante > porcos ne forte conculcent eas pedibus suis." > Cum es damnatorum possederunt porcos iens ut salire off sanguinum > cliff in abyssum?" > -- > Sent from: http://spssx-discussion.1045642.n5.nabble.com/ > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except > the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Here is a modification of my original Solution TWO which bypasses the
problem/underspecified cases which were initially flagged in 'error'. /*Modified Solution TWO using SHIFT VALUES with LEAD function */. SHIFT VALUES VARIABLE=ID RESULT=nextID LEAD=1 /VARIABLE=Col2 RESULT=nextCol2 LEAD=1. COMPUTE InvalidNull= ID EQ NextID AND SYSMIS(Col2) AND NOT SYSMIS(NextCol2). /* Perhaps do something interesting proceduraly rather than EXECUTE? */. EXECUTE. DELETE VARIABLES NextID NextCol2. coxspss wrote > Thanks for everyone responding so quickly. I tried Melissa's & David's > approach. Melissa's worked the fastest but had some false positives, > this was connected to her questions about the multiple nulls being valid > so she may have altered her syntax had I responded to her in time. > David's approach took longer to process but did correctly identify the > nulls, speed not really an issues, I am processing approximately 2 > million records so I didn't except speed, so went with David's. Thanks > again and to all who keep this listserv the best on spss. > > David Wright > > > > On Tue, Mar 27, 2018 at 1:46 PM, David Marso wrote: > >> As long as you don't have a bajillion records the following should >> suffice. >> >> /*-------------------------------------*/. >> SET MXLOOPS=10000000. >> MATRIX. >> GET DATA /VARIABLES rowcnt ID col1 col2 /MISSING=-999. >> COMPUTE Flags=MAKE(NROW(Data),1,0). >> LOOP #=NROW(DATA) TO 2 BY -1. >> COMPUTE Flags(#-1)= (Data(#,2) EQ Data(#-1,2) AND (Data(#,4) NE -999 >> AND >> Data(#-1,4) EQ -999)). >> END LOOP. >> SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags. >> END MATRIX. >> RECODE col2(-999=SYSMIS). >> /*-------------------------------------*/. >> >> >> In a situation where you do have a HUGE number of records try the >> following: >> >> DATA LIST LIST /rowcnt ID col1 col2 . >> BEGIN DATA >> 1 1 197810 198320 2 1 198320 199330 3 1 199330 4 2 200210 200320 5 2 >> 200320 200530 6 2 200530 200820 7 2 200530 8 2 200820 9 3 200010 >> 200120 10 3 200520 200810 11 3 200520 12 3 200810 201110 13 3 200820 >> 14 3 201110 201210 15 3 201210 201810 16 3 201810 17 4 196510 196820 >> 18 4 196820 19 5 200010 20 5 200520 200520 21 5 200520 200610 22 5 >> 200810 END DATA. >> SPLIT FILE BY ID. >> INCLUDE FILE="C:\Users\Owner\Desktop\SPSSCODE\SplitMatrix.sps". >> RECODE col2(-999=SYSMIS). >> >> >> /* Contents of "C:\Users\Owner\Desktop\SPSSCODE\SplitMatrix.sps" */. >> SET MXLOOPS=10000000. >> MATRIX. >> COMPUTE Col2=4. >> GET DATA /VARIABLES rowcnt ID col1 col2 /MISSING=-999. >> COMPUTE N=NROW(Data). >> COMPUTE Flags=MAKE(N,1,0). >> LOOP #=N TO 2 BY -1. >> COMPUTE Flags(#-1)= (Data(#,Col2) NE -999 AND Data(#-1,Col2) EQ -999). >> END LOOP. >> SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags. >> END MATRIX. >> >> >> Jim Marks wrote >>> Have you considered reorganizing your data to a wide format instead >>> of >>> LAG? >>> >>> I think that would make it easier to identify invalid nulls. If you >>> replaced the invalid nulls with a dummy code (9999999) you could >>> compute >>> invalid null after restructuring your data back to a long format. >>> >>> Jim Marks >>> Sr. Market Research Manager >>> National Market Research >>> Kaiser Foundation Health Plan of the Mid-Atlantic States, Inc. >>> >>> -----Original Message----- >>> From: SPSSX(r) Discussion [mailto: >> >>> SPSSX-L@.UGA >> >>> ] On Behalf Of >> >>> coxspss@ >> >>> Sent: Tuesday, March 27, 2018 1:33 PM >>> To: >> >>> SPSSX-L@.UGA >> >>> Subject: Re: identifying a null within a repeat ID table >>> >>> Caution: This email came from outside Kaiser Permanente. Do not open >>> attachments or click on links if you do not recognize the sender. >>> >>> >>> ______________________________________________________________________ >>> Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple >>> nulls but >>> they must appear on the last adjacent rows, I can see where in my >>> post it >>> would lead someone to think it is invalid, tried to clarify that with >>> the >>> "-- note the last record may be preceded by a valid null" but upon >>> reading >>> it again I can see it just as confusing, sorry about that. >>> This multiple valid nulls at the end of a repeating ID was also >>> causing me >>> endless frustrations in my own attempt to write the syntax. Should >>> have >>> also indicated that the file is sorted by ID & col1 in the file. >>> >>> >>> On Tue, Mar 27, 2018 at 11:15 AM, David Marso wrote: >>> >>>> Assuming the ID values are in ascending order you don't need the >>>> SORT. >>>> To use Solution ONE below the file must be in proper order. >>>> Solution TWO doesn't require any SORT as long as the collective IDs >>>> are adjacent. >>>> I also question the non flagged 7. >>>> Operator error? >>>> -- >>>> >>>> /*Solution ONE using MATCH with LAST function assumes sorted*/. >>>> MATCH FILES /FILE * / LAST=@FLAG@/BY ID. >>>> COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@. >>>> EXECUTE. >>>> DELETE VARIABLES @FLAG@. >>>> >>>> /*Solution TWO using SHIFT VALUES with LEAD function */. >>>> SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1. >>>> COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@. >>>> EXECUTE. >>>> DELETE VARIABLES @NextID@. >>>> >>>> Could also use CREATE with LEAD but why bother if we have SHIFT >>>> VALUES? >>>> >>>> >>>> >>>> >>>> >>>> Ives, Melissa L wrote >>>>> 3, 8, 16, 18, 22 are valid because they are the last record for the >>>>> ID. >>>>> >>>>> Why is 7 a valid null? >>>>> >>>>> Does col1 always increase as rowcnt increases? >>>>> Is rowcnt really a variable? >>>>> >>>>> *Put the last record on top. >>>>> Sort cases by ID col1 (D). >>>>> *alternatively. >>>>> *Sort cases by ID rowcnt (D). >>>>> >>>>> *Identify the last record w/lag function. >>>>> Compute lastrec=(Id ne lag(id). >>>>> *identify invalid nulls if missing and not last record. >>>>> Compute InvldNull=Missing(col2) and lastrec=0. >>>>> >>>>> *resort anytime. >>>>> M >>>>> >>>>> -----Original Message----- >>>>> From: SPSSX(r) Discussion [mailto: >>>> >>>>> SPSSX-L@.UGA >>>> >>>>> ] On Behalf Of >>>> >>>>> coxspss@ >>>> >>>>> Sent: Tuesday, March 27, 2018 7:39 AM >>>>> To: >>>> >>>>> SPSSX-L@.UGA >>>> >>>>> Subject: [SPSSX-L] identifying a null within a repeat ID table >>>>> >>>>> I have a repeating ID table in which I need to identify when a null >>>>> in >>>>> col2 appears either first within the ID repeating set or before the >>>>> last repeating record-- note the last record may be preceded by a >>>>> valid null. >>>>> In the table below rowcnt 11, 13 & 19 are cases of an invalid null. >>>>> I >>>>> would create a new col as a 0,1 binary in which 1=invalid null. >>>>> I've >>>>> tried various lag syntaxes but not making progress. David. >>>>> >>>>> >>>>> rowcnt,ID,col1,col2 >>>>> 11197810198320 >>>>> 21198320199330 >>>>> 31199330 >>>>> 42200210200320 >>>>> 52200320200530 >>>>> 62200530200820 >>>>> 72200530 >>>>> 82200820 >>>>> 93200010200120 >>>>> 103200520200810 >>>>> 113 200520 >>>>> 123200810201110 >>>>> 133200820 >>>>> 143201110201210 >>>>> 153201210201810 >>>>> 163201810 >>>>> 174196510196820 >>>>> 184196820 >>>>> 195200010 >>>>> 205200520200520 >>>>> 215200520200610 >>>>> 22 5 200810 >>>>> >>>>> ===================== >>>>> To manage your subscription to SPSSX-L, send a message to >>>> >>>>> LISTSERV@.UGA >>>> >>>>> (not to SPSSX-L), with no body text except the command. To leave >>>>> the >>>>> list, send the command SIGNOFF SPSSX-L For a list of commands to >>>>> manage subscriptions, send the command INFO REFCARD >>>>> >>>>> ________________________________ >>>>> >>>>> This correspondence contains proprietary information some or all of >>>>> which may be legally privileged; it is for the intended recipient >>>>> only. If you are not the intended recipient you must not use, >>>>> disclose, distribute, copy, print, or rely on this correspondence >>>>> and >>>>> completely dispose of the correspondence immediately. Please notify >>>>> the sender if you have received this email in error. NOTE: Messages >>>>> to or from the State of Connecticut domain may be subject to the >>>>> Freedom of Information statutes and regulations. >>>>> >>>>> >>>>> ===================== >>>>> To manage your subscription to SPSSX-L, send a message to >>>> >>>>> LISTSERV@.UGA >>>> >>>>> (not to SPSSX-L), with no body text except the command. To leave >>>>> the >>>>> list, send the command SIGNOFF SPSSX-L For a list of commands to >>>>> manage subscriptions, send the command INFO REFCARD >>>> >>>> >>>> >>>> >>>> >>>> ----- >>>> Please reply to the list and not to my personal email. >>>> Those desiring my consulting or training services please feel free >>>> to >>>> email me. >>>> --- >>>> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante >>>> porcos ne forte conculcent eas pedibus suis." >>>> Cum es damnatorum possederunt porcos iens ut salire off sanguinum >>>> cliff in abyssum?" >>>> -- >>>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>>> >> >>> LISTSERV@.UGA >> >>> (not to SPSSX-L), with no body text except >>>> the command. To leave the list, send the command SIGNOFF SPSSX-L For >>>> a >>>> list of commands to manage subscriptions, send the command INFO >>>> REFCARD >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >> >>> LISTSERV@.UGA >> >>> (not to SPSSX-L), with no body text except the command. To leave the >>> list, send the command SIGNOFF SPSSX-L For a list of commands to >>> manage >>> subscriptions, send the command INFO REFCARD >>> >>> NOTICE TO RECIPIENT: If you are not the intended recipient of this >>> e-mail, you are prohibited from sharing, copying, or otherwise using >>> or >>> disclosing its contents. If you have received this e-mail in error, >>> please notify the sender immediately by reply e-mail and permanently >>> delete this e-mail and any attachments without reading, forwarding or >>> saving them. Thank you. >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >> >>> LISTSERV@.UGA >> >>> (not to SPSSX-L), with no body text except the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >> >> >> >> >> >> ----- >> Please reply to the list and not to my personal email. >> Those desiring my consulting or training services please feel free to >> email me. >> --- >> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante >> porcos ne forte conculcent eas pedibus suis." >> Cum es damnatorum possederunt porcos iens ut salire off sanguinum >> cliff in abyssum?" >> -- >> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> > LISTSERV@.UGA > (not to SPSSX-L), with no body text except >> the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |