identifying a null within a repeat ID table

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

identifying a null within a repeat ID table

wsu_wright
I have a repeating ID table in which I need to identify when a null in
col2 appears either first within the ID repeating set or before the last
repeating record-- note the last record may be preceded by a valid null.
In the table below rowcnt 11, 13 & 19 are cases of an invalid null.  I
would create a new col as a 0,1 binary in which 1=invalid null.  I've
tried various lag syntaxes but not making progress.  David.


rowcnt, ID,col1, col2
1 1 197810 198320
2 1 198320 199330
3 1 199330
4 2 200210 200320
5 2 200320 200530
6 2 200530 200820
7 2 200530
8 2 200820
9 3 200010 200120
10 3 200520 200810
11 3 200520
12 3 200810 201110
13 3 200820
14 3 201110 201210
15 3 201210 201810
16 3 201810
17 4 196510 196820
18 4 196820
19 5 200010
20 5 200520 200520
21 5 200520 200610
22 5 200810

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: identifying a null within a repeat ID table

MLIves

3, 8, 16, 18, 22 are valid because they are the last record for the ID.

Why is 7 a valid null?

Does col1 always increase as rowcnt increases?
Is rowcnt really a variable?

*Put the last record on top.
Sort cases by ID col1 (D).
*alternatively.
*Sort cases by ID rowcnt (D).

*Identify the last record w/lag function.
Compute lastrec=(Id ne lag(id).
*identify invalid nulls if missing and not last record.
Compute InvldNull=Missing(col2) and lastrec=0.

*resort anytime.
M

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of [hidden email]
Sent: Tuesday, March 27, 2018 7:39 AM
To: [hidden email]
Subject: [SPSSX-L] identifying a null within a repeat ID table

I have a repeating ID table in which I need to identify when a null in
col2 appears either first within the ID repeating set or before the last repeating record-- note the last record may be preceded by a valid null.
In the table below rowcnt 11, 13 & 19 are cases of an invalid null.  I would create a new col as a 0,1 binary in which 1=invalid null.  I've tried various lag syntaxes but not making progress.  David.


rowcnt,ID,col1,col2
11197810198320
21198320199330
31199330
42200210200320
52200320200530
62200530200820
72200530
82200820
93200010200120
103200520200810
113 200520
123200810201110
133200820
143201110201210
153201210201810
163201810
174196510196820
184196820
195200010
205200520200520
215200520200610
22 5 200810

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

________________________________

This correspondence contains proprietary information some or all of which may be legally privileged; it is for the intended recipient only. If you are not the intended recipient you must not use, disclose, distribute, copy, print, or rely on this correspondence and completely dispose of the correspondence immediately. Please notify the sender if you have received this email in error. NOTE: Messages to or from the State of Connecticut domain may be subject to the Freedom of Information statutes and regulations.


=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: identifying a null within a repeat ID table

David Marso
Administrator
Assuming the ID values are in ascending order you don't need the SORT.
To use Solution ONE below the file must be in proper order.
Solution TWO doesn't require any SORT as long as the collective IDs are
adjacent.
I also question the non flagged 7.
Operator error?
--

/*Solution ONE using MATCH with LAST function assumes sorted*/.
MATCH FILES /FILE * / LAST=@FLAG@/BY ID.
COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@.
EXECUTE.
DELETE VARIABLES @FLAG@.

/*Solution TWO using SHIFT VALUES with LEAD function */.
SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1.
COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@.
EXECUTE.
DELETE VARIABLES @NextID@.

Could also use CREATE with LEAD but why bother if we have SHIFT VALUES?





Ives, Melissa L wrote

> 3, 8, 16, 18, 22 are valid because they are the last record for the ID.
>
> Why is 7 a valid null?
>
> Does col1 always increase as rowcnt increases?
> Is rowcnt really a variable?
>
> *Put the last record on top.
> Sort cases by ID col1 (D).
> *alternatively.
> *Sort cases by ID rowcnt (D).
>
> *Identify the last record w/lag function.
> Compute lastrec=(Id ne lag(id).
> *identify invalid nulls if missing and not last record.
> Compute InvldNull=Missing(col2) and lastrec=0.
>
> *resort anytime.
> M
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:

> SPSSX-L@.UGA

> ] On Behalf Of

> coxspss@

> Sent: Tuesday, March 27, 2018 7:39 AM
> To:

> SPSSX-L@.UGA

> Subject: [SPSSX-L] identifying a null within a repeat ID table
>
> I have a repeating ID table in which I need to identify when a null in
> col2 appears either first within the ID repeating set or before the last
> repeating record-- note the last record may be preceded by a valid null.
> In the table below rowcnt 11, 13 & 19 are cases of an invalid null.  I
> would create a new col as a 0,1 binary in which 1=invalid null.  I've
> tried various lag syntaxes but not making progress.  David.
>
>
> rowcnt,ID,col1,col2
> 11197810198320
> 21198320199330
> 31199330
> 42200210200320
> 52200320200530
> 62200530200820
> 72200530
> 82200820
> 93200010200120
> 103200520200810
> 113 200520
> 123200810201110
> 133200820
> 143201110201210
> 153201210201810
> 163201810
> 174196510196820
> 184196820
> 195200010
> 205200520200520
> 215200520200610
> 22 5 200810
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSX-L For a list of commands to manage
> subscriptions, send the command INFO REFCARD
>
> ________________________________
>
> This correspondence contains proprietary information some or all of which
> may be legally privileged; it is for the intended recipient only. If you
> are not the intended recipient you must not use, disclose, distribute,
> copy, print, or rely on this correspondence and completely dispose of the
> correspondence immediately. Please notify the sender if you have received
> this email in error. NOTE: Messages to or from the State of Connecticut
> domain may be subject to the Freedom of Information statutes and
> regulations.
>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: identifying a null within a repeat ID table

wsu_wright
In reply to this post by wsu_wright
Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple nulls
but they must appear on the last adjacent rows, I can see where in my
post it would lead someone to think it is invalid, tried to clarify that
with the "-- note the last record may be preceded by a valid null" but
upon reading it again I can see it just as confusing, sorry about that.
This multiple valid nulls at the end of a repeating ID was also causing
me endless frustrations in my own attempt to write the syntax.  Should
have also indicated that the file is sorted by ID & col1 in the file.


On Tue, Mar 27, 2018 at 11:15 AM, David Marso wrote:

> Assuming the ID values are in ascending order you don't need the SORT.
> To use Solution ONE below the file must be in proper order.
> Solution TWO doesn't require any SORT as long as the collective IDs
> are
> adjacent.
> I also question the non flagged 7.
> Operator error?
> --
>
> /*Solution ONE using MATCH with LAST function assumes sorted*/.
> MATCH FILES /FILE * / LAST=@FLAG@/BY ID.
> COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@.
> EXECUTE.
> DELETE VARIABLES @FLAG@.
>
> /*Solution TWO using SHIFT VALUES with LEAD function */.
> SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1.
> COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@.
> EXECUTE.
> DELETE VARIABLES @NextID@.
>
> Could also use CREATE with LEAD but why bother if we have SHIFT
> VALUES?
>
>
>
>
>
> Ives, Melissa L wrote
>> 3, 8, 16, 18, 22 are valid because they are the last record for the
>> ID.
>>
>> Why is 7 a valid null?
>>
>> Does col1 always increase as rowcnt increases?
>> Is rowcnt really a variable?
>>
>> *Put the last record on top.
>> Sort cases by ID col1 (D).
>> *alternatively.
>> *Sort cases by ID rowcnt (D).
>>
>> *Identify the last record w/lag function.
>> Compute lastrec=(Id ne lag(id).
>> *identify invalid nulls if missing and not last record.
>> Compute InvldNull=Missing(col2) and lastrec=0.
>>
>> *resort anytime.
>> M
>>
>> -----Original Message-----
>> From: SPSSX(r) Discussion [mailto:
>
>> SPSSX-L@.UGA
>
>> ] On Behalf Of
>
>> coxspss@
>
>> Sent: Tuesday, March 27, 2018 7:39 AM
>> To:
>
>> SPSSX-L@.UGA
>
>> Subject: [SPSSX-L] identifying a null within a repeat ID table
>>
>> I have a repeating ID table in which I need to identify when a null
>> in
>> col2 appears either first within the ID repeating set or before the
>> last
>> repeating record-- note the last record may be preceded by a valid
>> null.
>> In the table below rowcnt 11, 13 & 19 are cases of an invalid null.
>> I
>> would create a new col as a 0,1 binary in which 1=invalid null.  I've
>> tried various lag syntaxes but not making progress.  David.
>>
>>
>> rowcnt,ID,col1,col2
>> 11197810198320
>> 21198320199330
>> 31199330
>> 42200210200320
>> 52200320200530
>> 62200530200820
>> 72200530
>> 82200820
>> 93200010200120
>> 103200520200810
>> 113 200520
>> 123200810201110
>> 133200820
>> 143201110201210
>> 153201210201810
>> 163201810
>> 174196510196820
>> 184196820
>> 195200010
>> 205200520200520
>> 215200520200610
>> 22 5 200810
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the command. To leave the
>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>> manage
>> subscriptions, send the command INFO REFCARD
>>
>> ________________________________
>>
>> This correspondence contains proprietary information some or all of
>> which
>> may be legally privileged; it is for the intended recipient only. If
>> you
>> are not the intended recipient you must not use, disclose,
>> distribute,
>> copy, print, or rely on this correspondence and completely dispose of
>> the
>> correspondence immediately. Please notify the sender if you have
>> received
>> this email in error. NOTE: Messages to or from the State of
>> Connecticut
>> domain may be subject to the Freedom of Information statutes and
>> regulations.
>>
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
> -----
> Please reply to the list and not to my personal email.
> Those desiring my consulting or training services please feel free to
> email me.
> ---
> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
> porcos ne forte conculcent eas pedibus suis."
> Cum es damnatorum possederunt porcos iens ut salire off sanguinum
> cliff in abyssum?"
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except
> the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: identifying a null within a repeat ID table

Rich Ulrich

If brute force is fine because efficiency does not matter, the first solution I saw

gives the hint:

Sort into reverse order; use the usual LAG( ) process to number the lines for each ID.

Then the (revised) first and second lines are the ones for which NULL is okay.


--
Rich Ulrich


From: SPSSX(r) Discussion <[hidden email]> on behalf of [hidden email] <[hidden email]>
Sent: Tuesday, March 27, 2018 1:32 PM
To: [hidden email]
Subject: Re: identifying a null within a repeat ID table
 
Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple nulls
but they must appear on the last adjacent rows, I can see where in my
post it would lead someone to think it is invalid, tried to clarify that
with the "-- note the last record may be preceded by a valid null" but
upon reading it again I can see it just as confusing, sorry about that.
This multiple valid nulls at the end of a repeating ID was also causing
me endless frustrations in my own attempt to write the syntax.  Should
have also indicated that the file is sorted by ID & col1 in the file.

...
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: identifying a null within a repeat ID table

Jim Marks
In reply to this post by wsu_wright
Have you considered reorganizing your data to a wide format  instead of LAG?

I think that would make it easier to identify invalid nulls. If you replaced the invalid nulls with a dummy code (9999999) you could compute invalid null after restructuring your data back to a long format.

Jim Marks
Sr. Market Research Manager
National Market Research
Kaiser Foundation Health Plan of the Mid-Atlantic States, Inc.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of [hidden email]
Sent: Tuesday, March 27, 2018 1:33 PM
To: [hidden email]
Subject: Re: identifying a null within a repeat ID table

Caution: This email came from outside Kaiser Permanente. Do not open attachments or click on links if you do not recognize the sender.

______________________________________________________________________
Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple nulls but they must appear on the last adjacent rows, I can see where in my post it would lead someone to think it is invalid, tried to clarify that with the "-- note the last record may be preceded by a valid null" but upon reading it again I can see it just as confusing, sorry about that.
This multiple valid nulls at the end of a repeating ID was also causing me endless frustrations in my own attempt to write the syntax.  Should have also indicated that the file is sorted by ID & col1 in the file.


On Tue, Mar 27, 2018 at 11:15 AM, David Marso wrote:

> Assuming the ID values are in ascending order you don't need the SORT.
> To use Solution ONE below the file must be in proper order.
> Solution TWO doesn't require any SORT as long as the collective IDs
> are adjacent.
> I also question the non flagged 7.
> Operator error?
> --
>
> /*Solution ONE using MATCH with LAST function assumes sorted*/.
> MATCH FILES /FILE * / LAST=@FLAG@/BY ID.
> COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@.
> EXECUTE.
> DELETE VARIABLES @FLAG@.
>
> /*Solution TWO using SHIFT VALUES with LEAD function */.
> SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1.
> COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@.
> EXECUTE.
> DELETE VARIABLES @NextID@.
>
> Could also use CREATE with LEAD but why bother if we have SHIFT
> VALUES?
>
>
>
>
>
> Ives, Melissa L wrote
>> 3, 8, 16, 18, 22 are valid because they are the last record for the
>> ID.
>>
>> Why is 7 a valid null?
>>
>> Does col1 always increase as rowcnt increases?
>> Is rowcnt really a variable?
>>
>> *Put the last record on top.
>> Sort cases by ID col1 (D).
>> *alternatively.
>> *Sort cases by ID rowcnt (D).
>>
>> *Identify the last record w/lag function.
>> Compute lastrec=(Id ne lag(id).
>> *identify invalid nulls if missing and not last record.
>> Compute InvldNull=Missing(col2) and lastrec=0.
>>
>> *resort anytime.
>> M
>>
>> -----Original Message-----
>> From: SPSSX(r) Discussion [mailto:
>
>> SPSSX-L@.UGA
>
>> ] On Behalf Of
>
>> coxspss@
>
>> Sent: Tuesday, March 27, 2018 7:39 AM
>> To:
>
>> SPSSX-L@.UGA
>
>> Subject: [SPSSX-L] identifying a null within a repeat ID table
>>
>> I have a repeating ID table in which I need to identify when a null
>> in
>> col2 appears either first within the ID repeating set or before the
>> last repeating record-- note the last record may be preceded by a
>> valid null.
>> In the table below rowcnt 11, 13 & 19 are cases of an invalid null.
>> I
>> would create a new col as a 0,1 binary in which 1=invalid null.  I've
>> tried various lag syntaxes but not making progress.  David.
>>
>>
>> rowcnt,ID,col1,col2
>> 11197810198320
>> 21198320199330
>> 31199330
>> 42200210200320
>> 52200320200530
>> 62200530200820
>> 72200530
>> 82200820
>> 93200010200120
>> 103200520200810
>> 113 200520
>> 123200810201110
>> 133200820
>> 143201110201210
>> 153201210201810
>> 163201810
>> 174196510196820
>> 184196820
>> 195200010
>> 205200520200520
>> 215200520200610
>> 22 5 200810
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the command. To leave the
>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>> manage subscriptions, send the command INFO REFCARD
>>
>> ________________________________
>>
>> This correspondence contains proprietary information some or all of
>> which may be legally privileged; it is for the intended recipient
>> only. If you are not the intended recipient you must not use,
>> disclose, distribute, copy, print, or rely on this correspondence and
>> completely dispose of the correspondence immediately. Please notify
>> the sender if you have received this email in error. NOTE: Messages
>> to or from the State of Connecticut domain may be subject to the
>> Freedom of Information statutes and regulations.
>>
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the command. To leave the
>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>> manage subscriptions, send the command INFO REFCARD
>
>
>
>
>
> -----
> Please reply to the list and not to my personal email.
> Those desiring my consulting or training services please feel free to
> email me.
> ---
> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
> porcos ne forte conculcent eas pedibus suis."
> Cum es damnatorum possederunt porcos iens ut salire off sanguinum
> cliff in abyssum?"
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except
> the command. To leave the list, send the command SIGNOFF SPSSX-L For a
> list of commands to manage subscriptions, send the command INFO
> REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

NOTICE TO RECIPIENT:  If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents.  If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them.  Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: identifying a null within a repeat ID table

David Marso
Administrator
As long as you don't have a bajillion records the following should suffice.

/*-------------------------------------*/.
SET MXLOOPS=10000000.
MATRIX.
GET DATA /VARIABLES rowcnt ID col1 col2  /MISSING=-999.
COMPUTE Flags=MAKE(NROW(Data),1,0).
LOOP #=NROW(DATA) TO 2 BY -1.
COMPUTE Flags(#-1)= (Data(#,2) EQ Data(#-1,2) AND (Data(#,4) NE -999 AND
Data(#-1,4) EQ -999)).
END LOOP.
SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags.
END MATRIX.
RECODE col2(-999=SYSMIS).
/*-------------------------------------*/.

Jim Marks wrote

> Have you considered reorganizing your data to a wide format  instead of
> LAG?
>
> I think that would make it easier to identify invalid nulls. If you
> replaced the invalid nulls with a dummy code (9999999) you could compute
> invalid null after restructuring your data back to a long format.
>
> Jim Marks
> Sr. Market Research Manager
> National Market Research
> Kaiser Foundation Health Plan of the Mid-Atlantic States, Inc.
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:

> SPSSX-L@.UGA

> ] On Behalf Of

> coxspss@

> Sent: Tuesday, March 27, 2018 1:33 PM
> To:

> SPSSX-L@.UGA

> Subject: Re: identifying a null within a repeat ID table
>
> Caution: This email came from outside Kaiser Permanente. Do not open
> attachments or click on links if you do not recognize the sender.
>
> ______________________________________________________________________
> Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple nulls but
> they must appear on the last adjacent rows, I can see where in my post it
> would lead someone to think it is invalid, tried to clarify that with the
> "-- note the last record may be preceded by a valid null" but upon reading
> it again I can see it just as confusing, sorry about that.
> This multiple valid nulls at the end of a repeating ID was also causing me
> endless frustrations in my own attempt to write the syntax.  Should have
> also indicated that the file is sorted by ID & col1 in the file.
>
>
> On Tue, Mar 27, 2018 at 11:15 AM, David Marso wrote:
>
>> Assuming the ID values are in ascending order you don't need the SORT.
>> To use Solution ONE below the file must be in proper order.
>> Solution TWO doesn't require any SORT as long as the collective IDs
>> are adjacent.
>> I also question the non flagged 7.
>> Operator error?
>> --
>>
>> /*Solution ONE using MATCH with LAST function assumes sorted*/.
>> MATCH FILES /FILE * / LAST=@FLAG@/BY ID.
>> COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@.
>> EXECUTE.
>> DELETE VARIABLES @FLAG@.
>>
>> /*Solution TWO using SHIFT VALUES with LEAD function */.
>> SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1.
>> COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@.
>> EXECUTE.
>> DELETE VARIABLES @NextID@.
>>
>> Could also use CREATE with LEAD but why bother if we have SHIFT
>> VALUES?
>>
>>
>>
>>
>>
>> Ives, Melissa L wrote
>>> 3, 8, 16, 18, 22 are valid because they are the last record for the
>>> ID.
>>>
>>> Why is 7 a valid null?
>>>
>>> Does col1 always increase as rowcnt increases?
>>> Is rowcnt really a variable?
>>>
>>> *Put the last record on top.
>>> Sort cases by ID col1 (D).
>>> *alternatively.
>>> *Sort cases by ID rowcnt (D).
>>>
>>> *Identify the last record w/lag function.
>>> Compute lastrec=(Id ne lag(id).
>>> *identify invalid nulls if missing and not last record.
>>> Compute InvldNull=Missing(col2) and lastrec=0.
>>>
>>> *resort anytime.
>>> M
>>>
>>> -----Original Message-----
>>> From: SPSSX(r) Discussion [mailto:
>>
>>> SPSSX-L@.UGA
>>
>>> ] On Behalf Of
>>
>>> coxspss@
>>
>>> Sent: Tuesday, March 27, 2018 7:39 AM
>>> To:
>>
>>> SPSSX-L@.UGA
>>
>>> Subject: [SPSSX-L] identifying a null within a repeat ID table
>>>
>>> I have a repeating ID table in which I need to identify when a null
>>> in
>>> col2 appears either first within the ID repeating set or before the
>>> last repeating record-- note the last record may be preceded by a
>>> valid null.
>>> In the table below rowcnt 11, 13 & 19 are cases of an invalid null.
>>> I
>>> would create a new col as a 0,1 binary in which 1=invalid null.  I've
>>> tried various lag syntaxes but not making progress.  David.
>>>
>>>
>>> rowcnt,ID,col1,col2
>>> 11197810198320
>>> 21198320199330
>>> 31199330
>>> 42200210200320
>>> 52200320200530
>>> 62200530200820
>>> 72200530
>>> 82200820
>>> 93200010200120
>>> 103200520200810
>>> 113 200520
>>> 123200810201110
>>> 133200820
>>> 143201110201210
>>> 153201210201810
>>> 163201810
>>> 174196510196820
>>> 184196820
>>> 195200010
>>> 205200520200520
>>> 215200520200610
>>> 22 5 200810
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>> LISTSERV@.UGA
>>
>>>  (not to SPSSX-L), with no body text except the command. To leave the
>>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>>> manage subscriptions, send the command INFO REFCARD
>>>
>>> ________________________________
>>>
>>> This correspondence contains proprietary information some or all of
>>> which may be legally privileged; it is for the intended recipient
>>> only. If you are not the intended recipient you must not use,
>>> disclose, distribute, copy, print, or rely on this correspondence and
>>> completely dispose of the correspondence immediately. Please notify
>>> the sender if you have received this email in error. NOTE: Messages
>>> to or from the State of Connecticut domain may be subject to the
>>> Freedom of Information statutes and regulations.
>>>
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>> LISTSERV@.UGA
>>
>>>  (not to SPSSX-L), with no body text except the command. To leave the
>>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>>> manage subscriptions, send the command INFO REFCARD
>>
>>
>>
>>
>>
>> -----
>> Please reply to the list and not to my personal email.
>> Those desiring my consulting or training services please feel free to
>> email me.
>> ---
>> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
>> porcos ne forte conculcent eas pedibus suis."
>> Cum es damnatorum possederunt porcos iens ut salire off sanguinum
>> cliff in abyssum?"
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except
>> the command. To leave the list, send the command SIGNOFF SPSSX-L For a
>> list of commands to manage subscriptions, send the command INFO
>> REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSX-L For a list of commands to manage
> subscriptions, send the command INFO REFCARD
>
> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
> e-mail, you are prohibited from sharing, copying, or otherwise using or
> disclosing its contents.  If you have received this e-mail in error,
> please notify the sender immediately by reply e-mail and permanently
> delete this e-mail and any attachments without reading, forwarding or
> saving them.  Thank you.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: identifying a null within a repeat ID table

David Marso
Administrator
In reply to this post by Jim Marks
As long as you don't have a bajillion records the following should suffice.

/*-------------------------------------*/.
SET MXLOOPS=10000000.
MATRIX.
GET DATA /VARIABLES rowcnt ID col1 col2  /MISSING=-999.
COMPUTE Flags=MAKE(NROW(Data),1,0).
LOOP #=NROW(DATA) TO 2 BY -1.
COMPUTE Flags(#-1)= (Data(#,2) EQ Data(#-1,2) AND (Data(#,4) NE -999 AND
Data(#-1,4) EQ -999)).
END LOOP.
SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags.
END MATRIX.
RECODE col2(-999=SYSMIS).
/*-------------------------------------*/.


In a situation where you do have a HUGE number of records try the following:

DATA LIST LIST /rowcnt ID col1 col2 .
BEGIN DATA
1 1 197810 198320
2 1 198320 199330
3 1 199330
4 2 200210 200320
5 2 200320 200530
6 2 200530 200820
7 2 200530
8 2 200820
9 3 200010 200120
10 3 200520 200810
11 3 200520
12 3 200810 201110
13 3 200820
14 3 201110 201210
15 3 201210 201810
16 3 201810
17 4 196510 196820
18 4 196820
19 5 200010
20 5 200520 200520
21 5 200520 200610
22 5 200810
END DATA.
SPLIT FILE BY ID.
INCLUDE FILE="C:\Users\Owner\Desktop\SPSSCODE\SplitMatrix.sps".
RECODE col2(-999=SYSMIS).


/* Contents of "C:\Users\Owner\Desktop\SPSSCODE\SplitMatrix.sps" */.
SET MXLOOPS=10000000.
MATRIX.
COMPUTE Col2=4.
GET DATA /VARIABLES rowcnt ID col1 col2  /MISSING=-999.
COMPUTE N=NROW(Data).
COMPUTE Flags=MAKE(N,1,0).
LOOP #=N TO 2 BY -1.
COMPUTE Flags(#-1)= (Data(#,Col2) NE -999 AND Data(#-1,Col2) EQ -999).
END LOOP.
SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags.
END MATRIX.


Jim Marks wrote

> Have you considered reorganizing your data to a wide format  instead of
> LAG?
>
> I think that would make it easier to identify invalid nulls. If you
> replaced the invalid nulls with a dummy code (9999999) you could compute
> invalid null after restructuring your data back to a long format.
>
> Jim Marks
> Sr. Market Research Manager
> National Market Research
> Kaiser Foundation Health Plan of the Mid-Atlantic States, Inc.
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:

> SPSSX-L@.UGA

> ] On Behalf Of

> coxspss@

> Sent: Tuesday, March 27, 2018 1:33 PM
> To:

> SPSSX-L@.UGA

> Subject: Re: identifying a null within a repeat ID table
>
> Caution: This email came from outside Kaiser Permanente. Do not open
> attachments or click on links if you do not recognize the sender.
>
> ______________________________________________________________________
> Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple nulls but
> they must appear on the last adjacent rows, I can see where in my post it
> would lead someone to think it is invalid, tried to clarify that with the
> "-- note the last record may be preceded by a valid null" but upon reading
> it again I can see it just as confusing, sorry about that.
> This multiple valid nulls at the end of a repeating ID was also causing me
> endless frustrations in my own attempt to write the syntax.  Should have
> also indicated that the file is sorted by ID & col1 in the file.
>
>
> On Tue, Mar 27, 2018 at 11:15 AM, David Marso wrote:
>
>> Assuming the ID values are in ascending order you don't need the SORT.
>> To use Solution ONE below the file must be in proper order.
>> Solution TWO doesn't require any SORT as long as the collective IDs
>> are adjacent.
>> I also question the non flagged 7.
>> Operator error?
>> --
>>
>> /*Solution ONE using MATCH with LAST function assumes sorted*/.
>> MATCH FILES /FILE * / LAST=@FLAG@/BY ID.
>> COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@.
>> EXECUTE.
>> DELETE VARIABLES @FLAG@.
>>
>> /*Solution TWO using SHIFT VALUES with LEAD function */.
>> SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1.
>> COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@.
>> EXECUTE.
>> DELETE VARIABLES @NextID@.
>>
>> Could also use CREATE with LEAD but why bother if we have SHIFT
>> VALUES?
>>
>>
>>
>>
>>
>> Ives, Melissa L wrote
>>> 3, 8, 16, 18, 22 are valid because they are the last record for the
>>> ID.
>>>
>>> Why is 7 a valid null?
>>>
>>> Does col1 always increase as rowcnt increases?
>>> Is rowcnt really a variable?
>>>
>>> *Put the last record on top.
>>> Sort cases by ID col1 (D).
>>> *alternatively.
>>> *Sort cases by ID rowcnt (D).
>>>
>>> *Identify the last record w/lag function.
>>> Compute lastrec=(Id ne lag(id).
>>> *identify invalid nulls if missing and not last record.
>>> Compute InvldNull=Missing(col2) and lastrec=0.
>>>
>>> *resort anytime.
>>> M
>>>
>>> -----Original Message-----
>>> From: SPSSX(r) Discussion [mailto:
>>
>>> SPSSX-L@.UGA
>>
>>> ] On Behalf Of
>>
>>> coxspss@
>>
>>> Sent: Tuesday, March 27, 2018 7:39 AM
>>> To:
>>
>>> SPSSX-L@.UGA
>>
>>> Subject: [SPSSX-L] identifying a null within a repeat ID table
>>>
>>> I have a repeating ID table in which I need to identify when a null
>>> in
>>> col2 appears either first within the ID repeating set or before the
>>> last repeating record-- note the last record may be preceded by a
>>> valid null.
>>> In the table below rowcnt 11, 13 & 19 are cases of an invalid null.
>>> I
>>> would create a new col as a 0,1 binary in which 1=invalid null.  I've
>>> tried various lag syntaxes but not making progress.  David.
>>>
>>>
>>> rowcnt,ID,col1,col2
>>> 11197810198320
>>> 21198320199330
>>> 31199330
>>> 42200210200320
>>> 52200320200530
>>> 62200530200820
>>> 72200530
>>> 82200820
>>> 93200010200120
>>> 103200520200810
>>> 113 200520
>>> 123200810201110
>>> 133200820
>>> 143201110201210
>>> 153201210201810
>>> 163201810
>>> 174196510196820
>>> 184196820
>>> 195200010
>>> 205200520200520
>>> 215200520200610
>>> 22 5 200810
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>> LISTSERV@.UGA
>>
>>>  (not to SPSSX-L), with no body text except the command. To leave the
>>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>>> manage subscriptions, send the command INFO REFCARD
>>>
>>> ________________________________
>>>
>>> This correspondence contains proprietary information some or all of
>>> which may be legally privileged; it is for the intended recipient
>>> only. If you are not the intended recipient you must not use,
>>> disclose, distribute, copy, print, or rely on this correspondence and
>>> completely dispose of the correspondence immediately. Please notify
>>> the sender if you have received this email in error. NOTE: Messages
>>> to or from the State of Connecticut domain may be subject to the
>>> Freedom of Information statutes and regulations.
>>>
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>> LISTSERV@.UGA
>>
>>>  (not to SPSSX-L), with no body text except the command. To leave the
>>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>>> manage subscriptions, send the command INFO REFCARD
>>
>>
>>
>>
>>
>> -----
>> Please reply to the list and not to my personal email.
>> Those desiring my consulting or training services please feel free to
>> email me.
>> ---
>> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
>> porcos ne forte conculcent eas pedibus suis."
>> Cum es damnatorum possederunt porcos iens ut salire off sanguinum
>> cliff in abyssum?"
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except
>> the command. To leave the list, send the command SIGNOFF SPSSX-L For a
>> list of commands to manage subscriptions, send the command INFO
>> REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSX-L For a list of commands to manage
> subscriptions, send the command INFO REFCARD
>
> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
> e-mail, you are prohibited from sharing, copying, or otherwise using or
> disclosing its contents.  If you have received this e-mail in error,
> please notify the sender immediately by reply e-mail and permanently
> delete this e-mail and any attachments without reading, forwarding or
> saving them.  Thank you.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: identifying a null within a repeat ID table

wsu_wright
In reply to this post by wsu_wright
Thanks for everyone responding so quickly.  I tried Melissa's & David's
approach.  Melissa's worked the fastest but had some false positives,
this was connected to her questions about the multiple nulls being valid
so she may have altered her syntax had I responded to her in time.
David's approach took longer to process but did correctly identify the
nulls, speed not really an issues, I am processing approximately 2
million records so I didn't except speed, so went with David's.  Thanks
again and to all who keep this listserv the best on spss.

David Wright



On Tue, Mar 27, 2018 at 1:46 PM, David Marso wrote:

> As long as you don't have a bajillion records the following should
> suffice.
>
> /*-------------------------------------*/.
> SET MXLOOPS=10000000.
> MATRIX.
> GET DATA /VARIABLES rowcnt ID col1 col2  /MISSING=-999.
> COMPUTE Flags=MAKE(NROW(Data),1,0).
> LOOP #=NROW(DATA) TO 2 BY -1.
> COMPUTE Flags(#-1)= (Data(#,2) EQ Data(#-1,2) AND (Data(#,4) NE -999
> AND
> Data(#-1,4) EQ -999)).
> END LOOP.
> SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags.
> END MATRIX.
> RECODE col2(-999=SYSMIS).
> /*-------------------------------------*/.
>
>
> In a situation where you do have a HUGE number of records try the
> following:
>
> DATA LIST LIST /rowcnt ID col1 col2 .
> BEGIN DATA
> 1 1 197810 198320 2 1 198320 199330 3 1 199330 4 2 200210 200320 5 2
> 200320 200530 6 2 200530 200820 7 2 200530 8 2 200820 9 3 200010
> 200120 10 3 200520 200810 11 3 200520 12 3 200810 201110 13 3 200820
> 14 3 201110 201210 15 3 201210 201810 16 3 201810 17 4 196510 196820
> 18 4 196820 19 5 200010 20 5 200520 200520 21 5 200520 200610 22 5
> 200810 END DATA.
> SPLIT FILE BY ID.
> INCLUDE FILE="C:\Users\Owner\Desktop\SPSSCODE\SplitMatrix.sps".
> RECODE col2(-999=SYSMIS).
>
>
> /* Contents of "C:\Users\Owner\Desktop\SPSSCODE\SplitMatrix.sps" */.
> SET MXLOOPS=10000000.
> MATRIX.
> COMPUTE Col2=4.
> GET DATA /VARIABLES rowcnt ID col1 col2  /MISSING=-999.
> COMPUTE N=NROW(Data).
> COMPUTE Flags=MAKE(N,1,0).
> LOOP #=N TO 2 BY -1.
> COMPUTE Flags(#-1)= (Data(#,Col2) NE -999 AND Data(#-1,Col2) EQ -999).
> END LOOP.
> SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags.
> END MATRIX.
>
>
> Jim Marks wrote
>> Have you considered reorganizing your data to a wide format  instead
>> of
>> LAG?
>>
>> I think that would make it easier to identify invalid nulls. If you
>> replaced the invalid nulls with a dummy code (9999999) you could
>> compute
>> invalid null after restructuring your data back to a long format.
>>
>> Jim Marks
>> Sr. Market Research Manager
>> National Market Research
>> Kaiser Foundation Health Plan of the Mid-Atlantic States, Inc.
>>
>> -----Original Message-----
>> From: SPSSX(r) Discussion [mailto:
>
>> SPSSX-L@.UGA
>
>> ] On Behalf Of
>
>> coxspss@
>
>> Sent: Tuesday, March 27, 2018 1:33 PM
>> To:
>
>> SPSSX-L@.UGA
>
>> Subject: Re: identifying a null within a repeat ID table
>>
>> Caution: This email came from outside Kaiser Permanente. Do not open
>> attachments or click on links if you do not recognize the sender.
>>
>>
>> ______________________________________________________________________
>> Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple
>> nulls but
>> they must appear on the last adjacent rows, I can see where in my
>> post it
>> would lead someone to think it is invalid, tried to clarify that with
>> the
>> "-- note the last record may be preceded by a valid null" but upon
>> reading
>> it again I can see it just as confusing, sorry about that.
>> This multiple valid nulls at the end of a repeating ID was also
>> causing me
>> endless frustrations in my own attempt to write the syntax.  Should
>> have
>> also indicated that the file is sorted by ID & col1 in the file.
>>
>>
>> On Tue, Mar 27, 2018 at 11:15 AM, David Marso wrote:
>>
>>> Assuming the ID values are in ascending order you don't need the
>>> SORT.
>>> To use Solution ONE below the file must be in proper order.
>>> Solution TWO doesn't require any SORT as long as the collective IDs
>>> are adjacent.
>>> I also question the non flagged 7.
>>> Operator error?
>>> --
>>>
>>> /*Solution ONE using MATCH with LAST function assumes sorted*/.
>>> MATCH FILES /FILE * / LAST=@FLAG@/BY ID.
>>> COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@.
>>> EXECUTE.
>>> DELETE VARIABLES @FLAG@.
>>>
>>> /*Solution TWO using SHIFT VALUES with LEAD function */.
>>> SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1.
>>> COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@.
>>> EXECUTE.
>>> DELETE VARIABLES @NextID@.
>>>
>>> Could also use CREATE with LEAD but why bother if we have SHIFT
>>> VALUES?
>>>
>>>
>>>
>>>
>>>
>>> Ives, Melissa L wrote
>>>> 3, 8, 16, 18, 22 are valid because they are the last record for the
>>>> ID.
>>>>
>>>> Why is 7 a valid null?
>>>>
>>>> Does col1 always increase as rowcnt increases?
>>>> Is rowcnt really a variable?
>>>>
>>>> *Put the last record on top.
>>>> Sort cases by ID col1 (D).
>>>> *alternatively.
>>>> *Sort cases by ID rowcnt (D).
>>>>
>>>> *Identify the last record w/lag function.
>>>> Compute lastrec=(Id ne lag(id).
>>>> *identify invalid nulls if missing and not last record.
>>>> Compute InvldNull=Missing(col2) and lastrec=0.
>>>>
>>>> *resort anytime.
>>>> M
>>>>
>>>> -----Original Message-----
>>>> From: SPSSX(r) Discussion [mailto:
>>>
>>>> SPSSX-L@.UGA
>>>
>>>> ] On Behalf Of
>>>
>>>> coxspss@
>>>
>>>> Sent: Tuesday, March 27, 2018 7:39 AM
>>>> To:
>>>
>>>> SPSSX-L@.UGA
>>>
>>>> Subject: [SPSSX-L] identifying a null within a repeat ID table
>>>>
>>>> I have a repeating ID table in which I need to identify when a null
>>>> in
>>>> col2 appears either first within the ID repeating set or before the
>>>> last repeating record-- note the last record may be preceded by a
>>>> valid null.
>>>> In the table below rowcnt 11, 13 & 19 are cases of an invalid null.
>>>> I
>>>> would create a new col as a 0,1 binary in which 1=invalid null.
>>>> I've
>>>> tried various lag syntaxes but not making progress.  David.
>>>>
>>>>
>>>> rowcnt,ID,col1,col2
>>>> 11197810198320
>>>> 21198320199330
>>>> 31199330
>>>> 42200210200320
>>>> 52200320200530
>>>> 62200530200820
>>>> 72200530
>>>> 82200820
>>>> 93200010200120
>>>> 103200520200810
>>>> 113 200520
>>>> 123200810201110
>>>> 133200820
>>>> 143201110201210
>>>> 153201210201810
>>>> 163201810
>>>> 174196510196820
>>>> 184196820
>>>> 195200010
>>>> 205200520200520
>>>> 215200520200610
>>>> 22 5 200810
>>>>
>>>> =====================
>>>> To manage your subscription to SPSSX-L, send a message to
>>>
>>>> LISTSERV@.UGA
>>>
>>>>  (not to SPSSX-L), with no body text except the command. To leave
>>>> the
>>>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>>>> manage subscriptions, send the command INFO REFCARD
>>>>
>>>> ________________________________
>>>>
>>>> This correspondence contains proprietary information some or all of
>>>> which may be legally privileged; it is for the intended recipient
>>>> only. If you are not the intended recipient you must not use,
>>>> disclose, distribute, copy, print, or rely on this correspondence
>>>> and
>>>> completely dispose of the correspondence immediately. Please notify
>>>> the sender if you have received this email in error. NOTE: Messages
>>>> to or from the State of Connecticut domain may be subject to the
>>>> Freedom of Information statutes and regulations.
>>>>
>>>>
>>>> =====================
>>>> To manage your subscription to SPSSX-L, send a message to
>>>
>>>> LISTSERV@.UGA
>>>
>>>>  (not to SPSSX-L), with no body text except the command. To leave
>>>> the
>>>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>>>> manage subscriptions, send the command INFO REFCARD
>>>
>>>
>>>
>>>
>>>
>>> -----
>>> Please reply to the list and not to my personal email.
>>> Those desiring my consulting or training services please feel free
>>> to
>>> email me.
>>> ---
>>> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
>>> porcos ne forte conculcent eas pedibus suis."
>>> Cum es damnatorum possederunt porcos iens ut salire off sanguinum
>>> cliff in abyssum?"
>>> --
>>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>>
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except
>>> the command. To leave the list, send the command SIGNOFF SPSSX-L For
>>> a
>>> list of commands to manage subscriptions, send the command INFO
>>> REFCARD
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the command. To leave the
>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>> manage
>> subscriptions, send the command INFO REFCARD
>>
>> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
>> e-mail, you are prohibited from sharing, copying, or otherwise using
>> or
>> disclosing its contents.  If you have received this e-mail in error,
>> please notify the sender immediately by reply e-mail and permanently
>> delete this e-mail and any attachments without reading, forwarding or
>> saving them.  Thank you.
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
> -----
> Please reply to the list and not to my personal email.
> Those desiring my consulting or training services please feel free to
> email me.
> ---
> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
> porcos ne forte conculcent eas pedibus suis."
> Cum es damnatorum possederunt porcos iens ut salire off sanguinum
> cliff in abyssum?"
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except
> the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: identifying a null within a repeat ID table

David Marso
Administrator
Here is a modification of my original Solution TWO which bypasses the
problem/underspecified cases which were initially flagged in 'error'.
/*Modified Solution TWO using SHIFT VALUES with LEAD function */.

SHIFT VALUES
   VARIABLE=ID    RESULT=nextID    LEAD=1
  /VARIABLE=Col2 RESULT=nextCol2 LEAD=1.

COMPUTE InvalidNull= ID EQ NextID AND SYSMIS(Col2) AND NOT SYSMIS(NextCol2).
/* Perhaps do something interesting proceduraly rather than EXECUTE? */.
EXECUTE.

DELETE VARIABLES NextID NextCol2.


coxspss wrote

> Thanks for everyone responding so quickly.  I tried Melissa's & David's
> approach.  Melissa's worked the fastest but had some false positives,
> this was connected to her questions about the multiple nulls being valid
> so she may have altered her syntax had I responded to her in time.
> David's approach took longer to process but did correctly identify the
> nulls, speed not really an issues, I am processing approximately 2
> million records so I didn't except speed, so went with David's.  Thanks
> again and to all who keep this listserv the best on spss.
>
> David Wright
>
>
>
> On Tue, Mar 27, 2018 at 1:46 PM, David Marso wrote:
>
>> As long as you don't have a bajillion records the following should
>> suffice.
>>
>> /*-------------------------------------*/.
>> SET MXLOOPS=10000000.
>> MATRIX.
>> GET DATA /VARIABLES rowcnt ID col1 col2  /MISSING=-999.
>> COMPUTE Flags=MAKE(NROW(Data),1,0).
>> LOOP #=NROW(DATA) TO 2 BY -1.
>> COMPUTE Flags(#-1)= (Data(#,2) EQ Data(#-1,2) AND (Data(#,4) NE -999
>> AND
>> Data(#-1,4) EQ -999)).
>> END LOOP.
>> SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags.
>> END MATRIX.
>> RECODE col2(-999=SYSMIS).
>> /*-------------------------------------*/.
>>
>>
>> In a situation where you do have a HUGE number of records try the
>> following:
>>
>> DATA LIST LIST /rowcnt ID col1 col2 .
>> BEGIN DATA
>> 1 1 197810 198320 2 1 198320 199330 3 1 199330 4 2 200210 200320 5 2
>> 200320 200530 6 2 200530 200820 7 2 200530 8 2 200820 9 3 200010
>> 200120 10 3 200520 200810 11 3 200520 12 3 200810 201110 13 3 200820
>> 14 3 201110 201210 15 3 201210 201810 16 3 201810 17 4 196510 196820
>> 18 4 196820 19 5 200010 20 5 200520 200520 21 5 200520 200610 22 5
>> 200810 END DATA.
>> SPLIT FILE BY ID.
>> INCLUDE FILE="C:\Users\Owner\Desktop\SPSSCODE\SplitMatrix.sps".
>> RECODE col2(-999=SYSMIS).
>>
>>
>> /* Contents of "C:\Users\Owner\Desktop\SPSSCODE\SplitMatrix.sps" */.
>> SET MXLOOPS=10000000.
>> MATRIX.
>> COMPUTE Col2=4.
>> GET DATA /VARIABLES rowcnt ID col1 col2  /MISSING=-999.
>> COMPUTE N=NROW(Data).
>> COMPUTE Flags=MAKE(N,1,0).
>> LOOP #=N TO 2 BY -1.
>> COMPUTE Flags(#-1)= (Data(#,Col2) NE -999 AND Data(#-1,Col2) EQ -999).
>> END LOOP.
>> SAVE {Data,Flags} /OUTFILE * / VARIABLES rowcnt ID col1 col2 Flags.
>> END MATRIX.
>>
>>
>> Jim Marks wrote
>>> Have you considered reorganizing your data to a wide format  instead
>>> of
>>> LAG?
>>>
>>> I think that would make it easier to identify invalid nulls. If you
>>> replaced the invalid nulls with a dummy code (9999999) you could
>>> compute
>>> invalid null after restructuring your data back to a long format.
>>>
>>> Jim Marks
>>> Sr. Market Research Manager
>>> National Market Research
>>> Kaiser Foundation Health Plan of the Mid-Atlantic States, Inc.
>>>
>>> -----Original Message-----
>>> From: SPSSX(r) Discussion [mailto:
>>
>>> SPSSX-L@.UGA
>>
>>> ] On Behalf Of
>>
>>> coxspss@
>>
>>> Sent: Tuesday, March 27, 2018 1:33 PM
>>> To:
>>
>>> SPSSX-L@.UGA
>>
>>> Subject: Re: identifying a null within a repeat ID table
>>>
>>> Caution: This email came from outside Kaiser Permanente. Do not open
>>> attachments or click on links if you do not recognize the sender.
>>>
>>>
>>> ______________________________________________________________________
>>> Yes, row 7 & 8 are valid nulls for ID 2, an ID can have multiple
>>> nulls but
>>> they must appear on the last adjacent rows, I can see where in my
>>> post it
>>> would lead someone to think it is invalid, tried to clarify that with
>>> the
>>> "-- note the last record may be preceded by a valid null" but upon
>>> reading
>>> it again I can see it just as confusing, sorry about that.
>>> This multiple valid nulls at the end of a repeating ID was also
>>> causing me
>>> endless frustrations in my own attempt to write the syntax.  Should
>>> have
>>> also indicated that the file is sorted by ID & col1 in the file.
>>>
>>>
>>> On Tue, Mar 27, 2018 at 11:15 AM, David Marso wrote:
>>>
>>>> Assuming the ID values are in ascending order you don't need the
>>>> SORT.
>>>> To use Solution ONE below the file must be in proper order.
>>>> Solution TWO doesn't require any SORT as long as the collective IDs
>>>> are adjacent.
>>>> I also question the non flagged 7.
>>>> Operator error?
>>>> --
>>>>
>>>> /*Solution ONE using MATCH with LAST function assumes sorted*/.
>>>> MATCH FILES /FILE * / LAST=@FLAG@/BY ID.
>>>> COMPUTE InvalidNull= SYSMIS(Col2) AND NOT @FLAG@.
>>>> EXECUTE.
>>>> DELETE VARIABLES @FLAG@.
>>>>
>>>> /*Solution TWO using SHIFT VALUES with LEAD function */.
>>>> SHIFT VALUES VARIABLE=ID RESULT=@nextID@ LEAD=1.
>>>> COMPUTE InvalidNull= SYSMIS(Col2) AND ID EQ @NextID@.
>>>> EXECUTE.
>>>> DELETE VARIABLES @NextID@.
>>>>
>>>> Could also use CREATE with LEAD but why bother if we have SHIFT
>>>> VALUES?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Ives, Melissa L wrote
>>>>> 3, 8, 16, 18, 22 are valid because they are the last record for the
>>>>> ID.
>>>>>
>>>>> Why is 7 a valid null?
>>>>>
>>>>> Does col1 always increase as rowcnt increases?
>>>>> Is rowcnt really a variable?
>>>>>
>>>>> *Put the last record on top.
>>>>> Sort cases by ID col1 (D).
>>>>> *alternatively.
>>>>> *Sort cases by ID rowcnt (D).
>>>>>
>>>>> *Identify the last record w/lag function.
>>>>> Compute lastrec=(Id ne lag(id).
>>>>> *identify invalid nulls if missing and not last record.
>>>>> Compute InvldNull=Missing(col2) and lastrec=0.
>>>>>
>>>>> *resort anytime.
>>>>> M
>>>>>
>>>>> -----Original Message-----
>>>>> From: SPSSX(r) Discussion [mailto:
>>>>
>>>>> SPSSX-L@.UGA
>>>>
>>>>> ] On Behalf Of
>>>>
>>>>> coxspss@
>>>>
>>>>> Sent: Tuesday, March 27, 2018 7:39 AM
>>>>> To:
>>>>
>>>>> SPSSX-L@.UGA
>>>>
>>>>> Subject: [SPSSX-L] identifying a null within a repeat ID table
>>>>>
>>>>> I have a repeating ID table in which I need to identify when a null
>>>>> in
>>>>> col2 appears either first within the ID repeating set or before the
>>>>> last repeating record-- note the last record may be preceded by a
>>>>> valid null.
>>>>> In the table below rowcnt 11, 13 & 19 are cases of an invalid null.
>>>>> I
>>>>> would create a new col as a 0,1 binary in which 1=invalid null.
>>>>> I've
>>>>> tried various lag syntaxes but not making progress.  David.
>>>>>
>>>>>
>>>>> rowcnt,ID,col1,col2
>>>>> 11197810198320
>>>>> 21198320199330
>>>>> 31199330
>>>>> 42200210200320
>>>>> 52200320200530
>>>>> 62200530200820
>>>>> 72200530
>>>>> 82200820
>>>>> 93200010200120
>>>>> 103200520200810
>>>>> 113 200520
>>>>> 123200810201110
>>>>> 133200820
>>>>> 143201110201210
>>>>> 153201210201810
>>>>> 163201810
>>>>> 174196510196820
>>>>> 184196820
>>>>> 195200010
>>>>> 205200520200520
>>>>> 215200520200610
>>>>> 22 5 200810
>>>>>
>>>>> =====================
>>>>> To manage your subscription to SPSSX-L, send a message to
>>>>
>>>>> LISTSERV@.UGA
>>>>
>>>>>  (not to SPSSX-L), with no body text except the command. To leave
>>>>> the
>>>>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>>>>> manage subscriptions, send the command INFO REFCARD
>>>>>
>>>>> ________________________________
>>>>>
>>>>> This correspondence contains proprietary information some or all of
>>>>> which may be legally privileged; it is for the intended recipient
>>>>> only. If you are not the intended recipient you must not use,
>>>>> disclose, distribute, copy, print, or rely on this correspondence
>>>>> and
>>>>> completely dispose of the correspondence immediately. Please notify
>>>>> the sender if you have received this email in error. NOTE: Messages
>>>>> to or from the State of Connecticut domain may be subject to the
>>>>> Freedom of Information statutes and regulations.
>>>>>
>>>>>
>>>>> =====================
>>>>> To manage your subscription to SPSSX-L, send a message to
>>>>
>>>>> LISTSERV@.UGA
>>>>
>>>>>  (not to SPSSX-L), with no body text except the command. To leave
>>>>> the
>>>>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>>>>> manage subscriptions, send the command INFO REFCARD
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -----
>>>> Please reply to the list and not to my personal email.
>>>> Those desiring my consulting or training services please feel free
>>>> to
>>>> email me.
>>>> ---
>>>> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
>>>> porcos ne forte conculcent eas pedibus suis."
>>>> Cum es damnatorum possederunt porcos iens ut salire off sanguinum
>>>> cliff in abyssum?"
>>>> --
>>>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>>>
>>>> =====================
>>>> To manage your subscription to SPSSX-L, send a message to
>>>>
>>
>>> LISTSERV@.UGA
>>
>>>  (not to SPSSX-L), with no body text except
>>>> the command. To leave the list, send the command SIGNOFF SPSSX-L For
>>>> a
>>>> list of commands to manage subscriptions, send the command INFO
>>>> REFCARD
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>> LISTSERV@.UGA
>>
>>>  (not to SPSSX-L), with no body text except the command. To leave the
>>> list, send the command SIGNOFF SPSSX-L For a list of commands to
>>> manage
>>> subscriptions, send the command INFO REFCARD
>>>
>>> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
>>> e-mail, you are prohibited from sharing, copying, or otherwise using
>>> or
>>> disclosing its contents.  If you have received this e-mail in error,
>>> please notify the sender immediately by reply e-mail and permanently
>>> delete this e-mail and any attachments without reading, forwarding or
>>> saving them.  Thank you.
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>> LISTSERV@.UGA
>>
>>>  (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>
>>
>>
>>
>>
>> -----
>> Please reply to the list and not to my personal email.
>> Those desiring my consulting or training services please feel free to
>> email me.
>> ---
>> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
>> porcos ne forte conculcent eas pedibus suis."
>> Cum es damnatorum possederunt porcos iens ut salire off sanguinum
>> cliff in abyssum?"
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except
>> the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"