Macro or Python code to determine if there is 3 missing values in a row for observation

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Macro or Python code to determine if there is 3 missing values in a row for observation

dmracek23
Hello,

I have been searching through the list serv and other resources to try and figure out a way to determine when a case "drops out" (i.e., missing values for 3 or more subsequent (i.e., adjacent variables) for each case.

Does anyone have any ideas or could point me to how this could be done? I explored how to do this but couldn't come up with a good approach.

data list free
  / v1 to v6.
begin data.
1 2 3 4 5 6
1 2 3 4 5 . 
1 2 3 4 . .  
1 2 3 . . . 
1 2 . . . . 
1 . . . . .
. . . . . . 
end data.

* This is somewhat conceptually similiar to.

count z1=v1 to v3 (MISSING).
count z2=v2 to v4 (MISSING).
count z3=v3 to v5 (MISSING).
count z4=v4 to v6 (MISSING).
execute.

* It seems like creating a looping variable with a count function would be on the right track but it doesn't seem to do what I want it to.

vector v = v1 to v6.
loop # = 1 to 6.
count z[#] = v[#] to v[#+2] (MISSING).
end loop.
EXECUTE.

* Then perhaps do something along these lines to determine where in the vector or array (which element/variable) the individual drops out.

COMPUTE MaxValue=MAX(z1 TO z4).
COMPUTE MaxCount=0.

VECTOR VectorVar=z1 TO z4.
LOOP #cnt=4 to 1 BY -1.
DO IF MaxValue=VectorVar(#cnt).
COMPUTE MaxVar=#cnt.
COMPUTE MaxCount=MaxCount+1.
END IF.
END LOOP.
EXECUTE.

* Perhaps I would be better of in Python -- for example integrating the spss.SetMacroValueFunction  with the following program and going from there.

BEGIN PROGRAM.
import spss
beg = 'v1'
end = 'v6'

MyVars = []
for i in xrange(spss.GetVariableCount()):
  x = spss.GetVariableName(i)
  MyVars.append(x)

len = MyVars.index(end) - MyVars.index(beg) + 1
print len 
END PROGRAM.

Any ideas or help would be greatly appreciated.

Thanks,

Derek Mracek

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Macro or Python code to determine if there is 3 missing values in a row for observation

Rich Ulrich
I look at what you have listed after "begin data" and I see that
you want to name the location just before 3 consecutive dots.

So, convert your set of variables to a string, (6F1), and get for
the index (minus 1) for 3 periods.  The whole string is presumed
to not show a drop-out  if the "..." is not found.

--
Rich Ulrich

Date: Wed, 21 Oct 2015 13:56:52 -0400
From: [hidden email]
Subject: Macro or Python code to determine if there is 3 missing values in a row for observation
To: [hidden email]

Hello,

I have been searching through the list serv and other resources to try and figure out a way to determine when a case "drops out" (i.e., missing values for 3 or more subsequent (i.e., adjacent variables) for each case.

Does anyone have any ideas or could point me to how this could be done? I explored how to do this but couldn't come up with a good approach.

data list free
  / v1 to v6.
begin data.
1 2 3 4 5 6
1 2 3 4 5 . 
1 2 3 4 . .  
1 2 3 . . . 
1 2 . . . . 
1 . . . . .
. . . . . . 
end data.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Macro or Python code to determine if there is 3 missing values in a row for observation

dmracek23
Hi Rich,

Thanks for your thoughts. I was hoping to program at what point (which variable) each case dropped off (operationalized as three missing in a row) and less so whether they dropped out or not.

Derek

On Wed, Oct 21, 2015 at 2:25 PM, Rich Ulrich <[hidden email]> wrote:
I look at what you have listed after "begin data" and I see that
you want to name the location just before 3 consecutive dots.

So, convert your set of variables to a string, (6F1), and get for
the index (minus 1) for 3 periods.  The whole string is presumed
to not show a drop-out  if the "..." is not found.

--
Rich Ulrich

Date: Wed, 21 Oct 2015 13:56:52 -0400
From: [hidden email]
Subject: Macro or Python code to determine if there is 3 missing values in a row for observation
To: [hidden email]


Hello,

I have been searching through the list serv and other resources to try and figure out a way to determine when a case "drops out" (i.e., missing values for 3 or more subsequent (i.e., adjacent variables) for each case.

Does anyone have any ideas or could point me to how this could be done? I explored how to do this but couldn't come up with a good approach.

data list free
  / v1 to v6.
begin data.
1 2 3 4 5 6
1 2 3 4 5 . 
1 2 3 4 . .  
1 2 3 . . . 
1 2 . . . . 
1 . . . . .
. . . . . . 
end data.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Macro or Python code to determine if there is 3 missing values in a row for observation

Bruce Weaver
Administrator
In reply to this post by dmracek23
Hi Derek.  I don't immediately see why your COUNT command is not working.  But you could use COMPUTE with the NMISS function instead--that seems to work okay, as shown below.

* Following Derek's DATA LIST command, try this.
NUMERIC z1 to z4(F1).
VECTOR v = v1 to v6 / z = z1 to z4.
LOOP # = 1 to 4. /* NOTE: 1 to 4, not 1 to 6.
*count z(#) = v(#) to v(#+2) (MISSING). /* Does not work for some reason.
COMPUTE z(#) = NMISS(v(#),v(#+1),v(#+2)).
END LOOP.
COMPUTE DropCase = ANY(3,z1 to z4).
FORMATS DropCase (F1).
LIST.

OUTPUT:
      v1       v2       v3       v4       v5       v6 z1 z2 z3 z4 DropCase
 
    1.00     2.00     3.00     4.00     5.00     6.00  0  0  0  0     0
    1.00     2.00     3.00     4.00     5.00      .    0  0  0  1     0
    1.00     2.00     3.00     4.00      .        .    0  0  1  2     0
    1.00     2.00     3.00      .        .        .    0  1  2  3     1
    1.00     2.00      .        .        .        .    1  2  3  3     1
    1.00      .        .        .        .        .    2  3  3  3     1
     .        .        .        .        .        .    3  3  3  3     1
 
Number of cases read:  7    Number of cases listed:  7

HTH.


dmracek23 wrote
Hello,

I have been searching through the list serv and other resources to try and
figure out a way to determine when a case "drops out" (i.e., missing values
for 3 or more subsequent (i.e., adjacent variables) for each case.

Does anyone have any ideas or could point me to how this could be done? I
explored how to do this but couldn't come up with a good approach.

data list free
  / v1 to v6.
begin data.
1 2 3 4 5 6
1 2 3 4 5 .
1 2 3 4 . .
1 2 3 . . .
1 2 . . . .
1 . . . . .
. . . . . .
end data.

* This is somewhat conceptually similiar to.

count z1=v1 to v3 (MISSING).
count z2=v2 to v4 (MISSING).
count z3=v3 to v5 (MISSING).
count z4=v4 to v6 (MISSING).
execute.

* It seems like creating a looping variable with a count function would be
on the right track but it doesn't seem to do what I want it to.

vector v = v1 to v6.
loop # = 1 to 6.
count z[#] = v[#] to v[#+2] (MISSING).
end loop.
EXECUTE.

* Then perhaps do something along these lines to determine where in the
vector or array (which element/variable) the individual drops out.

COMPUTE MaxValue=MAX(z1 TO z4).
COMPUTE MaxCount=0.

VECTOR VectorVar=z1 TO z4.
LOOP #cnt=4 to 1 BY -1.
DO IF MaxValue=VectorVar(#cnt).
COMPUTE MaxVar=#cnt.
COMPUTE MaxCount=MaxCount+1.
END IF.
END LOOP.
EXECUTE.

* Perhaps I would be better of in Python -- for example integrating the
spss.SetMacroValueFunction  with the following program and going from there.

BEGIN PROGRAM.
import spss
beg = 'v1'
end = 'v6'

MyVars = []
for i in xrange(spss.GetVariableCount()):
  x = spss.GetVariableName(i)
  MyVars.append(x)

len = MyVars.index(end) - MyVars.index(beg) + 1
print len
END PROGRAM.

Any ideas or help would be greatly appreciated.

Thanks,

Derek Mracek

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Macro or Python code to determine if there is 3 missing values in a row for observation

Bruce Weaver
Administrator
Again, without the z1 to z4 variables:

VECTOR v = v1 to v6.
LOOP # = 1 to 4.
- COMPUTE DropCase = NMISS(v(#),v(#+1),v(#+2)) EQ 3.
END LOOP if DropCase.
FORMATS v1 to v6 DropCase (F1).
LIST.

OUTPUT:

v1 v2 v3 v4 v5 v6 DropCase
 
 1  2  3  4  5  6     0
 1  2  3  4  5  .     0
 1  2  3  4  .  .     0
 1  2  3  .  .  .     1
 1  2  .  .  .  .     1
 1  .  .  .  .  .     1
 .  .  .  .  .  .     1
 
Number of cases read:  7    Number of cases listed:  7

Bruce Weaver wrote
Hi Derek.  I don't immediately see why your COUNT command is not working.  But you could use COMPUTE with the NMISS function instead--that seems to work okay, as shown below.

* Following Derek's DATA LIST command, try this.
NUMERIC z1 to z4(F1).
VECTOR v = v1 to v6 / z = z1 to z4.
LOOP # = 1 to 4. /* NOTE: 1 to 4, not 1 to 6.
*count z(#) = v(#) to v(#+2) (MISSING). /* Does not work for some reason.
COMPUTE z(#) = NMISS(v(#),v(#+1),v(#+2)).
END LOOP.
COMPUTE DropCase = ANY(3,z1 to z4).
FORMATS DropCase (F1).
LIST.

OUTPUT:
      v1       v2       v3       v4       v5       v6 z1 z2 z3 z4 DropCase
 
    1.00     2.00     3.00     4.00     5.00     6.00  0  0  0  0     0
    1.00     2.00     3.00     4.00     5.00      .    0  0  0  1     0
    1.00     2.00     3.00     4.00      .        .    0  0  1  2     0
    1.00     2.00     3.00      .        .        .    0  1  2  3     1
    1.00     2.00      .        .        .        .    1  2  3  3     1
    1.00      .        .        .        .        .    2  3  3  3     1
     .        .        .        .        .        .    3  3  3  3     1
 
Number of cases read:  7    Number of cases listed:  7

HTH.


dmracek23 wrote
Hello,

I have been searching through the list serv and other resources to try and
figure out a way to determine when a case "drops out" (i.e., missing values
for 3 or more subsequent (i.e., adjacent variables) for each case.

Does anyone have any ideas or could point me to how this could be done? I
explored how to do this but couldn't come up with a good approach.

data list free
  / v1 to v6.
begin data.
1 2 3 4 5 6
1 2 3 4 5 .
1 2 3 4 . .
1 2 3 . . .
1 2 . . . .
1 . . . . .
. . . . . .
end data.

* This is somewhat conceptually similiar to.

count z1=v1 to v3 (MISSING).
count z2=v2 to v4 (MISSING).
count z3=v3 to v5 (MISSING).
count z4=v4 to v6 (MISSING).
execute.

* It seems like creating a looping variable with a count function would be
on the right track but it doesn't seem to do what I want it to.

vector v = v1 to v6.
loop # = 1 to 6.
count z[#] = v[#] to v[#+2] (MISSING).
end loop.
EXECUTE.

* Then perhaps do something along these lines to determine where in the
vector or array (which element/variable) the individual drops out.

COMPUTE MaxValue=MAX(z1 TO z4).
COMPUTE MaxCount=0.

VECTOR VectorVar=z1 TO z4.
LOOP #cnt=4 to 1 BY -1.
DO IF MaxValue=VectorVar(#cnt).
COMPUTE MaxVar=#cnt.
COMPUTE MaxCount=MaxCount+1.
END IF.
END LOOP.
EXECUTE.

* Perhaps I would be better of in Python -- for example integrating the
spss.SetMacroValueFunction  with the following program and going from there.

BEGIN PROGRAM.
import spss
beg = 'v1'
end = 'v6'

MyVars = []
for i in xrange(spss.GetVariableCount()):
  x = spss.GetVariableName(i)
  MyVars.append(x)

len = MyVars.index(end) - MyVars.index(beg) + 1
print len
END PROGRAM.

Any ideas or help would be greatly appreciated.

Thanks,

Derek Mracek

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Macro or Python code to determine if there is 3 missing values in a row for observation

David Marso
Administrator
In reply to this post by dmracek23

data list free  / v1 to v6.
begin data.
1 2 3 4 5 6
1 2 3 4 5 .
1 2 3 4 . .  
1 2 3 . . .
1 2 . . . .
1 . . . . .
. . . . . .
end data.
VECTOR v=v1 TO v6.
LOOP #=1 TO 4.
COMPUTE Miss3=NMISS(v(#),v(#+1),v(#+2)) EQ 3.
IF Miss3 MISS3=#.
END LOOP IF Miss3 GT 0.
LIST.
v1       v2       v3       v4       v5       v6    Miss3
 
    1.00     2.00     3.00     4.00     5.00     6.00      .00
    1.00     2.00     3.00     4.00     5.00      .        .00
    1.00     2.00     3.00     4.00      .        .        .00
    1.00     2.00     3.00      .        .        .       4.00
    1.00     2.00      .        .        .        .       3.00
    1.00      .        .        .        .        .       2.00
     .        .        .        .        .        .       1.00



Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Macro or Python code to determine if there is 3 missing values in a row for observation

Rich Ulrich
In reply to this post by dmracek23
Derek,

The index function tells you where the sequence starts. "Minus one"
was my terse reference to the fact that where the periods start is
the first of three Missings, so you just need to subtract one from that number.

--
Rich Ulrich


Date: Wed, 21 Oct 2015 14:40:52 -0400
From: [hidden email]
Subject: Re: Macro or Python code to determine if there is 3 missing values in a row for observation
To: [hidden email]

Hi Rich,

Thanks for your thoughts. I was hoping to program at what point (which variable) each case dropped off (operationalized as three missing in a row) and less so whether they dropped out or not.

Derek

On Wed, Oct 21, 2015 at 2:25 PM, Rich Ulrich <[hidden email]> wrote:
I look at what you have listed after "begin data" and I see that
you want to name the location just before 3 consecutive dots.

So, convert your set of variables to a string, (6F1), and get for
the index (minus 1) for 3 periods.  The whole string is presumed
to not show a drop-out  if the "..." is not found.

--
Rich Ulrich

Date: Wed, 21 Oct 2015 13:56:52 -0400
From: [hidden email]
Subject: Macro or Python code to determine if there is 3 missing values in a row for observation
To: [hidden email]


Hello,

I have been searching through the list serv and other resources to try and figure out a way to determine when a case "drops out" (i.e., missing values for 3 or more subsequent (i.e., adjacent variables) for each case.

Does anyone have any ideas or could point me to how this could be done? I explored how to do this but couldn't come up with a good approach.

data list free
  / v1 to v6.
begin data.
1 2 3 4 5 6
1 2 3 4 5 . 
1 2 3 4 . .  
1 2 3 . . . 
1 2 . . . . 
1 . . . . .
. . . . . . 
end data.


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Macro or Python code to determine if there is 3 missing values in a row for observation

dmracek23
Thanks for the help everyone.

Derek

On Wed, Oct 21, 2015 at 9:36 PM, Rich Ulrich <[hidden email]> wrote:
Derek,

The index function tells you where the sequence starts. "Minus one"
was my terse reference to the fact that where the periods start is
the first of three Missings, so you just need to subtract one from that number.

--
Rich Ulrich


Date: Wed, 21 Oct 2015 14:40:52 -0400
From: [hidden email]
Subject: Re: Macro or Python code to determine if there is 3 missing values in a row for observation
To: [hidden email]

Hi Rich,

Thanks for your thoughts. I was hoping to program at what point (which variable) each case dropped off (operationalized as three missing in a row) and less so whether they dropped out or not.

Derek

On Wed, Oct 21, 2015 at 2:25 PM, Rich Ulrich <[hidden email]> wrote:
I look at what you have listed after "begin data" and I see that
you want to name the location just before 3 consecutive dots.

So, convert your set of variables to a string, (6F1), and get for
the index (minus 1) for 3 periods.  The whole string is presumed
to not show a drop-out  if the "..." is not found.

--
Rich Ulrich

Date: Wed, 21 Oct 2015 13:56:52 -0400
From: [hidden email]
Subject: Macro or Python code to determine if there is 3 missing values in a row for observation
To: [hidden email]


Hello,

I have been searching through the list serv and other resources to try and figure out a way to determine when a case "drops out" (i.e., missing values for 3 or more subsequent (i.e., adjacent variables) for each case.

Does anyone have any ideas or could point me to how this could be done? I explored how to do this but couldn't come up with a good approach.

data list free
  / v1 to v6.
begin data.
1 2 3 4 5 6
1 2 3 4 5 . 
1 2 3 4 . .  
1 2 3 . . . 
1 2 . . . . 
1 . . . . .
. . . . . . 
end data.



===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD