SPSSX Discussion

different values in a row

Classic

List

Threaded

17 messages Options

drfg2008

different values in a row

Is there a command (or a syntax, or python program) that counts different (numerical) values in a row.

Example:

variables: V1 V2 V3 V4
values row1: 1 2 3 4
values row2: 1 1 1 1

row1 has 4 different values: 1-4
row 2 has only one different value: 1

I tried to develop a python program but failed. Also couldn't find a solution on raynald's.

Thanks

Dr. Frank Gaeth

Art Kendall-2

Re: different values in a row

try something like this untested syntax.

count n1 = v1 to v4(1).
count n2 = v1 to v4(2).
count n3= v1 to v4(3).
count n4 = v1 to v4(4).
missing values n1 to n4 (0).
nvalues = nvalid (n1 to n4).

Art Kendall
Social Research Consultants

On 3/6/2011 4:11 AM, drfg2008 wrote:

> Is there a command (or a syntax, or python program) that counts different
> (numerical) values in a row.
>
> Example:
>
>
> variables: V1 V2 V3 V4
> values row1: 1 2 3 4
> values row2: 1 1 1 1
>
> row1 has 4 different values: 1-4
> row 2 has only one different value: 1
>
> I tried to develop a python program but failed. Also couldn't find a
> solution on raynald's.
>
> Thanks
>
> -----
> FUB
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/different-values-in-a-row-tp3411280p3411280.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Bruce Weaver

Re: different values in a row

Administrator

In reply to this post by drfg2008

drfg2008 wrote

Is there a command (or a syntax, or python program) that counts different (numerical) values in a row.

Example:

variables: V1 V2 V3 V4
values row1: 1 2 3 4
values row2: 1 1 1 1

row1 has 4 different values: 1-4
row 2 has only one different value: 1

I tried to develop a python program but failed. Also couldn't find a solution on raynald's.

Thanks

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).

drfg2008

Re: different values in a row

I tried your syntax, but couldn't get it running :

*---------------- first build a file ----------------.

input program.
loop a =1 to 100 by 1.
end case.
end loop.
end file.
end input program.
exe.

comp v1 =RV.BINOM(5,0.5).
comp v2 =RV.BINOM(5,0.5).
comp v3 =RV.BINOM(5,0.5).
comp v4 =RV.BINOM(5,0.5).

EXECUTE .

*-----------your syntax with exe.------------------------.

numeric i1 to i4 (f1.0). /* 4 indicator variables.
recode i1 to i4 (else=0). /* initialize to 0.
vector v = v1 to v4 / i = i1 to i4.
loop # = 1 to 4.
- compute i(v(#)) = 1. /* value stored in v(#) flagged as present.
end loop.
EXECUTE .
compute unique_values = sum(i1 to i4).

------ this is an excerpt of the error messages :

>Warnung Nr. 525
>An attempt was made to store a value into an element of a vector the subscript
>of which was missing or otherwise invalid. The subscript must be a positive
>integer and must not be greater than the length of the vector. No store can
>occur.
>Command line: 207 Current case: 6 Current splitfile group: 1

Dr. Frank Gaeth

drfg2008

Re: different values in a row

In reply to this post by drfg2008

Thanks Art,
for you message. However 'count' seems to be a problem, if you have numbers like: 1,23456 ; 1,234567
(real numbers) or a wide range - let's say from 1 to a few billion.

Frank

try something like this untested syntax.

count n1 = v1 to v4(1).
count n2 = v1 to v4(2).
count n3= v1 to v4(3).
count n4 = v1 to v4(4).
missing values n1 to n4 (0).
nvalues = nvalid (n1 to n4).

Art Kendall
Social Research Consultants

Dr. Frank Gaeth

Jon K Peck

Re: different values in a row

In reply to this post by Bruce Weaver

Clever, but it only works with values in the 1-4 range. Try, for example,
1,2,99,12

Here's a Python solution. I used the SPSSINC TRANS extension command, but you could pass the data explicitly instead.

* define a counting function.
begin program.
import spss
def countthem(*args):
return len(set(args))
end program.

* Run it over the data, storing the result in variable uniquecount.
spssinc trans result=uniquecount
/formula "countthem(v1,v2,v3,v4)".

Note that any sysmis values will contribute to the count. The countthem function could be modifed to ignore those.

HTH,

Jon Peck
Senior Software Engineer, IBM
[hidden email]
312-651-3435

From: Bruce Weaver <[hidden email]>
To: [hidden email]
Date: 03/06/2011 07:21 AM
Subject: Re: [SPSSX-L] different values in a row
Sent by: "SPSSX(r) Discussion" <[hidden email]>

I don't have SPSS on this machine, so the following is untested, but I think it might work. You may need an EXECUTE after the loop. numeric i1 to i4 (f1.0). /* 4 indicator variables. recode i1 to i4 (else=0). /* initialize to 0. vector v = v1 to v4 / i = i1 to i4. loop # = 1 to 4. - compute i(v(#)) = 1. /* value stored in v(#) flagged as present. end loop. compute unique_values = sum(i1 to i4). drfg2008 wrote: > > Is there a command (or a syntax, or python program) that counts different > (numerical) values in a row. > > Example: > > > variables: V1 V2 V3 V4 > values row1: 1 2 3 4 > values row2: 1 1 1 1 > > row1 has 4 different values: 1-4 > row 2 has only one different value: 1 > > I tried to develop a python program but failed. Also couldn't find a > solution on raynald's. > > Thanks > ----- -- Bruce Weaver [hidden email]http://sites.google.com/a/lakeheadu.ca/bweaver/"When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context:http://spssx-discussion.1045642.n5.nabble.com/different-values-in-a-row-tp3411280p3411438.htmlSent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Bruce Weaver

Re: different values in a row

Administrator

Something about that solution was troubling me, and Jon put his finger on it. Thanks Jon. ;-)

Jon K Peck wrote

Clever, but it only works with values in the 1-4 range. Try, for example,
1,2,99,12

Here's a Python solution. I used the SPSSINC TRANS extension command, but
you could pass the data explicitly instead.

* define a counting function.
begin program.
import spss
def countthem(*args):
return len(set(args))
end program.

* Run it over the data, storing the result in variable uniquecount.
spssinc trans result=uniquecount
/formula "countthem(v1,v2,v3,v4)".

Note that any sysmis values will contribute to the count. The countthem
function could be modifed to ignore those.

HTH,

Jon Peck
Senior Software Engineer, IBM
peck@us.ibm.com
312-651-3435

From: Bruce Weaver <bruce.weaver@hotmail.com>
To: SPSSX-L@LISTSERV.UGA.EDU
Date: 03/06/2011 07:21 AM
Subject: Re: [SPSSX-L] different values in a row
Sent by: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>

I don't have SPSS on this machine, so the following is untested, but I
think
it might work. You may need an EXECUTE after the loop.

numeric i1 to i4 (f1.0). /* 4 indicator variables.
recode i1 to i4 (else=0). /* initialize to 0.
vector v = v1 to v4 / i = i1 to i4.
loop # = 1 to 4.
- compute i(v(#)) = 1. /* value stored in v(#) flagged as present.
end loop.
compute unique_values = sum(i1 to i4).

drfg2008 wrote:
>
> Is there a command (or a syntax, or python program) that counts
different
> (numerical) values in a row.
>
> Example:
>
>
> variables: V1 V2 V3 V4
> values row1: 1 2 3 4
> values row2: 1 1 1 1
>
> row1 has 4 different values: 1-4
> row 2 has only one different value: 1
>
> I tried to develop a python program but failed. Also couldn't find a
> solution on raynald's.
>
> Thanks
>

-----
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/different-values-in-a-row-tp3411280p3411438.html

Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Bruce Weaver

Re: different values in a row

Administrator

In reply to this post by drfg2008

As Jon pointed out, it works only for whole number values 1-4 (the values shown in the original example).

drfg2008 wrote

I tried your syntax, but couldn't get it running :

*---------------- first build a file ----------------.

input program.
loop a =1 to 100 by 1.
end case.
end loop.
end file.
end input program.
exe.

comp v1 =RV.BINOM(5,0.5).
comp v2 =RV.BINOM(5,0.5).
comp v3 =RV.BINOM(5,0.5).
comp v4 =RV.BINOM(5,0.5).

EXECUTE .

*-----------your syntax with exe.------------------------.

numeric i1 to i4 (f1.0). /* 4 indicator variables.
recode i1 to i4 (else=0). /* initialize to 0.
vector v = v1 to v4 / i = i1 to i4.
loop # = 1 to 4.
- compute i(v(#)) = 1. /* value stored in v(#) flagged as present.
end loop.
EXECUTE .
compute unique_values = sum(i1 to i4).

------ this is an excerpt of the error messages :

>Warnung Nr. 525
>An attempt was made to store a value into an element of a vector the subscript
>of which was missing or otherwise invalid. The subscript must be a positive
>integer and must not be greater than the length of the vector. No store can
>occur.
>Command line: 207 Current case: 6 Current splitfile group: 1

Bruce Weaver

Re: different values in a row

Administrator

Here is a non-Python (untested) solution [1] that should work quite well, provided the number of variables is not too large. It assumes that any missing values are SYSMIS.

* UVC = unique value count .
recode v1 to v4 (sysmis=9999). /* or some other user-defined value .
compute uvc = 1.
compute uvc = uvc + (v2 NE v1) .
compute uvc = uvc + ((v3 NE v1) and (v3 NE v2)).
compute uvc = uvc + ((v4 NE v1) and (v4 NE v2) and (v4 NE v1)).
missing values v1 to v4 (9999).

This will count SYSMIS as one of the possible unique values. If you don't want to include SYMIS, then correct UVC by subtraction. E.g.,

compute uvc = uvc - (nmiss(v1 to v4) GT 0).

One could possibly stick this basic idea into a macro that would make it more feasible for a large number of variables.

[1] As I often point out, many of the academic SPSS users I know will probably *never* install and use Python; therefore, I always try to offer solutions that will run in native SPSS code. This should not be taken as any slight against Python. It's just being realistic, IMO.

Bruce Weaver wrote

As Jon pointed out, it works only for whole number values 1-4 (the values shown in the original example).

drfg2008 wrote

I tried your syntax, but couldn't get it running :

*---------------- first build a file ----------------.

input program.
loop a =1 to 100 by 1.
end case.
end loop.
end file.
end input program.
exe.

comp v1 =RV.BINOM(5,0.5).
comp v2 =RV.BINOM(5,0.5).
comp v3 =RV.BINOM(5,0.5).
comp v4 =RV.BINOM(5,0.5).

EXECUTE .

*-----------your syntax with exe.------------------------.

numeric i1 to i4 (f1.0). /* 4 indicator variables.
recode i1 to i4 (else=0). /* initialize to 0.
vector v = v1 to v4 / i = i1 to i4.
loop # = 1 to 4.
- compute i(v(#)) = 1. /* value stored in v(#) flagged as present.
end loop.
EXECUTE .
compute unique_values = sum(i1 to i4).

------ this is an excerpt of the error messages :

>Warnung Nr. 525
>An attempt was made to store a value into an element of a vector the subscript
>of which was missing or otherwise invalid. The subscript must be a positive
>integer and must not be greater than the length of the vector. No store can
>occur.
>Command line: 207 Current case: 6 Current splitfile group: 1

David Marso

Re: different values in a row

Administrator

In reply to this post by drfg2008

Simple!
Restructure data wide to long retaining identifier for row (ROWID) and pushing all relevant columns into a single column (VAR).
AGGREGATE breaking on RowID and Var use N finction.
REAGGREGATE breaking on RowID and use N.
MATCH to original file.
Done.
say the file has variables V1 TO V100.
COMPUTE RowID=$CASENUM.
SAVE OUTFILE "origData.sav".

VECTOR V=V1 TO V100.
LOOP #=1 TO 100.
COMPUTE Var=V(#).
XSAVE OUTFILE "temp.sav" / KEEP RowID Var.
END LOOP.
EXE.

GET FILE "temp.sav".
AGGREGATE OUTFILE * / BREAK RowID Var /N=N.
AGGREGATE OUTFILE * / BREAK RowID /Unique=N.
MATCH FILES / FILE "origData.sav" / FILE * / BY RowID.

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

David Marso

Re: different values in a row

Administrator

In reply to this post by Bruce Weaver

Looks unwieldy with even a few variables.
See my post using Restructure and AGGREGATE.
"I always try to offer solutions that will run in native SPSS code. "
I concur. I have an OLD (11.5) version of SPSS and can't afford to upgrade, so you will find my solutions will work in ANY version of SPSS (except PC+) back to version 4 on a mainframe/Unix box/Ancient Mac -any version which supports VECTOR/LOOP XSAVE etc.
I even tend to use old skool restructure rather than CASESTOVARS or VARSTOCASES.
I am typing this on my Mac and my SPSS is on the Windows partition. I can't remember the VARSTOCASES syntax off the top of my head but the VECTOR/LOOP/XSAVE is a permanent part of my neural wiring at this point ;-)

Bruce Weaver wrote

Here is a non-Python (untested) solution [1] that should work quite well, provided the number of variables is not too large. It assumes that any missing values are SYSMIS.

* UVC = unique value count .
recode v1 to v4 (sysmis=9999). /* or some other user-defined value .
compute uvc = 1.
compute uvc = uvc + (v2 NE v1) .
compute uvc = uvc + ((v3 NE v1) and (v3 NE v2)).
compute uvc = uvc + ((v4 NE v1) and (v4 NE v2) and (v4 NE v1)).
missing values v1 to v4 (9999).

This will count SYSMIS as one of the possible unique values. If you don't want to include SYMIS, then correct UVC by subtraction. E.g.,

compute uvc = uvc - (nmiss(v1 to v4) GT 0).

One could possibly stick this basic idea into a macro that would make it more feasible for a large number of variables.

[1] As I often point out, many of the academic SPSS users I know will probably *never* install and use Python; therefore, I always try to offer solutions that will run in native SPSS code. This should not be taken as any slight against Python. It's just being realistic, IMO.

Bruce Weaver wrote

As Jon pointed out, it works only for whole number values 1-4 (the values shown in the original example).

drfg2008 wrote

I tried your syntax, but couldn't get it running :

*---------------- first build a file ----------------.

input program.
loop a =1 to 100 by 1.
end case.
end loop.
end file.
end input program.
exe.

comp v1 =RV.BINOM(5,0.5).
comp v2 =RV.BINOM(5,0.5).
comp v3 =RV.BINOM(5,0.5).
comp v4 =RV.BINOM(5,0.5).

EXECUTE .

*-----------your syntax with exe.------------------------.

numeric i1 to i4 (f1.0). /* 4 indicator variables.
recode i1 to i4 (else=0). /* initialize to 0.
vector v = v1 to v4 / i = i1 to i4.
loop # = 1 to 4.
- compute i(v(#)) = 1. /* value stored in v(#) flagged as present.
end loop.
EXECUTE .
compute unique_values = sum(i1 to i4).

------ this is an excerpt of the error messages :

>Warnung Nr. 525
>An attempt was made to store a value into an element of a vector the subscript
>of which was missing or otherwise invalid. The subscript must be a positive
>integer and must not be greater than the length of the vector. No store can
>occur.
>Command line: 207 Current case: 6 Current splitfile group: 1

martus

Re: different values in a row

In reply to this post by drfg2008

I only have a crude concept, not fully worked out:
For one case you may use the flip command to get the values
in one column, named variable. Than you sort and write the command
if (lag(variable) = variable) indicator = 1.
you delete all cases with indicator = 1, the number of cases equals
the number of different values.
This solution works for arbitrary numbers per row, but only for
one case in the original file. Maybe its possible to extend it to arbitrary number of
cases in the original file by including a loop on the variables in
the flipped file.
Peter

David Marso

Re: different values in a row

Administrator

See my previously posted solution. The First piece is essentially a FLIP BY ID.
AGGREGATE takes care of the rest.
---------

martus wrote

I only have a crude concept, not fully worked out:
For one case you may use the flip command to get the values
in one column, named variable. Than you sort and write the command
if (lag(variable) = variable) indicator = 1.
you delete all cases with indicator = 1, the number of cases equals
the number of different values.
This solution works for arbitrary numbers per row, but only for
one case in the original file. Maybe its possible to extend it to arbitrary number of
cases in the original file by including a loop on the variables in
the flipped file.
Peter

drfg2008

Re: different values in a row

In reply to this post by David Marso

Thanks everybody!

I understood the syntax best which david provided.

-> aggregate. I should have known.

Thank you.

Here the complete syntax (with 4 variables computed randomly) in case someone needs it.

input program.
loop a =1 to 100 by 1.
end case.
end loop.
end file.
end input program.
exe.

comp v1 =RV.BINOM(5,0.5).
comp v2 =RV.BINOM(5,0.5).
comp v3 =RV.BINOM(5,0.5).
comp v4 =RV.BINOM(5,0.5).

EXECUTE .

*say the file has variables V1 TO V4.
COMPUTE RowID=$CASENUM.
SAVE OUTFILE "C:\<path>\differentValues.sav".

VECTOR V=V1 TO V4.
LOOP #=1 TO 4.
COMPUTE Var=V(#).
XSAVE OUTFILE "temp.sav" / KEEP RowID Var.
END LOOP.
EXE.

GET FILE "temp.sav".
AGGREGATE OUTFILE * / BREAK RowID Var /N=N.
AGGREGATE OUTFILE * / BREAK RowID /Unique=N.
MATCH FILES / FILE "C:\<path>\differentValues.sav" / FILE * / BY RowID.
EXECUTE .

Dr. Frank Gaeth

drfg2008

Re: different values in a row

by the way, here is my python version of David Marsos fantastic (simple and effective) solution. The advantage of python is, that it counts automaticly the number and names of variables.

*testsample: --------.

input program.
loop a =1 to 100 by 1.
end case.
end loop.
end file.
end input program.
exe.

comp v1 =RV.BINOM(5,0.5).
comp v2 =RV.BINOM(5,0.5).
comp v3 =RV.BINOM(5,0.5).
comp v4 =RV.BINOM(5,0.5).
comp v5 =RV.BINOM(5,0.5).
comp v6 =RV.BINOM(5,0.5).
comp v7 =RV.BINOM(5,0.5).
comp v8 =RV.BINOM(5,0.5).

EXECUTE .
DELETE VARIABLES a.

* so here comes the python version: ------------.

COMPUTE RowID=$CASENUM.
SAVE OUTFILE "C:\differentValues.sav".

begin program.
import spss
FileN=spss.GetVariableCount()-1
varName_1 = spss.GetVariableName(0)
varName_n = spss.GetVariableName(spss.GetVariableCount()-2)

spss.Submit("VECTOR V=" + varName_1 +" to " + varName_n + ".")
spss.Submit("LOOP #=1 TO "+ str(FileN) + ".")
spss.Submit(r"""COMPUTE Var=V(#).
XSAVE OUTFILE "temp.sav" / KEEP RowID Var.
END LOOP.
EXE.""")
end program.

GET FILE "temp.sav".
AGGREGATE OUTFILE * / BREAK RowID Var /N=N.
AGGREGATE OUTFILE * / BREAK RowID /Unique=N.
MATCH FILES / FILE "C:\differentValues.sav" / FILE * / BY RowID.
EXECUTE .

Dr. Frank Gaeth

Bruce Weaver

Re: different values in a row

Administrator

In reply to this post by David Marso

Yes, that's more like it. (If I post something unwieldy enough, it always provokes David to post a much better solution.) ;-)

David Marso wrote

Looks unwieldy with even a few variables.
See my post using Restructure and AGGREGATE.
"I always try to offer solutions that will run in native SPSS code. "
I concur. I have an OLD (11.5) version of SPSS and can't afford to upgrade, so you will find my solutions will work in ANY version of SPSS (except PC+) back to version 4 on a mainframe/Unix box/Ancient Mac -any version which supports VECTOR/LOOP XSAVE etc.
I even tend to use old skool restructure rather than CASESTOVARS or VARSTOCASES.
I am typing this on my Mac and my SPSS is on the Windows partition. I can't remember the VARSTOCASES syntax off the top of my head but the VECTOR/LOOP/XSAVE is a permanent part of my neural wiring at this point ;-)

Bruce Weaver wrote

Here is a non-Python (untested) solution [1] that should work quite well, provided the number of variables is not too large. It assumes that any missing values are SYSMIS.

* UVC = unique value count .
recode v1 to v4 (sysmis=9999). /* or some other user-defined value .
compute uvc = 1.
compute uvc = uvc + (v2 NE v1) .
compute uvc = uvc + ((v3 NE v1) and (v3 NE v2)).
compute uvc = uvc + ((v4 NE v1) and (v4 NE v2) and (v4 NE v1)).
missing values v1 to v4 (9999).

This will count SYSMIS as one of the possible unique values. If you don't want to include SYMIS, then correct UVC by subtraction. E.g.,

compute uvc = uvc - (nmiss(v1 to v4) GT 0).

One could possibly stick this basic idea into a macro that would make it more feasible for a large number of variables.

[1] As I often point out, many of the academic SPSS users I know will probably *never* install and use Python; therefore, I always try to offer solutions that will run in native SPSS code. This should not be taken as any slight against Python. It's just being realistic, IMO.

Bruce Weaver wrote

As Jon pointed out, it works only for whole number values 1-4 (the values shown in the original example).

drfg2008 wrote

I tried your syntax, but couldn't get it running :

*---------------- first build a file ----------------.

input program.
loop a =1 to 100 by 1.
end case.
end loop.
end file.
end input program.
exe.

comp v1 =RV.BINOM(5,0.5).
comp v2 =RV.BINOM(5,0.5).
comp v3 =RV.BINOM(5,0.5).
comp v4 =RV.BINOM(5,0.5).

EXECUTE .

*-----------your syntax with exe.------------------------.

numeric i1 to i4 (f1.0). /* 4 indicator variables.
recode i1 to i4 (else=0). /* initialize to 0.
vector v = v1 to v4 / i = i1 to i4.
loop # = 1 to 4.
- compute i(v(#)) = 1. /* value stored in v(#) flagged as present.
end loop.
EXECUTE .
compute unique_values = sum(i1 to i4).

------ this is an excerpt of the error messages :

>Warnung Nr. 525
>An attempt was made to store a value into an element of a vector the subscript
>of which was missing or otherwise invalid. The subscript must be a positive
>integer and must not be greater than the length of the vector. No store can
>occur.
>Command line: 207 Current case: 6 Current splitfile group: 1

David Marso

Re: different values in a row

Administrator

ROFL !!! I sometimes provoke myself as well ;-)
Case in point:
Use VARSTOCASES to go wide to long (eliminating the external file -for which I omitted the ERASE command in my previous solution- ;-) * I should maybe call all my temp files for posted code "C:\dmmtemp" and every once in awhile post random code which has ERASE "C:\dmmtemp" somewhere in the mix ;-)))....*
ie, substitute the VECTOR/LOOP/XSAVE biz for an appropriate V2C.
Use MODE=ADDVARIABLES in the AGGREGATE(s) to remove the need for the MATCH.
Leaving this as an exercise for those that wish to pursue it.
---

Bruce Weaver wrote

Yes, that's more like it. (If I post something unwieldy enough, it always provokes David to post a much better solution.) ;-)

David Marso wrote

Looks unwieldy with even a few variables.
See my post using Restructure and AGGREGATE.
"I always try to offer solutions that will run in native SPSS code. "
I concur. I have an OLD (11.5) version of SPSS and can't afford to upgrade, so you will find my solutions will work in ANY version of SPSS (except PC+) back to version 4 on a mainframe/Unix box/Ancient Mac -any version which supports VECTOR/LOOP XSAVE etc.
I even tend to use old skool restructure rather than CASESTOVARS or VARSTOCASES.
I am typing this on my Mac and my SPSS is on the Windows partition. I can't remember the VARSTOCASES syntax off the top of my head but the VECTOR/LOOP/XSAVE is a permanent part of my neural wiring at this point ;-)

Bruce Weaver wrote

Here is a non-Python (untested) solution [1] that should work quite well, provided the number of variables is not too large. It assumes that any missing values are SYSMIS.

* UVC = unique value count .
recode v1 to v4 (sysmis=9999). /* or some other user-defined value .
compute uvc = 1.
compute uvc = uvc + (v2 NE v1) .
compute uvc = uvc + ((v3 NE v1) and (v3 NE v2)).
compute uvc = uvc + ((v4 NE v1) and (v4 NE v2) and (v4 NE v1)).
missing values v1 to v4 (9999).

This will count SYSMIS as one of the possible unique values. If you don't want to include SYMIS, then correct UVC by subtraction. E.g.,

compute uvc = uvc - (nmiss(v1 to v4) GT 0).

One could possibly stick this basic idea into a macro that would make it more feasible for a large number of variables.

[1] As I often point out, many of the academic SPSS users I know will probably *never* install and use Python; therefore, I always try to offer solutions that will run in native SPSS code. This should not be taken as any slight against Python. It's just being realistic, IMO.

Bruce Weaver wrote

As Jon pointed out, it works only for whole number values 1-4 (the values shown in the original example).

drfg2008 wrote

I tried your syntax, but couldn't get it running :

*---------------- first build a file ----------------.

input program.
loop a =1 to 100 by 1.
end case.
end loop.
end file.
end input program.
exe.

comp v1 =RV.BINOM(5,0.5).
comp v2 =RV.BINOM(5,0.5).
comp v3 =RV.BINOM(5,0.5).
comp v4 =RV.BINOM(5,0.5).

EXECUTE .

*-----------your syntax with exe.------------------------.

numeric i1 to i4 (f1.0). /* 4 indicator variables.
recode i1 to i4 (else=0). /* initialize to 0.
vector v = v1 to v4 / i = i1 to i4.
loop # = 1 to 4.
- compute i(v(#)) = 1. /* value stored in v(#) flagged as present.
end loop.
EXECUTE .
compute unique_values = sum(i1 to i4).

------ this is an excerpt of the error messages :

>Warnung Nr. 525
>An attempt was made to store a value into an element of a vector the subscript
>of which was missing or otherwise invalid. The subscript must be a positive
>integer and must not be greater than the length of the vector. No store can
>occur.
>Command line: 207 Current case: 6 Current splitfile group: 1