Count unique values

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Count unique values

Riya
Hey Everyone,

Is there any way we can calculate number of unique values in a variable? Basically, i have 100 variables in a data set. For each of the 100 variables, i wish to know the number of unique values. Let's say, a variable named VAR0001 has 2500 records ranging from 1 to 5. In this variable, 5 is the highest scale. I want this to be automated without running frequency for variables and saved in a data set.

I know this can be accomplished using AGGREGATE command. However, I am curious to know the way to get this done using COMPUTE statement.

Thanks in advance !

Thanks and Regards
Riya Mehra
Reply | Threaded
Open this post in threaded view
|

Re: Count unique values

David Marso
Administrator
Well, COMPUTE does not operate ACROSS cases so...
Since this is a small data set:
No actual code (outline)... I am up to my *$$ in alligators at the moment!
VARSTOCASES...(build varval from all variables and index as the variable name) .
AGGREGATE  (BREAK index varval /N as function).
AGGREGATE BREAK index /N=N....

Riya wrote
Hey Everyone,

Is there any way we can calculate number of unique values in a variable? Basically, i have 100 variables in a data set. For each of the 100 variables, i wish to know the number of unique values. Let's say, a variable named VAR0001 has 2500 records ranging from 1 to 5. In this variable, 5 is the highest scale. I want this to be automated without running frequency for variables and saved in a data set.

I know this can be accomplished using AGGREGATE command. However, I am curious to know the way to get this done using COMPUTE statement.

Thanks in advance !

Thanks and Regards
Riya Mehra
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Count unique values

Bruce Weaver
Administrator
In reply to this post by Riya
I haven't thought about it for very long, but I think you'd have to run AGGREGATE 100 times, with each of the 100 variables as the break variable.  So that may not be a desirable solution, even if you wanted to use AGGREGATE.  

I don't understand why you're ruling out the use of FRQUENCIES in conjunction with OMS.  I think that's how I'd tackle it.

Another possible approach would be to FLIP the file (or the variables of interest), and then use COUNT on the flipped file.  I haven't investigated this one far enough to know if it has legs.

HTH.


Riya wrote
Hey Everyone,

Is there any way we can calculate number of unique values in a variable? Basically, i have 100 variables in a data set. For each of the 100 variables, i wish to know the number of unique values. Let's say, a variable named VAR0001 has 2500 records ranging from 1 to 5. In this variable, 5 is the highest scale. I want this to be automated without running frequency for variables and saved in a data set.

I know this can be accomplished using AGGREGATE command. However, I am curious to know the way to get this done using COMPUTE statement.

Thanks in advance !

Thanks and Regards
Riya Mehra
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Count unique values

Maguin, Eugene
In reply to this post by David Marso
I wonder if this might be a good place to use Flip. David and others are particularly proficient programmers and maybe they can do it with compute but I'd bet there isn't a way. The problem is that you need to maintain a 'list' aka a vector of already identified values against which you check the value of the current case. A new value gets added to the list. And when you've cycled through all the cases for a specific value you can count the number of unique values in the vector for that variable.  If Flip didn't work out, then I'd look at Matrix functions because I suspect it will allow you to build that vector, count its elements and store that value in another vector for printing after you've gone through the variable list.

Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso
Sent: Wednesday, March 13, 2013 3:03 PM
To: [hidden email]
Subject: Re: Count unique values

Well, COMPUTE does not operate ACROSS cases so...
Since this is a small data set:
No actual code (outline)... I am up to my *$$ in alligators at the moment!
VARSTOCASES...(build *varval* from all variables and *index* as the variable
name) .
AGGREGATE  (BREAK index varval /N as function).
AGGREGATE BREAK index /N=N....


Riya wrote

> Hey Everyone,
>
> Is there any way we can calculate number of unique values in a variable?
> Basically, i have 100 variables in a data set. For each of the 100
> variables, i wish to know the number of unique values. Let's say, a
> variable named VAR0001 has 2500 records ranging from 1 to 5. In this
> variable, 5 is the highest scale. I want this to be automated without
> running frequency for variables and saved in a data set.
>
> I know this can be accomplished using AGGREGATE command. However, I am
> curious to know the way to get this done using COMPUTE statement.
>
> Thanks in advance !
>
> Thanks and Regards
> Riya Mehra





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Count-unique-values-tp5718625p5718627.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Count unique values

David Marso
Administrator
;-))  Too many notes (Bruce and Gene).
OK, here is actual code tested using GSS 93.
If you have a mix of numeric and string you will need to adapt accordingly!
No real programming here, simply butcher the beast and pound away ;-)
This would likely work with larger datasets as well where a FLIP would be a feline-butt-trophy.
--
DATASET COPY datacpy.
DATASET ACTIVATE datacpy.
VARSTOCASES  /ID=id  /MAKE VarVal FROM ALL  /INDEX=VarIndex(VarVal)  /NULL=KEEP.
DATASET DECLARE agg.
DATASET DECLARE agg2.
AGGREGATE OUTFILE agg / BREAK VarIndex VarVal / NVal=N.
AGGREGATE OUTFILE agg2 / BREAK VarIndex / NVal=N.
LIST.




VarIndex    NVal

adults         7
age           73
agecat4        4
aged           6
agewed        38
attsprts       4
babies         5
bigband        7
birthmo       13
blues          7
blues3         4
blugrass       7
cappun         4
carsfam        8
carsgen        8
chemfam        8
chemgen        8
childs        10
chldidel      11
classic3       4
classicl       7
cohort        73
country        7
degree         7
degree2        4
drink          3
dwelown        5
educ          20
fework         5
filter_$       3
folk           7
grass          5
grntest1       7
grntest2       7
grntest3       7
grntest4       7
gunlaw         5
hvymetal       7
hvymetl3       4
id          1500
income4        4
income91      24
jazz           7
jazz3          4
letdie1        5
life           6
madeg          8
marital        6
married        3
musicals       7
natcity        6
natcrime       6
natdrug        6
nateduc        6
natenvir       6
natheal        6
news           7
opera          7
opera3         4
padeg          8
partners      10
partyid        9
pillok         7
politics       6
polviews       9
preteen        6
race           3
rap            7
rap3           4
region        10
region4        5
relig          7
rincom91      25
sathobby      10
scitest4       7
sei          229
sex            2
sexeduc        5
sexfreq        9
sexfreq5       6
sibs          23
size          76
spanking       7
srcbelt        7
teens          4
tvhours       17
tvnews         7
tvpbs          7
tvshows        7
visitart       4
vote92         6
waterfam       8
watergen       8
wrkstat        8
xnorcsiz      11
zodiac        13


Number of cases read:  96    Number of cases listed:  96
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Count unique values

Riya
Thank you David for this donated code :) It worked like a charm. Thank you Bruce and Gene as well for showing me the way.