Re: 10 most frequent occurring values of a multiple response set (REVISITED )

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: 10 most frequent occurring values of a multiple response set (REVISITED )

Edward Boadi
Thanks Art,
Your syntax assumes that we know what the top 2 values are (1,4).
Now suppose we do not know before hand what the top 2 values are ,
I want to able to get same results without knowing before hand what the top 2 ( or top 4,8, 25 etc) are.

For a large data set , running an intermediary step, copying what the top  10 or 25 (what ever the case may be)
into  "recode z1 to z4 (1,4=copy) (else=0) into newz1 to newz4" could be a lot of pain .

Thanks for your help and look forward to hearing from you.

Edward.


-----Original Message-----
From: Art Kendall [mailto:[hidden email]]
Sent: Friday, August 04, 2006 7:47 AM
To: Edward Boadi
Cc: [hidden email]
Subject: Re: 10 most frequent occurring values of a multiple response
set (REVISITED )


Is this what you need

DATA LIST FREE/x y  z1 z2 z3 z4.
BEGIN DATA
2       1       1       3       3       5
1       2       4       1       4       3
1       3       4       5       9       1
1       4       5       2       5       4
2       2       5       1       3       1
1       3       2       2       2       5
1       2       1       2       1       1
1       1       9       4       1       1
1       1       2       4       5       1
1       3       1       5       1       1
1       1       2       4       4       4
1       2       2       9       4       4
2       4       5       1       2       3
1       1       1       1       9       2
1       2       5       1       1       3
1       4       5       1       2       1
1       3       1       2       4       4
END DATA.

SAVE OUTFILE  ='C:\Temp\originaldata.sav'.

MULT RESPONSE
  GROUPS=$z 'the 4 z variables' (z1 z2 z3 z4 (1,9))
  /FREQUENCIES=$z  .

recode z1 to z4 (1,4=copy) (else=0) into newz1 to newz4.

MULT RESPONSE
  GROUPS=$newz 'the 4 z variables' (newz1 newz2 newz3 newz4 (1,9))
  /FREQUENCIES=$newz  .


Art

Edward Boadi wrote:

>OK, Beadle and List.
>Lets consider this data file
>
>DATA LIST FREE/x y  z1 z2 z3 z4.
>BEGIN DATA
>2       1       1       3       3       5
>1       2       4       1       4       3
>1       3       4       5       9       1
>1       4       5       2       5       4
>2       2       5       1       3       1
>1       3       2       2       2       5
>1       2       1       2       1       1
>1       1       9       4       1       1
>1       1       2       4       5       1
>1       3       1       5       1       1
>1       1       2       4       4       4
>1       2       2       9       4       4
>2       4       5       1       2       3
>1       1       1       1       9       2
>1       2       5       1       1       3
>1       4       5       1       2       1
>1     3     1     2     4     4
>END DATA.
>
>SAVE OUTFILE  ='C:\Temp\originaldata.sav'.
>
>In the above data file :
>
>1. The  value "1" has the highest occurance accross  z1,z2,z3 and z4
>2. The next  is the value "4" accross  z1,z2,z3 and z4
>
>I want to create a new data file 'C:\Temp\Newdata.sav' with same variables (x,y,z1,z2,z3,z4) but with
>the values of z1,z2,z3 and z4 set to sysmis EXCEPT when (z1,z2,z3,z4) = (1 or 4).
>
>Thus I want to keep the top 2 values (1 and 4)  accross  z1,z2,z3 and z4
>
>Regards.
>
>
>-----Original Message-----
>From: Beadle, ViAnn [mailto:[hidden email]]
>Sent: Thursday, August 03, 2006 3:52 PM
>To: Edward Boadi; [hidden email]
>Subject: RE: Re: 10 most frequent occurring values of a multiple
>response set ( REVISITED )
>
>
>OK, let's try this from a different tack because I don't think anybody understands what you mean by most frequently occurring categories. Do you want to count occurrences of values across all four variables so that if z1 and z2 each have the value 14, that counts for two occurrences of 14?
>
>Perhaps if you would tell us why you want to do this, we would better understand your question. Or if you could give us a small set of data for the 4 variables and tell us what you think are the top 2 values (so you don't have to provide so much data that we can't read it), we would could provide more help here.
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Edward Boadi
>Sent: Thursday, August 03, 2006 2:29 PM
>To: [hidden email]
>Subject: Re: 10 most frequent occurring values of a multiple response set ( REVISITED )
>
>I wish to express my sincere thanks to the following people :
>Hillel Vardi,Beadle ViAnn,and Richard Ristow for your contributions (advice and syntax) on the above subject.
>
>Sorry to say that I have not been able to achieve my desired objective:
>
>I below is re-statement of what I want to do.
>
>Giving a data file c:\Temp\OriginalData.sav with variables x(1-4) ,y(1-3) , z1(1-15), z2(1-15),z3(1-15), and z4(1-15)
> z1, z2 ,z3 and Z4 have identical categories(15).
>
>I want to do the following:
>
>1.Identify 10  most frequent occurring categories of Z ( where Z is a combination of z1, Z2, z3 and z4)
>2.Set z1,z2,z3 and z4 to missing for values of z1,z2,z3,z4 not in  10  most frequent occurring categories
> Thus if the categories (2,4,7,9,12) of z1,z2,z3 and z4 are not in the 10  most frequent occurring categories, set
> it to sysmiss
>3. Save the new file as c:\Temp\NewData.sav with variables x,y,z1,z2,z3 and z4
>
>
>Any help on this task will be very much appreciated.
>
>Warm regards to all.
>
>
>
>