SPSSX Discussion

Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

Classic

List

Threaded

10 messages Options

Edward Boadi

Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

I wish to express my sincere thanks to the following people :
Hillel Vardi,Beadle ViAnn,and Richard Ristow for your contributions (advice and syntax) on the above subject.

Sorry to say that I have not been able to achieve my desired objective:

I below is re-statement of what I want to do.

Giving a data file c:\Temp\OriginalData.sav with variables x(1-4) ,y(1-3) , z1(1-15), z2(1-15),z3(1-15), and z4(1-15)
z1, z2 ,z3 and Z4 have identical categories(15).

I want to do the following:

1.Identify 10 most frequent occurring categories of Z ( where Z is a combination of z1, Z2, z3 and z4)
2.Set z1,z2,z3 and z4 to missing for values of z1,z2,z3,z4 not in 10 most frequent occurring categories
Thus if the categories (2,4,7,9,12) of z1,z2,z3 and z4 are not in the 10 most frequent occurring categories, set
it to sysmiss
3. Save the new file as c:\Temp\NewData.sav with variables x,y,z1,z2,z3 and z4

Any help on this task will be very much appreciated.

Warm regards to all.

Beadle, ViAnn

Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

OK, let's try this from a different tack because I don't think anybody understands what you mean by most frequently occurring categories. Do you want to count occurrences of values across all four variables so that if z1 and z2 each have the value 14, that counts for two occurrences of 14?

Perhaps if you would tell us why you want to do this, we would better understand your question. Or if you could give us a small set of data for the 4 variables and tell us what you think are the top 2 values (so you don't have to provide so much data that we can't read it), we would could provide more help here.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Edward Boadi
Sent: Thursday, August 03, 2006 2:29 PM
To: [hidden email]
Subject: Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

I wish to express my sincere thanks to the following people :
Hillel Vardi,Beadle ViAnn,and Richard Ristow for your contributions (advice and syntax) on the above subject.

Sorry to say that I have not been able to achieve my desired objective:

I below is re-statement of what I want to do.

Giving a data file c:\Temp\OriginalData.sav with variables x(1-4) ,y(1-3) , z1(1-15), z2(1-15),z3(1-15), and z4(1-15)
z1, z2 ,z3 and Z4 have identical categories(15).

I want to do the following:

1.Identify 10 most frequent occurring categories of Z ( where Z is a combination of z1, Z2, z3 and z4)
2.Set z1,z2,z3 and z4 to missing for values of z1,z2,z3,z4 not in 10 most frequent occurring categories
Thus if the categories (2,4,7,9,12) of z1,z2,z3 and z4 are not in the 10 most frequent occurring categories, set
it to sysmiss
3. Save the new file as c:\Temp\NewData.sav with variables x,y,z1,z2,z3 and z4

Any help on this task will be very much appreciated.

Warm regards to all.

Edward Boadi

Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

In reply to this post by Edward Boadi

OK, Beadle and List.
Lets consider this data file

DATA LIST FREE/x y z1 z2 z3 z4.
BEGIN DATA
2 1 1 3 3 5
1 2 4 1 4 3
1 3 4 5 9 1
1 4 5 2 5 4
2 2 5 1 3 1
1 3 2 2 2 5
1 2 1 2 1 1
1 1 9 4 1 1
1 1 2 4 5 1
1 3 1 5 1 1
1 1 2 4 4 4
1 2 2 9 4 4
2 4 5 1 2 3
1 1 1 1 9 2
1 2 5 1 1 3
1 4 5 1 2 1
1 3 1 2 4 4
END DATA.

SAVE OUTFILE ='C:\Temp\originaldata.sav'.

In the above data file :

1. The value "1" has the highest occurance accross z1,z2,z3 and z4
2. The next is the value "4" accross z1,z2,z3 and z4

I want to create a new data file 'C:\Temp\Newdata.sav' with same variables (x,y,z1,z2,z3,z4) but with
the values of z1,z2,z3 and z4 set to sysmis EXCEPT when (z1,z2,z3,z4) = (1 or 4).

Thus I want to keep the top 2 values (1 and 4) accross z1,z2,z3 and z4

Regards.

-----Original Message-----
From: Beadle, ViAnn [mailto:[hidden email]]
Sent: Thursday, August 03, 2006 3:52 PM
To: Edward Boadi; [hidden email]
Subject: RE: Re: 10 most frequent occurring values of a multiple
response set ( REVISITED )

OK, let's try this from a different tack because I don't think anybody understands what you mean by most frequently occurring categories. Do you want to count occurrences of values across all four variables so that if z1 and z2 each have the value 14, that counts for two occurrences of 14?

Perhaps if you would tell us why you want to do this, we would better understand your question. Or if you could give us a small set of data for the 4 variables and tell us what you think are the top 2 values (so you don't have to provide so much data that we can't read it), we would could provide more help here.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Edward Boadi
Sent: Thursday, August 03, 2006 2:29 PM
To: [hidden email]
Subject: Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

I wish to express my sincere thanks to the following people :
Hillel Vardi,Beadle ViAnn,and Richard Ristow for your contributions (advice and syntax) on the above subject.

Sorry to say that I have not been able to achieve my desired objective:

I below is re-statement of what I want to do.

Giving a data file c:\Temp\OriginalData.sav with variables x(1-4) ,y(1-3) , z1(1-15), z2(1-15),z3(1-15), and z4(1-15)
z1, z2 ,z3 and Z4 have identical categories(15).

I want to do the following:

1.Identify 10 most frequent occurring categories of Z ( where Z is a combination of z1, Z2, z3 and z4)
2.Set z1,z2,z3 and z4 to missing for values of z1,z2,z3,z4 not in 10 most frequent occurring categories
Thus if the categories (2,4,7,9,12) of z1,z2,z3 and z4 are not in the 10 most frequent occurring categories, set
it to sysmiss
3. Save the new file as c:\Temp\NewData.sav with variables x,y,z1,z2,z3 and z4

Any help on this task will be very much appreciated.

Warm regards to all.

Beadle, ViAnn

Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

In reply to this post by Edward Boadi

Use the mult response procedure to tabulate your 4 z variables together as a multiple response set.

MULT RESPONSE GROUPS=z (z1 z2 z3 z4 (1,15))
/FREQUENCIES=z.

Look at the table and find the top ten. Use recode to recode all other values to sysmis.

-----Original Message-----
From: Edward Boadi [mailto:[hidden email]]
Sent: Thursday, August 03, 2006 3:29 PM
To: Beadle, ViAnn; [hidden email]
Subject: RE: Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

OK, Beadle and List.
Lets consider this data file

DATA LIST FREE/x y z1 z2 z3 z4.
BEGIN DATA
2 1 1 3 3 5
1 2 4 1 4 3
1 3 4 5 9 1
1 4 5 2 5 4
2 2 5 1 3 1
1 3 2 2 2 5
1 2 1 2 1 1
1 1 9 4 1 1
1 1 2 4 5 1
1 3 1 5 1 1
1 1 2 4 4 4
1 2 2 9 4 4
2 4 5 1 2 3
1 1 1 1 9 2
1 2 5 1 1 3
1 4 5 1 2 1
1 3 1 2 4 4
END DATA.

SAVE OUTFILE ='C:\Temp\originaldata.sav'.

In the above data file :

1. The value "1" has the highest occurance accross z1,z2,z3 and z4
2. The next is the value "4" accross z1,z2,z3 and z4

I want to create a new data file 'C:\Temp\Newdata.sav' with same variables (x,y,z1,z2,z3,z4) but with
the values of z1,z2,z3 and z4 set to sysmis EXCEPT when (z1,z2,z3,z4) = (1 or 4).

Thus I want to keep the top 2 values (1 and 4) accross z1,z2,z3 and z4

Regards.

-----Original Message-----
From: Beadle, ViAnn [mailto:[hidden email]]
Sent: Thursday, August 03, 2006 3:52 PM
To: Edward Boadi; [hidden email]
Subject: RE: Re: 10 most frequent occurring values of a multiple
response set ( REVISITED )

OK, let's try this from a different tack because I don't think anybody understands what you mean by most frequently occurring categories. Do you want to count occurrences of values across all four variables so that if z1 and z2 each have the value 14, that counts for two occurrences of 14?

Perhaps if you would tell us why you want to do this, we would better understand your question. Or if you could give us a small set of data for the 4 variables and tell us what you think are the top 2 values (so you don't have to provide so much data that we can't read it), we would could provide more help here.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Edward Boadi
Sent: Thursday, August 03, 2006 2:29 PM
To: [hidden email]
Subject: Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

I wish to express my sincere thanks to the following people :
Hillel Vardi,Beadle ViAnn,and Richard Ristow for your contributions (advice and syntax) on the above subject.

Sorry to say that I have not been able to achieve my desired objective:

I below is re-statement of what I want to do.

Giving a data file c:\Temp\OriginalData.sav with variables x(1-4) ,y(1-3) , z1(1-15), z2(1-15),z3(1-15), and z4(1-15)
z1, z2 ,z3 and Z4 have identical categories(15).

I want to do the following:

1.Identify 10 most frequent occurring categories of Z ( where Z is a combination of z1, Z2, z3 and z4)
2.Set z1,z2,z3 and z4 to missing for values of z1,z2,z3,z4 not in 10 most frequent occurring categories
Thus if the categories (2,4,7,9,12) of z1,z2,z3 and z4 are not in the 10 most frequent occurring categories, set
it to sysmiss
3. Save the new file as c:\Temp\NewData.sav with variables x,y,z1,z2,z3 and z4

Any help on this task will be very much appreciated.

Warm regards to all.

Edward Boadi

Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

In reply to this post by Edward Boadi

Thanks Beadle,
Is there a way to automate the whole process without having to look at the
table created by :

MULT RESPONSE GROUPS=z (z1 z2 z3 z4 (1,15))
/FREQUENCIES=z.

Thanks:

-----Original Message-----
From: Beadle, ViAnn [mailto:[hidden email]]
Sent: Thursday, August 03, 2006 4:42 PM
To: Edward Boadi; [hidden email]
Subject: RE: Re: 10 most frequent occurring values of a multiple
response set ( REVISITED )

Use the mult response procedure to tabulate your 4 z variables together as a multiple response set.

MULT RESPONSE GROUPS=z (z1 z2 z3 z4 (1,15))
/FREQUENCIES=z.

Look at the table and find the top ten. Use recode to recode all other values to sysmis.

-----Original Message-----
From: Edward Boadi [mailto:[hidden email]]
Sent: Thursday, August 03, 2006 3:29 PM
To: Beadle, ViAnn; [hidden email]
Subject: RE: Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

OK, Beadle and List.
Lets consider this data file

DATA LIST FREE/x y z1 z2 z3 z4.
BEGIN DATA
2 1 1 3 3 5
1 2 4 1 4 3
1 3 4 5 9 1
1 4 5 2 5 4
2 2 5 1 3 1
1 3 2 2 2 5
1 2 1 2 1 1
1 1 9 4 1 1
1 1 2 4 5 1
1 3 1 5 1 1
1 1 2 4 4 4
1 2 2 9 4 4
2 4 5 1 2 3
1 1 1 1 9 2
1 2 5 1 1 3
1 4 5 1 2 1
1 3 1 2 4 4
END DATA.

SAVE OUTFILE ='C:\Temp\originaldata.sav'.

In the above data file :

1. The value "1" has the highest occurance accross z1,z2,z3 and z4
2. The next is the value "4" accross z1,z2,z3 and z4

I want to create a new data file 'C:\Temp\Newdata.sav' with same variables (x,y,z1,z2,z3,z4) but with
the values of z1,z2,z3 and z4 set to sysmis EXCEPT when (z1,z2,z3,z4) = (1 or 4).

Thus I want to keep the top 2 values (1 and 4) accross z1,z2,z3 and z4

Regards.

-----Original Message-----
From: Beadle, ViAnn [mailto:[hidden email]]
Sent: Thursday, August 03, 2006 3:52 PM
To: Edward Boadi; [hidden email]
Subject: RE: Re: 10 most frequent occurring values of a multiple
response set ( REVISITED )

OK, let's try this from a different tack because I don't think anybody understands what you mean by most frequently occurring categories. Do you want to count occurrences of values across all four variables so that if z1 and z2 each have the value 14, that counts for two occurrences of 14?

Perhaps if you would tell us why you want to do this, we would better understand your question. Or if you could give us a small set of data for the 4 variables and tell us what you think are the top 2 values (so you don't have to provide so much data that we can't read it), we would could provide more help here.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Edward Boadi
Sent: Thursday, August 03, 2006 2:29 PM
To: [hidden email]
Subject: Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

I wish to express my sincere thanks to the following people :
Hillel Vardi,Beadle ViAnn,and Richard Ristow for your contributions (advice and syntax) on the above subject.

Sorry to say that I have not been able to achieve my desired objective:

I below is re-statement of what I want to do.

Giving a data file c:\Temp\OriginalData.sav with variables x(1-4) ,y(1-3) , z1(1-15), z2(1-15),z3(1-15), and z4(1-15)
z1, z2 ,z3 and Z4 have identical categories(15).

I want to do the following:

1.Identify 10 most frequent occurring categories of Z ( where Z is a combination of z1, Z2, z3 and z4)
2.Set z1,z2,z3 and z4 to missing for values of z1,z2,z3,z4 not in 10 most frequent occurring categories
Thus if the categories (2,4,7,9,12) of z1,z2,z3 and z4 are not in the 10 most frequent occurring categories, set
it to sysmiss
3. Save the new file as c:\Temp\NewData.sav with variables x,y,z1,z2,z3 and z4

Any help on this task will be very much appreciated.

Warm regards to all.

Simon Phillip Freidin

Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

In reply to this post by Edward Boadi

varstocases make z from z1 to z4.
flip.
sel if case_lbl='z'.
compute end=$sysmis.
do repeat f=f1 to f15 /n=1 to 15.
count f = var001 to end (n).
end repeat.
match file file=*/keep=f1 to f15.
do repeat f=f1 to f15 /n=1 to 15.
if max(f1 to f15) =f mostfreq1=n.
end repeat.
do repeat f=f1 to f15 /n=1 to 15.
if n=mostfreq1 f=$sysmis.
end repeat.
do repeat f=f1 to f15 /n=1 to 15.
if max(f1 to f15) =f mostfreq2=n.
end repeat.
exe.
write outfile='c:\temp\recodes.sps'
/"recode z1 z2 z3 z4 ( " mostfreq1 " = " mostfreq1 " ) ( " mostfreq2 " =
" mostfreq2 " ) (else = sysmis )".

get file = 'c:\temp\originaldata.sav'.
include file = 'c:\temp\recodes.sps'.
exe.

At 06:29 AM 4/08/2006, you wrote:

>OK, Beadle and List.
>Lets consider this data file
>
>DATA LIST FREE/x y z1 z2 z3 z4.
>BEGIN DATA
>2 1 1 3 3 5
>1 2 4 1 4 3
>1 3 4 5 9 1
>1 4 5 2 5 4
>2 2 5 1 3 1
>1 3 2 2 2 5
>1 2 1 2 1 1
>1 1 9 4 1 1
>1 1 2 4 5 1
>1 3 1 5 1 1
>1 1 2 4 4 4
>1 2 2 9 4 4
>2 4 5 1 2 3
>1 1 1 1 9 2
>1 2 5 1 1 3
>1 4 5 1 2 1
>1 3 1 2 4 4
>END DATA.
>
>SAVE OUTFILE ='C:\Temp\originaldata.sav'.
>
>In the above data file :
>
>1. The value "1" has the highest occurance accross z1,z2,z3 and z4
>2. The next is the value "4" accross z1,z2,z3 and z4
>
>I want to create a new data file 'C:\Temp\Newdata.sav' with same variables
>(x,y,z1,z2,z3,z4) but with
>the values of z1,z2,z3 and z4 set to sysmis EXCEPT when (z1,z2,z3,z4) = (1
>or 4).
>
>Thus I want to keep the top 2 values (1 and 4) accross z1,z2,z3 and z4
>
>Regards.
>
>
>-----Original Message-----
>From: Beadle, ViAnn [mailto:[hidden email]]
>Sent: Thursday, August 03, 2006 3:52 PM
>To: Edward Boadi; [hidden email]
>Subject: RE: Re: 10 most frequent occurring values of a multiple
>response set ( REVISITED )
>
>
>OK, let's try this from a different tack because I don't think anybody
>understands what you mean by most frequently occurring categories. Do you
>want to count occurrences of values across all four variables so that if
>z1 and z2 each have the value 14, that counts for two occurrences of 14?
>
>Perhaps if you would tell us why you want to do this, we would better
>understand your question. Or if you could give us a small set of data for
>the 4 variables and tell us what you think are the top 2 values (so you
>don't have to provide so much data that we can't read it), we would could
>provide more help here.
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
>Edward Boadi
>Sent: Thursday, August 03, 2006 2:29 PM
>To: [hidden email]
>Subject: Re: 10 most frequent occurring values of a multiple response set
>( REVISITED )
>
>I wish to express my sincere thanks to the following people :
>Hillel Vardi,Beadle ViAnn,and Richard Ristow for your contributions
>(advice and syntax) on the above subject.
>
>Sorry to say that I have not been able to achieve my desired objective:
>
>I below is re-statement of what I want to do.
>
>Giving a data file c:\Temp\OriginalData.sav with variables x(1-4) ,y(1-3)
>, z1(1-15), z2(1-15),z3(1-15), and z4(1-15)
> z1, z2 ,z3 and Z4 have identical categories(15).
>
>I want to do the following:
>
>1.Identify 10 most frequent occurring categories of Z ( where Z is a
>combination of z1, Z2, z3 and z4)
>2.Set z1,z2,z3 and z4 to missing for values of z1,z2,z3,z4 not
>in 10 most frequent occurring categories
> Thus if the categories (2,4,7,9,12) of z1,z2,z3 and z4 are not in the
> 10 most frequent occurring categories, set
> it to sysmiss
>3. Save the new file as c:\Temp\NewData.sav with variables x,y,z1,z2,z3 and z4
>
>
>Any help on this task will be very much appreciated.
>
>Warm regards to all.

Research Database Manager and Analyst
Melbourne Institute of Applied Economic and Social Research
The University of Melbourne
Melbourne VIC 3010 Australia
New Tel: (03) 8344 2085 New Fax: (03) 8344 2111
http://www.melbourneinstitute.com/hilda/

Art Kendall

Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

In reply to this post by Edward Boadi

Is this what you need

DATA LIST FREE/x y z1 z2 z3 z4.
BEGIN DATA
2 1 1 3 3 5
1 2 4 1 4 3
1 3 4 5 9 1
1 4 5 2 5 4
2 2 5 1 3 1
1 3 2 2 2 5
1 2 1 2 1 1
1 1 9 4 1 1
1 1 2 4 5 1
1 3 1 5 1 1
1 1 2 4 4 4
1 2 2 9 4 4
2 4 5 1 2 3
1 1 1 1 9 2
1 2 5 1 1 3
1 4 5 1 2 1
1 3 1 2 4 4
END DATA.

SAVE OUTFILE ='C:\Temp\originaldata.sav'.

MULT RESPONSE
GROUPS=$z 'the 4 z variables' (z1 z2 z3 z4 (1,9))
/FREQUENCIES=$z .

recode z1 to z4 (1,4=copy) (else=0) into newz1 to newz4.

MULT RESPONSE
GROUPS=$newz 'the 4 z variables' (newz1 newz2 newz3 newz4 (1,9))
/FREQUENCIES=$newz .

Art

Edward Boadi wrote:

>OK, Beadle and List.
>Lets consider this data file
>
>DATA LIST FREE/x y z1 z2 z3 z4.
>BEGIN DATA
>2 1 1 3 3 5
>1 2 4 1 4 3
>1 3 4 5 9 1
>1 4 5 2 5 4
>2 2 5 1 3 1
>1 3 2 2 2 5
>1 2 1 2 1 1
>1 1 9 4 1 1
>1 1 2 4 5 1
>1 3 1 5 1 1
>1 1 2 4 4 4
>1 2 2 9 4 4
>2 4 5 1 2 3
>1 1 1 1 9 2
>1 2 5 1 1 3
>1 4 5 1 2 1
>1 3 1 2 4 4
>END DATA.
>
>SAVE OUTFILE ='C:\Temp\originaldata.sav'.
>
>In the above data file :
>
>1. The value "1" has the highest occurance accross z1,z2,z3 and z4
>2. The next is the value "4" accross z1,z2,z3 and z4
>
>I want to create a new data file 'C:\Temp\Newdata.sav' with same variables (x,y,z1,z2,z3,z4) but with
>the values of z1,z2,z3 and z4 set to sysmis EXCEPT when (z1,z2,z3,z4) = (1 or 4).
>
>Thus I want to keep the top 2 values (1 and 4) accross z1,z2,z3 and z4
>
>Regards.
>
>
>-----Original Message-----
>From: Beadle, ViAnn [mailto:[hidden email]]
>Sent: Thursday, August 03, 2006 3:52 PM
>To: Edward Boadi; [hidden email]
>Subject: RE: Re: 10 most frequent occurring values of a multiple
>response set ( REVISITED )
>
>
>OK, let's try this from a different tack because I don't think anybody understands what you mean by most frequently occurring categories. Do you want to count occurrences of values across all four variables so that if z1 and z2 each have the value 14, that counts for two occurrences of 14?
>
>Perhaps if you would tell us why you want to do this, we would better understand your question. Or if you could give us a small set of data for the 4 variables and tell us what you think are the top 2 values (so you don't have to provide so much data that we can't read it), we would could provide more help here.
>
>-----Original Message-----
>From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Edward Boadi
>Sent: Thursday, August 03, 2006 2:29 PM
>To: [hidden email]
>Subject: Re: 10 most frequent occurring values of a multiple response set ( REVISITED )
>
>I wish to express my sincere thanks to the following people :
>Hillel Vardi,Beadle ViAnn,and Richard Ristow for your contributions (advice and syntax) on the above subject.
>
>Sorry to say that I have not been able to achieve my desired objective:
>
>I below is re-statement of what I want to do.
>
>Giving a data file c:\Temp\OriginalData.sav with variables x(1-4) ,y(1-3) , z1(1-15), z2(1-15),z3(1-15), and z4(1-15)
> z1, z2 ,z3 and Z4 have identical categories(15).
>
>I want to do the following:
>
>1.Identify 10 most frequent occurring categories of Z ( where Z is a combination of z1, Z2, z3 and z4)
>2.Set z1,z2,z3 and z4 to missing for values of z1,z2,z3,z4 not in 10 most frequent occurring categories
> Thus if the categories (2,4,7,9,12) of z1,z2,z3 and z4 are not in the 10 most frequent occurring categories, set
> it to sysmiss
>3. Save the new file as c:\Temp\NewData.sav with variables x,y,z1,z2,z3 and z4
>
>
>Any help on this task will be very much appreciated.
>
>Warm regards to all.
>
>
>
>

Art Kendall
Social Research Consultants

Edward Boadi

Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

In reply to this post by Edward Boadi

Thanks a million Simon for your amazing syntax.
It does exactly what I need.

Suppose I want to extend the syntax to "top 3, 10,25 etc" values What changes to the syntax is required.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Simon Freidin
Sent: Thursday, August 03, 2006 8:45 PM
To: [hidden email]
Subject: Re: 10 most frequent occurring values of a multiple response
set ( REVISITED )

varstocases make z from z1 to z4.
flip.
sel if case_lbl='z'.
compute end=$sysmis.
do repeat f=f1 to f15 /n=1 to 15.
count f = var001 to end (n).
end repeat.
match file file=*/keep=f1 to f15.
do repeat f=f1 to f15 /n=1 to 15.
if max(f1 to f15) =f mostfreq1=n.
end repeat.
do repeat f=f1 to f15 /n=1 to 15.
if n=mostfreq1 f=$sysmis.
end repeat.
do repeat f=f1 to f15 /n=1 to 15.
if max(f1 to f15) =f mostfreq2=n.
end repeat.
exe.
write outfile='c:\temp\recodes.sps'
/"recode z1 z2 z3 z4 ( " mostfreq1 " = " mostfreq1 " ) ( " mostfreq2 " =
" mostfreq2 " ) (else = sysmis )".

get file = 'c:\temp\originaldata.sav'.
include file = 'c:\temp\recodes.sps'.
exe.

At 06:29 AM 4/08/2006, you wrote:

Simon Phillip Freidin

Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

change all the 15s to max of allowed values; and insert, for each
additional value a block like

do repeat f=f1 to f15 /n=1 to 15.
if n=mostfreq1 f=$sysmis.
end repeat.
do repeat f=f1 to f15 /n=1 to 15.
if max(f1 to f15) =f mostfreq2=n.
end repeat.

Change the mostfreq var names as you do. For the 3rd value change
mostfreq1 -> mostfreq2 and mostfreq2 -> mostfreq3

Then add the additional value to the write:

write outfile='c:\temp\recodes.sps'
/"recode z1 z2 z3 z4 ( "
mostfreq1 " = " mostfreq1
" ) ( "
mostfreq2 " = " mostfreq2
" ) ( "
mostfreq3 " = " mostfreq3
" ) (else = sysmis )".

On 05/08/2006, at 12:33 AM, Edward Boadi wrote:

> Thanks a million Simon for your amazing syntax.
> It does exactly what I need.
>
> Suppose I want to extend the syntax to "top 3, 10,25 etc" values
> What changes to the syntax is required.
>
>
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]]On
> Behalf Of
> Simon Freidin
> Sent: Thursday, August 03, 2006 8:45 PM
> To: [hidden email]
> Subject: Re: 10 most frequent occurring values of a multiple response
> set ( REVISITED )
>
>
> varstocases make z from z1 to z4.
> flip.
> sel if case_lbl='z'.
> compute end=$sysmis.
> do repeat f=f1 to f15 /n=1 to 15.
> count f = var001 to end (n).
> end repeat.
> match file file=*/keep=f1 to f15.
> do repeat f=f1 to f15 /n=1 to 15.
> if max(f1 to f15) =f mostfreq1=n.
> end repeat.
> do repeat f=f1 to f15 /n=1 to 15.
> if n=mostfreq1 f=$sysmis.
> end repeat.
> do repeat f=f1 to f15 /n=1 to 15.
> if max(f1 to f15) =f mostfreq2=n.
> end repeat.
> exe.
> write outfile='c:\temp\recodes.sps'
> /"recode z1 z2 z3 z4 ( " mostfreq1 " = " mostfreq1 " ) ( "
> mostfreq2 " =
> " mostfreq2 " ) (else = sysmis )".
>
> get file = 'c:\temp\originaldata.sav'.
> include file = 'c:\temp\recodes.sps'.
> exe.
>
> At 06:29 AM 4/08/2006, you wrote:
>> OK, Beadle and List.
>> Lets consider this data file
>>
>> DATA LIST FREE/x y z1 z2 z3 z4.
>> BEGIN DATA
>> 2 1 1 3 3 5
>> 1 2 4 1 4 3
>> 1 3 4 5 9 1
>> 1 4 5 2 5 4
>> 2 2 5 1 3 1
>> 1 3 2 2 2 5
>> 1 2 1 2 1 1
>> 1 1 9 4 1 1
>> 1 1 2 4 5 1
>> 1 3 1 5 1 1
>> 1 1 2 4 4 4
>> 1 2 2 9 4 4
>> 2 4 5 1 2 3
>> 1 1 1 1 9 2
>> 1 2 5 1 1 3
>> 1 4 5 1 2 1
>> 1 3 1 2 4 4
>> END DATA.
>>
>> SAVE OUTFILE ='C:\Temp\originaldata.sav'.
>>
>> In the above data file :
>>
>> 1. The value "1" has the highest occurance accross z1,z2,z3 and z4
>> 2. The next is the value "4" accross z1,z2,z3 and z4
>>
>> I want to create a new data file 'C:\Temp\Newdata.sav' with same
>> variables
>> (x,y,z1,z2,z3,z4) but with
>> the values of z1,z2,z3 and z4 set to sysmis EXCEPT when
>> (z1,z2,z3,z4) = (1
>> or 4).
>>
>> Thus I want to keep the top 2 values (1 and 4) accross z1,z2,z3
>> and z4
>>
>> Regards.
>>
>>
>> -----Original Message-----
>> From: Beadle, ViAnn [mailto:[hidden email]]
>> Sent: Thursday, August 03, 2006 3:52 PM
>> To: Edward Boadi; [hidden email]
>> Subject: RE: Re: 10 most frequent occurring values of a multiple
>> response set ( REVISITED )
>>
>>
>> OK, let's try this from a different tack because I don't think
>> anybody
>> understands what you mean by most frequently occurring categories.
>> Do you
>> want to count occurrences of values across all four variables so
>> that if
>> z1 and z2 each have the value 14, that counts for two occurrences
>> of 14?
>>
>> Perhaps if you would tell us why you want to do this, we would better
>> understand your question. Or if you could give us a small set of
>> data for
>> the 4 variables and tell us what you think are the top 2 values
>> (so you
>> don't have to provide so much data that we can't read it), we
>> would could
>> provide more help here.
>>
>> -----Original Message-----
>> From: SPSSX(r) Discussion [mailto:[hidden email]] On
>> Behalf Of
>> Edward Boadi
>> Sent: Thursday, August 03, 2006 2:29 PM
>> To: [hidden email]
>> Subject: Re: 10 most frequent occurring values of a multiple
>> response set
>> ( REVISITED )
>>
>> I wish to express my sincere thanks to the following people :
>> Hillel Vardi,Beadle ViAnn,and Richard Ristow for your contributions
>> (advice and syntax) on the above subject.
>>
>> Sorry to say that I have not been able to achieve my desired
>> objective:
>>
>> I below is re-statement of what I want to do.
>>
>> Giving a data file c:\Temp\OriginalData.sav with variables x
>> (1-4) ,y(1-3)
>> , z1(1-15), z2(1-15),z3(1-15), and z4(1-15)
>> z1, z2 ,z3 and Z4 have identical categories(15).
>>
>> I want to do the following:
>>
>> 1.Identify 10 most frequent occurring categories of Z ( where Z is a
>> combination of z1, Z2, z3 and z4)
>> 2.Set z1,z2,z3 and z4 to missing for values of z1,z2,z3,z4 not
>> in 10 most frequent occurring categories
>> Thus if the categories (2,4,7,9,12) of z1,z2,z3 and z4 are not in
>> the
>> 10 most frequent occurring categories, set
>> it to sysmiss
>> 3. Save the new file as c:\Temp\NewData.sav with variables
>> x,y,z1,z2,z3 and z4
>>
>>
>> Any help on this task will be very much appreciated.
>>
>> Warm regards to all.
>
>
> Research Database Manager and Analyst
> Melbourne Institute of Applied Economic and Social Research
> The University of Melbourne
> Melbourne VIC 3010 Australia
> New Tel: (03) 8344 2085 New Fax: (03) 8344 2111
> http://www.melbourneinstitute.com/hilda/

Edward Boadi

Re: 10 most frequent occurring values of a multiple response set ( REVISITED )

In reply to this post by Edward Boadi

Thanks Simon, you are a star.
Your first syntax and the modification suggested works the way I want.

Thanks again to every one who contributed to this topic.

Warm regards to all.

Edward

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Simon Freidin
Sent: Friday, August 04, 2006 10:35 PM
To: [hidden email]
Subject: Re: 10 most frequent occurring values of a multiple response
set ( REVISITED )

change all the 15s to max of allowed values; and insert, for each
additional value a block like

do repeat f=f1 to f15 /n=1 to 15.
if n=mostfreq1 f=$sysmis.
end repeat.
do repeat f=f1 to f15 /n=1 to 15.
if max(f1 to f15) =f mostfreq2=n.
end repeat.

Change the mostfreq var names as you do. For the 3rd value change
mostfreq1 -> mostfreq2 and mostfreq2 -> mostfreq3

Then add the additional value to the write:

write outfile='c:\temp\recodes.sps'
/"recode z1 z2 z3 z4 ( "
mostfreq1 " = " mostfreq1
" ) ( "
mostfreq2 " = " mostfreq2
" ) ( "
mostfreq3 " = " mostfreq3
" ) (else = sysmis )".

On 05/08/2006, at 12:33 AM, Edward Boadi wrote: