A filter question

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

A filter question

lts1
Hi SPSSers,
 
I have two files (Original and Subset). Original contains data on individuals from about 65 countries, while subset contains aggregated data from about 30 of those countries. I need to look at the original data for the subset countries. My question is how can I filter Original to include only the countries in Subset (without a very long if statement)?
 
Thanks in advance.
 
    Best,
        Lisa
 
 
Reply | Threaded
Open this post in threaded view
|

Re: A filter question

John F Hall
Assuming you don't want to merge the two into a hierarchical set, and that there is a unique code value for each country, you need something like:
 
get file '<name of old f'ile>' .
count set = country (< enter individual code values for country here> ) .
select if ( set > 0 ) .
save out '<name of new file>' .
 
<newfile> will contain only those cases in Subset countries.  Make sure you have a read only copy of Original, otherwise all cases not in Subset will be permanently lost!
 
The count statement should look something like:
 
count x = country ( 3, 10, 22, 40, ....... 65 ) .
 
If you want to keep the new variable set :
 
recode set ( 2 thru hi =1) (0 = 2) .
var lab set 'Subset type' .
val lab set 1 'Subset' 2 'Other' .
save out '<newfile>' .
 
. . . then use
 
select if (subset = 1) .
<statistical procedures> .
 
. . .in your analyses.
 
Be careful with select if or you risk losing the non-Subset cases if you save the working file.  Use temporary for a single procedure.
 
temporary .
select if (subset = 1) .
<statistical procedure> .
 
. . .and data set immediately reverts to Original.
 
The GUI has a menu for temporary selection you can switch on and off, but I prefer to use syntax, so I've never used it.
 
Hope this helps.
 
----- Original Message -----
Sent: Wednesday, August 18, 2010 3:41 AM
Subject: A filter question

Hi SPSSers,
 
I have two files (Original and Subset). Original contains data on individuals from about 65 countries, while subset contains aggregated data from about 30 of those countries. I need to look at the original data for the subset countries. My question is how can I filter Original to include only the countries in Subset (without a very long if statement)?
 
Thanks in advance.
 
    Best,
        Lisa
 
 
Reply | Threaded
Open this post in threaded view
|

Re: A filter question

Jon K Peck
In reply to this post by lts1

Use the file matching facilities to do this.  No need to itemize the country names.

With both files sorted and the subset file as the active dataset,
use MATCH FILES with /TABLE set to the larger dataset and the country identifier as the key variable.

Add whatever variables are need from the original dataset.  You would need to drop variables in the subset in order to get the larger-file values or rename those variables in the MATCH.

This will produce a dataset containing the subset cases and any variables selected from the original one.  

HTH,
Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: "Lisa T. Stickney" <[hidden email]>
To: [hidden email]
Date: 08/17/2010 07:44 PM
Subject: [SPSSX-L] A filter question
Sent by: "SPSSX(r) Discussion" <[hidden email]>





Hi SPSSers,
 
I have two files (Original and Subset). Original contains data on individuals from about 65 countries, while subset contains aggregated data from about 30 of those countries. I need to look at the original data for the subset countries. My question is how can I filter Original to include only the countries in Subset (without a very long if statement)?
 
Thanks in advance.
 
    Best,
        Lisa