Remove or filter out rare categories in a variable

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Remove or filter out rare categories in a variable

Researchit
Hi SPSS people

I wonder if there is a way to remove all categories in a variable that have been chosen by a specified minimum number of people e.g. variable is 'JOBFUNCTION' and it was collected as open ended text format

Let's put on one side the merging of categories - but just suppose I simply want a new VAR ('JOBFUNCTION_NEW') which only includes categories chosen by 10 respondents or more.

is there an 'easy' way to do that ?

Appreciate any advice or suggestions

M
Reply | Threaded
Open this post in threaded view
|

Re: Remove or filter out rare categories in a variable

Bruce Weaver
Administrator
To make things more concrete, suppose JOBFUNCTION has 5 categories (numbered 1 to 5 with value labels), and suppose category 3 has a low number of respondents (e.g., < 10).  What comes to mind is something like this:

COMPUTE JOBFUNCTION_NEW = JOBFUNCTION.
FORMATS JOBFUNCTION_NEW (F2.0).
MISSING VALUES JOBFUNCTION_NEW (3).

Would that give you what you're looking for?

One downside to this approach is that it requires intervention by the user to determine which values of the new variable should be treated as missing.  If this is a one-time job where the input data are not going to change, that's probably sufficient.  But if it is something that will be done more than once and with input data that changes, then it is not a very good solution.  

Cheers,
Bruce

PS- Please note that this forum is VERY inactive these days.  See this thread for suggestions about other forums you could try:

http://spssx-discussion.165.s1.nabble.com/NOTE-This-forum-is-no-longer-connected-to-the-SPSSX-L-mailing-list-td5740719.html


Researchit wrote
Hi SPSS people

I wonder if there is a way to remove all categories in a variable that have been chosen by a specified minimum number of people e.g. variable is 'JOBFUNCTION' and it was collected as open ended text format

Let's put on one side the merging of categories - but just suppose I simply want a new VAR ('JOBFUNCTION_NEW') which only includes categories chosen by 10 respondents or more.

is there an 'easy' way to do that ?

Appreciate any advice or suggestions

M
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).