SPSSX Discussion

Handling variables with large numbers of value labels

Classic

List

Threaded

3 messages Options

Mark Lenel

Handling variables with large numbers of value labels

Hello,

I have a question in a survey where a respondent can list up to 5 computer games they have played recently. After coding this generates 5 NUMERIC variables (Game_1 TO Game_5). The list of VALUE LABELS associated with these 5 variables has nearly 1000 distinct codes + labels to account for the large variety of game titles out there!

I would like to be able to use this data in a series of tables that shows the top 20 games played by different subgroups of the sample.

The current method I have is to convert the 5 variables into nearly 1000 (!) dichotomous variables, create an MRSET from those, then use that MRSET in my tables. This obviously works, but is pretty laborious and processor-intensive while dealing with all the new variables.

I was wondering if anyone can offer anything more efficient, perhaps a way of creating tables direct from the original 5 variables?

Many thanks,

Mark

David Marso

Re: Handling variables with large numbers of value labels

Administrator

YIKES!!!
Normalize the data structure using VARSTOCASES or a VECTOR/LOOP/XSAVE approach.
Please search the archives for many many examples of this.
Basically turn your 5 variables into 5 records with appropriate case level info retained on each record.
Much more managable than dealing with some hideous 1000 item MD set!!! It will run fast and is scalable!
If you are looking for the 20 most common you can aggregate/SORT and select.

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

Jon K Peck

Re: Handling variables with large numbers of value labels

In reply to this post by Mark Lenel

You can treat these as an multiple category sets rather than an MC set, which eliminates the need to make dichotomies.

Jon Peck
Senior Software Engineer, IBM
[hidden email]
312-651-3435

From: Mark Lenel <[hidden email]>
To: [hidden email]
Date: 03/22/2011 06:07 AM
Subject: [SPSSX-L] Handling variables with large numbers of value labels
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Hello,

I have a question in a survey where a respondent can list up to 5 computer games they have played recently. After coding this generates 5 NUMERIC variables (Game_1 TO Game_5). The list of VALUE LABELS associated with these 5 variables has nearly 1000 distinct codes + labels to account for the large variety of game titles out there!

I would like to be able to use this data in a series of tables that shows the top 20 games played by different subgroups of the sample.

The current method I have is to convert the 5 variables into nearly 1000 (!) dichotomous variables, create an MRSET from those, then use that MRSET in my tables. This obviously works, but is pretty laborious and processor-intensive while dealing with all the new variables.

I was wondering if anyone can offer anything more efficient, perhaps a way of creating tables direct from the original 5 variables?

Many thanks,
Mark