How to split categorical variable into multiple variables?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How to split categorical variable into multiple variables?

Patrick Burns
Is there a way in SPSS to split categorical variable into multiple
variables?  I have a categorical variable that looks like this:

Person Frt_CD
1 Apple
2 Apple
2 Orange
3 Banana
4 Kiwi
4 Apple
5 Kiwi
5 Banana
5 Orange
6 Orange
7 Orange
...

I don't want to restructure the entire data file, but instead just add four
new 'flag' variables, each one specific to a category in my original
variable "Frt_CD" above.  The outcome would look like this:

Person Frt_CD Apples Bananas Kiwis Oranges
1 Apple 1 . . .
2 Apple 1 . . .
2 Orange . . . 1
3 Banana . 1 . .
4 Kiwi . . 1 .
4 Apple 1 . . .
5 Kiwi . . 1 .
5 Banana . 1 . .
5 Orange . . . 1
6 Orange . . . 1
7 Orange . . . 1
...

(I need to aggregate my dataset after this step, but want to hold onto the
information in it in cases where the same person consumes multiple types of
fruit.)

I know how to write syntax to accomplish the creation of these new variables
(such as "IF (Frt_CD='Apple') Apples=1.", etc., etc.), but it takes time to
change the syntax for each new variable I want to split up.  So I'm
wondering if there is an *existing* command or menu-driven wizard in SPSS
that does this, almost like auto-recode?  (BTW, I use SPSS v16.)  Thanks in
advance if you can recommend any time saving approach,

PATRICK
Los Angeles

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Patrick Burns, Senior Researcher Economic Roundtable 315 W. 9th Street, Suite 502 Los Angeles, CA, 90015-4200 http://www.economicrt.org
Reply | Threaded
Open this post in threaded view
|

Re: How to split categorical variable into multiple variables?

Maguin, Eugene
Here's one way (not tested).

Do repeat x='Apple' 'Orange' 'Banana' 'Kiwi'/y=Apple Orange Banana Kiwi.
If (frt_cd eq x) y=1.
End repeat.


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Patrick Burns
Sent: Friday, December 06, 2013 3:26 PM
To: [hidden email]
Subject: How to split categorical variable into multiple variables?

Is there a way in SPSS to split categorical variable into multiple variables?  I have a categorical variable that looks like this:

Person Frt_CD
1 Apple
2 Apple
2 Orange
3 Banana
4 Kiwi
4 Apple
5 Kiwi
5 Banana
5 Orange
6 Orange
7 Orange
...

I don't want to restructure the entire data file, but instead just add four new 'flag' variables, each one specific to a category in my original variable "Frt_CD" above.  The outcome would look like this:

Person Frt_CD Apples Bananas Kiwis Oranges
1 Apple 1 . . .
2 Apple 1 . . .
2 Orange . . . 1
3 Banana . 1 . .
4 Kiwi . . 1 .
4 Apple 1 . . .
5 Kiwi . . 1 .
5 Banana . 1 . .
5 Orange . . . 1
6 Orange . . . 1
7 Orange . . . 1
...

(I need to aggregate my dataset after this step, but want to hold onto the information in it in cases where the same person consumes multiple types of
fruit.)

I know how to write syntax to accomplish the creation of these new variables (such as "IF (Frt_CD='Apple') Apples=1.", etc., etc.), but it takes time to change the syntax for each new variable I want to split up.  So I'm wondering if there is an *existing* command or menu-driven wizard in SPSS that does this, almost like auto-recode?  (BTW, I use SPSS v16.)  Thanks in advance if you can recommend any time saving approach,

PATRICK
Los Angeles

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to split categorical variable into multiple variables?

Rick Oliver-3
In reply to this post by Patrick Burns
do repeat x='Apple' 'Banana' 'Kiwi' 'Orange'
        /y=Apples Bananas Kiwis Oranges.
compute y=Frt_CD=x.
end repeat.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        Patrick Burns <[hidden email]>
To:        [hidden email],
Date:        12/06/2013 02:26 PM
Subject:        How to split categorical variable into multiple variables?
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Is there a way in SPSS to split categorical variable into multiple
variables?  I have a categorical variable that looks like this:

Person Frt_CD
1 Apple
2 Apple
2 Orange
3 Banana
4 Kiwi
4 Apple
5 Kiwi
5 Banana
5 Orange
6 Orange
7 Orange
...

I don't want to restructure the entire data file, but instead just add four
new 'flag' variables, each one specific to a category in my original
variable "Frt_CD" above.  The outcome would look like this:

Person Frt_CD Apples Bananas Kiwis Oranges
1 Apple 1 . . .
2 Apple 1 . . .
2 Orange . . . 1
3 Banana . 1 . .
4 Kiwi . . 1 .
4 Apple 1 . . .
5 Kiwi . . 1 .
5 Banana . 1 . .
5 Orange . . . 1
6 Orange . . . 1
7 Orange . . . 1
...

(I need to aggregate my dataset after this step, but want to hold onto the
information in it in cases where the same person consumes multiple types of
fruit.)

I know how to write syntax to accomplish the creation of these new variables
(such as "IF (Frt_CD='Apple') Apples=1.", etc., etc.), but it takes time to
change the syntax for each new variable I want to split up.  So I'm
wondering if there is an *existing* command or menu-driven wizard in SPSS
that does this, almost like auto-recode?  (BTW, I use SPSS v16.)  Thanks in
advance if you can recommend any time saving approach,

PATRICK
Los Angeles

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: How to split categorical variable into multiple variables?

Jon K Peck
That works, but it requires you to know the exhaustive list of possibilities in advance.  The SPSSINC CREATE DUMMIES extension command figures that out for you (and generates appropriate labels and a macro definition), but it would require a newer version of Statistics.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Rick Oliver/Chicago/IBM@IBMUS
To:        [hidden email],
Date:        12/06/2013 01:40 PM
Subject:        Re: [SPSSX-L] How to split categorical variable into multiple              variables?
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




do repeat x='Apple' 'Banana' 'Kiwi' 'Orange'
       /y=Apples Bananas Kiwis Oranges.

compute y=Frt_CD=x.

end repeat.


Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        
Patrick Burns <[hidden email]>
To:        
[hidden email],
Date:        
12/06/2013 02:26 PM
Subject:        
How to split categorical variable into multiple variables?
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>




Is there a way in SPSS to split categorical variable into multiple
variables?  I have a categorical variable that looks like this:

Person Frt_CD
1 Apple
2 Apple
2 Orange
3 Banana
4 Kiwi
4 Apple
5 Kiwi
5 Banana
5 Orange
6 Orange
7 Orange
...

I don't want to restructure the entire data file, but instead just add four
new 'flag' variables, each one specific to a category in my original
variable "Frt_CD" above.  The outcome would look like this:

Person Frt_CD Apples Bananas Kiwis Oranges
1 Apple 1 . . .
2 Apple 1 . . .
2 Orange . . . 1
3 Banana . 1 . .
4 Kiwi . . 1 .
4 Apple 1 . . .
5 Kiwi . . 1 .
5 Banana . 1 . .
5 Orange . . . 1
6 Orange . . . 1
7 Orange . . . 1
...

(I need to aggregate my dataset after this step, but want to hold onto the
information in it in cases where the same person consumes multiple types of
fruit.)

I know how to write syntax to accomplish the creation of these new variables
(such as "IF (Frt_CD='Apple') Apples=1.", etc., etc.), but it takes time to
change the syntax for each new variable I want to split up.  So I'm
wondering if there is an *existing* command or menu-driven wizard in SPSS
that does this, almost like auto-recode?  (BTW, I use SPSS v16.)  Thanks in
advance if you can recommend any time saving approach,

PATRICK
Los Angeles

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD