Convert Categorical to binary

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Convert Categorical to binary

Jignesh Sutar
Is it possible to convert, for example, a 5 category variable to 5 binary
variables?

I know a simple IF statement can do the job, but I have many
categorical vars with 10+ categories. So there an automated/quick function
on SPSS that can do this conversion.

Thanks in advance

Jignesh
Reply | Threaded
Open this post in threaded view
|

Re: Convert Categorical to binary

Peck, Jon
Here is a Python programmability solution to this problem.  Explanation below.


begin program.
import spss, spssaux2
spssaux2.CreateBasisVariables(varindex=3,root="edlevels", macroname="edlevels")
end program.

Everything is boilerplate except the CreateBasisVariables line.  That line calls a function in the spssaux2 module from SPSS Developer Central (www.spss.com/devcentral).  It specifies that dummy variables for all values of the variable with index 3, counting from zero be created with names like edlevels_1, edlevels_2 ... .  (In the employee data.sav file, edlevel is the variable with index 3.)  Note that it is not necessary to specify the values of the input variable.

The function also creates a macro with the names of all the dummy variables except the first one, so you could then do something like this.

REGRESSION  /DEPENDENT salary
  /METHOD=ENTER jobtime  edlevels.

The dummy variables are labeled with their category value.  This function has some other useful features that you can read about in the documentation for spssaux2.

To use this function, you need the following.
SPSS 14.0.1 or later
The corresponding Python programmability Plug-in
Python 2.4 (not 2.5)
These modules:
spssaux,
spssaux2
spssdata
namedtuple

Once you have SPSS, the rest is free.
Python 2.4 can be downloaded from www.python.org
The plug-in and other modules can be downloaded from Developer Central


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of J Sutar
Sent: Thursday, April 05, 2007 5:06 AM
To: [hidden email]
Subject: [SPSSX-L] Convert Categorical to binary

Is it possible to convert, for example, a 5 category variable to 5 binary
variables?

I know a simple IF statement can do the job, but I have many
categorical vars with 10+ categories. So there an automated/quick function
on SPSS that can do this conversion.

Thanks in advance

Jignesh
Reply | Threaded
Open this post in threaded view
|

Re: Convert Categorical to binary

Maguin, Eugene
In reply to this post by Jignesh Sutar
Jignesh,

Combine a Do repeat structure with your existing series of if statements.
For example,

Key assumption: all variables have same number of categories. Trouble
otherwise.
Also, bear in mind that there is a probably a limit to the number of
variables that can be referenced on a do repeat even though the v10
documentation does not say anything about this. However, an example
referencing 400 variables is given.

Do repeat x=v1 to v10/
   Y1=ya1 to ya10/Y2=yb1 to yb10/Y3=yc1 to yc10/Y4=yd1 to yd10.
+  compute y1=0.
+  if (x eq 1) y1=1.
+  compute y2=0.
+  if (x eq 2) y2=1.
+  compute y3=0.
+  if (x eq 3) y3=1.
+  compute y4=0.
+  if (x eq 4) y4=1.
End repeat.

Gene Maguin
Reply | Threaded
Open this post in threaded view
|

Re: Convert Categorical to binary

Kooij, A.J. van der
In reply to this post by Jignesh Sutar
5 dummy variables for variable v1 with 5 categories:

VECTOR v1dum_(5F8.0).
LOOP #i = 1 TO 5.
COMPUTE v1dum_(#i) = (v1 = (#i)).
END LOOP.

Anita van der Kooij
Data Theory Group
Leiden University

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
J Sutar
Sent: 05 April 2007 12:06
To: [hidden email]
Subject: Convert Categorical to binary


Is it possible to convert, for example, a 5 category variable to 5
binary variables?

I know a simple IF statement can do the job, but I have many categorical
vars with 10+ categories. So there an automated/quick function on SPSS
that can do this conversion.

Thanks in advance

Jignesh

**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Convert Categorical to binary

Jignesh Sutar
In reply to this post by Peck, Jon
How do I adjust the below program, to do this for all variables at the
Nominal level?

Currently my setup is as below, but I'm sure Python can search through
variable level. I've only recently started using Python so am in need
of a helping hand.

Also, instead of having to enter varindex, is it possible to enter the
variable name? One last thing is that my root will always be the
original variable name, so can this all be automated in some way?

GET FILE "c:\program files\spss\employee data.sav".
VARIABLE LEVEL ALL (SCALE).
/*my real data will include this step so that I don't dummy code
unnecessary variable already defined as nominal, only those that I
define at the next command*/

VARIABLE LEVEL gender educ jobcat (NOMINAL).
BEGIN PROGRAM.
import spss, spssaux2
spssaux2.CreateBasisVariables(varindex=1,root="gender",usevaluelabels="true")
spssaux2.CreateBasisVariables(varindex=3,root="educ",usevaluelabels="true")
spssaux2.CreateBasisVariables(varindex=4,root="jobcat",usevaluelabels="true")
END PROGRAM.



2007/4/5 Peck, Jon <[hidden email]>:

> Here is a Python programmability solution to this problem.  Explanation below.
>
>
> begin program.
> import spss, spssaux2
> spssaux2.CreateBasisVariables(varindex=3,root="edlevels", macroname="edlevels")
> end program.
>
> Everything is boilerplate except the CreateBasisVariables line.  That line calls a function in the spssaux2 module from SPSS Developer Central (www.spss.com/devcentral).  It specifies that dummy variables for all values of the variable with index 3, counting from zero be created with names like edlevels_1, edlevels_2 ... .  (In the employee data.sav file, edlevel is the variable with index 3.)  Note that it is not necessary to specify the values of the input variable.
>
> The function also creates a macro with the names of all the dummy variables except the first one, so you could then do something like this.
>
> REGRESSION  /DEPENDENT salary
>  /METHOD=ENTER jobtime  edlevels.
>
> The dummy variables are labeled with their category value.  This function has some other useful features that you can read about in the documentation for spssaux2.
>
> To use this function, you need the following.
> SPSS 14.0.1 or later
> The corresponding Python programmability Plug-in
> Python 2.4 (not 2.5)
> These modules:
> spssaux,
> spssaux2
> spssdata
> namedtuple
>
> Once you have SPSS, the rest is free.
> Python 2.4 can be downloaded from www.python.org
> The plug-in and other modules can be downloaded from Developer Central
>
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of J Sutar
> Sent: Thursday, April 05, 2007 5:06 AM
> To: [hidden email]
> Subject: [SPSSX-L] Convert Categorical to binary
>
> Is it possible to convert, for example, a 5 category variable to 5 binary
> variables?
>
> I know a simple IF statement can do the job, but I have many
> categorical vars with 10+ categories. So there an automated/quick function
> on SPSS that can do this conversion.
>
> Thanks in advance
>
> Jignesh
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Convert Categorical to binary

Peck, Jon
See below.

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of J Sutar
Sent: Thursday, September 04, 2008 11:15 AM
To: Peck, Jon
Cc: [hidden email]
Subject: Re: [SPSSX-L] Convert Categorical to binary

How do I adjust the below program, to do this for all variables at the
Nominal level?

Currently my setup is as below, but I'm sure Python can search through
variable level. I've only recently started using Python so am in need
of a helping hand.

Also, instead of having to enter varindex, is it possible to enter the
variable name? One last thing is that my root will always be the
original variable name, so can this all be automated in some way?

GET FILE "c:\program files\spss\employee data.sav".
VARIABLE LEVEL ALL (SCALE).
/*my real data will include this step so that I don't dummy code
unnecessary variable already defined as nominal, only those that I
define at the next command*/



BEGIN PROGRAM.
import spss, spssaux2
for i in range spss.GetVariableCount():
  if spss.GetVariableMeasurementLevel(i) == "nominal":  spssaux2.CreateBasisVariables(varindex=i,root=spss.GetVariableName(i),usevaluelabels            ="true")
END PROGRAM.

You don't need the variable name for this logic, but if you did, you would need to look it up in a variable dictionary and get the index to use in CreateBasisVariables since that function only accepts an index value.

HTH,
Jon Peck

2007/4/5 Peck, Jon <[hidden email]>:

> Here is a Python programmability solution to this problem.  Explanation below.
>
>
> begin program.
> import spss, spssaux2
> spssaux2.CreateBasisVariables(varindex=3,root="edlevels", macroname="edlevels")
> end program.
>
> Everything is boilerplate except the CreateBasisVariables line.  That line calls a function in the spssaux2 module from SPSS Developer Central (www.spss.com/devcentral).  It specifies that dummy variables for all values of the variable with index 3, counting from zero be created with names like edlevels_1, edlevels_2 ... .  (In the employee data.sav file, edlevel is the variable with index 3.)  Note that it is not necessary to specify the values of the input variable.
>
> The function also creates a macro with the names of all the dummy variables except the first one, so you could then do something like this.
>
> REGRESSION  /DEPENDENT salary
>  /METHOD=ENTER jobtime  edlevels.
>
> The dummy variables are labeled with their category value.  This function has some other useful features that you can read about in the documentation for spssaux2.
>
> To use this function, you need the following.
> SPSS 14.0.1 or later
> The corresponding Python programmability Plug-in
> Python 2.4 (not 2.5)
> These modules:
> spssaux,
> spssaux2
> spssdata
> namedtuple
>
> Once you have SPSS, the rest is free.
> Python 2.4 can be downloaded from www.python.org
> The plug-in and other modules can be downloaded from Developer Central
>
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of J Sutar
> Sent: Thursday, April 05, 2007 5:06 AM
> To: [hidden email]
> Subject: [SPSSX-L] Convert Categorical to binary
>
> Is it possible to convert, for example, a 5 category variable to 5 binary
> variables?
>
> I know a simple IF statement can do the job, but I have many
> categorical vars with 10+ categories. So there an automated/quick function
> on SPSS that can do this conversion.
>
> Thanks in advance
>
> Jignesh
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD