Representing Nominal data in CHAID

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Representing Nominal data in CHAID

Mark Webb-3
I have nominal data in various formats.

Could list members please comment on the suitability of the various formats
if I want to use the data as independent variables in Chaid/Classification
Trees.



Using Gender as an example [1=M 2=F]



Format 1

One variable/column with either a 1 or a 2.

Format 2

Two columns - One for males with a 1 as the indicator

                        One for females with a 1 as the indicator

Format 3

Two columns - One for males with a 1 as the indicator

                        One for females with a 2 as the indicator



Am I correct in saying that in the case of format 1 data the results would
indicate the extent to which Gender is a descriptor & in the cases of format
2 & 3 whether male vs female is a descriptor.

Are format 2 & 3 equivalent ?



Any comments on preferred format ?

Regards

Mark

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Representing Nominal data in CHAID

Anthony Babinec
To use Gender as a predictor in classification trees, use
format 1. This is the same format you would use if Gender
were a row or column variable in a multi-way table.

Anthony Babinec
[hidden email]

"Be the change you want to see in the world."
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Mark Webb
Sent: Sunday, October 28, 2007 3:09 AM
To: [hidden email]
Subject: Representing Nominal data in CHAID

I have nominal data in various formats.

Could list members please comment on the suitability of the various formats
if I want to use the data as independent variables in Chaid/Classification
Trees.



Using Gender as an example [1=M 2=F]



Format 1

One variable/column with either a 1 or a 2.

Format 2

Two columns - One for males with a 1 as the indicator

                        One for females with a 1 as the indicator

Format 3

Two columns - One for males with a 1 as the indicator

                        One for females with a 2 as the indicator



Am I correct in saying that in the case of format 1 data the results would
indicate the extent to which Gender is a descriptor & in the cases of format
2 & 3 whether male vs female is a descriptor.

Are format 2 & 3 equivalent ?



Any comments on preferred format ?

Regards

Mark

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD