Preparing for decision tree

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Preparing for decision tree

Peter Spangler
I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts?

Peter 
Reply | Threaded
Open this post in threaded view
|

Re: Preparing for decision tree

Rick Oliver-3
You can build decision trees for continuous targets and continuous predictors. Binning your continuous predictors may yield a more readable tree result but will be less precise than leaving them continuous since you lose information by binning.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        Peter Spangler <[hidden email]>
To:        [hidden email],
Date:        02/27/2013 11:17 AM
Subject:        Preparing for decision tree
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts?

Peter 
Reply | Threaded
Open this post in threaded view
|

Re: Preparing for decision tree

Rick Oliver-3
In reply to this post by Peter Spangler
Also, the TREE procedure essentially bins your continuous predictors based on what makes sense to the algorithm.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        Rick Oliver/Chicago/IBM
To:        Peter Spangler <[hidden email]>,
Cc:        [hidden email]
Date:        02/27/2013 11:46 AM
Subject:        Re: Preparing for decision tree



You can build decision trees for continuous targets and continuous predictors. Binning your continuous predictors may yield a more readable tree result but will be less precise than leaving them continuous since you lose information by binning.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]





From:        Peter Spangler <[hidden email]>
To:        [hidden email],
Date:        02/27/2013 11:17 AM
Subject:        Preparing for decision tree
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts?

Peter 
Reply | Threaded
Open this post in threaded view
|

Re: Preparing for decision tree

Peter Spangler
In reply to this post by Rick Oliver-3
Because the categories I would like to predict are columns with row values for each case I cannot use them as the DV in the GUI. Would I be better off creating syntax?

On Wed, Feb 27, 2013 at 9:46 AM, Rick Oliver <[hidden email]> wrote:
You can build decision trees for continuous targets and continuous predictors. Binning your continuous predictors may yield a more readable tree result but will be less precise than leaving them continuous since you lose information by binning.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        Peter Spangler <[hidden email]>
To:        [hidden email],
Date:        02/27/2013 11:17 AM
Subject:        Preparing for decision tree
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts?

Peter 

Reply | Threaded
Open this post in threaded view
|

Re: Preparing for decision tree

Rick Oliver-3
I guess I don't really understand the problem. The target (the thing you want to predict) can be categorical or continuous. If it's categorical, the UI really wants the categories to have value labels, but this is not absolutely required unless you want to specify a particular category of interest, and you don't need value labels if you use syntax. The Case Studies for Decision Trees in the (Help menu>Case Studies) might provide some useful information.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        Peter Spangler <[hidden email]>
To:        Rick Oliver/Chicago/IBM@IBMUS,
Cc:        [hidden email]
Date:        02/27/2013 12:30 PM
Subject:        Re: Preparing for decision tree




Because the categories I would like to predict are columns with row values for each case I cannot use them as the DV in the GUI. Would I be better off creating syntax?

On Wed, Feb 27, 2013 at 9:46 AM, Rick Oliver <oliverr@...> wrote:
You can build decision trees for continuous targets and continuous predictors. Binning your continuous predictors may yield a more readable tree result but will be less precise than leaving them continuous since you lose information by binning.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
oliverr@...



From:        
Peter Spangler <pspangler@...>
To:        
[hidden email],
Date:        
02/27/2013 11:17 AM
Subject:        
Preparing for decision tree
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>





I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts?

Peter 


Reply | Threaded
Open this post in threaded view
|

Re: Preparing for decision tree

Melissa Ives

Do you need to run CasestoVars first to make the file horizontal/wide instead of vertical/long?





Because the categories I would like to predict are columns with row values for each case I cannot use them as the DV in the GUI. Would I be better off creating syntax?

On Wed, Feb 27, 2013 at 9:46 AM, Rick Oliver <[hidden email]> wrote:
You can build decision trees for continuous targets and continuous predictors. Binning your continuous predictors may yield a more readable tree result but will be less precise than leaving them continuous since you lose information by binning.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
[hidden email]



From:        
Peter Spangler <[hidden email]>
To:        
[hidden email],
Date:        
02/27/2013 11:17 AM

Subject:        
Preparing for decision tree
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>






I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts?

Peter 



PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.
Reply | Threaded
Open this post in threaded view
|

Re: Preparing for decision tree

Peter Spangler
This would switch the variables to cases, yes? I could then use the cases to predict the variables themselves?

On Wed, Feb 27, 2013 at 10:47 AM, Melissa Ives <[hidden email]> wrote:

Do you need to run CasestoVars first to make the file horizontal/wide instead of vertical/long?





Because the categories I would like to predict are columns with row values for each case I cannot use them as the DV in the GUI. Would I be better off creating syntax?

On Wed, Feb 27, 2013 at 9:46 AM, Rick Oliver <[hidden email]> wrote:
You can build decision trees for continuous targets and continuous predictors. Binning your continuous predictors may yield a more readable tree result but will be less precise than leaving them continuous since you lose information by binning.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
[hidden email]



From:        
Peter Spangler <[hidden email]>
To:        
[hidden email],
Date:        
02/27/2013 11:17 AM

Subject:        
Preparing for decision tree
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>






I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts?

Peter 



PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at <a href="tel:%28309%29%20827-6026" value="+13098276026" target="_blank">(309) 827-6026 for assistance.

Reply | Threaded
Open this post in threaded view
|

Re: Preparing for decision tree

Peter Spangler
In reply to this post by Rick Oliver-3
More specifically, each of my 10 variables are a sales category with values per case. I would like to predict the 10 variables based on the sales values per case.
*For IDs with XX sales in VAR_1, what bins would the tree algorithm tell me about their sales in VAR_2-VAR_10 

On Wed, Feb 27, 2013 at 10:40 AM, Rick Oliver <[hidden email]> wrote:
I guess I don't really understand the problem. The target (the thing you want to predict) can be categorical or continuous. If it's categorical, the UI really wants the categories to have value labels, but this is not absolutely required unless you want to specify a particular category of interest, and you don't need value labels if you use syntax. The Case Studies for Decision Trees in the (Help menu>Case Studies) might provide some useful information.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        Peter Spangler <[hidden email]>
To:        Rick Oliver/Chicago/IBM@IBMUS,
Cc:        [hidden email]
Date:        02/27/2013 12:30 PM
Subject:        Re: Preparing for decision tree




Because the categories I would like to predict are columns with row values for each case I cannot use them as the DV in the GUI. Would I be better off creating syntax?

On Wed, Feb 27, 2013 at 9:46 AM, Rick Oliver <[hidden email]> wrote:
You can build decision trees for continuous targets and continuous predictors. Binning your continuous predictors may yield a more readable tree result but will be less precise than leaving them continuous since you lose information by binning.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
[hidden email]



From:        
Peter Spangler <[hidden email]>
To:        
[hidden email],
Date:        
02/27/2013 11:17 AM
Subject:        
Preparing for decision tree
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>





I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts?

Peter 



Reply | Threaded
Open this post in threaded view
|

Re: Preparing for decision tree

Melissa Ives
In reply to this post by Peter Spangler

I am not sure what your dataset actually looks like, but

 

CASESTOVARS

/ID = ID

/INDEX = grp

  /separator="_"

/GROUPBY = VARIABLE.

 

Changes this…

 

ID grp v1 v2

1  A     2    3

1  B     4    5

2  A     2    2

2  B     3    3

 

To this…

 

Id V1_A v1_B v2_A v2_b

1     2       3       4      5

2     2       2       3      3

 

Varstocases would be used to go the other way (from Horizontal/wide to Vertical/long)

M

 

From: Peter Spangler [mailto:[hidden email]]
Sent: Wednesday, February 27, 2013 12:53 PM
To: Melissa Ives
Cc: [hidden email]
Subject: Re: Preparing for decision tree

 

This would switch the variables to cases, yes? I could then use the cases to predict the variables themselves?

On Wed, Feb 27, 2013 at 10:47 AM, Melissa Ives <[hidden email]> wrote:

Do you need to run CasestoVars first to make the file horizontal/wide instead of vertical/long?





Because the categories I would like to predict are columns with row values for each case I cannot use them as the DV in the GUI. Would I be better off creating syntax?

On Wed, Feb 27, 2013 at 9:46 AM, Rick Oliver <[hidden email]> wrote:
You can build decision trees for continuous targets and continuous predictors. Binning your continuous predictors may yield a more readable tree result but will be less precise than leaving them continuous since you lose information by binning.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail:
[hidden email]



From:        
Peter Spangler <[hidden email]>
To:        
[hidden email],
Date:        
02/27/2013 11:17 AM

Subject:        
Preparing for decision tree
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>






I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts?

Peter 

 


PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at <a href="tel:%28309%29%20827-6026" target="_blank"> (309) 827-6026 for assistance.

 



PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.