I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts?
Peter
|
You can build decision trees for continuous
targets and continuous predictors. Binning your continuous predictors may
yield a more readable tree result but will be less precise than leaving
them continuous since you lose information by binning.
Rick Oliver Senior Information Developer IBM Business Analytics (SPSS) E-mail: [hidden email] From: Peter Spangler <[hidden email]> To: [hidden email], Date: 02/27/2013 11:17 AM Subject: Preparing for decision tree Sent by: "SPSSX(r) Discussion" <[hidden email]> I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts? Peter |
In reply to this post by Peter Spangler
Also, the TREE procedure essentially bins
your continuous predictors based on what makes sense to the algorithm.
Rick Oliver Senior Information Developer IBM Business Analytics (SPSS) E-mail: [hidden email] From: Rick Oliver/Chicago/IBM To: Peter Spangler <[hidden email]>, Cc: [hidden email] Date: 02/27/2013 11:46 AM Subject: Re: Preparing for decision tree You can build decision trees for continuous targets and continuous predictors. Binning your continuous predictors may yield a more readable tree result but will be less precise than leaving them continuous since you lose information by binning. Rick Oliver Senior Information Developer IBM Business Analytics (SPSS) E-mail: [hidden email] From: Peter Spangler <[hidden email]> To: [hidden email], Date: 02/27/2013 11:17 AM Subject: Preparing for decision tree Sent by: "SPSSX(r) Discussion" <[hidden email]> I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts? Peter |
In reply to this post by Rick Oliver-3
Because the categories I would like to predict are columns with row values for each case I cannot use them as the DV in the GUI. Would I be better off creating syntax?
On Wed, Feb 27, 2013 at 9:46 AM, Rick Oliver <[hidden email]> wrote: You can build decision trees for continuous targets and continuous predictors. Binning your continuous predictors may yield a more readable tree result but will be less precise than leaving them continuous since you lose information by binning. |
I guess I don't really understand the problem.
The target (the thing you want to predict) can be categorical or continuous.
If it's categorical, the UI really wants the categories to have value labels,
but this is not absolutely required unless you want to specify a particular
category of interest, and you don't need value labels if you use syntax.
The Case Studies for Decision Trees in the (Help menu>Case Studies)
might provide some useful information.
Rick Oliver Senior Information Developer IBM Business Analytics (SPSS) E-mail: [hidden email] From: Peter Spangler <[hidden email]> To: Rick Oliver/Chicago/IBM@IBMUS, Cc: [hidden email] Date: 02/27/2013 12:30 PM Subject: Re: Preparing for decision tree Because the categories I would like to predict are columns with row values for each case I cannot use them as the DV in the GUI. Would I be better off creating syntax? On Wed, Feb 27, 2013 at 9:46 AM, Rick Oliver <oliverr@...> wrote: You can build decision trees for continuous targets and continuous predictors. Binning your continuous predictors may yield a more readable tree result but will be less precise than leaving them continuous since you lose information by binning. Rick Oliver Senior Information Developer IBM Business Analytics (SPSS) E-mail: oliverr@... From: Peter Spangler <pspangler@...> To: [hidden email], Date: 02/27/2013 11:17 AM Subject: Preparing for decision tree Sent by: "SPSSX(r) Discussion" <[hidden email]> I am preparing for a decision tree and would like some advice on preparation. I have cases by numeric ID and ten continuous variables that are each a category. I would like to prepare a decision tree that predicts an ID's category by that variable's continuous value. My thought is to bin the continuous values then build a decision tree predictor from a new variable. Any thoughts? Peter |
Do you need to run CasestoVars first to make the file horizontal/wide instead of vertical/long?
PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. |
This would switch the variables to cases, yes? I could then use the cases to predict the variables themselves?
On Wed, Feb 27, 2013 at 10:47 AM, Melissa Ives <[hidden email]> wrote:
|
In reply to this post by Rick Oliver-3
More specifically, each of my 10 variables are a sales category with values per case. I would like to predict the 10 variables based on the sales values per case.
*For IDs with XX sales in VAR_1, what bins would the tree algorithm tell me about their sales in VAR_2-VAR_10
On Wed, Feb 27, 2013 at 10:40 AM, Rick Oliver <[hidden email]> wrote: I guess I don't really understand the problem. The target (the thing you want to predict) can be categorical or continuous. If it's categorical, the UI really wants the categories to have value labels, but this is not absolutely required unless you want to specify a particular category of interest, and you don't need value labels if you use syntax. The Case Studies for Decision Trees in the (Help menu>Case Studies) might provide some useful information. |
In reply to this post by Peter Spangler
I am not sure what your dataset actually looks like, but
CASESTOVARS /ID = ID /INDEX = grp /separator="_" /GROUPBY = VARIABLE. Changes this… ID grp v1 v2 1 A 2 3 1 B 4 5 2 A 2 2 2 B 3 3 To this… Id V1_A v1_B v2_A v2_b 1 2 3 4 5 2 2 2 3 3 Varstocases would be used to go the other way (from Horizontal/wide to Vertical/long) M From: Peter Spangler [mailto:[hidden email]]
This would switch the variables to cases, yes? I could then use the cases to predict the variables themselves? On Wed, Feb 27, 2013 at 10:47 AM, Melissa Ives <[hidden email]> wrote: Do you need to run CasestoVars first to make the file horizontal/wide instead of vertical/long?
PRIVILEGED AND CONFIDENTIAL INFORMATION PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. |
Free forum by Nabble | Edit this page |