Hello,
I'm starting to work with clementine for modeling and I've got an issue with CHAID model. My population is not balanced on the variable to predict (Yes: 12% / No: 88%). When I run the Chaid model (interactive) on a balance sample (50% / 50%), I can't directly use the gain done by clementine. How to calculate the gain on the complete population, not balanced? When I run the Chaid model (interactive) on a non balance sample (12% / 88%), I can directly use the gain done by clementine but when I generate the rules, I only have rules which predict "No" and so I can't validate the model (with graph EVALUATION or output ANALYSIS). Can somebody explain to me how to manage a non balanced population, in order to build the model / calculate the gain / validate the model / generate the rules. thanks Stephanie |
Hi Stephanie,
I will write a brief reply as I am going to recommend that you sign up for CLUG (Clementine User's Group). Only some of the members of this list serve will be familiar with Clementine. Simply google CLUG and you will be able to sign-up. Reply to this list or to me if you have trouble doing so. It sounds like you are using a balance node. If so, that is a good start. You can simply remove the balance node when you evaluate the model, and then you will be back to your original unbalanced version of your data. It is true, that then you would have the gains only for your train sample and test sample separately, but that is how I would do it. It you wanted the whole population, you could remove the partition node too, but I am not sure why you would want to. The easiest way is to use an analysis node. It "knows" to run the train and test separately. Good luck, Keith www.keithmccormick.com On 9/10/06, Stephanie Baroux <[hidden email]> wrote: > Hello, > > I'm starting to work with clementine for modeling and I've got an > issue with CHAID model. > > My population is not balanced on the variable to predict (Yes: 12% / > No: 88%). > When I run the Chaid model (interactive) on a balance sample (50% / > 50%), I can't directly use the gain done by clementine. How to > calculate the gain on the complete population, not balanced? > When I run the Chaid model (interactive) on a non balance sample > (12% / 88%), I can directly use the gain done by clementine but when > I generate the rules, I only have rules which predict "No" and so I > can't validate the model (with graph EVALUATION or output ANALYSIS). > > Can somebody explain to me how to manage a non balanced population, > in order to build the model / calculate the gain / validate the > model / generate the rules. > > thanks > > Stephanie > |
In reply to this post by Stephanie Baroux
You need to use 'weight' option. I'm not sure they have it in
Clementine, they should. They do have it in AnswerTree. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Stephanie Baroux Sent: Sunday, September 10, 2006 3:32 PM To: [hidden email] Subject: Clementine - validate CHAID model Hello, I'm starting to work with clementine for modeling and I've got an issue with CHAID model. My population is not balanced on the variable to predict (Yes: 12% / No: 88%). When I run the Chaid model (interactive) on a balance sample (50% / 50%), I can't directly use the gain done by clementine. How to calculate the gain on the complete population, not balanced? When I run the Chaid model (interactive) on a non balance sample (12% / 88%), I can directly use the gain done by clementine but when I generate the rules, I only have rules which predict "No" and so I can't validate the model (with graph EVALUATION or output ANALYSIS). Can somebody explain to me how to manage a non balanced population, in order to build the model / calculate the gain / validate the model / generate the rules. thanks Stephanie This message is the property of Draft FCB Group and contains information which may be privileged or confidential. It is meant only for the intended recipients and/or their authorized agents. If you believe you have received this message in error, please notify us immediately by return e-mail and destroy any printed or electronic copies of the message. Any unauthorized use, dissemination, disclosure, or copying of this message or the information contained in it, is strictly prohibited and may be unlawful. Thank you for your cooperation. (A) |
Free forum by Nabble | Edit this page |