Automatic Linear Modeling: Identifying Predictor Variables?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Automatic Linear Modeling: Identifying Predictor Variables?

Vik Rubenfeld
I've been running Automatic Linear Modeling with Objective: Model stability (bagging). The model viewer reports that it created 10 Models, ranging in accuracy from 0% to 71.5%.  But the Model Viewer appears to provide no way of seeing what the predictor variables are in the most accurate model.  It lists a Predictor Frequency, but that doesn't necessarily appear to be the same thing.

Is there a way to identify the predictor variables used in the most accurate model generated by ALM?

Thanks in advance to all for any info.

Best,


-Vik

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Automatic Linear Modeling: Identifying Predictor Variables?

Jon K Peck
You have to really want this information to get this :-(  My impression is that most users of ALM, for better or worse, don't really want such details.  But you can get this information if you are persistent.

There are two ways to get all the details.  One is to use the option to save the PMML for the model(s) using the Model Options tab, which produces a zip file with details for each model.  The other is in the Component Model Details, to pick out the highest accuracy model (you can sort by accuracy using the column headings), right click, and export just the PMML for that model.

Either way, you wind up with an XML file or files that have all the model details for scoring purposes.  In there, you will find text like this:

<GeneralRegressionModel algorithmName="ALM" functionName="regression" modelName="CSGLM" modelType="generalLinear" targetVariableName="salary">
                                        <Extension extender="spss.com" name="modelID" value="8"/>
                                        <MiningSchema>
                                                <MiningField name="educ"/>
                                                <MiningField name="jobtime"/>
                                                <MiningField name="salary" usageType="predicted"/>
                                        </MiningSchema>
If there are categorical variables, you will also find the mapping for those.


At that point, the best bet is to use the regular regression procedure or similar to rerun that particular model in order to get readable output.  Besides the XML not being meant for reading by humans, the PMML file produces XML according to the PMML standard so that it can be used for scoring purposes, and this format is quite complex.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Vik Rubenfeld <[hidden email]>
To:        [hidden email],
Date:        09/11/2013 01:35 PM
Subject:        [SPSSX-L] Automatic Linear Modeling: Identifying Predictor              Variables?
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I've been running Automatic Linear Modeling with Objective: Model stability (bagging). The model viewer reports that it created 10 Models, ranging in accuracy from 0% to 71.5%.  But the Model Viewer appears to provide no way of seeing what the predictor variables are in the most accurate model.  It lists a Predictor Frequency, but that doesn't necessarily appear to be the same thing.

Is there a way to identify the predictor variables used in the most accurate model generated by ALM?

Thanks in advance to all for any info.

Best,


-Vik

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD