Re: Automatic Linear Modeling: Identifying Predictor Variables?
Posted by
Jon K Peck on
Sep 11, 2013; 8:14pm
URL: http://spssx-discussion.165.s1.nabble.com/Automatic-Linear-Modeling-Identifying-Predictor-Variables-tp5721919p5721921.html
You have to really want this information
to get this :-( My impression is that most users of ALM, for better
or worse, don't really want such details. But you can get this information
if you are persistent.
There are two ways to get all the details.
One is to use the option to save the PMML for the model(s) using
the Model Options tab, which produces a zip file with details for each
model. The other is in the Component Model Details, to pick out the
highest accuracy model (you can sort by accuracy using the column headings),
right click, and export just the PMML for that model.
Either way, you wind up with an XML
file or files that have all the model details for scoring purposes. In
there, you will find text like this:
<GeneralRegressionModel
algorithmName="ALM"
functionName="regression"
modelName="CSGLM"
modelType="generalLinear"
targetVariableName="salary">
<Extension
extender="spss.com"
name="modelID"
value="8"/>
<MiningSchema>
<MiningField
name="educ"/>
<MiningField
name="jobtime"/>
<MiningField
name="salary"
usageType="predicted"/>
</MiningSchema>
If there are categorical variables, you will also find the mapping for
those.
At that point, the best bet is to use
the regular regression procedure or similar to rerun that particular model
in order to get readable output. Besides the XML not being meant
for reading by humans, the PMML file produces XML according to the PMML
standard so that it can be used for scoring purposes, and this format is
quite complex.
Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621
From:
Vik Rubenfeld <[hidden email]>
To:
[hidden email],
Date:
09/11/2013 01:35 PM
Subject:
[SPSSX-L] Automatic
Linear Modeling: Identifying Predictor
Variables?
Sent by:
"SPSSX(r)
Discussion" <[hidden email]>
I've been running Automatic Linear Modeling with Objective:
Model stability (bagging). The model viewer reports that it created 10
Models, ranging in accuracy from 0% to 71.5%. But the Model Viewer
appears to provide no way of seeing what the predictor variables are in
the most accurate model. It lists a Predictor Frequency, but that
doesn't necessarily appear to be the same thing.
Is there a way to identify the predictor variables used in the most accurate
model generated by ALM?
Thanks in advance to all for any info.
Best,
-Vik
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD