exporting regression equation

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

exporting regression equation

Adam Troy
Hi all,

Does anyone know how to export a regression equation to the output?  I'm looking to cross validate a regression model from one dataset to another and score cases in that separate dataset.  For example, below is the regression (logistic) equation that I had to create by hand by copying and pasting each coefficient cell of the regression table in the output into this equation.  Is there any function of SPSS that will do this automatically?

COMPUTE probscore = EXP(-1.8837110889221 + -0.0610812604199899 *
  educationlevel + 0.38059306773648 * military + -0.262865101225782 * cross +
  0.36407564076912 * southsa + -0.622317432758787 * Canada +
 -0.342604201465917 * doctorate + 0.195794807986137 * counseling +
 -0.932893661626645 * distanceeducation + -1.1392796442106 * healthservices +
  0.434347821351045 * humanservices + -1.89066122627975 * technologymanagement
  + -0.498233972265128 * Business + -1.07565245282589 * MKCode +
  0.152397590137721 * WinNT6OS + -0.00727179535309702 * dayofweek2 +
 -1.98753437730861 * LiveCareer + 0.326307767213455 * DanRosenfeld +
 -1.88329168706132 * snagajob + -1.12664042058232 * FindTuition +
 -0.227924359312289 * CPAdeal + 0.966410333846172 * emailideal) .
EXECUTE .


Thanks,

Adam
Reply | Threaded
Open this post in threaded view
|

Re: exporting regression equation

Bruce Weaver
Administrator
If you are trying to save the predicted probabilities for a cross-validation data set, do the following:

1. Merge (via ADD FILES) the original data set with the new (cross-validation) data set.  Use the /IN sub-command to create a flag variable telling you which data set each case belongs to.  E.g.,

* Assuming the original dataset is open & active .

ADD FILES
 file = * /
 file = 'cvdata' / in = crossval .
EXE.

2. IF the outcome variables exist in the new data set, compute a copy of the outcome variable, but only for the original data set.  E.g.,

if NOT crossval DVcopy = dv.

This step  is not necessary if the DV does not exist in the cross-validation data set (or is missing for all cases in the cross-validation data).

3. Run your model using the copy of the outcome variable, and save the predicted probabilities from the model.  By using the copy of the outcome variable, you ensure that only the original data set is used for building the model; but predicted probabilities will be saved for all cases in the file.

If you want fitted log-odds instead of (or in addition to) the predicted probabilities, they are easy enough to compute.  

compute log_odds = ln(predprob / (1 - predprob)).

One reason you might want the log-odds is that if you plot the data, things that are linear in the model will look linear in the plot.  With predicted probabilities, that will not be the case.

And don't forget, you've got the CROSSVAL flag variable (1=cross-validation data, 0 = original data) you can use to separate the two data sets.

HTH.


Adam B. Troy-3 wrote
Hi all,

Does anyone know how to export a regression equation to the output?  I'm
looking to cross validate a regression model from one dataset to another and
score cases in that separate dataset.  For example, below is the regression
(logistic) equation that I had to create by hand by copying and pasting each
coefficient cell of the regression table in the output into this equation.
Is there any function of SPSS that will do this automatically?

COMPUTE probscore = EXP(-1.8837110889221 + -0.0610812604199899 *
  educationlevel + 0.38059306773648 * military + -0.262865101225782 * cross
+
  0.36407564076912 * southsa + -0.622317432758787 * Canada +
 -0.342604201465917 * doctorate + 0.195794807986137 * counseling +
 -0.932893661626645 * distanceeducation + -1.1392796442106 * healthservices
+
  0.434347821351045 * humanservices + -1.89066122627975 *
technologymanagement
  + -0.498233972265128 * Business + -1.07565245282589 * MKCode +
  0.152397590137721 * WinNT6OS + -0.00727179535309702 * dayofweek2 +
 -1.98753437730861 * LiveCareer + 0.326307767213455 * DanRosenfeld +
 -1.88329168706132 * snagajob + -1.12664042058232 * FindTuition +
 -0.227924359312289 * CPAdeal + 0.966410333846172 * emailideal) .
EXECUTE .


Thanks,

Adam
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: exporting regression equation

Adam Troy
Thanks Bruce.  I have used this trick in the past.  The problem is that I'm working with very large datasets, and to merge and rerun the model often locks up the computer for a bit.  I was hoping for a shortcut, and we also want to provide this equation in the report.

Thanks,

Adam

On Fri, May 28, 2010 at 11:42 AM, Bruce Weaver <[hidden email]> wrote:
If you are trying to save the predicted probabilities for a cross-validation
data set, do the following:

1. Merge (via ADD FILES) the original data set with the new
(cross-validation) data set.  Use the /IN sub-command to create a flag
variable telling you which data set each case belongs to.  E.g.,

* Assuming the original dataset is open & active .

ADD FILES
 file = * /
 file = 'cvdata' / in = crossval .
EXE.

2. IF the outcome variables exist in the new data set, compute a copy of the
outcome variable, but only for the original data set.  E.g.,

if NOT crossval DVcopy = dv.

This step  is not necessary if the DV does not exist in the cross-validation
data set (or is missing for all cases in the cross-validation data).

3. Run your model using the copy of the outcome variable, and save the
predicted probabilities from the model.  By using the copy of the outcome
variable, you ensure that only the original data set is used for building
the model; but predicted probabilities will be saved for all cases in the
file.

If you want fitted log-odds instead of (or in addition to) the predicted
probabilities, they are easy enough to compute.

compute log_odds = ln(predprob / (1 - predprob)).

One reason you might want the log-odds is that if you plot the data, things
that are linear in the model will look linear in the plot.  With predicted
probabilities, that will not be the case.

And don't forget, you've got the CROSSVAL flag variable (1=cross-validation
data, 0 = original data) you can use to separate the two data sets.

HTH.



Adam B. Troy-3 wrote:
>
> Hi all,
>
> Does anyone know how to export a regression equation to the output?  I'm
> looking to cross validate a regression model from one dataset to another
> and
> score cases in that separate dataset.  For example, below is the
> regression
> (logistic) equation that I had to create by hand by copying and pasting
> each
> coefficient cell of the regression table in the output into this equation.
> Is there any function of SPSS that will do this automatically?
>
> COMPUTE probscore = EXP(-1.8837110889221 + -0.0610812604199899 *
>   educationlevel + 0.38059306773648 * military + -0.262865101225782 *
> cross
> +
>   0.36407564076912 * southsa + -0.622317432758787 * Canada +
>  -0.342604201465917 * doctorate + 0.195794807986137 * counseling +
>  -0.932893661626645 * distanceeducation + -1.1392796442106 *
> healthservices
> +
>   0.434347821351045 * humanservices + -1.89066122627975 *
> technologymanagement
>   + -0.498233972265128 * Business + -1.07565245282589 * MKCode +
>   0.152397590137721 * WinNT6OS + -0.00727179535309702 * dayofweek2 +
>  -1.98753437730861 * LiveCareer + 0.326307767213455 * DanRosenfeld +
>  -1.88329168706132 * snagajob + -1.12664042058232 * FindTuition +
>  -0.227924359312289 * CPAdeal + 0.966410333846172 * emailideal) .
> EXECUTE .
>
>
> Thanks,
>
> Adam
>
>


-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

NOTE:  My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
--
View this message in context: http://old.nabble.com/exporting-regression-equation-tp28707706p28708524.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: exporting regression equation

Bruce Weaver
Administrator
Oh, I see.  How about using OMS to send the table of coefficients out to a data set when you're running the original model.  That file will contain all of the coefficients & variable names you need.  With a bit of data management & use of string functions, you should be able to cobble together the terms you need for computing your equation, and send them out to a text file with WRITE OUTFILE.  Then use INCLUDE (or INSERT) FILE to run that syntax on your cross-validation dataset.  


Adam B. Troy-3 wrote
Thanks Bruce.  I have used this trick in the past.  The problem is that I'm
working with very large datasets, and to merge and rerun the model often
locks up the computer for a bit.  I was hoping for a shortcut, and we also
want to provide this equation in the report.

Thanks,

Adam

On Fri, May 28, 2010 at 11:42 AM, Bruce Weaver <bruce.weaver@hotmail.com>wrote:

> If you are trying to save the predicted probabilities for a
> cross-validation
> data set, do the following:
>
> 1. Merge (via ADD FILES) the original data set with the new
> (cross-validation) data set.  Use the /IN sub-command to create a flag
> variable telling you which data set each case belongs to.  E.g.,
>
> * Assuming the original dataset is open & active .
>
> ADD FILES
>  file = * /
>  file = 'cvdata' / in = crossval .
> EXE.
>
> 2. IF the outcome variables exist in the new data set, compute a copy of
> the
> outcome variable, but only for the original data set.  E.g.,
>
> if NOT crossval DVcopy = dv.
>
> This step  is not necessary if the DV does not exist in the
> cross-validation
> data set (or is missing for all cases in the cross-validation data).
>
> 3. Run your model using the copy of the outcome variable, and save the
> predicted probabilities from the model.  By using the copy of the outcome
> variable, you ensure that only the original data set is used for building
> the model; but predicted probabilities will be saved for all cases in the
> file.
>
> If you want fitted log-odds instead of (or in addition to) the predicted
> probabilities, they are easy enough to compute.
>
> compute log_odds = ln(predprob / (1 - predprob)).
>
> One reason you might want the log-odds is that if you plot the data, things
> that are linear in the model will look linear in the plot.  With predicted
> probabilities, that will not be the case.
>
> And don't forget, you've got the CROSSVAL flag variable (1=cross-validation
> data, 0 = original data) you can use to separate the two data sets.
>
> HTH.
>
>
>
> Adam B. Troy-3 wrote:
> >
> > Hi all,
> >
> > Does anyone know how to export a regression equation to the output?  I'm
> > looking to cross validate a regression model from one dataset to another
> > and
> > score cases in that separate dataset.  For example, below is the
> > regression
> > (logistic) equation that I had to create by hand by copying and pasting
> > each
> > coefficient cell of the regression table in the output into this
> equation.
> > Is there any function of SPSS that will do this automatically?
> >
> > COMPUTE probscore = EXP(-1.8837110889221 + -0.0610812604199899 *
> >   educationlevel + 0.38059306773648 * military + -0.262865101225782 *
> > cross
> > +
> >   0.36407564076912 * southsa + -0.622317432758787 * Canada +
> >  -0.342604201465917 * doctorate + 0.195794807986137 * counseling +
> >  -0.932893661626645 * distanceeducation + -1.1392796442106 *
> > healthservices
> > +
> >   0.434347821351045 * humanservices + -1.89066122627975 *
> > technologymanagement
> >   + -0.498233972265128 * Business + -1.07565245282589 * MKCode +
> >   0.152397590137721 * WinNT6OS + -0.00727179535309702 * dayofweek2 +
> >  -1.98753437730861 * LiveCareer + 0.326307767213455 * DanRosenfeld +
> >  -1.88329168706132 * snagajob + -1.12664042058232 * FindTuition +
> >  -0.227924359312289 * CPAdeal + 0.966410333846172 * emailideal) .
> > EXECUTE .
> >
> >
> > Thanks,
> >
> > Adam
> >
> >
>
>
> -----
> --
> Bruce Weaver
> bweaver@lakeheadu.ca
> http://sites.google.com/a/lakeheadu.ca/bweaver/
> "When all else fails, RTFM."
>
> NOTE:  My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
> --
> View this message in context:
> http://old.nabble.com/exporting-regression-equation-tp28707706p28708524.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: exporting regression equation

Jon K Peck
In reply to this post by Adam Troy
Or combine OMS with a little Python and get away from writing out text files and "cobbling".

Regards,
Jon Peck
(From Florence)
-----------------
Sent from my BlackBerry Handheld.


----- Original Message -----
From: Bruce Weaver [[hidden email]]
Sent: 05/28/2010 11:47 AM MST
To: [hidden email]
Subject: Re: [SPSSX-L] exporting regression equation



Oh, I see.  How about using OMS to send the table of coefficients out to a
data set when you're running the original model.  That file will contain all
of the coefficients & variable names you need.  With a bit of data
management & use of string functions, you should be able to cobble together
the terms you need for computing your equation, and send them out to a text
file with WRITE OUTFILE.  Then use INCLUDE (or INSERT) FILE to run that
syntax on your cross-validation dataset.



Adam B. Troy-3 wrote:

>
> Thanks Bruce.  I have used this trick in the past.  The problem is that
> I'm
> working with very large datasets, and to merge and rerun the model often
> locks up the computer for a bit.  I was hoping for a shortcut, and we also
> want to provide this equation in the report.
>
> Thanks,
>
> Adam
>
> On Fri, May 28, 2010 at 11:42 AM, Bruce Weaver
> <[hidden email]>wrote:
>
>> If you are trying to save the predicted probabilities for a
>> cross-validation
>> data set, do the following:
>>
>> 1. Merge (via ADD FILES) the original data set with the new
>> (cross-validation) data set.  Use the /IN sub-command to create a flag
>> variable telling you which data set each case belongs to.  E.g.,
>>
>> * Assuming the original dataset is open & active .
>>
>> ADD FILES
>>  file = * /
>>  file = 'cvdata' / in = crossval .
>> EXE.
>>
>> 2. IF the outcome variables exist in the new data set, compute a copy of
>> the
>> outcome variable, but only for the original data set.  E.g.,
>>
>> if NOT crossval DVcopy = dv.
>>
>> This step  is not necessary if the DV does not exist in the
>> cross-validation
>> data set (or is missing for all cases in the cross-validation data).
>>
>> 3. Run your model using the copy of the outcome variable, and save the
>> predicted probabilities from the model.  By using the copy of the outcome
>> variable, you ensure that only the original data set is used for building
>> the model; but predicted probabilities will be saved for all cases in the
>> file.
>>
>> If you want fitted log-odds instead of (or in addition to) the predicted
>> probabilities, they are easy enough to compute.
>>
>> compute log_odds = ln(predprob / (1 - predprob)).
>>
>> One reason you might want the log-odds is that if you plot the data,
>> things
>> that are linear in the model will look linear in the plot.  With
>> predicted
>> probabilities, that will not be the case.
>>
>> And don't forget, you've got the CROSSVAL flag variable
>> (1=cross-validation
>> data, 0 = original data) you can use to separate the two data sets.
>>
>> HTH.
>>
>>
>>
>> Adam B. Troy-3 wrote:
>> >
>> > Hi all,
>> >
>> > Does anyone know how to export a regression equation to the output?
>> I'm
>> > looking to cross validate a regression model from one dataset to
>> another
>> > and
>> > score cases in that separate dataset.  For example, below is the
>> > regression
>> > (logistic) equation that I had to create by hand by copying and pasting
>> > each
>> > coefficient cell of the regression table in the output into this
>> equation.
>> > Is there any function of SPSS that will do this automatically?
>> >
>> > COMPUTE probscore = EXP(-1.8837110889221 + -0.0610812604199899 *
>> >   educationlevel + 0.38059306773648 * military + -0.262865101225782 *
>> > cross
>> > +
>> >   0.36407564076912 * southsa + -0.622317432758787 * Canada +
>> >  -0.342604201465917 * doctorate + 0.195794807986137 * counseling +
>> >  -0.932893661626645 * distanceeducation + -1.1392796442106 *
>> > healthservices
>> > +
>> >   0.434347821351045 * humanservices + -1.89066122627975 *
>> > technologymanagement
>> >   + -0.498233972265128 * Business + -1.07565245282589 * MKCode +
>> >   0.152397590137721 * WinNT6OS + -0.00727179535309702 * dayofweek2 +
>> >  -1.98753437730861 * LiveCareer + 0.326307767213455 * DanRosenfeld +
>> >  -1.88329168706132 * snagajob + -1.12664042058232 * FindTuition +
>> >  -0.227924359312289 * CPAdeal + 0.966410333846172 * emailideal) .
>> > EXECUTE .
>> >
>> >
>> > Thanks,
>> >
>> > Adam
>> >
>> >
>>
>>
>> -----
>> --
>> Bruce Weaver
>> [hidden email]
>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>> "When all else fails, RTFM."
>>
>> NOTE:  My Hotmail account is not monitored regularly.
>> To send me an e-mail, please use the address shown above.
>> --
>> View this message in context:
>> http://old.nabble.com/exporting-regression-equation-tp28707706p28708524.html
>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>
>


-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

NOTE:  My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
--
View this message in context: http://old.nabble.com/exporting-regression-equation-tp28707706p28710620.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: exporting regression equation

Art Kendall
In reply to this post by Adam Troy
If you are using very large data sets 3 things that might work are:
1) if you have the server version of SPSS see the option to save the XML for scoring the other set.
2) create a data set with only the variables involved in any of the computations and MATCH just those data and using the ?In option for create a flag to subset the data.  It still may use less of your time than using options you are not used to.
3)The equation itself is there in the output hardcode the equation in syntax.
compute yhat = constant + (b1*x1) + (b2*x2) ... .
compute yresid = y-yhat.

Art Kendall
Social Research Consultants

On 5/28/2010 12:19 PM, Adam B. Troy wrote:
Thanks Bruce.  I have used this trick in the past.  The problem is that I'm working with very large datasets, and to merge and rerun the model often locks up the computer for a bit.  I was hoping for a shortcut, and we also want to provide this equation in the report.

Thanks,

Adam

On Fri, May 28, 2010 at 11:42 AM, Bruce Weaver <[hidden email]> wrote:
If you are trying to save the predicted probabilities for a cross-validation
data set, do the following:

1. Merge (via ADD FILES) the original data set with the new
(cross-validation) data set.  Use the /IN sub-command to create a flag
variable telling you which data set each case belongs to.  E.g.,

* Assuming the original dataset is open & active .

ADD FILES
 file = * /
 file = 'cvdata' / in = crossval .
EXE.

2. IF the outcome variables exist in the new data set, compute a copy of the
outcome variable, but only for the original data set.  E.g.,

if NOT crossval DVcopy = dv.

This step  is not necessary if the DV does not exist in the cross-validation
data set (or is missing for all cases in the cross-validation data).

3. Run your model using the copy of the outcome variable, and save the
predicted probabilities from the model.  By using the copy of the outcome
variable, you ensure that only the original data set is used for building
the model; but predicted probabilities will be saved for all cases in the
file.

If you want fitted log-odds instead of (or in addition to) the predicted
probabilities, they are easy enough to compute.

compute log_odds = ln(predprob / (1 - predprob)).

One reason you might want the log-odds is that if you plot the data, things
that are linear in the model will look linear in the plot.  With predicted
probabilities, that will not be the case.

And don't forget, you've got the CROSSVAL flag variable (1=cross-validation
data, 0 = original data) you can use to separate the two data sets.

HTH.



Adam B. Troy-3 wrote:
>
> Hi all,
>
> Does anyone know how to export a regression equation to the output?  I'm
> looking to cross validate a regression model from one dataset to another
> and
> score cases in that separate dataset.  For example, below is the
> regression
> (logistic) equation that I had to create by hand by copying and pasting
> each
> coefficient cell of the regression table in the output into this equation.
> Is there any function of SPSS that will do this automatically?
>
> COMPUTE probscore = EXP(-1.8837110889221 + -0.0610812604199899 *
>   educationlevel + 0.38059306773648 * military + -0.262865101225782 *
> cross
> +
>   0.36407564076912 * southsa + -0.622317432758787 * Canada +
>  -0.342604201465917 * doctorate + 0.195794807986137 * counseling +
>  -0.932893661626645 * distanceeducation + -1.1392796442106 *
> healthservices
> +
>   0.434347821351045 * humanservices + -1.89066122627975 *
> technologymanagement
>   + -0.498233972265128 * Business + -1.07565245282589 * MKCode +
>   0.152397590137721 * WinNT6OS + -0.00727179535309702 * dayofweek2 +
>  -1.98753437730861 * LiveCareer + 0.326307767213455 * DanRosenfeld +
>  -1.88329168706132 * snagajob + -1.12664042058232 * FindTuition +
>  -0.227924359312289 * CPAdeal + 0.966410333846172 * emailideal) .
> EXECUTE .
>
>
> Thanks,
>
> Adam
>
>


-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

NOTE:  My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.
--
View this message in context: http://old.nabble.com/exporting-regression-equation-tp28707706p28708524.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants