predicting from a known sample

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

predicting from a known sample

Maguin, Eugene
I've analyzed a small dataset to develop a multiple regression type equation for a DV. From that equation, I can get coefficient table, an R squared, an error of measurement. Typical things. What I'd now like to do is to test the regression equation in a predictive sense using by records drawn from the population having a multivariate distribution whose parameters are defined by the statistics of the original small dataset. What do I hope gain from this? I want to gain an idea of the expected distribution of the predicted values of the DV. I guess that this might be a standard sort problem in which one develops a regression equation for a collected sample and then applies that regression equation to new cases to make a prediction.

The question is how to do this (and this is an area that I've never encountered before so I don't know what it might be called). I'm open to suggestions of readings to do, examples, where I should be asking this question, etc. And, if this an example of crazy, incomplete thinking, well, I'd like to know but tell me off-list.

Thanks, Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: predicting from a known sample

David Marso
Administrator
If you have your coefficients (coef.sav) and a set of simulated sample data (sample.sav).
OFF THE TOP OF MY HEAD !!!
---
MATRIX.
GET coef / FILE coef.sav /VAR all.
GET data / FILE sample.sav / VAR all.
COMPUTE Pred=data * coef .
SAVE {data,pred} / OUTFILE * .
END MATRIX.

Maguin, Eugene wrote
I've analyzed a small dataset to develop a multiple regression type equation for a DV. From that equation, I can get coefficient table, an R squared, an error of measurement. Typical things. What I'd now like to do is to test the regression equation in a predictive sense using by records drawn from the population having a multivariate distribution whose parameters are defined by the statistics of the original small dataset. What do I hope gain from this? I want to gain an idea of the expected distribution of the predicted values of the DV. I guess that this might be a standard sort problem in which one develops a regression equation for a collected sample and then applies that regression equation to new cases to make a prediction.

The question is how to do this (and this is an area that I've never encountered before so I don't know what it might be called). I'm open to suggestions of readings to do, examples, where I should be asking this question, etc. And, if this an example of crazy, incomplete thinking, well, I'd like to know but tell me off-list.

Thanks, Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: predicting from a known sample

Jon K Peck
This sounds like a perfect fit for the simulation feature in V21.  It will generate data for the predictors based on the in-sample data or from other specified distributions; it will use the PMML file saved from Regression or the other supported procedures and do Monte Carlo replications of the simulation.  You get lots of useful charts and statistics for the results.

If you are still on an older version and want to generate the new distributions by hand, you can still use the Applymodel function in Compute to generate the predicted values.  Applymodel used to be limited to Statistics Server, but it is now available in the Client.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621




From:        David Marso <[hidden email]>
To:        [hidden email],
Date:        11/08/2012 02:54 PM
Subject:        Re: [SPSSX-L] predicting from a known sample
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




If you have your coefficients (coef.sav) and a set of simulated sample data
(sample.sav).
OFF THE TOP OF MY HEAD !!!
---
MATRIX.
GET coef / FILE coef.sav /VAR all.
GET data / FILE sample.sav / VAR all.
COMPUTE Pred=data * coef .
SAVE {data,pred} / OUTFILE * .
END MATRIX.


Maguin, Eugene wrote
> I've analyzed a small dataset to develop a multiple regression type
> equation for a DV. From that equation, I can get coefficient table, an R
> squared, an error of measurement. Typical things. What I'd now like to do
> is to test the regression equation in a predictive sense using by records
> drawn from the population having a multivariate distribution whose
> parameters are defined by the statistics of the original small dataset.
> What do I hope gain from this? I want to gain an idea of the expected
> distribution of the predicted values of the DV. I guess that this might be
> a standard sort problem in which one develops a regression equation for a
> collected sample and then applies that regression equation to new cases to
> make a prediction.
>
> The question is how to do this (and this is an area that I've never
> encountered before so I don't know what it might be called). I'm open to
> suggestions of readings to do, examples, where I should be asking this
> question, etc. And, if this an example of crazy, incomplete thinking,
> well, I'd like to know but tell me off-list.
>
> Thanks, Gene Maguin
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/predicting-from-a-known-sample-tp5716118p5716121.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD