Multiple Curve Fit & Parameter Capture

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Multiple Curve Fit & Parameter Capture

Matt Freeman
I'm trying to conduct a trend analysis on 3,000 variables, one variable at
a time.  For each variable, I'm fitting an exponential curve and wish to
capture the coefficient b1. Below is my Syntax:

* Macro to create group of independent variables
define Policy_Vars ()
X12419 to X28510
!enddefine.

OMS
   / SELECT TABLES
   / IF COMMANDS = ['Curve Fit']
         SUBTYPES = ['Model Summary and Parameter Estimates']
   /DESTINATION FORMAT = SAV
   OUTFILE = 'C:\Documents and Settings\mrfreeman\Desktop\OMS_trial_3.sav'.
TSET NEWVAR=NONE.
CURVEFIT
  /VARIABLES=Policy_Vars WITH Year
  /MODEL=EXPONENTIAL
  /PLOT = NONE.
OMSEND.

A couple of questions have arisen:
  (1) I can only fit about 900 curves in a single execution.  What, then,
is the source of this limitation and can it be increased?
  (2) Is there a way to restrict the data written in the outfile command
to only b1 and not the rest of the Model Summary and Parameter Estimates
table?

Also, if you can recommend a better way to perform this task, please do.

Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multiple Curve Fit & Parameter Capture

Jon K Peck


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621




From:        Matt Freeman <[hidden email]>
To:        [hidden email],
Date:        01/03/2013 06:25 AM
Subject:        [SPSSX-L] Multiple Curve Fit & Parameter Capture
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I'm trying to conduct a trend analysis on 3,000 variables, one variable at
a time.  For each variable, I'm fitting an exponential curve and wish to
capture the coefficient b1. Below is my Syntax:

* Macro to create group of independent variables
define Policy_Vars ()
X12419 to X28510
!enddefine.

>>>The macro isn't doing much for you here,  but maybe you need to use it elsewhere, too.

OMS
  / SELECT TABLES
  / IF COMMANDS = ['Curve Fit']
        SUBTYPES = ['Model Summary and Parameter Estimates']
  /DESTINATION FORMAT = SAV
  OUTFILE = 'C:\Documents and Settings\mrfreeman\Desktop\OMS_trial_3.sav'.
TSET NEWVAR=NONE.
CURVEFIT
 /VARIABLES=Policy_Vars WITH Year
 /MODEL=EXPONENTIAL
 /PLOT = NONE.
OMSEND.

A couple of questions have arisen:
 (1) I can only fit about 900 curves in a single execution.  What, then,
is the source of this limitation and can it be increased?

>>>What do you see when this limit is exceeded?  If you are running out of memory, it could be due to OMS accumulating results or to CURVEFIT itself.  Try it without OMS to see if you get further.  Be aware that CURVEFIT uses listwise deletion of missing values, so if you have many of those, that could be a problem with so many variables.

 (2) Is there a way to restrict the data written in the outfile command
to only b1 and not the rest of the Model Summary and Parameter Estimates
table?

>>>OMS works only on entire objects.  You can prune by working on the dataset, but that would have to come afterwards.

You can, of course, construct several datasets with subsets of the variables and then merge them.

Also, if you can recommend a better way to perform this task, please do.

>>>You could short circuit the computations if you just want the coefficients by rolling a matrix procedure, but memory might still be an issue.

Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: Multiple Curve Fit & Parameter Capture

David Marso
Administrator
"Also, if you can recommend a better way to perform this task, please do.
>>>You could short circuit the computations if you just want the coefficients by rolling a matrix procedure, but memory might still be an issue."

I second this!  
Since Matt didn't bother to indicate the sample size and whether there are missing values the use of MATRIX might be either utterly trivial or relatively difficult.
There are ways!
Since X is a single unchanging vector (Year) it becomes in the linear form.

Ln(Y)=B1*Year

B1=1/Year'Year*Year'Y.

-----
In fact this could be readily done with AGGREGATE!
COMPUTE Year2=Year*Year.
DO REPEAT V=Policy_Vars.
COMPUTE V=Ln(V)*Year.
END REPEAT.
COMPUTE NoBREAK=1.
AGGREGATE OUTFILE * / BREAK NoBreak/ Year2 Policy_Vars=SUM(Year2 Policy_Vars).
DO REPEAT V=Policy_Vars.
COMPUTE V=V/Year2.
END REPEAT.







Jon K Peck wrote
Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621




From:   Matt Freeman <[hidden email]>
To:     [hidden email],
Date:   01/03/2013 06:25 AM
Subject:        [SPSSX-L] Multiple Curve Fit & Parameter Capture
Sent by:        "SPSSX(r) Discussion" <[hidden email]>



I'm trying to conduct a trend analysis on 3,000 variables, one variable at
a time.  For each variable, I'm fitting an exponential curve and wish to
capture the coefficient b1. Below is my Syntax:

* Macro to create group of independent variables
define Policy_Vars ()
X12419 to X28510
!enddefine.
>>>The macro isn't doing much for you here,  but maybe you need to use it
elsewhere, too.

OMS
   / SELECT TABLES
   / IF COMMANDS = ['Curve Fit']
         SUBTYPES = ['Model Summary and Parameter Estimates']
   /DESTINATION FORMAT = SAV
   OUTFILE = 'C:\Documents and
Settings\mrfreeman\Desktop\OMS_trial_3.sav'.
TSET NEWVAR=NONE.
CURVEFIT
  /VARIABLES=Policy_Vars WITH Year
  /MODEL=EXPONENTIAL
  /PLOT = NONE.
OMSEND.

A couple of questions have arisen:
  (1) I can only fit about 900 curves in a single execution.  What, then,
is the source of this limitation and can it be increased?
>>>What do you see when this limit is exceeded?  If you are running out of
memory, it could be due to OMS accumulating results or to CURVEFIT itself.
 Try it without OMS to see if you get further.  Be aware that CURVEFIT
uses listwise deletion of missing values, so if you have many of those,
that could be a problem with so many variables.

  (2) Is there a way to restrict the data written in the outfile command
to only b1 and not the rest of the Model Summary and Parameter Estimates
table?
>>>OMS works only on entire objects.  You can prune by working on the
dataset, but that would have to come afterwards.

You can, of course, construct several datasets with subsets of the
variables and then merge them.

Also, if you can recommend a better way to perform this task, please do.
>>>You could short circuit the computations if you just want the
coefficients by rolling a matrix procedure, but memory might still be an
issue.

Thank you.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Multiple Curve Fit & Parameter Capture

Matt Freeman
In reply to this post by Matt Freeman
Thanks guys!  I'm really struggling with this problem.  David - I tried your
Aggregate code and it didn't work.  However, it did force me back into the
nitty-gritty of parameter estimates and I believe I have it figured out.  The
issue I still have, then, is capturing the results from each aggregation.
Since I'm looping through the variables with the DO REPEAT command, all that's
left at the end is the result from the last aggregation.  So my question is
this: How can I capture the results from each aggregation (at each iteration
of the DO REPEAT) so I can calculate the parameter estimate?

I've been RTFM and I've discovered the INPUT PROGRAM command and it seems like
I might be able to "create cases from groups of cases" in an input file.  Any
thoughts?

Thanks. mf

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multiple Curve Fit & Parameter Capture

David Marso
Administrator
Please define 'didn't work'.
There is a single aggregation and it hammers out the required summary stats to do all gazillion
estimates at once.  So, your remaining questions/issues make absolutely no sense.  
INPUT PROGRAM has no relevance to the current discussion so I have no thoughts to share on whatever it seems you think you have discovered.
--
Matt Freeman wrote
Thanks guys!  I'm really struggling with this problem.  David - I tried your
Aggregate code and it didn't work.  However, it did force me back into the
nitty-gritty of parameter estimates and I believe I have it figured out.  The
issue I still have, then, is capturing the results from each aggregation.
Since I'm looping through the variables with the DO REPEAT command, all that's
left at the end is the result from the last aggregation.  So my question is
this: How can I capture the results from each aggregation (at each iteration
of the DO REPEAT) so I can calculate the parameter estimate?

I've been RTFM and I've discovered the INPUT PROGRAM command and it seems like
I might be able to "create cases from groups of cases" in an input file.  Any
thoughts?

Thanks. mf

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Multiple Curve Fit & Parameter Capture

Matt Freeman
In reply to this post by Matt Freeman
On Sat, 5 Jan 2013 07:49:10 -0800, David Marso <[hidden email]>
wrote:

>Please define 'didn't work'.
>There is a single aggregation and it hammers out the required summary stats
>to do all gazillion
>estimates at once.  So, your remaining questions/issues make absolutely no
>sense.
>INPUT PROGRAM has no relevance to the current discussion so I have no
>thoughts to share on whatever it seems you think you have discovered.
>--
>
>Matt Freeman wrote
>> Thanks guys!  I'm really struggling with this problem.  David - I tried
>> your
>> Aggregate code and it didn't work.  However, it did force me back into
the

>> nitty-gritty of parameter estimates and I believe I have it figured out.
>> The
>> issue I still have, then, is capturing the results from each aggregation.
>> Since I'm looping through the variables with the DO REPEAT command, all
>> that's
>> left at the end is the result from the last aggregation.  So my question
>> is
>> this: How can I capture the results from each aggregation (at each
>> iteration
>> of the DO REPEAT) so I can calculate the parameter estimate?
>>
>> I've been RTFM and I've discovered the INPUT PROGRAM command and it seems
>> like
>> I might be able to "create cases from groups of cases" in an input file.
>> Any
>> thoughts?
>>
>> Thanks. mf
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to email
me.
>--
>View this message in context: http://spssx-
discussion.1045642.n5.nabble.com/Multiple-Curve-Fit-Parameter-Capture-
tp5717208p5717236.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

David,

Thanks for your prompt reply.  You are certainly correct about the
aggregation
- I tried a small example from home and it works exactly as you described. I
obviously messed something up at work.

When I said "it didn't work", I meant that I wasn't getting the parameter
estimate as a direct output of your algorithm.  However, I believe that your
algorithm only provides certain pieces of the parameter-estimate puzzle and
I
will have to fill in the rest.

I do appreciate your time and effort.  Thank you.

Matt

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Multiple Curve Fit & Parameter Capture

David Marso
Administrator
Ah, something about an intercept ;-)
You can either adapt the aggregate to also get Sum (Ln(Y)) and N and cobble together the OLS for simple regression or do it as follows.

** Simulate some data **.
INPUT PROGRAM.
LOOP T=1 TO 100.
DO REPEAT Y=Y1 TO Y10 / B=.1 .2 .3 .4 .5 .6 .7 .8 .9 1 /C=1 2 3 4 5 6 7 8 9 10.
COMPUTE Y=C*EXP(B*T)+ NORMAL(.2).
END REPEAT.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXE.
** To verify results from Regression **.
CURVEFIT /VARIABLES=y1 TO y10 WITH t  /MODEL=EXPONENTIAL .

** Reshape data from Wide to Long **.
VARSTOCASES /ID = id /MAKE Y FROM y1 TO y10 /INDEX = Index(10) /KEEP =  t
SORT CASES BY INDEX T .
SPLIT FILE BY INDEX.

COMPUTE LY=LN(Y).
REGRESSION / DEP LY / ENTER T / OUTFILE COVB ("C:\TEMP\COVB.sav").
GET FILE  "C:\TEMP\COVB.sav".
SELECT IF ROWTYPE_="EST".
COMPUTE Const_2=EXP(const_).
EXE.
Matt Freeman wrote
On Sat, 5 Jan 2013 07:49:10 -0800, David Marso <[hidden email]>
wrote:

>Please define 'didn't work'.
>There is a single aggregation and it hammers out the required summary stats
>to do all gazillion
>estimates at once.  So, your remaining questions/issues make absolutely no
>sense.
>INPUT PROGRAM has no relevance to the current discussion so I have no
>thoughts to share on whatever it seems you think you have discovered.
>--
>
>Matt Freeman wrote
>> Thanks guys!  I'm really struggling with this problem.  David - I tried
>> your
>> Aggregate code and it didn't work.  However, it did force me back into
the
>> nitty-gritty of parameter estimates and I believe I have it figured out.
>> The
>> issue I still have, then, is capturing the results from each aggregation.
>> Since I'm looping through the variables with the DO REPEAT command, all
>> that's
>> left at the end is the result from the last aggregation.  So my question
>> is
>> this: How can I capture the results from each aggregation (at each
>> iteration
>> of the DO REPEAT) so I can calculate the parameter estimate?
>>
>> I've been RTFM and I've discovered the INPUT PROGRAM command and it seems
>> like
>> I might be able to "create cases from groups of cases" in an input file.
>> Any
>> thoughts?
>>
>> Thanks. mf
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
>-----
>Please reply to the list and not to my personal email.
>Those desiring my consulting or training services please feel free to email
me.
>--
>View this message in context: http://spssx-
discussion.1045642.n5.nabble.com/Multiple-Curve-Fit-Parameter-Capture-
tp5717208p5717236.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

David,

Thanks for your prompt reply.  You are certainly correct about the
aggregation
- I tried a small example from home and it works exactly as you described. I
obviously messed something up at work.

When I said "it didn't work", I meant that I wasn't getting the parameter
estimate as a direct output of your algorithm.  However, I believe that your
algorithm only provides certain pieces of the parameter-estimate puzzle and
I
will have to fill in the rest.

I do appreciate your time and effort.  Thank you.

Matt

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"