I'm trying to conduct a trend analysis on 3,000 variables, one variable at
a time. For each variable, I'm fitting an exponential curve and wish to capture the coefficient b1. Below is my Syntax: * Macro to create group of independent variables define Policy_Vars () X12419 to X28510 !enddefine. OMS / SELECT TABLES / IF COMMANDS = ['Curve Fit'] SUBTYPES = ['Model Summary and Parameter Estimates'] /DESTINATION FORMAT = SAV OUTFILE = 'C:\Documents and Settings\mrfreeman\Desktop\OMS_trial_3.sav'. TSET NEWVAR=NONE. CURVEFIT /VARIABLES=Policy_Vars WITH Year /MODEL=EXPONENTIAL /PLOT = NONE. OMSEND. A couple of questions have arisen: (1) I can only fit about 900 curves in a single execution. What, then, is the source of this limitation and can it be increased? (2) Is there a way to restrict the data written in the outfile command to only b1 and not the rest of the Model Summary and Parameter Estimates table? Also, if you can recommend a better way to perform this task, please do. Thank you. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: Matt Freeman <[hidden email]> To: [hidden email], Date: 01/03/2013 06:25 AM Subject: [SPSSX-L] Multiple Curve Fit & Parameter Capture Sent by: "SPSSX(r) Discussion" <[hidden email]> I'm trying to conduct a trend analysis on 3,000 variables, one variable at a time. For each variable, I'm fitting an exponential curve and wish to capture the coefficient b1. Below is my Syntax: * Macro to create group of independent variables define Policy_Vars () X12419 to X28510 !enddefine. >>>The macro isn't doing much for you here, but maybe you need to use it elsewhere, too. OMS / SELECT TABLES / IF COMMANDS = ['Curve Fit'] SUBTYPES = ['Model Summary and Parameter Estimates'] /DESTINATION FORMAT = SAV OUTFILE = 'C:\Documents and Settings\mrfreeman\Desktop\OMS_trial_3.sav'. TSET NEWVAR=NONE. CURVEFIT /VARIABLES=Policy_Vars WITH Year /MODEL=EXPONENTIAL /PLOT = NONE. OMSEND. A couple of questions have arisen: (1) I can only fit about 900 curves in a single execution. What, then, is the source of this limitation and can it be increased? >>>What do you see when this limit is exceeded? If you are running out of memory, it could be due to OMS accumulating results or to CURVEFIT itself. Try it without OMS to see if you get further. Be aware that CURVEFIT uses listwise deletion of missing values, so if you have many of those, that could be a problem with so many variables. (2) Is there a way to restrict the data written in the outfile command to only b1 and not the rest of the Model Summary and Parameter Estimates table? >>>OMS works only on entire objects. You can prune by working on the dataset, but that would have to come afterwards. You can, of course, construct several datasets with subsets of the variables and then merge them. Also, if you can recommend a better way to perform this task, please do. >>>You could short circuit the computations if you just want the coefficients by rolling a matrix procedure, but memory might still be an issue. Thank you. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
"Also, if you can recommend a better way to perform this task, please do.
>>>You could short circuit the computations if you just want the coefficients by rolling a matrix procedure, but memory might still be an issue." I second this! Since Matt didn't bother to indicate the sample size and whether there are missing values the use of MATRIX might be either utterly trivial or relatively difficult. There are ways! Since X is a single unchanging vector (Year) it becomes in the linear form. Ln(Y)=B1*Year B1=1/Year'Year*Year'Y. ----- In fact this could be readily done with AGGREGATE! COMPUTE Year2=Year*Year. DO REPEAT V=Policy_Vars. COMPUTE V=Ln(V)*Year. END REPEAT. COMPUTE NoBREAK=1. AGGREGATE OUTFILE * / BREAK NoBreak/ Year2 Policy_Vars=SUM(Year2 Policy_Vars). DO REPEAT V=Policy_Vars. COMPUTE V=V/Year2. END REPEAT.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Matt Freeman
Thanks guys! I'm really struggling with this problem. David - I tried your
Aggregate code and it didn't work. However, it did force me back into the nitty-gritty of parameter estimates and I believe I have it figured out. The issue I still have, then, is capturing the results from each aggregation. Since I'm looping through the variables with the DO REPEAT command, all that's left at the end is the result from the last aggregation. So my question is this: How can I capture the results from each aggregation (at each iteration of the DO REPEAT) so I can calculate the parameter estimate? I've been RTFM and I've discovered the INPUT PROGRAM command and it seems like I might be able to "create cases from groups of cases" in an input file. Any thoughts? Thanks. mf ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Please define 'didn't work'.
There is a single aggregation and it hammers out the required summary stats to do all gazillion estimates at once. So, your remaining questions/issues make absolutely no sense. INPUT PROGRAM has no relevance to the current discussion so I have no thoughts to share on whatever it seems you think you have discovered. --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Matt Freeman
On Sat, 5 Jan 2013 07:49:10 -0800, David Marso <[hidden email]>
wrote: >Please define 'didn't work'. >There is a single aggregation and it hammers out the required summary stats >to do all gazillion >estimates at once. So, your remaining questions/issues make absolutely no >sense. >INPUT PROGRAM has no relevance to the current discussion so I have no >thoughts to share on whatever it seems you think you have discovered. >-- > >Matt Freeman wrote >> Thanks guys! I'm really struggling with this problem. David - I tried >> your >> Aggregate code and it didn't work. However, it did force me back into >> nitty-gritty of parameter estimates and I believe I have it figured out. >> The >> issue I still have, then, is capturing the results from each aggregation. >> Since I'm looping through the variables with the DO REPEAT command, all >> that's >> left at the end is the result from the last aggregation. So my question >> is >> this: How can I capture the results from each aggregation (at each >> iteration >> of the DO REPEAT) so I can calculate the parameter estimate? >> >> I've been RTFM and I've discovered the INPUT PROGRAM command and it seems >> like >> I might be able to "create cases from groups of cases" in an input file. >> Any >> thoughts? >> >> Thanks. mf >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD > > > > > >----- >Please reply to the list and not to my personal email. >Those desiring my consulting or training services please feel free to email >-- >View this message in context: http://spssx- discussion.1045642.n5.nabble.com/Multiple-Curve-Fit-Parameter-Capture- tp5717208p5717236.html >Sent from the SPSSX Discussion mailing list archive at Nabble.com. > >===================== >To manage your subscription to SPSSX-L, send a message to >[hidden email] (not to SPSSX-L), with no body text except the >command. To leave the list, send the command >SIGNOFF SPSSX-L >For a list of commands to manage subscriptions, send the command >INFO REFCARD David, Thanks for your prompt reply. You are certainly correct about the aggregation - I tried a small example from home and it works exactly as you described. I obviously messed something up at work. When I said "it didn't work", I meant that I wasn't getting the parameter estimate as a direct output of your algorithm. However, I believe that your algorithm only provides certain pieces of the parameter-estimate puzzle and I will have to fill in the rest. I do appreciate your time and effort. Thank you. Matt ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Ah, something about an intercept ;-)
You can either adapt the aggregate to also get Sum (Ln(Y)) and N and cobble together the OLS for simple regression or do it as follows. ** Simulate some data **. INPUT PROGRAM. LOOP T=1 TO 100. DO REPEAT Y=Y1 TO Y10 / B=.1 .2 .3 .4 .5 .6 .7 .8 .9 1 /C=1 2 3 4 5 6 7 8 9 10. COMPUTE Y=C*EXP(B*T)+ NORMAL(.2). END REPEAT. END CASE. END LOOP. END FILE. END INPUT PROGRAM. EXE. ** To verify results from Regression **. CURVEFIT /VARIABLES=y1 TO y10 WITH t /MODEL=EXPONENTIAL . ** Reshape data from Wide to Long **. VARSTOCASES /ID = id /MAKE Y FROM y1 TO y10 /INDEX = Index(10) /KEEP = t SORT CASES BY INDEX T . SPLIT FILE BY INDEX. COMPUTE LY=LN(Y). REGRESSION / DEP LY / ENTER T / OUTFILE COVB ("C:\TEMP\COVB.sav"). GET FILE "C:\TEMP\COVB.sav". SELECT IF ROWTYPE_="EST". COMPUTE Const_2=EXP(const_). EXE.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |