|
I am working with a very detailed dataset that is creating various permutations of up to four combinations of up to 72 variables. Lots of LOOP statements are combined into this giant run. Many thanks to Jan for her brilliant code!
Unfortunately things fall apart after about 12 - 15 variables. They actually do not fall apart but the SPSS environment gets exceedingly slow and it gets to a point where I have no way of knowing if it is going to work or not.
I have tried some variations in seeing what will happen. I think I was able to get 30 variables to work after about one hour. I tried running it in SPSS 12 instead of SPSS 17 since 17 sometimes gets a little clunky in weird ways. My main hope with doing it with Version 12 was to engage changing the MXMEMORY to something much larger. I recall many years ago doing that for such complex issues and it greatly helped when working with exceedingly large data files.
Unfortunately SPSS V 12 says the command is outdated. And I remember a conversation with SPSS Technical Support many years ago where they said the newer versions of SPSS automatically control for such memory issues.
Is there an alternative to SET MXMEMORY? Does anyone have any suggestions on how to truly speed up this very basic code that creates an enormous amount of data? It is simple but it is a lot.
And I have posted below the code that I am trying to make work. [Again- thanks Jan!] You do not need a data file to make this work.
I look forward to your responses. Thank you very much.
Zachary
* Code that is simpler but it takes the limits that I need.
* SET MXMEMORY = 56000.
* EXECUTE. SET MPRINT=no.
define permut3 (outfile = !charend('|') / afterby = !charend('|') / statvars = !charend('|') / byvars = !cmdend ).
* count byvars and create quoted variables. !let !countstr = "" !let !kv = "" !do !i !in (!byvars) !let !countstr = !concat(!countstr, "x") !let !kv = !concat(!kv ," ", !quote(!i)) !doend !let !cnt = !length(!countstr) . * create a SPSS data file with the permutations in rows. input program. - vector r (!cnt, f1). - loop #m = 1 TO !cnt. - loop #n = 1 to 2**(!cnt - 1). - loop #i = 1 to !cnt . - do if (#i = #m). - compute r(#i) = 2. - else. - compute r(#i) = rnd((-1)**(trunc((#n-1)/(2**(#i-1-(#i > #m)) ))) + 1) / 2. - end if. - end loop. - end case. - end loop. - end loop. - end file. end input program. !let !rmax = !concat("r",!cnt ) compute #s = sum(r1 to !rmax ). select if #s > 2 and #s <= 2 + !afterby. * because the 0000 and 1111 permutations do not give sense here. exe. * create the needed commands.
vector r = r1 to !rmax . string command (a250). compute command = concat("PARTIAL CORR /VARIABLES = ", !quote(!statvars)). do repe i = 1 to !cnt / v = !kv . - if r(i) = 2 command = concat(rtrim(command), " ", v). end repe. compute command = concat(rtrim(command), " BY"). do repe i = 1 to !cnt / v = !kv . - if r(i) = 1 command = concat(rtrim(command), " ", v). end repe. compute command = concat(rtrim(command), "."). WRITE OUTFILE = !quote(!outfile) /command . exe. !enddefine. permut3 outfile = C:/temp2/partcorr.sps | afterby = 2 | statvars = cmpi7 | byvars = cta_1 cta_2 cta_3 cta_4 cta_5 cta_6 cta_7 cta_8 cta_9 cta_10 cta_11 cta_12 cta_13 cta_14 cta_15 cta_16 cta_17 cta_18 cta_19 cta_20. |
|
Since you have this in a macro, I assume that the overall task is
something you want to do many times. In which situation you only
need to generate the file of 24 million permutations once.
I am not sure what you mean by your first sentence. Are you saying that you want to generate all the approximately 1 million combinations of 72 variables taken 4 at a time (order is not important). Then you want to try all 24 permutations of those 4 (order is important). If you want about 24 million "instances", why not just go with the approximately 24 million permutations of 72 variables taken 4 at a time? Please explain the context in which this arises. It appears that you are trying 24 million partial correlation procedures! What is the goal of the analysis? What is the nature of the variables? Are you sure that the order of the variables is important? Are you trying to do some kind of "poor man's factor analysis" or stepwise regression? Also, are you using as input a 73 * 73 correlation matrix or are you passing the raw data for every run? Art Kendall Social Research Consultants On 10/14/2010 11:34 AM, Zachary Feinstein wrote: ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
|
Administrator
|
In reply to this post by Zachary Feinstein
Zachary,
Your assumption that it is due to a lack of memory is quite premature. You need to take a good look at what is happening here internal to the macro and not just blindly run it! What is likely happening is after the gigantic data set is generated and SPSS slavishly begins to fire off on Partial Reg after another that your Output document is becoming so huge that SPSS moans in pain, screams WTF and gives you the ghost in the machine. ------------------------------------ - loop #m = 1 TO !cnt. - loop #n = 1 to 2**(!cnt - 1). - loop #i = 1 to !cnt . - do if (#i = #m). - compute r(#i) = 2. - else. - compute r(#i) = rnd((-1)**(trunc((#n-1)/(2**(#i-1-(#i > #m)) ))) + 1) / 2. - end if. - end loop. - end case. - end loop. - end loop. - end file. 20 * 2**19 * 20= 10,485,760 cases with 20 variables = 209,715,200 data elements. OF course it's going to take a VERY VERY long time, and MEMORY is NOT the issue at all (you have at most 20 variables being used) Trivial amount of memory involved! I believe you REALLY need to rethink your strategy. If I were to attempt to implement something like this, I certainly would not be building it with RAW data (i.e. I see no MATRIX=IN(file) ) and would NOT be running it as an individual procedure for each combination. IF I were to try this (big IF here) I would approach it using the MATRIX language and use the SWEEP operator. SWEEP Perform sweep transformation See: http://tinyurl.com/2dzg74d http://lib.stat.cmu.edu/apstat/178 OTOH, I can't think of why /what you are trying to achieve here. Certainly a brute force approach is really out of the question. What are you going to do with all of this after all is said and done and you fill up your hard disk and end up completely bald?. What is your research question? If you can answer this then maybe you can find a more appropriate solution. In a nutshell: Dealing with HUGE combinatorial problems using brute force naive approaches is doomed to a utterly epic failure in the ugliest fashion imaginable. HTH, David ---- ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
Hello,
Thank you for your email. I will be out of the office on Friday, October 15th, returning on Monday, October 18th and will respond to your message when i return. Thanks! Genevieve Odoom Policy and Program Analyst OANHSS Suite 700 - 7050 Weston Rd. Woodbridge, ON L4L 8G7 Tel: (905) 851-8821 x 241 Fax: (905) 851-0744 [hidden email] www.oanhss.org<https://mail.oanhss.org/ecp/Organize/www.oanhss.org> ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by David Marso
I have found with v19 that coming back to a file that I have left open
all night on my pc, I cannot get Spss to come back up. The headings will display but it will not show me the data, I have to go out of v19 and bring it back up. This is a PITA because I sometimes may have many temporary files open that I just don't feel like saving. This happens every time. I found another bug with search and replace that I will post later. - Please consider the environment before printing this email. - ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
I do leave SPSS 19 running overnight, or even open in a hibernating notebook
(which means saving the desktop and turning the notebook off), with several data files open, and it resurrects without a glitch. Sometimes it takes its time to show the data or the syntax, but it finally gets on. Hector -----Mensaje original----- De: SPSSX(r) Discussion [mailto:[hidden email]] En nombre de Lombardo, Barbara Enviado el: Friday, October 15, 2010 11:21 AM Para: [hidden email] Asunto: SPSS 19 I have found with v19 that coming back to a file that I have left open all night on my pc, I cannot get Spss to come back up. The headings will display but it will not show me the data, I have to go out of v19 and bring it back up. This is a PITA because I sometimes may have many temporary files open that I just don't feel like saving. This happens every time. I found another bug with search and replace that I will post later. - Please consider the environment before printing this email. - ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD Se certificó que el correo entrante no contiene virus. Comprobada por AVG - www.avg.es Versión: 8.5.448 / Base de datos de virus: 271.1.1/3185 - Fecha de la versión: 10/14/10 18:34:00 ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
