|
A few people have e-mailed me about this and I thought I would provide some insight into what I am trying to accomplish. I am basically trying to perform a Kruskal Wallis (sp?) Average Over Ordering statistic. In a very basic sense it is the average of all permutations of partial correlations between multiple sets of variables versus a singular dependent variable. This metric then provides an indication of the variability for each of the attributes when they are squared and averaged.
Now I know it has been done tons of times in market research literature for limited sets of variables. And it has provided very good results of "statistical drivers" or "statistical importances." I have never seen this tried on so many variables as what I am attempting now.
Also note that the code does not actually do all possible combinations. The intention is for it to try all possible combinations of the correlation part of a Partial Correlation but the stuff that is covarying in the models is limited to either 2 or 3 variables at a time. I would be crazy to try any more than that. Maybe I am a little crazy already with this endeavor. But there is a lot of research showing how the 2nd level and certainly the 3rd level converges to what the all permutation model provides. Perhaps someone has a suggestion on how to modify the code so it does not literally spend time on all of the possible permutations behind the scenes and really does focus on the 2-way combinations or the 3-way combinations.
One person suggested using the SWEEP algorithm using Matrix Algebra. I will be honest in that it has been about 20 years since I played around with SWEEP algorithms in GAUSS to greatly speed up Newman Keuls parameter estimations but I have no idea how I might apply it here.
Lastly, I have heard stories of how the procedure I am trying in SPSS may be done with VBA. But it has been many years since I have regularly used VBA too.
Thank you for the suggestions and questions. Any and all help is greatly valued and appreciated. Note that ultimately I want to try this for 45 or even 72 variables. I can live with just 45 though.
Zachary
----- Forwarded Message ---- From: Zachary Feinstein <[hidden email]> To: [hidden email] Sent: Thu, October 14, 2010 10:34:19 AM Subject: Alternative to SET MXMEMORY = I am working with a very detailed dataset that is creating various permutations of up to four combinations of up to 72 variables. Lots of LOOP statements are combined into this giant run. Many thanks to Jan for her brilliant code!
Unfortunately things fall apart after about 12 - 15 variables. They actually do not fall apart but the SPSS environment gets exceedingly slow and it gets to a point where I have no way of knowing if it is going to work or not.
I have tried some variations in seeing what will happen. I think I was able to get 30 variables to work after about one hour. I tried running it in SPSS 12 instead of SPSS 17 since 17 sometimes gets a little clunky in weird ways. My main hope with doing it with Version 12 was to engage changing the MXMEMORY to something much larger. I recall many years ago doing that for such complex issues and it greatly helped when working with exceedingly large data files.
Unfortunately SPSS V 12 says the command is outdated. And I remember a conversation with SPSS Technical Support many years ago where they said the newer versions of SPSS automatically control for such memory issues.
Is there an alternative to SET MXMEMORY? Does anyone have any suggestions on how to truly speed up this very basic code that creates an enormous amount of data? It is simple but it is a lot.
And I have posted below the code that I am trying to make work. [Again- thanks Jan!] You do not need a data file to make this work.
I look forward to your responses. Thank you very much.
Zachary
* Code that is simpler but it takes the limits that I need.
* SET MXMEMORY = 56000.
* EXECUTE. SET MPRINT=no.
define permut3 (outfile = !charend('|') / afterby = !charend('|') / statvars = !charend('|') / byvars = !cmdend ).
* count byvars and create quoted variables. !let !countstr = "" !let !kv = "" !do !i !in (!byvars) !let !countstr = !concat(!countstr, "x") !let !kv = !concat(!kv ," ", !quote(!i)) !doend !let !cnt = !length(!countstr) . * create a SPSS data file with the permutations in rows. input program. - vector r (!cnt, f1). - loop #m = 1 TO !cnt. - loop #n = 1 to 2**(!cnt - 1). - loop #i = 1 to !cnt . - do if (#i = #m). - compute r(#i) = 2. - else. - compute r(#i) = rnd((-1)**(trunc((#n-1)/(2**(#i-1-(#i > #m)) ))) + 1) / 2. - end if. - end loop. - end case. - end loop. - end loop. - end file. end input program. !let !rmax = !concat("r",!cnt ) compute #s = sum(r1 to !rmax ). select if #s > 2 and #s <= 2 + !afterby. * because the 0000 and 1111 permutations do not give sense here. exe. * create the needed commands.
vector r = r1 to !rmax . string command (a250). compute command = concat("PARTIAL CORR /VARIABLES = ", !quote(!statvars)). do repe i = 1 to !cnt / v = !kv . - if r(i) = 2 command = concat(rtrim(command), " ", v). end repe. compute command = concat(rtrim(command), " BY"). do repe i = 1 to !cnt / v = !kv . - if r(i) = 1 command = concat(rtrim(command), " ", v). end repe. compute command = concat(rtrim(command), "."). WRITE OUTFILE = !quote(!outfile) /command . exe. !enddefine. permut3 outfile = C:/temp2/partcorr.sps | afterby = 2 | statvars = cmpi7 | byvars = cta_1 cta_2 cta_3 cta_4 cta_5 cta_6 cta_7 cta_8 cta_9 cta_10 cta_11 cta_12 cta_13 cta_14 cta_15 cta_16 cta_17 cta_18 cta_19 cta_20. |
|
Administrator
|
Zachary,
SWEEP: ...but I have no idea how I might apply it here. Dust off your cranium and read the links I posted! --- X'X | X'Y | X'Z (X'X) | Bxy | Bxz ---------------- --------------------------- Y'X | Y'Y | Y'Z ->SWEEP(X) -> Bxy' | Res(y|x) | ?Y'Z? ---------------- --------------------------- Z'X | Z'Y | Z'Z Bxz' | ?Z'Y? | Res(z|x) I don't really believe it is necessary to elaborate considering I posted 2 links previously and SPSS Matrix language supplies an implementation. Review the math behind multiple regression and partial correlation. Consider the above in standardized form??? --- Still not clear on what is being controlled for and what is being addressed as a combinatorial issue. VBA? TOO SLOW!!! You would really want a REAL programming language with matrix operations. And besides, Excel stat routines SUCK !!! So dream on with that idea. This could certainly be implemented but you need to sit down and go at it with tweezers. Not a hammer! "Now I know it has been done tons of times in market research literature for limited sets of variables." ROLMFAO: Some of the most ridiculous shit in the universe rolls out of drunken minds of marketing research drones ;-) FOLLOW THE MONEY! Sorry, I couldn't resist!!! Like somehow APSS Regression is a good idea?? If anyone is offended then ... You know the drill... ------------- HTH, David On Thu, 14 Oct 2010 16:35:34 -0700, Zachary Feinstein <[hidden email]> wrote: >A few people have e-mailed me about this and I thought I would provide some insight into what I am trying to accomplish.� I am basically trying to perform a Kruskal Wallis (sp?) Average Over Ordering statistic.� In a very basic sense it is the average of all permutations of partial correlations between multiple sets of variables versus a singular dependent variable.� This metric then provides an indication of the variability for each of the attributes when they are squared and averaged. Now I know it has been done tons of times in market research literature for limited sets of variables.� And it has provided very good results of "statistical drivers" or "statistical importances."� I have never seen this tried on so many variables as what I am attempting now. Also note that the code does not actually do all possible combinations.� The intention is for it to try all possible combinations of the correlation part of a Partial Correlation but the stuff that is covarying in the models is limited to either 2 or 3 variables at a time.� I would be crazy to try any more than that.� Maybe I am a little crazy already with this endeavor.� But there is a lot of research showing how the 2nd level and certainly the 3rd level converges to what the all permutation model provides.� Perhaps someone has a suggestion on how to modify the code so it does not literally spend time on all of the possible permutations behind the scenes and really does focus on the 2-way combinations or the 3-way combinations. One person suggested using the SWEEP algorithm using Matrix Algebra.� I will be honest in that it has been about 20 years since I played around with SWEEP algorithms in GAUSS to greatly speed up Newman Keuls parameter estimations but I have no idea how I might apply it here. Lastly, I have heard stories of how the procedure I am trying in SPSS may be done with VBA.� But it has been many years since I have regularly used VBA too. Thank you for the suggestions and questions.� Any and all help is greatly valued and appreciated.� Note that ultimately I want to try this for 45 or even 72 variables.� I can live with just 45 though. Zachary [hidden email] ----- Forwarded Message ---- From: Zachary Feinstein <[hidden email]> To: [hidden email] Sent: Thu, October 14, 2010 10:34:19 AM Subject: Alternative to SET MXMEMORY = I am working with a very detailed dataset that is creating various permutations of up to four combinations of up to 72 variables.� Lots of LOOP statements are combined into this giant run.� Many thanks to Jan for her brilliant code! Unfortunately things fall apart after about 12 - 15 variables.� They actually do not fall apart but the SPSS environment gets exceedingly slow and it gets to a point where I have no way of knowing if it is going to work or not. I have tried some variations in seeing what will happen.� I think I was able to get 30 variables to work after about one hour.� I tried running it in SPSS 12 instead of SPSS 17 since 17 sometimes gets a little clunky in weird ways.� My main hope with doing it with Version 12 was to engage changing the MXMEMORY to something much larger.� I recall many years ago doing that for such complex issues and it greatly helped when working with exceedingly large data files. Unfortunately SPSS V 12 says the command is outdated.� And I remember a conversation with SPSS Technical Support many years ago where they said the newer versions of SPSS automatically control for such memory issues. Is there an alternative to SET MXMEMORY?� Does anyone have any suggestions on how to truly speed up this very basic code that creates an enormous amount of data?� It is simple but it is a lot. And I have posted below the code that I am trying to make work.� [Again- thanks Jan!]� You do not need a data file to make this work. I look forward to your responses.� Thank you very much. Zachary [hidden email] * Code that is simpler but it takes the limits that I need. * SET MXMEMORY = 56000. * EXECUTE. SET MPRINT=no. define permut3 (outfile = !charend('|') / afterby = !charend('|') / statvars = !charend('|') / byvars = !cmdend ). � * count byvars and create quoted variables. !let !countstr = "" !let !kv = "" !do !i !in (!byvars) !let !countstr = !concat(!countstr, "x") !let !kv = !concat(!kv ," ", !quote(!i)) !doend !let !cnt = !length(!countstr) . � * create a SPSS data file with the permutations in rows. input program. - vector r (!cnt, f1). - loop #m = 1 TO !cnt. -�� loop #n = 1 to 2**(!cnt - 1). -���� loop #i = 1 to !cnt . -������ do if (#i = #m). -�������� compute r(#i) = 2. -������ else. -�������� compute r(#i) = rnd((-1)**(trunc((#n-1)/(2**(#i-1-(#i > #m)) ))) + 1) / 2. -������ end if. -�� end loop. -�� end case. - end loop. - end loop. - end file. end input program. !let !rmax = !concat("r",!cnt ) compute #s = sum(r1 to !rmax ). select if #s > 2 and #s <= 2 + !afterby. * because the 0000 and 1111 permutations do not give sense here. exe. * create the needed commands. vector r = r1 to !rmax . string command (a250). compute command = concat("PARTIAL CORR /VARIABLES = ", !quote(!statvars)). do repe i = 1 to !cnt / v = !kv . - if r(i) = 2 command = concat(rtrim(command), " ", v). end repe. compute command = concat(rtrim(command), " BY"). do repe i = 1 to !cnt / v = !kv . - if r(i) = 1 command = concat(rtrim(command), " ", v). end repe. compute command = concat(rtrim(command), "."). WRITE OUTFILE = !quote(!outfile) /command . exe. !enddefine. � permut3 outfile = C:/temp2/partcorr.sps | afterby = 2 | statvars = cmpi7 | byvars = cta_1 cta_2 cta_3 cta_4 cta_5 cta_6 cta_7 cta_8 cta_9 cta_10 cta_11 cta_12 cta_13 cta_14 cta_15 cta_16 cta_17 cta_18 cta_19 cta_20. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
| Free forum by Nabble | Edit this page |
