This is my first attempt at writing some python code to work with SPSS. Big picture, I'm trying to is set up a system to select cases meeting certain changing criteria (Please note this is not for statistical purposes). All variables are binary. I'd like to sum a variable range, select the largest value, delete out all variables set to 1 in the selected case, resum, take the highest number, delete out all variables set to 1 in the selected case resum, take the highest number.....etc.
I'm trying to break this down into baby steps I can handle....here is the first piece
1) Supply a text variable name (starting point) 2) Identify the index of that variable name
3) Select the variable AFTER that index (Start of the binary variables) 4) Select the last variable in the dataset (end of the binary variables) I'm going to be playing around with this but if anyone has insight into the steps I'd be interested in knowing how you'd handle it.
Thanks! |
I suggest that you study the things that
the Dataset class can do. You might also want to read some of the
Python material in the Programming and Data Management book available from
the SPSS Community site.
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: Craig J <[hidden email]> To: [hidden email], Date: 11/16/2012 06:23 PM Subject: [SPSSX-L] Python Question Sent by: "SPSSX(r) Discussion" <[hidden email]> This is my first attempt at writing some python code to work with SPSS. Big picture, I'm trying to is set up a system to select cases meeting certain changing criteria (Please note this is not for statistical purposes). All variables are binary. I'd like to sum a variable range, select the largest value, delete out all variables set to 1 in the selected case, resum, take the highest number, delete out all variables set to 1 in the selected case resum, take the highest number.....etc. I'm trying to break this down into baby steps I can handle....here is the first piece 1) Supply a text variable name (starting point) 2) Identify the index of that variable name 3) Select the variable AFTER that index (Start of the binary variables) 4) Select the last variable in the dataset (end of the binary variables) I'm going to be playing around with this but if anyone has insight into the steps I'd be interested in knowing how you'd handle it. Thanks! |
Administrator
|
This post was updated on .
In reply to this post by Craig Johnson
Please reread your description and realize this is terribly vague.
What does "sum a variable range" mean? What does "select the largest value" mean? What does "delete out all variables set to 1 in the selected case" mean? When do you decide to stop? What is this supposed to achieve ie What output? Why are you presuming Python is the appropriate solution? <edited:flipped the following two lines > Have you looked at the SPSS MATRIX language? See CSUM, RSUM, : indexing operator, LOOP END LOOP control . <edited : upped the ante with the home-brew ;-) I'll bet a home-brew that MATRIX will rip any python solution to pieces WRT processing efficiency! <edited>:ADDED Have you considered RANK? -- Realize that my ESPss and InterneTelepathy gifts are legendary however the signal is weak. ---
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Please reread your description and realize this is terribly vague. * Purposely vauge What does "sum a variable range" mean? * Compute Tot=Sum(VarA to VarZ). What does "select the largest value" mean? * Sort Cases Tot (A). Select If $casenum=1. What does "delete out all variables set to 1 in the selected case" mean? * If $casenum=1 and any of the variables for that case are set to one delete that variable. When do you decide to stop? * When the range is null What is this supposed to achieve ie What output? * Set of cases that have at least one case that has the binary variable =1. This is not a statistical operation. Why are you presuming Python is the appropriate solution? * It's possible it could be done with SPSS syntax. However, "appropriate solutions" are usually in the eyes of the beholder. In this instance I'd like to use Python to start using the language.
See CSUM, RSUM, : indexing operator, LOOP END LOOP control . * Familiar with all of these. Have you looked at the SPSS MATRIX language? * It's not a matrix. I'll bet a MATRIX solution will rip any python solution to pieces WRT * I'm using a duel quad core on a PC on roughly 50k to 500k cases. I'm not exactly worried about sucking up processing power from a mainframe. If it takes longer to run that's fine especially since it will only be ran once.
|
Administrator
|
FWIW:
--- INPUT PROGRAM. LOOP ID=1 TO 50000. DO REPEAT V=V001 TO V100. COMPUTE V=TRUNC(UNIFORM(2)). END REPEAT. END CASE. END LOOP. END FILE. END INPUT PROGRAM. EXE. SET WORKSPACE=500000. SET MXLOOPS=1000000. MATRIX . GET ID /FILE */ VAR ID . GET V /FILE */ VAR V001 TO V100. COMPUTE #N=NROW(ID). COMPUTE #P=NCOL(V). COMPUTE EMPTY=MAKE(#N,1,0). COMPUTE IDS={-9}. COMPUTE MAXSUMS={-9}. LOOP. + COMPUTE SumV=RSUM(V). ** I suspect the following line will bust python's caps **. + COMPUTE MAX=CMAX(SUMV). + COMPUTE #found=0. + DO IF (MAX GT 0). + LOOP #=1 TO #N. + DO IF (SumV(#) EQ MAX) AND NOT (#found). + COMPUTE #found=1. + COMPUTE IDS={IDS,ID(#)}. + COMPUTE MAXSUMS={MAXSUMS,MAX}. + LOOP ##=1 TO #P. + DO IF (V(#,##) EQ 1). + COMPUTE V(:,##)=EMPTY. + END IF. + END LOOP. + END IF. + END LOOP IF #found. + END IF. END LOOP IF MAX=0. COMPUTE IDS=IDS(2:NCOL(IDS)). COMPUTE MAXSUMS=MAXSUMS(2:NCOL(MAXSUMS)). PRINT IDS. PRINT MAXSUMS. END MATRIX.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |