Administrator
|
DUMMY Variables...
-------------------- *G4 exists in the data file and has integer values between 1 and 4. **COMMENTS??**. RECODE G4 (1=1) INTO DX1 / G4 (2=1) INTO DX2 / G4 (3=1) INTO DX3 / DX1 DX2 DX3 (MISSING=0). OTOH: The following is more concise with large number of groups ;-) NUMERIC DX1 TO DX4 (F1). RECODE DX1 TO DX4 (ELSE=0). VECTOR DX=DX1 TO DX4. COMPUTE DX(G4)=1. The following IMNSHO is abysmal. DO IF G4=1. + COMPUTE DX1=1. ELSE IF G4=2. + COMPUTE DX2=1. ELSE IF G4=3. + COMPUTE DX3=1. END IF. RECODE DX1 TO DX4 (MISSING=0)(ELSE=COPY).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Or use the SPSSINC CREATE DUMMIES extension
command. It can do multiple variables in some command, doesn't require
specification of the categories, and it makes appropriate variable labels,
too. It can even do 2- and 3-way interaction terms.
Example: SPSSINC CREATE DUMMIES VARIABLE=y ROOTNAME = ydummies /OPTIONS ORDER=A USEVALUELABELS=YES MACRONAME="!ydummies" OMITFIRST=YES. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: David Marso <[hidden email]> To: [hidden email] Date: 02/01/2012 01:39 PM Subject: [SPSSX-L] Interesting bit of code ;-) Sent by: "SPSSX(r) Discussion" <[hidden email]> DUMMY Variables... -------------------- *G4 exists in the data file and has integer values between 1 and 4. **COMMENTS??**. RECODE G4 (1=1) INTO DX1 / G4 (2=1) INTO DX2 / G4 (3=1) INTO DX3 / DX1 DX2 DX3 (MISSING=0). OTOH: The following is more concise with large number of groups ;-) NUMERIC DX1 TO DX4 (F1). RECODE DX1 TO DX4 (ELSE=0). VECTOR DX=DX1 TO DX4. COMPUTE DX(G4)=1. The following IMNSHO is abysmal. DO IF G4=1. + COMPUTE DX1=1. ELSE IF G4=2. + COMPUTE DX2=1. ELSE IF G4=3. + COMPUTE DX3=1. END IF. RECODE DX1 TO DX4 (MISSING=0)(ELSE=COPY). -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Interesting-bit-of-code-tp5448698p5448698.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by David Marso
Also interesting-- DO REPEAT + TO keyword instead of multiple DO IF
statements: new file. data list free /g10 (f8.0). begin data 1 2 3 4 5 6 7 8 9 10 End data. Dataset name g10 window = front. list. do repeat x = dx1 to dx10 /y = 1 TO 10. COMPUTE x = 0. if g10 = y x = 1. end repeat. list. Jim Marks Director, Market Research x1616 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: Wednesday, February 01, 2012 2:30 PM To: [hidden email] Subject: Interesting bit of code ;-) DUMMY Variables... -------------------- *G4 exists in the data file and has integer values between 1 and 4. **COMMENTS??**. RECODE G4 (1=1) INTO DX1 / G4 (2=1) INTO DX2 / G4 (3=1) INTO DX3 / DX1 DX2 DX3 (MISSING=0). OTOH: The following is more concise with large number of groups ;-) NUMERIC DX1 TO DX4 (F1). RECODE DX1 TO DX4 (ELSE=0). VECTOR DX=DX1 TO DX4. COMPUTE DX(G4)=1. The following IMNSHO is abysmal. DO IF G4=1. + COMPUTE DX1=1. ELSE IF G4=2. + COMPUTE DX2=1. ELSE IF G4=3. + COMPUTE DX3=1. END IF. RECODE DX1 TO DX4 (MISSING=0)(ELSE=COPY). -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Interesting-bit-of-code-tp 5448698p5448698.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by Jon K Peck
I had a feeling ;-)
--- OTOH: I have been using SPSS for some 25+ years and had never thought to use a single RECODE to map a single variable onto 3 new ones and to then RECODE these new variables which are defined on the same recode to resolve missing values. Hats off to Tex Hull, Jon Fry, Bill Hoskins and I'm sure others who designed the elegant guts of this crazy program ;-))!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by Marks, Jim
Or :
do repeat x = dx1 to dx10 /y = 1 TO 10. COMPUTE x = (g10 EQ y). end repeat. > list. > On Wed, Feb 1, 2012 at 5:06 PM, Marks, Jim [via SPSSX Discussion] <[hidden email]> wrote: > Also interesting-- DO REPEAT + TO keyword instead of multiple DO IF > statements: > > new file. > data list free /g10 (f8.0). > begin data > Â 1 2 3 4 5 6 7 8 9 10 > End data. > > Dataset name g10 window = front. > list. > > do repeat x = dx1 to dx10 /y = 1 TO 10. > COMPUTE x = 0. > if g10 = y x = 1. > end repeat. > list. > > > > Jim Marks > Director, Market Research > x1616 > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > David Marso > Sent: Wednesday, February 01, 2012 2:30 PM > To: [hidden email] > Subject: Interesting bit of code ;-) > > DUMMY Variables... > -------------------- > *G4 exists in the data file and has integer values between 1 and 4. > **COMMENTS??**. > RECODE G4 (1=1) INTO DX1 > Â Â Â / G4 (2=1) INTO DX2 > Â Â Â / G4 (3=1) INTO DX3 > Â Â Â / DX1 DX2 DX3 (MISSING=0). > > OTOH: Â The following is more concise with large number of groups ;-) > > NUMERIC DX1 TO DX4 (F1). > RECODE DX1 TO DX4 (ELSE=0). > VECTOR DX=DX1 TO DX4. > COMPUTE DX(G4)=1. > > The following IMNSHO is abysmal. > DO IF G4=1. > + Â COMPUTE DX1=1. > ELSE IF G4=2. > + Â COMPUTE DX2=1. > ELSE IF G4=3. > + Â COMPUTE DX3=1. > END IF. > RECODE DX1 TO DX4 (MISSING=0)(ELSE=COPY). > > > > -- > View this message in context: > http://spssx-discussion.1045642.n5.nabble.com/Interesting-bit-of-code-tp > 5448698p5448698.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > > > ________________________________ > If you reply to this email, your message will be added to the discussion > below: > http://spssx-discussion.1045642.n5.nabble.com/Interesting-bit-of-code-tp5448698p5448975.html > To unsubscribe from Interesting bit of code ;-), click here. > NAML
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by David Marso
I like DO-REPEAT for this task. It's very transparent, I think, and not significantly more likely to cause RSI than the other methods you show . ;-)
DATA LIST FREE / g4 (F1). BEGIN DATA 1 2 3 4 END DATA. NUMERIC dx1 TO dx4 (F1). DO REPEAT dx = dx1 to dx4 / # = 1 to 4 . - COMPUTE dx = g4 EQ #. END REPEAT. LIST.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
OTOH It is less efficient than the VECTOR approach.
N computes rather than 1. I'm not sure about the RECODE WRT processing efficiency. My point was the interesting way that RECODE allows multiple vars to be created from one variable and the ability to subsequently recode these new variables in the single recode statement. -- Remember DO REPEAT cycles through the entire list of stand in 'variables'. -- NUMERIC dx1 TO dx4 (F1). DO REPEAT dx = dx1 to dx4 / # = 1 to 4 . - COMPUTE dx = g4 EQ #. END REPEAT PRINT. 18 0 +COMPUTE DX1 = G4 EQ 1 19 0 +COMPUTE DX2 = G4 EQ 2 20 0 +COMPUTE DX3 = G4 EQ 3 21 0 +COMPUTE DX4 = G4 EQ 4 LIST.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
There would not be any significant difference
between DO REPEAT and unrolling the computation into specific computes.
DO REPEAT saves a little parsing time, but that would be trivial
compared to the calculation time in most cases.
RECODE is generally much more efficient than separate COMPUTEs or a DO IF approach. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: David Marso <[hidden email]> To: [hidden email] Date: 02/01/2012 04:07 PM Subject: Re: [SPSSX-L] Interesting bit of code ;-) Sent by: "SPSSX(r) Discussion" <[hidden email]> OTOH It is less efficient than the VECTOR approach. N computes rather than 1. I'm not sure about the RECODE WRT processing efficiency. My point was the interesting way that RECODE allows multiple vars to be created from one variable and the ability to subsequently recode these new variables in the single recode statement. -- Remember DO REPEAT cycles through the entire list of stand in 'variables'. -- NUMERIC dx1 TO dx4 (F1). DO REPEAT dx = dx1 to dx4 / # = 1 to 4 . - COMPUTE dx = g4 EQ #. END REPEAT PRINT. 18 0 +COMPUTE DX1 = G4 EQ 1 19 0 +COMPUTE DX2 = G4 EQ 2 20 0 +COMPUTE DX3 = G4 EQ 3 21 0 +COMPUTE DX4 = G4 EQ 4 LIST. Bruce Weaver wrote > > I like DO-REPEAT for this task. It's very transparent, I think, and not > significantly more likely to cause RSI than the other methods you show . > ;-) > > DATA LIST FREE / g4 (F1). > BEGIN DATA > 1 2 3 4 > END DATA. > > NUMERIC dx1 TO dx4 (F1). > DO REPEAT dx = dx1 to dx4 / # = 1 to 4 . > - COMPUTE dx = g4 EQ #. > END REPEAT. > LIST. > > > > David Marso wrote >> >> DUMMY Variables... >> -------------------- >> *G4 exists in the data file and has integer values between 1 and 4. >> **COMMENTS??**. >> RECODE G4 (1=1) INTO DX1 >> / G4 (2=1) INTO DX2 >> / G4 (3=1) INTO DX3 >> / DX1 DX2 DX3 (MISSING=0). >> >> OTOH: The following is more concise with large number of groups ;-) >> >> NUMERIC DX1 TO DX4 (F1). >> RECODE DX1 TO DX4 (ELSE=0). >> VECTOR DX=DX1 TO DX4. >> COMPUTE DX(G4)=1. >> >> The following IMNSHO is abysmal. >> DO IF G4=1. >> + COMPUTE DX1=1. >> ELSE IF G4=2. >> + COMPUTE DX2=1. >> ELSE IF G4=3. >> + COMPUTE DX3=1. >> END IF. >> RECODE DX1 TO DX4 (MISSING=0)(ELSE=COPY). >> > -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Interesting-bit-of-code-tp5448698p5449105.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by David Marso
You're right about efficiency. But unless one is working with a HUGE data file, the difference will likely be imperceptible (to the human eye, at least). And in that case, transparency should trump efficiency.
I feel like I'm stealing material from Art Kendall here. ;-) p.s. - I'd never noticed that one can use PRINT like that on END REPEAT. Thanks for educating me (once again).
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
Yeah, BUT I find the following to be as transparent as glass ;-).
ONE compute per case rather than 50 and NO logical comparison required. NUMERIC state_01 TO state_50 (F1). RECODE state_01 TO state_50 (ELSE=0). VECTOR state_dummy=state_01 TO state_50. COMPUTE state_dummy(state)=1. "p.s. - I'd never noticed that one can use PRINT like that on END REPEAT. Thanks for educating me (once again)." Glad to be of service! -- ** You can also use: END REPEAT NOPRINT (but WTF? there for the sake of completeness???). --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
Yes, it becomes more transparent the longer one looks at it. I was just thinking about how to extend it to the case of a factorial design (only for folks with old versions that won't support the Python-based dummy variable generator, of course). Something like this, I suppose.
* Generate A and B variables for a 3x4 factorial design. data list free / a b (2f1). begin data 1 1 1 2 1 3 1 4 2 1 2 2 2 3 2 4 3 1 3 2 3 3 3 4 end data. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK= /maxB 'Max value of B'=MAX(B). FORMATS a b maxB (f1.0). * Now generate indicator variables for A, B, and A*B . NUMERIC A1 TO A3 B1 TO B4 A1B1 A1B2 A1B3 A1B4 A2B1 A2B2 A2B3 A2B4 A3B1 A3B2 A3B3 A3B4 (F1). RECODE A1 TO A3B4 (ELSE=0). /* Initialize all indicators to 0. VECTOR AV = A1 TO A3 / BV = B1 TO B4 / ABV = A1B1 TO A3B4 . COMPUTE AV(A) = 1. COMPUTE BV(B) = 1. COMPUTE ABV((A-1)*maxB+B) = 1. /* Note the use of maxB here . LIST A B A1 to A3B4.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
Here's a small improvement to the NUMERIC command used to generate the indicator variables:
NUMERIC A1 TO A3 B1 TO B4 A1B1 TO A1B4 A2B1 TO A2B4 A3B1 TO A3B4 (F1).
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
In reply to this post by Bruce Weaver
And for a real mind-blower consider this MATRIX code ;-))
---- data list free / a b (2f1). begin data 1 1 1 2 1 3 1 4 2 1 2 2 2 3 2 4 3 1 3 2 3 3 3 4 end data. MATRIX. GET X /VAR a b. COMPUTE DESIGNX={X,DESIGN(X),KRONEKER(IDENT(CMAX(X(:,1))),IDENT(CMAX(X(:,2))))}. SAVE DESIGNX / OUTFILE *. END MATRIX. -----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Bruce Weaver
> Bruce Weaver wrote
>> >> You're right about efficiency. But unless one is working with a HUGE >> data file, the difference will likely be imperceptible (to the human eye, >> at least). And in that case, transparency should trump efficiency. I agree. In fact transparency should **ALWAYS** trump efficiency except in extreme cases. On my first course in commercial programming, the instructor began by saying "Your objective as a professional programmer should be to write clear and simple code." It's advice I have no hesitation in passing on. The time spent in condensing code to an irreducible and indecipherable minimum is usually time wasted, and rarely necessary in an SPSS context. Writing transparent code that can be easily understood and maintained (by oneself and others) is far more important. Geeks who delight in producing obscure compressed code don't usually last very long in a professional programming team. Their colleagues soon get fed up with trying to decipher what they've done and working out where the (inevitable) error is. Garry ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |