Dear Listserv, I have the variable "age" in my data-set, it is continuous. I want to create 3 dummy variables, comparing the following age breakdowns, 60-66 vs.. 67-77, 60-66 vs. 78-88 and 60-66 vs. 89-99. Can someone please advise on the type of syntax I would use to breakdown age and create these 3 separate dummy variables. All suggestions are welcomed, Stace |
Administrator
|
Why do you want to categorize age? It is often (maybe even usually) preferable to treat age as a continuous variable. There are many articles on this--see for example David Streiner's article "Breaking Up is Hard to Do" (http://ww1.cpa-apc.org/publications/archives/cjp/2002/april/researchMethodsDichotomizingData.asp).
On the (dodgy) assumption that there is a good reason for categorizing... You've not said whether the values of Age are can be fractional (vs whole numbers). Allowing for fractional ages, and assuming you want to round up at .5 in the usual way: DO REPEAT a = AgeCat1 to AgeCat4 / min = 59.5 66.5 77.5 88.5 / max = 66.5 77.5 88.5 99.5 . - COMPUTE a = (Age GE min) and (Age LT max). END REPEAT. VALUE LABELS AgeCat1 "Age: 60-66" AgeCat2 "Age: 67-77" AgeCat3 "Age: 78-88" AgeCat4 "Age: 89-99". FORMATS AgeCat1 to AgeCat4 (F1). Then filter out any records outside the age range 60-99 (if there are any), and include any 3 of those 4 indicator variables in your model. I gather you want to include the last 3, with the first as the reference category. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
In reply to this post by stace swayne
You are misusing the term dummy variable.
It sounds like you are after what most people refer to as a set of contrasts? What is your actual goal in doing this? Please consider Bruce's advise re categorizing/ loss of information etc. Why these specific cut-points? The word arbitrary comes to mind.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Bruce Weaver
I am interested in running a logistic regression and my collaborators want these age breakdowns and yes they are arbitrary and I understand the issue with loss of information. Nonetheless, I need to break the age variable down and I wanted to know if anyone could advise on syntax. I need whole numbers, would the syntax you suggested need to be altered? thanks, Stace On Friday, November 8, 2013 10:25 AM, Bruce Weaver <[hidden email]> wrote: Why do you want to categorize age? It is often (maybe even usually) preferable to treat age as a continuous variable. There are many articles on this--see for example David Streiner's article "Breaking Up is Hard to Do" (http://ww1.cpa-apc.org/publications/archives/cjp/2002/april/researchMethodsDichotomizingData.asp). On the (dodgy) assumption that there is a good reason for categorizing... You've not said whether the values of Age are can be fractional (vs whole numbers). Allowing for fractional ages, and assuming you want to round up at .5 in the usual way: DO REPEAT a = AgeCat1 to AgeCat4 / min = 59.5 66.5 77.5 88.5 / max = 66.5 77.5 88.5 99.5 . - COMPUTE a = (Age GE min) and (Age LT max). END REPEAT. VALUE LABELS AgeCat1 "Age: 60-66" AgeCat2 "Age: 67-77" AgeCat3 "Age: 78-88" AgeCat4 "Age: 89-99". FORMATS AgeCat1 to AgeCat4 (F1). Then filter out any records outside the age range 60-99 (if there are any), and include any 3 of those 4 indicator variables in your model. I gather you want to include the last 3, with the first as the reference category. HTH. stace swayne wrote > Dear Listserv, > > I have the variable "age" in my data-set, it is continuous. I want to > create 3 dummy variables, comparing the following age breakdowns, 60-66 > vs.. 67-77, 60-66 vs. 78-88 and 60-66 vs. 89-99. > > Can someone please advise on the type of syntax I would use to breakdown > age and create these 3 separate dummy variables. > > All suggestions are welcomed, > > Stace ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Question-about-restructuring-a-variable-tp5722934p5722939.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
An obvious way to achieve this would be to RECODE your age group and use that in LOG REG.
See CONTRAST subcommand and ponder if it fits your requirement.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by stace swayne
Stace I’m not sure if this helps, and I’m not a statistician, but if you want to create some dummy binary variables for the age groups in question, try something like this (tested on British Social Attitudes data): recode age (60 thru 66 = 1) (67 thru 77 =2) (78 thru 88 = 3) (89 thru 99 =4) (else = 0) Into agecat. freq agecat. DO REPEAT a = AgeCat1 to AgeCat4. compute a = agecat. END REPEAT. recode agecat1 to agecat4 (2 thru 4 =1). freq agecat1 to agecat4.
For some reason SPSS didn’t like your value labels, but dinner is served. John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/spss-without-tears.html From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of stace swayne I am interested in running a logistic regression and my collaborators want these age breakdowns and yes they are arbitrary and I understand the issue with loss of information. Nonetheless, I need to break the age variable down and I wanted to know if anyone could advise on syntax. I need whole numbers, would the syntax you suggested need to be altered? thanks, Stace On Friday, November 8, 2013 10:25 AM, Bruce Weaver <[hidden email]> wrote: Why do you want to categorize age? It is often (maybe even usually)
----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Question-about-restructuring-a-variable-tp5722934p5722939.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |