Question about restructuring a variable

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Question about restructuring a variable

stace swayne
Dear Listserv,

I have the variable "age" in my data-set, it is continuous. I want to create 3 dummy variables, comparing the following age breakdowns, 60-66 vs.. 67-77, 60-66 vs. 78-88 and 60-66 vs. 89-99.

Can someone please advise on the type of syntax I would use to breakdown age and create these 3 separate dummy variables.

All suggestions are welcomed,

Stace
Reply | Threaded
Open this post in threaded view
|

Re: Question about restructuring a variable

Bruce Weaver
Administrator
Why do you want to categorize age?  It is often (maybe even usually) preferable to treat age as a continuous variable.  There are many articles on this--see for example David Streiner's article "Breaking Up is Hard to Do" (http://ww1.cpa-apc.org/publications/archives/cjp/2002/april/researchMethodsDichotomizingData.asp).

On the (dodgy) assumption that there is a good reason for categorizing...

You've not said whether the values of Age are can be fractional (vs whole numbers).  Allowing for fractional ages, and assuming you want to round up at .5 in the usual way:

DO REPEAT
   a = AgeCat1 to AgeCat4 /
   min = 59.5 66.5 77.5 88.5 /
   max = 66.5 77.5 88.5 99.5 .
- COMPUTE a = (Age GE min) and (Age LT max).
END REPEAT.
VALUE LABELS
 AgeCat1 "Age: 60-66"
 AgeCat2 "Age: 67-77"
 AgeCat3 "Age: 78-88"
 AgeCat4 "Age: 89-99".
FORMATS AgeCat1 to AgeCat4 (F1).

Then filter out any records outside the age range 60-99 (if there are any), and include any 3 of those 4 indicator variables in your model.  I gather you want to include the last 3, with the first as the reference category.  

HTH.


stace swayne wrote
Dear Listserv,

I have the variable "age" in my data-set, it is continuous. I want to create 3 dummy variables, comparing the following age breakdowns, 60-66 vs.. 67-77, 60-66 vs. 78-88 and 60-66 vs. 89-99.

Can someone please advise on the type of syntax I would use to breakdown age and create these 3 separate dummy variables.

All suggestions are welcomed,

Stace
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Question about restructuring a variable

David Marso
Administrator
In reply to this post by stace swayne
You are misusing the term dummy variable.  
It sounds like you are after what most people refer to as a set of contrasts?
What is your actual goal in doing this?
Please consider Bruce's advise re categorizing/ loss of information etc.
Why these specific cut-points?  The word arbitrary comes to mind.

stace swayne wrote
Dear Listserv,

I have the variable "age" in my data-set, it is continuous. I want to create 3 dummy variables, comparing the following age breakdowns, 60-66 vs.. 67-77, 60-66 vs. 78-88 and 60-66 vs. 89-99.

Can someone please advise on the type of syntax I would use to breakdown age and create these 3 separate dummy variables.

All suggestions are welcomed,

Stace
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Question about restructuring a variable

stace swayne
In reply to this post by Bruce Weaver
I am interested in running a logistic regression and my collaborators want these age breakdowns and yes they are arbitrary and I understand the issue with loss of information. Nonetheless, I need to break the age variable down and I wanted to know if anyone could advise on syntax.  I need whole numbers, would the syntax you suggested need to be altered?

thanks,

Stace


On Friday, November 8, 2013 10:25 AM, Bruce Weaver <[hidden email]> wrote:
Why do you want to categorize age?  It is often (maybe even usually)
preferable to treat age as a continuous variable.  There are many articles
on this--see for example David Streiner's article "Breaking Up is Hard to
Do"
(http://ww1.cpa-apc.org/publications/archives/cjp/2002/april/researchMethodsDichotomizingData.asp).

On the (dodgy) assumption that there is a good reason for categorizing...

You've not said whether the values of Age are can be fractional (vs whole
numbers).  Allowing for fractional ages, and assuming you want to round up
at .5 in the usual way:

DO REPEAT
  a = AgeCat1 to AgeCat4 /
  min = 59.5 66.5 77.5 88.5 /
  max = 66.5 77.5 88.5 99.5 .
- COMPUTE a = (Age GE min) and (Age LT max).
END REPEAT.
VALUE LABELS
AgeCat1 "Age: 60-66"
AgeCat2 "Age: 67-77"
AgeCat3 "Age: 78-88"
AgeCat4 "Age: 89-99".
FORMATS AgeCat1 to AgeCat4 (F1).

Then filter out any records outside the age range 60-99 (if there are any),
and include any 3 of those 4 indicator variables in your model.  I gather
you want to include the last 3, with the first as the reference category.

HTH.



stace swayne wrote

> Dear Listserv,
>
> I have the variable "age" in my data-set, it is continuous. I want to
> create 3 dummy variables, comparing the following age breakdowns, 60-66
> vs.. 67-77, 60-66 vs. 78-88 and 60-66 vs. 89-99.
>
> Can someone please advise on the type of syntax I would use to breakdown
> age and create these 3 separate dummy variables.
>
> All suggestions are welcomed,
>
> Stace





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Question-about-restructuring-a-variable-tp5722934p5722939.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



Reply | Threaded
Open this post in threaded view
|

Re: Question about restructuring a variable

David Marso
Administrator
An obvious way to achieve this would be to RECODE your age group and use that in LOG REG.
See CONTRAST subcommand and ponder if it fits your requirement.

stace swayne wrote
I am interested in running a logistic regression and my collaborators want these age breakdowns and yes they are arbitrary and I understand the issue with loss of information. Nonetheless, I need to break the age variable down and I wanted to know if anyone could advise on syntax.  I need whole numbers, would the syntax you suggested need to be altered?

thanks,

Stace



On Friday, November 8, 2013 10:25 AM, Bruce Weaver <[hidden email]> wrote:
 
Why do you want to categorize age?  It is often (maybe even usually)
preferable to treat age as a continuous variable.  There are many articles
on this--see for example David Streiner's article "Breaking Up is Hard to
Do"
(http://ww1.cpa-apc.org/publications/archives/cjp/2002/april/researchMethodsDichotomizingData.asp).

On the (dodgy) assumption that there is a good reason for categorizing...

You've not said whether the values of Age are can be fractional (vs whole
numbers).  Allowing for fractional ages, and assuming you want to round up
at .5 in the usual way:

DO REPEAT
   a = AgeCat1 to AgeCat4 /
   min = 59.5 66.5 77.5 88.5 /
   max = 66.5 77.5 88.5 99.5 .
- COMPUTE a = (Age GE min) and (Age LT max).
END REPEAT.
VALUE LABELS
AgeCat1 "Age: 60-66"
AgeCat2 "Age: 67-77"
AgeCat3 "Age: 78-88"
AgeCat4 "Age: 89-99".
FORMATS AgeCat1 to AgeCat4 (F1).

Then filter out any records outside the age range 60-99 (if there are any),
and include any 3 of those 4 indicator variables in your model.  I gather
you want to include the last 3, with the first as the reference category.

HTH.



stace swayne wrote

> Dear Listserv,
>
> I have the variable "age" in my data-set, it is continuous. I want to
> create 3 dummy variables, comparing the following age breakdowns, 60-66
> vs.. 67-77, 60-66 vs. 78-88 and 60-66 vs. 89-99.
>
> Can someone please advise on the type of syntax I would use to breakdown
> age and create these 3 separate dummy variables.
>
> All suggestions are welcomed,
>
> Stace





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Question-about-restructuring-a-variable-tp5722934p5722939.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Question about restructuring a variable

John F Hall
In reply to this post by stace swayne

Stace

 

I’m not sure if this helps, and I’m not a statistician, but if you want to create some dummy binary variables for the age groups in question, try something like this (tested on British Social Attitudes data):

 

recode age

              (60 thru 66 = 1) (67 thru 77 =2)

(78 thru 88 = 3) (89 thru 99 =4) (else = 0)

              Into agecat.

freq agecat.

 

 

DO REPEAT

  a = AgeCat1 to AgeCat4.

compute a = agecat.

END REPEAT.

recode agecat1 to agecat4 (2 thru 4 =1).

freq agecat1 to agecat4.

 

 

AgeCat1

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

0

2232

73.8

73.8

73.8

1

792

26.2

26.2

100.0

Total

3024

100.0

100.0

 

 

 

AgeCat2

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

0

2232

73.8

73.8

73.8

1

792

26.2

26.2

100.0

Total

3024

100.0

100.0

 

 

 

 

AgeCat3

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

0

2232

73.8

73.8

73.8

1

792

26.2

26.2

100.0

Total

3024

100.0

100.0

 

 

 

AgeCat4

 

Frequency

Percent

Valid Percent

Cumulative Percent

Valid

0

2232

73.8

73.8

73.8

1

792

26.2

26.2

100.0

Total

3024

100.0

100.0

 

 

For some reason SPSS didn’t like your value labels, but dinner is served.

 

John F Hall (Mr)

[Retired academic survey researcher]

 

Email:   [hidden email] 

Website: www.surveyresearch.weebly.com

SPSS start page:  www.surveyresearch.weebly.com/spss-without-tears.html

  

  

 

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of stace swayne
Sent: 08 November 2013 18:20
To: [hidden email]
Subject: Re: Question about restructuring a variable

 

I am interested in running a logistic regression and my collaborators want these age breakdowns and yes they are arbitrary and I understand the issue with loss of information. Nonetheless, I need to break the age variable down and I wanted to know if anyone could advise on syntax.  I need whole numbers, would the syntax you suggested need to be altered?

 

thanks,

 

Stace

 

On Friday, November 8, 2013 10:25 AM, Bruce Weaver <[hidden email]> wrote:

Why do you want to categorize age?  It is often (maybe even usually)
preferable to treat age as a continuous variable.  There are many articles
on this--see for example David Streiner's article "Breaking Up is Hard to
Do"
(http://ww1.cpa-apc.org/publications/archives/cjp/2002/april/researchMethodsDichotomizingData.asp).

On the (dodgy) assumption that there is a good reason for categorizing...

You've not said whether the values of Age are can be fractional (vs whole
numbers).  Allowing for fractional ages, and assuming you want to round up
at .5 in the usual way:

DO REPEAT
  a = AgeCat1 to AgeCat4 /
  min = 59.5 66.5 77.5 88.5 /
  max = 66.5 77.5 88.5 99.5 .
- COMPUTE a = (Age GE min) and (Age LT max).
END REPEAT.
VALUE LABELS
AgeCat1 "Age: 60-66"
AgeCat2 "Age: 67-77"
AgeCat3 "Age: 78-88"
AgeCat4 "Age: 89-99".
FORMATS AgeCat1 to AgeCat4 (F1).

Then filter out any records outside the age range 60-99 (if there are any),
and include any 3 of those 4 indicator variables in your model.  I gather
you want to include the last 3, with the first as the reference category.

HTH.



stace swayne wrote


> Dear Listserv,
>
> I have the variable "age" in my data-set, it is continuous. I want to
> create 3 dummy variables, comparing the following age breakdowns, 60-66
> vs.. 67-77, 60-66 vs. 78-88 and 60-66 vs. 89-99.
>
> Can someone please advise on the type of syntax I would use to breakdown
> age and create these 3 separate dummy variables.
>
> All suggestions are welcomed,
>
> Stace






-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Question-about-restructuring-a-variable-tp5722934p5722939.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD