SPSSX Discussion

Follow-up to piecewise regression question

Classic

List

Threaded

21 messages Options

parisec

Follow-up to piecewise regression question

Hi all,

I posted a question last week about extending the information from these articles:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

http://www.spsstools.net/Syntax/RegressionRepeatedMeasure/PiecewiseRegression.txt

.....to accomodate having the coefficient represent the increase in odds of an event for every 1 year increase in age within an age group.

The examples in these articles demonstrate how to compute this when you want to split a group into above or below a single value such as <14 and 14+. I think that to have multiple groups, i need to constrain the age group so that the lower limit of the age group is 0 and each year in age within the age group increases by 1. The end result is that the number of cases in the new age matches the number of cases in the 38-50 age group.

With this in mind, i computed below what I think is the correct new variable to enter in a piecewise regression for a 38-50 age group.

However, I cannot find an example that validates or invaldates this idea.

Thanks for any references or information you may have.

Carol

age	piecewise age 38-50
27	.
28	.
29	.
30	.
31	.
32	.
33	.
34	.
35	.
36	.
37	.
38	0
39	1
40	2
41	3
42	4
43	5
44	6
45	7
46	8
47	9
48	10
49	11
50	12
51	.
52	.
53	.
54	.
55	.
56	.
57	.

parisec

Re: Follow-up to piecewise regression question

Hi doug

Age is linear but it changes by age group. My study is regarding ultra distance running times taking age into account in addition to some other variables. So i'm not actually using logistic regression as previously stated regarding the coefficient, but i'm using linear regression.

What happens is that up to say age 38 or so, increasing age can lead to faster times. After age 40 or so, the opposite is true, and after 51 or so, people get even slower with increasing age. I could stratifiy age into smaller age categories and enter as dummy variables but i loose some sensitivity.

My thought on using piecewise regression here is that i want to show the effect of age on finish times. However if i enter it as a single variable, the coefficient will not accurately reflect the association of age on time since the coefficient will likely show that there is a x decrease in time (people get faster) over the entire range of ages. With piecewise regression, i can enter age in as 3 variables: 1) <38; 2) 38-50; and 3) 51+/ and get 3 coefficients that reflect the change in time for every 1 year increase in age within the age group.

The question i'm trying to get answered is whether or not the variable below is the correct way to compute the 3 categories that i have. The articles i refer to only show examples for 2 knots; above and below a specific value. I'm just not sure how to handle the a 3rd knot.

Thanks!
Carol
________________________________
From: Doug [[hidden email]]
Sent: Sunday, April 29, 2012 5:00 AM
To: Parise, Carol A.
Subject: Re: Follow-up to piecewise regression question

If age is not linear, I don't understand why you would run linear models. Sorry I can't help further, other than to suggest a linearizing transformation.

On Thu, Apr 26, 2012 at 11:41 PM, Parise, Carol A. <[hidden email]<mailto:[hidden email]>> wrote:
Hi Doug,

Thanks for the response. I am running linear mixed models. The issue is that i know that age is not linear for all ranges of age which is why i want to use piecewise regression.

What i am trying to do is have a term in the model that reflects an increase in the log odds of the event for every 1 year increase in age between that age group.

Do you know if the example i included below is correct for this?
________________________________
From: Doug [[hidden email]<mailto:[hidden email]>]
Sent: Thursday, April 26, 2012 5:07 PM
To: Parise, Carol A.
Subject: Re: Follow-up to piecewise regression question

I missed your first post, but depending on what your data look like, it might be useful to think about using a logistic model if there's a fairly smooth transition from 38-50 with an asympote at older ages. I did something similar to estimate "growth rate" for a multispecies community to identify an inflection point.

On Thu, Apr 26, 2012 at 8:00 PM, Parise, Carol A. <[hidden email]<mailto:[hidden email]><mailto:[hidden email]<mailto:[hidden email]>>> wrote:
Hi all,

I posted a question last week about extending the information from these articles:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

http://www.spsstools.net/Syntax/RegressionRepeatedMeasure/PiecewiseRegression.txt

.....to accomodate having the coefficient represent the increase in odds of an event for every 1 year increase in age within an age group.

The examples in these articles demonstrate how to compute this when you want to split a group into above or below a single value such as <14 and 14+. I think that to have multiple groups, i need to constrain the age group so that the lower limit of the age group is 0 and each year in age within the age group increases by 1. The end result is that the number of cases in the new age matches the number of cases in the 38-50 age group.

With this in mind, i computed below what I think is the correct new variable to enter in a piecewise regression for a 38-50 age group.

However, I cannot find an example that validates or invaldates this idea.

Thanks for any references or information you may have.

Carol

age piecewise age 38-50
27 .
28 .
29 .
30 .
31 .
32 .
33 .
34 .
35 .
36 .
37 .
38 0
39 1
40 2
41 3
42 4
43 5
44 6
45 7
46 8
47 9
48 10
49 11
50 12
51 .
52 .
53 .
54 .
55 .
56 .
57 .

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Buhi, Eric

Automatic reply: Follow-up to piecewise regression question

Banned User

CONTENTS DELETED

The author has deleted this message.

Ryan

Re: Follow-up to piecewise regression question

In reply to this post by parisec

Carol,

It seems to me that a simple approach to allow for varying slopes would be to create an indicator variable of the age groups of interest (e.g., 0 thru <{a} = 1, {a} thru <{b} = 2, >= {b} = 3), and then to parameterize the model as follows:

MIXED y BY group WITH age

/FIXED=group group*age | NOINT SSTYPE(3)
/METHOD=REML
/PRINT=SOLUTION.

The model above assumes that age has a linear relationship with the dependent variable that varies depending on the age group. The estimated group-specifc slopes (group*age interaction effects) are provided in the "Estimates of Fixed Effects" Table. If you wanted to test whether the group-specific slopes were significantly different from each other, you could add the following TEST statements:

/TEST = "diff in slopes between grp 1 and grp 2" group*age 1 -1 0

/TEST = "diff in slopes between grp 1 and grp 3" group*age 1 0 -1

/TEST = "diff in slopes between grp 2 and grp 3" group*age 0 1 -1

The code provided above is untested, but I'm fairly certain it will do as I suggest.

Ryan

On Thu, Apr 26, 2012 at 8:00 PM, Parise, Carol A. <[hidden email]> wrote:

Hi all,

I posted a question last week about extending the information from these articles:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

http://www.spsstools.net/Syntax/RegressionRepeatedMeasure/PiecewiseRegression.txt

.....to accomodate having the coefficient represent the increase in odds of an event for every 1 year increase in age within an age group.

The examples in these articles demonstrate how to compute this when you want to split a group into above or below a single value such as <14 and 14+. I think that to have multiple groups, i need to constrain the age group so that the lower limit of the age group is 0 and each year in age within the age group increases by 1. The end result is that the number of cases in the new age matches the number of cases in the 38-50 age group.

With this in mind, i computed below what I think is the correct new variable to enter in a piecewise regression for a 38-50 age group.

However, I cannot find an example that validates or invaldates this idea.

Thanks for any references or information you may have.

Carol

age piecewise age 38-50

27 .

28 .

29 .

30 .

31 .

32 .

33 .

34 .

35 .

36 .

37 .

38 0

39 1

40 2

41 3

42 4

43 5

44 6

45 7

46 8

47 9

48 10

49 11

50 12

51 .

52 .

53 .

54 .

55 .

56 .

57 .

Jon K Peck

Re: Follow-up to piecewise regression question

In reply to this post by parisec

If you just want a piecewise linear age effect, why not just do this?

age
age3850 = (age >=38) * (age - 38)
age51plus = age >= 51) * (age - 51)

where (age >= 38), for example, is zero until age 38 and then 1

The the effect below 38 is b(age)
38-50: b(age) + b(age3850)
51+: b(age) + b(age3850) + b(age51plus)

Regards,

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
new phone: 720-342-5621

From: "Parise, Carol A." <[hidden email]>
To: [hidden email]
Date: 04/29/2012 04:22 PM
Subject: Re: [SPSSX-L] Follow-up to piecewise regression question
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Hi doug Age is linear but it changes by age group. My study is regarding ultra distance running times taking age into account in addition to some other variables. So i'm not actually using logistic regression as previously stated regarding the coefficient, but i'm using linear regression. What happens is that up to say age 38 or so, increasing age can lead to faster times. After age 40 or so, the opposite is true, and after 51 or so, people get even slower with increasing age. I could stratifiy age into smaller age categories and enter as dummy variables but i loose some sensitivity. My thought on using piecewise regression here is that i want to show the effect of age on finish times. However if i enter it as a single variable, the coefficient will not accurately reflect the association of age on time since the coefficient will likely show that there is a x decrease in time (people get faster) over the entire range of ages. With piecewise regression, i can enter age in as 3 variables: 1) <38; 2) 38-50; and 3) 51+/ and get 3 coefficients that reflect the change in time for every 1 year increase in age within the age group. The question i'm trying to get answered is whether or not the variable below is the correct way to compute the 3 categories that i have. The articles i refer to only show examples for 2 knots; above and below a specific value. I'm just not sure how to handle the a 3rd knot. Thanks! Carol ________________________________ From: Doug [[hidden email]] Sent: Sunday, April 29, 2012 5:00 AM To: Parise, Carol A. Subject: Re: Follow-up to piecewise regression question If age is not linear, I don't understand why you would run linear models. Sorry I can't help further, other than to suggest a linearizing transformation. On Thu, Apr 26, 2012 at 11:41 PM, Parise, Carol A. <[hidden email]<mailto:PariseC@...>> wrote: Hi Doug, Thanks for the response. I am running linear mixed models. The issue is that i know that age is not linear for all ranges of age which is why i want to use piecewise regression. What i am trying to do is have a term in the model that reflects an increase in the log odds of the event for every 1 year increase in age between that age group. Do you know if the example i included below is correct for this? ________________________________ From: Doug [[hidden email]<mailto:danutter@...>] Sent: Thursday, April 26, 2012 5:07 PM To: Parise, Carol A. Subject: Re: Follow-up to piecewise regression question I missed your first post, but depending on what your data look like, it might be useful to think about using a logistic model if there's a fairly smooth transition from 38-50 with an asympote at older ages. I did something similar to estimate "growth rate" for a multispecies community to identify an inflection point. On Thu, Apr 26, 2012 at 8:00 PM, Parise, Carol A. <[hidden email]<mailto:PariseC@...><mailto:PariseC@...<mailto:PariseC@...>>> wrote: Hi all, I posted a question last week about extending the information from these articles:http://www.ats.ucla.edu/stat/spss/faq/piecewise.htmhttp://www.spsstools.net/Syntax/RegressionRepeatedMeasure/PiecewiseRegression.txt.....to accomodate having the coefficient represent the increase in odds of an event for every 1 year increase in age within an age group. The examples in these articles demonstrate how to compute this when you want to split a group into above or below a single value such as <14 and 14+. I think that to have multiple groups, i need to constrain the age group so that the lower limit of the age group is 0 and each year in age within the age group increases by 1. The end result is that the number of cases in the new age matches the number of cases in the 38-50 age group. With this in mind, i computed below what I think is the correct new variable to enter in a piecewise regression for a 38-50 age group. However, I cannot find an example that validates or invaldates this idea. Thanks for any references or information you may have. Carol age piecewise age 38-50 27 . 28 . 29 . 30 . 31 . 32 . 33 . 34 . 35 . 36 . 37 . 38 0 39 1 40 2 41 3 42 4 43 5 44 6 45 7 46 8 47 9 48 10 49 11 50 12 51 . 52 . 53 . 54 . 55 . 56 . 57 . ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Ryan

Re: Follow-up to piecewise regression question

In reply to this post by Ryan

For those interested, I decided to apply the approach I suggested below to the data provided in one of the websites Carol sent us the link for:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

I found that the slopes were identical. Moreover, after centering age at 14, the intercepts fell in line as well. As I think about it, the parameterization I proposed is essentially identical to the piecewise regression model reported on that website.

Ryan

On Sun, Apr 29, 2012 at 9:13 PM, R B <[hidden email]> wrote:

Carol,

It seems to me that a simple approach to allow for varying slopes would be to create an indicator variable of the age groups of interest (e.g., 0 thru <{a} = 1, {a} thru <{b} = 2, >= {b} = 3), and then to parameterize the model as follows:

MIXED y BY group WITH age
/FIXED=group group*age | NOINT SSTYPE(3)
/METHOD=REML
/PRINT=SOLUTION.

The model above assumes that age has a linear relationship with the dependent variable that varies depending on the age group. The estimated group-specifc slopes (group*age interaction effects) are provided in the "Estimates of Fixed Effects" Table. If you wanted to test whether the group-specific slopes were significantly different from each other, you could add the following TEST statements:

/TEST = "diff in slopes between grp 1 and grp 2" group*age 1 -1 0
/TEST = "diff in slopes between grp 1 and grp 3" group*age 1 0 -1
/TEST = "diff in slopes between grp 2 and grp 3" group*age 0 1 -1

The code provided above is untested, but I'm fairly certain it will do as I suggest.

Ryan

On Thu, Apr 26, 2012 at 8:00 PM, Parise, Carol A. <[hidden email]> wrote:

Hi all,

I posted a question last week about extending the information from these articles:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

http://www.spsstools.net/Syntax/RegressionRepeatedMeasure/PiecewiseRegression.txt

.....to accomodate having the coefficient represent the increase in odds of an event for every 1 year increase in age within an age group.

The examples in these articles demonstrate how to compute this when you want to split a group into above or below a single value such as <14 and 14+. I think that to have multiple groups, i need to constrain the age group so that the lower limit of the age group is 0 and each year in age within the age group increases by 1. The end result is that the number of cases in the new age matches the number of cases in the 38-50 age group.

With this in mind, i computed below what I think is the correct new variable to enter in a piecewise regression for a 38-50 age group.

However, I cannot find an example that validates or invaldates this idea.

Thanks for any references or information you may have.

Carol

age piecewise age 38-50

27 .

28 .

29 .

30 .

31 .

32 .

33 .

34 .

35 .

36 .

37 .

38 0

39 1

40 2

41 3

42 4

43 5

44 6

45 7

46 8

47 9

48 10

49 11

50 12

51 .

52 .

53 .

54 .

55 .

56 .

57 .

parisec

Re: Follow-up to piecewise regression question

Ryan,

Thanks for your thoughtful response.

This is interesting and makes me wonder if i'm making this harder than it needs to be. My original plan was indicator coding with 5 smaller age groups i.e. quintiles of age.

I was thinking that by using indicator coding and using highest quintile of age as the reference category, that the coefficient would represent the change in finish time for anyone in say the lowest quintile compared with anyone in the highest quintile.

my goal is to have the coefficient represent the change in finish time for every 1 year increase in age within the specified age groups which is why i thought i needed piecewise. When i started working on piecewise with my 5 groups, i quickly discovered that there wasn't much variation an age group that was inclusive of only 5 years or so. Therefore, i came up with 3 cutpoints that i think make sense based on the graphs and correlations of the data.

Based on your experiment with the data on the site, it makes me think i can achieve what i want with my original plan which makes me wonder when WOULD be the reason to use piecewise regression versus indicator coding?

Carol

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of R B
Sent: Monday, April 30, 2012 11:20 AM
To: [hidden email]
Subject: Re: Follow-up to piecewise regression question

For those interested, I decided to apply the approach I suggested below to the data provided in one of the websites Carol sent us the link for:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

Ryan

On Sun, Apr 29, 2012 at 9:13 PM, R B <[hidden email]> wrote:

Carol,

It seems to me that a simple approach to allow for varying slopes would be to create an indicator variable of the age groups of interest (e.g., 0 thru <{a} = 1, {a} thru <{b} = 2, >= {b} = 3), and then to parameterize the model as follows:

MIXED y BY group WITH age

/FIXED=group group*age | NOINT SSTYPE(3)
/METHOD=REML
/PRINT=SOLUTION.

The model above assumes that age has a linear relationship with the dependent variable that varies depending on the age group. The estimated group-specifc slopes (group*age interaction effects) are provided in the "Estimates of Fixed Effects" Table. If you wanted to test whether the group-specific slopes were significantly different from each other, you could add the following TEST statements:

/TEST = "diff in slopes between grp 1 and grp 2" group*age 1 -1 0

/TEST = "diff in slopes between grp 1 and grp 3" group*age 1 0 -1

/TEST = "diff in slopes between grp 2 and grp 3" group*age 0 1 -1

The code provided above is untested, but I'm fairly certain it will do as I suggest.

Ryan

On Thu, Apr 26, 2012 at 8:00 PM, Parise, Carol A. <[hidden email]> wrote:

Hi all,

I posted a question last week about extending the information from these articles:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

http://www.spsstools.net/Syntax/RegressionRepeatedMeasure/PiecewiseRegression.txt

.....to accomodate having the coefficient represent the increase in odds of an event for every 1 year increase in age within an age group.

The examples in these articles demonstrate how to compute this when you want to split a group into above or below a single value such as <14 and 14+. I think that to have multiple groups, i need to constrain the age group so that the lower limit of the age group is 0 and each year in age within the age group increases by 1. The end result is that the number of cases in the new age matches the number of cases in the 38-50 age group.

With this in mind, i computed below what I think is the correct new variable to enter in a piecewise regression for a 38-50 age group.

However, I cannot find an example that validates or invaldates this idea.

Thanks for any references or information you may have.

Carol

age piecewise age 38-50

27 .

28 .

29 .

30 .

31 .

32 .

33 .

34 .

35 .

36 .

37 .

38 0

39 1

40 2

41 3

42 4

43 5

44 6

45 7

46 8

47 9

48 10

49 11

50 12

51 .

52 .

53 .

54 .

55 .

56 .

57 .

Bruce Weaver

Re: Follow-up to piecewise regression question

Administrator

It sounds like you're now describing a model that has age categories, but not age as a continuous variable. If I followed, however, Ryan's model (see syntax below) included age as both a categorical variable (called Group) and a continuous variable (age). The interaction of those two variables (group*age) is what allows the slope for continuous age to vary by age group. That's more or less the same thing you're trying to accomplish by using piece-wise regression, right?

MIXED y BY group WITH age
/FIXED=group group*age | NOINT SSTYPE(3)
/METHOD=REML
/PRINT=SOLUTION.

HTH.

parisec wrote

Ryan,

Thanks for your thoughtful response.

This is interesting and makes me wonder if i'm making this harder than it needs to be. My original plan was indicator coding with 5 smaller age groups i.e. quintiles of age.

I was thinking that by using indicator coding and using highest quintile of age as the reference category, that the coefficient would represent the change in finish time for anyone in say the lowest quintile compared with anyone in the highest quintile.

my goal is to have the coefficient represent the change in finish time for every 1 year increase in age within the specified age groups which is why i thought i needed piecewise. When i started working on piecewise with my 5 groups, i quickly discovered that there wasn't much variation an age group that was inclusive of only 5 years or so. Therefore, i came up with 3 cutpoints that i think make sense based on the graphs and correlations of the data.

Based on your experiment with the data on the site, it makes me think i can achieve what i want with my original plan which makes me wonder when WOULD be the reason to use piecewise regression versus indicator coding?

Carol

________________________________
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of R B
Sent: Monday, April 30, 2012 11:20 AM
To: [hidden email]
Subject: Re: Follow-up to piecewise regression question

For those interested, I decided to apply the approach I suggested below to the data provided in one of the websites Carol sent us the link for:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

I found that the slopes were identical. Moreover, after centering age at 14, the intercepts fell in line as well. As I think about it, the parameterization I proposed is essentially identical to the piecewise regression model reported on that website.

Ryan
On Sun, Apr 29, 2012 at 9:13 PM, R B <[hidden email]<mailto:[hidden email]>> wrote:
Carol,

It seems to me that a simple approach to allow for varying slopes would be to create an indicator variable of the age groups of interest (e.g., 0 thru <{a} = 1, {a} thru <{b} = 2, >= {b} = 3), and then to parameterize the model as follows:

MIXED y BY group WITH age
/FIXED=group group*age | NOINT SSTYPE(3)
/METHOD=REML
/PRINT=SOLUTION.

The model above assumes that age has a linear relationship with the dependent variable that varies depending on the age group. The estimated group-specifc slopes (group*age interaction effects) are provided in the "Estimates of Fixed Effects" Table. If you wanted to test whether the group-specific slopes were significantly different from each other, you could add the following TEST statements:

/TEST = "diff in slopes between grp 1 and grp 2" group*age 1 -1 0
/TEST = "diff in slopes between grp 1 and grp 3" group*age 1 0 -1
/TEST = "diff in slopes between grp 2 and grp 3" group*age 0 1 -1

The code provided above is untested, but I'm fairly certain it will do as I suggest.

Ryan

On Thu, Apr 26, 2012 at 8:00 PM, Parise, Carol A. <[hidden email]<mailto:[hidden email]>> wrote:
Hi all,

I posted a question last week about extending the information from these articles:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

http://www.spsstools.net/Syntax/RegressionRepeatedMeasure/PiecewiseRegression.txt

.....to accomodate having the coefficient represent the increase in odds of an event for every 1 year increase in age within an age group.

The examples in these articles demonstrate how to compute this when you want to split a group into above or below a single value such as <14 and 14+. I think that to have multiple groups, i need to constrain the age group so that the lower limit of the age group is 0 and each year in age within the age group increases by 1. The end result is that the number of cases in the new age matches the number of cases in the 38-50 age group.

With this in mind, i computed below what I think is the correct new variable to enter in a piecewise regression for a 38-50 age group.

However, I cannot find an example that validates or invaldates this idea.

Thanks for any references or information you may have.

Carol

age piecewise age 38-50
27 .
28 .
29 .
30 .
31 .
32 .
33 .
34 .
35 .
36 .
37 .
38 0
39 1
40 2
41 3
42 4
43 5
44 6
45 7
46 8
47 9
48 10
49 11
50 12
51 .
52 .
53 .
54 .
55 .
56 .
57 .

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).

Ryan

Re: Follow-up to piecewise regression question

In reply to this post by parisec

Carol,

I'm not sure I fully understand your follow-up question. Here's the bottom line. The model I presented is essentially identical to the piece-wise regression on that website. What's very convenient about using the approach I suggested via the MIXED procedure is that you need not worry about manipulating "age" whatsoever. All you need to do is create the "age" grouping variable (a.k.a. "group" in my code), and then parameterize the model as I demonstrated previously. Furthermore, I see no reason the approach I recommended cannot be used for situations in which you categorize "age" into more than two groups.

Correctly coded TEST statements will answer most, if not all of your questions. That is, you can use TEST statements to estimate group-specific slopes as well as group-specific intercepts at particular values of "age" (e.g., setting "age" at a cut point). You can also test for differences between "age" groups with respect to their slopes as well as differences between "age" groups with respect to their intercepts at particular values of "age".

Ryan

On Mon, Apr 30, 2012 at 4:34 PM, Parise, Carol A. <[hidden email]> wrote:

Ryan,

Thanks for your thoughtful response.

This is interesting and makes me wonder if i'm making this harder than it needs to be. My original plan was indicator coding with 5 smaller age groups i.e. quintiles of age.

I was thinking that by using indicator coding and using highest quintile of age as the reference category, that the coefficient would represent the change in finish time for anyone in say the lowest quintile compared with anyone in the highest quintile.

my goal is to have the coefficient represent the change in finish time for every 1 year increase in age within the specified age groups which is why i thought i needed piecewise. When i started working on piecewise with my 5 groups, i quickly discovered that there wasn't much variation an age group that was inclusive of only 5 years or so. Therefore, i came up with 3 cutpoints that i think make sense based on the graphs and correlations of the data.

Based on your experiment with the data on the site, it makes me think i can achieve what i want with my original plan which makes me wonder when WOULD be the reason to use piecewise regression versus indicator coding?

Carol

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of R B
Sent: Monday, April 30, 2012 11:20 AM
To: [hidden email]

Subject: Re: Follow-up to piecewise regression question

For those interested, I decided to apply the approach I suggested below to the data provided in one of the websites Carol sent us the link for:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

I found that the slopes were identical. Moreover, after centering age at 14, the intercepts fell in line as well. As I think about it, the parameterization I proposed is essentially identical to the piecewise regression model reported on that website.

Ryan

On Sun, Apr 29, 2012 at 9:13 PM, R B <[hidden email]> wrote:

Carol,

It seems to me that a simple approach to allow for varying slopes would be to create an indicator variable of the age groups of interest (e.g., 0 thru <{a} = 1, {a} thru <{b} = 2, >= {b} = 3), and then to parameterize the model as follows:

MIXED y BY group WITH age

/FIXED=group group*age | NOINT SSTYPE(3)
/METHOD=REML
/PRINT=SOLUTION.

The model above assumes that age has a linear relationship with the dependent variable that varies depending on the age group. The estimated group-specifc slopes (group*age interaction effects) are provided in the "Estimates of Fixed Effects" Table. If you wanted to test whether the group-specific slopes were significantly different from each other, you could add the following TEST statements:

/TEST = "diff in slopes between grp 1 and grp 2" group*age 1 -1 0

/TEST = "diff in slopes between grp 1 and grp 3" group*age 1 0 -1

/TEST = "diff in slopes between grp 2 and grp 3" group*age 0 1 -1

The code provided above is untested, but I'm fairly certain it will do as I suggest.

Ryan

On Thu, Apr 26, 2012 at 8:00 PM, Parise, Carol A. <[hidden email]> wrote:

Hi all,

I posted a question last week about extending the information from these articles:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

http://www.spsstools.net/Syntax/RegressionRepeatedMeasure/PiecewiseRegression.txt

.....to accomodate having the coefficient represent the increase in odds of an event for every 1 year increase in age within an age group.

The examples in these articles demonstrate how to compute this when you want to split a group into above or below a single value such as <14 and 14+. I think that to have multiple groups, i need to constrain the age group so that the lower limit of the age group is 0 and each year in age within the age group increases by 1. The end result is that the number of cases in the new age matches the number of cases in the 38-50 age group.

With this in mind, i computed below what I think is the correct new variable to enter in a piecewise regression for a 38-50 age group.

However, I cannot find an example that validates or invaldates this idea.

Thanks for any references or information you may have.

Carol

age piecewise age 38-50

27 .

28 .

29 .

30 .

31 .

32 .

33 .

34 .

35 .

36 .

37 .

38 0

39 1

40 2

41 3

42 4

43 5

44 6

45 7

46 8

47 9

48 10

49 11

50 12

51 .

52 .

53 .

54 .

55 .

56 .

57 .

Ryan

Re: Follow-up to piecewise regression question

Carol,

It looks like you set up the model correctly, and that your interpretation of the slopes is correct. However, I don't see why you centered age at the grand mean. In addition to assessing for shifts in slopes from one age group to the next, isn't the purpose of piecewise regression to see if there is a shift in intercepts at the cutpoints? With that in mind, I would suggest that you NOT center age at any value before running the analysis. I repeat...I think you should enter age into the model in its original form. Then you can easily estimate and compare the intercepts at the appropriate age cutpoint for adjacent age groups using TEST statements. Concretely...

According to your post, your cutpoints are 38 and 51. Therefore, I think you would want to estimate the intercepts at age=38 for age groups 1 and 2, and test whether they are significantly different from each other. How do you do this? Simple! Add the following TEST statements:

/TEST = "int for grp 1 at age 38" group 1 0 0 group*age 38 0 0
/TEST = "int for grp 2 at age 38" group 0 1 0 group*age 0 38 0
/TEST = "diff in ints between grps 1 and 2 at age 38" group 1 -1 0 group*age 38 -38 0

If you want to do the same for age groups 2 and 3, then you'd write the following TEST statements:

/TEST = "int for grp 2 at age 51" group 0 1 0 group*age 0 51 0
/TEST = "int for grp 3 at age 51" group 0 0 1 group*age 0 0 51
/TEST = "diff in ints between grps 2 and 3 at age 51" group 0 1 -1 group*age 0 51 -51

The full MIXED code, including the above intercept TEST statements **AND** slope TEST statements would look like this:

MIXED y BY group WITH age
/FIXED=group group*age | NOINT SSTYPE(3)
/METHOD=REML
/PRINT=SOLUTION
/TEST = "int for grp 1 at age 38" group 1 0 0 group*age 38 0 0
/TEST = "int for grp 2 at age 38" group 0 1 0 group*age 0 38 0
/TEST = "diff in ints between grps 1 and 2 at age 38" group 1 -1 0 group*age 38 -38 0
/TEST = "int for grp 2 at age 51" group 0 1 0 group*age 0 51 0
/TEST = "int for grp 3 at age 51" group 0 0 1 group*age 0 0 51
/TEST = "diff in ints between grps 2 and 3 at age 51" group 0 1 -1 group*age 0 51 -51
/TEST = "grp 1 slope" group*age 1 0 0
/TEST = "grp 2 slope" group*age 0 1 0
/TEST = "grp 3 slope" group*age 0 0 1
/TEST = "diff in slopes between grp 1 and grp 2" group*age 1 -1 0
/TEST = "diff in slopes between grp 2 and grp 3" group*age 0 1 -1.

A few points:

(1) The group-specific slopes estimated from the TEST statements should equal the group*age interaction coefficients reported in the "Estimates of Fixed Effects" Table.
(2) The code above is UNTESTED. I'm too busy right now to test the code above.
(3) I am no expert in piecewise regression. I'm simply extrapolating from the two-category example provided on that website.

HTH,

Ryan
On Tue, May 1, 2012 at 7:48 PM, Parise, Carol A. <[hidden email]> wrote:

Ryan,

This nailed it. When Bruce stated....

****************************************

If I followed, however, Ryan's model (see syntax below) included age as *both* a categorical variable (called Group) and a continuous variable (age). The interaction of those two variables (group*age) is what allows the slope for continuous age to vary by age group. That's more or less the same thing you're trying to accomplish by using piece-wise regression, right?

MIXED y BY group WITH age

/FIXED=group group*age | NOINT SSTYPE(3)

/METHOD=REML

/PRINT=SOLUTION.

***************************************************

The lightbulb went on and i figured out why this made sense.

I went back and reran my analysis and I think what I pasted below gives me exactly what i need. i'm including the interpretation because in case i'm wrong, i am hoping someone can point it out. If it's correct, i suspect someone may find it useful.

Estimates of Fixed Effects^a

Parameter

Estimate

[age3=1.00]

25.723006

[age3=2.00]

26.830893

[age3=3.00]

27.558274

[age3=1.00] * AllCenAge

.029302

[age3=2.00] * AllCenAge

.198141

[age3=3.00] * AllCenAge

.079184

a. Dependent Variable: timehrs.

age3= group (1=<38, 2=38-50, 3=51+)

allcenage = centered continous age

My interpreation:

the mean finsh time for people:

<38 is 25.75 hrs

38-50 = 26.83 hrs

51+ = 27.56 hrs

For every 1 year increase in age up to age 38, finish time increases by .03 hours [age3=1.00] * AllCenAge, for people between 38-50 years old, finish time increases by .19 hrs for every 1 year increase in age [age3=2.00] * AllCenAge, and there is only a .08 increase per year in finish time for people age 51+. [age3=3.00] * AllCenAge

Thanks Ryan, Bruce, and Jon for your time.

Carol.

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of R B
Sent: Monday, April 30, 2012 5:59 PM

To: [hidden email]
Subject: Re: Follow-up to piecewise regression question

Carol,

I'm not sure I fully understand your follow-up question. Here's the bottom line. The model I presented is essentially identical to the piece-wise regression on that website. What's very convenient about using the approach I suggested via the MIXED procedure is that you need not worry about manipulating "age" whatsoever. All you need to do is create the "age" grouping variable (a.k.a. "group" in my code), and then parameterize the model as I demonstrated previously. Furthermore, I see no reason the approach I recommended cannot be used for situations in which you categorize "age" into more than two groups.

Correctly coded TEST statements will answer most, if not all of your questions. That is, you can use TEST statements to estimate group-specific slopes as well as group-specific intercepts at particular values of "age" (e.g., setting "age" at a cut point). You can also test for differences between "age" groups with respect to their slopes as well as differences between "age" groups with respect to their intercepts at particular values of "age".

Ryan

On Mon, Apr 30, 2012 at 4:34 PM, Parise, Carol A. <[hidden email]> wrote:

Ryan,

Thanks for your thoughtful response.

This is interesting and makes me wonder if i'm making this harder than it needs to be. My original plan was indicator coding with 5 smaller age groups i.e. quintiles of age.

I was thinking that by using indicator coding and using highest quintile of age as the reference category, that the coefficient would represent the change in finish time for anyone in say the lowest quintile compared with anyone in the highest quintile.

my goal is to have the coefficient represent the change in finish time for every 1 year increase in age within the specified age groups which is why i thought i needed piecewise. When i started working on piecewise with my 5 groups, i quickly discovered that there wasn't much variation an age group that was inclusive of only 5 years or so. Therefore, i came up with 3 cutpoints that i think make sense based on the graphs and correlations of the data.

Based on your experiment with the data on the site, it makes me think i can achieve what i want with my original plan which makes me wonder when WOULD be the reason to use piecewise regression versus indicator coding?

Carol

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of R B
Sent: Monday, April 30, 2012 11:20 AM
To: [hidden email]

Subject: Re: Follow-up to piecewise regression question

For those interested, I decided to apply the approach I suggested below to the data provided in one of the websites Carol sent us the link for:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

I found that the slopes were identical. Moreover, after centering age at 14, the intercepts fell in line as well. As I think about it, the parameterization I proposed is essentially identical to the piecewise regression model reported on that website.

Ryan

On Sun, Apr 29, 2012 at 9:13 PM, R B <[hidden email]> wrote:

Carol,

It seems to me that a simple approach to allow for varying slopes would be to create an indicator variable of the age groups of interest (e.g., 0 thru <{a} = 1, {a} thru <{b} = 2, >= {b} = 3), and then to parameterize the model as follows:

MIXED y BY group WITH age

/FIXED=group group*age | NOINT SSTYPE(3)
/METHOD=REML
/PRINT=SOLUTION.

The model above assumes that age has a linear relationship with the dependent variable that varies depending on the age group. The estimated group-specifc slopes (group*age interaction effects) are provided in the "Estimates of Fixed Effects" Table. If you wanted to test whether the group-specific slopes were significantly different from each other, you could add the following TEST statements:

/TEST = "diff in slopes between grp 1 and grp 2" group*age 1 -1 0

/TEST = "diff in slopes between grp 1 and grp 3" group*age 1 0 -1

/TEST = "diff in slopes between grp 2 and grp 3" group*age 0 1 -1

The code provided above is untested, but I'm fairly certain it will do as I suggest.

Ryan

On Thu, Apr 26, 2012 at 8:00 PM, Parise, Carol A. <[hidden email]> wrote:

Hi all,

I posted a question last week about extending the information from these articles:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

http://www.spsstools.net/Syntax/RegressionRepeatedMeasure/PiecewiseRegression.txt

.....to accomodate having the coefficient represent the increase in odds of an event for every 1 year increase in age within an age group.

The examples in these articles demonstrate how to compute this when you want to split a group into above or below a single value such as <14 and 14+. I think that to have multiple groups, i need to constrain the age group so that the lower limit of the age group is 0 and each year in age within the age group increases by 1. The end result is that the number of cases in the new age matches the number of cases in the 38-50 age group.

With this in mind, i computed below what I think is the correct new variable to enter in a piecewise regression for a 38-50 age group.

However, I cannot find an example that validates or invaldates this idea.

Thanks for any references or information you may have.

Carol

age piecewise age 38-50

27 .

28 .

29 .

30 .

31 .

32 .

33 .

34 .

35 .

36 .

37 .

38 0

39 1

40 2

41 3

42 4

43 5

44 6

45 7

46 8

47 9

48 10

49 11

50 12

51 .

52 .

53 .

54 .

55 .

56 .

57 .

Bruce Weaver

Re: Follow-up to piecewise regression question

Administrator

Let me begin by echoing Ryan's disclaimer: I have no particular expertise in piece-wise regression. Having said that, it looks to me as if Ryan's model allows for discontinuities at the cut-points between age groups. Does that make sense in the context of your problem, Carol? Or do you want the function to be continuous at the cut-points? (I've not taken time to look at the website the example came from, so I don't know which way those folks specified their model.)

Re interpretation of the coefficients, I always find it helpful to make a plot of fitted values as a function of the main explanatory variables. In your case, this will show graphically the slopes (and intercepts if you extrapolate) for Age within the various age groups, and help you map back to the coefficients.

HTH.

R B wrote

Carol,

It looks like you set up the model correctly, and that your interpretation
of the slopes is correct. However, I don't see why you centered age at the
grand mean. In addition to assessing for shifts in slopes from one age
group to the next, isn't the purpose of piecewise regression to see if
there is a shift in intercepts at the cutpoints? With that in mind, I would
suggest that you NOT center age at any value before running the analysis. I
repeat...I think you should enter age into the model in its original form.
Then you can easily estimate and compare the intercepts at the appropriate
age cutpoint for adjacent age groups using TEST statements. Concretely...

According to your post, your cutpoints are 38 and 51. Therefore, I think
you would want to estimate the intercepts at age=38 for age groups 1 and 2,
and test whether they are significantly different from each other. How do
you do this? Simple! Add the following TEST statements:

/TEST = "int for grp 1 at age 38" group 1 0 0 group*age 38 0 0
/TEST = "int for grp 2 at age 38" group 0 1 0 group*age 0 38 0
/TEST = "diff in ints between grps 1 and 2 at age 38" group 1 -1 0
group*age 38 -38 0

If you want to do the same for age groups 2 and 3, then you'd write the
following TEST statements:

/TEST = "int for grp 2 at age 51" group 0 1 0 group*age 0 51 0
/TEST = "int for grp 3 at age 51" group 0 0 1 group*age 0 0 51
/TEST = "diff in ints between grps 2 and 3 at age 51" group 0 1 -1
group*age 0 51 -51

The full MIXED code, including the above intercept TEST statements **AND**
slope TEST statements would look like this:

MIXED y BY group WITH age
/FIXED=group group*age | NOINT SSTYPE(3)
/METHOD=REML
/PRINT=SOLUTION
/TEST = "int for grp 1 at age 38" group 1 0 0 group*age 38 0 0
/TEST = "int for grp 2 at age 38" group 0 1 0 group*age 0 38 0
/TEST = "diff in ints between grps 1 and 2 at age 38" group 1 -1 0
group*age 38 -38 0
/TEST = "int for grp 2 at age 51" group 0 1 0 group*age 0 51 0
/TEST = "int for grp 3 at age 51" group 0 0 1 group*age 0 0 51
/TEST = "diff in ints between grps 2 and 3 at age 51" group 0 1 -1
group*age 0 51 -51
/TEST = "grp 1 slope" group*age 1 0 0
/TEST = "grp 2 slope" group*age 0 1 0
/TEST = "grp 3 slope" group*age 0 0 1
/TEST = "diff in slopes between grp 1 and grp 2" group*age 1 -1 0
/TEST = "diff in slopes between grp 2 and grp 3" group*age 0 1 -1.

A few points:

(1) The group-specific slopes estimated from the TEST statements should
equal the group*age interaction coefficients reported in the "Estimates of
Fixed Effects" Table.
(2) The code above is UNTESTED. I'm too busy right now to test the code
above.
(3) I am no expert in piecewise regression. I'm simply extrapolating from
the two-category example provided on that website.

HTH,

Ryan
On Tue, May 1, 2012 at 7:48 PM, Parise, Carol A.
<[hidden email]>wrote:

> **
> Ryan,
>
> This nailed it. When Bruce stated....
>
> ****************************************
> If I followed, however, Ryan's model (see syntax below) included age as
> *both* a categorical variable (called Group) and a continuous variable
> (age). The interaction of those two variables (group*age) is what allows
> the slope for continuous age to vary by age group. That's more or less the
> same thing you're trying to accomplish by using piece-wise regression,
> right?
>
> MIXED y BY group WITH age
>
> /FIXED=group group*age | NOINT SSTYPE(3)
>
> /METHOD=REML
>
> /PRINT=SOLUTION.
> ***************************************************
> The lightbulb went on and i figured out why this made sense.

--- snip ---

Ryan

Re: Follow-up to piecewise regression question

In reply to this post by Ryan

One point of clarification--when I wrote "shift in slopes," I probably should have written "change in slopes in terms of magnitude and/or direction from one group to the next" so as to avoid any confusion with "change in intercepts" which would imply a jump (up or down) at the cutpoint from one group to the next.

As side point, the shape of the relationship from say linear to curvilinear could occur as well between adjacent groups but that's another ball of wax, and it is not consistent with the OPs situation as far as I'm aware.

Ryan

On May 1, 2012, at 9:26 PM, R B <[hidden email]> wrote:

Carol,
It looks like you set up the model correctly, and that your interpretation of the slopes is correct. However, I don't see why you centered age at the grand mean. In addition to assessing for shifts in slopes from one age group to the next, isn't the purpose of piecewise regression to see if there is a shift in intercepts at the cutpoints? With that in mind, I would suggest that you NOT center age at any value before running the analysis. I repeat...I think you should enter age into the model in its original form. Then you can easily estimate and compare the intercepts at the appropriate age cutpoint for adjacent age groups using TEST statements. Concretely...

According to your post, your cutpoints are 38 and 51. Therefore, I think you would want to estimate the intercepts at age=38 for age groups 1 and 2, and test whether they are significantly different from each other. How do you do this? Simple! Add the following TEST statements:

/TEST = "int for grp 1 at age 38" group 1 0 0 group*age 38 0 0
/TEST = "int for grp 2 at age 38" group 0 1 0 group*age 0 38 0
/TEST = "diff in ints between grps 1 and 2 at age 38" group 1 -1 0 group*age 38 -38 0

If you want to do the same for age groups 2 and 3, then you'd write the following TEST statements:
/TEST = "int for grp 2 at age 51" group 0 1 0 group*age 0 51 0
/TEST = "int for grp 3 at age 51" group 0 0 1 group*age 0 0 51
/TEST = "diff in ints between grps 2 and 3 at age 51" group 0 1 -1 group*age 0 51 -51
The full MIXED code, including the above intercept TEST statements **AND** slope TEST statements would look like this:

MIXED y BY group WITH age
/FIXED=group group*age | NOINT SSTYPE(3)
/METHOD=REML
/PRINT=SOLUTION
/TEST = "int for grp 1 at age 38" group 1 0 0 group*age 38 0 0
/TEST = "int for grp 2 at age 38" group 0 1 0 group*age 0 38 0
/TEST = "diff in ints between grps 1 and 2 at age 38" group 1 -1 0 group*age 38 -38 0
/TEST = "int for grp 2 at age 51" group 0 1 0 group*age 0 51 0
/TEST = "int for grp 3 at age 51" group 0 0 1 group*age 0 0 51
/TEST = "diff in ints between grps 2 and 3 at age 51" group 0 1 -1 group*age 0 51 -51
/TEST = "grp 1 slope" group*age 1 0 0
/TEST = "grp 2 slope" group*age 0 1 0
/TEST = "grp 3 slope" group*age 0 0 1
/TEST = "diff in slopes between grp 1 and grp 2" group*age 1 -1 0
/TEST = "diff in slopes between grp 2 and grp 3" group*age 0 1 -1.
A few points:
(1) The group-specific slopes estimated from the TEST statements should equal the group*age interaction coefficients reported in the "Estimates of Fixed Effects" Table.
(2) The code above is UNTESTED. I'm too busy right now to test the code above.
(3) I am no expert in piecewise regression. I'm simply extrapolating from the two-category example provided on that website.
HTH,
Ryan
On Tue, May 1, 2012 at 7:48 PM, Parise, Carol A. <[hidden email][hidden email]> wrote:

Ryan,

This nailed it. When Bruce stated....

****************************************

If I followed, however, Ryan's model (see syntax below) included age as *both* a categorical variable (called Group) and a continuous variable (age). The interaction of those two variables (group*age) is what allows the slope for continuous age to vary by age group. That's more or less the same thing you're trying to accomplish by using piece-wise regression, right?

MIXED y BY group WITH age

/FIXED=group group*age | NOINT SSTYPE(3)

/METHOD=REML

/PRINT=SOLUTION.

***************************************************

The lightbulb went on and i figured out why this made sense.

I went back and reran my analysis and I think what I pasted below gives me exactly what i need. i'm including the interpretation because in case i'm wrong, i am hoping someone can point it out. If it's correct, i suspect someone may find it useful.

Estimates of Fixed Effects^a

Parameter

Estimate

[age3=1.00]

25.723006

[age3=2.00]

26.830893

[age3=3.00]

27.558274

[age3=1.00] * AllCenAge

.029302

[age3=2.00] * AllCenAge

.198141

[age3=3.00] * AllCenAge

.079184

a. Dependent Variable: timehrs.

age3= group (1=<38, 2=38-50, 3=51+)

allcenage = centered continous age

My interpreation:

the mean finsh time for people:

<38 is 25.75 hrs

38-50 = 26.83 hrs

51+ = 27.56 hrs

For every 1 year increase in age up to age 38, finish time increases by .03 hours [age3=1.00] * AllCenAge, for people between 38-50 years old, finish time increases by .19 hrs for every 1 year increase in age [age3=2.00] * AllCenAge, and there is only a .08 increase per year in finish time for people age 51+. [age3=3.00] * AllCenAge

Thanks Ryan, Bruce, and Jon for your time.

Carol.

From: SPSSX(r) Discussion [mailto:[hidden email][hidden email]] On Behalf Of R B
Sent: Monday, April 30, 2012 5:59 PM

To: [hidden email][hidden email]
Subject: Re: Follow-up to piecewise regression question

Carol,

I'm not sure I fully understand your follow-up question. Here's the bottom line. The model I presented is essentially identical to the piece-wise regression on that website. What's very convenient about using the approach I suggested via the MIXED procedure is that you need not worry about manipulating "age" whatsoever. All you need to do is create the "age" grouping variable (a.k.a. "group" in my code), and then parameterize the model as I demonstrated previously. Furthermore, I see no reason the approach I recommended cannot be used for situations in which you categorize "age" into more than two groups.

Correctly coded TEST statements will answer most, if not all of your questions. That is, you can use TEST statements to estimate group-specific slopes as well as group-specific intercepts at particular values of "age" (e.g., setting "age" at a cut point). You can also test for differences between "age" groups with respect to their slopes as well as differences between "age" groups with respect to their intercepts at particular values of "age".

Ryan

On Mon, Apr 30, 2012 at 4:34 PM, Parise, Carol A. <[hidden email][hidden email]> wrote:

Ryan,

Thanks for your thoughtful response.

This is interesting and makes me wonder if i'm making this harder than it needs to be. My original plan was indicator coding with 5 smaller age groups i.e. quintiles of age.

I was thinking that by using indicator coding and using highest quintile of age as the reference category, that the coefficient would represent the change in finish time for anyone in say the lowest quintile compared with anyone in the highest quintile.

my goal is to have the coefficient represent the change in finish time for every 1 year increase in age within the specified age groups which is why i thought i needed piecewise. When i started working on piecewise with my 5 groups, i quickly discovered that there wasn't much variation an age group that was inclusive of only 5 years or so. Therefore, i came up with 3 cutpoints that i think make sense based on the graphs and correlations of the data.

Based on your experiment with the data on the site, it makes me think i can achieve what i want with my original plan which makes me wonder when WOULD be the reason to use piecewise regression versus indicator coding?

Carol

From: SPSSX(r) Discussion [mailto:[hidden email][hidden email]] On Behalf Of R B
Sent: Monday, April 30, 2012 11:20 AM
To: [hidden email][hidden email]

Subject: Re: Follow-up to piecewise regression question

For those interested, I decided to apply the approach I suggested below to the data provided in one of the websites Carol sent us the link for:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

I found that the slopes were identical. Moreover, after centering age at 14, the intercepts fell in line as well. As I think about it, the parameterization I proposed is essentially identical to the piecewise regression model reported on that website.

Ryan

On Sun, Apr 29, 2012 at 9:13 PM, R B <[hidden email][hidden email]> wrote:

Carol,

It seems to me that a simple approach to allow for varying slopes would be to create an indicator variable of the age groups of interest (e.g., 0 thru <{a} = 1, {a} thru <{b} = 2, >= {b} = 3), and then to parameterize the model as follows:

MIXED y BY group WITH age

/FIXED=group group*age | NOINT SSTYPE(3)
/METHOD=REML
/PRINT=SOLUTION.

The model above assumes that age has a linear relationship with the dependent variable that varies depending on the age group. The estimated group-specifc slopes (group*age interaction effects) are provided in the "Estimates of Fixed Effects" Table. If you wanted to test whether the group-specific slopes were significantly different from each other, you could add the following TEST statements:

/TEST = "diff in slopes between grp 1 and grp 2" group*age 1 -1 0

/TEST = "diff in slopes between grp 1 and grp 3" group*age 1 0 -1

/TEST = "diff in slopes between grp 2 and grp 3" group*age 0 1 -1

The code provided above is untested, but I'm fairly certain it will do as I suggest.

Ryan

On Thu, Apr 26, 2012 at 8:00 PM, Parise, Carol A. <[hidden email][hidden email]> wrote:

Hi all,

I posted a question last week about extending the information from these articles:

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

http://www.spsstools.net/Syntax/RegressionRepeatedMeasure/PiecewiseRegression.txt

.....to accomodate having the coefficient represent the increase in odds of an event for every 1 year increase in age within an age group.

The examples in these articles demonstrate how to compute this when you want to split a group into above or below a single value such as <14 and 14+. I think that to have multiple groups, i need to constrain the age group so that the lower limit of the age group is 0 and each year in age within the age group increases by 1. The end result is that the number of cases in the new age matches the number of cases in the 38-50 age group.

With this in mind, i computed below what I think is the correct new variable to enter in a piecewise regression for a 38-50 age group.

However, I cannot find an example that validates or invaldates this idea.

Thanks for any references or information you may have.

Carol

age piecewise age 38-50

27 .

28 .

29 .

30 .

31 .

32 .

33 .

34 .

35 .

36 .

37 .

38 0

39 1

40 2

41 3

42 4

43 5

44 6

45 7

46 8

47 9

48 10

49 11

50 12

51 .

52 .

53 .

54 .

55 .

56 .

57 .

Ryan

Re: Follow-up to piecewise regression question

In reply to this post by Bruce Weaver

Bruce,

They specified it the way I did. See "try 3":

http://www.ats.ucla.edu/stat/spss/faq/piecewise.htm

Ryan

On May 2, 2012, at 7:54 AM, Bruce Weaver <[hidden email]> wrote:

> Let me begin by echoing Ryan's disclaimer: I have no particular expertise in
> piece-wise regression. Having said that, it looks to me as if Ryan's model
> allows for discontinuities at the cut-points between age groups. Does that
> make sense in the context of your problem, Carol? Or do you want the
> function to be continuous at the cut-points? (I've not taken time to look
> at the website the example came from, so I don't know which way those folks
> specified their model.)
>
> Re interpretation of the coefficients, I always find it helpful to make a
> plot of fitted values as a function of the main explanatory variables. In
> your case, this will show graphically the slopes (and intercepts if you
> extrapolate) for Age within the various age groups, and help you map back
> the coefficients.
>
> HTH.
>
>
> R B wrote
>>
>> Carol,
>>
>> It looks like you set up the model correctly, and that your interpretation
>> of the slopes is correct. However, I don't see why you centered age at the
>> grand mean. In addition to assessing for shifts in slopes from one age
>> group to the next, isn't the purpose of piecewise regression to see if
>> there is a shift in intercepts at the cutpoints? With that in mind, I
>> would
>> suggest that you NOT center age at any value before running the analysis.
>> I
>> repeat...I think you should enter age into the model in its original form.
>> Then you can easily estimate and compare the intercepts at the appropriate
>> age cutpoint for adjacent age groups using TEST statements. Concretely...
>>
>> According to your post, your cutpoints are 38 and 51. Therefore, I think
>> you would want to estimate the intercepts at age=38 for age groups 1 and
>> 2,
>> and test whether they are significantly different from each other. How do
>> you do this? Simple! Add the following TEST statements:
>>
>> /TEST = "int for grp 1 at age 38" group 1 0 0 group*age 38 0 0
>> /TEST = "int for grp 2 at age 38" group 0 1 0 group*age 0 38 0
>> /TEST = "diff in ints between grps 1 and 2 at age 38" group 1 -1 0
>> group*age 38 -38 0
>>
>> If you want to do the same for age groups 2 and 3, then you'd write the
>> following TEST statements:
>>
>> /TEST = "int for grp 2 at age 51" group 0 1 0 group*age 0 51 0
>> /TEST = "int for grp 3 at age 51" group 0 0 1 group*age 0 0 51
>> /TEST = "diff in ints between grps 2 and 3 at age 51" group 0 1 -1
>> group*age 0 51 -51
>>
>> The full MIXED code, including the above intercept TEST statements **AND**
>> slope TEST statements would look like this:
>>
>> MIXED y BY group WITH age
>> /FIXED=group group*age | NOINT SSTYPE(3)
>> /METHOD=REML
>> /PRINT=SOLUTION
>> /TEST = "int for grp 1 at age 38" group 1 0 0 group*age 38 0 0
>> /TEST = "int for grp 2 at age 38" group 0 1 0 group*age 0 38 0
>> /TEST = "diff in ints between grps 1 and 2 at age 38" group 1 -1 0
>> group*age 38 -38 0
>> /TEST = "int for grp 2 at age 51" group 0 1 0 group*age 0 51 0
>> /TEST = "int for grp 3 at age 51" group 0 0 1 group*age 0 0 51
>> /TEST = "diff in ints between grps 2 and 3 at age 51" group 0 1 -1
>> group*age 0 51 -51
>> /TEST = "grp 1 slope" group*age 1 0 0
>> /TEST = "grp 2 slope" group*age 0 1 0
>> /TEST = "grp 3 slope" group*age 0 0 1
>> /TEST = "diff in slopes between grp 1 and grp 2" group*age 1 -1 0
>> /TEST = "diff in slopes between grp 2 and grp 3" group*age 0 1 -1.
>>
>> A few points:
>>
>> (1) The group-specific slopes estimated from the TEST statements should
>> equal the group*age interaction coefficients reported in the "Estimates of
>> Fixed Effects" Table.
>> (2) The code above is UNTESTED. I'm too busy right now to test the code
>> above.
>> (3) I am no expert in piecewise regression. I'm simply extrapolating from
>> the two-category example provided on that website.
>>
>> HTH,
>>
>> Ryan
>> On Tue, May 1, 2012 at 7:48 PM, Parise, Carol A.
>> <PariseC@>wrote:
>>
>>> **
>>> Ryan,
>>>
>>> This nailed it. When Bruce stated....
>>>
>>> ****************************************
>>> If I followed, however, Ryan's model (see syntax below) included age as
>>> *both* a categorical variable (called Group) and a continuous variable
>>> (age). The interaction of those two variables (group*age) is what allows
>>> the slope for continuous age to vary by age group. That's more or less
>>> the
>>> same thing you're trying to accomplish by using piece-wise regression,
>>> right?
>>>
>>> MIXED y BY group WITH age
>>>
>>> /FIXED=group group*age | NOINT SSTYPE(3)
>>>
>>> /METHOD=REML
>>>
>>> /PRINT=SOLUTION.
>>> ***************************************************
>>> The lightbulb went on and i figured out why this made sense.
>>
>> --- snip ---
>>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Follow-up-to-piecewise-regression-question-tp5668949p5680294.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

parisec

Re: Follow-up to piecewise regression question

In reply to this post by Bruce Weaver

Bruce,

The correlation between age and time is negative up to around age 38, from age 38-50 it is slightly positive, then at age 51+ it is highly positive. I think this makes the function continuous. would you agree?

thanks
Carol

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bruce Weaver
Sent: Wednesday, May 02, 2012 4:54 AM
To: [hidden email]
Subject: Re: Follow-up to piecewise regression question

Let me begin by echoing Ryan's disclaimer: I have no particular expertise in piece-wise regression. Having said that, it looks to me as if Ryan's model
allows for discontinuities at the cut-points between age groups. Does that
make sense in the context of your problem, Carol? Or do you want the function to be continuous at the cut-points? (I've not taken time to look at the website the example came from, so I don't know which way those folks specified their model.)

Re interpretation of the coefficients, I always find it helpful to make a plot of fitted values as a function of the main explanatory variables. In your case, this will show graphically the slopes (and intercepts if you
extrapolate) for Age within the various age groups, and help you map back to the coefficients.

HTH.

R B wrote

>
> Carol,
>
> It looks like you set up the model correctly, and that your
> interpretation of the slopes is correct. However, I don't see why you
> centered age at the grand mean. In addition to assessing for shifts in
> slopes from one age group to the next, isn't the purpose of piecewise
> regression to see if there is a shift in intercepts at the cutpoints?
> With that in mind, I would suggest that you NOT center age at any
> value before running the analysis.
> I
> repeat...I think you should enter age into the model in its original form.
> Then you can easily estimate and compare the intercepts at the
> appropriate age cutpoint for adjacent age groups using TEST statements. Concretely...
>
> According to your post, your cutpoints are 38 and 51. Therefore, I
> think you would want to estimate the intercepts at age=38 for age
> groups 1 and 2, and test whether they are significantly different from
> each other. How do you do this? Simple! Add the following TEST
> statements:
>
> /TEST = "int for grp 1 at age 38" group 1 0 0 group*age 38 0 0 /TEST =
> "int for grp 2 at age 38" group 0 1 0 group*age 0 38 0 /TEST = "diff
> in ints between grps 1 and 2 at age 38" group 1 -1 0 group*age 38 -38
> 0
>
> If you want to do the same for age groups 2 and 3, then you'd write
> the following TEST statements:
>
> /TEST = "int for grp 2 at age 51" group 0 1 0 group*age 0 51 0 /TEST =
> "int for grp 3 at age 51" group 0 0 1 group*age 0 0 51 /TEST = "diff
> in ints between grps 2 and 3 at age 51" group 0 1 -1 group*age 0 51
> -51
>
> The full MIXED code, including the above intercept TEST statements
> **AND** slope TEST statements would look like this:
>
> MIXED y BY group WITH age
> /FIXED=group group*age | NOINT SSTYPE(3) /METHOD=REML
> /PRINT=SOLUTION /TEST = "int for grp 1 at age 38" group 1 0 0
> group*age 38 0 0 /TEST = "int for grp 2 at age 38" group 0 1 0
> group*age 0 38 0 /TEST = "diff in ints between grps 1 and 2 at age
> 38" group 1 -1 0 group*age 38 -38 0 /TEST = "int for grp 2 at age 51"
> group 0 1 0 group*age 0 51 0 /TEST = "int for grp 3 at age 51" group
> 0 0 1 group*age 0 0 51 /TEST = "diff in ints between grps 2 and 3 at
> age 51" group 0 1 -1 group*age 0 51 -51 /TEST = "grp 1 slope"
> group*age 1 0 0 /TEST = "grp 2 slope" group*age 0 1 0 /TEST = "grp 3
> slope" group*age 0 0 1 /TEST = "diff in slopes between grp 1 and grp
> 2" group*age 1 -1 0 /TEST = "diff in slopes between grp 2 and grp 3"
> group*age 0 1 -1.
>
> A few points:
>
> (1) The group-specific slopes estimated from the TEST statements
> should equal the group*age interaction coefficients reported in the
> "Estimates of Fixed Effects" Table.
> (2) The code above is UNTESTED. I'm too busy right now to test the
> code above.
> (3) I am no expert in piecewise regression. I'm simply extrapolating
> from the two-category example provided on that website.
>
> HTH,
>
> Ryan
> On Tue, May 1, 2012 at 7:48 PM, Parise, Carol A.
> <PariseC@>wrote:
>
>> **
>> Ryan,
>>
>> This nailed it. When Bruce stated....
>>
>> ****************************************
>> If I followed, however, Ryan's model (see syntax below) included age
>> as
>> *both* a categorical variable (called Group) and a continuous
>> variable (age). The interaction of those two variables (group*age) is
>> what allows the slope for continuous age to vary by age group. That's
>> more or less the same thing you're trying to accomplish by using
>> piece-wise regression, right?
>>
>> MIXED y BY group WITH age
>>
>> /FIXED=group group*age | NOINT SSTYPE(3)
>>
>> /METHOD=REML
>>
>> /PRINT=SOLUTION.
>> ***************************************************
>> The lightbulb went on and i figured out why this made sense.
>
> --- snip ---
>

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Follow-up-to-piecewise-regression-question-tp5668949p5680294.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Jo Fennessey

Automatic reply: Follow-up to piecewise regression question

I will be out of the office until Monday, May 7, but will check email daily and am available by cell phone anytime: 410-258-4623.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

parisec

Re: Follow-up to piecewise regression question

In reply to this post by Ryan

Here is the result of using age in its original form and the \TEST statements.

What just dawned on me as i was looking at this is that since there is a significant interaction for age3*age, the main effect of age3 is no longer applicable - just like in any other anova.

The efffect of age on time, depends on which part of the age continum one is on. So, there is no effect of being under age 38 on finish time but being between 38-51 means you get slower by .20 hrs for every year. If you are over 50, then you still slow down but not as much.

Estimates of Fixed Effects^a

Parameter

Estimate

Std. Error

Sig.

95% Confidence Interval

						Lower Bound	Upper Bound
[age3=1.00]	.171610	1.293038	13,805	.133	.894	-2.362920	2.706140
[age3=2.00]	-6.124152	1.282973	13,805	-4.773	.000	-8.638953	-3.609351
[age3=3.00]	0^b	0	.	.	.	.	.
[age3=1.00] * NewAge	.036716	.022293	13,805.000	1.647	.100	-.006981	.080413
[age3=2.00] * NewAge	.203924	.016284	13,805	12.523	.000	.172004	.235843
[age3=3.00] * NewAge	.082119	.019039	13,805	4.313	.000	.044800	.119438

/TEST = "diff in slopes between <38 and 38-50" age3*newAge 1 -1 0

The estimate is simply 0.037-0.203 and it's significant which means that the slopes change from one age group to another. I believe that this provides the rationale for selecting this cutpoint in the data.

Contrast Estimates^a,b

Contrast

Estimate

Std. Error

Test Value

Sig.

95% Confidence Interval

							Lower Bound	Upper Bound
L1	-.167208	.027493	13,805	0	-6.082	.000	-.221098	-.113318

a. diff in slopes between <38 and 38-50

/TEST = "diff in slopes between <38 and 51+" age3*newAge 1 0 -1

The estimate is 0.203-0.082 and it's not significant which means that the slopes aren't really that different.

Contrast

Estimate

Std. Error

Test Value

Sig.

95% Confidence Interval

							Lower Bound	Upper Bound
L1	-.045403	.029255	13,805	0	-1.552	.121	-.102747	.011940

a. diff in slopes between <38 and 51+

/TEST = "diff in slopes between 38-50 and 51+" age3*newAge 0 1 -1.

The difference in slopes between 38-50 and 51+ is significant which means that the cut point at age 50 is justfiable.

Contrast Estimates^a,b

Contrast

Estimate

Std. Error

Test Value

Sig.

95% Confidence Interval

							Lower Bound	Upper Bound
L1	.121804	.024961	13,805	0	4.880	.000	.072877	.170731

a. diff in slopes between 38-50 and 51+

In the end, people under 38 and over 51 slow down by just around the same amount of time...but you slow down more in those middle ages.

Thanks again for taking the time to post. This was actually really fun to work though and see how it affected my results.

Carol

Bruce Weaver

Re: Follow-up to piecewise regression question

Administrator

In reply to this post by parisec

Hi Carol. I think it's more a matter of whether it makes sense for there to (possibly) be a big change in the fitted value as you move from one age group to the next. Given that people on either side of the cut-point can differ by as little as one day in age, I would argue it usually doesn't make a lot of sense, and that you would ordinarily want a function that is continuous at the cut-points.

HTH.

parisec wrote

Bruce,

The correlation between age and time is negative up to around age 38, from age 38-50 it is slightly positive, then at age 51+ it is highly positive. I think this makes the function continuous. would you agree?

thanks
Carol

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bruce Weaver
Sent: Wednesday, May 02, 2012 4:54 AM
To: [hidden email]
Subject: Re: Follow-up to piecewise regression question

Let me begin by echoing Ryan's disclaimer: I have no particular expertise in piece-wise regression. Having said that, it looks to me as if Ryan's model
allows for discontinuities at the cut-points between age groups. Does that
make sense in the context of your problem, Carol? Or do you want the function to be continuous at the cut-points? (I've not taken time to look at the website the example came from, so I don't know which way those folks specified their model.)

Re interpretation of the coefficients, I always find it helpful to make a plot of fitted values as a function of the main explanatory variables. In your case, this will show graphically the slopes (and intercepts if you
extrapolate) for Age within the various age groups, and help you map back to the coefficients.

HTH.

R B wrote
>
> Carol,
>
> It looks like you set up the model correctly, and that your
> interpretation of the slopes is correct. However, I don't see why you
> centered age at the grand mean. In addition to assessing for shifts in
> slopes from one age group to the next, isn't the purpose of piecewise
> regression to see if there is a shift in intercepts at the cutpoints?
> With that in mind, I would suggest that you NOT center age at any
> value before running the analysis.
> I
> repeat...I think you should enter age into the model in its original form.
> Then you can easily estimate and compare the intercepts at the
> appropriate age cutpoint for adjacent age groups using TEST statements. Concretely...
>
> According to your post, your cutpoints are 38 and 51. Therefore, I
> think you would want to estimate the intercepts at age=38 for age
> groups 1 and 2, and test whether they are significantly different from
> each other. How do you do this? Simple! Add the following TEST
> statements:
>
> /TEST = "int for grp 1 at age 38" group 1 0 0 group*age 38 0 0 /TEST =
> "int for grp 2 at age 38" group 0 1 0 group*age 0 38 0 /TEST = "diff
> in ints between grps 1 and 2 at age 38" group 1 -1 0 group*age 38 -38
> 0
>
> If you want to do the same for age groups 2 and 3, then you'd write
> the following TEST statements:
>
> /TEST = "int for grp 2 at age 51" group 0 1 0 group*age 0 51 0 /TEST =
> "int for grp 3 at age 51" group 0 0 1 group*age 0 0 51 /TEST = "diff
> in ints between grps 2 and 3 at age 51" group 0 1 -1 group*age 0 51
> -51
>
> The full MIXED code, including the above intercept TEST statements
> **AND** slope TEST statements would look like this:
>
> MIXED y BY group WITH age
> /FIXED=group group*age | NOINT SSTYPE(3) /METHOD=REML
> /PRINT=SOLUTION /TEST = "int for grp 1 at age 38" group 1 0 0
> group*age 38 0 0 /TEST = "int for grp 2 at age 38" group 0 1 0
> group*age 0 38 0 /TEST = "diff in ints between grps 1 and 2 at age
> 38" group 1 -1 0 group*age 38 -38 0 /TEST = "int for grp 2 at age 51"
> group 0 1 0 group*age 0 51 0 /TEST = "int for grp 3 at age 51" group
> 0 0 1 group*age 0 0 51 /TEST = "diff in ints between grps 2 and 3 at
> age 51" group 0 1 -1 group*age 0 51 -51 /TEST = "grp 1 slope"
> group*age 1 0 0 /TEST = "grp 2 slope" group*age 0 1 0 /TEST = "grp 3
> slope" group*age 0 0 1 /TEST = "diff in slopes between grp 1 and grp
> 2" group*age 1 -1 0 /TEST = "diff in slopes between grp 2 and grp 3"
> group*age 0 1 -1.
>
> A few points:
>
> (1) The group-specific slopes estimated from the TEST statements
> should equal the group*age interaction coefficients reported in the
> "Estimates of Fixed Effects" Table.
> (2) The code above is UNTESTED. I'm too busy right now to test the
> code above.
> (3) I am no expert in piecewise regression. I'm simply extrapolating
> from the two-category example provided on that website.
>
> HTH,
>
> Ryan
> On Tue, May 1, 2012 at 7:48 PM, Parise, Carol A.
> <PariseC@>wrote:
>
>> **
>> Ryan,
>>
>> This nailed it. When Bruce stated....
>>
>> ****************************************
>> If I followed, however, Ryan's model (see syntax below) included age
>> as
>> *both* a categorical variable (called Group) and a continuous
>> variable (age). The interaction of those two variables (group*age) is
>> what allows the slope for continuous age to vary by age group. That's
>> more or less the same thing you're trying to accomplish by using
>> piece-wise regression, right?
>>
>> MIXED y BY group WITH age
>>
>> /FIXED=group group*age | NOINT SSTYPE(3)
>>
>> /METHOD=REML
>>
>> /PRINT=SOLUTION.
>> ***************************************************
>> The lightbulb went on and i figured out why this made sense.
>
> --- snip ---
>

-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Follow-up-to-piecewise-regression-question-tp5668949p5680294.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Ryan

Re: Follow-up to piecewise regression question

In reply to this post by parisec

Carol,

Something's wrong with your analysis. Your "Estimates of Fixed Effects" should include three main effect terms (representing intercepts) and three interaction terms (representing slopes). Instead, it looks like you have a reference category for age. Did you run the analysis a couple different ways?

Ryan

On Wed, May 2, 2012 at 2:34 PM, Parise, Carol A. <[hidden email]> wrote:

Here is the result of using age in its original form and the \TEST statements.

What just dawned on me as i was looking at this is that since there is a significant interaction for age3*age, the main effect of age3 is no longer applicable - just like in any other anova.

The efffect of age on time, depends on which part of the age continum one is on. So, there is no effect of being under age 38 on finish time but being between 38-51 means you get slower by .20 hrs for every year. If you are over 50, then you still slow down but not as much.

Estimates of Fixed Effects^a

Parameter Estimate Std. Error df t Sig. 95% Confidence Interval

Lower Bound Upper Bound

[age3=1.00] .171610 1.293038 13,805 .133 .894 -2.362920 2.706140

[age3=2.00] -6.124152 1.282973 13,805 -4.773 .000 -8.638953 -3.609351

[age3=3.00] 0^b 0 . . . . .

[age3=1.00] * NewAge .036716 .022293 13,805.000 1.647 .100 -.006981 .080413

[age3=2.00] * NewAge .203924 .016284 13,805 12.523 .000 .172004 .235843

[age3=3.00] * NewAge .082119 .019039 13,805 4.313 .000 .044800 .119438

/TEST = "diff in slopes between <38 and 38-50" age3*newAge 1 -1 0

The estimate is simply 0.037-0.203 and it's significant which means that the slopes change from one age group to another. I believe that this provides the rationale for selecting this cutpoint in the data.

Contrast Estimates^a,b

Contrast Estimate Std. Error df Test Value t Sig. 95% Confidence Interval

Lower Bound Upper Bound

L1 -.167208 .027493 13,805 0 -6.082 .000 -.221098 -.113318

a. diff in slopes between <38 and 38-50

/TEST = "diff in slopes between <38 and 51+" age3*newAge 1 0 -1

The estimate is 0.203-0.082 and it's not significant which means that the slopes aren't really that different.

Contrast Estimate Std. Error df Test Value t Sig. 95% Confidence Interval

Lower Bound Upper Bound

L1 -.045403 .029255 13,805 0 -1.552 .121 -.102747 .011940

a. diff in slopes between <38 and 51+

/TEST = "diff in slopes between 38-50 and 51+" age3*newAge 0 1 -1.

The difference in slopes between 38-50 and 51+ is significant which means that the cut point at age 50 is justfiable.

Contrast Estimates^a,b

Contrast Estimate Std. Error df Test Value t Sig. 95% Confidence Interval

Lower Bound Upper Bound

L1 .121804 .024961 13,805 0 4.880 .000 .072877 .170731

a. diff in slopes between 38-50 and 51+

In the end, people under 38 and over 51 slow down by just around the same amount of time...but you slow down more in those middle ages.

Thanks again for taking the time to post. This was actually really fun to work though and see how it affected my results.

Carol

Ryan

Re: Follow-up to piecewise regression question

Carol,

The following "Estimates of Fixed Effects" Table was taken from the MIXED model analysis I ran initially on the talk.sav dataset from that website. Notice that the grand intercept has been removed and there are no reference categories. The main effects represent group-specific intercepts and the interaction effects represent group-specific slopes. I don't have much more time to dedicate to this topic, but I urge you to review the code I posted and to make sure that you are parameterizing your model the same way.

Ryan

Estimates of Fixed Effects^a
Parameter	Estimate	Std. Error	df	t	Sig.	95% Confidence Interval
Parameter	Estimate	Std. Error	df	t	Sig.	Lower Bound	Upper Bound
[group=.00]	8.076798	4.563993	196	1.770	.078	-.924042	17.077637
[group=1.00]	-24.972667	5.430149	196	-4.599	.000	-35.681687	-14.263646
[group=.00] * age	.681919	.437809	196	1.558	.121	-.181501	1.545340
[group=1.00] * age	3.629046	.286749	196.000	12.656	.000	3.063537	4.194554
a. Dependent Variable: talking on the phone.

On Wed, May 2, 2012 at 2:58 PM, R B <[hidden email]> wrote:

Carol,

Something's wrong with your analysis. Your "Estimates of Fixed Effects" should include three main effect terms (representing intercepts) and three interaction terms (representing slopes). Instead, it looks like you have a reference category for age. Did you run the analysis a couple different ways?

Ryan
On Wed, May 2, 2012 at 2:34 PM, Parise, Carol A. <[hidden email]> wrote:

Here is the result of using age in its original form and the \TEST statements.

What just dawned on me as i was looking at this is that since there is a significant interaction for age3*age, the main effect of age3 is no longer applicable - just like in any other anova.

The efffect of age on time, depends on which part of the age continum one is on. So, there is no effect of being under age 38 on finish time but being between 38-51 means you get slower by .20 hrs for every year. If you are over 50, then you still slow down but not as much.

Estimates of Fixed Effects^a

Parameter Estimate Std. Error df t Sig. 95% Confidence Interval

Lower Bound Upper Bound

[age3=1.00] .171610 1.293038 13,805 .133 .894 -2.362920 2.706140

[age3=2.00] -6.124152 1.282973 13,805 -4.773 .000 -8.638953 -3.609351

[age3=3.00] 0^b 0 . . . . .

[age3=1.00] * NewAge .036716 .022293 13,805.000 1.647 .100 -.006981 .080413

[age3=2.00] * NewAge .203924 .016284 13,805 12.523 .000 .172004 .235843

[age3=3.00] * NewAge .082119 .019039 13,805 4.313 .000 .044800 .119438

/TEST = "diff in slopes between <38 and 38-50" age3*newAge 1 -1 0

The estimate is simply 0.037-0.203 and it's significant which means that the slopes change from one age group to another. I believe that this provides the rationale for selecting this cutpoint in the data.

Contrast Estimates^a,b

Contrast Estimate Std. Error df Test Value t Sig. 95% Confidence Interval

Lower Bound Upper Bound

L1 -.167208 .027493 13,805 0 -6.082 .000 -.221098 -.113318

a. diff in slopes between <38 and 38-50

/TEST = "diff in slopes between <38 and 51+" age3*newAge 1 0 -1

The estimate is 0.203-0.082 and it's not significant which means that the slopes aren't really that different.

Contrast Estimate Std. Error df Test Value t Sig. 95% Confidence Interval

Lower Bound Upper Bound

L1 -.045403 .029255 13,805 0 -1.552 .121 -.102747 .011940

a. diff in slopes between <38 and 51+

/TEST = "diff in slopes between 38-50 and 51+" age3*newAge 0 1 -1.

The difference in slopes between 38-50 and 51+ is significant which means that the cut point at age 50 is justfiable.

Contrast Estimates^a,b

Contrast Estimate Std. Error df Test Value t Sig. 95% Confidence Interval

Lower Bound Upper Bound

L1 .121804 .024961 13,805 0 4.880 .000 .072877 .170731

a. diff in slopes between 38-50 and 51+

In the end, people under 38 and over 51 slow down by just around the same amount of time...but you slow down more in those middle ages.

Thanks again for taking the time to post. This was actually really fun to work though and see how it affected my results.

Carol

parisec

Re: Follow-up to piecewise regression question

This is weird. I included 2 additonal fixed factors in the model but only posted the output for age: num_sex and racenum10.

When i run your code with the additional variables after the age variables on the /FIXED line:

MIXED timehrs BY num_sex racenum10 age3 WITH newage
/FIXED=age3 age3*newage num_sex racenum10 | NOINT SSTYPE(3)

...I get what you posted - no reference category where age has 3 parameters and the actual age intercepts make logical sense for the data. It also correctly leaves sex=1 as the refcat

Estimates of Fixed Effects^a
Parameter	Estimate	Std. Error	df	t	Sig.	95% Confidence Interval
Parameter	Estimate	Std. Error	df	t	Sig.	Lower Bound	Upper Bound
[age3=1.00]	24.620714	.730377	13,805	33.710	.000	23.189075	26.052353
[age3=2.00]	18.324952	.720691	13,805	25.427	.000	16.912300	19.737603
[age3=3.00]	24.449104	1.072402	13,805	22.798	.000	22.347050	26.551158
[age3=1.00] * NewAge	.036716	.022293	13,805	1.647	.100	-.006981	.080413
[age3=2.00] * NewAge	.203924	.016284	13,805	12.523	.000	.172004	.235843
[age3=3.00] * NewAge	.082119	.019039	13,805	4.313	.000	.044800	.119438
[num_sex=0]	1.107044	.123436	13,805	8.969	.000	.865092	1.348996
[num_sex=1]	0	0	.	.	.	.	.
a. Dependent Variable: timehrs.

Model Dimension^a
		Number of Levels	Number of Parameters
Fixed Effects	age3	3	3
	age3 * NewAge	3	3
	num_sex	2	1
a. Dependent Variable: timehrs.

When i run it with the 2 additional variables *first* on the /FIXED line:

MIXED
timehrs by num_sex racenum10 age3 with newAge
/FIXED = num_sex racenum10 age3 age3*newage |NOINT SSTYPE(3)

...I get what I posted where age3 has only 2 parameters and sex is included without a reference category

Estimates of Fixed Effects^a
Parameter	Estimate	Std. Error	df	t	Sig.	95% Confidence Interval
Parameter	Estimate	Std. Error	df	t	Sig.	Lower Bound	Upper Bound
[num_sex=0]	25.556148	1.073820	13,805	23.799	.000	23.451315	27.660981
[num_sex=1]	24.449104	1.072402	13,805	22.798	.000	22.347050	26.551158
[age3=1.00]	.171610	1.293038	13,805	.133	.894	-2.362920	2.706140
[age3=2.00]	-6.124152	1.282973	13,805	-4.773	.000	-8.638953	-3.609351
[age3=3.00]	0	0	.	.	.	.	.
[age3=1.00] * NewAge	.036716	.022293	13,805.000	1.647	.100	-.006981	.080413
[age3=2.00] * NewAge	.203924	.016284	13,805	12.523	.000	.172004	.235843
[age3=3.00] * NewAge	.082119	.019039	13,805	4.313	.000	.044800	.119438
a. Dependent Variable: timehrs.

Model Dimension
		Number of Levels	Number of Parameters
Fixed Effects	num_sex	2	2
	age3	3	2
	age3 * NewAge	3	3

I then ran

MIXED timehrs BY num_sex racenum10 age3 WITH newage
/FIXED=racenum10 age3 age3*newage num_sex | NOINT SSTYPE(3)

....and guess what - racenum10 no longer had a refcat but both sex and age did.

i never thought it mattered what order the fixed factors were place on the /FIXED line but this completely changed the model parameters.

Carol

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of R B
Sent: Wednesday, May 02, 2012 1:22 PM
To: [hidden email]
Subject: Re: Follow-up to piecewise regression question

Carol,

Ryan

Estimates of Fixed Effects^a
Parameter	Estimate	Std. Error	df	t	Sig.	95% Confidence Interval
Parameter	Estimate	Std. Error	df	t	Sig.	Lower Bound	Upper Bound
[group=.00]	8.076798	4.563993	196	1.770	.078	-.924042	17.077637
[group=1.00]	-24.972667	5.430149	196	-4.599	.000	-35.681687	-14.263646
[group=.00] * age	.681919	.437809	196	1.558	.121	-.181501	1.545340
[group=1.00] * age	3.629046	.286749	196.000	12.656	.000	3.063537	4.194554
a. Dependent Variable: talking on the phone.

On Wed, May 2, 2012 at 2:58 PM, R B <[hidden email]> wrote:

Carol,

Something's wrong with your analysis. Your "Estimates of Fixed Effects" should include three main effect terms (representing intercepts) and three interaction terms (representing slopes). Instead, it looks like you have a reference category for age. Did you run the analysis a couple different ways?

Ryan

On Wed, May 2, 2012 at 2:34 PM, Parise, Carol A. <[hidden email]> wrote:

Here is the result of using age in its original form and the \TEST statements.

What just dawned on me as i was looking at this is that since there is a significant interaction for age3*age, the main effect of age3 is no longer applicable - just like in any other anova.

The efffect of age on time, depends on which part of the age continum one is on. So, there is no effect of being under age 38 on finish time but being between 38-51 means you get slower by .20 hrs for every year. If you are over 50, then you still slow down but not as much.

Estimates of Fixed Effects^a

Parameter Estimate Std. Error df t Sig. 95% Confidence Interval

Lower Bound Upper Bound

[age3=1.00] .171610 1.293038 13,805 .133 .894 -2.362920 2.706140

[age3=2.00] -6.124152 1.282973 13,805 -4.773 .000 -8.638953 -3.609351

[age3=3.00] 0^b 0 . . . . .

[age3=1.00] * NewAge .036716 .022293 13,805.000 1.647 .100 -.006981 .080413

[age3=2.00] * NewAge .203924 .016284 13,805 12.523 .000 .172004 .235843

[age3=3.00] * NewAge .082119 .019039 13,805 4.313 .000 .044800 .119438

/TEST = "diff in slopes between <38 and 38-50" age3*newAge 1 -1 0

The estimate is simply 0.037-0.203 and it's significant which means that the slopes change from one age group to another. I believe that this provides the rationale for selecting this cutpoint in the data.

Contrast Estimates^a,b

Contrast Estimate Std. Error df Test Value t Sig. 95% Confidence Interval

Lower Bound Upper Bound

L1 -.167208 .027493 13,805 0 -6.082 .000 -.221098 -.113318

a. diff in slopes between <38 and 38-50

/TEST = "diff in slopes between <38 and 51+" age3*newAge 1 0 -1

The estimate is 0.203-0.082 and it's not significant which means that the slopes aren't really that different.

Contrast Estimate Std. Error df Test Value t Sig. 95% Confidence Interval

Lower Bound Upper Bound

L1 -.045403 .029255 13,805 0 -1.552 .121 -.102747 .011940

a. diff in slopes between <38 and 51+

/TEST = "diff in slopes between 38-50 and 51+" age3*newAge 0 1 -1.

The difference in slopes between 38-50 and 51+ is significant which means that the cut point at age 50 is justfiable.

Contrast Estimates^a,b

Contrast Estimate Std. Error df Test Value t Sig. 95% Confidence Interval

Lower Bound Upper Bound

L1 .121804 .024961 13,805 0 4.880 .000 .072877 .170731

a. diff in slopes between 38-50 and 51+

In the end, people under 38 and over 51 slow down by just around the same amount of time...but you slow down more in those middle ages.

Thanks again for taking the time to post. This was actually really fun to work though and see how it affected my results.

Carol