Interaction terms in regression

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Interaction terms in regression

Maria Sapouna
Hello,

I would like to include an interaction term between a categorical and a
continuous variable in a multiple regression.

I would appreciate it if someone could help me with this.

Thank you in advance.

Best wishes,
Maria
Reply | Threaded
Open this post in threaded view
|

Re: Interaction terms in regression

statisticsdoc
Maria,

You can accomplish this as follows:

1.) Dummy code the categorical variable (k-1 dummy codes for k levels of the
categorical variable)

2.) Compute the product of the continuous variable and each one of the dummy
codes (k-1 cross products)

3.) Enter the continuous variable and the dummy-coded categorical variable
in the regression model (i.e., the main effects)

4.) Then, for the interaction, enter all of the k-1 cross-products from Step
#2 as one block.  The increment in R-squared in this step is due to the
addition of the interaction between the continuous and categorical variable.
The beta weights refer to the effects of the continuous variable with
specific levels of the categorical variable.

If you wish to refer to a text on this matter, you could look at West and
Aiken, Cohen and Cohen, or Pedhauzur, for example.

Best,

Stephen Brand

For personalized and professional consultation in statistics and research
design, visit
www.statisticsdoc.com


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Maria Sapouna
Sent: Thursday, February 22, 2007 5:50 AM
To: [hidden email]
Subject: Interaction terms in regression


Hello,

I would like to include an interaction term between a categorical and a
continuous variable in a multiple regression.

I would appreciate it if someone could help me with this.

Thank you in advance.

Best wishes,
Maria
Reply | Threaded
Open this post in threaded view
|

Re: Interaction terms in regression

Muir Houston
Depending on your continuous variable you may also want to include squared terms as well - in a recent investigation I used interaction terms for age and gender, where age is a continuous variable and gender is coded 1=female and 0=male
 
however, based on some previous research that suggested that age was best modelled as a quadratic rather than linear term the square of age was also included - this resulted in a series of interaction terms which included all combinations of age ´ female and age2 ´ female and female
 
The Mallows' (1973) approach, involves: (I) estimating a model with the 32 combinations of age and gender that are possible (ranging from including none of the combinations, through to including all five); (ii) calculating a summary measure called Mallows' Cp, based on the number of variables 'p' in the model; and, (iii) selecting as the 'best subset' the collection with Mallows' Cp closest to p+1.
 
Mallows CL (1973): Some comments of Cp. Technometrics 15. pp 661--676.

 
Muir Houston
Research Fellow
CRLL
Institute of Education
University of Stirling
FK9 4LA
01786-46-7615

________________________________

From: SPSSX(r) Discussion on behalf of Statisticsdoc
Sent: Thu 22/02/2007 13:32
To: [hidden email]
Subject: Re: Interaction terms in regression



Maria,

You can accomplish this as follows:

1.) Dummy code the categorical variable (k-1 dummy codes for k levels of the
categorical variable)

2.) Compute the product of the continuous variable and each one of the dummy
codes (k-1 cross products)

3.) Enter the continuous variable and the dummy-coded categorical variable
in the regression model (i.e., the main effects)

4.) Then, for the interaction, enter all of the k-1 cross-products from Step
#2 as one block.  The increment in R-squared in this step is due to the
addition of the interaction between the continuous and categorical variable.
The beta weights refer to the effects of the continuous variable with
specific levels of the categorical variable.

If you wish to refer to a text on this matter, you could look at West and
Aiken, Cohen and Cohen, or Pedhauzur, for example.

Best,

Stephen Brand

For personalized and professional consultation in statistics and research
design, visit
www.statisticsdoc.com


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Maria Sapouna
Sent: Thursday, February 22, 2007 5:50 AM
To: [hidden email]
Subject: Interaction terms in regression


Hello,

I would like to include an interaction term between a categorical and a
continuous variable in a multiple regression.

I would appreciate it if someone could help me with this.

Thank you in advance.

Best wishes,
Maria



--
The University of Stirling is a university established in Scotland by
charter at Stirling, FK9 4LA.  Privileged/Confidential Information may
be contained in this message.  If you are not the addressee indicated
in this message (or responsible for delivery of the message to such
person), you may not disclose, copy or deliver this message to anyone
and any action taken or omitted to be taken in reliance on it, is
prohibited and may be unlawful.  In such case, you should destroy this
message and kindly notify the sender by reply email.  Please advise
immediately if you or your employer do not consent to Internet email
for messages of this kind.
Reply | Threaded
Open this post in threaded view
|

Re: Interaction terms in regression

Richard Ristow
In reply to this post by Maria Sapouna
Some comments.

At 02:50 AM 2/22/2007, Maria Sapouna wrote:

>I would like to include an interaction term between a categorical and
>a continuous variable in a multiple regression.

At 05:32 AM 2/22/2007, Statisticsdoc wrote, giving correct advice:

>2.) Compute the product of the continuous variable and [the dummy code
>for all but one category of the categorical];
>
>3.) Enter [first] the continuous variable and the dummy-coded
>categorical variable in the regression model (i.e., the main effects)
>
>4.) Then, enter all of the k-1 cross-products from Step #2 as one
>block.

a. In SPSS command REGRESSION, /METHOD=TEST is good for this

b. You've just added k-1 independent variables. Check (carefully) that
your sample size is still adequate. (Overall warning: Testing
interaction effects often raises needed sample size drastically.)

At 05:43 AM 2/22/2007, Muir Houston wrote:

>You may want to include squared terms as well - in a recent
>investigation I used interaction terms for age and gender, where age
>is continuous and gender is coded 1=female and 0=male. [In my study,]
>the square of age was also included - a series of interaction terms
>which included all combinations of age*female and age2*female [where
>'age2'=age**2 and [dummy variable for] female.

c. This may be useful, but it 'eats' sample size even more quickly:
it's 2*(k-1)+1 new independent variables, instead of k-1. (The extra
'+1' is the main-effect squared term.) If the categorical is gender,
i.e. k=2, this isn't so severe. But even three more independent
variables isn't trivial; check sample size requirements very carefully.
(Did I say this before?)

d. Don't use age**2. For many populations (e.g., any population
consisting of adults, say age>=20), age an age**2 are highly
correlated, enough to impair precision of estimation badly. Use
something like (age-40)**2, if 40 is reasonably near the mean age in
your population.