Dummy Coding and Interpreting Regression Analysis

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Dummy Coding and Interpreting Regression Analysis

Justin Meyer-3
SPSS Listers:

 

I am working to determine if a subjective rating of schools'
implementation is a predictor of posttest score for students in those
schools. In previous analyses, the significant predictors of posttest
score were pretest score (scaled score from about 300 to 600), gender of
student (male or female) and economic status of student (Free lunch or
not). For the subjective rating of schools' implementation, schools are
rated as tier 1, 2, or 3, with 1 being the best implementation and 3
being the worst. I am using a regression analysis, entering all of the
variables at the same time. Because the tier status consists of three
possible responses, I dummy coded it into two variables. The first
variable is 1 for "Tier 2", 0 for "not Tier 2". The second variable is 1
for "Tier 3", and 0 for "not Tier 3". Is this the correct way to code
this variable for a regression analysis?

 

Also, I found a b (unstandardized coefficient) of -10.269 for Tier 2 and
21.171 for Tier 3. Does this mean that, when all other variables are
equal, students in Tier 2 score an average of 10 points less on the
posttest and students in Tier 3 score an average of 21 points more on
the posttest when compared to Tier 1? Or are the unstandardized
coefficients comparing Tier 2 with both Tiers 1 and 3 and Tier 3 with
both Tiers 1 and 2, respectively? This seems like a simple question, but
I can't find much information about how dummy coding works.

 

Thank you for any help you can provide. Let me know if I need to explain
more.

 

The output from the regression, except for charts, is pasted below:

 

                                           Descriptive Statistics

 

 

Mean

Std. Deviation

N

sscaled_score1

503.15

51.110

1970

scaled_score1

426.24

44.629

1970

gender_Recoded

.51

.500

1970

economic_status_recodedRecoded

.2766

.44746

1970

School is Tier 2

.25

.434

1970

School is Tier 3

.02

.144

1970

 

 
Correlations

 

 

 

sscaled_score1

scaled_score1

gender_Recoded

economic_status_recodedRecoded

School is Tier 2

School is Tier 3

Pearson Correlation

sscaled_score1

1.000

.710

.103

-.264

-.117

.075

  scaled_score1  

.710

1.000

.090

-.239

-.052

.027

  gender_Recoded        

.103

.090

1.000

-.012

.009

.006

  economic_status_recodedRecoded        

-.264

-.239

-.012

1.000

-.088

.097

  School is Tier 2      

-.117

-.052

.009

-.088

1.000

-.086

  School is Tier 3      

.075

.027

.006

.097

-.086

1.000

Sig. (1-tailed)

sscaled_score1

.

.000

.000

.000

.000

.000

  scaled_score1  

.000

.

.000

.000

.011

.112

  gender_Recoded        

.000

.000

.

.298

.340

.403

  economic_status_recodedRecoded        

.000

.000

.298

.

.000

.000

  School is Tier 2      

.000

.011

.340

.000

.

.000

  School is Tier 3      

.000

.112

.403

.000

.000

.

N

sscaled_score1

1970

1970

1970

1970

1970

1970

  scaled_score1  

1970

1970

1970

1970

1970

1970

  gender_Recoded        

1970

1970

1970

1970

1970

1970

  economic_status_recodedRecoded        

1970

1970

1970

1970

1970

1970

  School is Tier 2      

1970

1970

1970

1970

1970

1970

  School is Tier 3      

1970

1970

1970

1970

1970

1970

 

                     Variables Entered/Removed(b)

 

Model

Variables Entered

Variables Removed

Method

1

School is Tier 3, gender_Recoded, School is Tier 2, scaled_score1,
economic_status_recodedRecoded(a)

.

Enter

a  All requested variables entered.

b  Dependent Variable: sscaled_score1

 

                                             Model Summary(b)

 

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

1

.726(a)

.526

.525

35.220

a  Predictors: (Constant), School is Tier 3, gender_Recoded, School is
Tier 2, scaled_score1, economic_status_recodedRecoded

b  Dependent Variable: sscaled_score1

 

 
ANOVA(b)

 

Model

 

Sum of Squares

df

Mean Square

F

Sig.

1

Regression

2707279.702

5

541455.940

436.512

.000(a)

  Residual      

2436172.778

1964

1240.414

 

 

  Total  

5143452.479

1969

 

 

 

a  Predictors: (Constant), School is Tier 3, gender_Recoded, School is
Tier 2, scaled_score1, economic_status_recodedRecoded

b  Dependent Variable: sscaled_score1

 

 
Coefficients(a)

 

Model

 

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

Correlations

Collinearity Statistics

   B    

Std. Error

Beta

Zero-orderPartial Part  

Tolerance

VIF

B

Std. Error

         
1

(Constant)

178.839

8.063

 

22.181

.000

 

 

 

 

 

  scaled_score1  

.769

.018

.672

41.675

.000

.710

.685

.647

.928

1.078

  gender_Recoded        

4.282

1.594

.042

2.687

.007

.103

.061

.042

.992

1.009

  economic_status_recodedRecoded        

-13.272

1.846

-.116

-7.191

.000

-.264

-.160

-.112

.924

1.083

  School is Tier 2      

-10.269

1.845

-.087

-5.567

.000

-.117

-.125

-.086

.981

1.019

  School is Tier 3      

21.171

5.542

.060

3.820

.000

.075

.086

.059

.982

1.018

       


a  Dependent Variable: sscaled_score1

 

 
Coefficient Correlations(a)

 

Model

 

 

School is Tier 3

gender_Recoded

School is Tier 2

scaled_score1

economic_status_recodedRecoded

1

Correlations

School is Tier 3

1.000

-.003

.074

-.046

-.099

   gender_Recoded        

-.003

1.000

-.015

-.090

-.011

   School is Tier 2      

.074

-.015

1.000

.073

.095

   scaled_score1        

-.046

-.090

.073

1.000

.248

   economic_status_recodedRecoded        

-.099

-.011

.095

.248

1.000

  Covariances    

School is Tier 3

30.719

-.028

.760

-.005

-1.014

   gender_Recoded        

-.028

2.540

-.045

-.003

-.032

   School is Tier 2      

.760

-.045

3.402

.002

.323

   scaled_score1        

-.005

-.003

.002

.000

.008

   economic_status_recodedRecoded        

-1.014

-.032

.323

.008

3.407

a  Dependent Variable: sscaled_score1

 

 
Collinearity Diagnostics(a)

 

Model

Dimension

Eigenvalue

Condition Index

Variance Proportions

  (Constant)scaled_score1 gender_Recoded        

economic_status_recodedRecoded

School is Tier 2

School is Tier 3

(Constant)

scaled_score1

1

1

3.277

1.000

.00

.00

.03

.02

.02

.00

  2      

1.025

1.788

.00

.00

.00

.04

.10

.74

  3      

.746

2.096

.00

.00

.00

.38

.38

.25

  4      

.604

2.328

.00

.00

.23

.38

.39

.00

  5      

.342

3.095

.01

.01

.73

.10

.09

.00

  6      

.005

25.692

.99

.99

.00

.08

.01

.00

       


a  Dependent Variable: sscaled_score1

 

 
Residuals Statistics(a)

 

 

Minimum

Maximum

Mean

Std. Deviation

N

Predicted Value

385.35

631.69

503.15

37.080

1970

Std. Predicted Value

-3.177

3.467

.000

1.000

1970

Standard Error of Predicted Value

1.324

6.064

1.826

.666

1970

Adjusted Predicted Value

385.36

632.21

503.15

37.087

1970

Residual

-119.496

155.510

.000

35.175

1970

Std. Residual

-3.393

4.415

.000

.999

1970

Stud. Residual

-3.405

4.421

.000

1.000

1970

Deleted Residual

-120.366

155.897

-.001

35.286

1970

Stud. Deleted Residual

-3.414

4.442

.000

1.001

1970

Mahal. Distance

1.785

57.368

4.997

6.982

1970

Cook's Distance

.000

.026

.001

.001

1970

Centered Leverage Value

.001

.029

.003

.004

1970

a  Dependent Variable: sscaled_score1

 

 

____________________________________

Justin Meyer

Researcher

Rowland Reading Foundation

1 South Pinckney Street, Suite 324

Madison, WI  53703

phone: 866-370-7323  fax: 608-204-3846

www.rowlandreading.org <http://www.rowlandreading.org/>

____________________________________

 

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Dummy Coding and Interpreting Regression Analysis

Richard Ristow
At 03:36 PM 1/18/2008, Justin Meyer wrote:

>I am working to determine if a subjective rating of schools'
>implementation is a predictor of posttest score for students in
>those schools. For the subjective rating of schools' implementation,
>schools are rated as tier 1, 2, or 3, with 1 being the best
>implementation and 3 being the worst. I am using a regression
>analysis, entering all of the variables at the same time. Because
>the tier status consists of three possible responses, I dummy coded
>it into two variables. The first variable is 1 for "Tier 2", 0 for
>"not Tier 2". The second variable is 1 for "Tier 3", and 0 for "not
>Tier 3". Is this the correct way to code this variable for a
>regression analysis?

It is certainly correct; there are some variations, that are also correct.

>I found a b (unstandardized coefficient) of -10.269 for Tier 2 and
>21.171 for Tier 3. Does this mean that, when all other variables are
>equal, students in Tier 2 score an average of 10 points less on the
>posttest and students in Tier 3 score an average of 21 points more
>on the posttest when compared to Tier 1?

It means exactly that. However, don't forget that these estimates
should be expressed as confidence intervals: "95% confidence interval
is...", and give the range. The 95% confidence interval is the
estimated value, +/- twice the standard error of estimate.

>The output from the regression, except for charts, is pasted below:

Thank you for that. However, you may see that it came through with so
many line-breaks added, that one can't do much with it.

So, does this get you farther?

-Best of luck,
  Richard Ristow

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD