Regression and effect size question

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Regression and effect size question

Marsha and Mike SZYMCZUK
After developing a regression model with several categorical variables such as gender or white vs. non-white, a colleague would like to estimate the effects size of each categorical variable.
 
I know that you can use something like
 
(group 2 mean - group 1 mean)/ pooled standard deviation.
 
My question is....
 
How does one compute the pooled standard deviation ?
 
When computing the pooled STD, do you adjust the weighting by including the number of variables in the model i.e. (n1 - k - 1) ?
 
I am limited to SPSS 11.5 (poor school district).
Reply | Threaded
Open this post in threaded view
|

Re: Regression and effect size question

Zdaniuk, Bozena-2
why not do stepwise regression where you enter the categorical variable at the last step and spss gives you the value and significance of the increase in Rsq.
bozena

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Marsha and Mike SZYMCZUK [[hidden email]]
Sent: Thursday, March 26, 2009 12:27 AM
To: [hidden email]
Subject: Regression and effect size question

After developing a regression model with several categorical variables such as gender or white vs. non-white, a colleague would like to estimate the effects size of each categorical variable.
 
I know that you can use something like
 
(group 2 mean - group 1 mean)/ pooled standard deviation.
 
My question is....
 
How does one compute the pooled standard deviation ?
 
When computing the pooled STD, do you adjust the weighting by including the number of variables in the model i.e. (n1 - k - 1) ?
 
I am limited to SPSS 11.5 (poor school district).
Reply | Threaded
Open this post in threaded view
|

Re: Regression and effect size question

Art Kendall
That would be a stepped or hierarchical approach and not the nefarious stepwise approach.

If you are using regression procedures to do an ANOVA, why not also run the same data through an anova or GLM procedure which gives you all of the applicable effects.

Art Kendall
Social Research Consultants

Zdaniuk, Bozena wrote:
why not do stepwise regression where you enter the categorical variable at the last step and spss gives you the value and significance of the increase in Rsq.
bozena

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Marsha and Mike SZYMCZUK [[hidden email]]
Sent: Thursday, March 26, 2009 12:27 AM
To: [hidden email]
Subject: Regression and effect size question

After developing a regression model with several categorical variables such as gender or white vs. non-white, a colleague would like to estimate the effects size of each categorical variable.
 
I know that you can use something like
 
(group 2 mean - group 1 mean)/ pooled standard deviation.
 
My question is....
 
How does one compute the pooled standard deviation ?
 
When computing the pooled STD, do you adjust the weighting by including the number of variables in the model i.e. (n1 - k - 1) ?
 
I am limited to SPSS 11.5 (poor school district).
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Regression and effect size question

statisticsdoc
In reply to this post by Zdaniuk, Bozena-2
As Art points out, what is needed is Hierarchical Regression, in which you can intentionally control the order of entry of variables in the model.
Check out the option in SPSS regression that specifies Method = Enter  .
This option allows you to specify the order of entry so you can determine how much the R-squared increases when the last categorical variable is entered. If the categorical variable has more than two levels, make sure that you dummy code the categories and enter them as a block into the regression (e.g., if you have three levels of a catgeorical variable, use two dummy codes, enter both dummy codes in the same step in the regression).  The increase in r-squared shows the unique contribution of the categorical variable, controlling for all of the other variables in the model.

As Art notes, you can also go at this problem with ANOVA or GLM .  In ANOVA, stick with the Unique Sums of Squares default option.

Best,

Steve

www.StatisticsDoc.com



---- "Zdaniuk wrote:

> why not do stepwise regression where you enter the categorical variable at the last step and spss gives you the value and significance of the increase in Rsq.
> bozena
> ________________________________
> From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Marsha and Mike SZYMCZUK [[hidden email]]
> Sent: Thursday, March 26, 2009 12:27 AM
> To: [hidden email]
> Subject: Regression and effect size question
>
> After developing a regression model with several categorical variables such as gender or white vs. non-white, a colleague would like to estimate the effects size of each categorical variable.
>
> I know that you can use something like
>
> (group 2 mean - group 1 mean)/ pooled standard deviation.
>
> My question is....
>
> How does one compute the pooled standard deviation ?
>
> When computing the pooled STD, do you adjust the weighting by including the number of variables in the model i.e. (n1 - k - 1) ?
>
> I am limited to SPSS 11.5 (poor school district).

--
For personalized and experienced consulting in statistics and research design, visit www.statisticsdoc.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Regression and effect size question

Zdaniuk, Bozena-2
I am really sorry for calling the venerable hierarchical regression "stepwise". It was early in the morning... I will never do it again. :)
Bozena

Bozena Zdaniuk, Ph.D.
University of Pittsburgh
UCSUR, 6th Fl.
121 University Place
Pittsburgh, PA 15260
Ph.: 412-624-5736
Fax: 412-624-4810
Email: [hidden email]

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Statisticsdoc
Sent: Thursday, March 26, 2009 11:32 AM
To: [hidden email]
Subject: Re: Regression and effect size question

As Art points out, what is needed is Hierarchical Regression, in which you can intentionally control the order of entry of variables in the model.
Check out the option in SPSS regression that specifies Method = Enter  .
This option allows you to specify the order of entry so you can determine how much the R-squared increases when the last categorical variable is entered. If the categorical variable has more than two levels, make sure that you dummy code the categories and enter them as a block into the regression (e.g., if you have three levels of a catgeorical variable, use two dummy codes, enter both dummy codes in the same step in the regression).  The increase in r-squared shows the unique contribution of the categorical variable, controlling for all of the other variables in the model.

As Art notes, you can also go at this problem with ANOVA or GLM .  In ANOVA, stick with the Unique Sums of Squares default option.

Best,

Steve

www.StatisticsDoc.com



---- "Zdaniuk wrote:

> why not do stepwise regression where you enter the categorical variable at the last step and spss gives you the value and significance of the increase in Rsq.
> bozena
> ________________________________
> From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Marsha and Mike SZYMCZUK [[hidden email]]
> Sent: Thursday, March 26, 2009 12:27 AM
> To: [hidden email]
> Subject: Regression and effect size question
>
> After developing a regression model with several categorical variables such as gender or white vs. non-white, a colleague would like to estimate the effects size of each categorical variable.
>
> I know that you can use something like
>
> (group 2 mean - group 1 mean)/ pooled standard deviation.
>
> My question is....
>
> How does one compute the pooled standard deviation ?
>
> When computing the pooled STD, do you adjust the weighting by including the number of variables in the model i.e. (n1 - k - 1) ?
>
> I am limited to SPSS 11.5 (poor school district).

--
For personalized and experienced consulting in statistics and research design, visit www.statisticsdoc.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Regression and effect size question

statisticsdoc
Bozena,

Those who rise early to post on the SPSS list are to be commended :)

Best Wishes,

Steve

---- "Zdaniuk wrote:

> I am really sorry for calling the venerable hierarchical regression "stepwise". It was early in the morning... I will never do it again. :)
> Bozena
>
> Bozena Zdaniuk, Ph.D.
> University of Pittsburgh
> UCSUR, 6th Fl.
> 121 University Place
> Pittsburgh, PA 15260
> Ph.: 412-624-5736
> Fax: 412-624-4810
> Email: [hidden email]
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Statisticsdoc
> Sent: Thursday, March 26, 2009 11:32 AM
> To: [hidden email]
> Subject: Re: Regression and effect size question
>
> As Art points out, what is needed is Hierarchical Regression, in which you can intentionally control the order of entry of variables in the model.
> Check out the option in SPSS regression that specifies Method = Enter  .
> This option allows you to specify the order of entry so you can determine how much the R-squared increases when the last categorical variable is entered. If the categorical variable has more than two levels, make sure that you dummy code the categories and enter them as a block into the regression (e.g., if you have three levels of a catgeorical variable, use two dummy codes, enter both dummy codes in the same step in the regression).  The increase in r-squared shows the unique contribution of the categorical variable, controlling for all of the other variables in the model.
>
> As Art notes, you can also go at this problem with ANOVA or GLM .  In ANOVA, stick with the Unique Sums of Squares default option.
>
> Best,
>
> Steve
>
> www.StatisticsDoc.com
>
>
>
> ---- "Zdaniuk wrote:
> > why not do stepwise regression where you enter the categorical variable at the last step and spss gives you the value and significance of the increase in Rsq.
> > bozena
> > ________________________________
> > From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Marsha and Mike SZYMCZUK [[hidden email]]
> > Sent: Thursday, March 26, 2009 12:27 AM
> > To: [hidden email]
> > Subject: Regression and effect size question
> >
> > After developing a regression model with several categorical variables such as gender or white vs. non-white, a colleague would like to estimate the effects size of each categorical variable.
> >
> > I know that you can use something like
> >
> > (group 2 mean - group 1 mean)/ pooled standard deviation.
> >
> > My question is....
> >
> > How does one compute the pooled standard deviation ?
> >
> > When computing the pooled STD, do you adjust the weighting by including the number of variables in the model i.e. (n1 - k - 1) ?
> >
> > I am limited to SPSS 11.5 (poor school district).
>
> --
> For personalized and experienced consulting in statistics and research design, visit www.statisticsdoc.com
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

--
For personalized and experienced consulting in statistics and research design, visit www.statisticsdoc.com

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Regression and effect size question

Swank, Paul R
In reply to this post by Marsha and Mike SZYMCZUK

If you are interested in the effect size of the group differences you could use the change in R squared or you could use Cohen’s d. However, how you define the standard deviation may be debatable. You could do it by using the square root of the mean square error from the model but this will reduce the standard deviation by removing variance attributable to other factors in the model. This tends to make the effect size too large versus the raw pooled standard deviation which would only include the groups in the model in order to estimate the pooled standard deviation. For purposes of estimating power, the former is better. For reporting effect sizes in support of your statistical analysis, the latter is typically preferred.

 

Dr. Paul R. Swank,

Professor and Director of Research

Children's Learning Institute

University of Texas Health Science Center-Houston

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marsha and Mike SZYMCZUK
Sent: Wednesday, March 25, 2009 11:27 PM
To: [hidden email]
Subject: Regression and effect size question

 

After developing a regression model with several categorical variables such as gender or white vs. non-white, a colleague would like to estimate the effects size of each categorical variable.

 

I know that you can use something like

 

(group 2 mean - group 1 mean)/ pooled standard deviation.

 

My question is....

 

How does one compute the pooled standard deviation ?

 

When computing the pooled STD, do you adjust the weighting by including the number of variables in the model i.e. (n1 - k - 1) ?

 

I am limited to SPSS 11.5 (poor school district).

Reply | Threaded
Open this post in threaded view
|

Re: Regression and effect size question

Henrik Lolle
In reply to this post by Marsha and Mike SZYMCZUK
Perhaps it is easier and/or better to use either eta square in GLM,
beta coefficients from Multiple Classification Analysis in UNIANOVA
(only syntax), or beta coefficients in CATREG.

best,
Henrik

Quoting Marsha and Mike SZYMCZUK <[hidden email]>:

> After developing a regression model with several categorical
> variables such as gender or white vs. non-white, a colleague would
> like to estimate the effects size of each categorical variable.
>
> I know that you can use something like
>
> (group 2 mean - group 1 mean)/ pooled standard deviation.
>
> My question is....
>
> How does one compute the pooled standard deviation ?
>
> When computing the pooled STD, do you adjust the weighting by
> including the number of variables in the model i.e. (n1 - k - 1) ?
>
> I am limited to SPSS 11.5 (poor school district).



************************************************************
Henrik Lolle
Department of Economics, Politics and Public Administration
Aalborg University
Fibigerstraede 1
9200 Aalborg
Phone: (+45) 99 40 81 84
************************************************************

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

LPAD question

parisec
In reply to this post by statisticsdoc
Hi all,

i think this will be a quick question for someone.

I have string data:

oldVAR

5678
13056

I need to put 4 leading 0s on these so that they look like this:

00005678
000013056

i thought the following would do the trick:

string newVAR(A9).
compute newVAR =LPAD(oldVAR,4,'0').
execute.

but, it's just returning the same numbers. what have i screwed up?

thanks a bunch.
Carol

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: LPAD question

Anthony Babinec
Use this COMPUTE statement: COMPUTE newvar=CONCAT('0000',oldvar).

Tony Babinec
[hidden email]

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Parise, Carol A.
Sent: Friday, March 27, 2009 10:39 AM
To: [hidden email]
Subject: LPAD question

Hi all,

i think this will be a quick question for someone.

I have string data:

oldVAR

5678
13056

I need to put 4 leading 0s on these so that they look like this:

00005678
000013056

i thought the following would do the trick:

string newVAR(A9).
compute newVAR =LPAD(oldVAR,4,'0').
execute.

but, it's just returning the same numbers. what have i screwed up?

thanks a bunch.
Carol

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: LPAD question

Oliver, Richard
In reply to this post by parisec
try this instead:

newvar (a9).
compute newvar=concat('0000', rtrim(oldvar)).

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Parise, Carol A.
Sent: Friday, March 27, 2009 10:39 AM
To: [hidden email]
Subject: LPAD question

Hi all,

i think this will be a quick question for someone.

I have string data:

oldVAR

5678
13056

I need to put 4 leading 0s on these so that they look like this:

00005678
000013056

i thought the following would do the trick:

string newVAR(A9).
compute newVAR =LPAD(oldVAR,4,'0').
execute.

but, it's just returning the same numbers. what have i screwed up?

thanks a bunch.
Carol

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: LPAD question

parisec
and we have a winner!

thanks
cp



-----Original Message-----
From: Oliver, Richard [mailto:[hidden email]]
Sent: Friday, March 27, 2009 9:10 AM
To: Parise, Carol A.; [hidden email]
Subject: RE: LPAD question

try this instead:

newvar (a9).
compute newvar=concat('0000', rtrim(oldvar)).

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Parise, Carol A.
Sent: Friday, March 27, 2009 10:39 AM
To: [hidden email]
Subject: LPAD question

Hi all,

i think this will be a quick question for someone.

I have string data:

oldVAR

5678
13056

I need to put 4 leading 0s on these so that they look like this:

00005678
000013056

i thought the following would do the trick:

string newVAR(A9).
compute newVAR =LPAD(oldVAR,4,'0').
execute.

but, it's just returning the same numbers. what have i screwed up?

thanks a bunch.
Carol

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list
of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD