Z scores

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Z scores

Kathryn Gardner
Dear List,
I am running a binary logistic regression and I wanted to standardise the predictors beforehand by converting to z scores so that the regression coefficients in the output are standardised. However, some of these predictors are already standardised with a mean of 100 and SD of 15. Will SPSS convert these standard scores to z scores properly?
Kathryn


Reply | Threaded
Open this post in threaded view
|

Re: Z scores

Swank, Paul R

I wouldn’t do that if I were you. Standardized coefficients are overrated and mostly misinterpreted. For example, unless conditions are perfect, they do not tell you the relative importance of the predictors as many people think they do. What are the conditions, you ask? Uncorrelated, perfectly reliable predictors without outliers.

 

Dr. Paul R. Swank,

Children's Learning Institute

Professor, Department of Pediatrics, Medical School

Adjunct Professor, School of Public Health

University of Texas Health Science Center-Houston

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Kathryn Gardner
Sent: Tuesday, December 06, 2011 11:52 AM
To: [hidden email]
Subject: Z scores

 

Dear List,
I am running a binary logistic regression and I wanted to standardise the predictors beforehand by converting to z scores so that the regression coefficients in the output are standardised. However, some of these predictors are already standardised with a mean of 100 and SD of 15. Will SPSS convert these standard scores to z scores properly?
Kathryn

Reply | Threaded
Open this post in threaded view
|

Re: Z scores

Hector Maletta
In reply to this post by Kathryn Gardner

Of course. Standardizing means detracting the mean and dividing by the standard deviation. The SPSS DESCRIPTIVES procedure (with the keyword SAVE) produces new standardized variables (z_oldname) based on whatever variables you provide.

Remember, however, that your standardized log reg coefficients will give you the increased odds of an event PER ADDITIONAL UNIT OF STANDARD DEVIATION, not per additional physical unit (not per year of additional age, but per additional unit of age SD), and the SD would vary from one sample to another, thus introducing a question mark on the actual meaning of your coefficients.

 

Hector

 

De:            SPSSX® Discussion [[hidden email]] En nombre de Kathryn Gardner
Enviado el: Tuesday, December 06, 2011 14:52
Para: [hidden email]
Asunto: Z scores

 

Dear List,
I am running a binary logistic regression and I wanted to standardise the predictors beforehand by converting to z scores so that the regression coefficients in the output are standardised. However, some of these predictors are already standardised with a mean of 100 and SD of 15. Will SPSS convert these standard scores to z scores properly?
Kathryn


No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.1873 / Virus Database: 2102/4660 - Release Date: 12/06/11

Reply | Threaded
Open this post in threaded view
|

Re: Z scores

Art Kendall
In reply to this post by Kathryn Gardner
Did you standardize those variables to 100 and 15 on the same set of cases that might go into the analysis?  Or are the scores standardized on some other group of cases?

They may or may not be close.

you can check with something like
DESCRIPTIVES ...
DO REPEAT NEWZ = ...../ OLDSCORE = OLDSCORE1 TO OLDSCORE... / NEWSCORE = NEWSCORE1 TO NEWSCORE.... /DIFF = DIFF1 TO DIFF....  .
COMPUTE NEWSCORE = 100 + 15*NEWZ.
COMPUTE DIFF = OLDSCORE - NEWSCORE.
END REPEAT.


On 12/6/2011 12:52 PM, Kathryn Gardner wrote:
Dear List,
I am running a binary logistic regression and I wanted to standardise the predictors beforehand by converting to z scores so that the regression coefficients in the output are standardised. However, some of these predictors are already standardised with a mean of 100 and SD of 15. Will SPSS convert these standard scores to z scores properly?
Kathryn



--
Art Kendall
Social Research Consultants
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Z scores

Bruce Weaver
Administrator
In reply to this post by Kathryn Gardner
Following up on what Paul and Hector have said, here are some notes I made for myself after reading John Fox's comments on standardized regression equations.

--- Start of notes ---

In his book "Applied Regression Analysis and Generalized Linear Models" (2008, Sage), John Fox is very cautious about the use of standardized regression coefficients.  He gives this interesting example.  When two variables are measured on the same scale (e.g.,years of education, and years of employment), then relative impact of the two can be compared directly.  But suppose those two variables differ substantially in the amount of spread.  In that case, comparison of the standardized regression coefficients would likely yield a very different story than comparison of the raw regression coefficients.  Fox then says:

"If expressing coefficients relative to a measure of spread potentially distorts their comparison when two explanatory variables are commensurable [i.e., measured on the same scale], then why should the procedure magically allow us to compare coefficients [for variables] that are measured in different units?" (p. 95)

Good question!

A page later, Fox adds the following:

"A common misuse of standardized coefficients is to employ them to make comparisons of the effects of the same explanatory variable in two or more samples drawn from different populations.  If the explanatory variable in question has different spreads in these samples, then spurious differences between coefficients may result, even when _unstandardized_ coefficients are similar; on the other hand, differences in unstandardized coefficients can be masked by compensating differences in dispersion." (p. 96)

And finally, this comment on whether or not Y has to be standardized:

"The usual practice standardizes the response variable as well, but this is an inessential element of the computation of standardized coefficients, because the _relative_ size of the slope coefficients does not change when Y is rescaled." (p. 95)

--- End of notes ---

HTH.


Kathryn Gardner wrote
Dear List,
I am running a binary logistic regression and I wanted to standardise the predictors beforehand by converting to z scores so that the regression coefficients in the output are standardised. However, some of these predictors are already standardised with a mean of 100 and SD of 15. Will SPSS convert these standard scores to z scores properly?
Kathryn
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Z scores

Rich Ulrich
In reply to this post by Kathryn Gardner

As several other regular contributors have mentioned, standardizing
unrelated variables is seldom a good idea.  - I like to standardize arbitrary
composite scores to M=50, SD=10, so that subgroup deviations are easily
noticed and described, without using decimal places.  - Of course, SPSS
can "properly" create z-scores by subtracting the mean and dividing by
the SD. 

Presumably, you want to see on sense of the "relative contributions"
of the variables.  That is always problematic for what are properly
called "partial regression coefficients"  - Either you get a the same
ordering from standardized coefficients as from the simple p-values;
or other results frankly warn that the predictor variables are confounding
each other, so that the notion of "unique contribution" is a misnomer.
So you can either use the p-values, while recognizing their weaknesses,
or write your story from knowledge of how the variables have to interact.

OLS regression has always provided standardized regression coefficients
(criterion standardized, too), and the main use that I have ever found
for them is a warning of suppressor relationships.  Suppressor relations,
where the difference of related variables is more important than their sum,
happen almost as readily (this introduces the topic of artifacts) in Logistic
as in Ordinary Least Squares. 


--
Rich Ulrich


Date: Tue, 6 Dec 2011 17:52:19 +0000
From: [hidden email]
Subject: Z scores
To: [hidden email]

Dear List,
I am running a binary logistic regression and I wanted to standardise the predictors beforehand by converting to z scores so that the regression coefficients in the output are standardised. However, some of these predictors are already standardised with a mean of 100 and SD of 15. Will SPSS convert these standard scores to z scores properly?
Kathryn


Reply | Threaded
Open this post in threaded view
|

Re: Z scores

Kathryn Gardner
Thanks to everyone who responded to my query (below). It seems the consensus is that SPSS will convert them properly to Z scores, but that standardising isn't the best approach when predictors are quite dissimilar, or not reliable and with outliers. OLS routinely produces standardised coefficients but I don't recall this issue being covered in the main stats books I have.
 
Bruce - some interesting quotes there. Thanks for those.
 
Kathryn

 

Dear List,
I am running a binary logistic regression and I wanted to standardise the predictors beforehand by converting to z scores so that the regression coefficients in the output are standardised. However, some of these predictors are already standardised with a mean of 100 and SD of 15. Will SPSS convert these standard scores to z scores properly?
Kathryn

 

 

I wouldn’t do that if I were you. Standardized coefficients are overrated and mostly misinterpreted. For example, unless conditions are perfect, they do not tell you the relative importance of the predictors as many people think they do. What are the conditions, you ask? Uncorrelated, perfectly reliable predictors without outliers.

Dr. Paul R. Swank,

Children's Learning Institute

Professor, Department of Pediatrics, Medical School

Adjunct Professor, School of Public Health

University of Texas Health Science Center-Houston

 

Of course. Standardizing means detracting the mean and dividing by the standard deviation. The SPSS DESCRIPTIVES procedure (with the keyword SAVE) produces new standardized variables (z_oldname) based on whatever variables you provide.

Remember, however, that your standardized log reg coefficients will give you the increased odds of an event PER ADDITIONAL UNIT OF STANDARD DEVIATION, not per additional physical unit (not per year of additional age, but per additional unit of age SD), and the SD would vary from one sample to another, thus introducing a question mark on the actual meaning of your coefficients.

Hector

 

Did you standardize those variables to 100 and 15 on the same set of cases that might go into the analysis? Or are the scores standardized on some other group of cases?

They may or may not be close.

you can check with something like
DESCRIPTIVES ...
DO REPEAT NEWZ = ...../ OLDSCORE = OLDSCORE1 TO OLDSCORE... / NEWSCORE = NEWSCORE1 TO NEWSCORE.... /DIFF = DIFF1 TO DIFF.... .
COMPUTE NEWSCORE = 100 + 15*NEWZ.
COMPUTE DIFF = OLDSCORE - NEWSCORE.
END REPEAT.

 

Kathryn,
Yes - converting the scores to z score form will simply change the mean of the distribution to 0 instead of 100 and the standard deviation to 1 instead of 15. The position of each case in the distribution of scores is the same.
Best,
Stephen Brand, Ph.D.

www.StatisticsDoc.com

 

 

Following up on what Paul and Hector have said, here are some notes I made
for myself after reading John Fox's comments on standardized regression
equations.
 
--- Start of notes ---
 
In his book "Applied Regression Analysis and Generalized Linear Models"
(2008, Sage), John Fox is very cautious about the use of standardized
regression coefficients.  He gives this interesting example.  When two
variables are measured on the same scale (e.g.,years of education, and years
of employment), then relative impact of the two can be compared directly.
But suppose those two variables differ substantially in the amount of
spread.  In that case, comparison of the standardized regression
coefficients would likely yield a very different story than comparison of
the raw regression coefficients.  Fox then says:
 
"If expressing coefficients relative to a measure of spread potentially
distorts their comparison when two explanatory variables are commensurable
[i.e., measured on the same scale], then why should the procedure magically
allow us to compare coefficients [for variables] that are measured in
different units?" (p. 95)
 
Good question!
 
A page later, Fox adds the following:
 
"A common misuse of standardized coefficients is to employ them to make
comparisons of the effects of the same explanatory variable in two or more
samples drawn from different populations.  If the explanatory variable in
question has different spreads in these samples, then spurious differences
between coefficients may result, even when _unstandardized_ coefficients are
similar; on the other hand, differences in unstandardized coefficients can
be masked by compensating differences in dispersion." (p. 96)
 
And finally, this comment on whether or not Y has to be standardized:
 
"The usual practice standardizes the response variable as well, but this is
an inessential element of the computation of standardized coefficients,
because the _relative_ size of the slope coefficients does not change when Y
is rescaled." (p. 95)
 
--- End of notes ---
 
HTH.

 

As several other regular contributors have mentioned, standardizing
unrelated variables is seldom a good idea. - I like to standardize arbitrary
composite scores to M=50, SD=10, so that subgroup deviations are easily
noticed and described, without using decimal places. - Of course, SPSS
can "properly" create z-scores by subtracting the mean and dividing by
the SD.

Presumably, you want to see on sense of the "relative contributions"
of the variables. That is always problematic for what are properly
called "partial regression coefficients" - Either you get a the same
ordering from standardized coefficients as from the simple p-values;
or other results frankly warn that the predictor variables are confounding
each other, so that the notion of "unique contribution" is a misnomer.
So you can either use the p-values, while recognizing their weaknesses,
or write your story from knowledge of how the variables have to interact.

OLS regression has always provided standardized regression coefficients
(criterion standardized, too), and the main use that I have ever found
for them is a warning of suppressor relationships. Suppressor relations,
where the difference of related variables is more important than their sum,
happen almost as readily (this introduces the topic of artifacts) in Logistic
as in Ordinary Least Squares.


--
Rich Ulrich