Dear List,
I am running a binary logistic regression and I wanted to standardise the predictors beforehand by converting to z scores so that the regression coefficients in the output are standardised. However, some of these predictors are already standardised with a mean of 100 and SD of 15. Will SPSS convert these standard scores to z scores properly? Kathryn |
I wouldn’t do that if I were you. Standardized coefficients are overrated and mostly misinterpreted. For example, unless conditions are perfect, they do not tell you the relative importance of the predictors as many people think they do. What are the conditions, you ask? Uncorrelated, perfectly reliable predictors without outliers. Dr. Paul R. Swank, Children's Learning Institute Professor, Department of Pediatrics, Medical School Adjunct Professor, School of Public Health University of Texas Health Science Center-Houston From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Kathryn Gardner Dear List, |
In reply to this post by Kathryn Gardner
Of course. Standardizing means detracting the mean and dividing by the standard deviation. The SPSS DESCRIPTIVES procedure (with the keyword SAVE) produces new standardized variables (z_oldname) based on whatever variables you provide. Remember, however, that your standardized log reg coefficients will give you the increased odds of an event PER ADDITIONAL UNIT OF STANDARD DEVIATION, not per additional physical unit (not per year of additional age, but per additional unit of age SD), and the SD would vary from one sample to another, thus introducing a question mark on the actual meaning of your coefficients. Hector De: SPSSX® Discussion [[hidden email]] En nombre de Kathryn Gardner Dear List, No virus found in this message. |
In reply to this post by Kathryn Gardner
Did you standardize those variables to 100 and 15 on
the same set of cases that might go into the analysis? Or are the
scores standardized on some other group of cases?
They may or may not be close. you can check with something like DESCRIPTIVES ... DO REPEAT NEWZ = ...../ OLDSCORE = OLDSCORE1 TO OLDSCORE... / NEWSCORE = NEWSCORE1 TO NEWSCORE.... /DIFF = DIFF1 TO DIFF.... . COMPUTE NEWSCORE = 100 + 15*NEWZ. COMPUTE DIFF = OLDSCORE - NEWSCORE. END REPEAT. On 12/6/2011 12:52 PM, Kathryn Gardner wrote:
-- Art Kendall Social Research Consultants===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
Administrator
|
In reply to this post by Kathryn Gardner
Following up on what Paul and Hector have said, here are some notes I made for myself after reading John Fox's comments on standardized regression equations.
--- Start of notes --- In his book "Applied Regression Analysis and Generalized Linear Models" (2008, Sage), John Fox is very cautious about the use of standardized regression coefficients. He gives this interesting example. When two variables are measured on the same scale (e.g.,years of education, and years of employment), then relative impact of the two can be compared directly. But suppose those two variables differ substantially in the amount of spread. In that case, comparison of the standardized regression coefficients would likely yield a very different story than comparison of the raw regression coefficients. Fox then says: "If expressing coefficients relative to a measure of spread potentially distorts their comparison when two explanatory variables are commensurable [i.e., measured on the same scale], then why should the procedure magically allow us to compare coefficients [for variables] that are measured in different units?" (p. 95) Good question! A page later, Fox adds the following: "A common misuse of standardized coefficients is to employ them to make comparisons of the effects of the same explanatory variable in two or more samples drawn from different populations. If the explanatory variable in question has different spreads in these samples, then spurious differences between coefficients may result, even when _unstandardized_ coefficients are similar; on the other hand, differences in unstandardized coefficients can be masked by compensating differences in dispersion." (p. 96) And finally, this comment on whether or not Y has to be standardized: "The usual practice standardizes the response variable as well, but this is an inessential element of the computation of standardized coefficients, because the _relative_ size of the slope coefficients does not change when Y is rescaled." (p. 95) --- End of notes --- HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Kathryn Gardner
As several other regular contributors have mentioned, standardizing unrelated variables is seldom a good idea. - I like to standardize arbitrary composite scores to M=50, SD=10, so that subgroup deviations are easily noticed and described, without using decimal places. - Of course, SPSS can "properly" create z-scores by subtracting the mean and dividing by the SD. Presumably, you want to see on sense of the "relative contributions" of the variables. That is always problematic for what are properly called "partial regression coefficients" - Either you get a the same ordering from standardized coefficients as from the simple p-values; or other results frankly warn that the predictor variables are confounding each other, so that the notion of "unique contribution" is a misnomer. So you can either use the p-values, while recognizing their weaknesses, or write your story from knowledge of how the variables have to interact. OLS regression has always provided standardized regression coefficients (criterion standardized, too), and the main use that I have ever found for them is a warning of suppressor relationships. Suppressor relations, where the difference of related variables is more important than their sum, happen almost as readily (this introduces the topic of artifacts) in Logistic as in Ordinary Least Squares. -- Rich Ulrich Date: Tue, 6 Dec 2011 17:52:19 +0000 From: [hidden email] Subject: Z scores To: [hidden email]
Dear List, I am running a binary logistic regression and I wanted to standardise the predictors beforehand by converting to z scores so that the regression coefficients in the output are standardised. However, some of these predictors are already standardised with a mean of 100 and SD of 15. Will SPSS convert these standard scores to z scores properly? Kathryn |
Thanks to everyone who responded to my query (below). It seems the consensus is that SPSS will convert them properly to Z scores, but that standardising isn't the best approach when predictors are quite dissimilar, or not reliable and with outliers. OLS routinely produces standardised coefficients but I don't recall this issue being covered in the main stats books I have. Bruce - some interesting quotes there. Thanks for those. Kathryn Dear List, I wouldn’t do that if I were
you. Standardized coefficients are overrated and mostly misinterpreted. For
example, unless conditions are perfect, they do not tell you the relative
importance of the predictors as many people think they do. What are the
conditions, you ask? Uncorrelated, perfectly reliable predictors without
outliers. Dr. Paul R. Swank, Children's Learning Institute Professor, Department of
Pediatrics, Medical School Adjunct Professor, School of
Public Health University of Texas Health
Science Center-Houston Of course. Standardizing
means detracting the mean and dividing by the standard deviation. The SPSS
DESCRIPTIVES procedure (with the keyword SAVE) produces new standardized
variables (z_oldname) based on whatever variables you provide. Remember, however, that your
standardized log reg coefficients will give you the increased odds of an event PER
ADDITIONAL UNIT OF STANDARD DEVIATION, not per additional physical unit (not
per year of additional age, but per additional unit of age SD), and the SD
would vary from one sample to another, thus introducing a question mark on the
actual meaning of your coefficients. Hector Did you standardize those variables to 100
and 15 on the same set of cases that might go into the analysis? Or are the
scores standardized on some other group of cases? Kathryn, www.StatisticsDoc.com Following up on what Paul and
Hector have said, here are some notes I made As several other regular contributors have
mentioned, standardizing |
Free forum by Nabble | Edit this page |