I would like to transform a random variable which is not normally distributed, into a normal distributed random variable (if possible). I have different transformation functions, especially based on the logarithm of the variable. Depending on the type of distribution, there are supposed to be so-called ‘optimal transformations’ (eg Box-Cox transformation ??). Does anyone know of suitable transformations or has anyone heard of a 'Box-Cox' transformation?
Thanks
Dr. Frank Gaeth
|
Hi Frank,
you might want to try http://epm.sagepub.com/content/55/4/625.abstract It helped me on various occasions. HTH Matthias ---------- Forwarded message ---------- From: drfg2008 <[hidden email]> Date: Sun, May 1, 2011 at 9:31 PM Subject: transformation of variable into a normally distributed variable To: [hidden email] I would like to transform a random variable which is not normally distributed, into a normal distributed random variable (if possible). I have different transformation functions, especially based on the logarithm of the variable. Depending on the type of distribution, there are supposed to be so-called ‘optimal transformations’ (eg Box-Cox transformation ??). Does anyone know of suitable transformations or has anyone heard of a 'Box-Cox' transformation? Thanks -----Dr. Frank Gaeth FU Berlin -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/transformation-of-variable-into-a-normally-distributed-variable-tp4363370p4363370.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by drfg2008
The Box-Cox transformation is available
in the Data Preparation Option. If you have that, look for
Transform>Prepare Data for Modeling>Interactive. You will find it on the Rescale tab under Continuous Targets or in the ADP command. HTH, Jon Peck Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: drfg2008 <[hidden email]> To: [hidden email] Date: 05/01/2011 01:34 PM Subject: [SPSSX-L] transformation of variable into a normally distributed variable Sent by: "SPSSX(r) Discussion" <[hidden email]> I would like to transform a random variable which is not normally distributed, into a normal distributed random variable (if possible). I have different transformation functions, especially based on the logarithm of the variable. Depending on the type of distribution, there are supposed to be so-called ‘optimal transformations’ (eg Box-Cox transformation ??). Does anyone know of suitable transformations or has anyone heard of a 'Box-Cox' transformation? Thanks -----Dr. Frank Gaeth FU Berlin -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/transformation-of-variable-into-a-normally-distributed-variable-tp4363370p4363370.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by drfg2008
A google search will turn up many hits for Box-Cox transformation as
will a search of scholar.google.com. There is a Wikipedia entry titled "Power Transform" that give brief coverage of Box-Cox transformations. see: http://en.wikipedia.org/wiki/Power_transform A somewhat more informative presentation is given by Gerson in his notes on assumptions underlying linear model analysis. He discusses several different types of transformation in addition to Box-Cox; see: http://faculty.chass.ncsu.edu/garson/PA765/assumpt.htm A more technical review of Box-Cox transformations is provided by the SpringerLink website for its Encyclopedia of Mathematics; see: http://eom.springer.de/B/b110790 If you have the SPSS manual/PDF for Data Preparation, there is a section on how to get Box-Cox transformation in the "Rescale Fields" part of Automated Data Preparation or ADP procedure (for v19, see page 26; for a syntax example see pp86-97). I presume there are additional sources on how SPSS does Box-Cox as well as how to set up your own equations. There appears to be a sizable literature on Box-Cox transformation but you should decide whether these are relevant to your needs. -Mike Palij New York University [hidden email] ----- Original Message ----- From: "drfg2008" <[hidden email]> To: <[hidden email]> Sent: Sunday, May 01, 2011 3:31 PM Subject: transformation of variable into a normally distributed variable >I would like to transform a random variable which is not normally > distributed, into a normal distributed random variable (if possible). I have > different transformation functions, especially based on the logarithm of the > variable. Depending on the type of distribution, there are supposed to be > so-called ‘optimal transformations’ (eg Box-Cox transformation ??). Does > anyone know of suitable transformations or has anyone heard of a 'Box-Cox' > transformation? > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Jon K Peck
Hi Frank Hi Frank The Box-Cox transformation estimates the power to which the DV must be raised to minimize the mean square error in a regression. In my way of doing (probably by now ages out-of-date!) you need to specify the DV and predictors. ADP requires you to specify a DV (target), but I’m not sure how it knows what your IVs are. If you haven’t got ADP I can let you have a macro. Garry Gelade From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jon K Peck The Box-Cox transformation is available in the Data Preparation Option. If you have that, look for
|
Administrator
|
In reply to this post by drfg2008
What kind of model do you want to use, and what role does this non-normal variable play in it (i.e., explanatory variable or outcome)? What does the distribution look like for the non-normal variable?
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by drfg2008
What is suitable? The best guide to that, in my own experience,
is information about how the numbers are generated. The textbooks by John Tukey (et al.) include unusually good discussions. Counts often deserve square-root, distances or latencies may deserve reciprocals, and so on. The further guide to what is appropriate is, What transformation yields "equal intervals" within the context of what you are modeling? For example if "twice as much" is a natural way to describe equal differences, then the log is apt to be the appropriate transformation. Box-Cox transformations, as Tukey discusses, are applied to scales that have a natural zero - real "quantities," for instance. You may want to re-score a Test where subjects score near 100-max as the score of errors (near zero). However, a scale with both a minimum and a maximum (and scores near both extremes) may deserve a "folded" transformation, such as the logit. > > I would like to transform a random variable which is not normally > distributed, into a normal distributed random variable (if possible). I have > different transformation functions, especially based on the logarithm of the > variable. Depending on the type of distribution, there are supposed to be > so-called ‘optimal transformations’ (eg Box-Cox transformation ??). Does > anyone know of suitable transformations or has anyone heard of a 'Box-Cox' > transformation? -- Rich Ulrich ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Thanks everyone!
I didn't know that from SPSS 19 upwards there is a Box-Cox transformation available. My version is 17. Thank you also for the literature, which I'm checking right now, and thank you for the hint concerning the external VB-Program (I try to keep the programming completely within the syntax). Just an additional question to that: If I compute the Rank Variables (Blom) method (over a skewed distribution) like that: RANK VARIABLES=zv1 (A) /NORMAL /PRINT=NO /TIES=MEAN /FRACTION=BLOM. I get a N~distributed variable with µ=0 and s=1. And let's assume I would want to do a test on location. (Since I do not have V19): Would it be acceptable to do a transformation of the DV with Blom first and then compute a GLM (for example) over the DV. (Remark: Spearman is nothing more than Pearson over the ranks and the U-Test is in its result very similar to a t-Test over the ranks.) Frank
Dr. Frank Gaeth
|
Administrator
|
This partially answers the question I asked earlier--i.e., I think you want to run an OLS linear model with the non-normal variable as the dependent. If so, the first thing to do is run the model using the raw variable, and then examine the residuals. It is the errors (which are estimated by the residuals*) that are assumed to be normal, not the variable itself. If the residuals are too non-normal for comfort (or if they are too heteroscedastic), then start looking for a transformation.
* http://en.wikipedia.org/wiki/Errors_and_residuals_in_statistics HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Thanks Bruce,
I think about using Blom and/or Box-Cox transformation together with the CHAID algorithm. Since CHAID consists of ANOVA (if DV is metric) or Chi-Square (if DV is descrete), it could make a difference in the performance of CHAID. Especially if you have skewed distributions, where ANOVA is not robust.
Dr. Frank Gaeth
|
In reply to this post by Garry Gelade
Hi Jon Peck (Senior Software Engineer, IBM) and Mike Palij (New York University)
I hope that you guys are still monitoring this old thread of discussion. I note your response on the Box-Cox procedure on SPSS. 1. How do we find out what Lamda was used in the transformation? I am using the version 20 Premium edition of SPSS. 2. I also have the version 19 Base package of SPSS on another computer. How do I write a script/ Macro to do the transformation? 3. Does anyone know how to do other transformation besides Box-Cox on SPSS: e.g. Atksinson's score test (1973) for transforming responses and Box and Tidwell (1962) for transforming predictors? Ref: Atkinson, A. C. (1973) Testing transformations to normality. J. R. Statist. Soc. B, 35, 473–479. Box, G.E.P. and Tidwell, P.W. (1962) Transformations of the independent variables. Technometrics 4 531-550. Thanks! Ben |
Free forum by Nabble | Edit this page |