data transformation for negative values to get normality

classic Classic list List threaded Threaded
7 messages Options
ro
Reply | Threaded
Open this post in threaded view
|

data transformation for negative values to get normality

ro
Hi all,

I am working with negative and positive values for a variable (range data beetwen -8.00 - 10.00), lamentably this data is nor normal according Kolmogorov smirnov test...

I am trying to transform with log10 but the tipical transformation log10 (x+1) is not possible because my lowest of negative values is -8.00, could i use the transformation: log10 (x+9) to dont get negative values??

Regards
Thanks in advance
Rodrigo
Reply | Threaded
Open this post in threaded view
|

Re: data transformation for negative values to get normality

Ryan
Who says your data must be normally distributed? Please elaborate on what you plan on doing.

Ryan

On Dec 29, 2012, at 1:41 PM, ro <[hidden email]> wrote:

> Hi all,
>
> I am working with negative and positive values for a variable (range data
> beetwen -8.00 - 10.00), lamentably this data is nor normal according
> Kolmogorov smirnov test...
>
> I am trying to transform with log10 but the tipical transformation log10
> (x+1) is not possible because my lowest of negative values is -8.00, could i
> use the transformation: log10 (x+9) to dont get negative values??
>
> Regards
> Thanks in advance
> Rodrigo
>
>
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/data-transformation-for-negative-values-to-get-normality-tp5717171.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
ro
Reply | Threaded
Open this post in threaded view
|

Re: data transformation for negative values to get normality

ro
I wanna run a 2 ways anova test (GLM test), but when i run a kolmogorov smirvov test i have problems with some factors in one fixed effect...
Reply | Threaded
Open this post in threaded view
|

Re: data transformation for negative values to get normality

Rich Ulrich
In reply to this post by ro
If you don't have a "natural zero" then you should not be
considering the log as an early choice for transformation.
(If you do have a natural zero, *why*  do your scores run
from minus 8 to 10?)  The best guide to the appropriate
transformation, in my experience, is to consider how the
data have been generated. 

If you do have a large N, then there is a fair chance that
you don't have any need for a transformation, anyway.
What makes "testing for normality" an optional sort of
procedure is that it *matters most* when the N is small,
and that is when the power of the test for normality is (also)
small.  When the N is large enough, a test for normality will
usually "reject" because there are few distributions that are
truly normal.  - Since you characterize your data as having
integer limits, we might assume that it is not Normal for
either of two reasons -- (a) Normal distributions have infinite
range, and (b) normal distributions are continuous (not integers).

Finally, what matters for test purposes is that the *residual*
or errors are sufficiently Normal so that the test statistic is
accurate.  What throws off the test statistic, usually, is either
too many outliers or too many zeroes.

How were your data generated?  If adding a constant does
not shift the scores to reflect a natural zero, then there's not
much to recommend it.  Statisticians, as opposed to stat-pack
amateurs, are more apt to *replace* a zero with something
appropriate than to use an add-on.  (I'm thinking, in particular,
of cases where measured drug or enzyme levels are reported
as zero owing to measurement insufficiency... and the proper
replacement score might be one-half the lowest detected level.)

--
Rich Ulrich


> Date: Sat, 29 Dec 2012 10:41:26 -0800

> From: [hidden email]
> Subject: data transformation for negative values to get normality
> To: [hidden email]
>
> Hi all,
>
> I am working with negative and positive values for a variable (range data
> beetwen -8.00 - 10.00), lamentably this data is nor normal according
> Kolmogorov smirnov test...
>
> I am trying to transform with log10 but the tipical transformation log10
> (x+1) is not possible because my lowest of negative values is -8.00, could i
> use the transformation: log10 (x+9) to dont get negative values??
>
...

ro
Reply | Threaded
Open this post in threaded view
|

Re: data transformation for negative values to get normality

ro
Thanks for your reply...

I am working with scores for nutritional status (BMI) of childs, the data are countinuos value and represent differents nutritional status , -8.0 - -5.0= Underweight, -4.99- -2.00= Normal weight, -1.99- 1.00 = Pre-obesity, etc...
Reply | Threaded
Open this post in threaded view
|

Re: data transformation for negative values to get normality

Rich Ulrich
Okay.  Well, BMI does not originate as negatives.  But I know that
there is the problem of age-specific standards for children's BMIs.
Here is an example --
  http://www.bcm.edu/cnrc/bodycomp/bmiz2.html

I think I may be safe in assuming that these numbers represent
the child's status relative to the 50th percentile for their age.
Or, maybe it is relative a higher percentile, since (-5, -2] is called
Normal Weight in your statement below.  If you wanted to add-on
a value that would put numbers back to their natural range, or
close to you, you might use the base number of 20 or so --
whatever you used for your mid-age child.

I once worked with child-BMIs, and the data came to me as
age-specific percentiles.  I notice that at the web-page I cited,
the 90% range is smaller for children under 12 or 10.  If your
data do include a wide range of ages, then you might have better
numbers if you start with the percentiles instead of the adjusted
BMI itself. 

Your "non-normal" test result could be pointing to a couple of
other things that you should consider, since BMI is not something
shouts out for a different measurement basis. 
 - Are there errors?
 - Is there "lumping"  in the data, representing a mixed source of
data for your sample?  - I can imagine a mixed sample with one group
of modern, overweight children who the media are often concerned
with, and (since you have some very low scores) another group of
starving, war-emigres.    - In this case, you don't want to ignore that
distinction in the design of your analysis, but what would matter is
the set of residuals, as mentioned before.

--
Rich Ulrich



> Date: Sat, 29 Dec 2012 15:39:10 -0800

> From: [hidden email]
> Subject: Re: data transformation for negative values to get normality
> To: [hidden email]
>
> Thanks for your reply...
>
> I am working with scores for nutritional status (BMI) of childs, the data
> are countinuos value and represent differents nutritional status , -8.0 -
> -5.0= Underweight, -4.99- -2.00= Normal weight, -1.99- 1.00 = Pre-obesity,
> etc...
>
...
ro
Reply | Threaded
Open this post in threaded view
|

Re: data transformation for negative values to get normality

ro
In reply to this post by ro
I got the values from Antrho software from WHO, http://www.who.int/growthref/tools/who_anthroplus_manual.pdf.

Data represent z scores for BMI-for-age (3 SD, <-2 SD, <-1 SD, >+1 SD, >+2 SD and >+3 SD), 3rd,
15th, 50th, 85th and 97th percentiles and are corrected for age, sorry for my mistake...

Thanks for you insterest and reply