(no subject)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

(no subject)

lorenzo.avanzi
Hi,
I am PhD in Organizational Psychology. I study and work at the Faculty of Psychology in University of Bologna (Italy).
I don't speak very well english. I'll try to explain you my problem.
We assume multivariate normality on the data to carry out our analysis, however, I have seen that often (or always...) my variables have a non-normal distribution.
I controll the skenwess and the kurtosis index. I delete the univariate outliers from my data base. I transform (by logarithm, square root, ecc) the variables that have a value larger than or -1 (in skewess and kurtosis). I calculate the Mahalonobis Distance and delete the multivariate outliers and finally I calculate the Marcia Index... but my value of this index is often (or always) larger than the critical value (p (p*q)). Often I couldn't assume multivariate normality in my data.
I have two questions:
1. Exist other tecniques to change my data to obtain the multivariate normal distribution?
2. Exist other tecniques to work with data that have not a multivariate normal distibution?
I work with SPSS, and I hope to you help me, also to implement the answers in SPSS.

Thank you

Lorenzo

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: [Transforming for multivariate normality]

Richard Ristow
At 06:47 AM 7/10/2009, [hidden email] asked (with blank subject head):

First, it is a good idea to give a descriptive subject line with your
postings. Not having a subject line may be why you haven't received
any responses.


>We assume multivariate normality on the data to carry out our
>analysis, however, I have seen that often (or always...) my
>variables have a non-normal distribution. I controll the skenwess
>and the kurtosis index. I delete the univariate outliers from my
>data base. I transform (by logarithm, square root, ecc) the
>variables that have a value larger than or -1 (in skewess and
>kurtosis). I calculate the Mahalonobis Distance and delete the
>multivariate outliers and finally I calculate the Marcia Index...
>but my value of this index is often (or always) larger than the
>critical value (p (p*q)). Often I couldn't assume multivariate
>normality in my data.
>
>I have two questions:
>1. Exist other techniques to change my data to obtain the
>multivariate normal distribution?
>2. Exist other techniques to work with data that have not a
>multivariate normal distribution?

I hand this off to others on the list, but I think a pretty uniform
response will be: Don't do it.

First, most analysis methods do not assume multivariate normality.
Many assume normal distribution in the residuals, i.e. the
unexplained portion after fitting a model, but they're fairly robust
even if that is violated.

Second, in trying to get normality, you're doing violence to your data.

* "I delete the univariate outliers from my data base."

That's rarely advised these days, except when the outliers can be
shown to be data errors. Delete the outliers, and you're throwing
away information; and the information near the ends of the variables'
distributions may be unusually important.

* "I transform (by logarithm, square root, etc) the variables that
have a value larger than or -1 (in skewess and kurtosis)"

But if the variable was scale level in the first place, the
transformed scale is wrong. That is, in different parts of the range,
the same change in the value of the transformed variable, means
different changes in the underlying variable. If your original model
was linear, the new one isn't.

Generally, you transform -- log transform is common -- when you've
reason to think that the transformed scale more nearly represents the
underlying measure than the original scale does.

(Credit: the late statistician John Wilder Tukey did much in making
and disseminating arguments like this.)

-With best wishes to you,
  Richard

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD