transform variable

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

transform variable

Ivitseva
Dear All,

I want to run a GLM with two factors (Land Use with 6 levels and
Environmental Zones with 8 levels) and the dependent variable is the trend
slope of the start of the growing season. The homogeneity of variance is
strongly violated in the levels of the independent variables. What kind of
transformation can I do considering that I have zeros (no trend) and
negative values (negative trend) in my dependent variable? Any help MOST
welcome! Thanks in advance, Eva.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: transform variable

Bruce Weaver
Administrator
Ivitseva wrote
Dear All,

I want to run a GLM with two factors (Land Use with 6 levels and
Environmental Zones with 8 levels) and the dependent variable is the trend
slope of the start of the growing season. The homogeneity of variance is
strongly violated in the levels of the independent variables. What kind of
transformation can I do considering that I have zeros (no trend) and
negative values (negative trend) in my dependent variable? Any help MOST
welcome! Thanks in advance, Eva.
Hi Eva.  I have some questions.

1. On what basis are you saying that homogeneity of variance is "strongly violated"?
2. What are the minimum and maximum cell variances?
3. Are the n's the same in all cells?  If not, how much do they vary?
4. Is the shape of the distribution of Y similar in all cells?

Cheers,
Bruce
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: transform variable

Ivitseva
In reply to this post by Ivitseva
Dear Bruce,

Thanks for answering!!!

1: The Levene's statistic is significant.
F=17.961; df1=46; df2=16927; sig=0.000

2: cell variances (i.e. standard deviation is reported)
min:216.25
max:8299.15

3: no, unfortunately also the sample number varies.
Highest is 1295, the lowest is 1 (I could leave that out, but I still have
another cell with a sample number of 3).

4: Y has a skewed distribution in almost all the cells.

Can you suggest something?

I would like to show how different the trend of the start of the vegetation
growing season is in the combined categories of land cover and
environmental zones. For this I wanted to use teh GLM/two-way ANOVA but I
guess I cannot if the assumption of homogeneity of variance is violated,
right?

Thanks in advance,
Eva

On Tue, 26 Jan 2010 14:22:36 -0800, Bruce Weaver <[hidden email]>
wrote:

>Ivitseva wrote:
>>
>> Dear All,
>>
>> I want to run a GLM with two factors (Land Use with 6 levels and
>> Environmental Zones with 8 levels) and the dependent variable is the
trend
>> slope of the start of the growing season. The homogeneity of variance is
>> strongly violated in the levels of the independent variables. What kind
of

>> transformation can I do considering that I have zeros (no trend) and
>> negative values (negative trend) in my dependent variable? Any help MOST
>> welcome! Thanks in advance, Eva.
>>
>>
>
>Hi Eva.  I have some questions.
>
>1. On what basis are you saying that homogeneity of variance is "strongly
>violated"?
>2. What are the minimum and maximum cell variances?
>3. Are the n's the same in all cells?  If not, how much do they vary?
>4. Is the shape of the distribution of Y similar in all cells?
>
>Cheers,
>Bruce
>
>
>-----
>--
>Bruce Weaver
>[hidden email]
>http://sites.google.com/a/lakeheadu.ca/bweaver/
>"When all else fails, RTFM."
>
>NOTE:  My Hotmail account is not monitored regularly.
>To send me an e-mail, please use the address shown above.
>--
>View this message in context: http://old.nabble.com/transform-variable-
tp27330677p27330874.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: transform variable

Maguin, Eugene
Eva,

I'm curious about your data and would like to ask a couple of questions so I
can learn something new. First, you talk about the 'trend' or 'slope'. How
are you computing trend/slope and trend/slope in what? Second, you say that
you have positive, negative and 0 values of trend at the start of the
vegetation growing season. How is it that you can get negative and 0 values
for trend.

You do have huge a SD value (8299) for a land cover-climate zone
combination. That sounds like you have huge growth rates for some plants and
low growth (negative??) rates for other plants. I'm having difficulty
understanding that.

Gene Maguin



Dear Bruce,

Thanks for answering!!!

1: The Levene's statistic is significant.
F=17.961; df1=46; df2=16927; sig=0.000

2: cell variances (i.e. standard deviation is reported)
min:216.25
max:8299.15

3: no, unfortunately also the sample number varies.
Highest is 1295, the lowest is 1 (I could leave that out, but I still have
another cell with a sample number of 3).

4: Y has a skewed distribution in almost all the cells.

Can you suggest something?

I would like to show how different the trend of the start of the vegetation
growing season is in the combined categories of land cover and
environmental zones. For this I wanted to use teh GLM/two-way ANOVA but I
guess I cannot if the assumption of homogeneity of variance is violated,
right?

Thanks in advance,
Eva

On Tue, 26 Jan 2010 14:22:36 -0800, Bruce Weaver <[hidden email]>
wrote:

>Ivitseva wrote:
>>
>> Dear All,
>>
>> I want to run a GLM with two factors (Land Use with 6 levels and
>> Environmental Zones with 8 levels) and the dependent variable is the
trend
>> slope of the start of the growing season. The homogeneity of variance is
>> strongly violated in the levels of the independent variables. What kind
of

>> transformation can I do considering that I have zeros (no trend) and
>> negative values (negative trend) in my dependent variable? Any help MOST
>> welcome! Thanks in advance, Eva.
>>
>>
>
>Hi Eva.  I have some questions.
>
>1. On what basis are you saying that homogeneity of variance is "strongly
>violated"?
>2. What are the minimum and maximum cell variances?
>3. Are the n's the same in all cells?  If not, how much do they vary?
>4. Is the shape of the distribution of Y similar in all cells?
>
>Cheers,
>Bruce
>
>
>-----
>--
>Bruce Weaver
>[hidden email]
>http://sites.google.com/a/lakeheadu.ca/bweaver/
>"When all else fails, RTFM."
>
>NOTE:  My Hotmail account is not monitored regularly.
>To send me an e-mail, please use the address shown above.
>--
>View this message in context: http://old.nabble.com/transform-variable-
tp27330677p27330874.html
>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: transform variable

Ivitseva
In reply to this post by Ivitseva
Hi Gene,

Well, I'm not sure I can tell you anything new but I'll do my best to
explain my study.

I work in remote sensing, right now I use satellite images to extract
phenological indicators. I have used a set of images over Europe showing
vegetation vigour (it's called NDVI) for 26 years, from 1983 till 2008, and
I have extracted the Start of Season parameter for each pixel in the image
for. Thus for each 1km spatial resolution pixel over Europe I have 26 values
for the start of the vegetation growing season in decade (thus e.g. the
value 8 for a given year would mean that for that pixel in that year the
growing season started in the second decade of March). I fit a linear trend
over these 26 observation for each pixel. I use the slope (regression
coefficient) of the linear regression modell to assess the rate of change in
the dependent, which is the start of the season. Negative slope values mean
a negative trend thus earlier start of season and similarly,  positive slope
values mean positive trend and later start of season. In case the slope is
zero I have no trend. I only assess these values over pixel where the t-test
of the regression was significant.

Considering the SD, yes, for some land cover-climate zone combination I have
huge values, meaning that in that zones I have pixels where the start of the
season shifted much earlier and in other pixels in the same combination
group the start of the season shifted for later during the observed 26
years. But that might also come from the fact that the resulting groups
spatially have very different sizes, some being large with over 100 pixels
whereas in other groups there are only few pixels.

I just do not know how to present my results and above all, I do not know if
I can present the GLM-two way ANOVA at all given the violation of the
homogen variances. You see, I'm in remote sensing implementing self teached
statistics and quite often I am completely lost.

Cheers,

Eva

On Wed, 27 Jan 2010 13:46:06 -0500, Gene Maguin <[hidden email]> wrote:

>Eva,
>
>I'm curious about your data and would like to ask a couple of questions so I
>can learn something new. First, you talk about the 'trend' or 'slope'. How
>are you computing trend/slope and trend/slope in what? Second, you say that
>you have positive, negative and 0 values of trend at the start of the
>vegetation growing season. How is it that you can get negative and 0 values
>for trend.
>
>You do have huge a SD value (8299) for a land cover-climate zone
>combination. That sounds like you have huge growth rates for some plants and
>low growth (negative??) rates for other plants. I'm having difficulty
>understanding that.
>
>Gene Maguin
>
>
>
>Dear Bruce,
>
>Thanks for answering!!!
>
>1: The Levene's statistic is significant.
>F=17.961; df1=46; df2=16927; sig=0.000
>
>2: cell variances (i.e. standard deviation is reported)
>min:216.25
>max:8299.15
>
>3: no, unfortunately also the sample number varies.
>Highest is 1295, the lowest is 1 (I could leave that out, but I still have
>another cell with a sample number of 3).
>
>4: Y has a skewed distribution in almost all the cells.
>
>Can you suggest something?
>
>I would like to show how different the trend of the start of the vegetation
>growing season is in the combined categories of land cover and
>environmental zones. For this I wanted to use teh GLM/two-way ANOVA but I
>guess I cannot if the assumption of homogeneity of variance is violated,
>right?
>
>Thanks in advance,
>Eva
>
>On Tue, 26 Jan 2010 14:22:36 -0800, Bruce Weaver <[hidden email]>
>wrote:
>
>>Ivitseva wrote:
>>>
>>> Dear All,
>>>
>>> I want to run a GLM with two factors (Land Use with 6 levels and
>>> Environmental Zones with 8 levels) and the dependent variable is the
>trend
>>> slope of the start of the growing season. The homogeneity of variance is
>>> strongly violated in the levels of the independent variables. What kind
>of
>>> transformation can I do considering that I have zeros (no trend) and
>>> negative values (negative trend) in my dependent variable? Any help MOST
>>> welcome! Thanks in advance, Eva.
>>>
>>>
>>
>>Hi Eva.  I have some questions.
>>
>>1. On what basis are you saying that homogeneity of variance is "strongly
>>violated"?
>>2. What are the minimum and maximum cell variances?
>>3. Are the n's the same in all cells?  If not, how much do they vary?
>>4. Is the shape of the distribution of Y similar in all cells?
>>
>>Cheers,
>>Bruce
>>
>>
>>-----
>>--
>>Bruce Weaver
>>[hidden email]
>>http://sites.google.com/a/lakeheadu.ca/bweaver/
>>"When all else fails, RTFM."
>>
>>NOTE:  My Hotmail account is not monitored regularly.
>>To send me an e-mail, please use the address shown above.
>>--
>>View this message in context: http://old.nabble.com/transform-variable-
>tp27330677p27330874.html
>>Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>
>>=====================
>>To manage your subscription to SPSSX-L, send a message to
>>[hidden email] (not to SPSSX-L), with no body text except the
>>command. To leave the list, send the command
>>SIGNOFF SPSSX-L
>>For a list of commands to manage subscriptions, send the command
>>INFO REFCARD
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD