Repeated measures analysis of fractions summing to a constant

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Repeated measures analysis of fractions summing to a constant

Kirill Orlov
Consider you have a between-within design: several between-subject groups and several (3 or more) repeated measures (= within-subject) trials. It's all very classic and typical. The nuance, however, is that the values for every subject sum across the repeated levels to a **constant**. This is because the data are complementary, i.e. percentages of fractions, so, in this case they sum to 100 for every individual. For example, with 3 RM levels, a respondent's data is like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100).

I know that I can analyze between-groups X repeated-measures count data via Generalized Estimating Equations procedure. By I doubt in this case because the values *sum to a constant*, they are complementary fractions; they are not counts of successes in repeated independent trials!

Can I analyze such data in SPSS and how? Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Bruce Weaver
Administrator
Hello Kirill.  This is not a direct answer to your question.  I'm just pointing to a thread from a couple years ago that addressed the same question.  One of my posts in it gives a couple of references that may be of interest to you.  Both of them suggest that ANOVA generally works quite well with "ipsative" data (or "allocated observations").  You can see the relevant messages here:

   http://listserv.uga.edu/cgi-bin/wa?A2=ind1101&L=spssx-l&P=36237

HTH.

Kirill Orlov wrote
Consider you have a between-within design: several between-subject
groups and several (3 or more) repeated measures (= within-subject)
trials. It's all very classic and typical. The nuance, however, is that
the values for every subject sum across the repeated levels to a
**constant**. This is because the data are complementary, i.e.
percentages of fractions, so, in this case they sum to 100 for every
individual. For example, with 3 RM levels, a respondent's data is like
30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100).

I know that I can analyze between-groups X repeated-measures count data
via Generalized Estimating Equations procedure. By I doubt in this case
because the values *sum to a constant*, they are complementary
fractions; they are not counts of successes in repeated independent trials!

Can I analyze such data in SPSS and how? Thanks.
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Ryan
I'd like to chip in to report that since that exchange, I came across real-world data where subjects were asked to rank order items. When faced with these data, I found an elegant solution to this problem:
 
 
I'm not suggesting that this article presents a solution to the OPs problem, but the article is relevant to the dependency issue. It's a good read for those who must contend with rank-ordered items.
 
Best,
 
Ryan
 


On Thu, Apr 4, 2013 at 6:30 AM, Bruce Weaver <[hidden email]> wrote:
Hello Kirill.  This is not a direct answer to your question.  I'm just
pointing to a thread from a couple years ago that addressed the same
question.  One of my posts in it gives a couple of references that may be of
interest to you.  Both of them suggest that ANOVA generally works quite well
with "ipsative" data (or "allocated observations").  You can see the
relevant messages here:

   http://listserv.uga.edu/cgi-bin/wa?A2=ind1101&L=spssx-l&P=36237

HTH.


Kirill Orlov wrote
> Consider you have a between-within design: several between-subject
> groups and several (3 or more) repeated measures (= within-subject)
> trials. It's all very classic and typical. The nuance, however, is that
> the values for every subject sum across the repeated levels to a
> **constant**. This is because the data are complementary, i.e.
> percentages of fractions, so, in this case they sum to 100 for every
> individual. For example, with 3 RM levels, a respondent's data is like
> 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100).
>
> I know that I can analyze between-groups X repeated-measures count data
> via Generalized Estimating Equations procedure. By I doubt in this case
> because the values *sum to a constant*, they are complementary
> fractions; they are not counts of successes in repeated independent
> trials!
>
> Can I analyze such data in SPSS and how? Thanks.





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Repeated-measures-analysis-of-fractions-summing-to-a-constant-tp5719257p5719259.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Rich Ulrich
In reply to this post by Kirill Orlov
There is a literature on "compositional data" which probably will be helpful. 
Years ago, I found Aitchison to be readable.

I have no idea whether it will work for your model, but I will mention
that you escape the absolute linear dependency if you represent each
fraction as its log-odds, like log(25/75)  in place of 25%.

--
Rich Ulrich


Date: Thu, 4 Apr 2013 12:05:47 +0400
From: [hidden email]
Subject: Repeated measures analysis of fractions summing to a constant
To: [hidden email]

Consider you have a between-within design: several between-subject groups and several (3 or more) repeated measures (= within-subject) trials. It's all very classic and typical. The nuance, however, is that the values for every subject sum across the repeated levels to a **constant**. This is because the data are complementary, i.e. percentages of fractions, so, in this case they sum to 100 for every individual. For example, with 3 RM levels, a respondent's data is like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100).

I know that I can analyze between-groups X repeated-measures count data via Generalized Estimating Equations procedure. By I doubt in this case because the values *sum to a constant*, they are complementary fractions; they are not counts of successes in repeated independent trials!

Can I analyze such data in SPSS and how? Thanks.

Reply | Threaded
Open this post in threaded view
|

Automatic reply: Repeated measures analysis of fractions summing to a constant

Gonzales, Dana L
               

I will be out of the office until Tuesday December 11th. I will check email periodically.

Dr. Gonzales
Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Bruce Weaver
Administrator
In reply to this post by Rich Ulrich
Judging from what I see on the Wikipedia page (http://en.wikipedia.org/wiki/Compositional_data), "compositional data" is another name for with Shaffer called "allocated observations" and Greer & Dunlap called "ipsative data".  But it also looks like there are two sets of literature that do not overlap all that much.


Rich Ulrich-2 wrote
There is a literature on "compositional data" which probably will be helpful.  
Years ago, I found Aitchison to be readable.

I have no idea whether it will work for your model, but I will mention
that you escape the absolute linear dependency if you represent each
fraction as its log-odds, like log(25/75)  in place of 25%.

--
Rich Ulrich

Date: Thu, 4 Apr 2013 12:05:47 +0400
From: [hidden email]
Subject: Repeated measures analysis of fractions summing to a constant
To: [hidden email]


 

   
 
 
    Consider you have a between-within design: several between-subject
    groups and several (3 or more) repeated measures (= within-subject)
    trials. It's all very classic and typical. The nuance, however, is
    that the values for every subject sum across the repeated levels to
    a **constant**. This is because the data are complementary, i.e.
    percentages of fractions, so, in this case they sum to 100 for every
    individual. For example, with 3 RM levels, a respondent's data is
    like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42%
    (sum=100).

   

    I know that I can analyze between-groups X repeated-measures count
    data via Generalized Estimating Equations procedure. By I doubt in
    this case because the values *sum to a constant*, they are
    complementary fractions; they are not counts of successes in
    repeated independent trials!

   

    Can I analyze such data in SPSS and how? Thanks.
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Rich Ulrich
The Wikip article on "ipsative" tells me that my own use of
that term falls under the third type that they mention, where
educators may standardize the scores for an individual based
only on that individuals previous scores.  

It seems that you are apt to find several different uses under "ipsative"
in addition to the one that resembles "compositional".

--
Rich Ulrich



> Date: Thu, 4 Apr 2013 11:42:15 -0700

> From: [hidden email]
> Subject: Re: Repeated measures analysis of fractions summing to a constant
> To: [hidden email]
>
> Judging from what I see on the Wikipedia page
> (http://en.wikipedia.org/wiki/Compositional_data), "compositional data" is
> another name for with Shaffer called "allocated observations" and Greer &
> Dunlap called "ipsative data". But it also looks like there are two sets of
> literature that do not overlap all that much.
>
>
>
> Rich Ulrich-2 wrote
> > There is a literature on "compositional data" which probably will be
> > helpful.
> > Years ago, I found Aitchison to be readable.
> >
> > I have no idea whether it will work for your model, but I will mention
> > that you escape the absolute linear dependency if you represent each
> > fraction as its log-odds, like log(25/75) in place of 25%.
> >
> > --
> > Rich Ulrich
> >
> > Date: Thu, 4 Apr 2013 12:05:47 +0400
> > From:
>
> > kior@
>
> > Subject: Repeated measures analysis of fractions summing to a constant
> > To:
>
> > SPSSX-L@.UGA
>
> >
> >
> > Consider you have a between-within design: several between-subject
> > groups and several (3 or more) repeated measures (= within-subject)
> > trials. It's all very classic and typical. The nuance, however, is
> > that the values for every subject sum across the repeated levels to
> > a **constant**. This is because the data are complementary, i.e.
> > percentages of fractions, so, in this case they sum to 100 for every
> > individual. For example, with 3 RM levels, a respondent's data is
> > like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42%
> > (sum=100).
> >
> >
> >
> > I know that I can analyze between-groups X repeated-measures count
> > data via Generalized Estimating Equations procedure. By I doubt in
> > this case because the values *sum to a constant*, they are
> > complementary fractions; they are not counts of successes in
> > repeated independent trials!
> >
> >
> >
> > Can I analyze such data in SPSS and how? Thanks.
>
...
Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Ryan
In reply to this post by Kirill Orlov
Would the OP mind explaining exactly what the DV is?   It might help shed light on how to proceed, notwithstanding other interesting solutions. Speaking of which, the idea of converting the probs to logits is intriguing. I am generally in favor of the logit scale because of its properties, but the fact that it addresses the linear dependence is an added bonus here.

Anyway, I would appreciate if the OP would be willing to tell us more about the DV.

Ryan

On Apr 4, 2013, at 4:05 AM, Kirill Orlov <[hidden email]> wrote:

> Consider you have a between-within design: several between-subject groups and several (3 or more) repeated measures (= within-subject) trials. It's all very classic and typical. The nuance, however, is that the values for every subject sum across the repeated levels to a **constant**. This is because the data are complementary, i.e. percentages of fractions, so, in this case they sum to 100 for every individual. For example, with 3 RM levels, a respondent's data is like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100).
>
> I know that I can analyze between-groups X repeated-measures count data via Generalized Estimating Equations procedure. By I doubt in this case because the values *sum to a constant*, they are complementary fractions; they are not counts of successes in repeated independent trials!
>
> Can I analyze such data in SPSS and how? Thanks.
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Kirill Orlov
Ryan,
For example, the DV might be "how do you spend your typical day?" question
Work __% of time
Meals__% of time
Stroll__% of time
Comp/TV/Reading__% of time
Else__% of time
[Please check that your answers sum to 100%]

Converting to logits might be interesting idea, although not necessarily most right. But I wonder if SPSS (GEE or other procedure) have already forseen and provided tools (reference distribution + link function) exactly for a DV which is fractions summing to a constant; for such a DV isn't uncommon.


05.04.2013 2:32, Subscribe SAS-L Anonymous пишет:
Would the OP mind explaining exactly what the DV is?   It might help shed light on how to proceed, notwithstanding other interesting solutions. Speaking of which, the idea of converting the probs to logits is intriguing. I am generally in favor of the logit scale because of its properties, but the fact that it addresses the linear dependence is an added bonus here.

Anyway, I would appreciate if the OP would be willing to tell us more about the DV.

Ryan

On Apr 4, 2013, at 4:05 AM, Kirill Orlov [hidden email] wrote:

Consider you have a between-within design: several between-subject groups and several (3 or more) repeated measures (= within-subject) trials. It's all very classic and typical. The nuance, however, is that the values for every subject sum across the repeated levels to a **constant**. This is because the data are complementary, i.e. percentages of fractions, so, in this case they sum to 100 for every individual. For example, with 3 RM levels, a respondent's data is like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100).

I know that I can analyze between-groups X repeated-measures count data via Generalized Estimating Equations procedure. By I doubt in this case because the values *sum to a constant*, they are complementary fractions; they are not counts of successes in repeated independent trials!

Can I analyze such data in SPSS and how? Thanks.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD



Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Kornbrot, Diana
Re: Repeated measures analysis of fractions summing to a constant SPSS generalized model does indeed provide appropriate routines for analyzing count data

Use either GEE generalized Estimating Equation or Genealized linear models
Your model is for COUNTS – recommend negative binomial with log link. The % should be converted to number of hours as response variable, participants is a subject variable, and category [work, meal, etc is rpeated measures]. Model has category as predictor and shold also be entered under fixed.

Have fun

Diana


On 05/04/2013 04:43, "Kirill Orlov" <kior@...> wrote:

 Ryan,
 For example, the DV might be "how do you spend your typical day?" question
 Work __% of time
 Meals__% of time
 Stroll__% of time
 Comp/TV/Reading__% of time
 Else__% of time
 [Please check that your answers sum to 100%]
 
 Converting to logits might be interesting idea, although not necessarily most right. But I wonder if SPSS (GEE or other procedure) have already forseen and provided tools (reference distribution + link function) exactly for a DV which is fractions summing to a constant; for such a DV isn't uncommon.
 
 
 
05.04.2013 2:32, Subscribe SAS-L Anonymous пишет:
 
 

Would the OP mind explaining exactly what the DV is?   It might help shed light on how to proceed, notwithstanding other interesting solutions. Speaking of which, the idea of converting the probs to logits is intriguing. I am generally in favor of the logit scale because of its properties, but the fact that it addresses the linear dependence is an added bonus here.

Anyway, I would appreciate if the OP would be willing to tell us more about the DV.

Ryan

On Apr 4, 2013, at 4:05 AM, Kirill Orlov <kior@...> <[hidden email]>  wrote:

 

Consider you have a between-within design: several between-subject groups and several (3 or more) repeated measures (= within-subject) trials. It's all very classic and typical. The nuance, however, is that the values for every subject sum across the repeated levels to a **constant**. This is because the data are complementary, i.e. percentages of fractions, so, in this case they sum to 100 for every individual. For example, with 3 RM levels, a respondent's data is like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100).

I know that I can analyze between-groups X repeated-measures count data via Generalized Estimating Equations procedure. By I doubt in this case because the values *sum to a constant*, they are complementary fractions; they are not counts of successes in repeated independent trials!

Can I analyze such data in SPSS and how? Thanks.

 

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


 

 


Emeritus Professor Diana Kornbrot
email:  d.e.kornbrot@...    
 web:    http://dianakornbrot.wordpress.com/
Work
Department of Psychology
School of Life and Medical Sciences
University of Hertfordshire
College Lane, Hatfield, Hertfordshire AL10 9AB, UK
voice:   +44 (0) 170 728 4626
Home
19 Elmhurst Avenue
London N2 0LT, UK
voice:   +44 (0) 208  444 2081
mobile: +44 (0) 740 318 1612


Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Andy W
Kirill,

This is a question that has come up on cross-validated a few times, see here for an example http://stats.stackexchange.com/q/24187/1036. A frequent recommendation seems to be a Stata library by the name of dirifit (see http://maartenbuis.nl/software/dirifit.html) or a synonymous R library DirichletReg. I do not know if the current GENLIN procedure can be wrangled to produce the same model.

A quick perusing of some of the materials floating around the web related to said packages suggest a quick and dirty way is to fit separate beta regression models for each of the subsets - although that doesn't constrain the total to be 1. (Smithson & Verkuilen (2006) A Better Lemon Squeezer has supplementary material on how to fit beta regression models in SPSS.)

Count data models are not appropriate here because of the ceiling effect. You can look up ways around that (like censored Poisson regression or Tobit models) - but those ignore the compositional nature of the data here. Another suggestion on the CV site recommends multinomial models - which I see the relationship but I don't quite understand how you turn this into discrete outcomes to feed into a multinomial logistic regression.

Looks like you will have some (hopefully fun) reading to do to sort through all these disparate recommendations!

Andy
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Art Kendall
In reply to this post by Kirill Orlov
Correspondence analysis is designed for compositional data.
Art Kendall
Social Research Consultants
On 4/4/2013 4:07 AM, Kirill Orlov [via SPSSX Discussion] wrote:
Consider you have a between-within design: several between-subject groups and several (3 or more) repeated measures (= within-subject) trials. It's all very classic and typical. The nuance, however, is that the values for every subject sum across the repeated levels to a **constant**. This is because the data are complementary, i.e. percentages of fractions, so, in this case they sum to 100 for every individual. For example, with 3 RM levels, a respondent's data is like 30%, 22%, 48% (sum=100); for another respondent 25%, 33%, 42% (sum=100).

I know that I can analyze between-groups X repeated-measures count data via Generalized Estimating Equations procedure. By I doubt in this case because the values *sum to a constant*, they are complementary fractions; they are not counts of successes in repeated independent trials!

Can I analyze such data in SPSS and how? Thanks.




If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/Repeated-measures-analysis-of-fractions-summing-to-a-constant-tp5719257.html
To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.
NAML

Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Kirill Orlov
In reply to this post by Kornbrot, Diana
Thank you for all your answers that came so far. I haven't read them carefully yet.

But here is what meanwhile came to my own mind after a little meditation.
It is very simple: I just thought that (PLEASE correct me if I'm mistaken!) that there is no problem at all. The constraint that repeated-measures sum to a constant within individuals *does not* refute using common RM-ANOVA model. If only ANOVA distributional and spericity assumptions hold, no need for GEE or other procedures arise at all.

Let's have some data: between-subject grouping factor GROUP and within-subject factor RM with 3 levels summing up to a constant (100).

group rm1 rm2 rm3 sum

1 50 30 20 100
1 24 42 34 100
1 34 16 50 100
1 61 28 11 100
1 46 46 8 100
1 23 18 59 100
2 55 22 23 100
2 27 39 34 100
2 44 36 20 100
2 28 40 32 100


Run usual Repeated-measures ANOVA:

GLM rm1 rm2 rm3 BY group
/WSFACTOR= rm 3
/METHOD= SSTYPE(3)
/WSDESIGN= rm
/DESIGN= group.


Summing up to a constant just means that upon collapsing the RM levels, all respondents appear to be the same: there exist no between-subject variation at all, or in other words, the "respondent ID" factor's effect is zero. Hence, in the table "Tests of Between-Subjects Effects" Error term is zero. Also, the effect of GROUP factor is zero too - of course, because the constant sum (100) in our data is the same for both groups 1 and 2.

Now, - I'd ask you, - does these results invalid in any way? Do we say that ANOVA is misused when an error variation - which is left unxplained - is zero? I would not say it, and so RM-ANOVA *is* an appropriate method for fractions (i.e values summing up to a constant). If I'm wrong, please explain me why.

Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Rich Ulrich
No, you are a bit wrong in concluding that there is no problem.

If you think of the situation of dummy variables, you have provided
an "extra" dummy, like entering dichotomies for both Male and Female.
There is redundancy.  There is over-parameterization.  There is,
somewhere, the loss of one d.f.  for RM when you perform any analysis. 
A "fixed" zero-effect is not the same as a randomly occurring near-zero-effect.

You retain full information (in the statistical sense) if you set up your
model to leave out one of the categories, just as one would for any
dummy coding.  The others will be most "independent" if you omit the
category that has the greatest variance.  The drawback might lie in the
ease of interpreting your results.

--
Rich Ulrich



Date: Fri, 5 Apr 2013 19:36:04 +0400
From: [hidden email]
Subject: Re: Repeated measures analysis of fractions summing to a constant
To: [hidden email]

Thank you for all your answers that came so far. I haven't read them carefully yet.

But here is what meanwhile came to my own mind after a little meditation.
It is very simple: I just thought that (PLEASE correct me if I'm mistaken!) that there is no problem at all. The constraint that repeated-measures sum to a constant within individuals *does not* refute using common RM-ANOVA model. If only ANOVA distributional and spericity assumptions hold, no need for GEE or other procedures arise at all.

Let's have some data: between-subject grouping factor GROUP and within-subject factor RM with 3 levels summing up to a constant (100).

group rm1 rm2 rm3 sum

1 50 30 20 100
1 24 42 34 100
1 34 16 50 100
1 61 28 11 100
1 46 46 8 100
1 23 18 59 100
2 55 22 23 100
2 27 39 34 100
2 44 36 20 100
2 28 40 32 100


Run usual Repeated-measures ANOVA:

GLM rm1 rm2 rm3 BY group
/WSFACTOR= rm 3
/METHOD= SSTYPE(3)
/WSDESIGN= rm
/DESIGN= group.


Summing up to a constant just means that upon collapsing the RM levels, all respondents appear to be the same: there exist no between-subject variation at all, or in other words, the "respondent ID" factor's effect is zero. Hence, in the table "Tests of Between-Subjects Effects" Error term is zero. Also, the effect of GROUP factor is zero too - of course, because the constant sum (100) in our data is the same for both groups 1 and 2.

Now, - I'd ask you, - does these results invalid in any way? Do we say that ANOVA is misused when an error variation - which is left unxplained - is zero? I would not say it, and so RM-ANOVA *is* an appropriate method for fractions (i.e values summing up to a constant). If I'm wrong, please explain me why.

Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Bruce Weaver
Administrator
The articles by Shaffer (1981) and Greer & Dunlap (1997) say there is no problem.  (I've sent both of them to Rich off-list.)  Meanwhile, here are some relevant excerpts from Greer & Dunlap.

"Periodically, researchers in the behavioral sciences analyze measures that are ipsative. Ipsative measures are those for which the mean for each level of one or more variables (usually the participants) equals the same constant. Data with these constraints are also referred to as allocated observations (Shaffer, 1981) and compositional data (when the scores are proportions; Aitchison, 1986)."  (p. 200)

"The general conclusion is clear: Repeated measures ANOVA with ipsative data works quite well.  Although it is known that techniques such as factor analysis are badly affected by ipsative scores, ANOVA is not, particularly if the epsilon correction for nonuniform variance-covariance matrices is used.  Fortunately, the epsilon correction for repeated measures ANOVA is readily obtainable from most major computer statistical packages.  Therefore, it is hoped that readers will no longer look with suspicion upon ANOVAs with ipsative data, even though the presence of sums of squares equal to zero is disconcerting." (p. 206)

Reference

Greer T, Dunlap WP. (1997). Analysis of variance with ipsative measures. Psychological Methods, 2(2), 200-207.

HTH.


Rich Ulrich-2 wrote
No, you are a bit wrong in concluding that there is no problem.

If you think of the situation of dummy variables, you have provided
an "extra" dummy, like entering dichotomies for both Male and Female.
There is redundancy.  There is over-parameterization.  There is,
somewhere, the loss of one d.f.  for RM when you perform any analysis.  
A "fixed" zero-effect is not the same as a randomly occurring near-zero-effect.

You retain full information (in the statistical sense) if you set up your
model to leave out one of the categories, just as one would for any
dummy coding.  The others will be most "independent" if you omit the
category that has the greatest variance.  The drawback might lie in the
ease of interpreting your results.

--
Rich Ulrich


Date: Fri, 5 Apr 2013 19:36:04 +0400
From: [hidden email]
Subject: Re: Repeated measures analysis of fractions summing to a constant
To: [hidden email]


 

   
 
 
    Thank you for all your answers that came so far. I haven't read them
    carefully yet.

   

    But here is what meanwhile came to my own mind after a little
    meditation.

    It is very simple: I just thought that (PLEASE correct me if I'm
    mistaken!) that there is no problem at all. The constraint that
    repeated-measures sum to a constant within individuals *does not*
    refute using common RM-ANOVA model. If only ANOVA distributional and
    spericity assumptions hold, no need for GEE or other procedures
    arise at all.

   

    Let's have some data: between-subject grouping factor  GROUP and
    within-subject factor RM with 3 levels summing up to a constant
    (100).

   

       group      rm1
      rm2      rm3      sum

       

             1       50       30       20      100

             1       24       42       34      100

             1       34       16       50      100

             1       61       28       11      100

             1       46       46        8      100

             1       23       18       59      100

             2       55       22       23      100

             2       27       39       34      100

             2       44       36       20      100

             2       28       40       32      100

   

    Run usual Repeated-measures ANOVA:

   

      GLM rm1 rm2 rm3 BY group

        /WSFACTOR= rm 3

        /METHOD= SSTYPE(3)

        /WSDESIGN= rm

        /DESIGN= group.

   

    Summing up to a constant just means that upon collapsing the RM
    levels, all respondents appear to be the same: there exist no
    between-subject variation at all, or in other words, the "respondent
    ID" factor's effect is zero. Hence, in the table "Tests of
    Between-Subjects Effects" Error term is zero. Also, the effect of
    GROUP factor is zero too - of course, because the constant sum (100)
    in our data is the same for both groups 1 and 2.

   

    Now, - I'd ask you, - does these results invalid in any way? Do we
    say that ANOVA is misused when an error variation - which is left
    unxplained - is zero? I would not say it, and so RM-ANOVA *is* an
    appropriate method for fractions (i.e values summing up to a
    constant). If I'm wrong, please explain me why.
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Rich Ulrich
Okay.  I pointed out that there was a loss of d.f.  The cite from
G&D says the analysis is okay if you use the epsilon correction
for repeated measures.  Now, remember that the epsilon correction
makes use of a reduction of d.f. 

I haven't yet looked at what Bruce sent me, but I'm willing to accept
that the epsilon correction reduces the d.f. appropriately, either
100% of what is needed, or nearly 100%.  Epsilon corrections could
be large enough.

It has been a long time since I looked at the epsilon correction, but
I do remember that descriptions mentioned cells with zero or near-zero
for variances.  In my recollection, what was discussed were models
where zeroes were apt to be due to "basement" or "ceiling" scoring
effects.  The d.f. corrections were not always small, so I expect that
they could work here.  (If I were reporting the data, I would be careful
to report the epsilons as evidence that the model has accounted
properly for the d.f.)

--
Rich Ulrich


> Date: Fri, 5 Apr 2013 12:36:24 -0700

> From: [hidden email]
> Subject: Re: Repeated measures analysis of fractions summing to a constant
> To: [hidden email]
>
> The articles by Shaffer (1981) and Greer & Dunlap (1997) say there is no
> problem. (I've sent both of them to Rich off-list.) Meanwhile, here are
> some relevant excerpts from Greer & Dunlap.
>
> "Periodically, researchers in the behavioral sciences analyze measures that
> are ipsative. Ipsative measures are those for which the mean for each level
> of one or more variables (usually the participants) equals the same
> constant. Data with these constraints are also referred to as allocated
> observations (Shaffer, 1981) and compositional data (when the scores are
> proportions; Aitchison, 1986)." (p. 200)
>
> "The general conclusion is clear: Repeated measures ANOVA with ipsative data
> works quite well. Although it is known that techniques such as factor
> analysis are badly affected by ipsative scores, ANOVA is not, particularly
> if the epsilon correction for nonuniform variance-covariance matrices is
> used. Fortunately, the epsilon correction for repeated measures ANOVA is
> readily obtainable from most major computer statistical packages.
> Therefore, it is hoped that readers will no longer look with suspicion upon
> ANOVAs with ipsative data, even though the presence of sums of squares equal
> to zero is disconcerting." (p. 206)
>
> Reference
>
> Greer T, Dunlap WP. (1997). Analysis of variance with ipsative measures.
> Psychological Methods, 2(2), 200-207.
>
> HTH.
>
>
>
> Rich Ulrich-2 wrote
> > No, you are a bit wrong in concluding that there is no problem.
> >
> > If you think of the situation of dummy variables, you have provided
> > an "extra" dummy, like entering dichotomies for both Male and Female.
> > There is redundancy. There is over-parameterization. There is,
> > somewhere, the loss of one d.f. for RM when you perform any analysis.
> > A "fixed" zero-effect is not the same as a randomly occurring
> > near-zero-effect.
> >
> > You retain full information (in the statistical sense) if you set up your
> > model to leave out one of the categories, just as one would for any
> > dummy coding. The others will be most "independent" if you omit the
> > category that has the greatest variance. The drawback might lie in the
> > ease of interpreting your results.
> >
> > --
> > Rich Ulrich
> >
> >
> > Date: Fri, 5 Apr 2013 19:36:04 +0400
> > From:
>
> > kior@
>
> > Subject: Re: Repeated measures analysis of fractions summing to a constant
> > To:
>
> > SPSSX-L@.UGA
>
> >
> >
> >
> >
> >
> >
> >
> > Thank you for all your answers that came so far. I haven't read them
> > carefully yet.
> >
> >
> >
> > But here is what meanwhile came to my own mind after a little
> > meditation.
> >
> > It is very simple: I just thought that (PLEASE correct me if I'm
> > mistaken!) that there is no problem at all. The constraint that
> > repeated-measures sum to a constant within individuals *does not*
> > refute using common RM-ANOVA model. If only ANOVA distributional and
> > spericity assumptions hold, no need for GEE or other procedures
> > arise at all.
> >
> >
> >
> > Let's have some data: between-subject grouping factor GROUP and
> > within-subject factor RM with 3 levels summing up to a constant
> > (100).
> >
> >
> >
> > group rm1
> > rm2 rm3 sum
> >
> >
> >
> > 1 50 30 20 100
> >
> > 1 24 42 34 100
> >
> > 1 34 16 50 100
> >
> > 1 61 28 11 100
> >
> > 1 46 46 8 100
> >
> > 1 23 18 59 100
> >
> > 2 55 22 23 100
> >
> > 2 27 39 34 100
> >
> > 2 44 36 20 100
> >
> > 2 28 40 32 100
> >
> >
> >
> > Run usual Repeated-measures ANOVA:
> >
> >
> >
> > GLM rm1 rm2 rm3 BY group
> >
> > /WSFACTOR= rm 3
> >
> > /METHOD= SSTYPE(3)
> >
> > /WSDESIGN= rm
> >
> > /DESIGN= group.
> >
> >
> >
> > Summing up to a constant just means that upon collapsing the RM
> > levels, all respondents appear to be the same: there exist no
> > between-subject variation at all, or in other words, the "respondent
> > ID" factor's effect is zero. Hence, in the table "Tests of
> > Between-Subjects Effects" Error term is zero. Also, the effect of
> > GROUP factor is zero too - of course, because the constant sum (100)
> > in our data is the same for both groups 1 and 2.
> >
> >
> >
> > Now, - I'd ask you, - does these results invalid in any way? Do we
> > say that ANOVA is misused when an error variation - which is left
> > unxplained - is zero? I would not say it, and so RM-ANOVA *is* an
> > appropriate method for fractions (i.e values summing up to a
> > constant). If I'm wrong, please explain me why.
>
...
Reply | Threaded
Open this post in threaded view
|

Re: Repeated measures analysis of fractions summing to a constant

Zuluaga, Juan
In reply to this post by Kirill Orlov
'For example, the DV might be "how do you spend your typical day?" question
Work __% of time
Meals__% of time
...
[Please check that your answers sum to 100%]'

Isn't this what Correspondence Analysis is for?
http://sru.soc.surrey.ac.uk/SRU7.html
A related approach:
http://igitur-archive.library.uu.nl/fss/2007-1004-201532/heijden_van_der_88_the_analysis.pdf

Ecologists have that kind of data very often, when they compare transects (what percentage of plants sampled are of what species, etc). Check Legendre & Legendre's  Numerical Ecology.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD