SPSSX Discussion

comparison of means across strata

Classic

List

Threaded

14 messages Options

xenia

comparison of means across strata

hi all, I have the following problem:
I have a dataset which has the age at screening of each individual. An individual can have many screenings, so each row in the dataset has the ID of the individual, the screening no. (1st screening, 2nd etc) and the age at each screening. There are males and females in the file. I also have "no. of samples", depending on how many screenings an individual had, i.e. for an individual with 5 screenings, the total
number of samples is 5. I want to compare, eventually, across all sample strata, the mean age between males and females, but using weighted sums across strata (as you will see below). Is the right test the 2-way ANOVA through the GLM procedure? Is there a simpler way to do this in SPSS?

So far I have calculated everything separately in SPSS:
Within each sample stratum, i.e. for all individuals with 3 screenings etc, I have calculated the
agediff = mean age females - mean age males, and the respective standard errors.
I have set x(i) = 1/se(i)**2 (this is the inverse of the se to the power of 2 for sample stratum i), and
w(i) = x(i)/(sum of all x(i) over all i sample strata)
Then I calculated the pooled agediff = sum over all i of(w(i)*agediff(i)) (A).
Doing these calculations I have come up with a file which has one row for each sample stratum, in which I have the agedifference, the mean age in females, the mean age in males and the x and w for each sample stratum. Would it be correct to run an independent samples t-test, so that I can examine the age difference between males and females pooled across all the sample strata?
However, I don't know how to incorporate the specific weights w I described above into the ANOVA or the t-test procedure, i.e. do the procedures in SPSS calculate the pooled age difference as the sum over all i of(w(i)*agediff(i)) or do I need to change something in the syntax or write a macro to get (A)?

I appreciate your help, thank you.

Maguin, Eugene

Re: comparison of means across strata

Others with more experience may have the correct understanding but, I think, this is just a multilevel problem. You have screening age nested within person. No level 1 predictors; one level two predictor: gender. I think you can use Mixed with emmeans subcommand.

Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of xenia
Sent: Thursday, November 08, 2012 9:58 AM
To: [hidden email]
Subject: comparison of means across strata

hi all, I have the following problem:
I have a dataset which has the age at screening of each individual. An individual can have many screenings, so each row in the dataset has the ID of the individual, the screening no. (1st screening, 2nd etc) and the age at each screening. There are males and females in the file. I also have "no. of samples", depending on how many screenings an individual had, i.e. for an individual with 5 screenings, the total number of samples is 5. I want to compare, eventually, across all sample strata, the mean age between males and females, but using weighted sums across strata (as you will see below). Is the right test the 2-way ANOVA through the GLM procedure? Is there a simpler way to do this in SPSS?

So far I have calculated everything separately in SPSS:
Within each sample stratum, i.e. for all individuals with 3 screenings etc, I have calculated the agediff = mean age females - mean age males, and the respective standard errors.
I have set x(i) = 1/se(i)**2 (this is the inverse of the se to the power of
2 for sample stratum i), and
w(i) = x(i)/(sum of all x(i) over all i sample strata) Then I calculated the pooled agediff = sum over all i of(w(i)*agediff(i)) (A).
Doing these calculations I have come up with a file which has one row for each sample stratum, in which I have the agedifference, the mean age in females, the mean age in males and the x and w for each sample stratum.
Would it be correct to run an independent samples t-test, so that I can examine the age difference between males and females pooled across all the sample strata?
However, I don't know how to incorporate the specific weights w I described above into the ANOVA or the t-test procedure, i.e. do the procedures in SPSS calculate the pooled age difference as the sum over all i
of(w(i)*agediff(i)) or do I need to change something in the syntax or write a macro to get (A)?

I appreciate your help, thank you.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/comparison-of-means-across-strata-tp5716113.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

xenia

Re: comparison of means across strata

Thank you,
I want to compare the mean age for females with the mean age for males, across all sample strata. Each stratum represents the total number of screenings someone had, so in stratum 1 I have 10 individuals, in stratum 2 I have 5 and so on. So, shouldn't the stratum variable be included as a predictor?

Fuller, Matthew

Automatic reply: comparison of means across strata

I will be out of the office until November 13th, with limited access to e-mail.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Rich Ulrich

Re: comparison of means across strata

In reply to this post by xenia

The question is, What is your hypothesis? Or, is there
more than one?

It seems reasonable, to me, that you would compare several
variables that describe aspects of the study -- using obvious names,
InitialAge; FinalAge; Duration; AverageAge.

Duration uses StartDate and EndDate.
Only AverageAge makes use of more than one Age. I don't see
any reason to use anything other than the average of the several
ages that have been recorded, though other aspects of the study
might suggest something else.

It looks like t-tests to me.

--
Rich Ulrich

> Date: Thu, 8 Nov 2012 14:34:51 -0800

> From: [hidden email]
> Subject: Re: comparison of means across strata
> To: [hidden email]
>
> Thank you,
> I want to compare the mean age for females with the mean age for males,
> across all sample strata. Each stratum represents the total number of
> screenings someone had, so in stratum 1 I have 10 individuals, in stratum 2
> I have 5 and so on. So, shouldn't the stratum variable be included as a
> predictor?
>
> ...

xenia

Re: comparison of means across strata

Thank you,
the hypothesis is: the mean age in females is equal to the mean age in males, pooled across the strata. It is t-tests, but I obviously don't want to perform multiple t-tests, this is why I was thinking that the right test would be ANOVA. I also think that the variable for strata should be included because there could be a relationship with the number of screenings a person has, the no.of screenings is implied by the strata variable as I have explained in my initial query.

Marhefka, Stephanie

Automatic reply: comparison of means across strata

Hello. I will be out of the office until Tuesday November 13, 2012. I will do my best to respond to you upon my return. If you have not heard back by Thursday November 15, please contact me again.

Thank you,

Stephanie L. Marhefka, Ph.D.

Rich Ulrich

Re: comparison of means across strata

In reply to this post by xenia

Okay, that statement of "what the hypothesis is" would not
satisfy me if I were in your audience. What do you mean by
"pooled"? You are writing a narrative, and *that* answer will
not satisfy anyone about anything.

Why don't you want to perform multiple t-tests? - This is not
a sort of test that introduces a "problem of multiple testing"
because, I am pretty sure, the average age is not an *outcome* of
otherwise of vital interest. When you are trying to validate, to show
that your sample is matched, you owe it to your audience to be complete
by testing everything. The "cheating" occurs when you avoid the
separate tests.

Perhaps you do not want to *report* multiple t-tests because of
the clutter. In that case, you perform all the tests that might be of
interest, to test whether the samples are equivalent by age, and
then (for instance) report the most and least divergent and explain
what they are. Of just show that the most divergent is not at all
divergent.

If "number of strata" is important, you might look separately at
means for various counts of followups.

--
Rich Ulrich

> Date: Thu, 8 Nov 2012 21:40:48 -0800

> From: [hidden email]
> Subject: Re: comparison of means across strata
> To: [hidden email]
>
> Thank you,
> the hypothesis is: the mean age in females is equal to the mean age in
> males, pooled across the strata. It is t-tests, but I obviously don't want
> to perform multiple t-tests, this is why I was thinking that the right test
> would be ANOVA. I also think that the variable for strata should be included
> because there could be a relationship with the number of screenings a person
> has, the no.of screenings is implied by the strata variable as I have
> explained in my initial query.
>
> ...

xenia

Re: comparison of means across strata

"Pooled" means: the overall test across the strata, not within.
What you describe can be done more efficiently and in less time using the ANOVA. I don't see the point of doing many t-tests when I can do an ANOVA, especially if the number of tests is e.g. 1000. I think a general linear model is more efficient than the t-test. Can you state the reasons why you think doing t-tests is better than doing a GLM?

Rich Ulrich

Re: comparison of means across strata

[Apparently not posted on first try. Paragraph 2 added.]

Keep the distinction between tests of outcome and tests for study
validation. Finding a difference in age (I presume) is something that
undermines the study. It is not an "outcome" that encourages
publication as an important study. The proper attitude towards
validation tests is, "Bring 'em on!   We'll test everything that any
critic may worry about." And, as I said, your write-up can use an
an honest summary instead of detailing everything.

This is all basic description that you ought to want to know for your
own good confidence in going ahead, if age is any real concern. If you
just want the crudest comparison, why not the average for each person.

I'm a great advocate of an "overall test" when I have a sensible
"overall hypothesis" that it will answer. In this case, as described,
I would want to know if they subjects were recruited at the same
age; if they stayed in the same amount of time (and same count
of followups... how consistent are the gaps?); and if they are the
same age at the end.   Those are separate tests. If they show big
differences, you have new concerns for your later tests.

This is validation for the study design, where those things are
(probably) assumed to be equal. And the separate assumptions
deserve separate tests. If I see only one fuzzy, overall test, I will be
led to imagine bad things, such as, the analyst is either careless or
hiding something.

--
Rich Ulrich

> Date: Thu, 8 Nov 2012 23:56:05 -0800
> From: [hidden email]
> Subject: Re: comparison of means across strata
> To: [hidden email]
>
> "Pooled" means: the overall test across the strata, not within.
> What you describe can be done more efficiently and in less time using the
> ANOVA. I don't see the point of doing many t-tests when I can do an ANOVA,
> especially if the number of tests is e.g. 1000. I think a general linear
> model is more efficient than the t-test. Can you state the reasons why you
> think doing t-tests is better than doing a GLM?
>
>...

xenia

Re: comparison of means across strata

In reply to this post by Maguin, Eugene

Thank you for this.

--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/comparison-of-means-across-strata-tp5716113p5716146.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Li Cao

Automatic reply: comparison of means across strata

I will be out of office on Nov12th and Nov13th.

For ARF, please contact ARF Sample [hidden email]

For SBA, Please contact Springboard US Sample [hidden email]

For SBUK, please contact Springboard UK Sample [hidden email]

Thanks!

Cheers,

Li Cao

Sample Analyst | National Panels | Vision Critical

direct 778.373.0459

xenia

Re: comparison of means across strata

In reply to this post by Rich Ulrich

Your reply was posted to my email, and I have already replied:

I'm doing an exploratory analysis of some existing data, this is not a study that is recruiting and will be
published. So maybe you're talking in terms of a clinical study but I'm not. I have two groups and right now I'm interested on whether their mean ages differ, nothing else. This is not a survival or other time-to-event analysis, so whether they are the same age at the end or when they were recruited is not relevant. The things you mention are relevant to the setting-up of a trial, this is not what I'm doing i.e. I'm not designing a study. I asked something very specific: how to compare means of two groups across strata method-wise, when I have age, sex and number of samples. If I was running a trial I would have decided a priori which variables to examine and what primary outcomes I'm interested in, but all these are quite different to the point of my question. I did not mention anything like that in my initial query, regarding study set-up or publication etc etc, so this conversation is not to the point. However, thank you for taking the time to write and offer your help.

xenia

Re: comparison of means across strata

In reply to this post by xenia

I consider this closed now, thank you to everyone who took the time to write and to E.Maguin for the suggestion on Mixed with emmeans.