SPSSX Discussion

(no subject)

Classic

List

Threaded

17 messages Options

Muyiwa Oladosu

(no subject)

Dear All:

I need help with reference on the folowing questions.

1. Sample size determination
2. Minimum size for a control group that is sufficient for comparison. For example, for a sample of experimental group of 450, what will be the minimun sample of control group that is necessary.

Thanks,

Muyiwa.

Everyday is a Gift!

Muyiwa Oladosu, PhD
Principal Associate/Consultant, MiraMonitor Consulting, LLC
P.O. Box 3239, Frederick, MD 21705, USA
Office: 240.723.1527, Fax: 301-695-5386
emails: [hidden email], [hidden email]
www.m2cnig.com www.miramic.com

____________________________________________________________________________________
Now that's room service! Choose from over 150,000 hotels
in 45,000 destinations on Yahoo! Travel to find your fit.
http://farechase.yahoo.com/promo-generic-14795097

Maguin, Eugene

Sample size determination

Muyiwa,

You simply have to provide more information. And, with respect, I suggest
that you do some reading on statistical power. Please respond to the list as
I may not have time to help you with what you are wanting.

Tell us: 1) what statistical test are you talking about. T-test? Chi square?
Correlation? What is your alternative hypothesis, that is, if you were doing
a correlation, what is the size of the correlation that you want to be
confident of being able to find? This value should be stated in effect size
terms. How confident you want to be of being able to find an effect of that
size. The usual value of social science research is 80%.

Gene Maguin

zstatman

Re: Sample size determination

There are numerous sample size & power calculators available to include a
couple of links from my Web site - Services Tab then scroll to the bottom

W

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Gene Maguin
Sent: Tuesday, January 16, 2007 10:26 AM
To: [hidden email]
Subject: Sample size determination

Muyiwa,

You simply have to provide more information. And, with respect, I suggest
that you do some reading on statistical power. Please respond to the list as
I may not have time to help you with what you are wanting.

Tell us: 1) what statistical test are you talking about. T-test? Chi square?
Correlation? What is your alternative hypothesis, that is, if you were doing
a correlation, what is the size of the correlation that you want to be
confident of being able to find? This value should be stated in effect size
terms. How confident you want to be of being able to find an effect of that
size. The usual value of social science research is 80%.

Gene Maguin

Will
Statistical Services

============
info.statman@earthlink.net
http://home.earthlink.net/~z_statman/
============

Angshu Bhowmik

effect of diseases on length of stay

Hello,

I apologise in advance for my ignorance - about to be exposed by my question.

I am a doctor - not a statistician. I have data on lengths of hospital stay (LOS, in days) in about 100 patients with Disease A (always present i.e. 1). There are various co-morbidities: Disease B, Disease C etc which may be present or absent. There are in total 18 co-morbidities, but they could be grouped into 5 groups if that makes it easier to perform a more sensible analysis.

LOS A B C D

12 1 0 1 1

13 1 1 0 1

4 1 1 1 0

etc

I want to find out if any of the co-morbidities are increasing (or decreasing) LOS e.g. Does disease C increase length of stay independently (and if it is possible to find out, by how much)?

I have used SPSS to do simple tests and I know how to run syntax and create it by using "paste" but have little knowledge of more complicated methods of using SPSS (nor of stats for that matter). Nonetheless, since I have SPSS at work, it is convenient for me to use.

I think I can't use logistic regression unless my LOS is categorised into "long" or "short". I don't think the data fulfils the criteria for multiple regression (but I am willing to be corrected).

Can anyone tell me what test I should perform (and if it is not something found in the "analysis" menu, then also how to do it)?

Many thanks to anyone who has the time to answer this.

Angshu Bhowmik

London, UK

Find sales, coupons, and free shipping, all in one place! MSN Shopping Sales & Deals

Richard Ristow

Re: effect of diseases on length of stay

Well, I'll start, but others can add more than I can.

At 11:59 AM 1/16/2007, Angshu Bhowmik wrote:

>I have data on lengths of hospital stay (LOS, in days) in about 100
>patients with Disease A (always present i.e. 1). There are various
>co-morbidities: Disease B, Disease C etc which may be present or
>absent. There are in total 18 co-morbidities, but they could be
>grouped into 5 groups if that makes it easier to perform a more
>sensible analysis.

That grouping is the only way to do it. It gives you a mean of 20
patients per group, which is reasonable. Keep all 18 co-morbidities,
and you have a mean of 100/18=5.5, which isn't going to be enough.

Then, you're comparing means between groups: that's a one-way analysis
of variance. In SPSS, command MEANS is a good place to start. The
syntax is easy, and it allows a lot of descriptive statistics by cell.
For descriptives, I'd select COUNT, MEAN, STDDEV, MEDIAN, MIN and MAX;
with /STATISTICS=ANOVA. From the menus: Analyze>Compare means>means.
Don't do a test for linearity; it's not meaningful, for you.

Moving on,

. If the F-test says the groups differ, you'll likely want to know
which groups have significantly higher or lower means, than which
others. That's called multiple comparison analysis, and is available in
command ONEWAY (Menu Analyze>Compare means>ANOVA). Select "post hoc",
and pick a test - try BONFERRONI first, but that's something others on
the list will know more about than I do.

. I don't know the shape of your length-of-stay distributions: Do they
cluster around a value? Or are there a lot of short stays, and a small
proportion of much longer ones? When you run your ANOVA, look for cell
means substantially larger than the medians; that can point to the
latter. If it's the case, you may want to use a non-parametric ANOVA.
Or, I'd seriously consider log-transforming the data. There may be
other opinions on that, though. A lot of people, including me, are very
cautious about transforming data to make it look 'nice'. That said, the
log transform still feels like something to try, if the distributions
show lengthy tails.

Now, is that enough to get you going?

-Best of luck,
Richard

Brock-15

Re: effect of diseases on length of stay

In reply to this post by Angshu Bhowmik

It seems to me that this is a perfect opportunity to use Survival
Analysis. I can not speak to the specifics of the analysis, as I am just
starting to read about this myself, but from your description of the
problem, it seems that it nearly resembles a textbook example of SA.

HTH,

~ Brock

Björn Türoque

Re: effect of diseases on length of stay

In reply to this post by Richard Ristow

I think Brock is correct, a good resource on how discrete time survival
analysis works is located here.
http://www.ats.ucla.edu/stat/mplus/seminars/DiscreteTimeSurvival/default.htm
Watch
the movies, they are informative, they use Mplus instead of SPSS but the
concepts and ideas are similar. Additionally the references at the end of
the slideshow and lecture point you to some good books on the topic.

Don

On 1/18/07, Richard Ristow <[hidden email]> wrote:

>
> Well, I'll start, but others can add more than I can.
>
> At 11:59 AM 1/16/2007, Angshu Bhowmik wrote:
>
> >I have data on lengths of hospital stay (LOS, in days) in about 100
> >patients with Disease A (always present i.e. 1). There are various
> >co-morbidities: Disease B, Disease C etc which may be present or
> >absent. There are in total 18 co-morbidities, but they could be
> >grouped into 5 groups if that makes it easier to perform a more
> >sensible analysis.
>
> That grouping is the only way to do it. It gives you a mean of 20
> patients per group, which is reasonable. Keep all 18 co-morbidities,
> and you have a mean of 100/18=5.5, which isn't going to be enough.
>
> Then, you're comparing means between groups: that's a one-way analysis
> of variance. In SPSS, command MEANS is a good place to start. The
> syntax is easy, and it allows a lot of descriptive statistics by cell.
> For descriptives, I'd select COUNT, MEAN, STDDEV, MEDIAN, MIN and MAX;
> with /STATISTICS=ANOVA. From the menus: Analyze>Compare means>means.
> Don't do a test for linearity; it's not meaningful, for you.
>
> Moving on,
>
> . If the F-test says the groups differ, you'll likely want to know
> which groups have significantly higher or lower means, than which
> others. That's called multiple comparison analysis, and is available in
> command ONEWAY (Menu Analyze>Compare means>ANOVA). Select "post hoc",
> and pick a test - try BONFERRONI first, but that's something others on
> the list will know more about than I do.
>
> . I don't know the shape of your length-of-stay distributions: Do they
> cluster around a value? Or are there a lot of short stays, and a small
> proportion of much longer ones? When you run your ANOVA, look for cell
> means substantially larger than the medians; that can point to the
> latter. If it's the case, you may want to use a non-parametric ANOVA.
> Or, I'd seriously consider log-transforming the data. There may be
> other opinions on that, though. A lot of people, including me, are very
> cautious about transforming data to make it look 'nice'. That said, the
> log transform still feels like something to try, if the distributions
> show lengthy tails.
>
> Now, is that enough to get you going?
>
> -Best of luck,
> Richard
>

Richard Ristow

Re: effect of diseases on length of stay

In reply to this post by Angshu Bhowmik

This struck me as I wrote with methodological questions about this
problem.

At 11:59 AM 1/16/2007, Angshu Bhowmik wrote:

>I have data on lengths of hospital stay (LOS, in days) in about 100.
>There are various co-morbidities: Disease B, Disease C etc which may
>be present or absent. There are in total 18 co-morbidities, but they
>could be grouped into 5 groups if that makes it easier to perform a
>more sensible analysis.
>
>I want to find out if any of the co-morbidities are increasing (or
>decreasing) LOS e.g. Does disease C increase length of stay
>independently (and if it is possible to find out, by how much)?

I wrote suggestion ANOVA; others wrote suggesting (reasonably) survival
analysis.

One question that at least I missed: You write "There are various
co-morbidities: Disease B, Disease C etc which may be present or
absent." I wrote assuming that patients had exactly one, or at most
one, of these. Is that so?

If you see patients with more than comorbidity, it changes the analysis
and its complexities. It also makes sample-size requirements more
stringent. I wrote that 100 patients is reasonable for analyzing
differences among 5 groups, though not 18. If you have combinations,
though, you may have many more 'groups', i.e. sets of patients with the
same sets of diseases. You may have to go to n-way ANOVA (I don't think
the Survival procedures are good at this), or dummy-variable
regression.

Richard Ristow

Re: effect of diseases on length of stay

In reply to this post by Angshu Bhowmik

Follow-up to original thread "effect of diseases on length of stay"

At 11:59 AM 1/16/2007, Angshu Bhowmik wrote:

>I have data on lengths of hospital stay (LOS, in days) in about 100.
>There are various co-morbidities: Disease B, Disease C etc which may
>be present or absent. I want to find out if any of the co-morbidities
>are increasing (or decreasing) LOS e.g. Does disease C increase
>length of stay independently (and if it is possible to find out, by
>how much)?

I suggested ANOVA, with the reservation that it might be necessary to
transform the data, mainly because length-of-stay curves tend to be
long-tailed. (That is, a majority of patients clustered around short to
moderate lengths of stay, with a minority having much longer ones.)

A couple of other posters suggested survival analysis, which is
certainly reasonable: a datum is, after all, duration to a terminal
event. (Don't tell the patients I'm calling discharge 'terminal'.)

Question, then: Survival analysis is necessary if you have censored
data, i.e. subjects of whom you know only that the terminal event
occurred after a certain date. But this set has no censored data.

Is there an active reason, then, not to use ANOVA? I ask because ANOVA
procedures generally have more flexibility, apart from handling
censoring, than do survival-analysis procedured.

-Ignorant, but anxious to learn,
Richard

Angshu Bhowmik

Re: effect of diseases on length of stay

In reply to this post by Richard Ristow

----- Original Message -----
From: "Richard Ristow"

> This struck me as I wrote with methodological questions about this
> problem.
>
> At 11:59 AM 1/16/2007, Angshu Bhowmik wrote:
>
>>I have data on lengths of hospital stay (LOS, in days) in about 100. There
>>are various co-morbidities: Disease B, Disease C etc which may be present
>>or absent. There are in total 18 co-morbidities, but they could be grouped
>>into 5 groups if that makes it easier to perform a more sensible analysis.
>>
>>I want to find out if any of the co-morbidities are increasing (or
>>decreasing) LOS e.g. Does disease C increase length of stay independently
>>(and if it is possible to find out, by how much)?
>
> I wrote suggestion ANOVA; others wrote suggesting (reasonably) survival
> analysis.
>
> One question that at least I missed: You write "There are various
> co-morbidities: Disease B, Disease C etc which may be present or absent."
> I wrote assuming that patients had exactly one, or at most one, of these.
> Is that so?

I am sorry for not being more clear. The patients may have one or more of
the co-morbidities - not just one.

>
> If you see patients with more than comorbidity, it changes the analysis
> and its complexities. It also makes sample-size requirements more
> stringent. I wrote that 100 patients is reasonable for analyzing
> differences among 5 groups, though not 18. If you have combinations,
> though, you may have many more 'groups', i.e. sets of patients with the
> same sets of diseases. You may have to go to n-way ANOVA (I don't think
> the Survival procedures are good at this), or dummy-variable regression.
>

I see (sort of). I was still struggling a bit with your original
suggestions, but I think I have just about managed to get a grasp of them.
This sounds quite complicated though, and I don't know how to set about it.

I suppose one way to simplify it might be to categorise the lengths of stay
into 3 or 4 groups e.g. very short (0) , short (1), long (2) and extra-long
(3) or something like that and then use ordinal multinomial logistic
regression (if I am right). Would this be an acceptable way to do it? At the
end of the day, my aim is just to find out if one or more of the
comorbidities affect length of stay -- the problem is that when there are so
many possible combinations, then it is difficult to isolate the effects of
just one of the comorbidities at a time.

Many thanks to all of your for your valuable suggestions. This is helping me
to get an understanding of the principles involved. I have not yet tried the
survival analysis, but will attempt this over the weekend.

Thanks again,

Angshu

Art Kendall

Re: effect of diseases on length of stay

It is rarely useful to coarsen your dependent variable.
It would seem that it might be worth your while to see if a count or
transformation of a count is related to length of stay.

Art Kendall
Social Research Consultants

Angshu Bhowmik wrote:

> ----- Original Message -----
> From: "Richard Ristow"
>
>
>
>> This struck me as I wrote with methodological questions about this
>> problem.
>>
>> At 11:59 AM 1/16/2007, Angshu Bhowmik wrote:
>>
>>> I have data on lengths of hospital stay (LOS, in days) in about 100.
>>> There
>>> are various co-morbidities: Disease B, Disease C etc which may be
>>> present
>>> or absent. There are in total 18 co-morbidities, but they could be
>>> grouped
>>> into 5 groups if that makes it easier to perform a more sensible
>>> analysis.
>>>
>>> I want to find out if any of the co-morbidities are increasing (or
>>> decreasing) LOS e.g. Does disease C increase length of stay
>>> independently
>>> (and if it is possible to find out, by how much)?
>>
>> I wrote suggestion ANOVA; others wrote suggesting (reasonably) survival
>> analysis.
>>
>> One question that at least I missed: You write "There are various
>> co-morbidities: Disease B, Disease C etc which may be present or
>> absent."
>> I wrote assuming that patients had exactly one, or at most one, of
>> these.
>> Is that so?
>
>
> I am sorry for not being more clear. The patients may have one or more of
> the co-morbidities - not just one.
>
>
>>
>> If you see patients with more than comorbidity, it changes the analysis
>> and its complexities. It also makes sample-size requirements more
>> stringent. I wrote that 100 patients is reasonable for analyzing
>> differences among 5 groups, though not 18. If you have combinations,
>> though, you may have many more 'groups', i.e. sets of patients with the
>> same sets of diseases. You may have to go to n-way ANOVA (I don't think
>> the Survival procedures are good at this), or dummy-variable regression.
>>
>
> I see (sort of). I was still struggling a bit with your original
> suggestions, but I think I have just about managed to get a grasp of
> them.
> This sounds quite complicated though, and I don't know how to set
> about it.
>
> I suppose one way to simplify it might be to categorise the lengths of
> stay
> into 3 or 4 groups e.g. very short (0) , short (1), long (2) and
> extra-long
> (3) or something like that and then use ordinal multinomial logistic
> regression (if I am right). Would this be an acceptable way to do it?
> At the
> end of the day, my aim is just to find out if one or more of the
> comorbidities affect length of stay -- the problem is that when there
> are so
> many possible combinations, then it is difficult to isolate the
> effects of
> just one of the comorbidities at a time.
>
> Many thanks to all of your for your valuable suggestions. This is
> helping me
> to get an understanding of the principles involved. I have not yet
> tried the
> survival analysis, but will attempt this over the weekend.
>
> Thanks again,
>
> Angshu
>
>

Art Kendall
Social Research Consultants

John-342

Re: effect of diseases on length of stay

In reply to this post by Angshu Bhowmik

Angshu, one issue you may want to consider is that hospital length of stay
may be affected by events other than the underlying disease and comorbid
conditions. Many if not most hospitals have a rather significant problem
placing patients into long term care or other facilities such as hospice and
rehab. The LTC issue is very problematic and tends to back up discharge from
acute care as patients may wait days for a free LTC bed to become available.
So if you have a patient in your dataset with significant comorbidities, it
is more likely they will go to other facilities after discharge from acute
care so this may influence your overall interpretation of your findings.

John Welton
Medical University of South Carolina
College of Nursing
Charleston, SC

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Angshu Bhowmik
Sent: Tuesday, January 23, 2007 4:28 PM
To: [hidden email]
Subject: Re: [SPSSX-L] effect of diseases on length of stay

----- Original Message -----
From: "Richard Ristow"

Richard Ristow

Re: effect of diseases on length of stay

In reply to this post by Angshu Bhowmik

At 04:27 PM 1/23/2007, Angshu Bhowmik wrote:

>>>I want to find out if any of the co-morbidities are increasing (or
>>>decreasing) LOS e.g. Does disease C increase length of stay
>>>independently (and if it is possible to find out, by how much)?
>>
>>I wrote suggestion ANOVA; others wrote suggesting (reasonably)
>>survival
>>analysis.
>>
>>One question that at least I missed: You write "There are various
>>co-morbidities: Disease B, Disease C etc which may be present or
>>absent." I wrote assuming that patients had exactly one, or at most
>>one, of these. Is that so?
>
>I am sorry for not being more clear. The patients may have one or more
>of the co-morbidities - not just one.

OK, so that increases the complexity, as I've described.

Something useful, though it won't solve it:

First, group your co-morbidities into the five groups. There's no
chance, otherwise.

Second, run counts and cell statistics for each group that occurs.
(More on how to do that, later.) It'll tell you, and us, a lot about
what the analytic possibilities are. In particular, about how
drastically your model may have to be simplified.

>I suppose one way to simplify it might be to categorise the lengths of
>stay into 3 or 4 groups e.g. very short (0) , short (1), long (2) and
>extra-long (3) or something like that and then use ordinal multinomial
>logistic regression (if I am right). Would this be an acceptable way
>to do it?

No. That throws away information, and doesn't gain you anything. It
does NOT simplify the model for analysis, even though it means fewer
numbers written down.

>My aim is just to find out if one or more of the comorbidities affect
>length of stay -- the problem is that when there are so many possible
>combinations, then it is difficult to isolate the effects of just one
>of the comorbidities at a time.

Well, it may come down to,
. 5-way by 2-level ANOVA, the grouping variables being presence/absence
of the five co-morbidity categories, probably suppressing all
interaction terms.
. Non-parametric analog of the above
. Survival-analysis analog of the above, if there is one

Hey, methodologists! Help! Can you see that I'm getting near the edge
of what I can say confidently?

Angshu Bhowmik

Re: effect of diseases on length of stay

In reply to this post by John-342

John,
You are quite right.
In many cases we have decided on a "medical" date of discharge (i.e. the
date the doctors decided that medical treatment had ended) and separately
recorded the actual date of discharge. Social problems leading to prolonged
stay is one of the "co-morbidities" I have included as well, so I am not
ignoring the effect you mention.
Thank you,
Angshu
----- Original Message -----
From: "John" <[hidden email]>
To: <[hidden email]>
Sent: Wednesday, January 24, 2007 12:17 AM
Subject: Re: effect of diseases on length of stay

> Angshu, one issue you may want to consider is that hospital length of stay
> may be affected by events other than the underlying disease and comorbid
> conditions. Many if not most hospitals have a rather significant problem
> placing patients into long term care or other facilities such as hospice
> and
> rehab. The LTC issue is very problematic and tends to back up discharge
> from
> acute care as patients may wait days for a free LTC bed to become
> available.
> So if you have a patient in your dataset with significant comorbidities,
> it
> is more likely they will go to other facilities after discharge from acute
> care so this may influence your overall interpretation of your findings.
>
> John Welton
> Medical University of South Carolina
> College of Nursing
> Charleston, SC
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
> Angshu Bhowmik
> Sent: Tuesday, January 23, 2007 4:28 PM
> To: [hidden email]
> Subject: Re: [SPSSX-L] effect of diseases on length of stay
>
> ----- Original Message -----
> From: "Richard Ristow"
>
>
>
>> This struck me as I wrote with methodological questions about this
>> problem.
>>
>> At 11:59 AM 1/16/2007, Angshu Bhowmik wrote:
>>
>>>I have data on lengths of hospital stay (LOS, in days) in about 100.
>>>There
>>>are various co-morbidities: Disease B, Disease C etc which may be present
>>>or absent. There are in total 18 co-morbidities, but they could be
>>>grouped
>>>into 5 groups if that makes it easier to perform a more sensible
>>>analysis.
>>>
>>>I want to find out if any of the co-morbidities are increasing (or
>>>decreasing) LOS e.g. Does disease C increase length of stay
>>>independently
>>>(and if it is possible to find out, by how much)?
>>
>> I wrote suggestion ANOVA; others wrote suggesting (reasonably) survival
>> analysis.
>>
>> One question that at least I missed: You write "There are various
>> co-morbidities: Disease B, Disease C etc which may be present or absent."
>> I wrote assuming that patients had exactly one, or at most one, of these.
>> Is that so?
>
>
> I am sorry for not being more clear. The patients may have one or more of
> the co-morbidities - not just one.
>
>
>>
>> If you see patients with more than comorbidity, it changes the analysis
>> and its complexities. It also makes sample-size requirements more
>> stringent. I wrote that 100 patients is reasonable for analyzing
>> differences among 5 groups, though not 18. If you have combinations,
>> though, you may have many more 'groups', i.e. sets of patients with the
>> same sets of diseases. You may have to go to n-way ANOVA (I don't think
>> the Survival procedures are good at this), or dummy-variable regression.
>>
>
> I see (sort of). I was still struggling a bit with your original
> suggestions, but I think I have just about managed to get a grasp of them.
> This sounds quite complicated though, and I don't know how to set about
> it.
>
> I suppose one way to simplify it might be to categorise the lengths of
> stay
> into 3 or 4 groups e.g. very short (0) , short (1), long (2) and
> extra-long
> (3) or something like that and then use ordinal multinomial logistic
> regression (if I am right). Would this be an acceptable way to do it? At
> the
> end of the day, my aim is just to find out if one or more of the
> comorbidities affect length of stay -- the problem is that when there are
> so
> many possible combinations, then it is difficult to isolate the effects of
> just one of the comorbidities at a time.
>
> Many thanks to all of your for your valuable suggestions. This is helping
> me
> to get an understanding of the principles involved. I have not yet tried
> the
> survival analysis, but will attempt this over the weekend.
>
> Thanks again,
>
> Angshu
>

Art Kendall

Re: effect of diseases on length of stay

In reply to this post by Richard Ristow

A lot depends on the nature of the problem that you are considering. Do
patients with the condition routinely walk out of the hospital if they
do not have co-morbidities? or Are they still in bed outside the
hospital? etc.

First, do scatter plots of LOS by N_of_Comorbidities with different
colored markers for ambulatory/not ambulatory. (perhaps 4 levels
ambulatory, home bed/assisted, skilled nursing facility, deceased.??)

There are notes interspersed below.

Richard Ristow wrote:
> At 04:27 PM 1/23/2007, Angshu Bhowmik wrote:
>
<snip>
>> I suppose one way to simplify it might be to categorise the lengths of
>> stay into 3 or 4 groups e.g. very short (0) , short (1), long (2) and
>> extra-long (3) or something like that and then use ordinal multinomial
>> logistic regression (if I am right). Would this be an acceptable way
>> to do it?
>
> No. That throws away information, and doesn't gain you anything. It
> does NOT simplify the model for analysis, even though it means fewer
> numbers written down.
Very well put.

>
>> <snip>
>
> Well, it may come down to,
> . 5-way by 2-level ANOVA, the grouping variables being presence/absence
> of the five co-morbidity categories, probably suppressing all
> interaction terms.
> . Non-parametric analog of the above
> . Survival-analysis analog of the above, if there is one
>
> Hey, methodologists! Help! Can you see that I'm getting near the edge
> of what I can say confidently?
>
>

In the 5 way ANOVA it will be necessary to ignore (suppress) 5 way,
likely 4 way, and possibly 3 way interactions. You might be able to get
the 2-way.
By ignore I mean pool them into the error(residual) term.

An additional way to look at the data is to use correlations.
You could create 2 additional variables. N_of_comorbidities. And
discharged ambulatory/not ambulatory.
1) Use CORRELATIONS to get the simple (aka zero-order) correlations of
LOS with each of the other variables "ignoring" the other independent
variables.
2) Use PARTIAL CORR to get the partials correlations of LOS with each
of the other variables "eliminating" the 2 additional variables.

put the results in a table with a row for each independent variable a
column for the name of the variable and columns for the zero order and
partial correlations. Of course the first two rows representing the 2
new variables will have n/a for the partial correlation.

Take your results with a grain of salt because your number of cases is
so small.

Art Kendall
Social Research Consultants

Art Kendall
Social Research Consultants

Richard Ristow

Re: effect of diseases on length of stay

In reply to this post by Richard Ristow

At 07:27 PM 1/23/2007, I wrote:

>First, group your co-morbidities into the five groups. There's no
>chance, otherwise.
>
>Second, run counts and cell statistics for each group that occurs.
>(More on how to do that, later.) It'll tell you, and us, a lot about
>what the analytic possibilities are. In particular, about how
>drastically your model may have to be simplified.

In other words, I promised to post how to do descriptive statistics on
your data. There are a good many ways; here's one, using AGGREGATE and
LIST, that at least gives pretty compact output.
. Code uses datasets (SPSS 14 and later), but can easily be recast to
use scratch saved files.
. Code uses the MEDIAN statistic in AGGREGATE, which is highly
desirable but added only recently (SPSS 14, I think)
. I'd love to do statistics for the single co-morbidities, but I can't
think of a definition I like, when multiple co-morbidities can occur.
One very useful question for the descriptives is how common multiple
co-morbidities are - and which.

This is SPSS 15 draft output, using synthetic generated data

AGGREGATE OUTFILE=Summary
/BREAK = CoMo1 TO CoMo5
/Patients 'Number of patients' = N
/Mean 'Mean Length of stay' = MEAN(LOS)
/StdDev 'Std Dev of LOS' = SD(LOS)
/Median 'Median LOS' = MEDIAN(LOS)
/Min 'Shortest stay' = MIN(LOS)
/Max 'Longest stay' = MAX(LOS).

DATASET ACTIVATE Summary.
FORMATS Mean StdDev (F5.2)
/Median Min Max (F3).

TEMPORARY.
STRING Morb01 TO Morb05 (A5).
RECODE CoMo1 TO CoMo5
(1 = 'Prsnt')
(0 = ' --- ')
(ELSE = ' ??? ')
INTO Morb01 TO Morb05.
LIST /VARIABLES = Morb01 TO Morb05
Patients TO Max
.
List
|-----------------------------|---------------------------|
|Output Created |24-JAN-2007 13:15:29 |
|-----------------------------|---------------------------|
[Summary]

Morb01 Morb02 Morb03 Morb04 Morb05 Patients Mean StdDev Median Min Max

--- --- --- --- --- 13 .00 .00 0 0
0
--- --- --- --- Prsnt 1
11.00 . 11 11 11
--- --- --- Prsnt --- 2 4.00 .00 4 4
4
--- --- Prsnt --- --- 7 5.71 1.80 6 3
8
--- --- Prsnt --- Prsnt 3
12.33 1.53 12 11 14
--- --- Prsnt Prsnt --- 2 9.00 4.24 9 6
12
--- Prsnt --- --- --- 25 3.20 1.47 3 1
6
--- Prsnt --- --- Prsnt 3
12.67 1.53 13 11 14
--- Prsnt --- Prsnt --- 9 9.44 1.67 10 7
12
--- Prsnt Prsnt --- --- 9 8.44 3.13 10 4
12
--- Prsnt Prsnt --- Prsnt 2
20.50 .71 21 20 21
--- Prsnt Prsnt Prsnt --- 3
13.67 1.15 13 13 15
Prsnt --- --- --- --- 2 2.50 .71 3 2 3
Prsnt --- --- Prsnt --- 1 8.00 . 8 8 8
Prsnt --- Prsnt --- --- 1 8.00 . 8 8 8
Prsnt --- Prsnt --- Prsnt 1 16.00 . 16 16 16
Prsnt Prsnt --- --- --- 6 6.67 1.51 7 5 8
Prsnt Prsnt --- --- Prsnt 1 15.00 . 15 15 15
Prsnt Prsnt --- Prsnt --- 2 14.00 1.41 14 13 15
Prsnt Prsnt Prsnt --- --- 5 11.00 2.35 12 8 13
Prsnt Prsnt Prsnt Prsnt --- 2 18.00 5.66 18 14 22

Number of cases read: 21 Number of cases listed: 21

===================
APPENDIX: Test data
===================
NEW FILE.
INPUT PROGRAM.
. COMPUTE #NCases = 100 /* Desired number of cases */ .

. NUMERIC CaseID (F3).
. NUMERIC CoMo1 TO CoMo5 (F2).
. VAL LABELS CoMo1 TO CoMo5 0 'Absent' 1 'Present'.
. NUMERIC LOS (F4).
. VAR LABELS LOS 'Length of stay, days'.

. LOOP CaseID = 1 TO #NCases.
. RECODE CoMo1 TO CoMo5 (ELSE = 0).
. COMPUTE LOS = 0.
. DO IF RV.UNIFORM(0,1) LE 0.25.
. COMPUTE CoMo1 = 1.
. COMPUTE LOS = LOS + 1 + RV.BINOM(5,0.4).
. END IF.
. DO IF RV.UNIFORM(0,1) LE 0.60.
. COMPUTE CoMo2 = 1.
. COMPUTE LOS = LOS + 1 + RV.BINOM(7,2/5).
. END IF.
. DO IF RV.UNIFORM(0,1) LE 0.40.
. COMPUTE CoMo3 = 1.
. COMPUTE LOS = LOS + 1 + RV.BINOM(10,2/5).
. END IF.
. DO IF RV.UNIFORM(0,1) LE 0.20.
. COMPUTE CoMo4 = 1.
. COMPUTE LOS = LOS + 1 + RV.BINOM(12,2/5).
. END IF.
. DO IF RV.UNIFORM(0,1) LE 0.10.
. COMPUTE CoMo5 = 1.
. COMPUTE LOS = LOS + 1 + RV.BINOM(18,2/5).
. END IF.
. END CASE.
. END LOOP.
END FILE.
END INPUT PROGRAM.
DATASET NAME LenStay.

Dataset Name
|----------------------------|---------------------------|
|Output Created |24-JAN-2007 13:15:27 |
|----------------------------|---------------------------|

LIST.
List
|----------------------------|---------------------------|
|Output Created |24-JAN-2007 13:15:27 |
|----------------------------|---------------------------|
[LenStay]

CaseID CoMo1 CoMo2 CoMo3 CoMo4 CoMo5 LOS

1 0 1 1 0 0 11
2 0 0 0 0 0 0
3 1 1 0 0 1 15
4 0 1 0 0 0 5
5 0 1 0 1 0 12
6 0 1 0 1 0 9
7 0 0 0 0 0 0
8 0 0 0 1 0 4
9 1 1 1 0 0 13
10 0 1 0 0 0 2
11 0 0 0 0 0 0
12 1 1 0 1 0 15
13 1 0 0 1 0 8
14 0 0 0 0 1 11
15 0 1 0 0 0 1
16 0 1 0 1 0 7
17 1 1 0 0 0 5
18 0 1 1 0 0 11
19 0 1 1 0 0 10
20 0 0 1 0 0 7
21 0 1 0 0 0 5
22 0 1 0 0 0 1
23 0 1 0 0 0 3
24 0 0 1 0 0 5
25 0 1 0 0 0 3
26 0 0 1 0 1 12
27 1 1 0 0 0 8
28 0 1 1 0 0 6
29 0 0 1 0 0 7
30 0 1 1 1 0 15
31 1 1 0 0 0 8
32 1 1 1 1 0 22
33 1 1 0 0 0 5
34 1 1 0 0 0 8
35 0 0 1 0 0 4
36 1 0 1 0 0 8
37 0 0 0 0 0 0
38 1 1 1 0 0 12
39 0 1 0 0 1 13
40 0 1 1 0 1 20
41 0 1 0 0 0 3
42 0 1 0 0 0 3
43 0 1 0 0 0 3
44 0 0 0 0 0 0
45 0 0 1 1 0 12
46 0 1 0 0 0 2
47 0 0 0 0 0 0
48 0 1 0 1 0 7
49 0 0 1 0 0 3
50 1 0 1 0 1 16
51 0 0 1 0 1 14
52 0 1 0 0 1 14
53 0 1 0 1 0 11
54 1 0 0 0 0 2
55 0 1 0 0 0 4
56 0 1 0 0 0 4
57 0 1 1 0 0 5
58 0 1 0 1 0 10
59 0 1 1 1 0 13
60 0 0 1 0 0 6
61 1 1 0 0 0 6
62 0 1 0 0 0 2
63 0 1 0 1 0 10
64 1 1 1 1 0 14
65 0 0 1 0 1 11
66 0 1 0 1 0 10
67 0 0 0 0 0 0
68 0 1 0 0 0 2
69 0 1 0 0 0 3
70 0 1 0 0 0 4
71 0 1 0 0 0 6
72 0 0 0 0 0 0
73 0 1 1 0 0 12
74 0 1 0 0 0 5
75 0 0 0 0 0 0
76 1 0 0 0 0 3
77 0 1 0 0 0 5
78 0 1 0 0 0 1
79 0 0 1 1 0 6
80 0 0 0 0 0 0
81 1 1 1 0 0 13
82 0 1 0 1 0 9
83 0 0 0 0 0 0
84 0 0 1 0 0 8
85 0 1 1 0 1 21
86 0 1 0 0 0 4
87 0 1 0 0 1 11
88 0 1 1 0 0 6
89 1 1 0 1 0 13
90 0 1 1 0 0 4
91 0 1 0 0 0 1
92 1 1 1 0 0 9
93 0 0 0 0 0 0
94 0 1 1 1 0 13
95 0 0 0 1 0 4
96 0 1 1 0 0 11
97 0 1 0 0 0 3
98 0 1 0 0 0 5
99 0 0 0 0 0 0
100 1 1 1 0 0 8

Angshu Bhowmik

Re: effect of diseases on length of stay

In reply to this post by Art Kendall

----- Original Message -----
From: "Art Kendall" <[hidden email]>
To: <[hidden email]>
Sent: Wednesday, January 24, 2007 1:40 PM
Subject: Re: effect of diseases on length of stay

>A lot depends on the nature of the problem that you are considering. Do
> patients with the condition routinely walk out of the hospital if they
> do not have co-morbidities? or Are they still in bed outside the
> hospital? etc.

Patients with the condition normally get discharged within about 3 - 5 days
if they do not have co-morbidities and may well walk out. They are not
usually bed bound once they have recovered from their acute illness. In
almost all cases, it is co-morbidities which seem to prolong length of stay.

I shall try your suggestions below.

Thank you all very much for your advice.
Angshu

>
> First, do scatter plots of LOS by N_of_Comorbidities with different
> colored markers for ambulatory/not ambulatory. (perhaps 4 levels
> ambulatory, home bed/assisted, skilled nursing facility, deceased.??)
>
> There are notes interspersed below.
>
> Richard Ristow wrote:
>> At 04:27 PM 1/23/2007, Angshu Bhowmik wrote:
>>
> <snip>
>>> I suppose one way to simplify it might be to categorise the lengths of
>>> stay into 3 or 4 groups e.g. very short (0) , short (1), long (2) and
>>> extra-long (3) or something like that and then use ordinal multinomial
>>> logistic regression (if I am right). Would this be an acceptable way
>>> to do it?
>>
>> No. That throws away information, and doesn't gain you anything. It
>> does NOT simplify the model for analysis, even though it means fewer
>> numbers written down.
> Very well put.
>>
>>> <snip>
>>
>> Well, it may come down to,
>> . 5-way by 2-level ANOVA, the grouping variables being presence/absence
>> of the five co-morbidity categories, probably suppressing all
>> interaction terms.
>> . Non-parametric analog of the above
>> . Survival-analysis analog of the above, if there is one
>>
>> Hey, methodologists! Help! Can you see that I'm getting near the edge
>> of what I can say confidently?
>>
>>
> In the 5 way ANOVA it will be necessary to ignore (suppress) 5 way,
> likely 4 way, and possibly 3 way interactions. You might be able to get
> the 2-way.
> By ignore I mean pool them into the error(residual) term.
>
> An additional way to look at the data is to use correlations.
> You could create 2 additional variables. N_of_comorbidities. And
> discharged ambulatory/not ambulatory.
> 1) Use CORRELATIONS to get the simple (aka zero-order) correlations of
> LOS with each of the other variables "ignoring" the other independent
> variables.
> 2) Use PARTIAL CORR to get the partials correlations of LOS with each
> of the other variables "eliminating" the 2 additional variables.
>
> put the results in a table with a row for each independent variable a
> column for the name of the variable and columns for the zero order and
> partial correlations. Of course the first two rows representing the 2
> new variables will have n/a for the partial correlation.
>
>
> Take your results with a grain of salt because your number of cases is
> so small.
>
> Art Kendall
> Social Research Consultants
>