Dear All:
I need help with reference on the folowing questions. 1. Sample size determination 2. Minimum size for a control group that is sufficient for comparison. For example, for a sample of experimental group of 450, what will be the minimun sample of control group that is necessary. Thanks, Muyiwa. Everyday is a Gift! Muyiwa Oladosu, PhD Principal Associate/Consultant, MiraMonitor Consulting, LLC P.O. Box 3239, Frederick, MD 21705, USA Office: 240.723.1527, Fax: 301-695-5386 emails: [hidden email], [hidden email] www.m2cnig.com www.miramic.com ____________________________________________________________________________________ Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. http://farechase.yahoo.com/promo-generic-14795097 |
Muyiwa,
You simply have to provide more information. And, with respect, I suggest that you do some reading on statistical power. Please respond to the list as I may not have time to help you with what you are wanting. Tell us: 1) what statistical test are you talking about. T-test? Chi square? Correlation? What is your alternative hypothesis, that is, if you were doing a correlation, what is the size of the correlation that you want to be confident of being able to find? This value should be stated in effect size terms. How confident you want to be of being able to find an effect of that size. The usual value of social science research is 80%. Gene Maguin |
There are numerous sample size & power calculators available to include a
couple of links from my Web site - Services Tab then scroll to the bottom W -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gene Maguin Sent: Tuesday, January 16, 2007 10:26 AM To: [hidden email] Subject: Sample size determination Muyiwa, You simply have to provide more information. And, with respect, I suggest that you do some reading on statistical power. Please respond to the list as I may not have time to help you with what you are wanting. Tell us: 1) what statistical test are you talking about. T-test? Chi square? Correlation? What is your alternative hypothesis, that is, if you were doing a correlation, what is the size of the correlation that you want to be confident of being able to find? This value should be stated in effect size terms. How confident you want to be of being able to find an effect of that size. The usual value of social science research is 80%. Gene Maguin
Will
Statistical Services ============ info.statman@earthlink.net http://home.earthlink.net/~z_statman/ ============ |
Hello, I apologise in advance for my ignorance - about to be exposed by my question. I am a doctor - not a statistician. I have data on lengths of hospital stay (LOS, in days) in about 100 patients with Disease A (always present i.e. 1). There are various co-morbidities: Disease B, Disease C etc which may be present or absent. There are in total 18 co-morbidities, but they could be grouped into 5 groups if that makes it easier to perform a more sensible analysis. LOS A B C D 12 1 0 1 1 13 1 1 0 1 4 1 1 1 0 etc I want to find out if any of the co-morbidities are increasing (or decreasing) LOS e.g. Does disease C increase length of stay independently (and if it is possible to find out, by how much)? I have used SPSS to do simple tests and I know how to run syntax and create it by using "paste" but have little knowledge of more complicated methods of using SPSS (nor of stats for that matter). Nonetheless, since I have SPSS at work, it is convenient for me to use. I think I can't use logistic regression unless my LOS is categorised into "long" or "short". I don't think the data fulfils the criteria for multiple regression (but I am willing to be corrected). Can anyone tell me what test I should perform (and if it is not something found in the "analysis" menu, then also how to do it)? Many thanks to anyone who has the time to answer this. Angshu Bhowmik London, UK Find sales, coupons, and free shipping, all in one place! MSN Shopping Sales & Deals |
Well, I'll start, but others can add more than I can.
At 11:59 AM 1/16/2007, Angshu Bhowmik wrote: >I have data on lengths of hospital stay (LOS, in days) in about 100 >patients with Disease A (always present i.e. 1). There are various >co-morbidities: Disease B, Disease C etc which may be present or >absent. There are in total 18 co-morbidities, but they could be >grouped into 5 groups if that makes it easier to perform a more >sensible analysis. That grouping is the only way to do it. It gives you a mean of 20 patients per group, which is reasonable. Keep all 18 co-morbidities, and you have a mean of 100/18=5.5, which isn't going to be enough. Then, you're comparing means between groups: that's a one-way analysis of variance. In SPSS, command MEANS is a good place to start. The syntax is easy, and it allows a lot of descriptive statistics by cell. For descriptives, I'd select COUNT, MEAN, STDDEV, MEDIAN, MIN and MAX; with /STATISTICS=ANOVA. From the menus: Analyze>Compare means>means. Don't do a test for linearity; it's not meaningful, for you. Moving on, . If the F-test says the groups differ, you'll likely want to know which groups have significantly higher or lower means, than which others. That's called multiple comparison analysis, and is available in command ONEWAY (Menu Analyze>Compare means>ANOVA). Select "post hoc", and pick a test - try BONFERRONI first, but that's something others on the list will know more about than I do. . I don't know the shape of your length-of-stay distributions: Do they cluster around a value? Or are there a lot of short stays, and a small proportion of much longer ones? When you run your ANOVA, look for cell means substantially larger than the medians; that can point to the latter. If it's the case, you may want to use a non-parametric ANOVA. Or, I'd seriously consider log-transforming the data. There may be other opinions on that, though. A lot of people, including me, are very cautious about transforming data to make it look 'nice'. That said, the log transform still feels like something to try, if the distributions show lengthy tails. Now, is that enough to get you going? -Best of luck, Richard |
In reply to this post by Angshu Bhowmik
It seems to me that this is a perfect opportunity to use Survival
Analysis. I can not speak to the specifics of the analysis, as I am just starting to read about this myself, but from your description of the problem, it seems that it nearly resembles a textbook example of SA. HTH, ~ Brock |
In reply to this post by Richard Ristow
I think Brock is correct, a good resource on how discrete time survival
analysis works is located here. http://www.ats.ucla.edu/stat/mplus/seminars/DiscreteTimeSurvival/default.htm Watch the movies, they are informative, they use Mplus instead of SPSS but the concepts and ideas are similar. Additionally the references at the end of the slideshow and lecture point you to some good books on the topic. Don On 1/18/07, Richard Ristow <[hidden email]> wrote: > > Well, I'll start, but others can add more than I can. > > At 11:59 AM 1/16/2007, Angshu Bhowmik wrote: > > >I have data on lengths of hospital stay (LOS, in days) in about 100 > >patients with Disease A (always present i.e. 1). There are various > >co-morbidities: Disease B, Disease C etc which may be present or > >absent. There are in total 18 co-morbidities, but they could be > >grouped into 5 groups if that makes it easier to perform a more > >sensible analysis. > > That grouping is the only way to do it. It gives you a mean of 20 > patients per group, which is reasonable. Keep all 18 co-morbidities, > and you have a mean of 100/18=5.5, which isn't going to be enough. > > Then, you're comparing means between groups: that's a one-way analysis > of variance. In SPSS, command MEANS is a good place to start. The > syntax is easy, and it allows a lot of descriptive statistics by cell. > For descriptives, I'd select COUNT, MEAN, STDDEV, MEDIAN, MIN and MAX; > with /STATISTICS=ANOVA. From the menus: Analyze>Compare means>means. > Don't do a test for linearity; it's not meaningful, for you. > > Moving on, > > . If the F-test says the groups differ, you'll likely want to know > which groups have significantly higher or lower means, than which > others. That's called multiple comparison analysis, and is available in > command ONEWAY (Menu Analyze>Compare means>ANOVA). Select "post hoc", > and pick a test - try BONFERRONI first, but that's something others on > the list will know more about than I do. > > . I don't know the shape of your length-of-stay distributions: Do they > cluster around a value? Or are there a lot of short stays, and a small > proportion of much longer ones? When you run your ANOVA, look for cell > means substantially larger than the medians; that can point to the > latter. If it's the case, you may want to use a non-parametric ANOVA. > Or, I'd seriously consider log-transforming the data. There may be > other opinions on that, though. A lot of people, including me, are very > cautious about transforming data to make it look 'nice'. That said, the > log transform still feels like something to try, if the distributions > show lengthy tails. > > Now, is that enough to get you going? > > -Best of luck, > Richard > |
In reply to this post by Angshu Bhowmik
This struck me as I wrote with methodological questions about this
problem. At 11:59 AM 1/16/2007, Angshu Bhowmik wrote: >I have data on lengths of hospital stay (LOS, in days) in about 100. >There are various co-morbidities: Disease B, Disease C etc which may >be present or absent. There are in total 18 co-morbidities, but they >could be grouped into 5 groups if that makes it easier to perform a >more sensible analysis. > >I want to find out if any of the co-morbidities are increasing (or >decreasing) LOS e.g. Does disease C increase length of stay >independently (and if it is possible to find out, by how much)? I wrote suggestion ANOVA; others wrote suggesting (reasonably) survival analysis. One question that at least I missed: You write "There are various co-morbidities: Disease B, Disease C etc which may be present or absent." I wrote assuming that patients had exactly one, or at most one, of these. Is that so? If you see patients with more than comorbidity, it changes the analysis and its complexities. It also makes sample-size requirements more stringent. I wrote that 100 patients is reasonable for analyzing differences among 5 groups, though not 18. If you have combinations, though, you may have many more 'groups', i.e. sets of patients with the same sets of diseases. You may have to go to n-way ANOVA (I don't think the Survival procedures are good at this), or dummy-variable regression. |
In reply to this post by Angshu Bhowmik
Follow-up to original thread "effect of diseases on length of stay"
At 11:59 AM 1/16/2007, Angshu Bhowmik wrote: >I have data on lengths of hospital stay (LOS, in days) in about 100. >There are various co-morbidities: Disease B, Disease C etc which may >be present or absent. I want to find out if any of the co-morbidities >are increasing (or decreasing) LOS e.g. Does disease C increase >length of stay independently (and if it is possible to find out, by >how much)? I suggested ANOVA, with the reservation that it might be necessary to transform the data, mainly because length-of-stay curves tend to be long-tailed. (That is, a majority of patients clustered around short to moderate lengths of stay, with a minority having much longer ones.) A couple of other posters suggested survival analysis, which is certainly reasonable: a datum is, after all, duration to a terminal event. (Don't tell the patients I'm calling discharge 'terminal'.) Question, then: Survival analysis is necessary if you have censored data, i.e. subjects of whom you know only that the terminal event occurred after a certain date. But this set has no censored data. Is there an active reason, then, not to use ANOVA? I ask because ANOVA procedures generally have more flexibility, apart from handling censoring, than do survival-analysis procedured. -Ignorant, but anxious to learn, Richard |
In reply to this post by Richard Ristow
----- Original Message -----
From: "Richard Ristow" > This struck me as I wrote with methodological questions about this > problem. > > At 11:59 AM 1/16/2007, Angshu Bhowmik wrote: > >>I have data on lengths of hospital stay (LOS, in days) in about 100. There >>are various co-morbidities: Disease B, Disease C etc which may be present >>or absent. There are in total 18 co-morbidities, but they could be grouped >>into 5 groups if that makes it easier to perform a more sensible analysis. >> >>I want to find out if any of the co-morbidities are increasing (or >>decreasing) LOS e.g. Does disease C increase length of stay independently >>(and if it is possible to find out, by how much)? > > I wrote suggestion ANOVA; others wrote suggesting (reasonably) survival > analysis. > > One question that at least I missed: You write "There are various > co-morbidities: Disease B, Disease C etc which may be present or absent." > I wrote assuming that patients had exactly one, or at most one, of these. > Is that so? I am sorry for not being more clear. The patients may have one or more of the co-morbidities - not just one. > > If you see patients with more than comorbidity, it changes the analysis > and its complexities. It also makes sample-size requirements more > stringent. I wrote that 100 patients is reasonable for analyzing > differences among 5 groups, though not 18. If you have combinations, > though, you may have many more 'groups', i.e. sets of patients with the > same sets of diseases. You may have to go to n-way ANOVA (I don't think > the Survival procedures are good at this), or dummy-variable regression. > I see (sort of). I was still struggling a bit with your original suggestions, but I think I have just about managed to get a grasp of them. This sounds quite complicated though, and I don't know how to set about it. I suppose one way to simplify it might be to categorise the lengths of stay into 3 or 4 groups e.g. very short (0) , short (1), long (2) and extra-long (3) or something like that and then use ordinal multinomial logistic regression (if I am right). Would this be an acceptable way to do it? At the end of the day, my aim is just to find out if one or more of the comorbidities affect length of stay -- the problem is that when there are so many possible combinations, then it is difficult to isolate the effects of just one of the comorbidities at a time. Many thanks to all of your for your valuable suggestions. This is helping me to get an understanding of the principles involved. I have not yet tried the survival analysis, but will attempt this over the weekend. Thanks again, Angshu |
It is rarely useful to coarsen your dependent variable.
It would seem that it might be worth your while to see if a count or transformation of a count is related to length of stay. Art Kendall Social Research Consultants Angshu Bhowmik wrote: > ----- Original Message ----- > From: "Richard Ristow" > > > >> This struck me as I wrote with methodological questions about this >> problem. >> >> At 11:59 AM 1/16/2007, Angshu Bhowmik wrote: >> >>> I have data on lengths of hospital stay (LOS, in days) in about 100. >>> There >>> are various co-morbidities: Disease B, Disease C etc which may be >>> present >>> or absent. There are in total 18 co-morbidities, but they could be >>> grouped >>> into 5 groups if that makes it easier to perform a more sensible >>> analysis. >>> >>> I want to find out if any of the co-morbidities are increasing (or >>> decreasing) LOS e.g. Does disease C increase length of stay >>> independently >>> (and if it is possible to find out, by how much)? >> >> I wrote suggestion ANOVA; others wrote suggesting (reasonably) survival >> analysis. >> >> One question that at least I missed: You write "There are various >> co-morbidities: Disease B, Disease C etc which may be present or >> absent." >> I wrote assuming that patients had exactly one, or at most one, of >> these. >> Is that so? > > > I am sorry for not being more clear. The patients may have one or more of > the co-morbidities - not just one. > > >> >> If you see patients with more than comorbidity, it changes the analysis >> and its complexities. It also makes sample-size requirements more >> stringent. I wrote that 100 patients is reasonable for analyzing >> differences among 5 groups, though not 18. If you have combinations, >> though, you may have many more 'groups', i.e. sets of patients with the >> same sets of diseases. You may have to go to n-way ANOVA (I don't think >> the Survival procedures are good at this), or dummy-variable regression. >> > > I see (sort of). I was still struggling a bit with your original > suggestions, but I think I have just about managed to get a grasp of > them. > This sounds quite complicated though, and I don't know how to set > about it. > > I suppose one way to simplify it might be to categorise the lengths of > stay > into 3 or 4 groups e.g. very short (0) , short (1), long (2) and > extra-long > (3) or something like that and then use ordinal multinomial logistic > regression (if I am right). Would this be an acceptable way to do it? > At the > end of the day, my aim is just to find out if one or more of the > comorbidities affect length of stay -- the problem is that when there > are so > many possible combinations, then it is difficult to isolate the > effects of > just one of the comorbidities at a time. > > Many thanks to all of your for your valuable suggestions. This is > helping me > to get an understanding of the principles involved. I have not yet > tried the > survival analysis, but will attempt this over the weekend. > > Thanks again, > > Angshu > >
Art Kendall
Social Research Consultants |
In reply to this post by Angshu Bhowmik
Angshu, one issue you may want to consider is that hospital length of stay
may be affected by events other than the underlying disease and comorbid conditions. Many if not most hospitals have a rather significant problem placing patients into long term care or other facilities such as hospice and rehab. The LTC issue is very problematic and tends to back up discharge from acute care as patients may wait days for a free LTC bed to become available. So if you have a patient in your dataset with significant comorbidities, it is more likely they will go to other facilities after discharge from acute care so this may influence your overall interpretation of your findings. John Welton Medical University of South Carolina College of Nursing Charleston, SC -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Angshu Bhowmik Sent: Tuesday, January 23, 2007 4:28 PM To: [hidden email] Subject: Re: [SPSSX-L] effect of diseases on length of stay ----- Original Message ----- From: "Richard Ristow" > This struck me as I wrote with methodological questions about this > problem. > > At 11:59 AM 1/16/2007, Angshu Bhowmik wrote: > >>I have data on lengths of hospital stay (LOS, in days) in about 100. There >>are various co-morbidities: Disease B, Disease C etc which may be present >>or absent. There are in total 18 co-morbidities, but they could be grouped >>into 5 groups if that makes it easier to perform a more sensible analysis. >> >>I want to find out if any of the co-morbidities are increasing (or >>decreasing) LOS e.g. Does disease C increase length of stay independently >>(and if it is possible to find out, by how much)? > > I wrote suggestion ANOVA; others wrote suggesting (reasonably) survival > analysis. > > One question that at least I missed: You write "There are various > co-morbidities: Disease B, Disease C etc which may be present or absent." > I wrote assuming that patients had exactly one, or at most one, of these. > Is that so? I am sorry for not being more clear. The patients may have one or more of the co-morbidities - not just one. > > If you see patients with more than comorbidity, it changes the analysis > and its complexities. It also makes sample-size requirements more > stringent. I wrote that 100 patients is reasonable for analyzing > differences among 5 groups, though not 18. If you have combinations, > though, you may have many more 'groups', i.e. sets of patients with the > same sets of diseases. You may have to go to n-way ANOVA (I don't think > the Survival procedures are good at this), or dummy-variable regression. > I see (sort of). I was still struggling a bit with your original suggestions, but I think I have just about managed to get a grasp of them. This sounds quite complicated though, and I don't know how to set about it. I suppose one way to simplify it might be to categorise the lengths of stay into 3 or 4 groups e.g. very short (0) , short (1), long (2) and extra-long (3) or something like that and then use ordinal multinomial logistic regression (if I am right). Would this be an acceptable way to do it? At the end of the day, my aim is just to find out if one or more of the comorbidities affect length of stay -- the problem is that when there are so many possible combinations, then it is difficult to isolate the effects of just one of the comorbidities at a time. Many thanks to all of your for your valuable suggestions. This is helping me to get an understanding of the principles involved. I have not yet tried the survival analysis, but will attempt this over the weekend. Thanks again, Angshu |
In reply to this post by Angshu Bhowmik
At 04:27 PM 1/23/2007, Angshu Bhowmik wrote:
>>>I want to find out if any of the co-morbidities are increasing (or >>>decreasing) LOS e.g. Does disease C increase length of stay >>>independently (and if it is possible to find out, by how much)? >> >>I wrote suggestion ANOVA; others wrote suggesting (reasonably) >>survival >>analysis. >> >>One question that at least I missed: You write "There are various >>co-morbidities: Disease B, Disease C etc which may be present or >>absent." I wrote assuming that patients had exactly one, or at most >>one, of these. Is that so? > >I am sorry for not being more clear. The patients may have one or more >of the co-morbidities - not just one. OK, so that increases the complexity, as I've described. Something useful, though it won't solve it: First, group your co-morbidities into the five groups. There's no chance, otherwise. Second, run counts and cell statistics for each group that occurs. (More on how to do that, later.) It'll tell you, and us, a lot about what the analytic possibilities are. In particular, about how drastically your model may have to be simplified. >I suppose one way to simplify it might be to categorise the lengths of >stay into 3 or 4 groups e.g. very short (0) , short (1), long (2) and >extra-long (3) or something like that and then use ordinal multinomial >logistic regression (if I am right). Would this be an acceptable way >to do it? No. That throws away information, and doesn't gain you anything. It does NOT simplify the model for analysis, even though it means fewer numbers written down. >My aim is just to find out if one or more of the comorbidities affect >length of stay -- the problem is that when there are so many possible >combinations, then it is difficult to isolate the effects of just one >of the comorbidities at a time. Well, it may come down to, . 5-way by 2-level ANOVA, the grouping variables being presence/absence of the five co-morbidity categories, probably suppressing all interaction terms. . Non-parametric analog of the above . Survival-analysis analog of the above, if there is one Hey, methodologists! Help! Can you see that I'm getting near the edge of what I can say confidently? |
In reply to this post by John-342
John,
You are quite right. In many cases we have decided on a "medical" date of discharge (i.e. the date the doctors decided that medical treatment had ended) and separately recorded the actual date of discharge. Social problems leading to prolonged stay is one of the "co-morbidities" I have included as well, so I am not ignoring the effect you mention. Thank you, Angshu ----- Original Message ----- From: "John" <[hidden email]> To: <[hidden email]> Sent: Wednesday, January 24, 2007 12:17 AM Subject: Re: effect of diseases on length of stay > Angshu, one issue you may want to consider is that hospital length of stay > may be affected by events other than the underlying disease and comorbid > conditions. Many if not most hospitals have a rather significant problem > placing patients into long term care or other facilities such as hospice > and > rehab. The LTC issue is very problematic and tends to back up discharge > from > acute care as patients may wait days for a free LTC bed to become > available. > So if you have a patient in your dataset with significant comorbidities, > it > is more likely they will go to other facilities after discharge from acute > care so this may influence your overall interpretation of your findings. > > John Welton > Medical University of South Carolina > College of Nursing > Charleston, SC > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > Angshu Bhowmik > Sent: Tuesday, January 23, 2007 4:28 PM > To: [hidden email] > Subject: Re: [SPSSX-L] effect of diseases on length of stay > > ----- Original Message ----- > From: "Richard Ristow" > > > >> This struck me as I wrote with methodological questions about this >> problem. >> >> At 11:59 AM 1/16/2007, Angshu Bhowmik wrote: >> >>>I have data on lengths of hospital stay (LOS, in days) in about 100. >>>There >>>are various co-morbidities: Disease B, Disease C etc which may be present >>>or absent. There are in total 18 co-morbidities, but they could be >>>grouped >>>into 5 groups if that makes it easier to perform a more sensible >>>analysis. >>> >>>I want to find out if any of the co-morbidities are increasing (or >>>decreasing) LOS e.g. Does disease C increase length of stay >>>independently >>>(and if it is possible to find out, by how much)? >> >> I wrote suggestion ANOVA; others wrote suggesting (reasonably) survival >> analysis. >> >> One question that at least I missed: You write "There are various >> co-morbidities: Disease B, Disease C etc which may be present or absent." >> I wrote assuming that patients had exactly one, or at most one, of these. >> Is that so? > > > I am sorry for not being more clear. The patients may have one or more of > the co-morbidities - not just one. > > >> >> If you see patients with more than comorbidity, it changes the analysis >> and its complexities. It also makes sample-size requirements more >> stringent. I wrote that 100 patients is reasonable for analyzing >> differences among 5 groups, though not 18. If you have combinations, >> though, you may have many more 'groups', i.e. sets of patients with the >> same sets of diseases. You may have to go to n-way ANOVA (I don't think >> the Survival procedures are good at this), or dummy-variable regression. >> > > I see (sort of). I was still struggling a bit with your original > suggestions, but I think I have just about managed to get a grasp of them. > This sounds quite complicated though, and I don't know how to set about > it. > > I suppose one way to simplify it might be to categorise the lengths of > stay > into 3 or 4 groups e.g. very short (0) , short (1), long (2) and > extra-long > (3) or something like that and then use ordinal multinomial logistic > regression (if I am right). Would this be an acceptable way to do it? At > the > end of the day, my aim is just to find out if one or more of the > comorbidities affect length of stay -- the problem is that when there are > so > many possible combinations, then it is difficult to isolate the effects of > just one of the comorbidities at a time. > > Many thanks to all of your for your valuable suggestions. This is helping > me > to get an understanding of the principles involved. I have not yet tried > the > survival analysis, but will attempt this over the weekend. > > Thanks again, > > Angshu > |
In reply to this post by Richard Ristow
A lot depends on the nature of the problem that you are considering. Do
patients with the condition routinely walk out of the hospital if they do not have co-morbidities? or Are they still in bed outside the hospital? etc. First, do scatter plots of LOS by N_of_Comorbidities with different colored markers for ambulatory/not ambulatory. (perhaps 4 levels ambulatory, home bed/assisted, skilled nursing facility, deceased.??) There are notes interspersed below. Richard Ristow wrote: > At 04:27 PM 1/23/2007, Angshu Bhowmik wrote: > <snip> >> I suppose one way to simplify it might be to categorise the lengths of >> stay into 3 or 4 groups e.g. very short (0) , short (1), long (2) and >> extra-long (3) or something like that and then use ordinal multinomial >> logistic regression (if I am right). Would this be an acceptable way >> to do it? > > No. That throws away information, and doesn't gain you anything. It > does NOT simplify the model for analysis, even though it means fewer > numbers written down. Very well put. > >> <snip> > > Well, it may come down to, > . 5-way by 2-level ANOVA, the grouping variables being presence/absence > of the five co-morbidity categories, probably suppressing all > interaction terms. > . Non-parametric analog of the above > . Survival-analysis analog of the above, if there is one > > Hey, methodologists! Help! Can you see that I'm getting near the edge > of what I can say confidently? > > likely 4 way, and possibly 3 way interactions. You might be able to get the 2-way. By ignore I mean pool them into the error(residual) term. An additional way to look at the data is to use correlations. You could create 2 additional variables. N_of_comorbidities. And discharged ambulatory/not ambulatory. 1) Use CORRELATIONS to get the simple (aka zero-order) correlations of LOS with each of the other variables "ignoring" the other independent variables. 2) Use PARTIAL CORR to get the partials correlations of LOS with each of the other variables "eliminating" the 2 additional variables. put the results in a table with a row for each independent variable a column for the name of the variable and columns for the zero order and partial correlations. Of course the first two rows representing the 2 new variables will have n/a for the partial correlation. Take your results with a grain of salt because your number of cases is so small. Art Kendall Social Research Consultants
Art Kendall
Social Research Consultants |
In reply to this post by Richard Ristow
At 07:27 PM 1/23/2007, I wrote:
>First, group your co-morbidities into the five groups. There's no >chance, otherwise. > >Second, run counts and cell statistics for each group that occurs. >(More on how to do that, later.) It'll tell you, and us, a lot about >what the analytic possibilities are. In particular, about how >drastically your model may have to be simplified. In other words, I promised to post how to do descriptive statistics on your data. There are a good many ways; here's one, using AGGREGATE and LIST, that at least gives pretty compact output. . Code uses datasets (SPSS 14 and later), but can easily be recast to use scratch saved files. . Code uses the MEDIAN statistic in AGGREGATE, which is highly desirable but added only recently (SPSS 14, I think) . I'd love to do statistics for the single co-morbidities, but I can't think of a definition I like, when multiple co-morbidities can occur. One very useful question for the descriptives is how common multiple co-morbidities are - and which. This is SPSS 15 draft output, using synthetic generated data AGGREGATE OUTFILE=Summary /BREAK = CoMo1 TO CoMo5 /Patients 'Number of patients' = N /Mean 'Mean Length of stay' = MEAN(LOS) /StdDev 'Std Dev of LOS' = SD(LOS) /Median 'Median LOS' = MEDIAN(LOS) /Min 'Shortest stay' = MIN(LOS) /Max 'Longest stay' = MAX(LOS). DATASET ACTIVATE Summary. FORMATS Mean StdDev (F5.2) /Median Min Max (F3). TEMPORARY. STRING Morb01 TO Morb05 (A5). RECODE CoMo1 TO CoMo5 (1 = 'Prsnt') (0 = ' --- ') (ELSE = ' ??? ') INTO Morb01 TO Morb05. LIST /VARIABLES = Morb01 TO Morb05 Patients TO Max . List |-----------------------------|---------------------------| |Output Created |24-JAN-2007 13:15:29 | |-----------------------------|---------------------------| [Summary] Morb01 Morb02 Morb03 Morb04 Morb05 Patients Mean StdDev Median Min Max --- --- --- --- --- 13 .00 .00 0 0 0 --- --- --- --- Prsnt 1 11.00 . 11 11 11 --- --- --- Prsnt --- 2 4.00 .00 4 4 4 --- --- Prsnt --- --- 7 5.71 1.80 6 3 8 --- --- Prsnt --- Prsnt 3 12.33 1.53 12 11 14 --- --- Prsnt Prsnt --- 2 9.00 4.24 9 6 12 --- Prsnt --- --- --- 25 3.20 1.47 3 1 6 --- Prsnt --- --- Prsnt 3 12.67 1.53 13 11 14 --- Prsnt --- Prsnt --- 9 9.44 1.67 10 7 12 --- Prsnt Prsnt --- --- 9 8.44 3.13 10 4 12 --- Prsnt Prsnt --- Prsnt 2 20.50 .71 21 20 21 --- Prsnt Prsnt Prsnt --- 3 13.67 1.15 13 13 15 Prsnt --- --- --- --- 2 2.50 .71 3 2 3 Prsnt --- --- Prsnt --- 1 8.00 . 8 8 8 Prsnt --- Prsnt --- --- 1 8.00 . 8 8 8 Prsnt --- Prsnt --- Prsnt 1 16.00 . 16 16 16 Prsnt Prsnt --- --- --- 6 6.67 1.51 7 5 8 Prsnt Prsnt --- --- Prsnt 1 15.00 . 15 15 15 Prsnt Prsnt --- Prsnt --- 2 14.00 1.41 14 13 15 Prsnt Prsnt Prsnt --- --- 5 11.00 2.35 12 8 13 Prsnt Prsnt Prsnt Prsnt --- 2 18.00 5.66 18 14 22 Number of cases read: 21 Number of cases listed: 21 =================== APPENDIX: Test data =================== NEW FILE. INPUT PROGRAM. . COMPUTE #NCases = 100 /* Desired number of cases */ . . NUMERIC CaseID (F3). . NUMERIC CoMo1 TO CoMo5 (F2). . VAL LABELS CoMo1 TO CoMo5 0 'Absent' 1 'Present'. . NUMERIC LOS (F4). . VAR LABELS LOS 'Length of stay, days'. . LOOP CaseID = 1 TO #NCases. . RECODE CoMo1 TO CoMo5 (ELSE = 0). . COMPUTE LOS = 0. . DO IF RV.UNIFORM(0,1) LE 0.25. . COMPUTE CoMo1 = 1. . COMPUTE LOS = LOS + 1 + RV.BINOM(5,0.4). . END IF. . DO IF RV.UNIFORM(0,1) LE 0.60. . COMPUTE CoMo2 = 1. . COMPUTE LOS = LOS + 1 + RV.BINOM(7,2/5). . END IF. . DO IF RV.UNIFORM(0,1) LE 0.40. . COMPUTE CoMo3 = 1. . COMPUTE LOS = LOS + 1 + RV.BINOM(10,2/5). . END IF. . DO IF RV.UNIFORM(0,1) LE 0.20. . COMPUTE CoMo4 = 1. . COMPUTE LOS = LOS + 1 + RV.BINOM(12,2/5). . END IF. . DO IF RV.UNIFORM(0,1) LE 0.10. . COMPUTE CoMo5 = 1. . COMPUTE LOS = LOS + 1 + RV.BINOM(18,2/5). . END IF. . END CASE. . END LOOP. END FILE. END INPUT PROGRAM. DATASET NAME LenStay. Dataset Name |----------------------------|---------------------------| |Output Created |24-JAN-2007 13:15:27 | |----------------------------|---------------------------| LIST. List |----------------------------|---------------------------| |Output Created |24-JAN-2007 13:15:27 | |----------------------------|---------------------------| [LenStay] CaseID CoMo1 CoMo2 CoMo3 CoMo4 CoMo5 LOS 1 0 1 1 0 0 11 2 0 0 0 0 0 0 3 1 1 0 0 1 15 4 0 1 0 0 0 5 5 0 1 0 1 0 12 6 0 1 0 1 0 9 7 0 0 0 0 0 0 8 0 0 0 1 0 4 9 1 1 1 0 0 13 10 0 1 0 0 0 2 11 0 0 0 0 0 0 12 1 1 0 1 0 15 13 1 0 0 1 0 8 14 0 0 0 0 1 11 15 0 1 0 0 0 1 16 0 1 0 1 0 7 17 1 1 0 0 0 5 18 0 1 1 0 0 11 19 0 1 1 0 0 10 20 0 0 1 0 0 7 21 0 1 0 0 0 5 22 0 1 0 0 0 1 23 0 1 0 0 0 3 24 0 0 1 0 0 5 25 0 1 0 0 0 3 26 0 0 1 0 1 12 27 1 1 0 0 0 8 28 0 1 1 0 0 6 29 0 0 1 0 0 7 30 0 1 1 1 0 15 31 1 1 0 0 0 8 32 1 1 1 1 0 22 33 1 1 0 0 0 5 34 1 1 0 0 0 8 35 0 0 1 0 0 4 36 1 0 1 0 0 8 37 0 0 0 0 0 0 38 1 1 1 0 0 12 39 0 1 0 0 1 13 40 0 1 1 0 1 20 41 0 1 0 0 0 3 42 0 1 0 0 0 3 43 0 1 0 0 0 3 44 0 0 0 0 0 0 45 0 0 1 1 0 12 46 0 1 0 0 0 2 47 0 0 0 0 0 0 48 0 1 0 1 0 7 49 0 0 1 0 0 3 50 1 0 1 0 1 16 51 0 0 1 0 1 14 52 0 1 0 0 1 14 53 0 1 0 1 0 11 54 1 0 0 0 0 2 55 0 1 0 0 0 4 56 0 1 0 0 0 4 57 0 1 1 0 0 5 58 0 1 0 1 0 10 59 0 1 1 1 0 13 60 0 0 1 0 0 6 61 1 1 0 0 0 6 62 0 1 0 0 0 2 63 0 1 0 1 0 10 64 1 1 1 1 0 14 65 0 0 1 0 1 11 66 0 1 0 1 0 10 67 0 0 0 0 0 0 68 0 1 0 0 0 2 69 0 1 0 0 0 3 70 0 1 0 0 0 4 71 0 1 0 0 0 6 72 0 0 0 0 0 0 73 0 1 1 0 0 12 74 0 1 0 0 0 5 75 0 0 0 0 0 0 76 1 0 0 0 0 3 77 0 1 0 0 0 5 78 0 1 0 0 0 1 79 0 0 1 1 0 6 80 0 0 0 0 0 0 81 1 1 1 0 0 13 82 0 1 0 1 0 9 83 0 0 0 0 0 0 84 0 0 1 0 0 8 85 0 1 1 0 1 21 86 0 1 0 0 0 4 87 0 1 0 0 1 11 88 0 1 1 0 0 6 89 1 1 0 1 0 13 90 0 1 1 0 0 4 91 0 1 0 0 0 1 92 1 1 1 0 0 9 93 0 0 0 0 0 0 94 0 1 1 1 0 13 95 0 0 0 1 0 4 96 0 1 1 0 0 11 97 0 1 0 0 0 3 98 0 1 0 0 0 5 99 0 0 0 0 0 0 100 1 1 1 0 0 8 |
In reply to this post by Art Kendall
----- Original Message -----
From: "Art Kendall" <[hidden email]> To: <[hidden email]> Sent: Wednesday, January 24, 2007 1:40 PM Subject: Re: effect of diseases on length of stay >A lot depends on the nature of the problem that you are considering. Do > patients with the condition routinely walk out of the hospital if they > do not have co-morbidities? or Are they still in bed outside the > hospital? etc. Patients with the condition normally get discharged within about 3 - 5 days if they do not have co-morbidities and may well walk out. They are not usually bed bound once they have recovered from their acute illness. In almost all cases, it is co-morbidities which seem to prolong length of stay. I shall try your suggestions below. Thank you all very much for your advice. Angshu > > First, do scatter plots of LOS by N_of_Comorbidities with different > colored markers for ambulatory/not ambulatory. (perhaps 4 levels > ambulatory, home bed/assisted, skilled nursing facility, deceased.??) > > There are notes interspersed below. > > Richard Ristow wrote: >> At 04:27 PM 1/23/2007, Angshu Bhowmik wrote: >> > <snip> >>> I suppose one way to simplify it might be to categorise the lengths of >>> stay into 3 or 4 groups e.g. very short (0) , short (1), long (2) and >>> extra-long (3) or something like that and then use ordinal multinomial >>> logistic regression (if I am right). Would this be an acceptable way >>> to do it? >> >> No. That throws away information, and doesn't gain you anything. It >> does NOT simplify the model for analysis, even though it means fewer >> numbers written down. > Very well put. >> >>> <snip> >> >> Well, it may come down to, >> . 5-way by 2-level ANOVA, the grouping variables being presence/absence >> of the five co-morbidity categories, probably suppressing all >> interaction terms. >> . Non-parametric analog of the above >> . Survival-analysis analog of the above, if there is one >> >> Hey, methodologists! Help! Can you see that I'm getting near the edge >> of what I can say confidently? >> >> > In the 5 way ANOVA it will be necessary to ignore (suppress) 5 way, > likely 4 way, and possibly 3 way interactions. You might be able to get > the 2-way. > By ignore I mean pool them into the error(residual) term. > > An additional way to look at the data is to use correlations. > You could create 2 additional variables. N_of_comorbidities. And > discharged ambulatory/not ambulatory. > 1) Use CORRELATIONS to get the simple (aka zero-order) correlations of > LOS with each of the other variables "ignoring" the other independent > variables. > 2) Use PARTIAL CORR to get the partials correlations of LOS with each > of the other variables "eliminating" the 2 additional variables. > > put the results in a table with a row for each independent variable a > column for the name of the variable and columns for the zero order and > partial correlations. Of course the first two rows representing the 2 > new variables will have n/a for the partial correlation. > > > Take your results with a grain of salt because your number of cases is > so small. > > Art Kendall > Social Research Consultants > |
Free forum by Nabble | Edit this page |