> From:
[hidden email]> Subject: Re: Binary logistic regression - poor models
> To:
[hidden email]>
> Dear all!
>
> Thank you for drawing attention to the unusual distribution. Among the 42
> drugs were a number of medications which were administered only a few
> patients. I have counted the number of patients for each drug. With the
> exception of those drugs, which did not receive minimum of 100 patients, the
> remaining drugs and related number of patients are as follows:
> dr1 851
> dr2 9234
> dr3 128
> dr4 827
> dr5 5439
> dr6 16846
> dr7 4502
> dr8 338
> dr9 803
> dr10 246
> dr11 11522
> dr12 3622
> dr13 296
> dr14 7972
> dr15 814
> dr16 4787
> dr17 212
> dr18 2688
> dr19 4607
> dr20 571
> dr21 816
> dr22 2243
> dr23 3012
> dr24 570
>
> It hase a high variance. The number of the total population is only
> decreased with 8 patients.
>
> My goal is to analyze the impact of drugs. I dont know if the logistic
> regression is the right method or not. I thought, if I calculate the OR for
> each drug I can establish a rankig between them and I can characterize their
> effects with the ORs. But the resulted model has very low sensitivity,
> perhaps because of the few cases of heart failure and the lots of variables.
> The calculated models have R-square about 0,022. It can be, that the sample
> is too complex for the logistic regression?
>
> Prior knowledge:
> We have relatively sparse prior knowledge about the effects of the drugs to
> be analyzed. In the literature we have only found detailed information about
> 3 drugs. So far we have analyzed only one of them. In this case: according
> to the literature we can establish a threshod cumulative dose. Under this
> cumulative dose it is associated with heart failure at very low incidence,
> and over this dose the incedence of heart failure increases exponentionally.
> I have made for this analysis chi-square test, and I evaluated the change of
> p value. But in this case we hade prior knowledge and I had made several
> runs of chi square test in the range that included the treshold value.
> In other cases: it can be that the effect of the drug is independent from
> the dose. We dont know it.
> The previously mentioned iterative calculation of the p values took a lots
> of time, but naturally I can do it for all drugs if you suggest this for me.
> The problem is: the inteval of the cumulative dose for each drug has a very
> wide range and the distribution of the patients for the different doses is
> very variable.
>
> Furthermore I though, when I analyse the effect of drugs in this way, I can
> not consider the effect of other drugs, and variables (age, gender, ...)
> into account. We know, that the age has high impact on the outcome.
> I think to create stata for each group of ages in case of one drug is not
> complicate, but I can not create strata for all drugs and age and gender
> together (and for all deases), because then every strata will contains only
> a few patients.
>
> So I want to consider (if it is possible) the effect of other variables as
> well, and so I got to the logistic regression. But it gives me very poor
> results, or I think that it is very poor. It can also be that not the
> logistic regression is the key solution. Therefore, I ask for help.
>
> Agnes
>
>