Hello everybody,
I would be glad for advice concerning the followin problem: The sample size for my logistic regression with four predictor variables appears somewhat small (N= 160) to me. Regarding the dichotomous dependent variable, there are 20 cases for category (a) and 140 cases for category (b). I wonder whether this might induce any problems for the results - is it ok to conduct a logistic regression with just 20 cases in one of the two categories for the dependent variable? Are there any references available regarding adeqaute sample sizes in logistic regression? Many thanks! Jan |
Administrator
|
Frank Harrell, author of the book "Regression Modeling Strategies", advocates a 20:1 rule, meaning 20 events per candidate predictor variable. See the section on overfitting here: http://biostat.mc.vanderbilt.edu/wiki/Main/ManuscriptChecklist Personally, I am comfortable relaxing that to 15:1, or even 10:1 at times, although 10:1 is really pushing it. For more on overfitting, see the nice article by Mike Babyak, available here: http://www.class.uidaho.edu/psy586/Course%20Readings/Babyak_04.pdf HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Mr Jan!
The biggest problem that comes to small samples is their testing. During modelling when the parameters movetoward the testing they mostly appears to be very large over or under estmated and does not shows any significance of the model. You can face problem during justification of interpretation of model estimates. In SPSS the minimum cases accpeted for the logistic regression are 60. It doesn't matter, whether any catagory of your Dept. variable has less then 60 but the sum of all catagories of dependent variable (whether its dichotomous or nominal or ordinal or catagorical) should not be less than 60. Nut in such situation you will face the problems which i mentioned in my first para. Do not make more than 2 two catagories of the indicators, try to supress them so that numbers would not distribute too much. or Try probit, tobit models e.t.c. or another from jthe literature relevent to models for qualitative data or for dummy/dichotomous dependent variables. Best of Luck |
In reply to this post by Bruce Weaver
I, too, have read these rules of thumb, and I generally adhere to them
when fitting a logistic regression model using maximum likelihood estimation. However, it is worth noting that there are exact methods that have been shown to perform well with small sample sizes/rare events. Having said that, as far as I'm aware, the latest version of SPSS does not offer a procedure to fit logistic regression using exact methods. Ryan On Thu, Dec 23, 2010 at 7:10 AM, Bruce Weaver <[hidden email]> wrote: > student09 wrote: >> >> Hello everybody, >> >> I would be glad for advice concerning the followin problem: >> >> The sample size for my logistic regression with four predictor variables >> appears somewhat small (N= 160) to me. Regarding the dichotomous dependent >> variable, there are 20 cases for category (a) and 140 cases for category >> (b). I wonder whether this might induce any problems for the results - is >> it ok to conduct a logistic regression with just 20 cases in one of the >> two categories for the dependent variable? Are there any references >> available regarding adeqaute sample sizes in logistic regression? >> >> Many thanks! >> Jan >> > > Frank Harrell, author of the book "Regression Modeling Strategies", > advocates a 20:1 rule, meaning 20 events per candidate predictor variable. > See the section on overfitting here: > > http://biostat.mc.vanderbilt.edu/wiki/Main/ManuscriptChecklist > > Personally, I am comfortable relaxing that to 15:1, or even 10:1 at times, > although 10:1 is really pushing it. For more on overfitting, see the nice > article by Mike Babyak, available here: > > http://www.class.uidaho.edu/psy586/Course%20Readings/Babyak_04.pdf > > HTH. > > ----- > -- > Bruce Weaver > [hidden email] > http://sites.google.com/a/lakeheadu.ca/bweaver/ > > "When all else fails, RTFM." > > NOTE: My Hotmail account is not monitored regularly. > To send me an e-mail, please use the address shown above. > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/logistic-regression-number-of-cases-per-category-tp3316148p3316269.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Hello,
Thank you for your email. I will be out of the office on Wednesday, December 29th, returning on Thursday, December 30th and will respond to your message when I return. Thanks! Genevieve Odoom Policy and Program Analyst OANHSS Suite 700 - 7050 Weston Rd. Woodbridge, ON L4L 8G7 Tel: (905) 851-8821 x 241 Fax: (905) 851-0744 [hidden email] www.oanhss.org<https://mail.oanhss.org/ecp/Organize/www.oanhss.org> ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by Ryan
Good point, Ryan. Here's the website for LogXact:
http://www.cytel.com/Software/LogXact.aspx I believe the demo is good for 30 days.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Hello,
Thank you for your email. I will be out of the office on Tuesday, January 4th, returning on Wednesday, January 5th and will respond to your message when I return. Thanks! Genevieve Odoom Policy and Program Analyst OANHSS Suite 700 - 7050 Weston Rd. Woodbridge, ON L4L 8G7 Tel: (905) 851-8821 x 241 Fax: (905) 851-0744 [hidden email] www.oanhss.org<https://mail.oanhss.org/ecp/Organize/www.oanhss.org> ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Bruce Weaver
The SPSS Community website at www.ibm.com/developerworks/spssdevcentral
is the successor to the SPSS Developer Central website, which will be closed
down later in 2011. The new site includes downloads, articles, forums,
a blog, and other useful items for IBM SPSS Statistics and other IBM SPSS
products.
We have now made available on the Community site the IBM SPSS Statistics Essentials for programmability using Python and .NET for Statistics versions 18 and 19. The site also has the SDKs (Software Development Kit) underlying these items. The Essentials for the Statistics patch release, version 19.0.1, are only available on the new site. The R Essentials, however, are not yet available there. Items that are available on the SPSS Community site are not available on the Developer Central site, which is no longer being updated. Older plugins and Essentials should still be obtained from Developer Central. There is also a new IBM Modeler article, Mining Your Warranty Data – Finding Anomalies (Part 1) on the site in the articles section. Regards, Jon Peck Senior Software Engineer, IBM [hidden email] |
Free forum by Nabble | Edit this page |