Hello, all. I sent out this message last Friday, but have not got any response. I am sending this again, and hope to get some advice on this question. I am using binomial logistic regression to predict students' retention. The outcome variable is 1=retained, 0= not retained. One of the predictors is students' high school GPA, which is a continuous variable. 30% of students in the data set did not provide their high school GPA. I am using a lower version of SPSS, so the multiple imputation procedure is not available in the software. Is there a better way to impute these missing data? Thanks! |
Joost Van Ginkel at the University of Leiden has a set of syntaxes that imputes data in a number of ways. He also has a set of syntaxes that performs multiple imputation. The syntax does not perform imputation via expectation maximization or full information maximum likelihood. The link is below.
Now, I'm not sure how familiar you are with missing data, but two issues come quickly to mind. The first is the mechanism of missingness. There are three levels, Missing Completely at Random, Missing at Random, and Missing Not at Random. Data Missing Completely at Random involve the variable in question (e.g., GPA) in which the missingness is totally unrelated to the values of any of the other variables. Data Missing at Random involve data missing that are related only to values of other data. Data Missing Not at Random involve the situation in which the missing data is related to the values of the variable in question (GPA), in other words students who have particular GPA values are not submitting the data.
Imputation is appropriate with the first two types of missingness but not Missing Not at Random. Based on your description, you may indeed be plagued with this problem. The second issue is the amount of missing data. 30% is quite a bit, and although there are no generally accepted levels, to impute almost one third of the data in a particular variable is questionable. There are other issues as well, such as whether the data is missing listwise or pairwise. In short, at first look imputation looks like a magical solution; but in actuality it is not a cure-all for missing data.
I'm sure others may join in on this with other considerations, but I can only caution you about your level and mechanism of missingness. It may render any results quite questionable.
The link is https://www.universiteitleiden.nl/en/staffmembers/joost-van-ginkel#tab-1
Brian Dates
From: SPSSX(r) Discussion <[hidden email]> on behalf of XIAOYING LIU <[hidden email]>
Sent: Monday, July 22, 2019 10:40:09 AM To: [hidden email] <[hidden email]> Subject: Imputing missing data for binomial logistic regression Hello, all. I sent out this message last Friday, but have not got any response. I am sending this again, and hope to get some advice on this question. I am using binomial logistic regression to predict students' retention. The outcome variable
is 1=retained, 0= not retained. One of the predictors is students' high school GPA, which is a continuous variable. 30% of students in the data set did not provide their high school GPA. I am using a lower version of SPSS, so the multiple imputation procedure
is not available in the software. Is there a better way to impute these missing data? Thanks!
|
In reply to this post by XIAOYING LIU
I am not answering on what type of imputation of missing values is
better in your case, and we don't know much about your data and
aims.
Just to inform you that two procedures with options for the so called hot deck imputation are available on my web-page http://www.spsstools.net/en/KO-spssmacros ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by XIAOYING LIU
Hey! You may have more success with "MedStats" ...... Google Groups
I'm biased :( I founded this Google Group about 14 years ago). The group has over 1500 members. It hosts just about all of the most famous Medical Statisticians..... Martin Bland (who awarded me a distinction for my summer project in my Advanced MSc in Medical Statistics and Information Technology) John Whittington (who was instrumental with me getting MedStats off the ground) Doug Altman (before he passed over...his contributions to MedStats are stored in the Group's Archives...a wonderful resource) Frank Harrell Jnr (a fellow Manager of MedStats.....as are John Whittington and myself..) Michael Campbell (a co-author of "Sample Size Tables for Clinical Studies" and a very prominent Medical Statistician). Abhaya Indrayan (who is co-author with me in the "Concise Encyclopaedia of Biostatistics for Medical Professionals"; latest 587 reads. Stephen Senn (Famous for his quote re Columbus not knowing where he was going to end up) Mark Schwartz SR Millis (well known for his "Professor Mean" activities Need I say more (too many to mention) With MedStats, literally anyone can view the discussion. To post a message on the mailing list, however, you do need to join the Group: Google Groups
Freelance Medical Statistician If you can't explain it simply, you don't understand it well enough.....Einstein Concise Encyclopedia of Biostatistics for Medical Professionals Martin P. Holt Linked In: https://www.linkedin.com/in/martin-holt-3b800b48?trk=nav_responsive_tab_profile
On Monday, 22 July 2019, 15:40:22 BST, XIAOYING LIU <[hidden email]> wrote:
Hello, all. I sent out this message last Friday, but have not got any response. I am sending this again, and hope to get some advice on this question. I am using binomial logistic regression to predict students' retention. The outcome variable is 1=retained, 0= not retained. One of the predictors is students' high school GPA, which is a continuous variable. 30% of students in the data set did not provide their high school GPA. I am using a lower version of SPSS, so the multiple imputation procedure is not available in the software. Is there a better way to impute these missing data? Thanks! |
Administrator
|
Hello Martin. As you know, I too am a member of MedStats. But it is my
impression that very few of the regulars there use SPSS--Diana K is one of the few I can think of. So while I do agree that there will be many members who know a lot about logistic regression and missing data, I doubt there will be an overabundance of SPSS-specific help on offer. That's my tuppence, FWIW. ;-) PS- I believe Steve Simon was responsible for the "Professor Mean" material you mentioned. E.g., see the bottom of this page: http://www.pmean.com/contact.html Martin Holt-3 wrote > Hey! > You may have more success with "MedStats" ...... > Google Groups > > | > | > | | > Google Groups > > Google Groups allows you to create and participate in online forums and > email-based groups with a rich experienc... > | > > | > > | > > > > > I'm biased :( I founded this Google Group about 14 years ago). > The group has over 1500 members. > It hosts just about all of the most famous Medical Statisticians..... > Martin Bland (who awarded me a distinction for my summer project in my > Advanced MSc in Medical Statistics and Information Technology) > John Whittington (who was instrumental with me getting MedStats off the > ground) > Doug Altman (before he passed over...his contributions to MedStats are > stored in the Group's Archives...a wonderful resource) > Frank Harrell Jnr (a fellow Manager of MedStats.....as are John > Whittington and myself..) > Michael Campbell (a co-author of "Sample Size Tables for Clinical Studies" > and a very prominent Medical Statistician). > Abhaya Indrayan (who is co-author with me in the "Concise Encyclopaedia of > Biostatistics for Medical Professionals"; latest 587 reads. > Stephen Senn (Famous for his quote re Columbus not knowing where he was > going to end up) > Mark Schwartz > SR Millis (well known for his "Professor Mean" activities > Need I say more (too many to mention) > > > With MedStats, literally anyone can view the discussion. > To post a message on the mailing list, however, you do need to join the > Group: > Google Groups > > | > | > | | > Google Groups > > Google Groups allows you to create and participate in online forums and > email-based groups with a rich experienc... > | > > | > > | > > > Freelance Medical Statistician > If you can't explain it simply, you don't understand it well > enough.....Einstein > > > Concise > > Encyclopedia > > of Biostatistics for > > MedicalProfessionals > > > > > MartinP. Holt > > https://www.crcpress.com/Concise-Encyclopedia-of-Biostatistics-for-Medical-Professionals/Indrayan-Holt/9781482243871 > > Linked > In: https://www.linkedin.com/in/martin-holt-3b800b48?trk=nav_responsive_tab_profile > > On Monday, 22 July 2019, 15:40:22 BST, XIAOYING LIU < > 00000b14802a796e-dmarc-request@.UGA >> wrote: > > Hello, all. I sent out this message last Friday, but have not got any > response. I am sending this again, and hope to get some advice on this > question. I am using binomial logistic regression to predict students' > retention. The outcome variable is 1=retained, 0= not retained. One of the > predictors is students' high school GPA, which is a continuous variable. > 30% of students in the data set did not provide their high school GPA. I > am using a lower version of SPSS, so the multiple imputation procedure is > not available in the software. Is there a better way to impute these > missing data? Thanks! > =====================To manage your subscription to SPSSX-L, send a > message > toLISTSERV@.UGA > (not to SPSSX-L), with no body text except thecommand. To leave the list, > send the commandSIGNOFF SPSSX-LFor a list of commands to manage > subscriptions, send the commandINFO REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Free forum by Nabble | Edit this page |