|
Hello,
I am running a logistic regression model (using a Demographic and Health surveys dataset) and realized a drastic reduction in my sub-population size. I traced the problem to a variable with a lot of missing cases. As you can see from the table below, this variable elicits whether the respondent engaged in unprotected sexual intercourse. About a third of the cases (33.78%) are missing. V761 -- Last intercourse used condom ----------------------------------------------------------- | Freq. Percent Valid Cum. --------------+-------------------------------------------- Valid 0 No | 6012 56.16 84.81 84.81 1 Yes | 1075 10.04 15.16 99.97 9 | 2 0.02 0.03 100.00 Total | 7089 66.22 100.00 Missing . | 3617 33.78 Total | 10706 100.00 ----------------------------------------------------------- According to the DHS - Demographic and health surveys, : A “missing value” is defined as a variable that should have a response, but because of interview errors the question was not asked. The general rule for the survey data processing is that under no circumstances an answer should be made up. Instead, a missing value is assigned in the data file (see: http://www.measuredhs.com/accesssurveys/Data_quality_use.cfm#1). So the missing values result from interview errors, and the errors are not related to my DV. In fact, the DV had only 161 missing variables. However, since the dependent variable in my deals with HIV risk, I need to include sexual risk variables such as the V761 in the model. One option is that I can ignore the errors on that single IV , but then it means I will have to accept the lower N (sample size) my analysis, and explain that in my write-up (that changes in sample size for the regression result from missing values on some of the covariates. Does this sound like a reasonable option? What other options do I have? Thanks in advance for your help. regards, Cy ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi Cy,
I suppose the variable is independent in your model - otherwise you probably have no better choice than to drop the missing cases. First try to find whether the missing values are really caused by non-asking the question. It looks like it was rather a non-response - the question was asked but respondents did not give an answer. If this suspicion is true, you can use the variable as if there were 3 levels of response: yes - no - refused. Recode it into two dummy variables (Refused 1/0 and Yes 1/0) and use normal logistic regression. Otherwise you will probably need to drop either the variable or the missing cases or use another technique than classic logistic regression. Good luck, Jan -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Chao Yawo Sent: Wednesday, July 15, 2009 1:20 AM To: [hidden email] Subject: Treatment for Missing Values - What Options Do I Have ? Hello, I am running a logistic regression model (using a Demographic and Health surveys dataset) and realized a drastic reduction in my sub-population size. I traced the problem to a variable with a lot of missing cases. As you can see from the table below, this variable elicits whether the respondent engaged in unprotected sexual intercourse. About a third of the cases (33.78%) are missing. V761 -- Last intercourse used condom ----------------------------------------------------------- | Freq. Percent Valid Cum. --------------+-------------------------------------------- Valid 0 No | 6012 56.16 84.81 84.81 1 Yes | 1075 10.04 15.16 99.97 9 | 2 0.02 0.03 100.00 Total | 7089 66.22 100.00 Missing . | 3617 33.78 Total | 10706 100.00 ----------------------------------------------------------- According to the DHS - Demographic and health surveys, : A "missing value" is defined as a variable that should have a response, but because of interview errors the question was not asked. The general rule for the survey data processing is that under no circumstances an answer should be made up. Instead, a missing value is assigned in the data file (see: http://www.measuredhs.com/accesssurveys/Data_quality_use.cfm#1). So the missing values result from interview errors, and the errors are not related to my DV. In fact, the DV had only 161 missing variables. However, since the dependent variable in my deals with HIV risk, I need to include sexual risk variables such as the V761 in the model. One option is that I can ignore the errors on that single IV , but then it means I will have to accept the lower N (sample size) my analysis, and explain that in my write-up (that changes in sample size for the regression result from missing values on some of the covariates. Does this sound like a reasonable option? What other options do I have? Thanks in advance for your help. regards, Cy ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD _____________ Tato zpráva a všechny připojené soubory jsou důvěrné a určené výlučně adresátovi(-ům). Jestliže nejste oprávněným adresátem, je zakázáno jakékoliv zveřejňování, zprostředkování nebo jiné použití těchto informací. Jestliže jste tento mail dostali neoprávněně, prosím, uvědomte odesilatele a smažte zprávu i přiložené soubory. Odesilatel nezodpovídá za jakékoliv chyby nebo opomenutí způsobené tímto přenosem. Jste si jisti, že opravdu potřebujete vytisknout tuto zprávu a/nebo její přílohy? Myslete na přírodu. This message and any attached files are confidential and intended solely for the addressee(s). Any publication, transmission or other use of the information by a person or entity other than the intended addressee is prohibited. If you receive this in error please contact the sender and delete the message as well as all attached documents. The sender does not accept liability for any errors or omissions as a result of the transmission. Are you sure that you really need a print version of this message and/or its attachments? Think about nature. -.- -- ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi all,
I have a few questions on Fisher's Exact. 1) When do you use Fisher's Exact? 2) Where do you find that in SPSS? 3) I heard that there is suppose to be a 2-tailed and 1-tailed test for Fisher's Exact. When do you use the 2-tailed and 1-tailed. Under what hypothesis testing? Thanks in advance for your help. Regards Dorraj Make the most of what you can do on your PC and the Web, just the way you want. Windows Live |
|
Hi,
1)
2)
For 2x2 tables, demand /STATISTICS=CHISQ in CROSSTABS. For bigger
tables, you need a special module to conduct exact tests.
3)
If you know in advance, that the independence can be affected only
in one way (e.g. higher proportion than expected in one cell, and not lower),
use one-sided test. Otherwise use the two-sided.
HTH
Jan
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of DorraJ Oet Sent: Wednesday, July 15, 2009 11:22 AM To: [hidden email] Subject: Fisher's Exact questions I have a few questions on Fisher's Exact. 1) When do you use Fisher's Exact? 2) Where do you find that in SPSS? 3) I heard that there is suppose to be a 2-tailed and 1-tailed test for Fisher's Exact. When do you use the 2-tailed and 1-tailed. Under what hypothesis testing? Thanks in advance for your help. Regards Dorraj Make the most of what you can do on your PC and the Web, just the way you want. Windows Live _____________ Tato zpráva
a všechny připojené
soubory jsou důvěrné a určené výlučně adresátovi(-ům). Jestliže nejste
oprávněným adresátem, je zakázáno jakékoliv zveřejňování, zprostředkování nebo
jiné použití těchto informací. Jestliže jste tento mail dostali neoprávněně,
prosím, uvědomte odesilatele a smažte zprávu i přiložené soubory. Odesilatel
nezodpovídá za jakékoliv chyby nebo opomenutí způsobené tímto
přenosem.
P Are you sure that you
really need a print version of this message and/or its attachments? Think about
nature.
|
|
Administrator
|
In addition to the info Jan has given, note that Fisher's exact test is quite conservative, and is probably not the best option. A recent simulation study by Campbell (2007, Statistics in Medicine) showed that the "N-1" chi-square performs much better. Here are my notes on that, including some SPSS syntax, and a link to Campbell's website.
http://sites.google.com/a/lakeheadu.ca/bweaver/Home/statistics/notes/chisqr_assumptions Cheers, Bruce
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
In reply to this post by Chao Yawo
Cy ... This situation has important implications for modeling and
inference. You might consider asking for guidance on SRMSNET, which is populated by specialists in survey methodology. http://www.amstat.org/sections/srms/srms_net.html Art Art Burke Northwest Regional Educational Laboratory 101 SW Main St, Suite 500 Portland, OR 97204-3213 -----Original Message----- From: Chao Yawo [mailto:[hidden email]] Sent: Tuesday, July 14, 2009 4:20 PM To: [hidden email] Subject: Treatment for Missing Values - What Options Do I Have ? Hello, I am running a logistic regression model (using a Demographic and Health surveys dataset) and realized a drastic reduction in my sub-population size. I traced the problem to a variable with a lot of missing cases. As you can see from the table below, this variable elicits whether the respondent engaged in unprotected sexual intercourse. About a third of the cases (33.78%) are missing. V761 -- Last intercourse used condom ----------------------------------------------------------- | Freq. Percent Valid Cum. --------------+-------------------------------------------- Valid 0 No | 6012 56.16 84.81 84.81 1 Yes | 1075 10.04 15.16 99.97 9 | 2 0.02 0.03 100.00 Total | 7089 66.22 100.00 Missing . | 3617 33.78 Total | 10706 100.00 ----------------------------------------------------------- According to the DHS - Demographic and health surveys, : A "missing value" is defined as a variable that should have a response, but because of interview errors the question was not asked. The general rule for the survey data processing is that under no circumstances an answer should be made up. Instead, a missing value is assigned in the data file (see: http://www.measuredhs.com/accesssurveys/Data_quality_use.cfm#1). So the missing values result from interview errors, and the errors are not related to my DV. In fact, the DV had only 161 missing variables. However, since the dependent variable in my deals with HIV risk, I need to include sexual risk variables such as the V761 in the model. One option is that I can ignore the errors on that single IV , but then it means I will have to accept the lower N (sample size) my analysis, and explain that in my write-up (that changes in sample size for the regression result from missing values on some of the covariates. Does this sound like a reasonable option? What other options do I have? Thanks in advance for your help. regards, Cy ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
