|
Hi everyone,
How do you distinguish between data that are missing because a respondent refused to answer and data that are missing because the question didn't apply to that respondent in SPSS? What is considered a discrete missing value? How do you set up the Missing value column in variable view? Thanks in advance! Thanks, Deepa Deepa Bhat, MPH, MS Monitoring & Evaluation Associate Technical Officer Making Medical Injections Safer (MMIS) John Snow, Inc. 1616 N. Fort Myer Drive Arlington, VA 22209 Phone: 703-528-7474 x5180 Fax: 703-528-7480 [hidden email] mmis.jsi.com |
|
The only way to distinguish between values missing for different reasons is to assign different codes to values missing for different reasons. This is typically done at the time the data are entered, although you could establish rules that assign codes based on the values of other variables (e.g., males can't be pregnant).
You set up missing values in Variable View by clicking the button in the cell in the Missing column for the Variable for which you want to identify user-missing value codes. There is a brief tutorial on this in the Help system: Help menu>Tutorial>Using the Data Editor>Defining Data>Handling Missing Data. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Deepa Bhat Sent: Friday, July 20, 2007 9:03 AM To: [hidden email] Subject: MISSING VALUES IN SPSS Hi everyone, How do you distinguish between data that are missing because a respondent refused to answer and data that are missing because the question didn't apply to that respondent in SPSS? What is considered a discrete missing value? How do you set up the Missing value column in variable view? Thanks in advance! Thanks, Deepa Deepa Bhat, MPH, MS Monitoring & Evaluation Associate Technical Officer Making Medical Injections Safer (MMIS) John Snow, Inc. 1616 N. Fort Myer Drive Arlington, VA 22209 Phone: 703-528-7474 x5180 Fax: 703-528-7480 [hidden email] mmis.jsi.com |
|
In reply to this post by Deepa Bhat
Greetings -
I have a question that may not have an answer, as I am not sure that the comparisons that are being requested are even possible. On the other hand, I also fear that I may be missing something very simple and basic! Here is the problem: I have two sets of data. The first is a very large consumer test (CT) that asked a very large group of participants to evaluate six products after using them at home. Among many other things, a series of 9 descriptive questions about each product were included in this study. A 10th relevant question is an "intent to purchase" query. Each participant assessed one product. The second set of data is a small pilot study of 21 participants. None of the participants in this small pilot study were tested in the big CT (i.e. completely different group of people). This is a repeated measures design in which the same six products are evaluated by each participant and each product was assessed using three separate assessment tools (i.e. ways of viewing/interacting with the product). This test took place in a lab, with no in-home use. The same 9 descriptive questions and the 10th intent to purchase questions were used to evaluate. My client's goal is to assess the degree of association between answers provided in the small study with those provided for the same products in the large study. They wish to see how predictive the three assessment tools are of actual answers after actual use of the product. They asked to start with simple tests of association (correlation) between each question for each product using the assessment tools in the small study and the answers to the same questions from the large CT. The ultimate goal (of the client) is to conduct either multiple linear regression or logistic regression on the 9 descriptive questions (IV) to determine their ability to predict the outcome of the intent to buy question (DV). Of course this would be quite possible within the big study. However, they are requesting that this is done with one of the "Assessment tools" used in the small study, NOT with the data collected from the large CT. In other words, the IVs would come from the small study and the DV from the large study. I do not see how this is possible because the participants are not the same in each study (not to mention the extremely small sample size in the pilot study, but I think that is a secondary issue, since I do not see how I can run any tests of association between two unrelated samples). My client is first interested in finding out how well the three assessment tools in the small study agree with answers from the large study. Although they want correlation, I could think of no way to provide that, so instead have treated the large data set as "the population" and used aggregate to create new variables for each question and product that are the mean for each. I used these as the population mean (seeing that these really do represent whether or not people would purchase the product after using it), and have conducted a series of single-sample t-tests for each product and question against its corresponding "population" mean from the large CT. These do provide information regarding whether or not the means for each of the assessment tools differ significantly (or not) from the large CT mean. This is where I am stuck. I cannot run any type of correlation tests because the cases (subjects) in each group are completely different people. My client is very adamant about wanting some test of association. Is there more that I can do with this design to provide the answers that my client is requesting? (Again, if I am missing something that is very simple and obvious, I apologize ahead of time!) Thanks for your help in advance (since I know this is long, please feel welcome to reply off the list if you prefer). Linda Case Linda P. Case AutumnGold Consulting (217) 586-4864 www.autumngoldconsulting.com [hidden email] or [hidden email] |
|
In reply to this post by Deepa Bhat
Hi Deepa:
You should recode the missing values to the questions that didn't apply to the respondents as 'Not Applicable' or "Not to be considered" things. You can also eliminate these with filtering on the variable that can give you the information that who should respond to your questions at hand to be analysed. You can treat real missing values as system missing or you can treat them with "MISSING VALUE ANALYSIS" options available with SPSS on the basis of the context. HTH. Samir On 7/20/07, Deepa Bhat <[hidden email]> wrote: > > Hi everyone, > > How do you distinguish between data that are missing because a > respondent refused to answer and data that are missing because the > question didn't apply to that respondent in SPSS? > What is considered a discrete missing value? How do you set up the > Missing value column in variable view? Thanks in advance! > > Thanks, > Deepa > > Deepa Bhat, MPH, MS > Monitoring & Evaluation Associate Technical Officer > Making Medical Injections Safer (MMIS) > John Snow, Inc. > 1616 N. Fort Myer Drive > Arlington, VA 22209 > Phone: 703-528-7474 x5180 > Fax: 703-528-7480 > [hidden email] > mmis.jsi.com > -- Samir Paul Senior Manager - Media Research SIRIUS Marketing & Social Research Ltd House #64/!, Road #12A New Dhanmondi Res Area, Dhaka 1209 |
|
In reply to this post by Linda Case
Hi Linda
Your impression is correct: you can't obtain any kind of association (correlation, regression) with those two datasets your client gave you. Besides, even if you could (the 21 participants had been obtained from the larger consumer test dataset, with 21 cases you would not be able to get multiple regression models, your sample size should be at least ten times bigger. Regards, Dr. Marta Garcia-Granero > Greetings - > > I have a question that may not have an answer, as I am not sure that the > comparisons that are being requested are even possible. On the other hand, > I also fear that I may be missing something very simple and basic! Here is > the problem: > > I have two sets of data. The first is a very large consumer test (CT) that > asked a very large group of participants to evaluate six products after > using them at home. Among many other things, a series of 9 descriptive > questions about each product were included in this study. A 10th relevant > question is an "intent to purchase" query. Each participant assessed one > product. > > The second set of data is a small pilot study of 21 participants. None of > the participants in this small pilot study were tested in the big CT (i.e. > completely different group of people). This is a repeated measures design > in which the same six products are evaluated by each participant and each > product was assessed using three separate assessment tools (i.e. ways of > viewing/interacting with the product). This test took place in a lab, with > no in-home use. The same 9 descriptive questions and the 10th intent to > purchase questions were used to evaluate. > > My client's goal is to assess the degree of association between answers > provided in the small study with those provided for the same products in the > large study. They wish to see how predictive the three assessment tools are > of actual answers after actual use of the product. They asked to start with > simple tests of association (correlation) between each question for each > product using the assessment tools in the small study and the answers to the > same questions from the large CT. The ultimate goal (of the client) is to > conduct either multiple linear regression or logistic regression on the 9 > descriptive questions (IV) to determine their ability to predict the outcome > of the intent to buy question (DV). Of course this would be quite possible > within the big study. However, they are requesting that this is done with > one of the "Assessment tools" used in the small study, NOT with the data > collected from the large CT. In other words, the IVs would come from the > small study and the DV from the large study. I do not see how this is > possible because the participants are not the same in each study (not to > mention the extremely small sample size in the pilot study, but I think that > is a secondary issue, since I do not see how I can run any tests of > association between two unrelated samples). > > My client is first interested in finding out how well the three assessment > tools in the small study agree with answers from the large study. Although > they want correlation, I could think of no way to provide that, so instead > have treated the large data set as "the population" and used aggregate to > create new variables for each question and product that are the mean for > each. I used these as the population mean (seeing that these really do > represent whether or not people would purchase the product after using it), > and have conducted a series of single-sample t-tests for each product and > question against its corresponding "population" mean from the large CT. > These do provide information regarding whether or not the means for each of > the assessment tools differ significantly (or not) from the large CT mean. > This is where I am stuck. I cannot run any type of correlation tests > because the cases (subjects) in each group are completely different people. > My client is very adamant about wanting some test of association. Is there > more that I can do with this design to provide the answers that my client is > requesting? (Again, if I am missing something that is very simple and > obvious, I apologize ahead of time!) > > Thanks for your help in advance (since I know this is long, please feel > welcome to reply off the list if you prefer). > > Linda Case > > Linda P. Case > AutumnGold Consulting > (217) 586-4864 > www.autumngoldconsulting.com > [hidden email] or [hidden email] > > |
|
Hi Marta -
Thanks for your response. I know it probably seemed self-evident to most on this list, but when one has a client insisting that certain tests be conducted, it gets a bit intimidating. I wanted to be sure I had covered all of the bases, and the collective expertise on this list was definitely the right place to ask! I also received some very helpful suggestions from another person on the list that will provide some answers that at least come close to what is needed. Thank you again for reading the tome that I posted, and for your corroboration! Linda Linda P. Case AutumnGold Consulting (217) 586-4864 www.autumngoldconsulting.com [hidden email] or [hidden email] -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Marta Garcia-Granero Sent: Saturday, July 21, 2007 4:10 AM To: [hidden email] Subject: Re: Selection of Appropriate Tests of Association (long) Hi Linda Your impression is correct: you can't obtain any kind of association (correlation, regression) with those two datasets your client gave you. Besides, even if you could (the 21 participants had been obtained from the larger consumer test dataset, with 21 cases you would not be able to get multiple regression models, your sample size should be at least ten times bigger. Regards, Dr. Marta Garcia-Granero > Greetings - > > I have a question that may not have an answer, as I am not sure that the > comparisons that are being requested are even possible. On the other hand, > I also fear that I may be missing something very simple and basic! Here is > the problem: > > I have two sets of data. The first is a very large consumer test (CT) that > asked a very large group of participants to evaluate six products after > using them at home. Among many other things, a series of 9 descriptive > questions about each product were included in this study. A 10th relevant > question is an "intent to purchase" query. Each participant assessed one > product. > > The second set of data is a small pilot study of 21 participants. None of > the participants in this small pilot study were tested in the big CT (i.e. > completely different group of people). This is a repeated measures design > in which the same six products are evaluated by each participant and each > product was assessed using three separate assessment tools (i.e. ways of > viewing/interacting with the product). This test took place in a lab, > no in-home use. The same 9 descriptive questions and the 10th intent to > purchase questions were used to evaluate. > > My client's goal is to assess the degree of association between answers > provided in the small study with those provided for the same products in the > large study. They wish to see how predictive the three assessment tools are > of actual answers after actual use of the product. They asked to start with > simple tests of association (correlation) between each question for each > product using the assessment tools in the small study and the answers to the > same questions from the large CT. The ultimate goal (of the client) is to > conduct either multiple linear regression or logistic regression on the 9 > descriptive questions (IV) to determine their ability to predict the outcome > of the intent to buy question (DV). Of course this would be quite possible > within the big study. However, they are requesting that this is done with > one of the "Assessment tools" used in the small study, NOT with the data > collected from the large CT. In other words, the IVs would come from the > small study and the DV from the large study. I do not see how this is > possible because the participants are not the same in each study (not to > mention the extremely small sample size in the pilot study, but I think that > is a secondary issue, since I do not see how I can run any tests of > association between two unrelated samples). > > My client is first interested in finding out how well the three assessment > tools in the small study agree with answers from the large study. Although > they want correlation, I could think of no way to provide that, so instead > have treated the large data set as "the population" and used aggregate to > create new variables for each question and product that are the mean for > each. I used these as the population mean (seeing that these really do > represent whether or not people would purchase the product after using > and have conducted a series of single-sample t-tests for each product and > question against its corresponding "population" mean from the large CT. > These do provide information regarding whether or not the means for each of > the assessment tools differ significantly (or not) from the large CT mean. > This is where I am stuck. I cannot run any type of correlation tests > because the cases (subjects) in each group are completely different people. > My client is very adamant about wanting some test of association. Is there > more that I can do with this design to provide the answers that my client is > requesting? (Again, if I am missing something that is very simple and > obvious, I apologize ahead of time!) > > Thanks for your help in advance (since I know this is long, please feel > welcome to reply off the list if you prefer). > > Linda Case > > Linda P. Case > AutumnGold Consulting > (217) 586-4864 > www.autumngoldconsulting.com > [hidden email] or [hidden email] > > |
| Free forum by Nabble | Edit this page |
