MISSING VALUES IN SPSS

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

MISSING VALUES IN SPSS

Deepa Bhat
Hi everyone,

How do you distinguish between data that are missing because a
respondent refused to answer and data that are missing because the
question didn't apply to that respondent in SPSS?
What is considered a discrete missing value? How do you set up the
Missing value column in variable view? Thanks in advance!

Thanks,
Deepa

Deepa Bhat, MPH, MS
Monitoring & Evaluation Associate Technical Officer
Making Medical Injections Safer  (MMIS)
John Snow, Inc.
1616 N. Fort Myer Drive
Arlington, VA 22209
Phone: 703-528-7474 x5180
Fax: 703-528-7480
[hidden email]
mmis.jsi.com
Reply | Threaded
Open this post in threaded view
|

Re: MISSING VALUES IN SPSS

Oliver, Richard
The only way to distinguish between values missing for different reasons is to assign different codes to values missing for different reasons. This is typically done at the time the data are entered, although you could establish rules that assign codes based on the values of other variables (e.g., males can't be pregnant).

You set up missing values in Variable View by clicking the button in the cell in the Missing column for the Variable for which you want to identify user-missing value codes. There is a brief tutorial on this in the Help system: Help menu>Tutorial>Using the Data Editor>Defining Data>Handling Missing Data.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Deepa Bhat
Sent: Friday, July 20, 2007 9:03 AM
To: [hidden email]
Subject: MISSING VALUES IN SPSS

Hi everyone,

How do you distinguish between data that are missing because a
respondent refused to answer and data that are missing because the
question didn't apply to that respondent in SPSS?
What is considered a discrete missing value? How do you set up the
Missing value column in variable view? Thanks in advance!

Thanks,
Deepa

Deepa Bhat, MPH, MS
Monitoring & Evaluation Associate Technical Officer
Making Medical Injections Safer  (MMIS)
John Snow, Inc.
1616 N. Fort Myer Drive
Arlington, VA 22209
Phone: 703-528-7474 x5180
Fax: 703-528-7480
[hidden email]
mmis.jsi.com
Reply | Threaded
Open this post in threaded view
|

Selection of Appropriate Tests of Association (long)

Linda Case
In reply to this post by Deepa Bhat
Greetings -

I have a question that may not have an answer, as I am not sure that the
comparisons that are being requested are even possible.  On the other hand,
I also fear that I may be missing something very simple and basic!  Here is
the problem:

I have two sets of data. The first is a very large consumer test (CT) that
asked a very large group of participants to evaluate six products after
using them at home. Among many other things, a series of 9 descriptive
questions about each product were included in this study. A 10th relevant
question is an "intent to purchase" query.  Each participant assessed one
product.

The second set of data is a small pilot study of 21 participants. None of
the participants in this small pilot study were tested in the big CT (i.e.
completely different group of people).  This is a repeated measures design
in which the same six products are evaluated by each participant and each
product was assessed using three separate assessment tools (i.e. ways of
viewing/interacting with the product).  This test took place in a lab, with
no in-home use.  The same 9 descriptive questions and the 10th intent to
purchase questions were used to evaluate.

My client's goal is to assess the degree of association between answers
provided in the small study with those provided for the same products in the
large study. They wish to see how predictive the three assessment tools are
of actual answers after actual use of the product.  They asked to start with
simple tests of association (correlation) between each question for each
product using the assessment tools in the small study and the answers to the
same questions from the large CT.  The ultimate goal (of the client) is to
conduct either multiple linear regression or logistic regression on the 9
descriptive questions (IV) to determine their ability to predict the outcome
of the intent to buy question (DV).  Of course this would be quite possible
within the big study. However, they are requesting that this is done with
one of the "Assessment tools" used in the small study, NOT with the data
collected from the large CT.  In other words, the IVs would come from the
small study and the DV from the large study.  I do not see how this is
possible because the participants are not the same in each study (not to
mention the extremely small sample size in the pilot study, but I think that
is a secondary issue, since I do not see how I can run any tests of
association between two unrelated samples).

My client is first interested in finding out how well the three assessment
tools in the small study agree with answers from the large study. Although
they want correlation, I could think of no way to provide that, so instead
have treated the large data set as "the population" and used aggregate to
create new variables for each question and product that are the mean for
each.  I used these as the population mean (seeing that these really do
represent whether or not people would purchase the product after using it),
and have conducted a series of single-sample t-tests for each product and
question against its corresponding "population" mean from the large CT.
These do provide information regarding whether or not the means for each of
the assessment tools differ significantly (or not) from the large CT mean.
This is where I am stuck.  I cannot run any type of correlation tests
because the cases (subjects) in each group are completely different people.
My client is very adamant about wanting some test of association.  Is there
more that I can do with this design to provide the answers that my client is
requesting? (Again, if I am missing something that is very simple and
obvious, I apologize ahead of time!)

Thanks for your help in advance (since I know this is long, please feel
welcome to reply off the list if you prefer).

Linda Case

Linda P. Case
AutumnGold Consulting
(217) 586-4864
www.autumngoldconsulting.com
[hidden email] or [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: MISSING VALUES IN SPSS

Samir Paul
In reply to this post by Deepa Bhat
Hi Deepa:

You should recode the missing values to the questions that didn't apply to
the respondents as 'Not Applicable' or "Not to be considered" things. You
can also eliminate these with filtering on the variable that can give you
the information that who should respond to your questions at hand to be
analysed.

You can treat real missing values as system missing or you can treat them
with "MISSING VALUE ANALYSIS" options available with SPSS on the basis of
the context.

HTH.
Samir


On 7/20/07, Deepa Bhat <[hidden email]> wrote:

>
> Hi everyone,
>
> How do you distinguish between data that are missing because a
> respondent refused to answer and data that are missing because the
> question didn't apply to that respondent in SPSS?
> What is considered a discrete missing value? How do you set up the
> Missing value column in variable view? Thanks in advance!
>
> Thanks,
> Deepa
>
> Deepa Bhat, MPH, MS
> Monitoring & Evaluation Associate Technical Officer
> Making Medical Injections Safer  (MMIS)
> John Snow, Inc.
> 1616 N. Fort Myer Drive
> Arlington, VA 22209
> Phone: 703-528-7474 x5180
> Fax: 703-528-7480
> [hidden email]
> mmis.jsi.com
>



--
Samir Paul
Senior Manager - Media Research
SIRIUS Marketing & Social Research Ltd
House #64/!, Road #12A New
Dhanmondi Res Area, Dhaka 1209
Reply | Threaded
Open this post in threaded view
|

Re: Selection of Appropriate Tests of Association (long)

Marta Garcia-Granero
In reply to this post by Linda Case
Hi Linda

Your impression is correct: you can't obtain any kind of association
(correlation, regression) with those two datasets your client gave you.
Besides, even if you could (the 21 participants had been obtained from
the larger consumer test dataset, with 21 cases you would not be able to
get multiple regression models, your sample size should be at least ten
times bigger.

Regards,
Dr. Marta Garcia-Granero

> Greetings -
>
> I have a question that may not have an answer, as I am not sure that the
> comparisons that are being requested are even possible.  On the other hand,
> I also fear that I may be missing something very simple and basic!  Here is
> the problem:
>
> I have two sets of data. The first is a very large consumer test (CT) that
> asked a very large group of participants to evaluate six products after
> using them at home. Among many other things, a series of 9 descriptive
> questions about each product were included in this study. A 10th relevant
> question is an "intent to purchase" query.  Each participant assessed one
> product.
>
> The second set of data is a small pilot study of 21 participants. None of
> the participants in this small pilot study were tested in the big CT (i.e.
> completely different group of people).  This is a repeated measures design
> in which the same six products are evaluated by each participant and each
> product was assessed using three separate assessment tools (i.e. ways of
> viewing/interacting with the product).  This test took place in a lab, with
> no in-home use.  The same 9 descriptive questions and the 10th intent to
> purchase questions were used to evaluate.
>
> My client's goal is to assess the degree of association between answers
> provided in the small study with those provided for the same products in the
> large study. They wish to see how predictive the three assessment tools are
> of actual answers after actual use of the product.  They asked to start with
> simple tests of association (correlation) between each question for each
> product using the assessment tools in the small study and the answers to the
> same questions from the large CT.  The ultimate goal (of the client) is to
> conduct either multiple linear regression or logistic regression on the 9
> descriptive questions (IV) to determine their ability to predict the outcome
> of the intent to buy question (DV).  Of course this would be quite possible
> within the big study. However, they are requesting that this is done with
> one of the "Assessment tools" used in the small study, NOT with the data
> collected from the large CT.  In other words, the IVs would come from the
> small study and the DV from the large study.  I do not see how this is
> possible because the participants are not the same in each study (not to
> mention the extremely small sample size in the pilot study, but I think that
> is a secondary issue, since I do not see how I can run any tests of
> association between two unrelated samples).
>
> My client is first interested in finding out how well the three assessment
> tools in the small study agree with answers from the large study. Although
> they want correlation, I could think of no way to provide that, so instead
> have treated the large data set as "the population" and used aggregate to
> create new variables for each question and product that are the mean for
> each.  I used these as the population mean (seeing that these really do
> represent whether or not people would purchase the product after using it),
> and have conducted a series of single-sample t-tests for each product and
> question against its corresponding "population" mean from the large CT.
> These do provide information regarding whether or not the means for each of
> the assessment tools differ significantly (or not) from the large CT mean.
> This is where I am stuck.  I cannot run any type of correlation tests
> because the cases (subjects) in each group are completely different people.
> My client is very adamant about wanting some test of association.  Is there
> more that I can do with this design to provide the answers that my client is
> requesting? (Again, if I am missing something that is very simple and
> obvious, I apologize ahead of time!)
>
> Thanks for your help in advance (since I know this is long, please feel
> welcome to reply off the list if you prefer).
>
> Linda Case
>
> Linda P. Case
> AutumnGold Consulting
> (217) 586-4864
> www.autumngoldconsulting.com
> [hidden email] or [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Selection of Appropriate Tests of Association (long)

Linda Case
Hi Marta -

Thanks for your response. I know it probably seemed self-evident to most on
this list, but when one has a client insisting that certain tests be
conducted, it gets a bit intimidating. I wanted to be sure I had covered all
of the bases, and the collective expertise on this list was definitely the
right place to ask! I also received some very helpful suggestions from
another person on the list that will provide some answers that at least come
close to what is needed.

Thank you again for reading the tome that I posted, and for your
corroboration!

Linda

Linda P. Case
AutumnGold Consulting
(217) 586-4864
www.autumngoldconsulting.com
[hidden email] or [hidden email]
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Marta Garcia-Granero
Sent: Saturday, July 21, 2007 4:10 AM
To: [hidden email]
Subject: Re: Selection of Appropriate Tests of Association (long)

Hi Linda

Your impression is correct: you can't obtain any kind of association
(correlation, regression) with those two datasets your client gave you.
Besides, even if you could (the 21 participants had been obtained from
the larger consumer test dataset, with 21 cases you would not be able to
get multiple regression models, your sample size should be at least ten
times bigger.

Regards,
Dr. Marta Garcia-Granero

> Greetings -
>
> I have a question that may not have an answer, as I am not sure that the
> comparisons that are being requested are even possible.  On the other
hand,
> I also fear that I may be missing something very simple and basic!  Here
is

> the problem:
>
> I have two sets of data. The first is a very large consumer test (CT) that
> asked a very large group of participants to evaluate six products after
> using them at home. Among many other things, a series of 9 descriptive
> questions about each product were included in this study. A 10th relevant
> question is an "intent to purchase" query.  Each participant assessed one
> product.
>
> The second set of data is a small pilot study of 21 participants. None of
> the participants in this small pilot study were tested in the big CT (i.e.
> completely different group of people).  This is a repeated measures design
> in which the same six products are evaluated by each participant and each
> product was assessed using three separate assessment tools (i.e. ways of
> viewing/interacting with the product).  This test took place in a lab,
with
> no in-home use.  The same 9 descriptive questions and the 10th intent to
> purchase questions were used to evaluate.
>
> My client's goal is to assess the degree of association between answers
> provided in the small study with those provided for the same products in
the
> large study. They wish to see how predictive the three assessment tools
are
> of actual answers after actual use of the product.  They asked to start
with
> simple tests of association (correlation) between each question for each
> product using the assessment tools in the small study and the answers to
the
> same questions from the large CT.  The ultimate goal (of the client) is to
> conduct either multiple linear regression or logistic regression on the 9
> descriptive questions (IV) to determine their ability to predict the
outcome
> of the intent to buy question (DV).  Of course this would be quite
possible
> within the big study. However, they are requesting that this is done with
> one of the "Assessment tools" used in the small study, NOT with the data
> collected from the large CT.  In other words, the IVs would come from the
> small study and the DV from the large study.  I do not see how this is
> possible because the participants are not the same in each study (not to
> mention the extremely small sample size in the pilot study, but I think
that

> is a secondary issue, since I do not see how I can run any tests of
> association between two unrelated samples).
>
> My client is first interested in finding out how well the three assessment
> tools in the small study agree with answers from the large study. Although
> they want correlation, I could think of no way to provide that, so instead
> have treated the large data set as "the population" and used aggregate to
> create new variables for each question and product that are the mean for
> each.  I used these as the population mean (seeing that these really do
> represent whether or not people would purchase the product after using
it),
> and have conducted a series of single-sample t-tests for each product and
> question against its corresponding "population" mean from the large CT.
> These do provide information regarding whether or not the means for each
of
> the assessment tools differ significantly (or not) from the large CT mean.
> This is where I am stuck.  I cannot run any type of correlation tests
> because the cases (subjects) in each group are completely different
people.
> My client is very adamant about wanting some test of association.  Is
there
> more that I can do with this design to provide the answers that my client
is

> requesting? (Again, if I am missing something that is very simple and
> obvious, I apologize ahead of time!)
>
> Thanks for your help in advance (since I know this is long, please feel
> welcome to reply off the list if you prefer).
>
> Linda Case
>
> Linda P. Case
> AutumnGold Consulting
> (217) 586-4864
> www.autumngoldconsulting.com
> [hidden email] or [hidden email]
>
>