I often work with data gathered from "check all that apply" questions. The responses may be spread over 10 or more options and I need to reduce these responses to one variable. My strategy has been as follows: If Barrier1 = "Yes" Barr1_num = 1. If Barrier2 = "Yes" Barr2_num = 10. If Barrier3 = "Yes" Barr3_num = 100 Etc. Compute barriers = 0. Then Compute barriers = sum(Barr1_num, Barr2_num , Barr3_num ). Value labels Barriers 0 No barriers 1 Barrier one only 10 Barrier two only 100 Barrier three only 11 Barriers one and two..... etc. This allows a simple frequency command to summarize the data (with note that the frequencies can sum to more the 100%) BUT I wonder if there is a better way to do this. Thanks in advance. Bill William N. Dudley, PhD 437-L Coleman BuildingProfessor - Public Health Education The School of Health and Human Sciences The University of North Carolina at Greensboro Greensboro, NC 27402-6170 See my research on ResearchGate VOICE 336.256 2475 |
Hi, Bill, do you mean such? COMPUTE barriers = (Barrier1 = "Yes") * 1 + (Barrier2 = "Yes") * 10 + (Barrier3 = "Yes") * 100 + ... Mario Giesel Munich, Germany
Am Montag, 29. April 2019, 14:37:16 MESZ hat William Dudley <[hidden email]> Folgendes geschrieben:
I often work with data gathered from "check all that apply" questions. The responses may be spread over 10 or more options and I need to reduce these responses to one variable. My strategy has been as follows: If Barrier1 = "Yes" Barr1_num = 1. If Barrier2 = "Yes" Barr2_num = 10. If Barrier3 = "Yes" Barr3_num = 100 Etc. Compute barriers = 0. Then Compute barriers = sum(Barr1_num, Barr2_num , Barr3_num ). Value labels Barriers 0 No barriers 1 Barrier one only 10 Barrier two only 100 Barrier three only 11 Barriers one and two..... etc. This allows a simple frequency command to summarize the data (with note that the frequencies can sum to more the 100%) BUT I wonder if there is a better way to do this. Thanks in advance. Bill William N. Dudley, PhD 437-L Coleman BuildingProfessor - Public Health Education The School of Health and Human Sciences The University of North Carolina at Greensboro Greensboro, NC 27402-6170 See my research on ResearchGate VOICE 336.256 2475 |
In reply to this post by William Dudley-2
This version of coding might be a way to store the data, but is it really the best way to analyze it? I’d go for using
Multiple Response functions found in a couple of places in SPSS. BTW, anyone who knows of methods for hypothesis testing in such MR tables? So far, I have only used these tables for descriptive purposes. Testing might be tricky, but I reckon someone should
have come up with a clever approach. Robert Från: SPSSX(r) Discussion [mailto:[hidden email]]
För William Dudley I often work with data gathered from "check all that apply" questions. The responses may be spread over 10 or more options and I need to reduce these responses to one variable. My strategy has been as follows: If Barrier1 = "Yes" Barr1_num = 1. If Barrier2 = "Yes" Barr2_num = 10. If Barrier3 = "Yes" Barr3_num = 100 Etc. Compute barriers = 0. Then Compute barriers = sum(Barr1_num, Barr2_num , Barr3_num ). Value labels Barriers 0 No barriers 1 Barrier one only 10 Barrier two only 100 Barrier three only 11 Barriers one and two..... etc. This allows a simple frequency command to summarize the data (with note that the frequencies can sum to more the 100%) BUT I wonder if there is a better way to do this. Thanks in advance. Bill -- William N. Dudley, PhD 437-L Coleman Building Greensboro, NC 27402-6170 ResearchGate ===================== To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Robert Lundqvist
|
Robert, hypothesis testing can be done for multiple response questions with CTABLES. But if I understood Bill correctly he wants to keep the information which "yes" responses are given simultaneously. This information is lost with multiple response grouping. Mario Giesel Munich, Germany
Am Montag, 29. April 2019, 15:17:50 MESZ hat Robert Lundqvist <[hidden email]> Folgendes geschrieben:
This version of coding might be a way to store the data, but is it really the best way to analyze it? I’d go for using Multiple Response functions found in a couple of places in SPSS. BTW, anyone who knows of methods for hypothesis testing in such MR tables? So far, I have only used these tables for descriptive purposes. Testing might be tricky, but I reckon someone should have come up with a clever approach.
Robert
Från: SPSSX(r) Discussion [mailto:[hidden email]]
För William Dudley
I often work with data gathered from "check all that apply" questions. The responses may be spread over 10 or more options and I need to reduce these responses to one variable.
My strategy has been as follows:
If Barrier1 = "Yes" Barr1_num = 1. If Barrier2 = "Yes" Barr2_num = 10. If Barrier3 = "Yes" Barr3_num = 100
Etc. Compute barriers = 0. Then Compute barriers = sum(Barr1_num, Barr2_num , Barr3_num ).
Value labels Barriers 0 No barriers 1 Barrier one only 10 Barrier two only 100 Barrier three only 11 Barriers one and two..... etc.
This allows a simple frequency command to summarize the data (with note that the frequencies can sum to more the 100%) BUT I wonder if there is a better way to do this.
Thanks in advance. Bill
-- William N. Dudley, PhD 437-L Coleman Building Greensboro, NC 27402-6170 ResearchGate
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
|
Thanks all for the input. With regard to the idea of summing the responses (e.g. how many barriers were checked), this is useful but the loss of information makes it not appealing. For instance One could be assigned a given score for many different patterns. Re the Multiple response approach. My clients really want the ability to run a simple Frequency tables, and or to categorize patients into subgroups based on their patterns of barriers. This need can also be met with mixture modeling/cluster analysis which we are also using BUT my sample sizes are often pretty small < 100 so, we mostly stick to simple descriptive statistics. Thanks again for the input Bill On Mon, Apr 29, 2019 at 9:32 AM Mario Giesel <[hidden email]> wrote:
William N. Dudley, PhD 437-L Coleman BuildingProfessor - Public Health Education The School of Health and Human Sciences The University of North Carolina at Greensboro Greensboro, NC 27402-6170 See my research on ResearchGate VOICE 336.256 2475 |
Administrator
|
Hi Bill. If you're commenting on Mario's suggested code, it does not compute
the number of barriers that were checked. It generates a code several digits in length, with the ones column coding barrier 1 (1=checked, 0 = not checked), the tens column coding barrier 2, the hundreds column coding barrier 3, etc. Here's an example using 3 barriers variables with all possible patterns of Yes and No responses: DATA LIST LIST /Barrier1 to Barrier3 (3A3). BEGIN DATA No No No Yes No No No Yes No No No Yes Yes Yes No Yes No Yes No Yes Yes Yes Yes Yes END DATA. LIST. COMPUTE barriers = (Barrier1 = "Yes") * 1 + (Barrier2 = "Yes") * 10 + (Barrier3 = "Yes") * 100. FORMATS barriers(N3). FREQUENCIES barriers. * Ones column codes barrier 1. * Tens column codes barrier 2. * Hundreds column codes barrier 3. Apologies if you were talking about something else. William Dudley-2 wrote > Thanks all for the input. > > With regard to the idea of summing the responses (e.g. how many barriers > were checked), this is useful but the loss of information makes it not > appealing. For instance One could be assigned a given score for many > different patterns. > > Re the Multiple response approach. My clients really want the ability to > run a simple Frequency tables, and or to categorize patients into > subgroups > based on their patterns of barriers. > This need can also be met with mixture modeling/cluster analysis which we > are also using BUT my sample sizes are often pretty small < 100 so, we > mostly stick to simple descriptive statistics. > > Thanks again for the input > Bill > > > > > > On Mon, Apr 29, 2019 at 9:32 AM Mario Giesel < > 0000055bfbeaad14-dmarc-request@.uga >> wrote: > >> Robert, >> >> hypothesis testing can be done for multiple response questions with >> CTABLES. >> But if I understood Bill correctly he wants to keep the information which >> "yes" responses are given simultaneously. >> This information is lost with multiple response grouping. >> >> Mario Giesel >> Munich, Germany >> >> >> Am Montag, 29. April 2019, 15:17:50 MESZ hat Robert Lundqvist < >> > robert.lundqvist@ >> Folgendes geschrieben: >> >> >> This version of coding might be a way to store the data, but is it really >> the best way to analyze it? I’d go for using Multiple Response functions >> found in a couple of places in SPSS. BTW, anyone who knows of methods for >> hypothesis testing in such MR tables? So far, I have only used these >> tables >> for descriptive purposes. Testing might be tricky, but I reckon someone >> should have come up with a clever approach. >> >> >> >> Robert >> >> >> >> *Från:* SPSSX(r) Discussion [mailto: > SPSSX-L@.UGA > ] *För *William >> Dudley >> *Skickat:* den 29 april 2019 14:37 >> *Till:* > SPSSX-L@.UGA >> *Ämne:* Best way to code "Check all that apply" >> >> >> >> I often work with data gathered from "check all that apply" questions. >> The responses may be spread over 10 or more options and I need to reduce >> these responses to one variable. >> >> >> >> My strategy has been as follows: >> >> >> >> If Barrier1 = "Yes" Barr1_num = 1. >> >> If Barrier2 = "Yes" Barr2_num = 10. >> >> If Barrier3 = "Yes" Barr3_num = 100 >> >> >> >> Etc. >> >> Compute barriers = 0. >> >> Then >> >> Compute barriers = sum(Barr1_num, Barr2_num , Barr3_num ). >> >> >> >> Value labels Barriers >> >> 0 No barriers >> >> 1 Barrier one only >> >> 10 Barrier two only >> >> 100 Barrier three only >> >> 11 Barriers one and two..... etc. >> >> >> >> >> >> This allows a simple frequency command to summarize the data (with note >> that the frequencies can sum to more the 100%) BUT I wonder if there is >> a >> better way to do this. >> >> >> >> Thanks in advance. >> >> Bill >> >> >> >> >> >> >> >> -- >> >> William N. Dudley, PhD >> Professor - Public Health Education >> The School of Health and Human Sciences >> The University of North Carolina at Greensboro >> >> 437-L Coleman Building >> >> Greensboro, NC 27402-6170 >> See my research on >> >> GoogleScholar >> <https://scholar.google.com/citations?user=ZiYmyb4AAAAJ&hl=en> >> >> ResearchGate <https://www.researchgate.net/profile/William_Dudley> >> VOICE 336.256 2475 >> >> >> >> [image: email signature image example.png] >> >> ===================== To manage your subscription to SPSSX-L, send a >> message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text >> except the command. To leave the list, send the command SIGNOFF SPSSX-L >> For >> a list of commands to manage subscriptions, send the command INFO REFCARD >> ===================== To manage your subscription to SPSSX-L, send a >> message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text >> except the command. To leave the list, send the command SIGNOFF SPSSX-L >> For >> a list of commands to manage subscriptions, send the command INFO REFCARD >> ===================== To manage your subscription to SPSSX-L, send a >> message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text >> except the command. To leave the list, send the command SIGNOFF SPSSX-L >> For >> a list of commands to manage subscriptions, send the command INFO REFCARD > > > > -- > William N. Dudley, PhD > Professor - Public Health Education > The School of Health and Human Sciences > The University of North Carolina at Greensboro > 437-L Coleman Building > Greensboro, NC 27402-6170 > See my research on > GoogleScholar > <https://scholar.google.com/citations?user=ZiYmyb4AAAAJ&hl=en> > ResearchGate <https://www.researchgate.net/profile/William_Dudley> > VOICE 336.256 2475 > > [image: email signature image example.png] > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by spss.giesel@yahoo.de
at some number of optional items the floating point representation of an
integer runs into problems. For presentation a lot depends on how many items there are in the set and how many co-occurrences there are. In Multiple Response ( and in CTABLES) it is possible to crosstab a set by itself a few times, by a variable, by one of the items set1 by set by set1 etc. set1 by set1 by gender etc. item1 by set1 by set1. It is also possible to create a set that is a subset, e.g, each new set excluding a particular item. ----- Art Kendall Social Research Consultants -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
In reply to this post by William Dudley-2
I'm a big fan of summary scores and composite scores, so I will
advocate that approach, and give more suggestions along that line.
Look at the single-item frequencies - The top two or three or four
are the ones that a-priori deserve special attention. Are there any
others that deserve to be singled out because of their content?
Consider whether the more-trivial items (by meaning) which have
few occurrences might be grouped as Y/N or a count for "other".
Look at the correlations and the 2x2 tables. Skewness can give a
small r, even when the odds-ratio, say, may be large. Consider both
"unnecessary redundancy" (combine the items) and your aim of
finding of "important clusters".
Despite the "loss of information" by taking a summary count,
the most important dimension of information /could/ be the
number of Barriers. And you don't necessary need any statistical
analysis to look at the items and create clusters, for which you
can decide to take the "count of xxxx-kind of barrier". (I always
look at factor analyses of new scales for a sample, even when the
n is too small for reliable factors, and even when the items are
binary and have less information. The worst that can happen is
that I don't learn anything and waste 10 minutes looking.)
Counts might be recoded into "none, 1-2, 3-6, more" or whatnot,
if they are relatively few. On the other hand, you want the opposite
image if they are many, recoding into "10, 8-9, 4-7, 3 or less" to
get a measure of Absence of barriers.
Hope that helps.
--
Rich Ulrich
From: SPSSX(r) Discussion <[hidden email]> on behalf of William Dudley <[hidden email]>
Sent: Monday, April 29, 2019 10:55 AM To: [hidden email] Subject: Re: SV: Best way to code "Check all that apply" Thanks all for the input.
With regard to the idea of summing the responses (e.g. how many barriers were checked), this is useful but the loss of information makes it not appealing. For instance One could be assigned a given score
for many different patterns.
Re the Multiple response approach. My clients really want the ability to run a simple Frequency tables, and or to categorize patients into subgroups based on their patterns of barriers.
This need can also be met with mixture modeling/cluster analysis which we are also using BUT my sample sizes are often pretty small < 100 so, we mostly stick to simple descriptive statistics.
Thanks again for the input
Bill
On Mon, Apr 29, 2019 at 9:32 AM Mario Giesel <[hidden email]> wrote:
William N. Dudley, PhD
437-L Coleman BuildingProfessor - Public Health Education The School of Health and Human Sciences The University of North Carolina at Greensboro Greensboro, NC 27402-6170
See my research on ResearchGate
VOICE 336.256 2475 |
In reply to this post by William Dudley-2
I would think that the powers aggregate (1*x1 + 10*x2 etc) would be pretty hard to make sense of in a frequency table. At the least, I would think you would want short value labels indicating the combination components. That process could be automated with a little Python code that would use some part of the individual variable labels for the constituent variables if there is a short portion that is meaningful. I suppose, also, that it would help to have the frequencies listed in descending count order. You are probably already doing this. On Mon, Apr 29, 2019 at 8:55 AM William Dudley <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |