Any offers? Replies to quantitative_methods_teaching @ ncrm.ac.uk Hi -- I'm trying to get 1st year (geography) students to think a little bit more deeply about why a poorly defined variable is a big problem for (geographical/social science) research and why it's something that needs careful thought before you even try to write down a question. So I am looking for good examples... better still would be subtle and memorable examples, though I'd settle for 'dumb things seen in the wild'. The geography link is optional. Examples could be from the real world or simply clever 'thought experiments'. The tricky part is that in 'my' part of the course they will not yet be doing statistical analysis so it won't be very useful to show how the wrong data type gives you invalid test results... the focus should really just be on that "Ooooh, I didn't think of that when I saw that list of categories!" sort of moment. The idea is to get them excited about exploring a research question in several different ways and then lead them naturally towards more sophisticated analyses as issues come up that they can't explain/explore with simpler approaches. Anyway, so I'm wondering if anyone on the list has the background knowledge (and the time) to suggest some examples for: Nominal Ordinal Interval Ratio [String/Character -- for this one I've mainly focussed on comparability in analysis] Alternatively, if you think that's a terrible way to organise a talk about variables and data types then I'm all ears, though I'd appreciate it if you phrased it constructively... Suggestions *very* much appreciated! Jon Reades |
I have some good examples of poor nominal and ordinal variables. Will have time this evening to send them on. From: John F Hall [mailto:[hidden email]] Any offers? Replies to [hidden email] John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/spss-without-tears.html From: [hidden email] [[hidden email]] On Behalf Of Reades, Jonathan Hi -- I'm trying to get 1st year (geography) students to think a little bit more deeply about why a poorly defined variable is a big problem for (geographical/social science) research and why it's something that needs careful thought before you even try to write down a question. So I am looking for good examples... better still would be subtle and memorable examples, though I'd settle for 'dumb things seen in the wild'. The geography link is optional. Examples could be from the real world or simply clever 'thought experiments'. The tricky part is that in 'my' part of the course they will not yet be doing statistical analysis so it won't be very useful to show how the wrong data type gives you invalid test results... the focus should really just be on that "Ooooh, I didn't think of that when I saw that list of categories!" sort of moment. The idea is to get them excited about exploring a research question in several different ways and then lead them naturally towards more sophisticated analyses as issues come up that they can't explain/explore with simpler approaches. Anyway, so I'm wondering if anyone on the list has the background knowledge (and the time) to suggest some examples for: Nominal Ordinal Interval Ratio [String/Character -- for this one I've mainly focussed on comparability in analysis] Alternatively, if you think that's a terrible way to organise a talk about variables and data types then I'm all ears, though I'd appreciate it if you phrased it constructively... Suggestions *very* much appreciated! jon -- Jon Reades Room K7.37, Strand Campus Department of Geography landline: 0207.848.1372 |
In reply to this post by John F Hall
The big problem that I've seen related to geography is the failure
to normalize the units to make sense for the problem. Sensible units may be "per capita" or "per square mile" or per what-have-you, and not the total pollution (in tons) or total income or total number of people/ cars/ households. Decades ago, I read a study of "air pollution in 35 cities" where the units were crude tons. New York had twice the totals of Chicago, and the other 33 cities were all at the bottom, comparatively speaking. Analyzes of various variables were thoroughly useless since the study did nothing to take out the overwhelming effect of "size" in terms of population. The dimension of units that are measured globally does *not* define the natural units for some particular analysis. - There exists, mainly among beginners, a simplistic and seriously-deficient understanding of what an "equal-interval" measure consists of. They may have even been given a crude rule of thumb that misdirects them to the concrete measures, like tons or dollars or persons. - What matters for the statistics is that the *errors* (of measurement, of prediction) are of equal size for the whole range. One corrective is to point out that if the natural (and proper) way of talking about XXX is to say "twice as much" or "one tenth" ... then you probably want to take the log of XXX. Or, if you consider a change of score of "100 points" ... Would that be just as meaningful and important at the top, middle, or bottom of the scale? Does a error of 1 unit have the same importance, where ever it occurs? And then... What transformation *does* provide equal intervals? You have predictors (independent variables) and outcomes (dependent variables) and they need to be "commensurate" -- measured in units that work together. But first, the measured units for the outcome need to be meaningful and close to being "equal interval"; and that depends on the latent factor that you are studying, and not on the units of raw measurement. "Dollars" might be equal interval for describing how much of YYY can be bought; the logarithm of "dollars" is a whole lot better for describing personal wealth, since "twice as much" is a lot better way of comparing (at either end of the scale) than saying "a million dollars richer." -- Rich Ulrich Date: Tue, 22 Oct 2013 19:00:55 +0200 From: [hidden email] Subject: FW: [Quantitative Methods Teaching] Good Examples of Poorly-Defined Variables To: [hidden email]
Any offers? Replies to quantitative_methods_teaching @ ncrm.ac.uk
Hi -- I'm trying to get 1st year (geography) students to think a little bit more deeply about why a poorly defined variable is a big problem for (geographical/social science) research and why it's something that needs careful thought before you even try to write down a question.
So I am looking for good examples... better still would be subtle and memorable examples, though I'd settle for 'dumb things seen in the wild'. The geography link is optional. Examples could be from the real world or simply clever 'thought experiments'.
The tricky part is that in 'my' part of the course they will not yet be doing statistical analysis so it won't be very useful to show how the wrong data type gives you invalid test results... the focus should really just be on that "Ooooh, I didn't think of that when I saw that list of categories!" sort of moment. The idea is to get them excited about exploring a research question in several different ways and then lead them naturally towards more sophisticated analyses as issues come up that they can't explain/explore with simpler approaches.
Anyway, so I'm wondering if anyone on the list has the background knowledge (and the time) to suggest some examples for:
Nominal Ordinal Interval Ratio [String/Character -- for this one I've mainly focussed on comparability in analysis]
Alternatively, if you think that's a terrible way to organise a talk about variables and data types then I'm all ears, though I'd appreciate it if you phrased it constructively...
Suggestions *very* much appreciated!
Jon Reades
|
In reply to this post by John F Hall
Don't bother trying to respond to that list.
I added the address to my reply, and my post was mechanically rejected since I am not a member. -- Rich Ulrich Date: Tue, 22 Oct 2013 19:00:55 +0200 From: [hidden email] Subject: FW: [Quantitative Methods Teaching] Good Examples of Poorly-Defined Variables To: [hidden email]
Any offers? Replies to quantitative_methods_teaching @ ncrm.ac.uk
|
In reply to this post by John F Hall
Hello Jonathan, I’ve equated your need for poor variables to poor questions. Poor questions create poor variables. The examples may or may not be helpful for you. Probably more complex that you wanted. But anyway, here they are. Number 1: Nominal from a local council postal survey �C I no longer have access to the data. But the uncleaned data showed a large percentage of parents answering “no” to Q10 and continuing on to mark childcare in Q11! Fortunately this had a clear pattern. These parents marked the “Informal” category. These parents clearly did not think that family, friends and neighbours counted as childcare. This is a fault in the design of the simple question about childcare use. Q10 Do you use childcare? Q11 If yes, what type of childcare to you use? Informal, e.g. Family / Friends / Neighbours 1 Childminder, Self-employed carers based in their own homes. 2 Home Based Child Carer, Employed by parents and based in parents home 3 Nanny / Aupair, Employed by parents and based in parents home 4 Crèche, Occasional care for parents to access work, training or one off events 5 Pre-school Play Group, Sessions of play and/or education for children aged 2-5yrs 6 Day Nursery, To provide care and education for children aged 6 weeks to 5 yrs 7 Breakfast / Before School Club, Safe place to play before school starts for children aged 3-14yrs 8 After School Club, Safe place to play after school finishes for children aged 3-14yrs 9 Holiday Play Scheme / Club, Safe place to play during school holidays for children aged 3-14yrs 10 Specialist care for ages 15, 16, 17 - children with a disability only 11 Other, please state 12 Number 2: Nominal from my 2011 AAPOR presentation based on ESRC SDMI Mixed Modes and Measurement Error data. Below respondents have been randomly assigned to two conditions.
When the 8 category version is collapsed into the same 3 categories and the 3 category version, you don’t get the same results. Cognitive interviewing suggested that ‘type of dwelling’ was a confusing concept when this would not have been expected: ● “What the [Blip] difference is there between a maisonette and a flat and a block of flats, a flat and a house?” (Male, postgraduate degree, employed, high income, White British ● Regarding a maisonette. Household member: “I’ll call it a duplex, yeah.” Respondent: “Well, it’s what they call it in the South.” (Male, postgraduate degree, employed, high income, White British) ● R answered ‘flat,’ but the interviewer observed it as semi-detached house. R said it had to be very large to be called a house (Female, higher education below degree level, employed, medium income, other ethnicity). Number 3: Ordinal from Housing Association Questionnaire 2. How happy are you with the area you live in? Very satisfied q Fairly satisfied q Neither satisfied nor dissatisfied q Fairly unsatisfied q Very unsatisfied q Don’t know q ・ Mismatch between stem and answer categories - Happiness vs. Satisfaction ・ Uses “dissatisfied” and “unsatisfied” ・ "Area" is ambiguous ・ "You" is ambiguous - Singular (meaning just the respondent) or plural (meaning the whole household) ・ A reference period would help ・ Too general to be useful ・ Etc, etc. With very best wishes, Pam Dr. Pamela Campanelli Survey Methods Consultant, Chartered Statistician and Chartered Scientist From: Dr Pamela Campanelli [mailto:[hidden email]] I have some good examples of poor nominal and ordinal variables. Will have time this evening to send them on. From: John F Hall [[hidden email]] Any offers? Replies to [hidden email] John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/spss-without-tears.html From: [hidden email] [[hidden email]] On Behalf Of Reades, Jonathan Hi -- I'm trying to get 1st year (geography) students to think a little bit more deeply about why a poorly defined variable is a big problem for (geographical/social science) research and why it's something that needs careful thought before you even try to write down a question. So I am looking for good examples... better still would be subtle and memorable examples, though I'd settle for 'dumb things seen in the wild'. The geography link is optional. Examples could be from the real world or simply clever 'thought experiments'. The tricky part is that in 'my' part of the course they will not yet be doing statistical analysis so it won't be very useful to show how the wrong data type gives you invalid test results... the focus should really just be on that "Ooooh, I didn't think of that when I saw that list of categories!" sort of moment. The idea is to get them excited about exploring a research question in several different ways and then lead them naturally towards more sophisticated analyses as issues come up that they can't explain/explore with simpler approaches. Anyway, so I'm wondering if anyone on the list has the background knowledge (and the time) to suggest some examples for: Nominal Ordinal Interval Ratio [String/Character -- for this one I've mainly focussed on comparability in analysis] Alternatively, if you think that's a terrible way to organise a talk about variables and data types then I'm all ears, though I'd appreciate it if you phrased it constructively... Suggestions *very* much appreciated! jon -- Jon Reades Room K7.37, Strand Campus Department of Geography landline: 0207.848.1372 |
In reply to this post by John F Hall
Hope you have good some examples by now. I still remember the example of "What mode of transport do you use to work" in my first year Stats class as an example of a bad question. Options naturally included
By car by train Walking cycling Taxi Bus problem is Most people use more than one method to get to work Some years ago, a friend who was taking a research methods sociology class, invited me to make a presentation on types of questions in a survey.
I gave as an example of a Multiple choice question: Marital Status - did not bother asking the full question. just stated as "we"usually do in typical surveys. One of the students could not see how this was a multiple choice rather than a multiple response question. He explained that he knows people who were currently divorced, widowed and married, as well as those who were currently widowed and single, etc. as we discussed his objection in class, with the purpose of that question, we soon realised that in certain circumstances, the Marital History of a person is far more important than their current marital situation. An area where this could be particularly important included ownership of property, fertility, and socio-economic status (especially for females). So need to reflect as to whether we are thinking of effect of current marital status or marital history.
For environmental science/geography, the same could apply to source of drinking water, source of energy for various end-uses, amount of energy for household space heating (how to combine different units of measurement depending of type such as electricity vs, fuelwood, etc.)
Finally in a recent international study, one of the important question was access to electricity. The data were to be collected at both household and community level. It appeared very obvious to the subject experts what the definition of "access".Only after piloting was conducted that the difficulty of using a concept such as "access" in a questionnaire was realised.
Best wishes, and do not forget to share your final list with the group. regards Forcheh
On Tue, Oct 22, 2013 at 7:00 PM, John F Hall <[hidden email]> wrote:
Professor Ntonghanwah Forcheh Department of Statistics, University of Botswana Private Bag UB00705, Gaborone, Botswana. Office: +267 355 2696, Mobile: Orange +267 75 26 2963, Bmobile: 73181378: Mascom 754 21238 fax: +267 3185099; Alternative Email: [hidden email] *@Honesty is a Virtue, Freedom of the Mind is Power. Motto: Never be afraid to be honest, Never lie to yourself, Trust in the Truth and you will be forever free.* |
In reply to this post by Dr Pamela Campanelli
I strong agree WRT to poor questions. Unless the researcher is using some standard questions that have been thoroughly field-tested by others, at a minimum, a “prototype” survey instrument needs to be tried out in a pilot survey. No point in spending money, time, and effort to field a useless survey. From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Dr Pamela Campanelli Hello Jonathan, I’ve equated your need for poor variables to poor questions. Poor questions create poor variables. The examples may or may not be helpful for you. Probably more complex that you wanted. But anyway, here they are. Number 1: Nominal from a local council postal survey �C I no longer have access to the data. But the uncleaned data showed a large percentage of parents answering “no” to Q10 and continuing on to mark childcare in Q11! Fortunately this had a clear pattern. These parents marked the “Informal” category. These parents clearly did not think that family, friends and neighbours counted as childcare. This is a fault in the design of the simple question about childcare use. Q10 Do you use childcare? Q11 If yes, what type of childcare to you use? Informal, e.g. Family / Friends / Neighbours 1 Childminder, Self-employed carers based in their own homes. 2 Home Based Child Carer, Employed by parents and based in parents home 3 Nanny / Aupair, Employed by parents and based in parents home 4 Crèche, Occasional care for parents to access work, training or one off events 5 Pre-school Play Group, Sessions of play and/or education for children aged 2-5yrs 6 Day Nursery, To provide care and education for children aged 6 weeks to 5 yrs 7 Breakfast / Before School Club, Safe place to play before school starts for children aged 3-14yrs 8 After School Club, Safe place to play after school finishes for children aged 3-14yrs 9 Holiday Play Scheme / Club, Safe place to play during school holidays for children aged 3-14yrs 10 Specialist care for ages 15, 16, 17 - children with a disability only 11 Other, please state 12 Number 2: Nominal from my 2011 AAPOR presentation based on ESRC SDMI Mixed Modes and Measurement Error data. Below respondents have been randomly assigned to two conditions.
When the 8 category version is collapsed into the same 3 categories and the 3 category version, you don’t get the same results. Cognitive interviewing suggested that ‘type of dwelling’ was a confusing concept when this would not have been expected: ● “What the [Blip] difference is there between a maisonette and a flat and a block of flats, a flat and a house?” (Male, postgraduate degree, employed, high income, White British ● Regarding a maisonette. Household member: “I’ll call it a duplex, yeah.” Respondent: “Well, it’s what they call it in the South.” (Male, postgraduate degree, employed, high income, White British) ● R answered ‘flat,’ but the interviewer observed it as semi-detached house. R said it had to be very large to be called a house (Female, higher education below degree level, employed, medium income, other ethnicity). Number 3: Ordinal from Housing Association Questionnaire 2. How happy are you with the area you live in? Very satisfied q Fairly satisfied q Neither satisfied nor dissatisfied q Fairly unsatisfied q Very unsatisfied q Don’t know q ・ Mismatch between stem and answer categories - Happiness vs. Satisfaction ・ Uses “dissatisfied” and “unsatisfied” ・ "Area" is ambiguous ・ "You" is ambiguous - Singular (meaning just the respondent) or plural (meaning the whole household) ・ A reference period would help ・ Too general to be useful ・ Etc, etc. With very best wishes, Pam Dr. Pamela Campanelli Survey Methods Consultant, Chartered Statistician and Chartered Scientist From: Dr Pamela Campanelli [[hidden email]] I have some good examples of poor nominal and ordinal variables. Will have time this evening to send them on. From: John F Hall [[hidden email]] Any offers? Replies to [hidden email] John F Hall (Mr) [Retired academic survey researcher] Email: [hidden email] Website: www.surveyresearch.weebly.com SPSS start page: www.surveyresearch.weebly.com/spss-without-tears.html From: [hidden email] [[hidden email]] On Behalf Of Reades, Jonathan Hi -- I'm trying to get 1st year (geography) students to think a little bit more deeply about why a poorly defined variable is a big problem for (geographical/social science) research and why it's something that needs careful thought before you even try to write down a question. So I am looking for good examples... better still would be subtle and memorable examples, though I'd settle for 'dumb things seen in the wild'. The geography link is optional. Examples could be from the real world or simply clever 'thought experiments'. The tricky part is that in 'my' part of the course they will not yet be doing statistical analysis so it won't be very useful to show how the wrong data type gives you invalid test results... the focus should really just be on that "Ooooh, I didn't think of that when I saw that list of categories!" sort of moment. The idea is to get them excited about exploring a research question in several different ways and then lead them naturally towards more sophisticated analyses as issues come up that they can't explain/explore with simpler approaches. Anyway, so I'm wondering if anyone on the list has the background knowledge (and the time) to suggest some examples for: Nominal Ordinal Interval Ratio [String/Character -- for this one I've mainly focussed on comparability in analysis] Alternatively, if you think that's a terrible way to organise a talk about variables and data types then I'm all ears, though I'd appreciate it if you phrased it constructively... Suggestions *very* much appreciated! jon -- Jon Reades Room K7.37, Strand Campus Department of Geography landline: 0207.848.1372 |
Poor questions?
The time to consult a statistician is *before* you even test the survey. A pilot survey helps you make sure you can get unambiguous answers to questions. A statistician, looking at the items at the start (and answers later on) helps you make sure that you can get the least ambiguous answers to *hypotheses.* You should have a pretty good idea before data collection of most of the sorts of statements that you will be making about outcome, even if you don't know the direction of the conclusions. The easiest "saves" are when you just need to ask a question a little bit differently, or just need to add an item or two for clarification. -- Rich Ulrich Date: Fri, 25 Oct 2013 14:29:43 -0600 From: [hidden email] Subject: Re: [Quantitative Methods Teaching] Good Examples of Poorly-Defined Variables To: [hidden email] I strong agree WRT to poor questions. Unless the researcher is using some standard questions that have been thoroughly field-tested by others, at a minimum, a “prototype” survey instrument needs to be tried out in a pilot survey. No point in spending money, time, and effort to field a useless survey. ... |
Free forum by Nabble | Edit this page |