I’d appreciate some suggestions or advice on analyzing the following type of data. Somebody here has school-level data on the number of EMS calls made during a semester for kids. The element that I need help with is that the school district
decided it needed to preserve something and so recoded the data so that schools with 1 thru 5 calls in a semester were given a value of 3, otherwise the true value was recorded. So the distribution looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the
enrollment at each school so I can compute a rate but because of the grouping it is not accurate. What am asking for is direct advice from anybody who has analyzed such data, references to articles about how such data can be/has been analyzed, how this type
of data would be described as for a search term (I know that work has been done with completely grouped counts), or where (other listservs, for instance, to look for advice etc. Thanks, Gene Magin |
Administrator
|
Hi Gene. Here's a partially baked suggestion: How about recoding 3 to -3
(or some other out of range value for counts), treating it as missing, and using multiple imputation? p.s. - Sorry about the empty post that was sent a few minutes ago--clicked the wrong button by mistake! Maguin, Eugene wrote > I'd appreciate some suggestions or advice on analyzing the following type > of data. Somebody here has school-level data on the number of EMS calls > made during a semester for kids. The element that I need help with is that > the school district decided it needed to preserve something and so recoded > the data so that schools with 1 thru 5 calls in a semester were given a > value of 3, otherwise the true value was recorded. So the distribution > looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment at > each school so I can compute a rate but because of the grouping it is not > accurate. What am asking for is direct advice from anybody who has > analyzed such data, references to articles about how such data can be/has > been analyzed, how this type of data would be described as for a search > term (I know that work has been done with completely grouped counts), or > where (other listservs, for instance, to look for advice etc. > > Thanks, Gene Magin > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
That sounds a bit fishy. Is there any way to constrain the imputed values to
be between 1-5 ? I am not that familiar with MI. Bruce Weaver wrote > Hi Gene. Here's a partially baked suggestion: How about recoding 3 to -3 > (or some other out of range value for counts), treating it as missing, and > using multiple imputation? > > p.s. - Sorry about the empty post that was sent a few minutes ago--clicked > the wrong button by mistake! > > > > Maguin, Eugene wrote >> I'd appreciate some suggestions or advice on analyzing the following type >> of data. Somebody here has school-level data on the number of EMS calls >> made during a semester for kids. The element that I need help with is >> that >> the school district decided it needed to preserve something and so >> recoded >> the data so that schools with 1 thru 5 calls in a semester were given a >> value of 3, otherwise the true value was recorded. So the distribution >> looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment at >> each school so I can compute a rate but because of the grouping it is not >> accurate. What am asking for is direct advice from anybody who has >> analyzed such data, references to articles about how such data can be/has >> been analyzed, how this type of data would be described as for a search >> term (I know that work has been done with completely grouped counts), or >> where (other listservs, for instance, to look for advice etc. >> >> Thanks, Gene Magin >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD > > > > > > ----- > -- > Bruce Weaver > bweaver@ > http://sites.google.com/a/lakeheadu.ca/bweaver/ > > "When all else fails, RTFM." > > NOTE: My Hotmail account is not monitored regularly. > To send me an e-mail, please use the address shown above. > > -- > Sent from: http://spssx-discussion.1045642.n5.nabble.com/ > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD Not ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
Good question, David. Yes, there is (see link below), I just forgot to
mention it. https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistics_reference_project_ddita/spss/mva/syn_multiple_imputation_constraints.html See MIN = NONE | num and MAX = NONE | num. You can also specify RND=1 to round imputed values to the nearest integer. Bruce David Marso wrote > That sounds a bit fishy. Is there any way to constrain the imputed values > to > be between 1-5 ? I am not that familiar with MI. > > > Bruce Weaver wrote >> Hi Gene. Here's a partially baked suggestion: How about recoding 3 to >> -3 >> (or some other out of range value for counts), treating it as missing, >> and >> using multiple imputation? >> >> p.s. - Sorry about the empty post that was sent a few minutes >> ago--clicked >> the wrong button by mistake! >> >> >> >> Maguin, Eugene wrote >>> I'd appreciate some suggestions or advice on analyzing the following >>> type >>> of data. Somebody here has school-level data on the number of EMS calls >>> made during a semester for kids. The element that I need help with is >>> that >>> the school district decided it needed to preserve something and so >>> recoded >>> the data so that schools with 1 thru 5 calls in a semester were given a >>> value of 3, otherwise the true value was recorded. So the distribution >>> looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment >>> at >>> each school so I can compute a rate but because of the grouping it is >>> not >>> accurate. What am asking for is direct advice from anybody who has >>> analyzed such data, references to articles about how such data can >>> be/has >>> been analyzed, how this type of data would be described as for a search >>> term (I know that work has been done with completely grouped counts), or >>> where (other listservs, for instance, to look for advice etc. >>> >>> Thanks, Gene Magin >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >> >>> LISTSERV@.UGA >> >>> (not to SPSSX-L), with no body text except the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >> >> >> >> >> >> ----- >> -- >> Bruce Weaver > >> bweaver@ > >> http://sites.google.com/a/lakeheadu.ca/bweaver/ >> >> "When all else fails, RTFM." >> >> NOTE: My Hotmail account is not monitored regularly. >> To send me an e-mail, please use the address shown above. >> >> -- >> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD > > Not > > > > ----- > Please reply to the list and not to my personal email. > Those desiring my consulting or training services please feel free to > email me. > --- > "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos > ne forte conculcent eas pedibus suis." > Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in > abyssum?" > -- > Sent from: http://spssx-discussion.1045642.n5.nabble.com/ > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by David Marso
Is this a school district where you can reach the person or persons who work
with the data? Would they be willing to produce the rates you are interested in? What is the rationale for coarsening the data? If they released a new data set with the rate and the coarsened count, would it be possible to reverse engineer the original count? ----- Art Kendall Social Research Consultants -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
In reply to this post by David Marso
In SPSS you can limit the outcome to within a particular range of the entire
data for multiple imputation, but that won't take into account the potential range given the already binned data (eg 3 can be from 1-5, 8 can be from 6-10 etc). If you look on google scholar you can find some implementations that take that into account, https://scholar.google.com/scholar?hl=en&as_sdt=0%2C44&q=interval+censored+multiple+imputation&btnG=. Sometimes this data is called *interval censored* or *binned* data. It sort of depends on the type of analysis you want to do with the data how you might approach it. Simply descriptive you have bounds on the counts, and you can subsequently bound various summary statistics and simply tests of differences. If estimating as a dependent variable, you may do some type of censored regression approach. If using as an independent variable that is where I have seen imputation used. See https://prod.sandia.gov/techlib-noauth/access-control.cgi/2007/070939.pdf (has no references to the imputation stuff though). ----- Andy W [hidden email] http://andrewpwheeler.wordpress.com/ -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by Bruce Weaver
Good to know Bruce! Thanks.
Fishy comment retracted... ---- Bruce Weaver wrote > Good question, David. Yes, there is (see link below), I just forgot to > mention it. > > https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistics_reference_project_ddita/spss/mva/syn_multiple_imputation_constraints.html > > See MIN = NONE | num and MAX = NONE | num. You can also specify RND=1 to > round imputed values to the nearest integer. > > Bruce > > > > David Marso wrote >> That sounds a bit fishy. Is there any way to constrain the imputed values >> to >> be between 1-5 ? I am not that familiar with MI. >> >> >> Bruce Weaver wrote >>> Hi Gene. Here's a partially baked suggestion: How about recoding 3 to >>> -3 >>> (or some other out of range value for counts), treating it as missing, >>> and >>> using multiple imputation? >>> >>> p.s. - Sorry about the empty post that was sent a few minutes >>> ago--clicked >>> the wrong button by mistake! >>> >>> >>> >>> Maguin, Eugene wrote >>>> I'd appreciate some suggestions or advice on analyzing the following >>>> type >>>> of data. Somebody here has school-level data on the number of EMS calls >>>> made during a semester for kids. The element that I need help with is >>>> that >>>> the school district decided it needed to preserve something and so >>>> recoded >>>> the data so that schools with 1 thru 5 calls in a semester were given a >>>> value of 3, otherwise the true value was recorded. So the distribution >>>> looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment >>>> at >>>> each school so I can compute a rate but because of the grouping it is >>>> not >>>> accurate. What am asking for is direct advice from anybody who has >>>> analyzed such data, references to articles about how such data can >>>> be/has >>>> been analyzed, how this type of data would be described as for a search >>>> term (I know that work has been done with completely grouped counts), >>>> or >>>> where (other listservs, for instance, to look for advice etc. >>>> >>>> Thanks, Gene Magin >>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>> >>>> LISTSERV@.UGA >>> >>>> (not to SPSSX-L), with no body text except the >>>> command. To leave the list, send the command >>>> SIGNOFF SPSSX-L >>>> For a list of commands to manage subscriptions, send the command >>>> INFO REFCARD >>> >>> >>> >>> >>> >>> ----- >>> -- >>> Bruce Weaver >> >>> bweaver@ >> >>> http://sites.google.com/a/lakeheadu.ca/bweaver/ >>> >>> "When all else fails, RTFM." >>> >>> NOTE: My Hotmail account is not monitored regularly. >>> To send me an e-mail, please use the address shown above. >>> >>> -- >>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >> >>> LISTSERV@.UGA >> >>> (not to SPSSX-L), with no body text except the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >> >> Not >> >> >> >> ----- >> Please reply to the list and not to my personal email. >> Those desiring my consulting or training services please feel free to >> email me. >> --- >> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante >> porcos >> ne forte conculcent eas pedibus suis." >> Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff >> in >> abyssum?" >> -- >> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD > > > > > > ----- > -- > Bruce Weaver > bweaver@ > http://sites.google.com/a/lakeheadu.ca/bweaver/ > > "When all else fails, RTFM." > > NOTE: My Hotmail account is not monitored regularly. > To send me an e-mail, please use the address shown above. > > -- > Sent from: http://spssx-discussion.1045642.n5.nabble.com/ > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
To Art. There's an intermediary and a court case in between the school district and person I am working with. My understanding is that the recode was done by the district and the person does not think we can go back to the district to get the unrecoded counts. Who knows, maybe the recode was a carefully calculated maneuver to obscure things.
Andy. Thank you for the additional names for this and the links. All. The data were provided by school, which is identified, and in addition to the EMS call count, we have school level counts of student demographic categories. Perhaps it's obvious but the call counts are the sum of calls made for each student. Overall, the number of calls to a school is low, no more than 20 in a semester and 70% to 80% of schools have no calls. But, perhaps very conveniently, about 95% of schools who have at least one call are in that 1 to 5=3 category. Before realizing that the data had been recoded, I had thought to analyze the data as a zero inflated poisson (ZIP) or negative binominal (ZINB). I had kind of thought of multiple imputation but I thought the imputation would have to be done against either a ZIP or ZINB model rather than a linear regression model and the values have to be between 1 and 5. I've used imputation before but always for continuous data. I'm not sure but I'd guess that because such a large percentage of schools that made calls are in the 3 category, there is really no way to compute (guess) what the rate parameter would be. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion <[hidden email]> On Behalf Of David Marso Sent: Tuesday, August 14, 2018 12:23 PM To: [hidden email] Subject: Re: partially grouped counts Good to know Bruce! Thanks. Fishy comment retracted... ---- Bruce Weaver wrote > Good question, David. Yes, there is (see link below), I just forgot > to mention it. > > https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistic > s_reference_project_ddita/spss/mva/syn_multiple_imputation_constraints > .html > > See MIN = NONE | num and MAX = NONE | num. You can also specify RND=1 > to round imputed values to the nearest integer. > > Bruce > > > > David Marso wrote >> That sounds a bit fishy. Is there any way to constrain the imputed >> values to be between 1-5 ? I am not that familiar with MI. >> >> >> Bruce Weaver wrote >>> Hi Gene. Here's a partially baked suggestion: How about recoding 3 to >>> -3 >>> (or some other out of range value for counts), treating it as missing, >>> and >>> using multiple imputation? >>> >>> p.s. - Sorry about the empty post that was sent a few minutes >>> ago--clicked >>> the wrong button by mistake! >>> >>> >>> >>> Maguin, Eugene wrote >>>> I'd appreciate some suggestions or advice on analyzing the following >>>> type >>>> of data. Somebody here has school-level data on the number of EMS calls >>>> made during a semester for kids. The element that I need help with is >>>> that >>>> the school district decided it needed to preserve something and so >>>> recoded >>>> the data so that schools with 1 thru 5 calls in a semester were given a >>>> value of 3, otherwise the true value was recorded. So the distribution >>>> looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment >>>> at >>>> each school so I can compute a rate but because of the grouping it is >>>> not >>>> accurate. What am asking for is direct advice from anybody who has >>>> analyzed such data, references to articles about how such data can >>>> be/has >>>> been analyzed, how this type of data would be described as for a search >>>> term (I know that work has been done with completely grouped counts), >>>> or >>>> where (other listservs, for instance, to look for advice etc. >>>> >>>> Thanks, Gene Magin >>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>> >>>> LISTSERV@.UGA >>> >>>> (not to SPSSX-L), with no body text except the >>>> command. To leave the list, send the command >>>> SIGNOFF SPSSX-L >>>> For a list of commands to manage subscriptions, send the command >>>> INFO REFCARD >>> >>> >>> >>> >>> >>> ----- >>> -- >>> Bruce Weaver >> >>> bweaver@ >> >>> http://sites.google.com/a/lakeheadu.ca/bweaver/ >>> >>> "When all else fails, RTFM." >>> >>> NOTE: My Hotmail account is not monitored regularly. >>> To send me an e-mail, please use the address shown above. >>> >>> -- >>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >> >>> LISTSERV@.UGA >> >>> (not to SPSSX-L), with no body text except the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >> >> Not >> >> >> >> ----- >> Please reply to the list and not to my personal email. >> Those desiring my consulting or training services please feel free to >> email me. >> --- >> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante >> porcos >> ne forte conculcent eas pedibus suis." >> Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff >> in >> abyssum?" >> -- >> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD > > > > > > ----- > -- > Bruce Weaver > bweaver@ > http://sites.google.com/a/lakeheadu.ca/bweaver/ > > "When all else fails, RTFM." > > NOTE: My Hotmail account is not monitored regularly. > To send me an e-mail, please use the address shown above. > > -- > Sent from: http://spssx-discussion.1045642.n5.nabble.com/ > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Do you mean that the school district has raw data by student?
Is the data provided as part of discovery? I know you may not be able to disclose some information, but is this an individual based case or part of a class action? Was any rationale given for coarsening the data? ----- Art Kendall Social Research Consultants -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
In reply to this post by Maguin, Eugene
Okay, very skewed. You say: Out of each 100 schools, there may be 80 with 0, 19 with "3", and 1
with a 6+ calls. Given the skew overall, we expect a huge skew from 1 to 5, too; so most "3"s are 1 or 2. Unless the school sizes are hugely disproportionate, ignore "rates" and go with "events".
How many schools are there? (How many with 6+ calls?)
I would consider (6+) to deserve a special discussion. These schools have "extra problems" ... I
would expect some differences in faculty, students, operating rules, or external circumstances, compared to other schools. On the other hand, if "6+" is just one disruptive student, that might be hard to pick out. "People" are usually counted for this sort of thing, in addition to "events",
but I think you don't have the luxury of knowing both.
Compare the three groups (1, "3", 6+) - Are they linear in regards to other characteristics?
Look for linear trends also within the "more" group alone. (Assume Poisson and use the square-root,
for a simple analysis of counts.)
Are trends within the 6+ group consistent with the 3-group trend?
-- Rich Ulrich
From: SPSSX(r) Discussion <[hidden email]> on behalf of Maguin, Eugene <[hidden email]>
Sent: Tuesday, August 14, 2018 2:32:14 PM To: [hidden email] Subject: Re: partially grouped counts To Art. There's an intermediary and a court case in between the school district and person I am working with. My understanding is that the recode was done by the district and the person does not think we can go back to the district to
get the unrecoded counts. Who knows, maybe the recode was a carefully calculated maneuver to obscure things.
Andy. Thank you for the additional names for this and the links. All. The data were provided by school, which is identified, and in addition to the EMS call count, we have school level counts of student demographic categories. Perhaps it's obvious but the call counts are the sum of calls made for each student. Overall, the number of calls to a school is low, no more than 20 in a semester and 70% to 80% of schools have no calls. But, perhaps very conveniently, about 95% of schools who have at least one call are in that 1 to 5=3 category. Before realizing that the data had been recoded, I had thought to analyze the data as a zero inflated poisson (ZIP) or negative binominal (ZINB). I had kind of thought of multiple imputation but I thought the imputation would have to be done against either a ZIP or ZINB model rather than a linear regression model and the values have to be between 1 and 5. I've used imputation before but always for continuous data. I'm not sure but I'd guess that because such a large percentage of schools that made calls are in the 3 category, there is really no way to compute (guess) what the rate parameter would be. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion <[hidden email]> On Behalf Of David Marso Sent: Tuesday, August 14, 2018 12:23 PM To: [hidden email] Subject: Re: partially grouped counts Good to know Bruce! Thanks. Fishy comment retracted... ---- Bruce Weaver wrote > Good question, David. Yes, there is (see link below), I just forgot > to mention it. > > https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistic > s_reference_project_ddita/spss/mva/syn_multiple_imputation_constraints > .html > > See MIN = NONE | num and MAX = NONE | num. You can also specify RND=1 > to round imputed values to the nearest integer. > > Bruce > > > > David Marso wrote >> That sounds a bit fishy. Is there any way to constrain the imputed >> values to be between 1-5 ? I am not that familiar with MI. >> >> >> Bruce Weaver wrote >>> Hi Gene. Here's a partially baked suggestion: How about recoding 3 to >>> -3 >>> (or some other out of range value for counts), treating it as missing, >>> and >>> using multiple imputation? >>> >>> p.s. - Sorry about the empty post that was sent a few minutes >>> ago--clicked >>> the wrong button by mistake! >>> >>> >>> >>> Maguin, Eugene wrote >>>> I'd appreciate some suggestions or advice on analyzing the following >>>> type >>>> of data. Somebody here has school-level data on the number of EMS calls >>>> made during a semester for kids. The element that I need help with is >>>> that >>>> the school district decided it needed to preserve something and so >>>> recoded >>>> the data so that schools with 1 thru 5 calls in a semester were given a >>>> value of 3, otherwise the true value was recorded. So the distribution >>>> looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment >>>> at >>>> each school so I can compute a rate but because of the grouping it is >>>> not >>>> accurate. What am asking for is direct advice from anybody who has >>>> analyzed such data, references to articles about how such data can >>>> be/has >>>> been analyzed, how this type of data would be described as for a search >>>> term (I know that work has been done with completely grouped counts), >>>> or >>>> where (other listservs, for instance, to look for advice etc. >>>> >>>> Thanks, Gene Magin >>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>> >>>> LISTSERV@.UGA >>> >>>> (not to SPSSX-L), with no body text except the >>>> command. To leave the list, send the command >>>> SIGNOFF SPSSX-L >>>> For a list of commands to manage subscriptions, send the command >>>> INFO REFCARD >>> >>> >>> >>> >>> >>> ----- >>> -- >>> Bruce Weaver >> >>> bweaver@ >> >>> http://sites.google.com/a/lakeheadu.ca/bweaver/ >>> >>> "When all else fails, RTFM." >>> >>> NOTE: My Hotmail account is not monitored regularly. >>> To send me an e-mail, please use the address shown above. >>> >>> -- >>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >> >>> LISTSERV@.UGA >> >>> (not to SPSSX-L), with no body text except the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >> >> Not >> >> >> >> ----- >> Please reply to the list and not to my personal email. >> Those desiring my consulting or training services please feel free to >> email me. >> --- >> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante >> porcos >> ne forte conculcent eas pedibus suis." >> Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff >> in >> abyssum?" >> -- >> Sent from: http://spssx-discussion.1045642.n5.nabble.com/ >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to > >> LISTSERV@.UGA > >> (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD > > > > > > ----- > -- > Bruce Weaver > bweaver@ > http://sites.google.com/a/lakeheadu.ca/bweaver/ > > "When all else fails, RTFM." > > NOTE: My Hotmail account is not monitored regularly. > To send me an e-mail, please use the address shown above. > > -- > Sent from: http://spssx-discussion.1045642.n5.nabble.com/ > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Yes, very skewed. My percentages from yesterday were wrong. The percentage of non-3 category calls ranges from 0.4% to 1.6%. Let the nominal N be 1500. So, at
the most 25 non-3 category cases. The numbers of 3 category cases ranges from (round numbers) 300 to 500. I agree that those non-3 category schools are special in some way (and we have some ideas about them). I’m a bit confused about these comments: >>Compare the three groups (1, "3", 6+) - Are they linear in regards to other characteristics?
I’m thinking of an ordinal model with a test for equality of slopes. Is this what you are thinking of? >>Look for linear trends also within the "more" group alone. (Assume Poisson and use the square-root,
for a simple analysis of counts.)
The “more” group is the 6+ group?
I rarely use Poisson so I don’t understand the square root transform of counts and, second, are you saying that the 1 (actually 0) group and the 3 group are discarded? >>Are trends within the 6+ group consistent with the 3-group trend?
I’m not understanding this, either. The 6+ group has range of count values and I could look at correlations between covariates and the 6+ group; however, everybody
in 3 group has the same value, 3. A number of people have commented. Thank you.
Gene Magin From: Rich Ulrich <[hidden email]>
Okay, very skewed. You say: Out of each 100 schools, there may be 80 with 0, 19 with "3", and 1
with a 6+ calls. Given the skew overall, we expect a huge skew from 1 to 5, too; so most "3"s are 1 or 2. Unless the school sizes are hugely disproportionate, ignore "rates" and go with "events".
How many schools are there? (How many with 6+ calls?)
I would consider (6+) to deserve a special discussion. These schools have "extra problems" ... I
would expect some differences in faculty, students, operating rules, or external circumstances, compared to other schools. On the other hand, if "6+" is just one disruptive student, that might be hard to pick out. "People" are usually counted for this sort of thing, in addition to "events",
but I think you don't have the luxury of knowing both.
Compare the three groups (1, "3", 6+) - Are they linear in regards to other characteristics?
Look for linear trends also within the "more" group alone. (Assume Poisson and use the square-root,
for a simple analysis of counts.)
Are trends within the 6+ group consistent with the 3-group trend?
-- Rich Ulrich From: SPSSX(r) Discussion <[hidden email]>
on behalf of Maguin, Eugene <[hidden email]> To Art. There's an intermediary and a court case in between the school district and person I am working with. My understanding is that the recode was done by the district and the person does not think we can
go back to the district to get the unrecoded counts. Who knows, maybe the recode was a carefully calculated maneuver to obscure things. |
I will insert comments in the text. From: SPSSX(r) Discussion <[hidden email]> on behalf of Maguin, Eugene <[hidden email]>
Sent: Thursday, August 16, 2018 2:47 PM To: [hidden email] Subject: Re: partially grouped counts gm>
>Yes, very skewed. My percentages from yesterday were wrong. The percentage of non-3 category calls ranges from 0.4% to 1.6%. >Let the nominal N be 1500.
So, at the most 25 non-3 category cases. The numbers of 3 category cases ranges from (round
>numbers) 300 to 500. >>I agree that those non-3 category schools are special in some way (and we have some ideas about them). >I’m a bit confused about these comments:
me>>Compare the three groups (1, "3", 6+) - Are they linear in regards to other characteristics? >I’m thinking of an ordinal model with a test for equality of slopes. Is this what you are thinking of?
No - what would be "equal" to what?
Three groups: 0 (sorry, I wrote 1), "3", and the rest (6 and more). < None, a few, many >.
These groups are ordered, logically. However, in the real world, zeros often do not fall where you would expect. If the groups do /differ/, the question might be, Do they differ in a way that shows as a linear trend (on those other variables you are looking at)?
me>>Look for linear trends also within the "more" group alone. (Assume Poisson and use the
>>square-root, for a simple analysis of counts.)
gm>The “more” group is the 6+ group? >I rarely use Poisson so I don’t understand the square root transform of counts and, second, are you
>saying that the 1 (actually 0) group and the 3 group are discarded?
Yes, "more" was how I was calling the 6+ group in my first draft, and I didn't clean up all the mentions;
and, Yes, for a close analysis of the 6+ group, you might righteously examine the 25 cases alone.
The hypothesis would be, Is there some reason for 50 or 100 calls versus 6 or 10 calls? Actually, on
my further imagining of factors, I think < 50 vs 100 > might be almost an "equal interval" to < 6 vs 12 > . Remember that we like equal-intervals, because we like "homogeneous errors" which is what gives
us valid tests. I'm guessing that
the 25-case distribution will be longer-tailed than Poisson, and the
natural log of the counts (rather than square root)
(label it "Intensity" of calls) is a more suitable
criterion than the raw counts.
gm >I’m not understanding this, either. The 6+ group has range of count values and I could look at correlations
>between covariates and the 6+ group; however, everybody in 3 group has the same value, 3.
I hope it is clearer now. The "3-groups trend" is the trend (if any) across 3/"3"/6+ . Is that trend similar to the trend measured (in 6+) for Intensity?
-- Rich Ulrich |
Free forum by Nabble | Edit this page |