partially grouped counts

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

partially grouped counts

Maguin, Eugene

I’d appreciate some suggestions or advice on analyzing the following type of data. Somebody here has school-level data on the number of EMS calls made during a semester for kids. The element that I need help with is that the school district decided it needed to preserve something and so recoded the data so that schools with 1 thru 5 calls in a semester were given a value of 3, otherwise the true value was recorded. So the distribution looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment at each school so I can compute a rate but because of the grouping it is not accurate. What am asking for is direct advice from anybody who has analyzed such data, references to articles about how such data can be/has been analyzed, how this type of data would be described as for a search term (I know that work has been done with completely grouped counts), or where (other listservs, for instance, to look for advice etc.

 

Thanks, Gene Magin

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: partially grouped counts

Bruce Weaver
Administrator
Hi Gene.  Here's a partially baked suggestion:  How about recoding 3 to -3
(or some other out of range value for counts), treating it as missing, and
using multiple imputation?

p.s. - Sorry about the empty post that was sent a few minutes ago--clicked
the wrong button by mistake!



Maguin, Eugene wrote

> I'd appreciate some suggestions or advice on analyzing the following type
> of data. Somebody here has school-level data on the number of EMS calls
> made during a semester for kids. The element that I need help with is that
> the school district decided it needed to preserve something and so recoded
> the data so that schools with 1 thru 5 calls in a semester were given a
> value of 3, otherwise the true value was recorded. So the distribution
> looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment at
> each school so I can compute a rate but because of the grouping it is not
> accurate. What am asking for is direct advice from anybody who has
> analyzed such data, references to articles about how such data can be/has
> been analyzed, how this type of data would be described as for a search
> term (I know that work has been done with completely grouped counts), or
> where (other listservs, for instance, to look for advice etc.
>
> Thanks, Gene Magin
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: partially grouped counts

David Marso
Administrator
That sounds a bit fishy. Is there any way to constrain the imputed values to
be between 1-5 ?  I am not that familiar with MI.  


Bruce Weaver wrote

> Hi Gene.  Here's a partially baked suggestion:  How about recoding 3 to -3
> (or some other out of range value for counts), treating it as missing, and
> using multiple imputation?
>
> p.s. - Sorry about the empty post that was sent a few minutes ago--clicked
> the wrong button by mistake!
>
>
>
> Maguin, Eugene wrote
>> I'd appreciate some suggestions or advice on analyzing the following type
>> of data. Somebody here has school-level data on the number of EMS calls
>> made during a semester for kids. The element that I need help with is
>> that
>> the school district decided it needed to preserve something and so
>> recoded
>> the data so that schools with 1 thru 5 calls in a semester were given a
>> value of 3, otherwise the true value was recorded. So the distribution
>> looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment at
>> each school so I can compute a rate but because of the grouping it is not
>> accurate. What am asking for is direct advice from anybody who has
>> analyzed such data, references to articles about how such data can be/has
>> been analyzed, how this type of data would be described as for a search
>> term (I know that work has been done with completely grouped counts), or
>> where (other listservs, for instance, to look for advice etc.
>>
>> Thanks, Gene Magin
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
> -----
> --
> Bruce Weaver

> bweaver@

> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

Not



-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: partially grouped counts

Bruce Weaver
Administrator
Good question, David.  Yes, there is (see link below), I just forgot to
mention it.  

https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistics_reference_project_ddita/spss/mva/syn_multiple_imputation_constraints.html

See MIN = NONE | num and MAX = NONE | num.  You can also specify RND=1 to
round imputed values to the nearest integer.

Bruce



David Marso wrote

> That sounds a bit fishy. Is there any way to constrain the imputed values
> to
> be between 1-5 ?  I am not that familiar with MI.  
>
>
> Bruce Weaver wrote
>> Hi Gene.  Here's a partially baked suggestion:  How about recoding 3 to
>> -3
>> (or some other out of range value for counts), treating it as missing,
>> and
>> using multiple imputation?
>>
>> p.s. - Sorry about the empty post that was sent a few minutes
>> ago--clicked
>> the wrong button by mistake!
>>
>>
>>
>> Maguin, Eugene wrote
>>> I'd appreciate some suggestions or advice on analyzing the following
>>> type
>>> of data. Somebody here has school-level data on the number of EMS calls
>>> made during a semester for kids. The element that I need help with is
>>> that
>>> the school district decided it needed to preserve something and so
>>> recoded
>>> the data so that schools with 1 thru 5 calls in a semester were given a
>>> value of 3, otherwise the true value was recorded. So the distribution
>>> looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment
>>> at
>>> each school so I can compute a rate but because of the grouping it is
>>> not
>>> accurate. What am asking for is direct advice from anybody who has
>>> analyzed such data, references to articles about how such data can
>>> be/has
>>> been analyzed, how this type of data would be described as for a search
>>> term (I know that work has been done with completely grouped counts), or
>>> where (other listservs, for instance, to look for advice etc.
>>>
>>> Thanks, Gene Magin
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>> LISTSERV@.UGA
>>
>>>  (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>
>>
>>
>>
>>
>> -----
>> --
>> Bruce Weaver
>
>> bweaver@
>
>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>
>> "When all else fails, RTFM."
>>
>> NOTE: My Hotmail account is not monitored regularly.
>> To send me an e-mail, please use the address shown above.
>>
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
> Not
>
>
>
> -----
> Please reply to the list and not to my personal email.
> Those desiring my consulting or training services please feel free to
> email me.
> ---
> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos
> ne forte conculcent eas pedibus suis."
> Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in
> abyssum?"
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: partially grouped counts

Art Kendall
In reply to this post by David Marso
Is this a school district where you can reach the person or persons who work
with the data?
Would they be willing to produce the rates you are interested in?

What is the rationale for coarsening the data?

If they released a new data set with the rate and the coarsened count, would
it be possible to reverse engineer the original count?



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: partially grouped counts

Andy W
In reply to this post by David Marso
In SPSS you can limit the outcome to within a particular range of the entire
data for multiple imputation, but that won't take into account the potential
range given the already binned data (eg 3 can be from 1-5, 8 can be from
6-10 etc). If you look on google scholar you can find some implementations
that take that into account,
https://scholar.google.com/scholar?hl=en&as_sdt=0%2C44&q=interval+censored+multiple+imputation&btnG=.

Sometimes this data is called *interval censored* or *binned* data. It sort
of depends on the type of analysis you want to do with the data how you
might approach it. Simply descriptive you have bounds on the counts, and you
can subsequently bound various summary statistics and simply tests of
differences. If estimating as a dependent variable, you may do some type of
censored regression approach. If using as an independent variable that is
where I have seen imputation used. See
https://prod.sandia.gov/techlib-noauth/access-control.cgi/2007/070939.pdf
(has no references to the imputation stuff though).



-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: partially grouped counts

David Marso
Administrator
In reply to this post by Bruce Weaver
Good to know Bruce! Thanks.
Fishy comment retracted...
----

Bruce Weaver wrote

> Good question, David.  Yes, there is (see link below), I just forgot to
> mention it.  
>
> https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistics_reference_project_ddita/spss/mva/syn_multiple_imputation_constraints.html
>
> See MIN = NONE | num and MAX = NONE | num.  You can also specify RND=1 to
> round imputed values to the nearest integer.
>
> Bruce
>
>
>
> David Marso wrote
>> That sounds a bit fishy. Is there any way to constrain the imputed values
>> to
>> be between 1-5 ?  I am not that familiar with MI.  
>>
>>
>> Bruce Weaver wrote
>>> Hi Gene.  Here's a partially baked suggestion:  How about recoding 3 to
>>> -3
>>> (or some other out of range value for counts), treating it as missing,
>>> and
>>> using multiple imputation?
>>>
>>> p.s. - Sorry about the empty post that was sent a few minutes
>>> ago--clicked
>>> the wrong button by mistake!
>>>
>>>
>>>
>>> Maguin, Eugene wrote
>>>> I'd appreciate some suggestions or advice on analyzing the following
>>>> type
>>>> of data. Somebody here has school-level data on the number of EMS calls
>>>> made during a semester for kids. The element that I need help with is
>>>> that
>>>> the school district decided it needed to preserve something and so
>>>> recoded
>>>> the data so that schools with 1 thru 5 calls in a semester were given a
>>>> value of 3, otherwise the true value was recorded. So the distribution
>>>> looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment
>>>> at
>>>> each school so I can compute a rate but because of the grouping it is
>>>> not
>>>> accurate. What am asking for is direct advice from anybody who has
>>>> analyzed such data, references to articles about how such data can
>>>> be/has
>>>> been analyzed, how this type of data would be described as for a search
>>>> term (I know that work has been done with completely grouped counts),
>>>> or
>>>> where (other listservs, for instance, to look for advice etc.
>>>>
>>>> Thanks, Gene Magin
>>>>
>>>> =====================
>>>> To manage your subscription to SPSSX-L, send a message to
>>>
>>>> LISTSERV@.UGA
>>>
>>>>  (not to SPSSX-L), with no body text except the
>>>> command. To leave the list, send the command
>>>> SIGNOFF SPSSX-L
>>>> For a list of commands to manage subscriptions, send the command
>>>> INFO REFCARD
>>>
>>>
>>>
>>>
>>>
>>> -----
>>> --
>>> Bruce Weaver
>>
>>> bweaver@
>>
>>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>>
>>> "When all else fails, RTFM."
>>>
>>> NOTE: My Hotmail account is not monitored regularly.
>>> To send me an e-mail, please use the address shown above.
>>>
>>> --
>>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>> LISTSERV@.UGA
>>
>>>  (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>
>> Not
>>
>>
>>
>> -----
>> Please reply to the list and not to my personal email.
>> Those desiring my consulting or training services please feel free to
>> email me.
>> ---
>> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
>> porcos
>> ne forte conculcent eas pedibus suis."
>> Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff
>> in
>> abyssum?"
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
> -----
> --
> Bruce Weaver

> bweaver@

> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: partially grouped counts

Maguin, Eugene
To Art. There's an intermediary and a court case in between the school district and person I am working with. My understanding is that the recode was done by the district and the person does not think we can go back to the district to get the unrecoded counts. Who knows, maybe the recode was a carefully calculated maneuver to obscure things.

Andy. Thank you for the additional names for this and the links.

All. The data were provided by school, which is identified, and in addition to the EMS call count, we have school level counts of student demographic categories. Perhaps it's obvious but the call counts are the sum of calls made for each student. Overall, the number of calls to a school is low, no more than 20 in a semester and 70% to 80% of schools have no calls. But, perhaps very conveniently, about 95% of schools who have at least one call are in that 1 to 5=3 category.

Before realizing that the data had been recoded, I had thought to analyze the data as a zero inflated poisson (ZIP) or negative binominal (ZINB).

I had kind of thought of multiple imputation but I thought the imputation would have to be done against either a ZIP or ZINB model rather than a linear regression model and the values have to be between 1 and 5. I've used imputation before but always for continuous data. I'm not sure but I'd guess that because such a large percentage of schools that made calls are in the 3 category, there is really no way to compute (guess) what the rate parameter would be.

Gene Maguin


 




-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of David Marso
Sent: Tuesday, August 14, 2018 12:23 PM
To: [hidden email]
Subject: Re: partially grouped counts

Good to know Bruce! Thanks.
Fishy comment retracted...
----

Bruce Weaver wrote

> Good question, David.  Yes, there is (see link below), I just forgot
> to mention it.
>
> https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistic
> s_reference_project_ddita/spss/mva/syn_multiple_imputation_constraints
> .html
>
> See MIN = NONE | num and MAX = NONE | num.  You can also specify RND=1
> to round imputed values to the nearest integer.
>
> Bruce
>
>
>
> David Marso wrote
>> That sounds a bit fishy. Is there any way to constrain the imputed
>> values to be between 1-5 ?  I am not that familiar with MI.
>>
>>
>> Bruce Weaver wrote
>>> Hi Gene.  Here's a partially baked suggestion:  How about recoding 3 to
>>> -3
>>> (or some other out of range value for counts), treating it as missing,
>>> and
>>> using multiple imputation?
>>>
>>> p.s. - Sorry about the empty post that was sent a few minutes
>>> ago--clicked
>>> the wrong button by mistake!
>>>
>>>
>>>
>>> Maguin, Eugene wrote
>>>> I'd appreciate some suggestions or advice on analyzing the following
>>>> type
>>>> of data. Somebody here has school-level data on the number of EMS calls
>>>> made during a semester for kids. The element that I need help with is
>>>> that
>>>> the school district decided it needed to preserve something and so
>>>> recoded
>>>> the data so that schools with 1 thru 5 calls in a semester were given a
>>>> value of 3, otherwise the true value was recorded. So the distribution
>>>> looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment
>>>> at
>>>> each school so I can compute a rate but because of the grouping it is
>>>> not
>>>> accurate. What am asking for is direct advice from anybody who has
>>>> analyzed such data, references to articles about how such data can
>>>> be/has
>>>> been analyzed, how this type of data would be described as for a search
>>>> term (I know that work has been done with completely grouped counts),
>>>> or
>>>> where (other listservs, for instance, to look for advice etc.
>>>>
>>>> Thanks, Gene Magin
>>>>
>>>> =====================
>>>> To manage your subscription to SPSSX-L, send a message to
>>>
>>>> LISTSERV@.UGA
>>>
>>>>  (not to SPSSX-L), with no body text except the
>>>> command. To leave the list, send the command
>>>> SIGNOFF SPSSX-L
>>>> For a list of commands to manage subscriptions, send the command
>>>> INFO REFCARD
>>>
>>>
>>>
>>>
>>>
>>> -----
>>> --
>>> Bruce Weaver
>>
>>> bweaver@
>>
>>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>>
>>> "When all else fails, RTFM."
>>>
>>> NOTE: My Hotmail account is not monitored regularly.
>>> To send me an e-mail, please use the address shown above.
>>>
>>> --
>>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>> LISTSERV@.UGA
>>
>>>  (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>
>> Not
>>
>>
>>
>> -----
>> Please reply to the list and not to my personal email.
>> Those desiring my consulting or training services please feel free to
>> email me.
>> ---
>> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
>> porcos
>> ne forte conculcent eas pedibus suis."
>> Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff
>> in
>> abyssum?"
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
> -----
> --
> Bruce Weaver

> bweaver@

> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: partially grouped counts

Art Kendall
Do you mean that the school district has raw data by student?

Is the data provided as part of discovery?

I know you may not be able to disclose some information, but is this an
individual based case or part of a class action?

Was any rationale given for coarsening the data?



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: partially grouped counts

Rich Ulrich
In reply to this post by Maguin, Eugene

Okay, very skewed.  You say:  Out of each 100 schools, there may be 80 with 0, 19 with "3", and 1

with a 6+ calls. Given the skew overall, we expect a huge skew from 1 to 5, too; so most "3"s

are 1 or 2.   Unless the school sizes are hugely disproportionate, ignore "rates" and go with "events".


How many schools are there? (How many with 6+ calls?)


I would consider (6+) to deserve a special discussion. These schools have "extra problems" ... I

would expect some differences in faculty, students, operating rules, or external circumstances,

compared to other schools.  On the other hand, if "6+" is just one disruptive student, that might

be hard to pick out.  "People" are usually counted for this sort of thing, in addition to "events",

but I think you don't have the luxury of knowing both.


Compare the three groups (1, "3", 6+) - Are they linear in regards to other characteristics? 

Look for linear trends also within the "more" group alone. (Assume Poisson and use the square-root,

for a simple analysis of counts.)


Are trends within the 6+ group consistent with the 3-group trend?

--

Rich Ulrich



From: SPSSX(r) Discussion <[hidden email]> on behalf of Maguin, Eugene <[hidden email]>
Sent: Tuesday, August 14, 2018 2:32:14 PM
To: [hidden email]
Subject: Re: partially grouped counts
 
To Art. There's an intermediary and a court case in between the school district and person I am working with. My understanding is that the recode was done by the district and the person does not think we can go back to the district to get the unrecoded counts. Who knows, maybe the recode was a carefully calculated maneuver to obscure things.

Andy. Thank you for the additional names for this and the links.

All. The data were provided by school, which is identified, and in addition to the EMS call count, we have school level counts of student demographic categories. Perhaps it's obvious but the call counts are the sum of calls made for each student. Overall, the number of calls to a school is low, no more than 20 in a semester and 70% to 80% of schools have no calls. But, perhaps very conveniently, about 95% of schools who have at least one call are in that 1 to 5=3 category.

Before realizing that the data had been recoded, I had thought to analyze the data as a zero inflated poisson (ZIP) or negative binominal (ZINB).

I had kind of thought of multiple imputation but I thought the imputation would have to be done against either a ZIP or ZINB model rather than a linear regression model and the values have to be between 1 and 5. I've used imputation before but always for continuous data. I'm not sure but I'd guess that because such a large percentage of schools that made calls are in the 3 category, there is really no way to compute (guess) what the rate parameter would be.

Gene Maguin


 




-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of David Marso
Sent: Tuesday, August 14, 2018 12:23 PM
To: [hidden email]
Subject: Re: partially grouped counts

Good to know Bruce! Thanks.
Fishy comment retracted...
----

Bruce Weaver wrote
> Good question, David.  Yes, there is (see link below), I just forgot
> to mention it.
>
> https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistic
> s_reference_project_ddita/spss/mva/syn_multiple_imputation_constraints
> .html
>
> See MIN = NONE | num and MAX = NONE | num.  You can also specify RND=1
> to round imputed values to the nearest integer.
>
> Bruce
>
>
>
> David Marso wrote
>> That sounds a bit fishy. Is there any way to constrain the imputed
>> values to be between 1-5 ?  I am not that familiar with MI.
>>
>>
>> Bruce Weaver wrote
>>> Hi Gene.  Here's a partially baked suggestion:  How about recoding 3 to
>>> -3
>>> (or some other out of range value for counts), treating it as missing,
>>> and
>>> using multiple imputation?
>>>
>>> p.s. - Sorry about the empty post that was sent a few minutes
>>> ago--clicked
>>> the wrong button by mistake!
>>>
>>>
>>>
>>> Maguin, Eugene wrote
>>>> I'd appreciate some suggestions or advice on analyzing the following
>>>> type
>>>> of data. Somebody here has school-level data on the number of EMS calls
>>>> made during a semester for kids. The element that I need help with is
>>>> that
>>>> the school district decided it needed to preserve something and so
>>>> recoded
>>>> the data so that schools with 1 thru 5 calls in a semester were given a
>>>> value of 3, otherwise the true value was recorded. So the distribution
>>>> looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment
>>>> at
>>>> each school so I can compute a rate but because of the grouping it is
>>>> not
>>>> accurate. What am asking for is direct advice from anybody who has
>>>> analyzed such data, references to articles about how such data can
>>>> be/has
>>>> been analyzed, how this type of data would be described as for a search
>>>> term (I know that work has been done with completely grouped counts),
>>>> or
>>>> where (other listservs, for instance, to look for advice etc.
>>>>
>>>> Thanks, Gene Magin
>>>>
>>>> =====================
>>>> To manage your subscription to SPSSX-L, send a message to
>>>
>>>> LISTSERV@.UGA
>>>
>>>>  (not to SPSSX-L), with no body text except the
>>>> command. To leave the list, send the command
>>>> SIGNOFF SPSSX-L
>>>> For a list of commands to manage subscriptions, send the command
>>>> INFO REFCARD
>>>
>>>
>>>
>>>
>>>
>>> -----
>>> --
>>> Bruce Weaver
>>
>>> bweaver@
>>
>>> http://sites.google.com/a/lakeheadu.ca/bweaver/
>>>
>>> "When all else fails, RTFM."
>>>
>>> NOTE: My Hotmail account is not monitored regularly.
>>> To send me an e-mail, please use the address shown above.
>>>
>>> --
>>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>> LISTSERV@.UGA
>>
>>>  (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>
>> Not
>>
>>
>>
>> -----
>> Please reply to the list and not to my personal email.
>> Those desiring my consulting or training services please feel free to
>> email me.
>> ---
>> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
>> porcos
>> ne forte conculcent eas pedibus suis."
>> Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff
>> in
>> abyssum?"
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>> LISTSERV@.UGA
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
> -----
> --
> Bruce Weaver

> bweaver@

> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: partially grouped counts

Maguin, Eugene

Yes, very skewed. My percentages from yesterday were wrong. The percentage of non-3 category calls ranges from 0.4% to 1.6%. Let the nominal N be 1500. So, at the most 25 non-3 category cases. The numbers of 3 category cases ranges from (round numbers) 300 to 500.

 

I agree that those non-3 category schools are special in some way (and we have some ideas about them).

 

I’m a bit confused about these comments:

>>Compare the three groups (1, "3", 6+) - Are they linear in regards to other characteristics? 

I’m thinking of an ordinal model with a test for equality of slopes. Is this what you are thinking of?

 

>>Look for linear trends also within the "more" group alone. (Assume Poisson and use the square-root,

for a simple analysis of counts.)

The “more” group is the 6+ group?

I rarely use Poisson so I don’t understand the square root transform of counts and, second, are you saying that the 1 (actually 0) group and the 3 group are discarded?

 

>>Are trends within the 6+ group consistent with the 3-group trend?

I’m not understanding this, either. The 6+ group has range of count values and I could look at correlations between covariates and the 6+ group; however, everybody in 3 group has the same value, 3.  

 

A number of people have commented. Thank you.

Gene Magin

 

 

From: Rich Ulrich <[hidden email]>
Sent: Tuesday, August 14, 2018 4:48 PM
To: [hidden email]; Maguin, Eugene <[hidden email]>
Subject: Re: partially grouped counts

 

Okay, very skewed.  You say:  Out of each 100 schools, there may be 80 with 0, 19 with "3", and 1

with a 6+ calls. Given the skew overall, we expect a huge skew from 1 to 5, too; so most "3"s

are 1 or 2.   Unless the school sizes are hugely disproportionate, ignore "rates" and go with "events".

 

How many schools are there? (How many with 6+ calls?)

 

I would consider (6+) to deserve a special discussion. These schools have "extra problems" ... I

would expect some differences in faculty, students, operating rules, or external circumstances,

compared to other schools.  On the other hand, if "6+" is just one disruptive student, that might

be hard to pick out.  "People" are usually counted for this sort of thing, in addition to "events",

but I think you don't have the luxury of knowing both.

 

Compare the three groups (1, "3", 6+) - Are they linear in regards to other characteristics? 

Look for linear trends also within the "more" group alone. (Assume Poisson and use the square-root,

for a simple analysis of counts.)

 

Are trends within the 6+ group consistent with the 3-group trend?

--

Rich Ulrich

 


From: SPSSX(r) Discussion <[hidden email]> on behalf of Maguin, Eugene <[hidden email]>
Sent: Tuesday, August 14, 2018 2:32:14 PM
To:
[hidden email]
Subject: Re: partially grouped counts

 

To Art. There's an intermediary and a court case in between the school district and person I am working with. My understanding is that the recode was done by the district and the person does not think we can go back to the district to get the unrecoded counts. Who knows, maybe the recode was a carefully calculated maneuver to obscure things.

Andy. Thank you for the additional names for this and the links.

All. The data were provided by school, which is identified, and in addition to the EMS call count, we have school level counts of student demographic categories. Perhaps it's obvious but the call counts are the sum of calls made for each student. Overall, the number of calls to a school is low, no more than 20 in a semester and 70% to 80% of schools have no calls. But, perhaps very conveniently, about 95% of schools who have at least one call are in that 1 to 5=3 category.

Before realizing that the data had been recoded, I had thought to analyze the data as a zero inflated poisson (ZIP) or negative binominal (ZINB).

I had kind of thought of multiple imputation but I thought the imputation would have to be done against either a ZIP or ZINB model rather than a linear regression model and the values have to be between 1 and 5. I've used imputation before but always for continuous data. I'm not sure but I'd guess that because such a large percentage of schools that made calls are in the 3 category, there is really no way to compute (guess) what the rate parameter would be.

Gene Maguin


 




-----Original Message-----
From: SPSSX(r) Discussion <
[hidden email]> On Behalf Of David Marso
Sent: Tuesday, August 14, 2018 12:23 PM
To:
[hidden email]
Subject: Re: partially grouped counts

Good to know Bruce! Thanks.
Fishy comment retracted...
----

Bruce Weaver wrote
> Good question, David.  Yes, there is (see link below), I just forgot
> to mention it.
>
>
https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistic
> s_reference_project_ddita/spss/mva/syn_multiple_imputation_constraints
> .html
>
> See MIN = NONE | num and MAX = NONE | num.  You can also specify RND=1
> to round imputed values to the nearest integer.
>
> Bruce
>
>
>
> David Marso wrote
>> That sounds a bit fishy. Is there any way to constrain the imputed
>> values to be between 1-5 ?  I am not that familiar with MI.
>>
>>
>> Bruce Weaver wrote
>>> Hi Gene.  Here's a partially baked suggestion:  How about recoding 3 to
>>> -3
>>> (or some other out of range value for counts), treating it as missing,
>>> and
>>> using multiple imputation?
>>>
>>> p.s. - Sorry about the empty post that was sent a few minutes
>>> ago--clicked
>>> the wrong button by mistake!
>>>
>>>
>>>
>>> Maguin, Eugene wrote
>>>> I'd appreciate some suggestions or advice on analyzing the following
>>>> type
>>>> of data. Somebody here has school-level data on the number of EMS calls
>>>> made during a semester for kids. The element that I need help with is
>>>> that
>>>> the school district decided it needed to preserve something and so
>>>> recoded
>>>> the data so that schools with 1 thru 5 calls in a semester were given a
>>>> value of 3, otherwise the true value was recorded. So the distribution
>>>> looks, for example, like 0, 3, 6, 8, 9, 12, etc. I have the enrollment
>>>> at
>>>> each school so I can compute a rate but because of the grouping it is
>>>> not
>>>> accurate. What am asking for is direct advice from anybody who has
>>>> analyzed such data, references to articles about how such data can
>>>> be/has
>>>> been analyzed, how this type of data would be described as for a search
>>>> term (I know that work has been done with completely grouped counts),
>>>> or
>>>> where (other listservs, for instance, to look for advice etc.
>>>>
>>>> Thanks, Gene Magin
>>>>
>>>> =====================
>>>> To manage your subscription to SPSSX-L, send a message to
>>>
>>>>
[hidden email]
>>>
>>>>  (not to SPSSX-L), with no body text except the
>>>> command. To leave the list, send the command
>>>> SIGNOFF SPSSX-L
>>>> For a list of commands to manage subscriptions, send the command
>>>> INFO REFCARD
>>>
>>>
>>>
>>>
>>>
>>> -----
>>> --
>>> Bruce Weaver
>>
>>> bweaver@
>>
>>>
http://sites.google.com/a/lakeheadu.ca/bweaver/
>>>
>>> "When all else fails, RTFM."
>>>
>>> NOTE: My Hotmail account is not monitored regularly.
>>> To send me an e-mail, please use the address shown above.
>>>
>>> --
>>> Sent from:
http://spssx-discussion.1045642.n5.nabble.com/
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>
>>>
[hidden email]
>>
>>>  (not to SPSSX-L), with no body text except the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>
>> Not
>>
>>
>>
>> -----
>> Please reply to the list and not to my personal email.
>> Those desiring my consulting or training services please feel free to
>> email me.
>> ---
>> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
>> porcos
>> ne forte conculcent eas pedibus suis."
>> Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff
>> in
>> abyssum?"
>> --
>> Sent from:
http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>
>>
[hidden email]
>
>>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>
>
>
>
>
> -----
> --
> Bruce Weaver

> bweaver@

>
http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> Sent from:
http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

>
[hidden email]

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from:
http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: partially grouped counts

Rich Ulrich

I will insert comments in the text.



From: SPSSX(r) Discussion <[hidden email]> on behalf of Maguin, Eugene <[hidden email]>
Sent: Thursday, August 16, 2018 2:47 PM
To: [hidden email]
Subject: Re: partially grouped counts

gm>

>Yes, very skewed. My percentages from yesterday were wrong. The percentage of non-3 category calls ranges from 0.4% to 1.6%. >Let the nominal N be 1500. So, at the most 25 non-3 category cases. The numbers of 3 category cases ranges from (round

>numbers) 300 to 500.

>I agree that those non-3 category schools are special in some way (and we have some ideas about them).

>I’m a bit confused about these comments:


me>>Compare the three groups (1, "3", 6+) - Are they linear in regards to other characteristics? 


>I’m thinking of an ordinal model with a test for equality of slopes. Is this what you are thinking of?


No - what would be "equal" to what? 

Three groups: 0 (sorry, I wrote 1), "3", and the rest (6 and more). < None, a few, many >.


These groups are ordered, logically.  However, in the real world, zeros often do not fall

where you would expect. If the groups do /differ/, the question might be, Do they differ

in a way that shows as a linear trend (on those other variables you are looking at)?


me>>Look for linear trends also within the "more" group alone. (Assume Poisson and use the

>>square-root, for a simple analysis of counts.)


gm>The “more” group is the 6+ group?

>I rarely use Poisson so I don’t understand the square root transform of counts and, second, are you

>saying that the 1 (actually 0) group and the 3 group are discarded?

 

Yes, "more" was how I was calling the 6+ group in my first draft, and I didn't clean up all the mentions;

and, Yes, for a close analysis of the 6+ group, you might righteously examine the 25 cases alone.


The hypothesis would be, Is there some reason for 50 or 100 calls versus 6 or 10 calls?  Actually, on

my further imagining of factors, I think < 50 vs 100 > might be almost an "equal interval" to < 6 vs 12 > .


Remember that we like equal-intervals, because we like "homogeneous errors" which is what gives

us valid tests. I'm guessing that the 25-case distribution will be longer-tailed than Poisson, and the

natural log of the counts (rather than square root) (label it "Intensity" of calls) is a more suitable

criterion than the raw counts. 


me>>Are trends within the 6+ group consistent with the 3-group trend?

gm >I’m not understanding this, either. The 6+ group has range of count values and I could look at correlations

>between covariates and the 6+ group; however, everybody in 3 group has the same value, 3.  


I hope it is clearer now.  The "3-groups trend" is the trend (if any) across 3/"3"/6+ . Is that

trend similar to the trend measured (in 6+) for Intensity?


--

Rich Ulrich

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD