value counts via aggregate

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

value counts via aggregate

wsu_wright
When aggregating I commonly need to get counts for specific values
(e.g., how many cases had value 3 on varx).  Since aggregate only
appears to provide the percent of cases with a said value via the
PIN(varx,3,3), I often create binaries before the aggregate & ask for
the sums for each binary as in

recode varx3 (3=1)((else=0) into varx3.
agg
   /outfile=*
   /break....
   /varx3=sum(varx3).

Since I may have several I'll often create the binaries via a do repeat.

But I'm curious, if aggregate can calculate percents why can it not
calculate counts (unless I'm missing something)?

DW

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: value counts via aggregate

ViAnn Beadle

Is There some reason why you don't want to obtain counts for variables within a break group using CROSSTABS or CTABLES?

On Sun, Jan 31, 2010 at 3:28 PM, David Wright <[hidden email]> wrote:
When aggregating I commonly need to get counts for specific values
(e.g., how many cases had value 3 on varx).  Since aggregate only
appears to provide the percent of cases with a said value via the
PIN(varx,3,3), I often create binaries before the aggregate & ask for
the sums for each binary as in

recode varx3 (3=1)((else=0) into varx3.
agg
 /outfile=*
 /break....
 /varx3=sum(varx3).

Since I may have several I'll often create the binaries via a do repeat.

But I'm curious, if aggregate can calculate percents why can it not
calculate counts (unless I'm missing something)?

DW

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: value counts via aggregate

wsu_wright
In reply to this post by wsu_wright
ViAnn,

In my case, my interest to obtain counts would not be for reporting
(crosstabs, ctable etc) but to transform databases which may later be
used for reporting.  For example, I may have a database that contains
student level data for each course section that includes student data
like student ID, student class, gender, race etc along with course level
data such as time & day of course, credit hours etc (which is not
normalized but repeated for every student in the class).  I will
aggregate this file by course to create a course-level database to push
out to our oracle tables so that users can create tables based on course
dynamics.  Along the way I may want to include for every course record
the number of male, female, freshmen, sophomores, etc as separate
columns.

So returning to your response, it actually rasies the same question,
after all, the % of a value could also be derived from crosstabs &
ctables but spss does provide a method in the aggregate syntax to get
the percent:

PIN(varx,value,value)

So why can it not provide a count (which would intuitively seem to be
part of the percent calculation any way).

CNT(varx,value,value)

Again, the counts can be obtained but requires extra data management
before the aggregation.  It would be more efficient to have the
CNT(varx,value,value) function similar to the PIN(varx,value,value)
function.  I have requested this as an enhancement when beta testing
both version 17 & 18 but have not yet seen it in production unless I'm
missing something.


David,

On Sun, Jan 31, 2010 at 9:30 PM, ViAnn Beadle wrote:

> Is There some reason why you don't want to obtain counts for variables
> within a break group using CROSSTABS or CTABLES?
>
> On Sun, Jan 31, 2010 at 3:28 PM, David Wright  wrote:
>
>> When aggregating I commonly need to get counts for specific values
>> (e.g., how many cases had value 3 on varx).  Since aggregate only
>> appears to provide the percent of cases with a said value via the
>> PIN(varx,3,3), I often create binaries before the aggregate & ask for
>> the sums for each binary as in
>>
>> recode varx3 (3=1)((else=0) into varx3.
>> agg
>>  /outfile=*
>>  /break....
>>  /varx3=sum(varx3).
>>
>> Since I may have several I'll often create the binaries via a do
>> repeat.
>>
>> But I'm curious, if aggregate can calculate percents why can it not
>> calculate counts (unless I'm missing something)?
>>
>> DW
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except
>> the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: value counts via aggregate

ViAnn Beadle
There are a number of ways of pushing the results of things like crosstabs out to flat files or SPSS data files (OMS and Export pivot table). But in the meantime, include the n for the break group and then multiple your percent by value n to get the count.

-----Original Message-----
From: [hidden email] [mailto:[hidden email]]
Sent: Monday, February 01, 2010 4:49 AM
To: ViAnn Beadle; [hidden email]
Cc: [hidden email]
Subject: Re: value counts via aggregate

ViAnn,

In my case, my interest to obtain counts would not be for reporting (crosstabs, ctable etc) but to transform databases which may later be used for reporting.  For example, I may have a database that contains student level data for each course section that includes student data like student ID, student class, gender, race etc along with course level data such as time & day of course, credit hours etc (which is not normalized but repeated for every student in the class).  I will aggregate this file by course to create a course-level database to push out to our oracle tables so that users can create tables based on course dynamics.  Along the way I may want to include for every course record the number of male, female, freshmen, sophomores, etc as separate columns.

So returning to your response, it actually rasies the same question, after all, the % of a value could also be derived from crosstabs & ctables but spss does provide a method in the aggregate syntax to get the percent:

PIN(varx,value,value)

So why can it not provide a count (which would intuitively seem to be part of the percent calculation any way).

CNT(varx,value,value)

Again, the counts can be obtained but requires extra data management before the aggregation.  It would be more efficient to have the
CNT(varx,value,value) function similar to the PIN(varx,value,value) function.  I have requested this as an enhancement when beta testing both version 17 & 18 but have not yet seen it in production unless I'm missing something.


David,

On Sun, Jan 31, 2010 at 9:30 PM, ViAnn Beadle wrote:

> Is There some reason why you don't want to obtain counts for variables
> within a break group using CROSSTABS or CTABLES?
>
> On Sun, Jan 31, 2010 at 3:28 PM, David Wright  wrote:
>
>> When aggregating I commonly need to get counts for specific values
>> (e.g., how many cases had value 3 on varx).  Since aggregate only
>> appears to provide the percent of cases with a said value via the
>> PIN(varx,3,3), I often create binaries before the aggregate & ask for
>> the sums for each binary as in
>>
>> recode varx3 (3=1)((else=0) into varx3.
>> agg
>>  /outfile=*
>>  /break....
>>  /varx3=sum(varx3).
>>
>> Since I may have several I'll often create the binaries via a do
>> repeat.
>>
>> But I'm curious, if aggregate can calculate percents why can it not
>> calculate counts (unless I'm missing something)?
>>
>> DW
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> [hidden email] (not to SPSSX-L), with no body text except
>> the command. To leave the list, send the command SIGNOFF SPSSX-L For
>> a list of commands to manage subscriptions, send the command INFO
>> REFCARD
>>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: value counts via aggregate

David Marso
Administrator
In reply to this post by wsu_wright
Hi David,
Simply use your variable as an additional BREAK and specify N for the
AGGREGATE function.  That will give you counts.  If you have a bunch of
variables to do consider first using VARSTOCASES keeping track of the
variable index and use the variable index along with the resulting column as
breaks.HTH, David Marso
On Mon, 1 Feb 2010 06:48:50 -0500, David Wright <[hidden email]> wrote:

>ViAnn,
>
>In my case, my interest to obtain counts would not be for reporting
>(crosstabs, ctable etc) but to transform databases which may later be
>used for reporting.  For example, I may have a database that contains
>student level data for each course section that includes student data
>like student ID, student class, gender, race etc along with course level
>data such as time & day of course, credit hours etc (which is not
>normalized but repeated for every student in the class).  I will
>aggregate this file by course to create a course-level database to push
>out to our oracle tables so that users can create tables based on course
>dynamics.  Along the way I may want to include for every course record
>the number of male, female, freshmen, sophomores, etc as separate
>columns.
>
>So returning to your response, it actually rasies the same question,
>after all, the % of a value could also be derived from crosstabs &
>ctables but spss does provide a method in the aggregate syntax to get
>the percent:
>
>PIN(varx,value,value)
>
>So why can it not provide a count (which would intuitively seem to be
>part of the percent calculation any way).
>
>CNT(varx,value,value)
>
>Again, the counts can be obtained but requires extra data management
>before the aggregation.  It would be more efficient to have the
>CNT(varx,value,value) function similar to the PIN(varx,value,value)
>function.  I have requested this as an enhancement when beta testing
>both version 17 & 18 but have not yet seen it in production unless I'm
>missing something.
>
>
>David,
>
>On Sun, Jan 31, 2010 at 9:30 PM, ViAnn Beadle wrote:
>
>> Is There some reason why you don't want to obtain counts for variables
>> within a break group using CROSSTABS or CTABLES?
>>
>> On Sun, Jan 31, 2010 at 3:28 PM, David Wright  wrote:
>>
>>> When aggregating I commonly need to get counts for specific values
>>> (e.g., how many cases had value 3 on varx).  Since aggregate only
>>> appears to provide the percent of cases with a said value via the
>>> PIN(varx,3,3), I often create binaries before the aggregate & ask for
>>> the sums for each binary as in
>>>
>>> recode varx3 (3=1)((else=0) into varx3.
>>> agg
>>>  /outfile=*
>>>  /break....
>>>  /varx3=sum(varx3).
>>>
>>> Since I may have several I'll often create the binaries via a do
>>> repeat.
>>>
>>> But I'm curious, if aggregate can calculate percents why can it not
>>> calculate counts (unless I'm missing something)?
>>>
>>> DW
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>> [hidden email] (not to SPSSX-L), with no body text except
>>> the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>>
>
>=====================
>To manage your subscription to SPSSX-L, send a message to
>[hidden email] (not to SPSSX-L), with no body text except the
>command. To leave the list, send the command
>SIGNOFF SPSSX-L
>For a list of commands to manage subscriptions, send the command
>INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: value counts via aggregate

Richard Ristow
At 04:31 PM 2/1/2010, David Marso wrote:

>Simply use your variable as an additional BREAK and specify N for
>the AGGREGATE function.  If you have a bunch of variables to do
>consider first using VARSTOCASES keeping track of the variable index
>and use the variable index along with the resulting column as breaks.

Yes; very nice indeed. And if you want counts for all values as
different variables in the same record (what you get if you create
indicator variables for each value and SUM in AGGREGATE), use
CASESTOVARS afterward.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD