SPSSX Discussion

summarize

Classic

List

Threaded

5 messages Options

Doris-18

summarize

Hi list,

I`m working on a data base containing of different studies about predictors
in the context of a literature review. The collected data show multiple
records per single study in the way that a specification of a variable e.g.
stress occurs several times. For example:

Stud_ID predictor_ID predictor_name
1 1 stress
1 2 stress
.
.
.
25 37 age
25 38 age
25 39 age

This is because a special predictor could have different categories. But
these categories aren`t the same over the studies, so I focused on the
variable specification per se what arose the following question: how can I
summarise the multiple records in that way that a special predictor like
age appears only 1x per study, in the above example that study 25 counts
the predictor "age" only 1x?
Thanks for your help, Doris

Albert-Jan Roskam

Re: summarize

Hi Doris,

How about:
aggregate
/ outfile = *
/ break = predictor_name
/ count = n.

Is this what you mean?

Cheers!
Albert-Jan

--- Doris Gerstner
<[hidden email]> wrote:

> Hi list,
>
> I`m working on a data base containing of different
> studies about predictors
> in the context of a literature review. The collected
> data show multiple
> records per single study in the way that a
> specification of a variable e.g.
> stress occurs several times. For example:
>
> Stud_ID predictor_ID predictor_name
> 1 1 stress
> 1 2 stress
> .
> .
> .
> 25 37 age
> 25 38 age
> 25 39 age
>
> This is because a special predictor could have
> different categories. But
> these categories aren`t the same over the studies,
> so I focused on the
> variable specification per se what arose the
> following question: how can I
> summarise the multiple records in that way that a
> special predictor like
> age appears only 1x per study, in the above example
> that study 25 counts
> the predictor "age" only 1x?
> Thanks for your help, Doris
>

____________________________________________________________________________________
Get your own web address.
Have a HUGE year through Yahoo! Small Business.
http://smallbusiness.yahoo.com/domains/?p=BESTDEAL

statisticsdoc

Re: summarize

In reply to this post by Doris-18

Doris,

A variation on Albert-Jan's suggestion would be:

Aggregate
/ outfile = *
/ break = Stud_ID predictor_name
/ count = n.

This would give you one record for each variable per study. I am assuming
that the file is already sorted on the break variables Stud_ID and
predictor_name.

HTH,

Stephen Brand

For personalized and professional consultation in statistics and research
design, visit
www.statisticsdoc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Doris Gerstner
Sent: Tuesday, February 06, 2007 4:40 AM
To: [hidden email]
Subject: summarize

Hi list,

I`m working on a data base containing of different studies about predictors
in the context of a literature review. The collected data show multiple
records per single study in the way that a specification of a variable e.g.
stress occurs several times. For example:

Stud_ID predictor_ID predictor_name
1 1 stress
1 2 stress
.
.
.
25 37 age
25 38 age
25 39 age

This is because a special predictor could have different categories. But
these categories aren`t the same over the studies, so I focused on the
variable specification per se what arose the following question: how can I
summarise the multiple records in that way that a special predictor like
age appears only 1x per study, in the above example that study 25 counts
the predictor "age" only 1x?
Thanks for your help, Doris

Richard Ristow

Re: summarize

At 09:54 AM 2/6/2007, Statisticsdoc wrote:

>A variation on Albert-Jan's suggestion would be:
>
>Aggregate
> / outfile = *
> / break = Stud_ID predictor_name
> / count = n.
>
>I am assuming that the file is already sorted on the break variables
>Stud_ID and predictor_name.

That last isn't necessary. It surprised the dickens out of me when I
learned this, but AGGREGATE now builds its tables in memory, unless you
use PRESORTED. (And you shouldn't do that unless there are a great many
break categories - hundreds of thousands.)

statisticsdoc

Re: summarize

Richard,

Thanks for the information - that's not only a surprise, but a great relief!

Stephen Brand

For personalized and professional consultation in statistics and research
design, visit
www.statisticsdoc.com

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Richard Ristow
Sent: Tuesday, February 06, 2007 11:28 AM
To: [hidden email]
Subject: Re: summarize

At 09:54 AM 2/6/2007, Statisticsdoc wrote:

>A variation on Albert-Jan's suggestion would be:
>
>Aggregate
> / outfile = *
> / break = Stud_ID predictor_name
> / count = n.
>
>I am assuming that the file is already sorted on the break variables
>Stud_ID and predictor_name.

That last isn't necessary. It surprised the dickens out of me when I
learned this, but AGGREGATE now builds its tables in memory, unless you
use PRESORTED. (And you shouldn't do that unless there are a great many
break categories - hundreds of thousands.)