SPSSX Discussion

Aggr for reliability dataset

Classic

List

Threaded

3 messages Options

uri1616

Aggr for reliability dataset

hi everyone,

I have this dataset "oldfile.sav" I generated using "update" to compare two other datasets formed by Uri and Yael. The first var "SY" is the raw number from the source datasets.
In this dataset each matching case from Uri's dataset and Yael dataset is displayed and marked by value '1' for "flag" .
Cases from the source datasets that weren't identical appear one after the other (ex. sy=32), while the first comes from Uri's file and the Second comes from Yael's. One of them gets '1' for "flag" and the other gets '0'. Both has the same raw number ("SY").

this is how it looks like:

sy flag var1 var2 var3 var4 var5 var6
27 1 1 0 0 0 0 0
32 0 3 0 1 0 0 0
32 1 2 0 1 0 1 0
33 1 1 0 0 0 0 0
34 1 1 0 0 0 0 0
35 1 1 0 0 0 0 0
36 1 1 0 0 0 1 0
36 0 0 0 0 0 1 0
37 1 1 0 0 0 0 0

Now I want to get a new file "sum.sav" in which each var (besides flag and sy) is aggregated, But I want three separate aggregations.
The first case of the new file should hold the aggregation of only the values that Uri and Yael entered identically.
The second should hold Uri's values that differentiate from Yael's.
The thirs should hold Yael's values that differentiate from Uri's.

Notice that in the non-matching cases, some of the values **are in fact identical** and therefor should be aggregated in
the first raw (in the ex. var3 was aggregated to '1' in the "match-agr" raw, bc both uri and yael entered '1' for it in raw 32).
for example in case 32, while values of var1 are different, the values of var3 are identical.

eventually the new file "sum.sav" for the dataset above sould look like this:

var1 var2 var3 var4 var5 var6

(match-agr) 5 0 1 0 1 0
(only-uri-agr) 4 0 0 0 0 0
(only-yael-agr) 2 0 0 0 1 0

Thank for your help,
Uri.

Richard Ristow

Re: Aggr for reliability dataset

At 05:21 PM 2/22/2014, uri1616 wrote:

>I have this dataset I generated using "update" to compare two other
>datasets formed by Uri and Yael. The first var "SY" is the raw
>number from the source datasets.

By "raw number", do you mean it's the record identifying key? That
appears to be how you're using it.

>In this dataset each matching case from Uri's dataset and Yael dataset is
>displayed and marked by value '1' for "flag" . Cases from the source
>datasets that weren't identical appear one after the other (ex. sy=32):

Data, reformatted to shorten lines, for readability:
sy flag var1 var2 var3 var4 var5 var6
27 1 1 0 0 0 0 0
32 0 3 0 1 0 0 0
32 1 2 0 1 0 1 0
33 1 1 0 0 0 0 0
34 1 1 0 0 0 0 0
35 1 1 0 0 0 0 0
36 1 1 0 0 0 1 0
36 0 0 0 0 0 1 0
37 1 1 0 0 0 0 0

>Now I want to get a new file "sum.sav" in which each var (besides
>flag and sy) is aggregated, But I want three separate aggregations.

In other words, you want a final aggregated file containing only three records?

This is the illustration you give:
var1 var2 var3 var4 var5 var6
(match) 5 0 1 0 1 0
(only-uri) 4 0 0 0 0 0
(only-yael) 2 0 0 0 1 0

You write a lot about "aggregating" the values. I'm not getting your
meaning. To start with, can you tell us how those three different
values for var1 were calculated?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

bdates

Re: Aggr for reliability dataset

Uri,

Are you trying to examine inter-rater reliability? If so, there is syntax out there for 'aggregating', or counting agreements and disagreements.

Brian

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Richard Ristow
Sent: Monday, February 24, 2014 2:56 PM
To: [hidden email]
Subject: Re: Aggr for reliability dataset

At 05:21 PM 2/22/2014, uri1616 wrote:

>I have this dataset I generated using "update" to compare two other
>datasets formed by Uri and Yael. The first var "SY" is the raw
>number from the source datasets.

By "raw number", do you mean it's the record identifying key? That
appears to be how you're using it.

>In this dataset each matching case from Uri's dataset and Yael dataset is
>displayed and marked by value '1' for "flag" . Cases from the source
>datasets that weren't identical appear one after the other (ex. sy=32):

Data, reformatted to shorten lines, for readability:
sy flag var1 var2 var3 var4 var5 var6
27 1 1 0 0 0 0 0
32 0 3 0 1 0 0 0
32 1 2 0 1 0 1 0
33 1 1 0 0 0 0 0
34 1 1 0 0 0 0 0
35 1 1 0 0 0 0 0
36 1 1 0 0 0 1 0
36 0 0 0 0 0 1 0
37 1 1 0 0 0 0 0

>Now I want to get a new file "sum.sav" in which each var (besides
>flag and sy) is aggregated, But I want three separate aggregations.

In other words, you want a final aggregated file containing only three records?

This is the illustration you give:
var1 var2 var3 var4 var5 var6
(match) 5 0 1 0 1 0
(only-uri) 4 0 0 0 0 0
(only-yael) 2 0 0 0 1 0

You write a lot about "aggregating" the values. I'm not getting your
meaning. To start with, can you tell us how those three different
values for var1 were calculated?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD