SPSSX Discussion

Restructuring Cases into Variables

Classic

List

Threaded

4 messages Options

dadrivr

Restructuring Cases into Variables

Hi all. Really need help here. My data set is structured like this:

coder rating
1 ##
1 ##
1 ##
2 ##
2 ##
2 ##

I need the data file to be structured like this:

coder 1 coder 2
## ##
## ##
## ##

I have tried restructuring using Cases to Variables, but I get the following error:
"The INDEX values for case 2 have occurred before in the cases with the same ID values."

I assume the error is because some of the lines are duplicates (both coder gives many ratings to each of a large number of people). Here is the syntax I am using:

SORT CASES BY coder rating.
CASESTOVARS
/ID=rating
/INDEX=coder
/GROUPBY=INDEX.

I would really appreciate your help. Thanks so much!

Maguin, Eugene

Re: Restructuring Cases into Variables

Please resubmit a true representation of the relevant structure of your
data. This statement indicates that you have omitted variables that are
relevant to your question.

>>I assume the error is because some of the lines are duplicates (both coder
gives many ratings to each of a large number of people).

coder rating newv
1 ## 1
1 ## 2
1 ## 3
2 ## 1
2 ## 2
2 ## 3

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

dadrivr

Re: Restructuring Cases into Variables

For data in the following form:

ID coder rating
1 1 0
1 1 0
1 2 1
1 2 0
2 1 1
2 1 1
2 2 0
2 2 1

Where 'ID' is the ID number of the subject, 'coder' is the coder number (either 1 or 2), and 'rating' is the coder's rating of the subject (either 1 or 0). I would like to restructure the data to compute inter-coder reliability, so the coders' ratings need to be in separate columns. I want the data to be restructured like this:

ID rating.1 rating.2
1 0 1
1 0 0
2 1 0
2 1 1

Where 'rating.1' is coder 1's ratings and 'rating.2' is coder 2's ratings.

You can use the following syntax to create the data and variables from the original example:
NEW FILE.
DATA LIST FREE /ID coder rating.
FORMATS ID coder rating (F1.0).
BEGIN DATA
1 1 0
1 1 0
1 2 1
1 2 0
2 1 1
2 1 1
2 2 0
2 2 0
END DATA.
EXECUTE.

When I attempt to restructure with the following syntax:
SORT CASES BY ID coder rating.
CASESTOVARS
/ID=rating
/INDEX=coder
/GROUPBY=INDEX.

Because some of the lines are duplicates, I get the following error:
"The INDEX values for case 2 have occurred before in the cases with the same ID values."

So I created a count variable to make each case unique:
compute idcnt=1.
if (coder=lag(coder)) idcnt=lag(idcnt)+1.
EXECUTE.
FORMATS idcnt (f1.0).

Now I try to restructure including the count variable:
SORT CASES BY ID coder idcnt.
CASESTOVARS
/ID = ID idcnt
/INDEX = coder
/autofix=no
/GROUPBY = VARIABLE .

And I get this:
ID idcnt rating.1 rating.2
1 1 0 .
1 2 0 .
1 1 . 1
1 2 . 0
2 1 1 .
2 2 1 .
2 1 . 0
2 2 . 1

The coders' ratings are now in the correct columns (i.e., separate columns), but they are not in the same rows. How can I fix this? Any help would be greatly appreciated. Thanks so much!

Gene Maguin wrote

Please resubmit a true representation of the relevant structure of your
data. This statement indicates that you have omitted variables that are
relevant to your question.

>>I assume the error is because some of the lines are duplicates (both coder
gives many ratings to each of a large number of people).

coder rating newv
1 ## 1
1 ## 2
1 ## 3
2 ## 1
2 ## 2
2 ## 3

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Marks, Jim

Re: Restructuring Cases into Variables

I ran your syntax and got an error message about the file being out of
order.

I changed the sort and it gave the desired result:

NEW FILE.
DATA LIST FREE /ID coder rating.
FORMATS ID coder rating .
BEGIN DATA
1 1 0
1 1 0
1 2 1
1 2 0
2 1 1
2 1 1
2 2 0
2 2 0
END DATA.
EXECUTE.

compute idcnt=1.
if (coder=lag(coder)) idcnt=lag(idcnt)+1.
EXECUTE.

* note change in sort order.
SORT CASES BY ID idcnt coder.
CASESTOVARS
/ID = ID idcnt
/INDEX = coder
/autofix=no
/GROUPBY = VARIABLE .

LIST.

Id idcnt rating.1.00 rating.2.00
1.00 1.00 .00 1.00
1.00 2.00 .00 .00
2.00 1.00 1.00 .00
2.00 2.00 1.00 .00

Jim Marks
Director, Market Research
x1616

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
dadrivr
Sent: Friday, January 15, 2010 11:56 AM
To: [hidden email]
Subject: Re: Restructuring Cases into Variables

For data in the following form:

ID coder rating
1 1 0
1 1 0
1 2 1
1 2 0
2 1 1
2 1 1
2 2 0
2 2 1

Where 'ID' is the ID number of the subject, 'coder' is the coder number
(either 1 or 2), and 'rating' is the coder's rating of the subject
(either 1
or 0). I would like to restructure the data to compute inter-coder
reliability, so the coders' ratings need to be in separate columns. I
want
the data to be restructured like this:

ID rating.1 rating.2
1 0 1
1 0 0
2 1 0
2 1 1

Where 'rating.1' is coder 1's ratings and 'rating.2' is coder 2's
ratings.

You can use the following syntax to create the data and variables from
the
original example:
NEW FILE.
DATA LIST FREE /ID coder rating.
FORMATS ID coder rating (F1.0).
BEGIN DATA
1 1 0
1 1 0
1 2 1
1 2 0
2 1 1
2 1 1
2 2 0
2 2 0
END DATA.
EXECUTE.

When I attempt to restructure with the following syntax:
SORT CASES BY ID coder rating.
CASESTOVARS
/ID=rating
/INDEX=coder
/GROUPBY=INDEX.

Because some of the lines are duplicates, I get the following error:
"The INDEX values for case 2 have occurred before in the cases with the
same
ID values."

So I created a count variable to make each case unique:
compute idcnt=1.
if (coder=lag(coder)) idcnt=lag(idcnt)+1.
EXECUTE.
FORMATS idcnt (f1.0).

Now I try to restructure including the count variable:
SORT CASES BY ID coder idcnt.
CASESTOVARS
/ID = ID idcnt
/INDEX = coder
/autofix=no
/GROUPBY = VARIABLE .

And I get this:
ID idcnt rating.1 rating.2
1 1 0 .
1 2 0 .
1 1 . 1
1 2 . 0
2 1 1 .
2 2 1 .
2 1 . 0
2 2 . 1

The coders' ratings are now in the correct columns (i.e., separate
columns),
but they are not in the same rows. How can I fix this? Any help would
be
greatly appreciated. Thanks so much!

Gene Maguin wrote:
>
> Please resubmit a true representation of the relevant structure of
your
> data. This statement indicates that you have omitted variables that
are
> relevant to your question.
>
>>>I assume the error is because some of the lines are duplicates (both
coder

> gives many ratings to each of a large number of people).
>
>
> coder rating newv
> 1 ## 1
> 1 ## 2
> 1 ## 3
> 2 ## 1
> 2 ## 2
> 2 ## 3
>
>
> Gene Maguin
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except

the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

--
View this message in context:
http://old.nabble.com/Restructuring-Cases-into-Variables-tp27167594p2718
0995.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD