Restructuring Cases into Variables

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Restructuring Cases into Variables

dadrivr
Hi all.  Really need help here.  My data set is structured like this:

coder   rating
1           ##
1           ##
1           ##
2           ##
2           ##
2           ##

I need the data file to be structured like this:

coder 1   coder 2
##           ##
##           ##
##           ##

I have tried restructuring using Cases to Variables, but I get the following error:
"The INDEX values for case 2 have occurred before in the cases with the same ID values."

I assume the error is because some of the lines are duplicates (both coder gives many ratings to each of a large number of people).  Here is the syntax I am using:

SORT CASES BY coder rating.
CASESTOVARS
  /ID=rating
  /INDEX=coder
  /GROUPBY=INDEX.


I would really appreciate your help.  Thanks so much!
Reply | Threaded
Open this post in threaded view
|

Re: Restructuring Cases into Variables

Maguin, Eugene
Please resubmit a true representation of the relevant structure of your
data. This statement indicates that you have omitted variables that are
relevant to your question.

>>I assume the error is because some of the lines are duplicates (both coder
gives many ratings to each of a large number of people).


coder   rating   newv
1           ##    1
1           ##    2
1           ##    3
2           ##    1
2           ##    2
2           ##    3


Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Restructuring Cases into Variables

dadrivr
For data in the following form:

ID  coder  rating
1    1          0
1    1          0
1    2          1
1    2          0
2    1          1
2    1          1
2    2          0
2    2          1

Where 'ID' is the ID number of the subject, 'coder' is the coder number (either 1 or 2), and 'rating' is the coder's rating of the subject (either 1 or 0).  I would like to restructure the data to compute inter-coder reliability, so the coders' ratings need to be in separate columns.  I want the data to be restructured like this:

ID  rating.1  rating.2
1    0             1
1    0             0
2    1             0
2    1             1

Where 'rating.1' is coder 1's ratings and 'rating.2' is coder 2's ratings.

You can use the following syntax to create the data and variables from the original example:
NEW FILE.
DATA LIST FREE /ID coder rating.
FORMATS ID coder rating (F1.0).
BEGIN DATA
  1 1 0
  1 1 0
  1 2 1
  1 2 0
  2 1 1
  2 1 1
  2 2 0
  2 2 0
END DATA.
EXECUTE.


When I attempt to restructure with the following syntax:
SORT CASES BY ID coder rating.
CASESTOVARS
  /ID=rating
  /INDEX=coder
  /GROUPBY=INDEX.


Because some of the lines are duplicates, I get the following error:
"The INDEX values for case 2 have occurred before in the cases with the same ID values."

So I created a count variable to make each case unique:
compute idcnt=1.
if (coder=lag(coder)) idcnt=lag(idcnt)+1.
EXECUTE.
FORMATS idcnt (f1.0).


Now I try to restructure including the count variable:
SORT CASES BY ID coder idcnt.
CASESTOVARS
 /ID = ID idcnt
 /INDEX = coder
 /autofix=no
 /GROUPBY = VARIABLE .


And I get this:
ID  idcnt  rating.1  rating.2
1    1         0             .
1    2         0             .
1    1         .              1
1    2         .              0
2    1         1             .
2    2         1             .
2    1         .              0
2    2         .              1

The coders' ratings are now in the correct columns (i.e., separate columns), but they are not in the same rows.  How can I fix this?  Any help would be greatly appreciated.  Thanks so much!


Gene Maguin wrote
Please resubmit a true representation of the relevant structure of your
data. This statement indicates that you have omitted variables that are
relevant to your question.

>>I assume the error is because some of the lines are duplicates (both coder
gives many ratings to each of a large number of people).


coder   rating   newv
1           ##    1
1           ##    2
1           ##    3
2           ##    1
2           ##    2
2           ##    3


Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Restructuring Cases into Variables

Marks, Jim
I ran your syntax and got an error message about the file being out of
order.

I changed the sort and it gave the desired result:

NEW FILE.
DATA LIST FREE /ID coder rating.
FORMATS ID coder rating .
BEGIN DATA
  1 1 0
  1 1 0
  1 2 1
  1 2 0
  2 1 1
  2 1 1
  2 2 0
  2 2 0
END DATA.
EXECUTE.

compute idcnt=1.
if (coder=lag(coder)) idcnt=lag(idcnt)+1.
EXECUTE.

* note change in sort order.
SORT CASES BY ID idcnt coder.
CASESTOVARS
 /ID = ID idcnt
 /INDEX = coder
 /autofix=no
 /GROUPBY = VARIABLE .

LIST.

Id      idcnt           rating.1.00             rating.2.00
1.00    1.00             .00                    1.00
1.00    2.00             .00                     .00
2.00    1.00             1.00                    .00
2.00    2.00             1.00                    .00



Jim Marks
Director, Market Research
x1616


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
dadrivr
Sent: Friday, January 15, 2010 11:56 AM
To: [hidden email]
Subject: Re: Restructuring Cases into Variables

For data in the following form:

ID  coder  rating
1    1          0
1    1          0
1    2          1
1    2          0
2    1          1
2    1          1
2    2          0
2    2          1

Where 'ID' is the ID number of the subject, 'coder' is the coder number
(either 1 or 2), and 'rating' is the coder's rating of the subject
(either 1
or 0).  I would like to restructure the data to compute inter-coder
reliability, so the coders' ratings need to be in separate columns.  I
want
the data to be restructured like this:

ID  rating.1  rating.2
1    0             1
1    0             0
2    1             0
2    1             1

Where 'rating.1' is coder 1's ratings and 'rating.2' is coder 2's
ratings.

You can use the following syntax to create the data and variables from
the
original example:
NEW FILE.
DATA LIST FREE /ID coder rating.
FORMATS ID coder rating (F1.0).
BEGIN DATA
  1 1 0
  1 1 0
  1 2 1
  1 2 0
  2 1 1
  2 1 1
  2 2 0
  2 2 0
END DATA.
EXECUTE.

When I attempt to restructure with the following syntax:
SORT CASES BY ID coder rating.
CASESTOVARS
  /ID=rating
  /INDEX=coder
  /GROUPBY=INDEX.

Because some of the lines are duplicates, I get the following error:
"The INDEX values for case 2 have occurred before in the cases with the
same
ID values."

So I created a count variable to make each case unique:
compute idcnt=1.
if (coder=lag(coder)) idcnt=lag(idcnt)+1.
EXECUTE.
FORMATS idcnt (f1.0).

Now I try to restructure including the count variable:
SORT CASES BY ID coder idcnt.
CASESTOVARS
 /ID = ID idcnt
 /INDEX = coder
 /autofix=no
 /GROUPBY = VARIABLE .

And I get this:
ID  idcnt  rating.1  rating.2
1    1         0             .
1    2         0             .
1    1         .              1
1    2         .              0
2    1         1             .
2    2         1             .
2    1         .              0
2    2         .              1

The coders' ratings are now in the correct columns (i.e., separate
columns),
but they are not in the same rows.  How can I fix this?  Any help would
be
greatly appreciated.  Thanks so much!



Gene Maguin wrote:
>
> Please resubmit a true representation of the relevant structure of
your
> data. This statement indicates that you have omitted variables that
are
> relevant to your question.
>
>>>I assume the error is because some of the lines are duplicates (both
coder

> gives many ratings to each of a large number of people).
>
>
> coder   rating   newv
> 1           ##    1
> 1           ##    2
> 1           ##    3
> 2           ##    1
> 2           ##    2
> 2           ##    3
>
>
> Gene Maguin
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except
the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

--
View this message in context:
http://old.nabble.com/Restructuring-Cases-into-Variables-tp27167594p2718
0995.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD