CASESTOVARS problem

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

CASESTOVARS problem

Bruce Weaver
Administrator
I received the following question off-list, but thought I would respond here so that others can join in, and perhaps even benefit.  ;-)

Here's the problem:

I am having such a hard time restructuring a fairly large data set.

What I have:
ID Service
1234 Tutoring
1234 Tutoring
1234 Tutoring
1234 Writing
1234 Coaching
5678 Tutoring
5678 Coaching
5678 Coaching
9012 Tutoring
9012 Writing
9012 Coaching

What I need:
ID Tutoring Writing Coaching
1234 3 1 1
5678 1 2
9012 1 1 1

--- End of problem statement ---

Here is one approach that works for the sample data at least.

NEW FILE.
DATASET CLOSE all.

DATA LIST list / ID (f4.0) Service(A10).
begin data
1234 Tutoring
1234 Tutoring
1234 Tutoring
1234 Writing
1234 Coaching
5678 Tutoring
5678 Coaching
5678 Coaching
9012 Tutoring
9012 Writing
9012 Coaching
end data.
DATASET NAME raw.

* Use AGGREGATE to get to one row per ID.

DATASET DECLARE Agg.
AGGREGATE
  /OUTFILE='Agg'
  /BREAK=ID Service
  /N_BREAK=N.
DATASET ACTIVATE Agg.

* Use Service as the index variable for CASESTOVARS.

SORT CASES BY ID Service .
CASESTOVARS
  /ID=ID
  /INDEX=Service
  /GROUPBY=VARIABLE.

RESULT:

  ID Coaching Tutoring Writing
 
1234        1        3       1
5678        2        1       .
9012        1        1       1
 
Number of cases read:  3    Number of cases listed:  3

If the order of the variables has to be as shown above (Tutoring Writing Coaching), use /KEEP with SAVE OUTFILE (or MATCH FILES or ADD FILES) to reorder the variables.

HTH.
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: CASESTOVARS problem

David Marso
Administrator
I can't really think of anything ultra clever to add aside from thinning the code a little.
1. AGGREGATE to the active file.
2. remove the SORT (it will already be in order following the AGGREGATE).
3. remove the unnecessary GROUPBY.
4. compress the code for the sake of the O2 emitting green guys.
In more complex/less obvious codes I will maintain multiline command format,
but this is obvious at a glance.
In fact I WOULD re-comment it as follows.
---
/ *AGGREGATE the file by ID and Service.  INDEX the wide file variables by Service  */.

AGGREGATE  /OUTFILE=* /BREAK=ID Service /N_BREAK=N.
CASESTOVARS  /ID=ID  /INDEX=Service .

Bruce Weaver wrote
I received the following question off-list, but thought I would respond here so that others can join in, and perhaps even benefit.  ;-)

Here's the problem:

I am having such a hard time restructuring a fairly large data set.

What I have:
ID Service
1234 Tutoring
1234 Tutoring
1234 Tutoring
1234 Writing
1234 Coaching
5678 Tutoring
5678 Coaching
5678 Coaching
9012 Tutoring
9012 Writing
9012 Coaching

What I need:
ID Tutoring Writing Coaching
1234 3 1 1
5678 1 2
9012 1 1 1

--- End of problem statement ---

Here is one approach that works for the sample data at least.

NEW FILE.
DATASET CLOSE all.

DATA LIST list / ID (f4.0) Service(A10).
begin data
1234 Tutoring
1234 Tutoring
1234 Tutoring
1234 Writing
1234 Coaching
5678 Tutoring
5678 Coaching
5678 Coaching
9012 Tutoring
9012 Writing
9012 Coaching
end data.
DATASET NAME raw.

* Use AGGREGATE to get to one row per ID.

DATASET DECLARE Agg.
AGGREGATE
  /OUTFILE='Agg'
  /BREAK=ID Service
  /N_BREAK=N.
DATASET ACTIVATE Agg.

* Use Service as the index variable for CASESTOVARS.

SORT CASES BY ID Service .
CASESTOVARS
  /ID=ID
  /INDEX=Service
  /GROUPBY=VARIABLE.

RESULT:

  ID Coaching Tutoring Writing
 
1234        1        3       1
5678        2        1       .
9012        1        1       1
 
Number of cases read:  3    Number of cases listed:  3

If the order of the variables has to be as shown above (Tutoring Writing Coaching), use /KEEP with SAVE OUTFILE (or MATCH FILES or ADD FILES) to reorder the variables.

HTH.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"