Multiple Rows Per Subject: How to Create New Data File Based on Highest Course Taken

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Multiple Rows Per Subject: How to Create New Data File Based on Highest Course Taken

elle lists

Hi all,

 

This newbie would appreciate your advice on how to proceed with this puzzle. I received a data file which has multiple rows (ranging from 2 to 5) per subject concerning technology class-taking.  The variables include the course level, year taken, course number, course title, course grade (see below):

 

ORIGINAL DATA LAYOUT:
Subject ID CourseLevel CourseYear Course CourseTitle CourseGrade
39#_3#EGQB0OGQ 11 2005 EXC1210 Excel I: Intermediate Concepts A
39#_3#EGQB0OGQ 11 2005 EXC1220 Excel II: VBA Programming B
39#_3#EGQB0OGQ 9 2003 WD1010 Word: Intro to Macros A
39#_3#EGQB0OGQ 12 2003 ACC1320 Access: Programming B
#_9P8SGGHOOAA09 12 2003 ACC1320 Access: Programming D
#_H22HEJZ1P7ACM 11 2003 EXC1210 Excel I: Intermediate Concepts B
#_H22HEJZ1P7ACM 11 2003 EXC1220 Excel II: VBA Programming C
#_H22HEJZ1P7ACM 10 2004 ACC1320 Access: Programming A
 
  
I’d like to restructure the data to create a new data set reflecting the highest course (determined by the course’s last 4-digit number) taken by each subject (see below).
DESIRED RESULT: 
Subject ID CourseLevel CourseYear Course CourseTitle CourseGrade
39#_3#EGQB0OGQ 11 2005 EXC1220 Excel II: VBA Programming B
#_9P8SGGHOOAA09 12 2003 ACC1320 Access: Programming D
#_H22HEJZ1P7ACM 11 2003 EXC1220 Excel II: VBA Programming C
 
 
Many thanks for your help,

 

elle

Reply | Threaded
Open this post in threaded view
|

Re: Multiple Rows Per Subject: How to Create New Data File Based on Highest Course Taken

David Marso
Administrator
See SUBSTR and AGGREGATE (MAX) in the FM!
elle lists wrote
Hi all,



This newbie would appreciate your advice on how to proceed with this
puzzle. I received a data file which has multiple rows (ranging from 2 to
5) per subject concerning technology class-taking.  The variables include
the course level, year taken, course number, course title, course grade
(see below):


    *ORIGINAL DATA LAYOUT:*    Subject ID CourseLevel CourseYear Course
CourseTitle CourseGrade  39#_3#EGQB0OGQ 11 2005 EXC1210 Excel I:
Intermediate Concepts A  39#_3#EGQB0OGQ 11 2005 EXC1220 Excel II: VBA
Programming B  39#_3#EGQB0OGQ 9 2003 WD1010 Word: Intro to Macros A
39#_3#EGQB0OGQ 12 2003 ACC1320 Access: Programming B  #_9P8SGGHOOAA09 12
2003 ACC1320 Access: Programming D  #_H22HEJZ1P7ACM 11 2003 EXC1210 Excel
I: Intermediate Concepts B  #_H22HEJZ1P7ACM 11 2003 EXC1220 Excel II: VBA
Programming C  #_H22HEJZ1P7ACM 10 2004 ACC1320 Access: Programming A


I’d like to restructure the data to create a new data set reflecting the
highest course (determined by the course’s last 4-digit number) taken by
each subject (see below).
    *DESIRED RESULT: *    Subject ID CourseLevel CourseYear Course
CourseTitle CourseGrade  39#_3#EGQB0OGQ 11 2005 EXC1220 Excel II: VBA
Programming B  #_9P8SGGHOOAA09 12 2003 ACC1320 Access: Programming D
#_H22HEJZ1P7ACM 11 2003 EXC1220 Excel II: VBA Programming C


Many thanks for your help,



elle
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Multiple Rows Per Subject: How to Create New Data File Based on Highest Course Taken

David Marso
Administrator
Alternatively:
COMPUTE CrsHigh=NUMBER(SUBSTR(LPAD(RTRIM(Course),8),5,4),F4).
SORT CASES BY SubjID CrsHigh(D).
COMPUTE FILT$=($CASENUM=1 OR SubjID <> LAG(SUBjID)).
FILTER BY FILT$.
EXE.

David Marso wrote
See SUBSTR and AGGREGATE (MAX) in the FM!
elle lists wrote
Hi all,



This newbie would appreciate your advice on how to proceed with this
puzzle. I received a data file which has multiple rows (ranging from 2 to
5) per subject concerning technology class-taking.  The variables include
the course level, year taken, course number, course title, course grade
(see below):


    *ORIGINAL DATA LAYOUT:*    Subject ID CourseLevel CourseYear Course
CourseTitle CourseGrade  39#_3#EGQB0OGQ 11 2005 EXC1210 Excel I:
Intermediate Concepts A  39#_3#EGQB0OGQ 11 2005 EXC1220 Excel II: VBA
Programming B  39#_3#EGQB0OGQ 9 2003 WD1010 Word: Intro to Macros A
39#_3#EGQB0OGQ 12 2003 ACC1320 Access: Programming B  #_9P8SGGHOOAA09 12
2003 ACC1320 Access: Programming D  #_H22HEJZ1P7ACM 11 2003 EXC1210 Excel
I: Intermediate Concepts B  #_H22HEJZ1P7ACM 11 2003 EXC1220 Excel II: VBA
Programming C  #_H22HEJZ1P7ACM 10 2004 ACC1320 Access: Programming A


I’d like to restructure the data to create a new data set reflecting the
highest course (determined by the course’s last 4-digit number) taken by
each subject (see below).
    *DESIRED RESULT: *    Subject ID CourseLevel CourseYear Course
CourseTitle CourseGrade  39#_3#EGQB0OGQ 11 2005 EXC1220 Excel II: VBA
Programming B  #_9P8SGGHOOAA09 12 2003 ACC1320 Access: Programming D
#_H22HEJZ1P7ACM 11 2003 EXC1220 Excel II: VBA Programming C


Many thanks for your help,



elle
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Multiple Rows Per Subject: How to Create New Data File Based on Highest Course Taken

elle lists
In reply to this post by elle lists
Many thanks, Melissa Ives and David Marso for the help. With David referencing the manual and Melissa's decription (below) I developed a clunky but working syntax
that : 1) extracts the 4-digits in the course number, 2) aggregates the cases to identify the course number with the highest value, and 3) saves the cases to a new data file. 
 
My awkward concoction is a far cry from David Marso's slick and elegant syntax in his second posting (thank you, David) but a tentative first step for this newbie.
 
Thanks again, Melissa and David.
 
elle
 
STRING Coursenum(A9).   /* This segment Extracts course number's 4-digits
COMPUTE Coursenum = SUBSTR(Course,4,4).  
 
SORT CASES by SubjectID. /*This segment collapses the cases by highest course number
AGGREGATE  
 / outfile=* MODE=ADDVARIABLES
 /PRESORTED
 /BREAK = SubjectID
 /Coursenum_max=MAX(coursenum).
 
DATASET COPY  temp. /*This segment copies cases with highest course to new data file
DATASET ACTIVATE  temp.
FILTER OFF.
USE ALL.
SELECT IF (Coursenum_max = Coursenum).
EXECUTE.
DATASET ACTIVATE  DataSet1.
 
On Fri, Apr 27, 2012 at 10:12 AM, Melissa Ives <[hidden email]> wrote:

Elle,

 

As David notes, use SUBSTR to keep the last 4 characters of the Course, then use AGGREGATE to identify the course number with the highest value.

 

Compute CrseN=substr(Course,4,4).  /*captures 4 characters beginning at position 4.

 

If you add max crseN to the current data file (mode=addvariables  overwrite=yes). Then you can use

SELECT IF MaxCrseN=CrseN. 

To keep only the maximum row for the student.

 

Melissa

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of elle lists
Sent: Thursday, April 26, 2012 6:00 PM
To: [hidden email]
Subject: [SPSSX-L] Multiple Rows Per Subject: How to Create New Data File Based on Highest Course Taken

 

Hi all,

 

This newbie would appreciate your advice on how to proceed with this puzzle. I received a data file which has multiple rows (ranging from 2 to 5) per subject concerning technology class-taking.  The variables include the course level, year taken, course number, course title, course grade (see below):

 

ORIGINAL DATA LAYOUT:

Subject ID

CourseLevel

CourseYear

Course

CourseTitle

CourseGrade

39#_3#EGQB0OGQ

11

2005

EXC1210

Excel I: Intermediate Concepts

A

39#_3#EGQB0OGQ

11

2005

EXC1220

Excel II: VBA Programming

B

39#_3#EGQB0OGQ

9

2003

WD1010

Word: Intro to Macros

A

39#_3#EGQB0OGQ

12

2003

ACC1320

Access: Programming

B

#_9P8SGGHOOAA09

12

2003

ACC1320

Access: Programming

D

#_H22HEJZ1P7ACM

11

2003

EXC1210

Excel I: Intermediate Concepts

B

#_H22HEJZ1P7ACM

11

2003

EXC1220

Excel II: VBA Programming

C

#_H22HEJZ1P7ACM

10

2004

ACC1320

Access: Programming

A

 

  

I’d like to restructure the data to create a new data set reflecting the highest course (determined by the course’s last 4-digit number) taken by each subject (see below).

DESIRED RESULT: 

Subject ID

CourseLevel

CourseYear

Course

CourseTitle

CourseGrade

39#_3#EGQB0OGQ

11

2005

EXC1220

Excel II: VBA Programming

B

#_9P8SGGHOOAA09

12

2003

ACC1320

Access: Programming

D

#_H22HEJZ1P7ACM

11

2003

EXC1220

Excel II: VBA Programming

C

 

 

Many thanks for your help,

 

elle



PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at <a href="tel:%28309%29%20827-6026" target="_blank" value="+13098276026">(309) 827-6026 for assistance.

Reply | Threaded
Open this post in threaded view
|

Re: Multiple Rows Per Subject: How to Create New Data File Based on Highest Course Taken

elle lists
In reply to this post by David Marso

Thank you, David Marso!  The syntax worked like a charm. It's always amazing to see the elegance and fluidity of your writing and certainly sends me to the manual to deconstruct the components.
 
Many thanks,
 
elle
On Fri, Apr 27, 2012 at 4:28 PM, David Marso <[hidden email]> wrote:
Alternatively:
COMPUTE CrsHigh=NUMBER(SUBSTR(LPAD(RTRIM(Course),8),5,4),F4).
SORT CASES BY SubjID CrsHigh(D).
COMPUTE FILT$=($CASENUM=1 OR SubjID <> LAG(SUBjID)).
FILTER BY FILT$.
EXE.


David Marso wrote
>
> See SUBSTR and AGGREGATE (MAX) in the FM!
>
> elle lists wrote
>>
>> Hi all,
>>
>>
>>
>> This newbie would appreciate your advice on how to proceed with this
>> puzzle. I received a data file which has multiple rows (ranging from 2 to
>> 5) per subject concerning technology class-taking.  The variables include
>> the course level, year taken, course number, course title, course grade
>> (see below):
>>
>>
>>     *ORIGINAL DATA LAYOUT:*    Subject ID CourseLevel CourseYear Course
>> CourseTitle CourseGrade  39#_3#EGQB0OGQ 11 2005 EXC1210 Excel I:
>> Intermediate Concepts A  39#_3#EGQB0OGQ 11 2005 EXC1220 Excel II: VBA
>> Programming B  39#_3#EGQB0OGQ 9 2003 WD1010 Word: Intro to Macros A
>> 39#_3#EGQB0OGQ 12 2003 ACC1320 Access: Programming B  #_9P8SGGHOOAA09 12
>> 2003 ACC1320 Access: Programming D  #_H22HEJZ1P7ACM 11 2003 EXC1210 Excel
>> I: Intermediate Concepts B  #_H22HEJZ1P7ACM 11 2003 EXC1220 Excel II: VBA
>> Programming C  #_H22HEJZ1P7ACM 10 2004 ACC1320 Access: Programming A
>>
>>
>> I’d like to restructure the data to create a new data set reflecting the
>> highest course (determined by the course’s last 4-digit number) taken by
>> each subject (see below).
>>     *DESIRED RESULT: *    Subject ID CourseLevel CourseYear Course
>> CourseTitle CourseGrade  39#_3#EGQB0OGQ 11 2005 EXC1220 Excel II: VBA
>> Programming B  #_9P8SGGHOOAA09 12 2003 ACC1320 Access: Programming D
>> #_H22HEJZ1P7ACM 11 2003 EXC1220 Excel II: VBA Programming C
>>
>>
>> Many thanks for your help,
>>
>>
>>
>> elle
>>
>


--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Multiple-Rows-Per-Subject-How-to-Create-New-Data-File-Based-on-Highest-Course-Taken-tp5668877p5671420.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD