Data Management and Multiple Record Same ID Date Computation Problem

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Data Management and Multiple Record Same ID Date Computation Problem

Björn Türoque
Dear List,

I have a student dataset that has one record for each individual
course enrollment. Each record contains the students unique ID and the date
the course started (stored in date format). I would like to compute whether
or not someone has enrolled in a course prior to the one they are currently
taking. For example if a student has taken a course at a previous date I
would like to have a new variable set up that indicates if this is not their
first enrollment.

I have figured out how to get the computer to look at the data and label the
first course an individual enrolls in, but I run into a problem with
students who enroll in multiple courses simultaniously. I would like the
data to reflect both of the first courses with the same start date are the
first course the student enrolls in, instead of as a previous course.

I have included data below, the variables Ihave reflects what I have been
able to compute, and the variable Iwant is what I would ideally like to get.

Any help would be greatly appreciated

Data List Free / StuID CourseDate CourseNum Ihave Iwant
Begin Data
  111 09/01/06   357 0 0
  111 09/01/06   426 1 0
  111 01/01/07   427 1 1
  111 01/01/07   595 1 1
  112 01/01/07   101 0 0
  112 03/04/07   204 1 1
  113 03/04/07   101 0 0
  113 03/04/07   101 1 0
  115 09/01/06   101 0 0
  115 03/04/07   357 1 1

--
Björn Türoque

Some people are just born to rock!
Reply | Threaded
Open this post in threaded view
|

Re: Data Management and Multiple Record Same ID Date Computation Problem

Hector Maletta
         Bjorn,
         You have three levels of analysis here: courses, dates of
enrolment, and students. At the moment you have complete disaggregation,
with one record for each combination of the three. Some of your questions
may be answered while remaining at that level of aggregation, whereas other
questions would requiring aggregating the records into groups (e.g. by
student, or by date of enrolment).
         First, there is the LAG function in COMPUTE. You can identify
records with a date (or a course code) equal or different than the one of
the preceding record. Using only that function, and not aggregating
anything, you may do something. For instance:

         SORT CASES BY STUID COURSENUM COURSEDATE.
         COMPUTE IWANT=0.
         IF ($CASENUM>1 and STUID=LAG(STUID) AND COURSEDATE >
LAG(COURSEDATE))iwant=1.
         EXECUTE.
         IF ($CASENUM>1 and STUID=LAG(STUID) AND COURSEDATE =
LAG(COURSEDATE) AND LAG(iwant)=1))IWANT=1.

         I have not tested the above, but I think it should produce your
IWANT variable.
         First we sort the data. Then we assign the value 0 to the IWANT
variable in all cases, as an initial step.
         Then the first IF command will assign the value 1 to enrolments
where a student has enrolled in a prior course immediately before the one
being considered (e.g. the second record in your example). The EXECUTE
command will carry out these commands.
         Up to now, your fourth case has still a zero, whereas according to
your rules it should have a one. The second IF will give a 1 to cases with
same date of the precedent case if the precedent case has IWANT=1.

         Another approach, which I leave for later, is to aggregate
enrolments by date, and create variables for each group of enrolments (i.e.
for each date). This is done with AGGREGATE using the MODE ADDVARIABLES
option to cause the new variables to be added to the existing disaggregated
file.

         Hector

         -----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Björn Türoque
Sent: 09 August 2007 18:20
To: [hidden email]
Subject: Data Management and Multiple Record Same ID Date Computation
Problem

         Dear List,

         I have a student dataset that has one record for each individual
         course enrollment. Each record contains the students unique ID and
the date
         the course started (stored in date format). I would like to compute
whether
         or not someone has enrolled in a course prior to the one they are
currently
         taking. For example if a student has taken a course at a previous
date I
         would like to have a new variable set up that indicates if this is
not their
         first enrollment.

         I have figured out how to get the computer to look at the data and
label the
         first course an individual enrolls in, but I run into a problem
with
         students who enroll in multiple courses simultaniously. I would
like the
         data to reflect both of the first courses with the same start date
are the
         first course the student enrolls in, instead of as a previous
course.

         I have included data below, the variables Ihave reflects what I
have been
         able to compute, and the variable Iwant is what I would ideally
like to get.

         Any help would be greatly appreciated

         Data List Free / StuID CourseDate CourseNum Ihave Iwant
         Begin Data
           111 09/01/06   357 0 0
           111 09/01/06   426 1 0
           111 01/01/07   427 1 1
           111 01/01/07   595 1 1
           112 01/01/07   101 0 0
           112 03/04/07   204 1 1
           113 03/04/07   101 0 0
           113 03/04/07   101 1 0
           115 09/01/06   101 0 0
           115 03/04/07   357 1 1

         --
         Björn Türoque

         Some people are just born to rock!
Reply | Threaded
Open this post in threaded view
|

Re: Data Management and Multiple Record Same ID Date Computation Problem

Gary Rosin
In reply to this post by Björn Türoque
I count myself as a rank neophyte in SPSS. So, FWIW,
I have a question.  Why not mark the first enrollment
period (identified by the date the period began)?  With
that approach, it would be easy to id the classes taken
during that enrollment period.

Gary


At 04:19 PM 8/9/2007, you wrote:

>Dear List,
>
>I have a student dataset that has one record for each individual
>course enrollment. Each record contains the students unique ID and the date
>the course started (stored in date format). I would like to compute whether
>or not someone has enrolled in a course prior to the one they are currently
>taking. For example if a student has taken a course at a previous date I
>would like to have a new variable set up that indicates if this is not their
>first enrollment.
>
>I have figured out how to get the computer to look at the data and label the
>first course an individual enrolls in, but I run into a problem with
>students who enroll in multiple courses simultaniously. I would like the
>data to reflect both of the first courses with the same start date are the
>first course the student enrolls in, instead of as a previous course.
>
>I have included data below, the variables Ihave reflects what I have been
>able to compute, and the variable Iwant is what I would ideally like to get.
>
>Any help would be greatly appreciated
>
>Data List Free / StuID CourseDate CourseNum Ihave Iwant
>Begin Data
>   111 09/01/06   357 0 0
>   111 09/01/06   426 1 0
>   111 01/01/07   427 1 1
>   111 01/01/07   595 1 1
>   112 01/01/07   101 0 0
>   112 03/04/07   204 1 1
>   113 03/04/07   101 0 0
>   113 03/04/07   101 1 0
>   115 09/01/06   101 0 0
>   115 03/04/07   357 1 1
>
>--
>Björn Türoque
>
>Some people are just born to rock!
Reply | Threaded
Open this post in threaded view
|

Re: Data Management and Multiple Record Same ID Date Computation Problem

hillel vardi
In reply to this post by Björn Türoque
Shalom

The aggregate comant will give you the fist date .
Tere is the code .

Data List list / StuID CourseDate CourseNum( f3 edate8 f3 ) .
Begin Data
  111 09/01/06   357 0 0
  111 09/01/06   426 1 0
  111 01/01/07   427 1 1
  111 01/01/07   595 1 1
  112 01/01/07   101 0 0
  112 03/04/07   204 1 1
  113 03/04/07   101 0 0
  113 03/04/07   101 1 0
  115 09/01/06   101 0 0
  115 03/04/07   357 1 1
end data .
AGGREGATE
  /OUTFILE=*
  MODE=ADDVARIABLES
  /BREAK=StuID
  /CourseDate_min = MIN(CourseDate).
if       CourseDate  gt  CourseDate_min  Iwant=1.
if       CourseDate eq   CourseDate_min  Iwant=0.
execute .


Hillel Vardi

Björn Türoque wrote:

> Dear List,
>
> I have a student dataset that has one record for each individual
> course enrollment. Each record contains the students unique ID and the date
> the course started (stored in date format). I would like to compute whether
> or not someone has enrolled in a course prior to the one they are currently
> taking. For example if a student has taken a course at a previous date I
> would like to have a new variable set up that indicates if this is not their
> first enrollment.
>
> I have figured out how to get the computer to look at the data and label the
> first course an individual enrolls in, but I run into a problem with
> students who enroll in multiple courses simultaniously. I would like the
> data to reflect both of the first courses with the same start date are the
> first course the student enrolls in, instead of as a previous course.
>
> I have included data below, the variables Ihave reflects what I have been
> able to compute, and the variable Iwant is what I would ideally like to get.
>
> Any help would be greatly appreciated
>
> Data List Free / StuID CourseDate CourseNum Ihave Iwant
> Begin Data
>   111 09/01/06   357 0 0
>   111 09/01/06   426 1 0
>   111 01/01/07   427 1 1
>   111 01/01/07   595 1 1
>   112 01/01/07   101 0 0
>   112 03/04/07   204 1 1
>   113 03/04/07   101 0 0
>   113 03/04/07   101 1 0
>   115 09/01/06   101 0 0
>   115 03/04/07   357 1 1
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Data Management and Multiple Record Same ID Date Computation Problem

Ornelas, Fermin
I was just playing with this data set just to familiarize myself with spss.
The MF version is 4.1 (please no laughs) and cannot get it to read date formatted as mm/dd/yy any ideas?

DATA LIST LIST/ STUID (F3) CDATE (ADATE8) CNUM(F3).
BEGIN DATA
111 09/01/06     357
111 09/01/06     426
111 01/01/07     427
111 01/01/07     595
112 01/01/07     101
112 03/04/07     204
113 03/04/07     101
113 03/04/07     101
115 09/01/06     101
115 03/04/07     357
END DATA.

>Warning # 1102
>An invalid numeric field has been found.  The result has been set to the
>system-missing value.

COMMAND LINE:    76  CURRENT CASE:       2  CURRENT SPLITFILE GROUP:   1
FIELD CONTENTS: '09/01/06'
RECORD NUMBER:       2  STARTING COLUMN:     5  RECORD LENGTH:    72



Fermin Ornelas, Ph.D.
Management Analyst III, AZ DES
1789 W. Jefferson Street
Phoenix, AZ 85032
Tel: (602) 542-5639
E-mail: [hidden email]


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of hillel vardi
Sent: Friday, August 10, 2007 10:55 AM
To: [hidden email]
Subject: Re: Data Management and Multiple Record Same ID Date Computation Problem

Shalom

The aggregate comant will give you the fist date .
Tere is the code .

Data List list / StuID CourseDate CourseNum( f3 edate8 f3 ) .
Begin Data
  111 09/01/06   357 0 0
  111 09/01/06   426 1 0
  111 01/01/07   427 1 1
  111 01/01/07   595 1 1
  112 01/01/07   101 0 0
  112 03/04/07   204 1 1
  113 03/04/07   101 0 0
  113 03/04/07   101 1 0
  115 09/01/06   101 0 0
  115 03/04/07   357 1 1
end data .
AGGREGATE
  /OUTFILE=*
  MODE=ADDVARIABLES
  /BREAK=StuID
  /CourseDate_min = MIN(CourseDate).
if       CourseDate  gt  CourseDate_min  Iwant=1.
if       CourseDate eq   CourseDate_min  Iwant=0.
execute .


Hillel Vardi

Björn Türoque wrote:

> Dear List,
>
> I have a student dataset that has one record for each individual
> course enrollment. Each record contains the students unique ID and the date
> the course started (stored in date format). I would like to compute whether
> or not someone has enrolled in a course prior to the one they are currently
> taking. For example if a student has taken a course at a previous date I
> would like to have a new variable set up that indicates if this is not their
> first enrollment.
>
> I have figured out how to get the computer to look at the data and label the
> first course an individual enrolls in, but I run into a problem with
> students who enroll in multiple courses simultaniously. I would like the
> data to reflect both of the first courses with the same start date are the
> first course the student enrolls in, instead of as a previous course.
>
> I have included data below, the variables Ihave reflects what I have been
> able to compute, and the variable Iwant is what I would ideally like to get.
>
> Any help would be greatly appreciated
>
> Data List Free / StuID CourseDate CourseNum Ihave Iwant
> Begin Data
>   111 09/01/06   357 0 0
>   111 09/01/06   426 1 0
>   111 01/01/07   427 1 1
>   111 01/01/07   595 1 1
>   112 01/01/07   101 0 0
>   112 03/04/07   204 1 1
>   113 03/04/07   101 0 0
>   113 03/04/07   101 1 0
>   115 09/01/06   101 0 0
>   115 03/04/07   357 1 1
>
>

NOTICE: This e-mail (and any attachments) may contain PRIVILEGED OR CONFIDENTIAL information and is intended only for the use of the specific individual(s) to whom it is addressed.  It may contain information that is privileged and confidential under state and federal law.  This information may be used or disclosed only in accordance with law, and you may be subject to penalties under law for improper use or further disclosure of the information in this e-mail and its attachments. If you have received this e-mail in error, please immediately notify the person named above by reply e-mail, and then delete the original e-mail.  Thank you.
Reply | Threaded
Open this post in threaded view
|

Re: Data Management and Multiple Record Same ID

Richard Ristow
In reply to this post by Björn Türoque
At 05:19 PM 8/9/2007, Björn Türoque wrote:

>I have a student dataset that has one record for
>each individual course enrollment. I would like
>to compute whether or not someone has enrolled
>in a course prior to the one they are currently
>taking. If a student has taken a course at a
>previous date I would like to have a new
>variable set up that indicates if this is not their first enrollment.
>
>I run into a problem with students who enroll in
>multiple courses simultaniously. I would like
>the data to reflect both of the first courses
>with the same start date are the first course
>the student enrolls in, instead of as a previous course.
>
>I have included data below, the variables Ihave
>reflects what I have been able to compute, and
>the variable Iwant is what I would ideally like to get.

|-----------------------------|---------------------------|
|Output Created               |13-AUG-2007 11:22:26       |
|-----------------------------|---------------------------|
StuID CourseDate CourseNum Ihave Iwant

   111 09/01/2006     357      0     0
   111 09/01/2006     426      1     0
   111 01/01/2007     427      1     1
   111 01/01/2007     595      1     1
   112 01/01/2007     101      0     0
   112 03/04/2007     204      1     1
   113 03/04/2007     101      0     0
   113 03/04/2007     101      1     0
   115 09/01/2006     101      0     0
   115 03/04/2007     357      1     1

Number of cases read:  10    Number of cases listed:  10


*  Try this. SPSS 15 draft output (WRR:not saved separately).

NUMERIC NotFirst (F2).
AGGREGATE OUTFILE=* MODE=ADDVARIABLES
    /BREAK    = StuID
    /FrstDate = MIN(CourseDate).
COMPUTE NotFirst = (CourseDate GT FrstDate).
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |13-AUG-2007 11:27:26       |
|-----------------------------|---------------------------|
StuID CourseDate CourseNum Ihave Iwant NotFirst   FrstDate

   111 09/01/2006     357      0     0      0    09/01/2006
   111 09/01/2006     426      1     0      0    09/01/2006
   111 01/01/2007     427      1     1      1    09/01/2006
   111 01/01/2007     595      1     1      1    09/01/2006
   112 01/01/2007     101      0     0      0    01/01/2007
   112 03/04/2007     204      1     1      1    01/01/2007
   113 03/04/2007     101      0     0      0    03/04/2007
   113 03/04/2007     101      1     0      0    03/04/2007
   115 09/01/2006     101      0     0      0    09/01/2006
   115 03/04/2007     357      1     1      1    09/01/2006

Number of cases read:  10    Number of cases listed:  10
===========================================
APPENDIX: Test data (from original posting)
===========================================
Data List LIST
/ StuID  * CourseDate(ADATE) CourseNum Ihave Iwant.
Begin Data
   111 09/01/06   357 0 0
   111 09/01/06   426 1 0
   111 01/01/07   427 1 1
   111 01/01/07   595 1 1
   112 01/01/07   101 0 0
   112 03/04/07   204 1 1
   113 03/04/07   101 0 0
   113 03/04/07   101 1 0
   115 09/01/06   101 0 0
   115 03/04/07   357 1 1
END DATA.
FORMATS StuID CourseNum (F4)
        /Ihave Iwant     (F2).
LIST.