Missing school years

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Missing school years

Clive Downs
Hi,

I want to work out for a group of students who were in hospital, which
school years they missed while in hospital.

Data set (1) lists the students, date of birth, and  periods of
hospitalisation, e.g.

* DATA SET 1.

DATA LIST FREE/ id(A3) period(A2) start(DATE11) end(DATE11) dob(DATE11).
BEGIN DATA
001 P2   17-MAR-1997 17-MAY-1996  12-JUN-1984
001  P1    01-OCT-1985 23-OCT-1985   12-JUN-1984
002  P1   06-JAN-1995  24-APR-1995   13-MAY-1985
002  P2    01-OCT-1996 31-DEC-1997   13-MAY-1985
END DATA.

Data set (2) lists the mapping between birth years and the school years,
e.g.

* DATA SET 2.

DATA LIST FREE/ birthdayyear(A7)  Y1995_96 Y1996_97 Y1997_98.
BEGIN DATA
1983/84  7  8  9
1984/85  6  7  8
END DATA.

where e.g birthdayyear 1983/84 means if  they were born between 1/Sep/1983
and 31/Aug/1984 they should have been in school year 7 between 1/sep/1995
and 31/Aug/1996. (in reality the table covers 13 school years, and about 25
birth-years).

I would like to end up with a new data file that shows for each student
each school year, and the number of days in hospital they spent for that
year, eg (not to scale).

ID     year6   year7     year8    year9
001    0         61          0       0
002    0         334       121       0

I have worked out some syntax to calculate this, but it is very clunky - it
involves setting up vectors, macros etc. I am sure there must be a simpler
solution.

If any one can suggest a tidy solution I would be very grateful!

Many thanks,

Clive.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Missing school years

Maguin, Eugene
Clive,

I don't quite understand the significance of data set 2. It would seem that
data set 1 is all that is needed IF I can also assume 1) that a person born
between 1/Sep/1983 and 31/Aug/1984 would be in first grade in the school
year beginning 1 Sept 1990, i.e., to enroll in first grade a child has be at
least six years old on 1 Sept; 2) all kids entered first grade at their
expected time, i.e., no kids entered late; 3) no kids jumped ahead of their
expected grade for their age; and 4) no kids were held back. But even if I
can not assume those points, I don't see how data set 2, which seems to be
year specific, could help me.

I haven't really tried to figure out your question but I think that a cases
to vars operation would be useful to do. You are going to have to use
vectors, I'm pretty sure. Macros, almost certainly not.

Gene Maguin



>>I want to work out for a group of students who were in hospital, which
school years they missed while in hospital.

Data set (1) lists the students, date of birth, and  periods of
hospitalisation, e.g.

* DATA SET 1.

DATA LIST FREE/ id(A3) period(A2) start(DATE11) end(DATE11) dob(DATE11).
BEGIN DATA
001 P2   17-MAR-1997 17-MAY-1996  12-JUN-1984
001  P1    01-OCT-1985 23-OCT-1985   12-JUN-1984
002  P1   06-JAN-1995  24-APR-1995   13-MAY-1985
002  P2    01-OCT-1996 31-DEC-1997   13-MAY-1985
END DATA.

Data set (2) lists the mapping between birth years and the school years,
e.g.

* DATA SET 2.

DATA LIST FREE/ birthdayyear(A7)  Y1995_96 Y1996_97 Y1997_98.
BEGIN DATA
1983/84  7  8  9
1984/85  6  7  8
END DATA.

where e.g birthdayyear 1983/84 means if  they were born between 1/Sep/1983
and 31/Aug/1984 they should have been in school year 7 between 1/sep/1995
and 31/Aug/1996. (in reality the table covers 13 school years, and about 25
birth-years).

I would like to end up with a new data file that shows for each student
each school year, and the number of days in hospital they spent for that
year, eg (not to scale).

ID     year6   year7     year8    year9
001    0         61          0       0
002    0         334       121       0

I have worked out some syntax to calculate this, but it is very clunky - it
involves setting up vectors, macros etc. I am sure there must be a simpler
solution.

Many thanks,

Clive.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Missing school years

Clive Downs
In reply to this post by Clive Downs
Hi Gene,

Thank you for your reply. I think you are correct, there is no need for
dataset 2 per se, if you assume all students are in the relevant school
year only dependent on age (this is what we have assumed). Thank you for
pointing that out, it was actually a distraction.

I'm not sure about CASETOVARS, if it could be done with that, it might
allow quite a good solution.

Regards,

Clive.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Missing school years

Maguin, Eugene
Clive,

I apologize for not thinking of this before but in order to count hospital
days per school yeat you need to have a school start date. And, also, you
count hospital days during summer vacation?

Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Missing school years

Clive Downs
In reply to this post by Clive Downs
Gene,

Thanks for the query (below). The school start date I am using is 1 Sept.
of each year. The school year is taken as ending 31 August.
To simplify the problem somewhat, I am, at this stage, ignoring vacation
days, and assuming that the maximum number of school days in any year is =
to the number of days in the calendar year.

Thanks,

Clive.

>Clive,
>
>I apologize for not thinking of this before but in order to count hospital
>days per school yeat you need to have a school start date. And, also, you
>count hospital days during summer vacation?
>
>Gene Maguin
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Missing school years

Richard Ristow
In reply to this post by Clive Downs
At 07:35 AM 10/28/2008, Clive Downs wrote:

>I want to work out for a group of students who were in hospital,
>which school years they missed while in hospital. Data set (1) lists
>the students, date of birth, and  periods of hospitalization, e.g.

Below, the ending date of 001 P2 corrected from 1996 to 1997:
|-----------------------------|---------------------------|
|Output Created               |30-OCT-2008 16:59:48       |
|-----------------------------|---------------------------|
[TestData]

id  period       start         end         dob

001 P2     17-MAR-1997 17-MAY-1997 12-JUN-1984
001 P1     01-OCT-1985 23-OCT-1985 12-JUN-1984
002 P1     06-JAN-1995 24-APR-1995 13-MAY-1985
002 P2     01-OCT-1996 31-DEC-1997 13-MAY-1985

Number of cases read:  4    Number of cases listed:  4


>Data set (2) lists the mapping between birth years and the school
>years, where e.g birthdayyear 1983/84 means if  they were born
>between 1/Sep/1983 and 31/Aug/1984 they should have been in school
>year 7 between 1/sep/1995 and 31/Aug/1996.

That dataset isn't needed. I'll define a 'term' as the period from 1
September of one year, through 31 August of the next year; and assign
to a 'term' the earlier of the two years it crosses. That is, your
'1983/84' year, I'll call term 1983.

Define the student's "birth term" as the term (in this sense) in
which they were born. Then, file 2 can be replaced by a calculation:
School year = term - birth term - 5

>I would like to end up with a file that shows for each student each
>school year, and the number of days in hospital they spent for that year, eg
>
>ID     year6   year7     year8    year9
>001    0         61          0       0
>002    0         334       121       0
>
>If any one can suggest a tidy solution I would be very grateful!

See if this counts as tidy. It gives *almost* the form above; it
omits variables for years in which no student lost days. One could
get around this with a dummy 'student' with records for all desired years.

It requires looping, as your solution with vectors must have; but it
loops writing output with XSAVE, rather than looping over a vector.
Now that I've worked it out, I could probably re-cast it into an
entirely 'wide' form, replacing the XSAVE by something that writes
values into the days-lost-by-year variables. You'd still have to
AGGREGATE to combine all hospitalizations.

*  .....  Write a record for every term in which the student lost    .
*         days to hospitalization:                                   .

*  Definition: A 'term' is 01 Sept of one year, through 31 August of .
*              the next. It is numbered by the earlier of the two    .
*              years: term 1983 begins 01 Sept 1983.                 .

NUMERIC    BrthTerm  Term     (F4)
            TermBegn  TermEnd  (DATE11)
            SchlYear           (F3)
            HospTerm           (F4)
            LostBegn  LostEnd  (DATE11)
            DaysLost           (F4).

STRING     TermName  (A7).

VAR LABEL  BrthTerm "Earlier year of 'term' in which student born"
            Term     "Earlier year of the 'term' this record applies to"
            TermBegn "Beginning date of the current term"
            TermEnd  "Ending    date of the current term"
            SchlYear "Student's school year (1-23) in the current term"
            HospTerm "'Term' in which the hospitalization BEGAN"
            TermName "Current term, in yyyy/yy form"
            LostBegn "Earliest date this term lost to hospitalization"
            LostEnd  "Latest   date this term lost to hospitalization"
            DaysLost "Days in this term lost to hospitalization".

COMPUTE    BrthTerm = XDATE.YEAR(dob)
                     - (dob   LE DATE.MDY(08,31,XDATE.YEAR(dob))).

COMPUTE    HospTerm = XDATE.YEAR(start)
                     - (start LE DATE.MDY(08,31,XDATE.YEAR(start))).

.  /**/    PRINT /                                              /*-*/
    /**/    ' '   /                          /* skip a line      /*-*/
    /**/    'Hospitalization ' id period /                       /*-*/
    /**/    ' ' 11  start    ' to ' end                          /*-*/
    /**/    '   term ' HospTerm.


LOOP       Term     = HospTerm TO 2010.

.  COMPUTE TermName = CONCAT(STRING(Term,F4),      '/',
                              STRING(MOD(Term+1,100),F2)).
.  COMPUTE SchlYear = Term - BrthTerm - 5.

.  COMPUTE TermBegn = DATE.MDY(09,01,Term).
.  COMPUTE TermEnd  = DATE.MDY(08,31,Term + 1).

.  /**/    PRINT /                                              /*-*/
    /**/    'Term ' Term                                         /*-*/
    /**/    ' ' 11  TermBegn ' to ' TermEnd                      /*-*/.


*  Leave the loop, if hospitalization ends before this term begins.. .
.  DO IF   end LT TermBegn.
.     BREAK.
.  END IF.

.  COMPUTE LostBegn = MAX(TermBegn,start).
.  COMPUTE LostEnd  = MIN(TermEnd, end  ).
*  Days lost is the interval, including both endpoints:         .... .
.  COMPUTE DaysLost = CTIME.DAYS(LostEnd-LostBegn) + 1.

.  /**/    PRINT /                                              /*-*/
    /**/    '     Lost'                                          /*-*/
    /**/    ' ' 11  LostBegn ' to ' LostEnd                      /*-*/
    /**/    '   ' DaysLost ' days'.

.  XSAVE OUTFILE=LostDays
      /KEEP = id Period
              TermName SchlYear Term DaysLost dob BrthTerm.
END LOOP.

EXECUTE /* to run a transformation program just to write XSAVE     */.
[trace output from PRINT commands deleted]

*  .....  Load the file of lost days by student, period, term:       .
GET         FILE=LostDays.
CACHE.
DATASET NAME     LostDays WINDOW=FRONT.
.  /**/  LIST /*-*/.

List
|-----------------------------|---------------------------|
|Output Created               |30-OCT-2008 16:59:51       |
|-----------------------------|---------------------------|
[LostDays] C:\Documents and Settings\Richard\My Documents
              \Temporary\SPSS
              \2008-10-28 Downs - Missing school years - LOSTDAYS.SAV

id  period TermName SchlYear Term DaysLost         dob BrthTerm

001 P2     1996/97       8   1996     62   12-JUN-1984   1983
001 P1     1985/86      -3   1985     23   12-JUN-1984   1983
002 P1     1994/95       5   1994    109   13-MAY-1985   1984
002 P2     1996/97       7   1996    335   13-MAY-1985   1984
002 P2     1997/98       8   1997    122   13-MAY-1985   1984

Number of cases read:  5    Number of cases listed:  5


*  .....  Drop days lost before the first school year, then          .
*         calculate total days lost per school year:                 .

SELECT IF SchlYear GE 1.

DATASET DECLARE  Summary.
AGGREGATE
        OUTFILE = Summary
       /BREAK   = id
                  SchlYear
       /dob     = FIRST(dob)
       /DaysLost 'Days lost to hospitalization' = SUM(DaysLost)
       /N_Hosp   'No. of hospitalizations'      = NU.

DATASET ACTIVATE Summary WINDOW=FRONT.
FORMATS DaysLost (F4).
LIST.

List
|-----------------------------|---------------------------|
|Output Created               |30-OCT-2008 16:59:53       |
|-----------------------------|---------------------------|
[Summary]

id  SchlYear         dob DaysLost  N_Hosp

001      8   12-JUN-1984     62         1
002      5   13-MAY-1985    109         1
002      7   13-MAY-1985    335         1
002      8   13-MAY-1985    122         1

Number of cases read:  4    Number of cases listed:  4


*  .....  Finally, convert to the desired 'wide' form:         ..... .

DATASET COPY     WideForm.
DATASET ACTIVATE WideForm WINDOW=FRONT.
CASESTOVARS
      /ID      = id
      /INDEX   = SchlYear
      /RENAME    DaysLost=Year
      /DROP    = N_Hosp.

Cases to Variables
|-----------------------------|---------------------------|
|Output Created               |30-OCT-2008 16:59:54       |
|-----------------------------|---------------------------|
[WideForm]

Generated Variables
|---------------|---------------|----------------------|
|Original       |SchlYear       |Result                |
|Variable       |Student's      |------|---------------|
|               |school year    |Name  |Label          |
|               |(1-23) in the  |      |               |
|               |current term   |      |               |
|---------------|---------------|------|---------------|
|DaysLost  Days |5              |Year.5|Year.5: Days   |
|lost to        |               |      |lost to        |
|hospitalization|               |      |hospitalization|
|               |---------------|------|---------------|
|               |7              |Year.7|Year.7: Days   |
|               |               |      |lost to        |
|               |               |      |hospitalization|
|               |---------------|------|---------------|
|               |8              |Year.8|Year.8: Days   |
|               |               |      |lost to        |
|               |               |      |hospitalization|
|---------------|---------------|------|---------------|

Processing Statistics
|---------------|---|
|Cases In       |4  |
|Cases Out      |2  |
|---------------|---|
|Cases In/Out   |2.0|
|---------------|---|
|Variables In   |5  |
|Variables Out  |5  |
|---------------|---|
|Index Values   |3  |
|---------------|---|


LIST.

List
|-----------------------------|---------------------------|
|Output Created               |30-OCT-2008 16:59:54       |
|-----------------------------|---------------------------|
[WideForm]

id          dob Year.5 Year.7 Year.8

001 12-JUN-1984     .      .     62
002 13-MAY-1985   109    335    122

Number of cases read:  2    Number of cases listed:  2
==================================
APPENDIX: All code, with test data
==================================
*  C:\Documents and Settings\Richard\My Documents                    .
*    \Technical\spssx-l\Z-2008d                                      .
*    \2008-10-28 Downs - Missing school years.SPS                    .

*  In response to posting                                            .
*  Date:    Tue, 28 Oct 2008 07:35:40 -0400                          .
*  From:    Clive Downs <[hidden email]>                 .
*  Subject: Missing school years                                     .
*  To:      [hidden email]                                 .

*  "I want to work out for a group of students who were in           .
*  hospital, which school years they missed while in hospital. I     .
*  would like to end up with a new data file that shows for each     .
*  student each school year, and the number of days in hospital      .
*  they spent for that year."                                        .


*  ................................................................. .
*  .................   File handle, for XSAVE  ..................... .


FILE HANDLE LostDays
  /NAME='C:\Documents and Settings\Richard\My Documents'              +
          '\Temporary\SPSS\'                                          +
        '2008-10-28 Downs - Missing school years'                     +
        ' - '                                                         +
        'LOSTDAYS.SAV'.

*  .................   Test data,              ..................... .
*  .................   from original posting   ..................... .


*  Ending date of 001 P2 corrected from 1996 to 1997  .... .
DATA LIST FREE/
    id(A3) period(A2) start(DATE11) end(DATE11) dob(DATE11).
BEGIN DATA
    001    P2        17-MAR-1997    17-MAY-1997 12-JUN-1984
    001    P1        01-OCT-1985    23-OCT-1985 12-JUN-1984
    002    P1        06-JAN-1995    24-APR-1995 13-MAY-1985
    002    P2        01-OCT-1996    31-DEC-1997 13-MAY-1985
END DATA.
DATASET NAME     TestData WINDOW=FRONT.
LIST.

*  .................   Post after this point   ..................... .
*  ................................................................. .

*  .....  Write a record for every term in which the student lost    .
*         days to hospitalization:                                   .

*  Definition: A 'term' is 01 Sept of one year, through 31 August of .
*              the next. It is numbered by the earlier of the two    .
*              years: term 1983 begins 01 Sept 1983.                 .

NUMERIC    BrthTerm  Term     (F4)
            TermBegn  TermEnd  (DATE11)
            SchlYear           (F3)
            HospTerm           (F4)
            LostBegn  LostEnd  (DATE11)
            DaysLost           (F4).

STRING     TermName  (A7).

VAR LABEL  BrthTerm "Earlier year of 'term' in which student born"
            Term     "Earlier year of the 'term' this record applies to"
            TermBegn "Beginning date of the current term"
            TermEnd  "Ending    date of the current term"
            SchlYear "Student's school year (1-23) in the current term"
            HospTerm "'Term' in which the hospitalization BEGAN"
            TermName "Current term, in yyyy/yy form"
            LostBegn "Earliest date this term lost to hospitalization"
            LostEnd  "Latest   date this term lost to hospitalization"
            DaysLost "Days in this term lost to hospitalization".

COMPUTE    BrthTerm = XDATE.YEAR(dob)
                     - (dob   LE DATE.MDY(08,31,XDATE.YEAR(dob))).

COMPUTE    HospTerm = XDATE.YEAR(start)
                     - (start LE DATE.MDY(08,31,XDATE.YEAR(start))).

.  /**/    PRINT /                                              /*-*/
    /**/    ' '   /                          /* skip a line      /*-*/
    /**/    'Hospitalization ' id period /                       /*-*/
    /**/    ' ' 11  start    ' to ' end                          /*-*/
    /**/    '   term ' HospTerm.


LOOP       Term     = HospTerm TO 2010.

.  COMPUTE TermName = CONCAT(STRING(Term,F4),      '/',
                              STRING(MOD(Term+1,100),F2)).
.  COMPUTE SchlYear = Term - BrthTerm - 5.

.  COMPUTE TermBegn = DATE.MDY(09,01,Term).
.  COMPUTE TermEnd  = DATE.MDY(08,31,Term + 1).

.  /**/    PRINT /                                              /*-*/
    /**/    'Term ' Term                                         /*-*/
    /**/    ' ' 11  TermBegn ' to ' TermEnd                      /*-*/.


*  Leave the loop, if hospitalization ends before this term begins.. .
.  DO IF   end LT TermBegn.
.     BREAK.
.  END IF.

.  COMPUTE LostBegn = MAX(TermBegn,start).
.  COMPUTE LostEnd  = MIN(TermEnd, end  ).
*  Days lost is the interval, including both endpoints:         .... .
.  COMPUTE DaysLost = CTIME.DAYS(LostEnd-LostBegn) + 1.

.  /**/    PRINT /                                              /*-*/
    /**/    '     Lost'                                          /*-*/
    /**/    ' ' 11  LostBegn ' to ' LostEnd                      /*-*/
    /**/    '   ' DaysLost ' days'.

.  XSAVE OUTFILE=LostDays
      /KEEP = id Period
              TermName SchlYear Term DaysLost dob BrthTerm.
END LOOP.

EXECUTE /* to run a transformation program just to write XSAVE     */.


*  .....  Load the file of lost days by student, period, term:       .
GET         FILE=LostDays.
CACHE.
DATASET NAME     LostDays WINDOW=FRONT.
.  /**/  LIST /*-*/.


*  .....  Drop days lost before the first school year, then          .
*         calculate total days lost per school year:                 .

SELECT IF SchlYear GE 1.

DATASET DECLARE  Summary.
AGGREGATE
        OUTFILE = Summary
       /BREAK   = id
                  SchlYear
       /dob     = FIRST(dob)
       /DaysLost 'Days lost to hospitalization' = SUM(DaysLost)
       /N_Hosp   'No. of hospitalizations'      = NU.

DATASET ACTIVATE Summary WINDOW=FRONT.
FORMATS DaysLost (F4).
LIST.


*  .....  Finally, convert to the desired 'wide' form:         ..... .

DATASET COPY     WideForm.
DATASET ACTIVATE WideForm WINDOW=FRONT.

CASESTOVARS
      /ID      = id
      /INDEX   = SchlYear
      /RENAME    DaysLost=Year
      /DROP    = N_Hosp.

LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Missing school years

Clive Downs
In reply to this post by Clive Downs
Hi Richard,

Thank you very much for your solution to this problem.
The XSAVE function was something I wasn't aware of, and using this makes
for a much better way of doing it.
Yes, I would say your solution is pretty tidy!

Regards,

Clive.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD