Aha! the VALUELABEL Function!: was Automate Add Value Labels using Python (or something)?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Aha! the VALUELABEL Function!: was Automate Add Value Labels using Python (or something)?

King Douglas
Richard, my honored and respected friend,

Thanks so much for you brilliant and generous solution.  If I had been aware of the VALUELABEL function, my life would have been easier, so thanks also for that.

You may have seen that I posted my solution to the problem yesterday afternoon, before you offered your alternative, but similar, approach.

Now I regret not having sent sample data as I was asking for conceptual help rather than a step-by-step how-to...although I didn't make that clear.

Cheers,

King Douglas
American Airlines Customer Research


King Douglas <[hidden email]> wrote:


---------------------------------
From: "King Douglas" <[hidden email]>
Sent: Thursday, January 24, 2008 7:50 AM
To: "MEATKD" <[hidden email]>
Subject: FW: Automate Add Value Labels using Python (or something)?



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]]On Behalf Of
Richard Ristow
Sent: Wednesday, January 23, 2008 6:25 PM
To: [hidden email]
Subject: Re: Automate Add Value Labels using Python (or something)?


At 12:31 PM 1/23/2008, King Douglas wrote:

>I'm working on a way to automate an Add Values statement for an
>automatically generated numeric variable for month (1 to 24) that
>represents a selected time period across a number of files.
>
>I think I could use Python to extract the labels, writing the
>results to a syntax file that I can insert into my program.

You may find native SPSS code is simpler for this. It's certainly
handled "write and insert" logic well, for a long time.

>For instance, say I have quarterly and monthly tracking survey
>results going back a number of years.  The survey dates are
>represented in three numeric variables, Year, Month, Day.
>
>I can create the new variable values (1-24) with ordinary syntax,
>but now want to automatically add value labels ("MoYr") based on the
>24-month sequence.

Since it looks like you're labeling with calendar month-year, would
assigning date format "MOYR6" serve your turn? No, I think I see why
not: your 'new variable' is the sequential month within the survey,
running 1-24. (Right; that's what you responded to Richard Oliver.)

>The "Month" variable already contains value labels (Jan,
>Feb...Dec).  I think I could use Python to extract the labels and
>simply append the year, writing the results to a syntax file that I
>can insert into my program.
>
>Now that I have theoretically solved this problem, I'll get to work.  :>)
>If anyone has any handy code or other suggestions while I'm working
>at it, that would be much appreciated.

Below is a native-SPSS solution. OK, here's to me, for thinking
code-generating code could be 'simple'. Python has the advantage of
direct access to the data dictionary, including the value labels.

On the other hand, the Python solution may be somewhat complicated,
because the mapping of survey month to calendar month is an attribute
of the data, not the data dictionary.

Anyhow, see how you like this. Test data, and code without
interspersed output lines, follow the listing.


|-----------------------------|---------------------------|
|Output Created               |23-JAN-2008 19:08:09       |
|-----------------------------|---------------------------|
[TestData]

CaseID        Date Year Month  Day SvyMonth

   001  12-MAY-2003 2003     5   12      1
   002  21-JUN-2003 2003     6   21      2
   003  30-JUN-2003 2003     6   30      2
   004  16-JUL-2003 2003     7   16      3
   005  14-SEP-2003 2003     9   14      5
   006  29-SEP-2003 2003     9   29      5
   007  03-OCT-2003 2003    10    3      6
   008  08-OCT-2003 2003    10    8      6
   009  03-NOV-2003 2003    11    3      7
   010  10-NOV-2003 2003    11   10      7
   011  13-DEC-2003 2003    12   13      8
   012  17-DEC-2003 2003    12   17      8
   013  23-DEC-2003 2003    12   23      8
   014  23-JAN-2004 2004     1   23      9
   015  01-FEB-2004 2004     2    1     10
   016  01-FEB-2004 2004     2    1     10
   017  05-FEB-2004 2004     2    5     10
   018  11-FEB-2004 2004     2   11     10
   019  17-FEB-2004 2004     2   17     10
   020  23-FEB-2004 2004     2   23     10
   021  02-APR-2004 2004     4    2     12
   022  14-APR-2004 2004     4   14     12
   023  14-APR-2004 2004     4   14     12
   024  04-MAY-2004 2004     5    4     13
   025  21-MAY-2004 2004     5   21     13
   026  04-JUN-2004 2004     6    4     14
   027  22-JUN-2004 2004     6   22     14
   028  09-JUL-2004 2004     7    9     15
   029  05-OCT-2004 2004    10    5     18
   030  28-OCT-2004 2004    10   28     18
   031  31-OCT-2004 2004    10   31     18
   032  03-NOV-2004 2004    11    3     19
   033  09-NOV-2004 2004    11    9     19
   034  16-DEC-2004 2004    12   16     20
   035  11-JAN-2005 2005     1   11     21
   036  20-FEB-2005 2005     2   20     22
   037  18-APR-2005 2005     4   18     24
   038  29-APR-2005 2005     4   29     24
   039  07-MAY-2005 2005     5    7     25

Number of cases read:  39    Number of cases listed:  39


*  .................   Post after this point   ..................... .
*  ................................................................. .
DATASET ACTIVATE TestData WINDOW=FRONT  /* Helpful, during testing */.

*  I.     PREPARE A FILE from which to generate code    ............ .

*  I.A    Add a dummy record for every month in the survey, to make  .
*         sure all months are represented.                           .

NEW  FILE.
INPUT PROGRAM.
.   LOOP  SvyMonth = 1 TO 24.
.      END CASE.
.   END LOOP.
END FILE.
END INPUT PROGRAM.

ADD FILES
   /FILE=TestData
   /FILE=*.

*  I.B    Create a file with one record for every survey month.      .
*         (It retains one survey date and one calendar-month value,  .
*         so the value label can be built either using the existing  .
*         label for 'Month', or the SPSS 'MONTHn' format.)           .

DATASET   DECLARE CodeGen.
AGGREGATE OUTFILE=CodeGen
    /BREAK=SvyMonth
    /FrstDate 'To have one date value in the result'        = MIN(Date)
    /CalndrMn 'To have the old "Month" value in the result' = MIN(Month)
    /NResps   'Number of responses - I''m always curious'   = NU.

DATASET ACTIVATE CodeGen  WINDOW=FRONT.
*  (The next line has nothing to do with the purpose of this code)   .
.  COMPUTE NResps = NResps - 1  /* Discount the dummy record       */.

*  Check an error that baffled me for a while.                ...... .
*  For resolution, see I.C., below.                           ...... .
.  /*--  DISPLAY DICTIONARY /VARIABLES=CalndrMn  /*-*/.

.  /*--  TEMPORARY                               /*-*/.
.  /*--  STRING  OldLabel (A5)                   /*-*/.
.  /*--  COMPUTE OldLabel=VALUELABEL(CalndrMn)   /*-*/.
.  /*--  LIST                                    /*-*/.


*  I.C    Fill in "date" and calendar month, for survey months with  .
*         no observed responses.                                     .

NUMERIC NoBreak  (F2).
COMPUTE NoBreak =  1.


AGGREGATE OUTFILE=* MODE=ADDVARIABLES
    /BREAK=NOBREAK
    /Ur_Month 'First survey month with any response' = MIN(SvyMonth)
    /Ur_Date  'The very earliest response date'      = MIN(FrstDate).

NUMERIC #Ur_Day1 (DATE11) /* First day of first response month     */
         #SvyDay1 (DATE11) /* First day of first month of survey    */.

DO IF   MISSING(FrstDate).
.  COMPUTE #Ur_Day1 = DATE.MDY(XDATE.MONTH(Ur_Date),
                                1,
                                XDATE.YEAR (Ur_Date)).
*  Filled-in date is the first day of the survey month      ....... .
.  COMPUTE #SvyDay1 = DATESUM(#Ur_Day1,(1-Ur_Month),"Months").
.  COMPUTE FrstDate = DATESUM(#Ur_Day1,(SvyMonth-1),"Months").
.  COMPUTE CalndrMn = XDATE.MONTH(FrstDate).
END IF.

.  /**/  LIST  /*-*/.

List
|-----------------------------|---------------------------|
|Output Created               |23-JAN-2008 19:08:12       |
|-----------------------------|---------------------------|
[CodeGen]

SvyMonth    FrstDate CalndrMn  NResps NoBreak Ur_Month     Ur_Date

      1   12-MAY-2003      5         1     1        1   12-MAY-2003
      2   21-JUN-2003      6         2     1        1   12-MAY-2003
      3   16-JUL-2003      7         1     1        1   12-MAY-2003
      4   01-AUG-2003      8         0     1        1   12-MAY-2003
      5   14-SEP-2003      9         2     1        1   12-MAY-2003
      6   03-OCT-2003     10         2     1        1   12-MAY-2003
      7   03-NOV-2003     11         2     1        1   12-MAY-2003
      8   13-DEC-2003     12         3     1        1   12-MAY-2003
      9   23-JAN-2004      1         1     1        1   12-MAY-2003
     10   01-FEB-2004      2         6     1        1   12-MAY-2003
     11   01-MAR-2004      3         0     1        1   12-MAY-2003
     12   02-APR-2004      4         3     1        1   12-MAY-2003
     13   04-MAY-2004      5         2     1        1   12-MAY-2003
     14   04-JUN-2004      6         2     1        1   12-MAY-2003
     15   09-JUL-2004      7         1     1        1   12-MAY-2003
     16   01-AUG-2004      8         0     1        1   12-MAY-2003
     17   01-SEP-2004      9         0     1        1   12-MAY-2003
     18   05-OCT-2004     10         3     1        1   12-MAY-2003
     19   03-NOV-2004     11         2     1        1   12-MAY-2003
     20   16-DEC-2004     12         1     1        1   12-MAY-2003
     21   11-JAN-2005      1         1     1        1   12-MAY-2003
     22   20-FEB-2005      2         1     1        1   12-MAY-2003
     23   01-MAR-2005      3         0     1        1   12-MAY-2003
     24   18-APR-2005      4         2     1        1   12-MAY-2003
     25   07-MAY-2005      5         0     1        1   12-MAY-2003

Number of cases read:  25    Number of cases listed:  25


*  II.    GENERATE THE CODE, and write to the external file          .

ADD FILES
   /FILE = *
   /BY     NoBreak
   /FIRST= ItBegins
   /LAST = ItEnds.


STRING #Syntax (A60) /* To build the output line in */.

DO IF   ItBegins     /* The command header line:    */.
.  COMPUTE #Syntax = "ADD VALUE LABELS SvyMonth".
.  WRITE OUTFILE=Syntax
                /#Syntax.
END IF.

*  ....              /* The label   lines:          */.
** (In quest for clearer code, this uses more       **.
** variables and lines than strictly necessary.)    **.

STRING  #Label (A5)
         #LblMn (A3)
         #LblYr (A4).

*  Month name for label, using SPSS "MONTHn" format **.
*xxCOMPUTE #LblMn= STRING(FrstDate,MONTH3).

*  Month name for label, using existing value labels**.
+  COMPUTE #LblMn= VALUELABEL(CalndrMn).

*  Two-digit year, for label                        **.
COMPUTE #LblYr=    STRING(XDATE.YEAR(FrstDate),N4).
COMPUTE #LblYr=    SUBSTR(#LblYr,3).

*  The complete label text:                         **.
COMPUTE #Label=    CONCAT(#LblMn,#LblYr).

COMPUTE #Syntax
          = CONCAT("      "           ,
                   STRING(SvyMonth,F3),
                   "     '"           ,
                   #Label             ,
                   "'"                ).


DO IF   ItEnds       /* Add period to the last line:*/.
.  COMPUTE #Syntax = CONCAT(RTRIM(#Syntax),".").
END IF.

+  WRITE OUTFILE=Syntax / #Syntax.

EXECUTE   /* An instance where EXECUTE is needed                   */.



*  III.   TRACE AND TEST: List the generated code       ............ .

+  /**/   DATA LIST FIXED            /*-*/
    /**/        FILE=Syntax /         /*-*/
    /**/        CodeLine    01-60(A)  /*-*/.

Data List will read 1 records from C:\Documents and Settings\...hon -
SYNTAX.SAV

Variable          Rec   Start     End  Format

CodeLine            1       1      60  A60


+  /**/   LIST                       /*-*/.

List
|-----------------------------|---------------------------|
|Output Created               |23-JAN-2008 19:08:14       |
|-----------------------------|---------------------------|
CodeLine

ADD VALUE LABELS SvyMonth
         1     'May03'
         2     'Jun03'
         3     'Jul03'
         4     'Aug03'
         5     'Sep03'
         6     'Oct03'
         7     'Nov03'
         8     'Dec03'
         9     'Jan04'
        10     'Feb04'
        11     'Mar04'
        12     'Apr04'
        13     'May04'
        14     'Jun04'
        15     'Jul04'
        16     'Aug04'
        17     'Sep04'
        18     'Oct04'
        19     'Nov04'
        20     'Dec04'
        21     'Jan05'
        22     'Feb05'
        23     'Mar05'
        24     'Apr05'
        25     'May05'.


Number of cases read:  26    Number of cases listed:  26


*  IV.    INCLUDE AND EXECUTE the code                  ............ .

DATASET ACTIVATE TestData.
INSERT  File=Syntax.
  227  ADD VALUE LABELS SvyMonth
  228          1     'May03'
  229          2     'Jun03'
  230          3     'Jul03'
  231          4     'Aug03'
  232          5     'Sep03'
  233          6     'Oct03'
  234          7     'Nov03'
  235          8     'Dec03'
  236          9     'Jan04'
  237         10     'Feb04'
  238         11     'Mar04'
  239         12     'Apr04'
  240         13     'May04'
  241         14     'Jun04'
  242         15     'Jul04'
  243         16     'Aug04'
  244         17     'Sep04'
  245         18     'Oct04'
  246         19     'Nov04'
  247         20     'Dec04'
  248         21     'Jan05'
  249         22     'Feb05'
  250         23     'Mar05'
  251         24     'Apr05'
  252         25     'May05'.
  253
  254  * End of INSERT and INCLUDE nesting level 01.

DISPLAY DICTIONARY /VARIABLES=SvyMonth.

File Information
|-----------------------------|---------------------------|
|Output Created               |23-JAN-2008 19:08:14       |
|-----------------------------|---------------------------|
[TestData]

Variable Information
|--------|--------|------|------------|------|---------|------|------|
|Variable|Position|Label |Measurement |Column|Alignment|Print |Write |
|        |        |      |Level       |Width |         |Format|Format|
|--------|--------|------|------------|------|---------|------|------|
|SvyMonth|6       ||Scale       |10    |Right    |F3    |F3    |
|--------|--------|------|------------|------|---------|------|------|
Variables in the working file

Variable Values
|--------|--|-----|
|Value   |  |Label|
|--------|--|-----|
|SvyMonth|1 |May03|
|        |2 |Jun03|
|        |3 |Jul03|
|        |4 |Aug03|
|        |5 |Sep03|
|        |6 |Oct03|
|        |7 |Nov03|
|        |8 |Dec03|
|        |9 |Jan04|
|        |10|Feb04|
|        |11|Mar04|
|        |12|Apr04|
|        |13|May04|
|        |14|Jun04|
|        |15|Jul04|
|        |16|Aug04|
|        |17|Sep04|
|        |18|Oct04|
|        |19|Nov04|
|        |20|Dec04|
|        |21|Jan05|
|        |22|Feb05|
|        |23|Mar05|
|        |24|Apr05|
|        |25|May05|
|--------|--|-----|

===================================
APPENDIX: Code, including test data
===================================
*  ................................................................. .
*  .................   Test data and file handle   ................. .

SET RNG = MT       /* 'Mersenne twister' random number generator  */ .
SET MTINDEX
         = 01232008 /*  Date of posting, mmddyyyy                  */ .

FILE HANDLE Syntax
  /NAME='C:\Documents and Settings\Richard\My Documents'              +
          '\Temporary\SPSS\'                                          +
        '2008-01-23 Douglas - Automate Add Value Labels using Python' +
        ' - '                                                         +
        'SYNTAX.SAV'.
NEW  FILE.
INPUT PROGRAM.
.  NUMERIC CaseID  (N3).
.  NUMERIC Date    (DATE11)
            Year     Month    Day (F4).
.  VALUE LABELS     Month
    1 'Jan'   2 'Feb'   3 'Mar'   4 'Apr'   5 'May'   6 'Jun'
    7 'Jul'   8 'Aug'   9 'Sep'  10 'Oct'  11 'Nov'  12 'Dec'.
.  LEAVE   CaseID   Date.
.  COMPUTE Date   = DATE.MDY(05,12,2003).
.  LOOP CaseID    = 1 TO 500
         IF Date  LE DATE.MDY(05,12,2005).
.     COMPUTE Year  = XDATE.YEAR (Date).
.     COMPUTE Month = XDATE.MONTH(Date).
.     COMPUTE Day   = XDATE.MDAY (Date).
.     END CASE.
.     COMPUTE
            Date     = XDATE.DATE(Date+RV.EXP(1/TIME.DAYS(20))).
.  END LOOP.
END FILE.
END INPUT PROGRAM.

DATASET NAME     TestData WINDOW=FRONT.


*  Ah! The 'month' variable you want to label is not calendar date,  .
*  but month 01-24 in the survey period.                             .

NUMERIC  #SvyStrt (DATE11).
COMPUTE  #SvyStrt= DATE.MDY(05,01,2003).
NUMERIC  SvyMonth (F3).
COMPUTE  SvyMonth = 1 + DATEDIFF(Date,#SvyStrt,"Months").

.  /**/  LIST  /*-*/.

*  .................   Post after this point   ..................... .
*  ................................................................. .
DATASET ACTIVATE TestData WINDOW=FRONT  /* Helpful, during testing */.

*  I.     PREPARE A FILE from which to generate code    ............ .

*  I.A    Add a dummy record for every month in the survey, to make  .
*         sure all months are represented.                           .

NEW  FILE.
INPUT PROGRAM.
.   LOOP  SvyMonth = 1 TO 24.
.      END CASE.
.   END LOOP.
END FILE.
END INPUT PROGRAM.

ADD FILES
   /FILE=TestData
   /FILE=*.

*  I.B    Create a file with one record for every survey month.      .
*         (It retains one survey date and one calendar-month value,  .
*         so the value label can be built either using the existing  .
*         label for 'Month', or the SPSS 'MONTHn' format.)           .

DATASET   DECLARE CodeGen.
AGGREGATE OUTFILE=CodeGen
    /BREAK=SvyMonth
    /FrstDate 'To have one date value in the result'        = MIN(Date)
    /CalndrMn 'To have the old "Month" value in the result' = MIN(Month)
    /NResps   'Number of responses - I''m always curious'   = NU.

DATASET ACTIVATE CodeGen  WINDOW=FRONT.

*  (The next line has nothing to do with the purpose of this code)   .
.  COMPUTE NResps = NResps - 1  /* Discount the dummy record       */.

*  Check an error that baffled me for a while.                ...... .
*  For resolution, see I.C., below.                           ...... .
.  /*--  DISPLAY DICTIONARY /VARIABLES=CalndrMn  /*-*/.

.  /*--  TEMPORARY                               /*-*/.
.  /*--  STRING  OldLabel (A5)                   /*-*/.
.  /*--  COMPUTE OldLabel=VALUELABEL(CalndrMn)   /*-*/.
.  /*--  LIST                                    /*-*/.


*  I.C    Fill in "date" and calendar month, for survey months with  .
*         no observed responses.                                     .

NUMERIC NoBreak  (F2).
COMPUTE NoBreak =  1.


AGGREGATE OUTFILE=* MODE=ADDVARIABLES
    /BREAK=NOBREAK
    /Ur_Month 'First survey month with any response' = MIN(SvyMonth)
    /Ur_Date  'The very earliest response date'      = MIN(FrstDate).

NUMERIC #Ur_Day1 (DATE11) /* First day of first response month     */
         #SvyDay1 (DATE11) /* First day of first month of survey    */.

DO IF   MISSING(FrstDate).
.  COMPUTE #Ur_Day1 = DATE.MDY(XDATE.MONTH(Ur_Date),
                                1,
                                XDATE.YEAR (Ur_Date)).
*  Filled-in date is the first day of the survey month      ....... .
.  COMPUTE #SvyDay1 = DATESUM(#Ur_Day1,(1-Ur_Month),"Months").
.  COMPUTE FrstDate = DATESUM(#Ur_Day1,(SvyMonth-1),"Months").
.  COMPUTE CalndrMn = XDATE.MONTH(FrstDate).
END IF.

.  /**/  LIST  /*-*/.

*  II.    GENERATE THE CODE, and write to the external file          .

ADD FILES
   /FILE = *
   /BY     NoBreak
   /FIRST= ItBegins
   /LAST = ItEnds.


STRING #Syntax (A60) /* To build the output line in */.

DO IF   ItBegins     /* The command header line:    */.
.  COMPUTE #Syntax = "ADD VALUE LABELS SvyMonth".
.  WRITE OUTFILE=Syntax
                /#Syntax.
END IF.

*  ....              /* The label   lines:          */.
** (In quest for clearer code, this uses more       **.
** variables and lines than strictly necessary.)    **.

STRING  #Label (A5)
         #LblMn (A3)
         #LblYr (A4).

*  Month name for label, using SPSS "MONTHn" format **.
*xxCOMPUTE #LblMn= STRING(FrstDate,MONTH3).

*  Month name for label, using existing value labels**.
+  COMPUTE #LblMn= VALUELABEL(CalndrMn).

*  Two-digit year, for label                        **.
COMPUTE #LblYr=    STRING(XDATE.YEAR(FrstDate),N4).
COMPUTE #LblYr=    SUBSTR(#LblYr,3).

*  The complete label text:                         **.
COMPUTE #Label=    CONCAT(#LblMn,#LblYr).

COMPUTE #Syntax
          = CONCAT("      "           ,
                   STRING(SvyMonth,F3),
                   "     '"           ,
                   #Label             ,
                   "'"                ).


DO IF   ItEnds       /* Add period to the last line:*/.
.  COMPUTE #Syntax = CONCAT(RTRIM(#Syntax),".").
END IF.

+  WRITE OUTFILE=Syntax / #Syntax.

EXECUTE   /* An instance where EXECUTE is needed                   */.



*  III.   TRACE AND TEST: List the generated code       ............ .

+  /**/   DATA LIST FIXED            /*-*/
    /**/        FILE=Syntax /         /*-*/
    /**/        CodeLine    01-60(A)  /*-*/.

+  /**/   LIST                       /*-*/.


*  IV.    INCLUDE AND EXECUTE the code                  ............ .

DATASET ACTIVATE TestData.

INSERT  File=Syntax.

DISPLAY DICTIONARY /VARIABLES=SvyMonth.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD