value labels from string vars

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

value labels from string vars

Elizabeth Tighe
We'd like to read data stored in an Access database where the
descriptive labels on categorical variables are stored in separate
tables.  The ODBC import will pull the fields containing the labels in
as additional variables and we'd like to get them to be used as value
labels instead.

Is there a way to assign the values of one variable (a string variable)
as the value labels for another variable, i.e., the equivalent of the
labmask function available as part of the labutil package in Stata?

In Stata:

labmask targetvar, values(stringvar)

Or otherwise get SPSS to recognize the fields as value labels on import?


Any tips/tricks would be greatly appreciated!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: value labels from string vars

Bruce Weaver
Administrator
If it is not important to retain the original numeric values of the numeric variable, then you could just use the string variable containing the labels in an AUTORECODE.  E.g.,

AUTORECODE VARIABLES= LabelVar /INTO=NewVar .

NewVar will be a numeric variable with value labels equal to the values of LabelVar.  

From the FM re the DESCENDING sub-command:

By default, values for the source variable are recoded in ascending order (from lowest to highest).
DESCENDING assigns the values to new variables in descending order (from highest to lowest).
The largest value is assigned 1, the second-largest, 2, and so on.

HTH.


Elizabeth Tighe wrote
We'd like to read data stored in an Access database where the
descriptive labels on categorical variables are stored in separate
tables.  The ODBC import will pull the fields containing the labels in
as additional variables and we'd like to get them to be used as value
labels instead.

Is there a way to assign the values of one variable (a string variable)
as the value labels for another variable, i.e., the equivalent of the
labmask function available as part of the labutil package in Stata?

In Stata:

labmask targetvar, values(stringvar)

Or otherwise get SPSS to recognize the fields as value labels on import?


Any tips/tricks would be greatly appreciated!

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: value labels from string vars

Jon K Peck
Bruce's solution is very simple and efficient.  If you are going to draw the data repeatedly and the set of values actually in the data may vary AND you need to keep the autorecode output the same as far as preexisting values go, use the TEMPLATE feature of AUTORECODE to maintain consistent recoding.

If you need to keep the original values, a little more work is required.  If just a few variables are involved, you could do an aggregated query for each, producing a separate table of values and labels for each.  Then a little bit of Python code would do the job.  If there are a lot of variables, that would be clumsy, and the whole task could be better done in a somewhat more complicated         Python job.  Python, of course, would require that you install the (free) Python Essentials if you don't already have that.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Bruce Weaver <[hidden email]>
To:        [hidden email],
Date:        01/06/2014 04:05 PM
Subject:        Re: [SPSSX-L] value labels from string vars
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




If it is not important to retain the original numeric values of the numeric
variable, then you could just use the string variable containing the labels
in an AUTORECODE.  E.g.,

AUTORECODE VARIABLES= LabelVar /INTO=NewVar .

NewVar will be a numeric variable with value labels equal to the values of
LabelVar.

From the FM re the DESCENDING sub-command:

By default, values for the source variable are recoded in ascending order
(from lowest to highest).
DESCENDING assigns the values to new variables in descending order (from
highest to lowest).
The largest value is assigned 1, the second-largest, 2, and so on.

HTH.



Elizabeth Tighe wrote
> We'd like to read data stored in an Access database where the
> descriptive labels on categorical variables are stored in separate
> tables.  The ODBC import will pull the fields containing the labels in
> as additional variables and we'd like to get them to be used as value
> labels instead.
>
> Is there a way to assign the values of one variable (a string variable)
> as the value labels for another variable, i.e., the equivalent of the
> labmask function available as part of the labutil package in Stata?
>
> In Stata:
>
> labmask targetvar, values(stringvar)
>
> Or otherwise get SPSS to recognize the fields as value labels on import?
>
>
> Any tips/tricks would be greatly appreciated!
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/value-labels-from-string-vars-tp5723809p5723810.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: value labels from string vars

Elizabeth Tighe
We have many variables and we do need to maintain the original numeric
values.  We'll just transfer to Stata and use the labmask function there
... was hoping to save this step and stick with SPSS if it was do-able
in SPSS directly.

many thanks,
Liz



On 1/6/2014 6:14 PM, Jon K Peck wrote:

> Bruce's solution is very simple and efficient.  If you are going to draw
> the data repeatedly and the set of values actually in the data may vary
> AND you need to keep the autorecode output the same as far as
> preexisting values go, use the TEMPLATE feature of AUTORECODE to
> maintain consistent recoding.
>
> If you need to keep the original values, a little more work is required.
>   If just a few variables are involved, you could do an aggregated query
> for each, producing a separate table of values and labels for each.
>   Then a little bit of Python code would do the job.  If there are a lot
> of variables, that would be clumsy, and the whole task could be better
> done in a somewhat more complicated         Python job.  Python, of
> course, would require that you install the (free) Python Essentials if
> you don't already have that.
>
>
> Jon Peck (no "h") aka Kim
> Senior Software Engineer, IBM
> [hidden email]
> phone: 720-342-5621
>
>
>
>
> From: Bruce Weaver <[hidden email]>
> To: [hidden email],
> Date: 01/06/2014 04:05 PM
> Subject: Re: [SPSSX-L] value labels from string vars
> Sent by: "SPSSX(r) Discussion" <[hidden email]>
> ------------------------------------------------------------------------
>
>
>
> If it is not important to retain the original numeric values of the numeric
> variable, then you could just use the string variable containing the labels
> in an AUTORECODE.  E.g.,
>
> AUTORECODE VARIABLES= LabelVar /INTO=NewVar .
>
> NewVar will be a numeric variable with value labels equal to the values of
> LabelVar.
>
>  From the FM re the DESCENDING sub-command:
>
> By default, values for the source variable are recoded in ascending order
> (from lowest to highest).
> DESCENDING assigns the values to new variables in descending order (from
> highest to lowest).
> The largest value is assigned 1, the second-largest, 2, and so on.
>
> HTH.
>
>
>
> Elizabeth Tighe wrote
>  > We'd like to read data stored in an Access database where the
>  > descriptive labels on categorical variables are stored in separate
>  > tables.  The ODBC import will pull the fields containing the labels in
>  > as additional variables and we'd like to get them to be used as value
>  > labels instead.
>  >
>  > Is there a way to assign the values of one variable (a string variable)
>  > as the value labels for another variable, i.e., the equivalent of the
>  > labmask function available as part of the labutil package in Stata?
>  >
>  > In Stata:
>  >
>  > labmask targetvar, values(stringvar)
>  >
>  > Or otherwise get SPSS to recognize the fields as value labels on import?
>  >
>  >
>  > Any tips/tricks would be greatly appreciated!
>  >
>  > =====================
>  > To manage your subscription to SPSSX-L, send a message to
>
>  > LISTSERV@.UGA
>
>  >  (not to SPSSX-L), with no body text except the
>  > command. To leave the list, send the command
>  > SIGNOFF SPSSX-L
>  > For a list of commands to manage subscriptions, send the command
>  > INFO REFCARD
>
>
>
>
>
> -----
> --
> Bruce Weaver
> [hidden email]
> http://sites.google.com/a/lakeheadu.ca/bweaver/
>
> "When all else fails, RTFM."
>
> NOTE: My Hotmail account is not monitored regularly.
> To send me an e-mail, please use the address shown above.
>
> --
> View this message in context:
> http://spssx-discussion.1045642.n5.nabble.com/value-labels-from-string-vars-tp5723809p5723810.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: value labels from string vars

Richard Ristow
In reply to this post by Elizabeth Tighe
At 05:46 PM 1/6/2014, Elizabeth Tighe wrote:

>We'd like to read data stored in an Access database where the
>descriptive labels on categorical variables are stored in separate
>tables.  The ODBC import will pull the fields containing the labels
>in as additional variables and we'd like to get them to be used as
>value labels instead.
>
>Is there a way to get SPSS to recognize the fields as value labels on import?

Here's another approach, maybe practicable, maybe not. It depends on
how much control you have over your ODBC import.

It sounds like the SQL for the ODBC joins the separate tables with
the descriptive labels to the data table, and then imports the
result; and then, in effect, you want to UNDO the join, and have a
lookup table of descriptive values again. (A set of value labels is,
in effect, such a table.)

I'd look into,

. Importing the data *without* joining to the tables with labels.
Then, you'll have the numeric values of the categorical variables;
you won't have the extra variables; and you won't, at that point,
have any labels for the categories.

. Separately, importing each of the tables with the descriptive
labels. I presume each of these has two fields: a numeric one with
the numeric value, and a character one with the corresponding
descriptive label. You'll get, then, for each a corresponding SPSS
file with the same variables: numeric value, and string with the
descriptive label.

. Converting the imported tables to VALUE LABELS statements. That can
be done a number of ways; my own preference would be to use native
SPSS code to create the VALUE LABELS statements, write those to an
external file, and INSERT them in a syntax program.

Now, you later write, "We have many variables and we do need to
maintain the original numeric values." The approach I'm outlining
maintains the numeric variables; but it does take some work for each
set of value labels. One question is how many *different* sets of
value labels you have; basically, you create the VALUE LABELS
statement once for each such set, and on it name all the variables to
which it applies.

You'll have to decide how this compares in difficulty to the SPSS ->
STATA -> SPSS route.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: value labels from string vars

Richard Ristow
In reply to this post by Elizabeth Tighe
At 05:46 PM 1/6/2014, Elizabeth Tighe wrote:

>We'd like to read data [from] an Access database where the
>descriptive labels on categorical variables are stored in separate
>tables.  The ODBC import will pull the fields containing the labels
>in as additional variables and we'd like to get them to be used as
>value labels instead.

The thread and list have moved on, and presumably the original poster
either does, or doesn't, like any of the solutions proposed.

However, here is a solution in native SPSS that, I believe, does what
was originally requested. Admittedly, it took some coding.

To use:

1. Specify a file handle for file 'Code', usually a temporary location

2. On the VARSTOCASES statement,
.  Change "/KEEP  = CaseID" to name whatever case identifier is in
your data -- or, omit this clause altogether; the case identifier
isn't actually used.
.  On "/MAKE Value", name the variables to which you want to assign
value labels; on "/MAKE Label", name the corresponding variables
containing the labels.

3. If desired, after running drop the label variables.


|-----------------------------|---------------------------|
|Output Created               |16-JAN-2014 12:33:52       |
|-----------------------------|---------------------------|
  [TestData]
CaseID LatinVal LatinLbl GreekVal GreekLbl HebrwVal HebrwLbl

   1001      1   A             1   Alpha         1   Aleph
   1002      1   A             2   Beta          1   Aleph
   1003      3   C             2   Beta          1   Aleph
   1004      3   C             1   Alpha         1   Aleph
   1005      3   C             1   Alpha         2   Beth
   1006      1   A             1   Alpha         3   Gimel
   1007      3   C             1   Alpha         2   Beth
   1008      1   A             2   Beta          2   Beth
   1009      3   C             1   Alpha         1   Aleph
   1010      2   B             3   Gamma         3   Gimel

Number of cases read:  10    Number of cases listed:  10


*  I.      Unroll, putting each code-label pair in a separate   ...  .
*          record                                               ...  .

DATASET   ACTIVATE TestData WINDOW=FRONT.
DATASET   COPY     Summary.
DATASET   ACTIVATE Summary  WINDOW=Front.
*  ...........................................................  ...  .
*  To use this for any other dataset, you should only have to   ...  .
*  change the lists of variables on the two MAKE lists,         ...  .
*  and change (or delete) the KEEP clause.                      ...  .
*  ...........................................................  ...  .

VARSTOCASES
    /MAKE Value FROM LatinVal GreekVal HebrwVal
    /MAKE Label FROM LatinLbl GreekLbl HebrwLbl
    /INDEX = Variable(Value)
    /KEEP  = CaseID
    /NULL  = DROP.


Variables to Cases
|-----------------------------|---------------------------|
|Output Created               |16-JAN-2014 12:33:53       |
|-----------------------------|---------------------------|
|   (Notes omitted)                                       |
|-----------------------------|---------------------------|

   *  II.   Summarize, to a single record for each                 ...  .
*        variable-value-label triplet found in the data         ...  .

AGGREGATE  OUTFILE=*
    /BREAK =Variable Value Label
    /N#    'Instances of this pair' = NU.

LIST.

List
|-----------------------------|---------------------------|
|Output Created               |16-JAN-2014 12:33:53       |
|-----------------------------|---------------------------|
Variable Value Label       N#

GreekVal    1  Alpha        6
GreekVal    2  Beta         3
GreekVal    3  Gamma        1
HebrwVal    1  Aleph        5
HebrwVal    2  Beth         3
HebrwVal    3  Gimel        2
LatinVal    1  A            4
LatinVal    2  B            1
LatinVal    3  C            5

Number of cases read:  9    Number of cases listed:  9


*  III.    Generate VALUE LABEL statements                      ...  .

ADD FILES
   /FILE=*
   /BY    Variable
   /FIRST=VarStart LAST =VarEnd.


*  III.A.  Scratch variables: A literal quotation mark, and     ...  .
*          a buffer in which to build the labels                ...  .

STRING  #Qt    (A1)
         #Label (A50).

COMPUTE #Qt = '"'.


*  III.B.  Write the statement beginnings, naming the variables ...  .

DO IF   VarStart.
.  WRITE OUTFILE=Code
    /'VALUE LABELS ' Variable .
END IF.


*  III.C.  Write the label specifications                       ...  .

COMPUTE    #Label = CONCAT(#Qt,RTRIM(Label),#Qt).

IF Varend  #Label = CONCAT(RTRIM(#Label),'.').

.  WRITE OUTFILE=Code
    /'   ' Value (F3) '  ' #Label *.

*  III.D.  Force a data pass, to write out the code             ...  .

EXECUTE.


*  IV.     Apply the labels to the original file                ...   .

DATASET ACTIVATE TestData WINDOW=FRONT.
INSERT FILE=Code.
  162  VALUE LABELS GreekVal
  163       1  "Alpha"
  164       2  "Beta"
  165       3  "Gamma".
  166  VALUE LABELS HebrwVal
  167       1  "Aleph"
  168       2  "Beth"
  169       3  "Gimel".
  170  VALUE LABELS LatinVal
  171       1  "A"
  172       2  "B"
  173       3  "C".
  174
  175  * End of INSERT and INCLUDE nesting level 01.
=============================
APPENDIX: Test data, and code
=============================
*  C:\Documents and Settings\Richard\My Documents       .
*    \Technical\spssx-l\Z-2014\                         .
*    2014-01-06 Tighe-value labels from string vars.SPS .

*  In response to posting                               .
*  Date:     Mon, 6 Jan 2014 17:46:20 -0500             .
*  From:     Elizabeth Tighe <[hidden email]>       .
*  Subject:  value labels from string vars              .
*  To:       [hidden email]                   .

*  ................................................................. .
*  "We'd like to read data stored in an Access database where the    .
*  descriptive labels on categorical variables are stored in         .
*  separate tables.  The ODBC import will pull the fields containing .
*  the labels in as additional variables and we'd like to get them   .
*  to be used as value labels instead.                               .
*                                                                    .
*  "Is there a way to assign the values of one variable (a string    .
*  variable) as the value labels for another variable, i.e., the     .
*  equivalent of the labmask function available as part of the       .
*  labutil package in Stata?"                                        .
*  ................................................................. .


*  0.a     External file for generated code                     ...  .

FILE HANDLE Code
  /NAME='C:\Documents and Settings\Richard\My Documents'              +
          '\Temporary\SPSS\'                                          +
        '2014-01-06 Tighe-value labels from string vars' +
        ' - '                                                         +
        'CODE.SPS'.

*  0.b     Test data                                            ...  .


SET RNG    =MT.
SET MTINDEX=4833.

NEW FILE.
INPUT PROGRAM.
NUMERIC  CaseID   (F5).
NUMERIC  LatinVal (F3).
STRING   LatinLbl (A1).
NUMERIC  GreekVal (F3).
STRING   GreekLbl (A5).
NUMERIC  HebrwVal (F3).
STRING   HebrwLbl (A6).

LOOP CaseID = 1001 TO 1010.
.  DO REPEAT  Value = LatinVal GreekVal HebrwVal.
.     COMPUTE Value = RV.POISSON(1.0) + 1.
.     RECODE  Value (3 THRU HI = 3).
.  END REPEAT.
.  RECODE LatinVal
          (1 = 'A') (2 = 'B') (3 = 'C')
          INTO LatinLbl.
.  RECODE GreekVal
          (1 = 'Alpha')
          (2 = 'Beta')
          (3 = 'Gamma')
          INTO GreekLbl.
.  RECODE HebrwVal
          (1 = 'Aleph')
          (2 = 'Beth')
          (3 = 'Gimel')
          INTO HebrwLbl.
.  END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.

DATASET NAME     TestData WINDOW=FRONT.

*  ...     Post, after this point                               ...  .

LIST.

*  I.      Unroll, putting each code-label pair in a separate   ...  .
*          record                                               ...  .

DATASET   ACTIVATE TestData WINDOW=FRONT.
DATASET   COPY     Summary.
DATASET   ACTIVATE Summary  WINDOW=Front.

*  ...........................................................  ...  .
*  To use this for any other dataset, you should only have to   ...  .
*  change the lists of variables on the two MAKE lists,         ...  .
*  and change (or delete) the KEEP clause.                      ...  .
*  ...........................................................  ...  .

VARSTOCASES
    /MAKE Value FROM LatinVal GreekVal HebrwVal
    /MAKE Label FROM LatinLbl GreekLbl HebrwLbl
    /INDEX = Variable(Value)
    /KEEP  = CaseID
    /NULL  = DROP.

LIST.

*  II.   Summarize, to a single record for each                 ...  .
*        variable-value-label triplet found in the data         ...  .

AGGREGATE  OUTFILE=*
    /BREAK =Variable Value Label
    /N#    'Instances of this pair' = NU.

LIST.

*  III.    Generate VALUE LABEL statements                      ...  .

ADD FILES
   /FILE=*
   /BY  Variable
   /FIRST=VarStart LAST =VarEnd.


*  III.A.  Scratch variables: A literal quotation mark, and     ...  .
*          a buffer in which to build the labels                ...  .

STRING  #Qt    (A1)
         #Label (A50).

COMPUTE #Qt = '"'.


*  III.B.  Write the statement beginnings, naming the variables ...  .

DO IF   VarStart.
.  WRITE OUTFILE=Code
    /'VALUE LABELS ' Variable .
END IF.


*  III.C.  Write the label specifications                       ...  .

COMPUTE    #Label = CONCAT(#Qt,RTRIM(Label),#Qt).

IF Varend  #Label = CONCAT(RTRIM(#Label),'.').

.  WRITE OUTFILE=Code
    /'   ' Value (F3) '  ' #Label *.

*  III.D.  Force a data pass, to write out the code             ...  .

EXECUTE.


*  IV.     Apply the labels to the original file                ...   .

DATASET ACTIVATE TestData WINDOW=FRONT.

INSERT FILE=Code.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: value labels from string vars

Richard Ristow
In reply to this post by Jon K Peck
At 06:14 PM 1/6/2014, Jon K Peck wrote:

>If you need to keep the original values, a little more work is
>required.  If just a few variables are involved, you could do an
>aggregated query for each, producing a separate table of values and
>labels for each.  Then a little bit of Python code would do the
>job.  If there are a lot of variables, that would be clumsy, and the
>whole task could be better done in a somewhat more complicated Python job.

This is a case where it really seems Python might help -- notably,
SPSS.SUBMIT from Python is a neater way of submitting generated code
to SPSS than is writing to an external file and INSERTing.

However, in other respects, the solution I've posted in native SPSS,
based on VARSTOCASES and AGGREGATE, seems pretty straightforward.
What would be the most natural approach in Python? Would it be neater?

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: value labels from string vars

Jon K Peck
Here's an example of what I am working on.
STATS VALLBLS FROMDATA VARIABLES=a b c to f LBLVARS="label.*" /OUTPUT SYNTAX="dolabels.sps".

It will check for conflicts (two different labels for the same value of a variable) and can report duplicate value labels for a variable (two different values with the same label).  It can optionally write a syntax file of VALUE LABEL statements and generate the labels.  In most cases it will do everything on one data pass.  Either variable list can be a regular variable list or a regular expression for selecting the variables.  If many variables share the same set of labels, LBLVARS can be just one variable that is applied to all the VARIABLES variables.  It will also have a dialog box interface.

Details still subject to change.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Richard Ristow <[hidden email]>
To:        Jon K Peck/Chicago/IBM@IBMUS, [hidden email],
Date:        01/16/2014 11:19 AM
Subject:        Re: value labels from string vars




At 06:14 PM 1/6/2014, Jon K Peck wrote:

>If you need to keep the original values, a little more work is
>required.  If just a few variables are involved, you could do an
>aggregated query for each, producing a separate table of values and
>labels for each.  Then a little bit of Python code would do the
>job.  If there are a lot of variables, that would be clumsy, and the
>whole task could be better done in a somewhat more complicated Python job.

This is a case where it really seems Python might help -- notably,
SPSS.SUBMIT from Python is a neater way of submitting generated code
to SPSS than is writing to an external file and INSERTing.

However, in other respects, the solution I've posted in native SPSS,
based on VARSTOCASES and AGGREGATE, seems pretty straightforward.
What would be the most natural approach in Python? Would it be neater?


Reply | Threaded
Open this post in threaded view
|

Re: value labels from string vars

Richard Ristow
In reply to this post by Richard Ristow
At 03:27 PM 1/16/2014, Jon K Peck wrote, off-list:

>One problem with [Richard's native-SPSS] code is that it presumes
>that the variables being labelled are numeric.  String variables can
>also take labels.  It also doesn't deal with the case where the
>label text contains quotes

Both points taken.

The OP stated (in her second posting on the problem) that all her
data values were numeric, so I felt OK accepting that limitation. It
is a limitation that is difficult to surmount in native SPSS,
probably much easier in Python.

Some of the other features planned for the Python implementation, like

. handling label text containing embedded quotes and (from an earlier posting)
. check[ing] for conflicts (two different labels for the same value
of a variable)
. report[ing] duplicate value labels for a variable (two different
values with the same label),

could be added to the native-SPSS implementation: the latter two
trivially easily, the former with some work. And others, like

. It will have a dialog box interface
. Either variable list can be a regular variable list or a regular
expression for selecting the variables

are essentially impossible in native SPSS; Python programmability
really does buy some important new capabilities. One feature,

. If many variables share the same set of labels, LBLVARS can be just
one variable that is applied to all the VARIABLES variables

is probably of little use in, as stated. By stipulation, the values
of the label variables are the labels that should be applied to the
corresponding values of the data variables. If two data variable
share the same label variable, they necessarily have the same values
unless labels are duplicated.

On the other hand, if you can specify a set of data variables that
share the same value labels (though different label variables), you
could combine the checking of values against labels, and improve your
chances of finding the correct labels for values that happen not to
occur in one or more of the data variables.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD