I have a large dataset of administrative data that is currently in Excel, and all fields have both a number and the corresponding value. For example, one column would contain the following options:
(0) None (1) One (2) Two (3) Three or more There are about 300 variables like this (although what 0, 1, 2, etc mean differs for each of them), so I would like to avoid having to recode all of them. Is there a way to get SPSS to just read the numeric value from the parentheses? Is there another trick I could use for this? Thanks! |
Maybe I'm wrong but from your example, it looks like the numeric value is always in position 2 of the string value that is read in from excel. If so, then it's basically a) do repeat for v1 to v300, b) use Char.Substr to extract the character in position 2, c) use Number to convert the extracted character to a number.
Do repeat x=v1 to v300/y=y1 to y300. Compute y=number(char.substr(x,2,1),f1.0). End repeat. I hope it's understood that the conversion takes place after the data have been imported from the excel file. You may also have another problem that may not have occurred to you yet. The excel file data give both the numeric value and value label. Perhaps you will want value labels. Perhaps in python there is a way to stick the extracted labels into the value labels section of the data file. That is not possible using syntax, so far as I know. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of pbl Sent: Sunday, March 10, 2013 4:27 PM To: [hidden email] Subject: Reading numeric values from text when importing into SPSS? I have a large dataset of administrative data that is currently in Excel, and all fields have both a number and the corresponding value. For example, one column would contain the following options: (0) None (1) One (2) Two (3) Three or more There are about 300 variables like this (although what 0, 1, 2, etc mean differs for each of them), so I would like to avoid having to recode all of them. Is there a way to get SPSS to just read the numeric value from the parentheses? Is there another trick I could use for this? Thanks! -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Reading-numeric-values-from-text-when-importing-into-SPSS-tp5718529.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by pbl
Try the example
syntax below.
You most likely will want to do some autorecode. new file. data list list/ invar1 to invar3 (3a30). begin data "(0) None" "(13) orange" "(5) John" "(1) One" " (23) apple" "(7) Mary" "(2) Two" "(44) pineapple" "(111) Sue" "(3) Three or more" " (-1) n/a" "(22) Bill" end data. numeric numvar1 to numvar3 (f4). string strvar1 to strvar3(a20). string numstr(a6). do repeat inv = invar1 to invar3 /num = numvar1 to numvar3 /str = strvar1 to strvar3. compute EndNumPart = index(inv,")"). compute numstr = char.substr(inv,1,EndNumPart+1). compute numstr = replace(numstr,'(',''). compute numstr = replace(numstr,')',''). compute numstr = replace(numstr,' ',''). compute num = number(numstr,f6). compute str = char.substr(inv,(EndNumPart+2)). end repeat. list. delete vars numstr. alter type strvar1 to strvar3 (a=amin) . Art Kendall Social Research ConsultantsOn 3/10/2013 4:27 PM, pbl wrote: I have a large dataset of administrative data that is currently in Excel, and all fields have both a number and the corresponding value. For example, one column would contain the following options: (0) None (1) One (2) Two (3) Three or more There are about 300 variables like this (although what 0, 1, 2, etc mean differs for each of them), so I would like to avoid having to recode all of them. Is there a way to get SPSS to just read the numeric value from the parentheses? Is there another trick I could use for this? Thanks! -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Reading-numeric-values-from-text-when-importing-into-SPSS-tp5718529.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
In reply to this post by Maguin, Eugene
Hi,
Getting the values is easy, but as Gene said, you (me at least) would probably like to have the value labels. do repeat #in = v1 to v300 / #out = num1 to num300. +compute #out = number(substr(#in, index(#in, "(") + 1, index(#in, ")") -1), f5). end repeat. * getting the value labels: entirely untested. BEGIN PROGRAM. import re import spss prefix = "blah_" # assumes that target vars start with this vnames = spssaux.GetVariableNamesList() varPos = [v.index() for v in vnames if v.startswith(prefix)] for i in varPos: dataCursor = spss.Cursor([i]) col = dataCursor.fetchall() dataCursor.close() valueLabels = dict([re.findall("\((\d+)\) (\w+)", v[0], re.I)) for v in col]) syntax = ["value labels %s" % vnames[i]] syntax.extend([" %s '%s' " % value, label for value, label in valueLabels.items()]) spss.Submit("\n".join(syntax)) END PROGRAM. Regards, Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ----- Original Message ----- > From: "Maguin, Eugene" <[hidden email]> > To: [hidden email] > Cc: > Sent: Sunday, March 10, 2013 9:41 PM > Subject: Re: [SPSSX-L] Reading numeric values from text when importing into SPSS? > > Maybe I'm wrong but from your example, it looks like the numeric value is > always in position 2 of the string value that is read in from excel. If so, then > it's basically a) do repeat for v1 to v300, b) use Char.Substr to extract > the character in position 2, c) use Number to convert the extracted character to > a number. > > Do repeat x=v1 to v300/y=y1 to y300. > Compute y=number(char.substr(x,2,1),f1.0). > End repeat. > > I hope it's understood that the conversion takes place after the data have > been imported from the excel file. > > You may also have another problem that may not have occurred to you yet. The > excel file data give both the numeric value and value label. Perhaps you will > want value labels. Perhaps in python there is a way to stick the extracted > labels into the value labels section of the data file. That is not possible > using syntax, so far as I know. > > Gene Maguin > > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of pbl > Sent: Sunday, March 10, 2013 4:27 PM > To: [hidden email] > Subject: Reading numeric values from text when importing into SPSS? > > I have a large dataset of administrative data that is currently in Excel, and > all fields have both a number and the corresponding value. For example, one > column would contain the following options: > (0) None > (1) One > (2) Two > (3) Three or more > There are about 300 variables like this (although what 0, 1, 2, etc mean differs > for each of them), so I would like to avoid having to recode all of them. Is > there a way to get SPSS to just read the numeric value from the parentheses? Is > there another trick I could use for this? > > Thanks! > > > > -- > View this message in context: > http://spssx-discussion.1045642.n5.nabble.com/Reading-numeric-values-from-text-when-importing-into-SPSS-tp5718529.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command SIGNOFF SPSSX-L For a list of > commands to manage subscriptions, send the command INFO REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by pbl
The easiest way to get value labels might be to use AUTORECODE
on the original strings -- assuming that they were computer-generated and so there are not any variations in spelling, spacing, or capital letters. If *all* the values are observed as zero to <something>, no gaps, then you could recover the original scores by subtracting 1 in a DO REPEAT. -- Rich Ulrich > Date: Sun, 10 Mar 2013 13:27:00 -0700 > From: [hidden email] > Subject: Reading numeric values from text when importing into SPSS? > To: [hidden email] > > I have a large dataset of administrative data that is currently in Excel, and > all fields have both a number and the corresponding value. For example, one > column would contain the following options: > (0) None > (1) One > (2) Two > (3) Three or more > There are about 300 variables like this (although what 0, 1, 2, etc mean > differs for each of them), so I would like to avoid having to recode all of > them. Is there a way to get SPSS to just read the numeric value from the > parentheses? Is there another trick I could use for this? > |
Free forum by Nabble | Edit this page |