I wonder if there exists a complete example on how to create custom variable attributes with Python code alone. I know how to do it with SPSS syntax. Say, I have a variable v1. I want to create - an attribute "minval" that contains the minimum value - an attribute "minlab" that contains the label of the minimum value in the v1 variable values list if not empty. How to proceed? Thanks for any advice. Mario Giesel Munich, Germany |
The easiest way would be like this. begin program python3. import spssaux spssaux.createAttribute("salary salbegin", "unit", "DM") end program. You can also work with a dictionary of attributes via the Dataset class with a bit more code. If you need an example to calculate the minimum value and retrieve its label, I can come up with that. Note: Since V27 only supports Python 2 out of the box, I used begin program python3, but this same code would work with Python2. On Thu, Oct 15, 2020 at 4:56 AM Mario Giesel <[hidden email]> wrote:
|
Administrator
|
Pardon my ignorance, Jon, but what does "DM" stand for?
Jon Peck wrote > The easiest way would be like this. > begin program python3. > import spssaux > > spssaux.createAttribute("salary salbegin", "unit", "DM") > end program. ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Bruce, possibly an old currency of Germany? :-) Thanks, Jon, I'll give it a try. Thanks a lot, Mario
Am Donnerstag, 15. Oktober 2020, 16:56:16 MESZ hat Bruce Weaver <[hidden email]> Folgendes geschrieben:
Pardon my ignorance, Jon, but what does "DM" stand for? Jon Peck wrote > The easiest way would be like this. > begin program python3. > import spssaux > > spssaux.createAttribute("salary salbegin", "unit", "DM") > end program. ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
I have posted a complete solution for the minimum as an attribute below with an explanation of the code, but the side point of the first solution was that custom attributes make it easy to enrich the variable metadata - here marking the currency unit of the variables. Python is not required for this: the VARIABLE ATTRIBUTE could be used directly, but Python code would be needed if you want to use these values in the code. As far as I can see, custom variable attributes are woefully underused. As for the example, I created a function that takes a string of variable names and constructs a custom attribute named min containing the minimum value for each variable. I split it into two begin program blocks. The first one defines a function named createMinAttrib, and the second invokes it. The blocks could be combined, but it is clearer this way. The program passes the data for the specified variables and records the minimum values. Then for each variable it creates a custom attribute named min holding that value (or missing if all values are missing). begin program python3. import spssaux, spssdata def createMinAttrib(thevars): """create a custom attribute of the minimum values for a set of variables the vars is a blank separated string of variable names""" varlist = thevars.split() curs = spssdata.Spssdata(thevars) mins = {} for case in curs: for i, v in enumerate(varlist): if case[i] is not None: mins[v] = min(mins.get(v, case[i]), case[i]) curs.CClose() for v in varlist: spssaux.createAttribute(v, "min", mins.get(v, None)) end program. begin program python3. createMinAttrib("salary salbegin") end program.
|
Administrator
|
In reply to this post by spss.giesel@yahoo.de
Ah ja ... sehr gut! Dankeschön. ;-)
[hidden email] wrote > Bruce, possibly an old currency of Germany? :-) > Thanks, Jon, I'll give it a try. > Thanks a lot,Mario > Am Donnerstag, 15. Oktober 2020, 16:56:16 MESZ hat Bruce Weaver < > bruce.weaver@ > > Folgendes geschrieben: > > Pardon my ignorance, Jon, but what does "DM" stand for? ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Jon Peck
Hi, Jon, I agree that custom attributes can be very useful. Thanks for your solution! Is there a documentation for the spssdata object? I'd like to understand it better but could not find it in the Python pdf which I thinks handles Python 2 only. Or how can I get access to the object model of Python 3? Meanwhile I came up with a solution for my variable description task. I'm posting it as it might be interesting for other users as well. * Encoding: UTF-8. * A. Variable information is written into custom attributes; - position: starting position of variable - hasLabels: Are value labels available? - numVal: Number of labelled values - minVal: Minimum labelled value - maxVal: Maximum labelled value - minLab: Label of minVal - maxLab: Label of maxVal . BEGIN PROGRAM PYTHON. import spssaux, sys reload(sys) sys.setdefaultencoding('1252') # Windows Encoding of umlauts sDict = spssaux.VariableDict() # Retrieve Dictionary p = 0 # Position initialization for var in sDict: vname = str(var) # Variable name as String spssaux.createAttribute(vname, "position", str(p).zfill(5)) # Starting position of variable valLabs = var.ValueLabels # Dictionary of Value Labels valList = [int(key) for key, val in valLabs.iteritems()] # Value list labList = [val for key, val in valLabs.iteritems()] # Label list if len(valList) > 0: # If there are Value Labels # Write value "yes" into attribute "hasLabels" spssaux.createAttribute(vname, "hasLabels", "yes") spssaux.createAttribute(vname, "numVal", str(len(valList)).zfill(3)) spssaux.createAttribute(vname, "minVal", min(valList)) spssaux.createAttribute(vname, "maxVal", max(valList)) spssaux.createAttribute(vname, "minLab", labList[valList.index(min(valList))]) spssaux.createAttribute(vname, "maxLab", labList[valList.index(max(valList))]) else: spssaux.createAttribute(vname, "hasLabels", "no") spssaux.createAttribute(vname, "numVal", "000") p += 1 END PROGRAM. * B. Sorting 2 times to cluster variables according to similarity. SORT VARIABLES BY ATTRIBUTE minLab. SORT VARIABLES BY ATTRIBUTE numVal. * C. Retrieve original variable order. SORT VARIABLES BY ATTRIBUTE position. * D. Delete custom attributes if no longer needed. VARIABLE ATTRIBUTE VARIABLES = ALL DELETE = position min numVal maxLab minLab maxVal minVal hasLabels. Mario Giesel Munich, Germany
Am Donnerstag, 15. Oktober 2020, 19:17:00 MESZ hat Jon Peck <[hidden email]> Folgendes geschrieben:
I have posted a complete solution for the minimum as an attribute below with an explanation of the code, but the side point of the first solution was that custom attributes make it easy to enrich the variable metadata - here marking the currency unit of the variables. Python is not required for this: the VARIABLE ATTRIBUTE could be used directly, but Python code would be needed if you want to use these values in the code. As far as I can see, custom variable attributes are woefully underused. As for the example, I created a function that takes a string of variable names and constructs a custom attribute named min containing the minimum value for each variable. I split it into two begin program blocks. The first one defines a function named createMinAttrib, and the second invokes it. The blocks could be combined, but it is clearer this way. The program passes the data for the specified variables and records the minimum values. Then for each variable it creates a custom attribute named min holding that value (or missing if all values are missing). begin program python3. import spssaux, spssdata def createMinAttrib(thevars): """create a custom attribute of the minimum values for a set of variables the vars is a blank separated string of variable names""" varlist = thevars.split() curs = spssdata.Spssdata(thevars) mins = {} for case in curs: for i, v in enumerate(varlist): if case[i] is not None: mins[v] = min(mins.get(v, case[i]), case[i]) curs.CClose() for v in varlist: spssaux.createAttribute(v, "min", mins.get(v, None)) end program. begin program python3. createMinAttrib("salary salbegin") end program.
|
The spssdata module, which now exists in Python 2 and Python 3 versions, is documented via docstrings in the module itself. It and the other extra SPSS modules installed with Python are not documented in the Python pdf, unfortunately, nor are the extension commands. I wish they were. These modules, which have a lot of useful functions and classes, are spssaux.py spssaux2.py spssdata.py, and extendedTransforms.py. The Python2 and Python3 versions are functionally identical as the Python3 versions are just the Python2 versions updated for changes in the language. The same is true of the 50+ extension commands implemented in Python. I expect that any future enhancements will only be done for the Python3 versions. Python3 in Statistics, however, only supports Unicode. You can write code to handle code page text. I am glad to see that you are using the attributes. Note that your code above only works for numeric variables, so you might want to add a filter on that. On Thu, Oct 15, 2020 at 10:13 PM Mario Giesel <[hidden email]> wrote:
|
In reply to this post by spss.giesel@yahoo.de
I have learned that the custom attributes will be extremely valuable. Thank
you for working this out. Are the min value and max value of valid values as opposed to missing values? I would like to hear suggestions of things to include as attributes and values of attributes for an attribute "domain" I am thinking of these as attribute values Whole numbers only Positive nonzero only Currently, there is a built-in attribute "Measure" one thing I have done is copy-and-paste that into a new attribute "Measur2" . Then I went through the variables that had two valid values and gave the attribute "Measur2" the value "dichotomy" since dichotomies can be considered not to have discrepancies in the size of intervals since there is only one interval. ----- Art Kendall Social Research Consultants -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
The code I posted for min and max attributes excludes system missing values but includes user missing. To exclude user missing, the spssdata call would be changed to curs = spssdata.Spssdata(thevars, omitmissing=True) I think of custom variable attributes as mostly properties of the data such as measurement units, source, validation, confidentiality, question text, interviewer instructions etc. The Data Validation procedure allows you to define some domain properties such as min, max, and integer values or values from a specific list. Not by coincidence, these properties are stored as special custom attributes (names start with @) and are used by the data validation procedure. The STATS GET TRIPLES extension command stores some properties not supported by built-in Statistics metadata as custom attributes. On Fri, Oct 16, 2020 at 9:39 AM Art Kendall <[hidden email]> wrote: I have learned that the custom attributes will be extremely valuable. Thank |
Thanks.
Data Validation Procedure - something else to catch up with. ----- Art Kendall Social Research Consultants -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
Now part of Base, along with Bootstrapping, in V27. On Fri, Oct 16, 2020 at 11:29 AM Art Kendall <[hidden email]> wrote: Thanks. |
Free forum by Nabble | Edit this page |