|
Dear all,
I'am trying to use simple Python stuff, before checking the more difficult stuff of Programming-and-Data-Management-for-IBM-SPSS-Statistics (which it's too tricky for me at the moment). The final aim is using Python inside SPSS of course. But for now I'm trying some very basic stuff like this BEGIN PROGRAM. tupl = ('a', 'b', 'c', 'd', 'e') tupl[0] END PROGRAM. I would like to see in SPSS the result 'a' and store it in a SPSS dataset, for instance as var1. And may be the whole tuple as var2. Is it possible? Or in general how to import to SPSS something created from scratch with Python code, e.g. a simulation, without defining before any SPSS data. I checked stuff like spss.Dataset but I can't figure it out exactly. Thanks in advance |
|
I wrote a blog post about the subject, https://andrewpwheeler.wordpress.com/2014/09/19/turning-data-from-python-into-spss-data/.
For this example, to plop the whole ('a', 'b', 'c', 'd', 'e') tuple into one variable I turn it into a string. **************************************************. BEGIN PROGRAM Python. #Export to SPSS dataset function import spss def SPSSData(data,vars,types,name=None): VarDict = zip(vars,types) #combining variables and #formats into tuples spss.StartDataStep() datasetObj = spss.Dataset(name=name) #if you give a name, #needs to be declared #appending variables to dataset for i in VarDict: datasetObj.varlist.append(i[0],i[1]) #now the data for j in data: datasetObj.cases.append(list(j)) spss.EndDataStep() END PROGRAM. *Example data. BEGIN PROGRAM. tupl = ('a', 'b', 'c', 'd', 'e') tupl[0] END PROGRAM. *Now turning into an SPSS dataset. BEGIN PROGRAM. #Create data CombList1 = (tupl[0],",".join(tupl)) CombList2 = ('f','f,g,h') #example for a second row YourData = [CombList1,CombList2] stL = [1,100] varnames = ['Var1','Var2'] SPSSData(data=YourData,vars=varnames,types=stL) END PROGRAM. **************************************************. |
|
This post was updated on .
Thanks Andy, interesting.
May be I missed something, but it looks to me that the logic is "appending". In real world I was thinking more of a situation where I can directly import a large dataset (e.g. 1000 rows and 10 columns) generated in Python. is that possible? for istance imagine severla columns like this to import: import random def rollDice(): roll = random.randint(1,100) return roll # Now, just to test our dice, let's roll the dice 100 times. x = 0 while x < 100: result = rollDice() print(result) x+=1 |
|
I'm not sure I understand what the problem is. Yes you have to append one row at a time to the created SPSS dataset.
If that is not appealing for whatever reason, you can dump the table to a csv file and then upload that in SPSS. Off-hand I don't know any scenarios where that would be quicker/easier though. 1000 rows and 10 certainly is not a case. |
|
In reply to this post by raw
Andy's example shows the use of the Dataset class, which is the most versatile and general way of adding new variables and setting their values, but it might be a little intimidating. Here is a simpler example using the spssdata.Spssdata class. It first gets a cursor and defines a new string variable. Then it iterates over the cases (must be five in this example) and assigns a value from the tuple to the new variable in each case. data list free/x(f1.0). begin data 1 2 3 4 5 end data begin program. import spss, spssdata tupl = ('a', 'b', 'c', 'd', 'e') curs = spssdata.Spssdata(accessType="w") newvar = spssdata.vdef("newstring", vtype=1) curs.append(newvar) curs.commitdict() for i, case in enumerate(curs): curs.casevalues([tupl[i]]) curs.CClose() end program. list. On Wed, Mar 9, 2016 at 4:29 AM, raw <[hidden email]> wrote: Dear all, |
|
Thanks Jon, there some details that still are a little fuzzy for me, sorry. My fault of course.
Imagine that from scratch I do something like this: begin program. import random def rollDice(): roll = random.randint(1,100) return roll # Now, just to test our dice, let's roll the dice 100 times. x = 0 while x < 100: result = rollDice() print(result) x+=1 end program. Can I import to a .sav the results without appending each row? Or even better, imagine that I’m working on a normal SPSS dataset like employee data, but I compute a new variable totally using a python code instead of normal SPSS code, e.g.: if expression1: statement(s) elif expression2: statement(s) elif expression3: statement(s) else: statement(s) etc. is that possible to do this and import the created variable to the spss dataset? |
|
Jon's example does show appending a new variable to an already created SPSS data. From your if-elif description you will also probably want to check out SPSSINC TRANS examples.
I still don't understand what the problem with append is you are having. In your dice roll example you can either append each unique roll (append to the dataset inside the while statement), or the ending value of x (append to the dataset after the while statement). |
|
In reply to this post by raw
You can create new casewise variables in the active dataset or update existing ones using Python as I illustrated above. You can also change existing variables. The easiest way to do this is to use the SPSSINC TRANS extension command, which Andy mentioned, because that handles all the boilerplate of defining the new variables and transmitting the values between Python and the active Statistics dataset. You would just write a function that does the computation for a single case, and TRANS handles the rest. Here is a simple example. begin program: def f(x, y, z): if x==y: return z else: return x end program. SPSSINC TRANS RESULT = w /formula "f(var1, var2, var3)". TRANS has lots of options and can handle multiple outputs from the formula if needed. You can, of course, use Python libraries as needed in your function. Beyond that, the Dataset class provides the underlying functionality for all this. On Wed, Mar 9, 2016 at 9:32 AM, raw <[hidden email]> wrote: Thanks Jon, there some details that still are a little fuzzy for me, sorry. |
|
learning more Python is on my ever expanding TO DO list.
2 other ways to generate simulation or demonstration data are INPUT PROGRAM if you search the archives for this list you will find many examples using INPUT PROGRAM. and the built in SPSS procedure for simulation. I do not have access to SPSS on this PC but search for SIMULATE or SIMULATION ins SPSS help.
Art Kendall
Social Research Consultants |
|
Administrator
|
Let's not forget the MOST concise tool available in SPSS for performing such monkey business ;-) MATRIX. SAVE (TRUNC(UNIFORM(1000,3) *100)+1) /OUTFILE * / VARIABLES Random01 TO Random03. END MATRIX. ---------------------------------------------------------------------
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
In reply to this post by Jon Peck
Ok Jon I tried simple stuff on the employee data and now it's clear, for instance:
*just random trial, not business relevant. begin program. def f(x,y,z): for x in range(0,10): return z else: return y end program. SPSSINC TRANS RESULT = myvar1 /formula "f(educ,salary,salbegin)". but still I cannot figure it out on stuff like this: data list free / v1(f8) v2(f8). begin data 10 15 2 2 end data begin program. import math def pythagoras(a,b): value = math.sqrt(a *a + b*b) print(value) end program. SPSSINC TRANS RESULT = myvar2 /formula "pythagoras(v1,v2)". myvar2 is in this case missing in the dataset, why? As I said, my python level is very basic, but managing this things give me a lot of motivation to check more serious stuff! |
|
You are close, but your function needs to return a value. Add return value at the end of the definition. Since the function didn't return a value, TRANS set the result to sysmis. On Thu, Mar 10, 2016 at 3:16 PM, raw <[hidden email]> wrote: Ok Jon I tried simple stuff on the employee data and now it's clear, for |
|
Hi I did this, as proposed but another user
data list free / v1(f8) v2(f8) v3(f8). begin data 10 15 20 2 2 3 13 1 2 end data begin program. def f(x,y,z): for x in range(0,10): return z else: return y end program. SPSSINC TRANS RESULT = myvar1 /formula "f(v1,v2,v3)". but I see in my case that myvar1 is always eq to v3, while in the 3rd case it should be eq. to v2. what's the problem? |
|
The line for x in range(0,10): says to loop over the values 0 through 9, but you are returning the first time through the loop. What you want is if x in range(0,10): On Fri, Mar 11, 2016 at 8:22 AM, progster <[hidden email]> wrote: Hi I did this, as proposed but another user |
|
In reply to this post by progster
This isn't an SPSS question at this point (nor was your prior), but is asking about Python code. Here you would replace "for" with "if" and your code runs as expected. "for" does a loop over 0 to 9. In this particular case, the last return wins, so that is why it is always returning z.
At this point please just take some time to learn python programming. If you had actually tested your python code by itself you would have figured out the error. There is no point in continuing to ask these minor code questions. This is an SPSS forum - not a python one. |
|
Administrator
|
Andy, Unfortunately it seems to be the case that python is becoming a dominant sub-theme in this group. Classic cases of people attempting to do what is utterly trivial to do in SPSS proper (3 lines of MATRIX code) using python without even bothering to study the language! My 2 cents. On Fri, Mar 11, 2016 at 10:37 AM, Andy W [via SPSSX Discussion] <[hidden email]> wrote: This isn't an SPSS question at this point (nor was your prior), but is asking about Python code. Here you would replace "for" with "if" and your code runs as expected. "for" does a loop over 0 to 9. In this particular case, the last return wins, so that is why it is always returning z.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
I think the point of this thread was to learn about how to use Python functions with Statistics and SPSSINC TRANS in particular. While this particular example could easily be done with native Statistics syntax, I don't think that is the point here. On Fri, Mar 11, 2016 at 8:53 AM, David Marso <[hidden email]> wrote:
|
|
In reply to this post by David Marso
Andy, David, I think that the documentation about programmability extension is very tricky, and I was not going to spend six months to study Python and may be discovering that I was not interested in learning it.
Thanks to some posts I discovered that it's worth. My aim of course is not reproduce with Python some analysis that can be done easily done with normal SPSS synthax. My aim was simply starting with easy stuff and checking IF I was interested. I cannot decide what it's appropiate for this forum, but it's clear that IBM strategy is integration. So if IBM, owner of SPSS sw decided this, I think it's appropriate. Or you can create a "Python-SPSS" room, as many forums do. please don't be "Donald Trump" style, knowledge don't need walls. |
|
Administrator
|
Donald Trump Style?
Please elaborate on that. I'm actually more of a EPA/PETA sort.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
| Free forum by Nabble | Edit this page |
