|
Dear all I have to split a string variable into components, separeted by comma.
I’ve found a tutorial in Python “split-string-variable-into-components”, but it did not work in my db (may be it’s because in that example you have only one variable in the db, instead I have more) Any tip? thanks |
|
Administrator
|
http://spssx-discussion.1045642.n5.nabble.com/template/NamlServlet.jtp?macro=search_page&node=1068821&query=Parse&n=1068821
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
Thanks. i think that the most flexible solution is the Python one, because the normal spss synthax creates several variables with name v1 to v_n, but without a customized name. Instead this solution does: http://www.spss-tutorials.com/split-string-variable-into-components/ *1. Create Test Data. begin program. import random,spss random.seed(1) data = '' for case in range(10): val = '"' for novars in range(random.randrange(12)): for vallen in range(random.randrange(8)): val += chr(random.randrange(97,123)) val += ';' val += '"' data += val + '\n' spss.Submit('''data list list/s1(a%s).\nbegin data\n\n%s.'''%(max(len(s) for s in data.split('"')),data)) end program. *2. Define the function. begin program. def stringsplitter(variable,sep): import spss,spssaux lens = [] curs = spss.Cursor([spssaux.VariableDict().VariableIndex(variable)],\ accessType='w') for case in range(curs.GetCaseCount()): for cnt,val in enumerate(curs.fetchone()[0].split(sep)): if not len(lens)>cnt: lens.append(len(val.strip())) elif len(val.strip())>lens[cnt]: lens[cnt] = len(val.strip()) curs.close() curs=spss.Cursor([spssaux.VariableDict().VariableIndex(variable)],\ accessType='w') curs.SetVarNameAndType([variable + '_s' + str(cnt + 1) for cnt in range(len(lens))],[1 if leng==0 else leng for leng in lens]) curs.CommitDictionary() for case in range(curs.GetCaseCount()): for cnt,val in enumerate(curs.fetchone()[0].split(sep)): curs.SetValueChar(variable+'_s'+str(cnt + 1),val) curs.CommitCase() curs.close() end program. *3. Apply the function. begin program. stringsplitter('s1',';') #Please specify string variable and separator. end program. but I am not able to adapt the syntax to my data, here an example: data list list / id * city (A50) zone (A1) product (A50). begin data 1 "berlin" "a" "stock1, stock2, stock3" 2 "paris" "a" "stock1, stock2, stock3" 3 "amsterdam" "b" "stock1, stock2, stock3, stock4" 4 "london" "b" "stock1, stock2, stock3, stock5" end data. i guess of course that I have to use my var name "product" instead of s1, but I miss something else |
|
Here is a simpler Python solution.
begin program. def splitter(thestring): return thestring.split(";") end program. spssinc trans result=x1 to x20 type=10 /formula "splitter(s1)". It takes s1 as the input and creates variables x1 to x20 with the split pieces. Each output string is 10 bytes. You can, of course, easily change those parameters. Unused slots get value blank. If there are too many split values, you will get an error message. HTH, Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: raw <[hidden email]> To: [hidden email] Date: 10/28/2015 03:36 PM Subject: Re: [SPSSX-L] split-string-variable-into-components (may be Python) Sent by: "SPSSX(r) Discussion" <[hidden email]> Thanks. i think that the most flexible solution is the Python one, because the normal spss synthax creates several variables with name v1 to v_n, but without a customized name. Instead this solution does: http://www.spss-tutorials.com/split-string-variable-into-components/ *1. Create Test Data. begin program. import random,spss random.seed(1) data = '' for case in range(10): val = '"' for novars in range(random.randrange(12)): for vallen in range(random.randrange(8)): val += chr(random.randrange(97,123)) val += ';' val += '"' data += val + '\n' spss.Submit('''data list list/s1(a%s).\nbegin data\n\n%s.'''%(max(len(s) for s in data.split('"')),data)) end program. *2. Define the function. begin program. def stringsplitter(variable,sep): import spss,spssaux lens = [] curs = spss.Cursor([spssaux.VariableDict().VariableIndex(variable)],\ accessType='w') for case in range(curs.GetCaseCount()): for cnt,val in enumerate(curs.fetchone()[0].split(sep)): if not len(lens)>cnt: lens.append(len(val.strip())) elif len(val.strip())>lens[cnt]: lens[cnt] = len(val.strip()) curs.close() curs=spss.Cursor([spssaux.VariableDict().VariableIndex(variable)],\ accessType='w') curs.SetVarNameAndType([variable + '_s' + str(cnt + 1) for cnt in range(len(lens))],[1 if leng==0 else leng for leng in lens]) curs.CommitDictionary() for case in range(curs.GetCaseCount()): for cnt,val in enumerate(curs.fetchone()[0].split(sep)): curs.SetValueChar(variable+'_s'+str(cnt + 1),val) curs.CommitCase() curs.close() end program. *3. Apply the function. begin program. stringsplitter('s1',';') #Please specify string variable and separator. end program. but I am not able to adapt the syntax to my data, here an example: data list list / id * city (A50) zone (A1) product (A50). begin data 1 "berlin" "a" "stock1, stock2, stock3" 2 "paris" "a" "stock1, stock2, stock3" 3 "amsterdam" "b" "stock1, stock2, stock3, stock4" 4 "london" "b" "stock1, stock2, stock3, stock5" end data. i guess of course that I have to use my var name "product" instead of s1, but I miss something else -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/split-string-variable-into-components-may-be-Python-tp5730890p5730894.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
