Folks,
My survey research data file contains a large number of airport codes (LAX, DFW, BOS, MIA, etc.) in a variable named "AIRPORT". Each represents the airport from which the survey respondent departed. There are about 256 airport codes for about 25,000 respondents, and the airports change from survey to survey. The 3-character AIRPORT variable would look something like this: AIRPORT ABQ ABQ DFW DFW DFW ORD ORD MIA MIA MIA I want to create a Python list of the unique airport codes included in the variable "AIRPORT." E.g., ["ABQ","DFW","ORD","MIA"]. There must be an easy way to do this, although I haven't found it yet in spssaux or any other SPSS Python module. Any ideas? This inquiry is cross-posted on the Google SPSS group and SPSS Developer Central. Thanks, King Douglas Senior Analyst American Airlines |
There are answers to this query posted in the Developer Central forums (www.spss.com/devcentral) for anyone interested.
-Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of King Douglas Sent: Wednesday, October 11, 2006 10:50 AM To: [hidden email] Subject: [SPSSX-L] Create Python list from unique values of SPSS string variable Folks, My survey research data file contains a large number of airport codes (LAX, DFW, BOS, MIA, etc.) in a variable named "AIRPORT". Each represents the airport from which the survey respondent departed. There are about 256 airport codes for about 25,000 respondents, and the airports change from survey to survey. The 3-character AIRPORT variable would look something like this: AIRPORT ABQ ABQ DFW DFW DFW ORD ORD MIA MIA MIA I want to create a Python list of the unique airport codes included in the variable "AIRPORT." E.g., ["ABQ","DFW","ORD","MIA"]. There must be an easy way to do this, although I haven't found it yet in spssaux or any other SPSS Python module. Any ideas? This inquiry is cross-posted on the Google SPSS group and SPSS Developer Central. Thanks, King Douglas Senior Analyst American Airlines |
There is also an example in the documentation that is provided with the SPSS 15 programmability plug-in that may be relevant:
*fetchall with Variable Index. *python_cursor_fetchall_index.sps. DATA LIST FREE /var1 var2 var3. BEGIN DATA 1 2 3 1 4 5 2 5 7 END DATA. BEGIN PROGRAM. import spss i=[0] dataCursor=spss.Cursor(i) oneVar=dataCursor.fetchall() uniqueCount=len(set(oneVar)) print oneVar print spss.GetVariableName(0), " has ", uniqueCount, " unique values." #extending the example to get the actual list of values. uniqueList=(set(oneVar)) print uniqueList dataCursor.close() end program. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Peck, Jon Sent: Wednesday, October 11, 2006 4:10 PM To: [hidden email] Subject: Re: Create Python list from unique values of SPSS string variable There are answers to this query posted in the Developer Central forums (www.spss.com/devcentral) for anyone interested. -Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of King Douglas Sent: Wednesday, October 11, 2006 10:50 AM To: [hidden email] Subject: [SPSSX-L] Create Python list from unique values of SPSS string variable Folks, My survey research data file contains a large number of airport codes (LAX, DFW, BOS, MIA, etc.) in a variable named "AIRPORT". Each represents the airport from which the survey respondent departed. There are about 256 airport codes for about 25,000 respondents, and the airports change from survey to survey. The 3-character AIRPORT variable would look something like this: AIRPORT ABQ ABQ DFW DFW DFW ORD ORD MIA MIA MIA I want to create a Python list of the unique airport codes included in the variable "AIRPORT." E.g., ["ABQ","DFW","ORD","MIA"]. There must be an easy way to do this, although I haven't found it yet in spssaux or any other SPSS Python module. Any ideas? This inquiry is cross-posted on the Google SPSS group and SPSS Developer Central. Thanks, King Douglas Senior Analyst American Airlines |
In reply to this post by Peck, Jon
Jon, would you mind pasting the url that has the answers? I've searched within the forums and haven't found them.
-Tiffany -----Original Message----- There are answers to this query posted in the Developer Central forums (www.spss.com/devcentral) for anyone interested. -Jon Peck |
Administrator
|
In reply to this post by 410678
Wouldn't AGGREGATE be a much simpler solution?
Or at least run the Python code on the aggregated file. What do you need to do with this list later? ---
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by 410678
Wouldn't AGGREGATE be a much simpler solution?
Or at least run the Python code on the aggregated file. What do you need to do with this list later? ---
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by tko
Go to the SPSS Python programmability link,
http://www.ibm.com/developerworks/forums/forum.jspa?forumID=2300,
and search for airport. Bear in mind that this topic stems from 2006.
Any references to the old SPSS Developer Central site are obsolete.
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: tko <[hidden email]> To: [hidden email] Date: 04/17/2012 09:37 AM Subject: Re: [SPSSX-L] Create Python list from unique values of SPSS string variable Sent by: "SPSSX(r) Discussion" <[hidden email]> Jon, would you mind pasting the url that has the answers? I've searched within the forums and haven't found them. -Tiffany -----Original Message----- There are answers to this query posted in the Developer Central forums (www.spss.com/devcentral) for anyone interested. -Jon Peck -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Create-Python-list-from-unique-values-of-SPSS-string-variable-tp1071458p5646847.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Given the answer from the forum, I now have the below code. However, I need to run this on dozens of files, and sometimes the index variable will not be in all caps. Is there a way to apply IGNORECASE?
BEGIN PROGRAM. s = set() data = spssdata.Spssdata(indexes=['RATING']) for case in data: s.add(case.RATING) ratinglist = list(s) ratinglist.sort() del data for a in ratinglist: if a>1: ratingCode = int(a) spss.Submit(r""" IF (RATING=%(ratingCode)s & (S105_%(ratingCode)s=3 | S105_%(ratingCode)s=4)) Fam_filter = 0. """ %locals()) print 'There are ',len(ratinglist)-1,'companies in this list.' END PROGRAM. -----Original Message----- Re: Create Python list from unique values of SPSS string variable Apr 17, 2012; 12:11pm— by Jon K Peck Go to the SPSS Python programmability link, http://www.ibm.com/developerworks/forums/forum.jspa?forumID=2300, and search for airport. Bear in mind that this topic stems from 2006. Any references to the old SPSS Developer Central site are obsolete. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 |
The indexes field in the spssdata.Spssdata
line can be the position in the file (counting from zero).
e.g., spssdata.Spssdata(indexes="5") to select the sixth variable. Values would then be accessed as case[0]. If the position also varies, post back, and this can be generalized further. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: tko <[hidden email]> To: [hidden email] Date: 04/17/2012 05:25 PM Subject: Re: [SPSSX-L] Create Python list from unique values of SPSS string variable Sent by: "SPSSX(r) Discussion" <[hidden email]> Given the answer from the forum, I now have the below code. However, I need to run this on dozens of files, and sometimes the index variable will not be in all caps. Is there a way to apply IGNORECASE? BEGIN PROGRAM. s = set() data = spssdata.Spssdata(indexes=['RATING']) for case in data: s.add(case.RATING) ratinglist = list(s) ratinglist.sort() del data for a in ratinglist: if a>1: ratingCode = int(a) spss.Submit(r""" IF (RATING=%(ratingCode)s & (S105_%(ratingCode)s=3 | S105_%(ratingCode)s=4)) Fam_filter = 0. """ %locals()) print 'There are ',len(ratinglist)-1,'companies in this list.' END PROGRAM. -----Original Message----- Re: Create Python list from unique values of SPSS string variable Apr 17, 2012; 12:11pm— by Jon K Peck Go to the SPSS Python programmability link, http://www.ibm.com/developerworks/forums/forum.jspa?forumID=2300, and search for airport. Bear in mind that this topic stems from 2006. Any references to the old SPSS Developer Central site are obsolete. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Create-Python-list-from-unique-values-of-SPSS-string-variable-tp1071458p5647849.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Yes, the position does vary, so further generalization would be much appreciated.
Thank you, Tiffany
On Tuesday, April 17, 2012, Jon K Peck wrote: The indexes field in the spssdata.Spssdata line can be the position in the file (counting from zero). |
This will give you the index of TARGET.
target, TaRgEt etc.
import spss, spssaux targetindex = [v.upper() for v in spssaux.getVariableNamesList()].index("TARGET") Regards, Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: Tiffany Ko <[hidden email]> To: Jon K Peck/Chicago/IBM@IBMUS Cc: "[hidden email]" <[hidden email]> Date: 04/17/2012 10:02 PM Subject: Re: [SPSSX-L] Create Python list from unique values of SPSS string variable Yes, the position does vary, so further generalization would be much appreciated. Thank you, Tiffany On Tuesday, April 17, 2012, Jon K Peck wrote: The indexes field in the spssdata.Spssdata line can be the position in the file (counting from zero). e.g., spssdata.Spssdata(indexes="5") to select the sixth variable. Values would then be accessed as case[0]. If the position also varies, post back, and this can be generalized further. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM <a href="javascript:_e({}, 'cvml', 'peck@us.ibm.com');" target=_blank>peck@... new phone: <a href="tel:720-342-5621" target=_blank>720-342-5621 From: tko <<a href="javascript:_e({}, 'cvml', 'ko.tiffany@gmail.com');" target=_blank>ko.tiffany@...> To: <a href="javascript:_e({}, 'cvml', 'SPSSX-L@listserv.uga.edu');" target=_blank>SPSSX-L@... Date: 04/17/2012 05:25 PM Subject: Re: [SPSSX-L] Create Python list from unique values of SPSS string variable Sent by: "SPSSX(r) Discussion" <<a href="javascript:_e({}, 'cvml', 'SPSSX-L@listserv.uga.edu');" target=_blank>SPSSX-L@...> Given the answer from the forum, I now have the below code. However, I need to run this on dozens of files, and sometimes the index variable will not be in all caps. Is there a way to apply IGNORECASE? BEGIN PROGRAM. s = set() data = spssdata.Spssdata(indexes=['RATING']) for case in data: s.add(case.RATING) ratinglist = list(s) ratinglist.sort() del data for a in ratinglist: if a>1: ratingCode = int(a) spss.Submit(r""" IF (RATING=%(ratingCode)s & (S105_%(ratingCode)s=3 | S105_%(ratingCode)s=4)) Fam_filter = 0. """ %locals()) print 'There are ',len(ratinglist)-1,'companies in this list.' END PROGRAM. -----Original Message----- Re: Create Python list from unique values of SPSS string variable Apr 17, 2012; 12:11pm— by Jon K Peck Go to the SPSS Python programmability link, http://www.ibm.com/developerworks/forums/forum.jspa?forumID=2300, and search for airport. Bear in mind that this topic stems from 2006. Any references to the old SPSS Developer Central site are obsolete. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: <a href="tel:720-342-5621" target=_blank>720-342-5621 -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Create-Python-list-from-unique-values-of-SPSS-string-variable-tp1071458p5647849.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to <a href="javascript:_e({}, 'cvml', 'LISTSERV@LISTSERV.UGA.EDU');" target=_blank>LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by 410678
hi,
Isn't it easier to AGGREGATE first, then fetch all the cases and take the set? Regards, Albert-Jan ------------------------------ On Wed, Apr 18, 2012 3:20 PM CEST Jon K Peck wrote: >This will give you the index of TARGET. target, TaRgEt etc. > >import spss, spssaux >targetindex = [v.upper() for v in >spssaux.getVariableNamesList()].index("TARGET") > >Regards, > >Jon Peck (no "h") aka Kim >Senior Software Engineer, IBM >[hidden email] >new phone: 720-342-5621 > > > > >From: Tiffany Ko <[hidden email]> >To: Jon K Peck/Chicago/IBM@IBMUS >Cc: "[hidden email]" <[hidden email]> >Date: 04/17/2012 10:02 PM >Subject: Re: [SPSSX-L] Create Python list from unique values of >SPSS string variable > > > >Yes, the position does vary, so further generalization would be much >appreciated. > >Thank you, > >Tiffany > >On Tuesday, April 17, 2012, Jon K Peck wrote: >The indexes field in the spssdata.Spssdata line can be the position in the >file (counting from zero). >e.g., >spssdata.Spssdata(indexes="5") >to select the sixth variable. >Values would then be accessed as case[0]. > >If the position also varies, post back, and this can be generalized >further. > >Jon Peck (no "h") aka Kim >Senior Software Engineer, IBM >[hidden email] >new phone: 720-342-5621 > > > > >From: tko <[hidden email]> >To: [hidden email] >Date: 04/17/2012 05:25 PM >Subject: Re: [SPSSX-L] Create Python list from unique values of >SPSS string variable >Sent by: "SPSSX(r) Discussion" <[hidden email]> > > > >Given the answer from the forum, I now have the below code. However, I >need >to run this on dozens of files, and sometimes the index variable will not >be >in all caps. Is there a way to apply IGNORECASE? > >BEGIN PROGRAM. >s = set() >data = spssdata.Spssdata(indexes=['RATING']) >for case in data: > s.add(case.RATING) >ratinglist = list(s) >ratinglist.sort() >del data >for a in ratinglist: > if a>1: > ratingCode = int(a) > spss.Submit(r"" > IF (RATING=%(ratingCode)s & (S105_%(ratingCode)s=3 | >S105_%(ratingCode)s=4)) Fam_filter = 0. > "" %locals()) >print 'There are ',len(ratinglist)-1,'companies in this list.' >END PROGRAM. > > >-----Original Message----- >Re: Create Python list from unique values of SPSS string variable >Apr 17, 2012; 12:11pm— by Jon K Peck >Go to the SPSS Python programmability link, >http://www.ibm.com/developerworks/forums/forum.jspa?forumID=2300, and >search >for airport. Bear in mind that this topic stems from 2006. Any >references >to the old SPSS Developer Central site are obsolete. > >Jon Peck (no "h") aka Kim >Senior Software Engineer, IBM >[hidden email] >new phone: 720-342-5621 > > >-- >View this message in context: >http://spssx-discussion.1045642.n5.nabble.com/Create-Python-list-from-unique-values-of-SPSS-string-variable-tp1071458p5647849.html > >Sent from the SPSSX Discussion mailing list archive at Nabble.com. > >===================== >To manage your subscription to SPSSX-L, send a message to >[hidden email] (not to SPSSX-L), with no body text except the >command. To leave the list, send the command >SIGNOFF SPSSX-L >For a list of commands to manage subscriptions, send the command >INFO REFCARD > > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
;-))
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Thanks all. Below is my final code. Hope it's useful for another global market researcher :)
Background: The purpose is to create a set of the unique values of a single variable, because these unique values appear in the names of other variables within the same dataset. For example, the variable "Rating" has values 1, 2, 3, 4, which means the file also has the corresponding variables named S105_1, S105_2, S105_3, S105_4. I have hundreds of files with thousands of different values, and different corresponding variables, which is why this code has been especially helpful. Though I could aggregate into a separate file, it seems that would be a less efficient way to get to the final calculations. COMPUTE Fam_filter = 0. VARIABLE LABELS Fam_filter "Familiarity filter check". VALUE LABELS Fam_filter 0 "Not familiar" 1 "Familiar". BEGIN PROGRAM. s = set() ratingindex = [v.upper() for v in spssaux.getVariableNamesList()].index('RATING') data = spssdata.Spssdata(indexes=[ratingindex]) for case in data: s.add(case[0]) ratinglist = list(s) ratinglist.sort() del data for a in ratinglist: if a>1: ratingcode = int(a) spss.Submit(r""" IF (RATING=%(ratingcode)s & (S105_%(ratingcode)s=3 | S105_%(ratingcode)s=4)) Fam_filter = 1. """ %locals()) print 'There are ',len(ratinglist)-1,'companies in this list.' END PROGRAM. CTABLES /VLABELS VARIABLES=RATING Fam_filter DISPLAY=DEFAULT /TABLE RATING [C] BY Fam_filter [COUNT F40.0] /CATEGORIES VARIABLES=RATING ORDER=A KEY=LABEL EMPTY=EXCLUDE /CATEGORIES VARIABLES=Fam_filter ORDER=A KEY=VALUE EMPTY=EXCLUDE. In Reply To Re: Create Python list from unique values of SPSS string variable Apr 18, 2012; 10:56am— by David Marso ;-)) Albert-Jan Roskam wrote hi, Isn't it easier to AGGREGATE first, then fetch all the cases and take the set? Regards, Albert-Jan |
Free forum by Nabble | Edit this page |