Dear list,
How do can I count the number of non-filtered cases in python? Or is there a different approach that is more practical? I am dealing with data that changes and has warying quality. I have many production scripts that run syntax files making lists of cases that must be dealt with. If there are no cases I get a warning. "No cases were ipnut to this procedure". I would like to get rid of these warnings. My plan is to make a small python script that runs a SPSS command only if there are cases. My problem is that spss.GetCaseCount() counts all cases, independent of they are filtered away. I want to count only those valid non-filtered cases. * First I identify duplicates and unique cases. * Then unique cases are filtered away, so that only duplicates are shown. * Then the Python script conditionally runs SUMMARIZE. BEGIN PROGRAM. import spss numberofcases = spss.GetCaseCount() print numberofcases if numberofcases > 0: try: spss.Submit(""" ECHO "list of duplicates". SUMMARIZE /TABLES=var1 var2 var3 /TITLE='Case Summaries' /MISSING=VARIABLE /CELLS=COUNT . """) except: print "Something went wrong." else: print "There are no duplicates." END PROGRAM. Sincerely, Eero Olli ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
There is no api to get the count of filtered
cases, because that is not known without a data pass. So you would
have to do a data pass to find out whether there are zero cases.
The quickest way to do that would be to run, say, AGGREGATE into a new dataset. But that will also trigger a warning in the log if there are no input cases. At least that would be separate from the procedure output. But you could even suppress that by using OMS with Viewer=no. Alternatively, you could pass the data through your Python code, which will then see only the filtered cases. begin program. import spss, spssdata curs = spssdata.Spssdata() for case in curs: havecases = True break # got at least one case else: havecases = False print "There are no cases" curs.CClose() if havecases: ... end program. Jon Peck Senior Software Engineer, IBM [hidden email] 312-651-3435 From: Eero Olli <[hidden email]> To: [hidden email] Date: 02/23/2011 08:07 AM Subject: [SPSSX-L] run Summarize only if there are non-filtered cases - a python script Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear list, How do can I count the number of non-filtered cases in python? Or is there a different approach that is more practical? I am dealing with data that changes and has warying quality. I have many production scripts that run syntax files making lists of cases that must be dealt with. If there are no cases I get a warning. "No cases were ipnut to this procedure". I would like to get rid of these warnings. My plan is to make a small python script that runs a SPSS command only if there are cases. My problem is that spss.GetCaseCount() counts all cases, independent of they are filtered away. I want to count only those valid non-filtered cases. * First I identify duplicates and unique cases. * Then unique cases are filtered away, so that only duplicates are shown. * Then the Python script conditionally runs SUMMARIZE. BEGIN PROGRAM. import spss numberofcases = spss.GetCaseCount() print numberofcases if numberofcases > 0: try: spss.Submit(""" ECHO "list of duplicates". SUMMARIZE /TABLES=var1 var2 var3 /TITLE='Case Summaries' /MISSING=VARIABLE /CELLS=COUNT . """) except: print "Something went wrong." else: print "There are no duplicates." END PROGRAM. Sincerely, Eero Olli ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |