SPSSX Discussion

run Summarize only if there are non-filtered cases - a python script

Classic

List

Threaded

2 messages Options

Eero Olli

run Summarize only if there are non-filtered cases - a python script

Dear list,

How do can I count the number of non-filtered cases in python? Or is there a different approach that is more practical?

I am dealing with data that changes and has warying quality. I have many production scripts that run syntax files making lists of cases that must be dealt with. If there are no cases I get a warning. "No cases were ipnut to this procedure". I would like to get rid of these warnings.

My plan is to make a small python script that runs a SPSS command only if there are cases. My problem is that spss.GetCaseCount() counts all cases, independent of they are filtered away. I want to count only those valid non-filtered cases.

* First I identify duplicates and unique cases.
* Then unique cases are filtered away, so that only duplicates are shown.
* Then the Python script conditionally runs SUMMARIZE.

BEGIN PROGRAM.
import spss
numberofcases = spss.GetCaseCount()
print numberofcases
if numberofcases > 0:
try:
spss.Submit("""
ECHO "list of duplicates".
SUMMARIZE
/TABLES=var1 var2 var3
/TITLE='Case Summaries'
/MISSING=VARIABLE
/CELLS=COUNT .
""")
except:
print "Something went wrong."
else:
print "There are no duplicates."
END PROGRAM.

Sincerely,

Eero Olli

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Jon K Peck

Re: run Summarize only if there are non-filtered cases - a python script

There is no api to get the count of filtered cases, because that is not known without a data pass. So you would have to do a data pass to find out whether there are zero cases.

The quickest way to do that would be to run, say, AGGREGATE into a new dataset. But that will also trigger a warning in the log if there are no input cases. At least that would be separate from the procedure output. But you could even suppress that by using OMS with Viewer=no.

Alternatively, you could pass the data through your Python code, which will then see only the filtered cases.
begin program.
import spss, spssdata
curs = spssdata.Spssdata()
for case in curs:
havecases = True
break # got at least one case
else:
havecases = False
print "There are no cases"
curs.CClose()

if havecases:
...
end program.

Jon Peck
Senior Software Engineer, IBM
[hidden email]
312-651-3435

From: Eero Olli <[hidden email]>
To: [hidden email]
Date: 02/23/2011 08:07 AM
Subject: [SPSSX-L] run Summarize only if there are non-filtered cases - a python script
Sent by: "SPSSX(r) Discussion" <[hidden email]>

Dear list, How do can I count the number of non-filtered cases in python? Or is there a different approach that is more practical? I am dealing with data that changes and has warying quality. I have many production scripts that run syntax files making lists of cases that must be dealt with. If there are no cases I get a warning. "No cases were ipnut to this procedure". I would like to get rid of these warnings. My plan is to make a small python script that runs a SPSS command only if there are cases. My problem is that spss.GetCaseCount() counts all cases, independent of they are filtered away. I want to count only those valid non-filtered cases. * First I identify duplicates and unique cases. * Then unique cases are filtered away, so that only duplicates are shown. * Then the Python script conditionally runs SUMMARIZE. BEGIN PROGRAM. import spss numberofcases = spss.GetCaseCount() print numberofcases if numberofcases > 0: try: spss.Submit(""" ECHO "list of duplicates". SUMMARIZE /TABLES=var1 var2 var3 /TITLE='Case Summaries' /MISSING=VARIABLE /CELLS=COUNT . """) except: print "Something went wrong." else: print "There are no duplicates." END PROGRAM. Sincerely, Eero Olli ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD