run Summarize only if there are non-filtered cases - a python script

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

run Summarize only if there are non-filtered cases - a python script

Eero Olli
Dear list,

How do can I count the number of non-filtered cases in python? Or is there a different approach that is more practical?

I am dealing with data that changes and has warying quality. I have many production scripts that run syntax files making lists of cases that must be dealt with.  If there are no cases I get a warning. "No cases were ipnut to this procedure". I would like to get rid of these warnings.

My plan is to make a small python script that runs a SPSS command only if there are cases. My problem is that spss.GetCaseCount() counts all cases, independent of they are filtered away. I want to count only those valid non-filtered cases.

* First I identify duplicates and unique cases.
* Then unique cases are filtered away, so that only duplicates are shown.
* Then the Python script conditionally runs SUMMARIZE.

BEGIN PROGRAM.
import spss
numberofcases = spss.GetCaseCount()
print numberofcases
if numberofcases > 0:
    try:
        spss.Submit("""
ECHO "list of duplicates".
SUMMARIZE
  /TABLES=var1 var2 var3
  /TITLE='Case Summaries'
  /MISSING=VARIABLE
  /CELLS=COUNT .
""")
    except:
        print "Something went wrong."
else:
    print "There are no duplicates."
END PROGRAM.


Sincerely,

Eero Olli

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: run Summarize only if there are non-filtered cases - a python script

Jon K Peck
There is no api to get the count of filtered cases, because that is not known without a data pass.  So you would have to do a data pass to find out whether there are zero cases.

The quickest way to do that would be to run, say, AGGREGATE into a new dataset.  But that will also trigger a warning in the log if there are no input cases.  At least that would be separate from the procedure output.  But you could even suppress that by using OMS with Viewer=no.

Alternatively, you could pass the data through your Python code, which will then see only the filtered cases.
begin program.
import spss, spssdata
curs = spssdata.Spssdata()
for case in curs:
  havecases = True
  break   # got at least one case
else:
  havecases = False
  print "There are no cases"
curs.CClose()

if havecases:
 ...
end program.


Jon Peck
Senior Software Engineer, IBM
[hidden email]
312-651-3435




From:        Eero Olli <[hidden email]>
To:        [hidden email]
Date:        02/23/2011 08:07 AM
Subject:        [SPSSX-L] run Summarize only if there are non-filtered cases - a              python              script
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Dear list,

How do can I count the number of non-filtered cases in python? Or is there a different approach that is more practical?

I am dealing with data that changes and has warying quality. I have many production scripts that run syntax files making lists of cases that must be dealt with.  If there are no cases I get a warning. "No cases were ipnut to this procedure". I would like to get rid of these warnings.

My plan is to make a small python script that runs a SPSS command only if there are cases. My problem is that spss.GetCaseCount() counts all cases, independent of they are filtered away. I want to count only those valid non-filtered cases.

* First I identify duplicates and unique cases.
* Then unique cases are filtered away, so that only duplicates are shown.
* Then the Python script conditionally runs SUMMARIZE.

BEGIN PROGRAM.
import spss
numberofcases = spss.GetCaseCount()
print numberofcases
if numberofcases > 0:
   try:
       spss.Submit("""
ECHO "list of duplicates".
SUMMARIZE
 /TABLES=var1 var2 var3
 /TITLE='Case Summaries'
 /MISSING=VARIABLE
 /CELLS=COUNT .
""")
   except:
       print "Something went wrong."
else:
   print "There are no duplicates."
END PROGRAM.


Sincerely,

Eero Olli

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD