further comparison of crosstabulation categories

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

further comparison of crosstabulation categories

Paul Mcgeoghan
I have a customer who is comparing a dependent variable (e.g
Absence/Presence) against an independent variable SITE with 11 categories.

He wants to produce multiple crosstabulations using each possible pair of
independent categories.

Any syntax which can do this?
Otherwise, he needs to do Data Select Cases first to select each possible
pair.

Thanks,
Paul

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: further comparison of crosstabulation categories

Jon K Peck

Here is a small Python program that does this.

GET  FILE='C:\spss18\Samples\English\1991 U.S. General Social Survey.sav'.
DATASET NAME GSS WINDOW=FRONT.

begin program.
import spss, spssdata, itertools
# example uses life and happy.  Change as needed, preserving variable name case
rowvar="life"
colvar = "happy"
# find values for colvar
cursor = spssdata.Spssdata(colvar, omitmissing=True)
allvalues = set([case[0] for case in cursor])
cursor.CClose()
# run all crosstabs
for val1, val2 in itertools.combinations(allvalues, 2):
  spss.Submit("""TEMPORARY.
SELECT IF %(colvar)s = %(val1)s OR %(colvar)s = %(val2)s.
CROSSTABS %(rowvar)s BY %(colvar)s
  /CELLS=COUNT COLUMN .""" % locals())
end program.

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435



From: Paul McGeoghan <[hidden email]>
To: [hidden email]
Date: 06/14/2010 10:07 AM
Subject: [SPSSX-L] further comparison of crosstabulation categories
Sent by: "SPSSX(r) Discussion" <[hidden email]>





I have a customer who is comparing a dependent variable (e.g
Absence/Presence) against an independent variable SITE with 11 categories.

He wants to produce multiple crosstabulations using each possible pair of
independent categories.

Any syntax which can do this?
Otherwise, he needs to do Data Select Cases first to select each possible
pair.

Thanks,
Paul

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


Reply | Threaded
Open this post in threaded view
|

Re: further comparison of crosstabulation categories

Roberts, Michael-2

Jon,

 

I am not the primary recipient of your code, but it would be useful for me so I tried running it and ran into a problem – I get an error “AttributeError: 'module' object has no attribute 'combinations'”

 

Suspecting that this error has something to do with the version of Python I am using – version 2.5 with SPSS v. 17.0.3, I checked and it appears that this functionality “itertools.combinations” is new in Python 2.6.  So, my question is can I dump Python 2.5 and use the 2.6 version to get this new functionality, or is version 17 irrevocably tied to the older python?

 

TIA

 

Mike

 

 

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jon K Peck
Sent: Monday, June 14, 2010 6:13 PM
To: [hidden email]
Subject: Re: further comparison of crosstabulation categories

 


Here is a small Python program that does this.

GET  FILE='C:\spss18\Samples\English\1991 U.S. General Social Survey.sav'.
DATASET NAME GSS WINDOW=FRONT.

begin program.
import spss, spssdata, itertools
# example uses life and happy.  Change as needed, preserving variable name case
rowvar="life"
colvar = "happy"
# find values for colvar
cursor = spssdata.Spssdata(colvar, omitmissing=True)
allvalues = set([case[0] for case in cursor])
cursor.CClose()
# run all crosstabs
for val1, val2 in itertools.combinations(allvalues, 2):
  spss.Submit("""TEMPORARY.
SELECT IF %(colvar)s = %(val1)s OR %(colvar)s = %(val2)s.
CROSSTABS %(rowvar)s BY %(colvar)s
  /CELLS=COUNT COLUMN .""" % locals())
end program.

Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435


From:

Paul McGeoghan <[hidden email]>

To:

[hidden email]

Date:

06/14/2010 10:07 AM

Subject:

[SPSSX-L] further comparison of crosstabulation categories

Sent by:

"SPSSX(r) Discussion" <[hidden email]>

 





I have a customer who is comparing a dependent variable (e.g
Absence/Presence) against an independent variable SITE with 11 categories.

He wants to produce multiple crosstabulations using each possible pair of
independent categories.

Any syntax which can do this?
Otherwise, he needs to do Data Select Cases first to select each possible
pair.

Thanks,
Paul

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: further comparison of crosstabulation categories

Jon K Peck




From: Jon K Peck/Chicago/IBM
To: "Roberts, Michael" <[hidden email]>
Cc: "[hidden email]" <[hidden email]>
Date: 06/16/2010 07:17 AM
Subject: RE: further comparison of crosstabulation categories




I didn't realize that the combinations function was new in 2.6.  Unfortunately, with V17 you have to stay with Python 2.5, however the 2.6 help gives the following code as equivalent to the combinations function.  If you save this to a file and import it, you should be able to get the same effect.  I'm copying the text below, but I will also send it as an attachment  to Michael, since the listserv often mangles line indentation.


def combinations(iterable, r):
   # combinations('ABCD', 2) --> AB AC AD BC BD CD
   # combinations(range(4), 3) --> 012 013 023 123
   pool = tuple(iterable)
   n = len(pool)
   if r > n:
       return
   indices = range(r)
   yield tuple(pool[i] for i in indices)
   while True:
       for i in reversed(range(r)):
           if indices[i] != i + n - r:
               break
       else:
           return
       indices[i] += 1
       for j in range(i+1, r):
           indices[j] = indices[j-1] + 1
       yield tuple(pool[i] for i in indices)

 
[attachment "combinations.py" deleted by Jon K Peck/Chicago/IBM]
Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435




From: "Roberts, Michael" <[hidden email]>
To: Jon K Peck/Chicago/IBM@IBMUS, "[hidden email]" <[hidden email]>
Date: 06/16/2010 07:05 AM
Subject: RE: further comparison of crosstabulation categories





Jon,
 
I am not the primary recipient of your code, but it would be useful for me so I tried running it and ran into a problem – I get an error “AttributeError: 'module' object has no attribute 'combinations'”
 
Suspecting that this error has something to do with the version of Python I am using – version 2.5 with SPSS v. 17.0.3, I checked and it appears that this functionality “itertools.combinations” is new in Python 2.6.  So, my question is can I dump Python 2.5 and use the 2.6 version to get this new functionality, or is version 17 irrevocably tied to the older python?
 
TIA
 
Mike
 
 
 
From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Jon K Peck
Sent:
Monday, June 14, 2010 6:13 PM
To:
[hidden email]
Subject:
Re: further comparison of crosstabulation categories

 

Here is a small Python program that does this.


GET  FILE='C:\spss18\Samples\English\1991 U.S. General Social Survey.sav'.

DATASET NAME GSS WINDOW=FRONT.


begin program.

import spss, spssdata, itertools

# example uses life and happy.  Change as needed, preserving variable name case

rowvar="life"

colvar = "happy"

# find values for colvar

cursor = spssdata.Spssdata(colvar, omitmissing=True)

allvalues = set([case[0] for case in cursor])

cursor.CClose()

# run all crosstabs

for val1, val2 in itertools.combinations(allvalues, 2):

 spss.Submit("""TEMPORARY.

SELECT IF %(colvar)s = %(val1)s OR %(colvar)s = %(val2)s.

CROSSTABS %(rowvar)s BY %(colvar)s

 /CELLS=COUNT COLUMN .""" % locals())

end program.


Jon Peck
SPSS, an IBM Company
[hidden email]
312-651-3435

From: Paul McGeoghan <[hidden email]>
To: [hidden email]
Date: 06/14/2010 10:07 AM
Subject: [SPSSX-L] further comparison of crosstabulation categories
Sent by: "SPSSX(r) Discussion" <[hidden email]>

 






I have a customer who is comparing a dependent variable (e.g
Absence/Presence) against an independent variable SITE with 11 categories.

He wants to produce multiple crosstabulations using each possible pair of
independent categories.

Any syntax which can do this?
Otherwise, he needs to do Data Select Cases first to select each possible
pair.

Thanks,
Paul

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD