SPSSX Discussion

Matching Cases on Basis of Propensity Scores

Classic

List

Threaded

2 messages Options

Daryl Schrock

Matching Cases on Basis of Propensity Scores

Hi List,

A few months ago, I posted a message about matching cases between an
experimental group and a group selected from an epidemiological study.
One of you suggested that I try using Raynald Levesque's Syntax titled
"Matching cases on basis of Propensity scores", available at
www.spsstools.net <http://www.spsstools.net/> . I decided to try this
solution and encountered a few problems that I'm hoping someone can
figure out. For the purposes of the match, I changed all of the relevant
variable names to the same name as are used in Ray's macro/syntax. Ray's
sample syntax and data work fine; but when I run the syntax with my
data, I get around 75 or 80 different files titled
"results_1.....results_80, etc." in my C:\temp folder and SPSS hangs
until I close the program using Windows Task Manager.

The experimental group in my data has 126 participants and the control
group has 5709 participants. The only thing I changed in the syntax from
Ray's site is that I left off the first part of the syntax in which he
creates a sample data file, and I changed the number of treatment cases
to 126. If I step through the syntax piece by piece, I do indeed
encounter the problem when the program gets to the point in the syntax
that reads

!match nbtreat=126 .

Is this a computer memory issue? Does anybody know how I can solve this
problem? I'm running SPSS 15.0.1 on Windows Vista, with 2 gigs of RAM
and an Intel Core2 Duo processor.

My syntax is posted below; the original syntax that I downloaded from
Ray's site is below that. Thanks.

Daryl

My Syntax:

************************************************************************
******************************

************************************************************************
******************************

Matching Syntax from Ray Levesque begins

************************************************************************
******************************

************************************************************************
******************************.

SORT CASES BY treatm(D) propen.

COMPUTE idx=$CASENUM.

SAVE OUTFILE='c:\temp\mydata.sav'.

* Erase the previous temporary result file, if any.

ERASE FILE='c:\temp\results.sav'.

COMPUTE key=1.

SELECT IF (1=0).

* Create an empty data file to receive results.

SAVE OUTFILE='c:\temp\results.sav'.

********************************************.

* Define a macro which will do the job.

********************************************.

SET MPRINT=no.

*////////////////////////////////.

DEFINE !match (nbtreat=!TOKENS(1))

!DO !cnt=1 !TO !nbtreat

GET FILE='c:\temp\mydata.sav'.

SELECT IF idx=!cnt OR treatm=0.

DO IF $CASENUM=1.

COMPUTE #target=propen.

ELSE.

COMPUTE delta=propen-#target.

END IF.

EXECUTE.

SELECT IF ~MISSING(delta).

IF (delta<0) delta=-delta.

SORT CASES BY delta.

SELECT IF $CASENUM=1.

COMPUTE key=!cnt.

ADD FILES FILE=*

/FILE='c:\temp\results.sav'.

SAVE OUTFILE='c:\temp\results.sav'.

!DOEND

!ENDDEFINE.

*////////////////////////////////.

SET MPRINT=yes.

**************************.

* Call macro (we know that there are 126 treatment cases).

**************************.

!match nbtreat=126 .

* Sort results file to allow matching.

GET FILE='c:\temp\results.sav'.

SORT CASES BY key.

SAVE OUTFILE='c:\temp\results.sav'.

* Match each treatment cases with the most similar non treatment case.

GET FILE='c:\temp\mydata.sav'.

MATCH FILES /FILE=*

/FILE='C:\Temp\results.sav'

/RENAME (idx = d0) caseid=caseid2 improv=improv2 propen=propen2

treatm=treatm2 key=idx

/BY idx

/DROP= d0.

EXECUTE.

* That's it!.

Original Syntax from Ray Levesque's site:

* The solution assumes that the number of cases receiving the treatment
is known.

* This could restriction could be removed if necessary.

* Create a data file for illustration purposes.

INPUT PROGRAM.

SET SEED=2365847.

LOOP caseid=1 TO 20.

COMPUTE treatm=TRUNC(UNIFORM(1)+.5).

COMPUTE propen=UNIFORM(100).

COMPUTE improv=UNIFORM(100).

END CASE.

END LOOP.

END FILE.

END INPUT PROGRAM.

SORT CASES BY treatm(D) propen.

COMPUTE idx=$CASENUM.

SAVE OUTFILE='c:\temp\mydata.sav'.

* Erase the previous temporary result file, if any.

ERASE FILE='c:\temp\results.sav'.

COMPUTE key=1.

SELECT IF (1=0).

* Create an empty data file to receive results.

SAVE OUTFILE='c:\temp\results.sav'.

********************************************.

* Define a macro which will do the job.

********************************************.

SET MPRINT=no.

*////////////////////////////////.

DEFINE !match (nbtreat=!TOKENS(1))

!DO !cnt=1 !TO !nbtreat

GET FILE='c:\temp\mydata.sav'.

SELECT IF idx=!cnt OR treatm=0.

DO IF $CASENUM=1.

COMPUTE #target=propen.

ELSE.

COMPUTE delta=propen-#target.

END IF.

EXECUTE.

SELECT IF ~MISSING(delta).

IF (delta<0) delta=-delta.

SORT CASES BY delta.

SELECT IF $CASENUM=1.

COMPUTE key=!cnt.

ADD FILES FILE=*

/FILE='c:\temp\results.sav'.

SAVE OUTFILE='c:\temp\results.sav'.

!DOEND

!ENDDEFINE.

*////////////////////////////////.

SET MPRINT=yes.

**************************.

* Call macro (we know that there are 7 treatment cases).

**************************.

!match nbtreat=7.

* Sort results file to allow matching.

GET FILE='c:\temp\results.sav'.

SORT CASES BY key.

SAVE OUTFILE='c:\temp\results.sav'.

* Match each treatment cases with the most similar non treatment case.

GET FILE='c:\temp\mydata.sav'.

MATCH FILES /FILE=*

/FILE='C:\Temp\results.sav'

/RENAME (idx = d0) caseid=caseid2 improv=improv2 propen=propen2

treatm=treatm2 key=idx

/BY idx

/DROP= d0.

EXECUTE.

* That's it!.

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Daryl Schrock

Re: Matching cases on basis of Propensity Scores

Hi all,

Ok; I was able to make Ray's syntax work by using add files on all of
the results_1 ...results_126 files and merging them into one file
(c:temp\results) and then proceeding with the syntax from Ray's site.
Not an elegant solution, but it works. (Just a note, the matching syntax
that I modified actually returned 127 results files; each one containing
one case, except for results_127 which was empty [did not contain any
data])

But, I'm left with the question of why the original syntax didn't merge
all of these cases into one results file when using my data, when it
worked as expected when using Ray's sample data. Does anyone have any
thoughts on this?

Daryl

====================To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD