|
Hi List,
A few months ago, I posted a message about matching cases between an experimental group and a group selected from an epidemiological study. One of you suggested that I try using Raynald Levesque's Syntax titled "Matching cases on basis of Propensity scores", available at www.spsstools.net <http://www.spsstools.net/> . I decided to try this solution and encountered a few problems that I'm hoping someone can figure out. For the purposes of the match, I changed all of the relevant variable names to the same name as are used in Ray's macro/syntax. Ray's sample syntax and data work fine; but when I run the syntax with my data, I get around 75 or 80 different files titled "results_1.....results_80, etc." in my C:\temp folder and SPSS hangs until I close the program using Windows Task Manager. The experimental group in my data has 126 participants and the control group has 5709 participants. The only thing I changed in the syntax from Ray's site is that I left off the first part of the syntax in which he creates a sample data file, and I changed the number of treatment cases to 126. If I step through the syntax piece by piece, I do indeed encounter the problem when the program gets to the point in the syntax that reads !match nbtreat=126 . Is this a computer memory issue? Does anybody know how I can solve this problem? I'm running SPSS 15.0.1 on Windows Vista, with 2 gigs of RAM and an Intel Core2 Duo processor. My syntax is posted below; the original syntax that I downloaded from Ray's site is below that. Thanks. Daryl My Syntax: ************************************************************************ ****************************** ************************************************************************ ****************************** Matching Syntax from Ray Levesque begins ************************************************************************ ****************************** ************************************************************************ ******************************. SORT CASES BY treatm(D) propen. COMPUTE idx=$CASENUM. SAVE OUTFILE='c:\temp\mydata.sav'. * Erase the previous temporary result file, if any. ERASE FILE='c:\temp\results.sav'. COMPUTE key=1. SELECT IF (1=0). * Create an empty data file to receive results. SAVE OUTFILE='c:\temp\results.sav'. ********************************************. * Define a macro which will do the job. ********************************************. SET MPRINT=no. *////////////////////////////////. DEFINE !match (nbtreat=!TOKENS(1)) !DO !cnt=1 !TO !nbtreat GET FILE='c:\temp\mydata.sav'. SELECT IF idx=!cnt OR treatm=0. DO IF $CASENUM=1. COMPUTE #target=propen. ELSE. COMPUTE delta=propen-#target. END IF. EXECUTE. SELECT IF ~MISSING(delta). IF (delta<0) delta=-delta. SORT CASES BY delta. SELECT IF $CASENUM=1. COMPUTE key=!cnt. ADD FILES FILE=* /FILE='c:\temp\results.sav'. SAVE OUTFILE='c:\temp\results.sav'. !DOEND !ENDDEFINE. *////////////////////////////////. SET MPRINT=yes. **************************. * Call macro (we know that there are 126 treatment cases). **************************. !match nbtreat=126 . * Sort results file to allow matching. GET FILE='c:\temp\results.sav'. SORT CASES BY key. SAVE OUTFILE='c:\temp\results.sav'. * Match each treatment cases with the most similar non treatment case. GET FILE='c:\temp\mydata.sav'. MATCH FILES /FILE=* /FILE='C:\Temp\results.sav' /RENAME (idx = d0) caseid=caseid2 improv=improv2 propen=propen2 treatm=treatm2 key=idx /BY idx /DROP= d0. EXECUTE. * That's it!. Original Syntax from Ray Levesque's site: * The solution assumes that the number of cases receiving the treatment is known. * This could restriction could be removed if necessary. * Create a data file for illustration purposes. INPUT PROGRAM. SET SEED=2365847. LOOP caseid=1 TO 20. COMPUTE treatm=TRUNC(UNIFORM(1)+.5). COMPUTE propen=UNIFORM(100). COMPUTE improv=UNIFORM(100). END CASE. END LOOP. END FILE. END INPUT PROGRAM. SORT CASES BY treatm(D) propen. COMPUTE idx=$CASENUM. SAVE OUTFILE='c:\temp\mydata.sav'. * Erase the previous temporary result file, if any. ERASE FILE='c:\temp\results.sav'. COMPUTE key=1. SELECT IF (1=0). * Create an empty data file to receive results. SAVE OUTFILE='c:\temp\results.sav'. ********************************************. * Define a macro which will do the job. ********************************************. SET MPRINT=no. *////////////////////////////////. DEFINE !match (nbtreat=!TOKENS(1)) !DO !cnt=1 !TO !nbtreat GET FILE='c:\temp\mydata.sav'. SELECT IF idx=!cnt OR treatm=0. DO IF $CASENUM=1. COMPUTE #target=propen. ELSE. COMPUTE delta=propen-#target. END IF. EXECUTE. SELECT IF ~MISSING(delta). IF (delta<0) delta=-delta. SORT CASES BY delta. SELECT IF $CASENUM=1. COMPUTE key=!cnt. ADD FILES FILE=* /FILE='c:\temp\results.sav'. SAVE OUTFILE='c:\temp\results.sav'. !DOEND !ENDDEFINE. *////////////////////////////////. SET MPRINT=yes. **************************. * Call macro (we know that there are 7 treatment cases). **************************. !match nbtreat=7. * Sort results file to allow matching. GET FILE='c:\temp\results.sav'. SORT CASES BY key. SAVE OUTFILE='c:\temp\results.sav'. * Match each treatment cases with the most similar non treatment case. GET FILE='c:\temp\mydata.sav'. MATCH FILES /FILE=* /FILE='C:\Temp\results.sav' /RENAME (idx = d0) caseid=caseid2 improv=improv2 propen=propen2 treatm=treatm2 key=idx /BY idx /DROP= d0. EXECUTE. * That's it!. ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Hi all,
Ok; I was able to make Ray's syntax work by using add files on all of the results_1 ...results_126 files and merging them into one file (c:temp\results) and then proceeding with the syntax from Ray's site. Not an elegant solution, but it works. (Just a note, the matching syntax that I modified actually returned 127 results files; each one containing one case, except for results_127 which was empty [did not contain any data]) But, I'm left with the question of why the original syntax didn't merge all of these cases into one results file when using my data, when it worked as expected when using Ray's sample data. Does anyone have any thoughts on this? Daryl ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
