Case Control Matching Fuzzy

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Case Control Matching Fuzzy

maartenmeerkamp
Dear SPSS-experts,

I am having difficulties matching and merging two datasets for a case control study.

The goal:
I have two patient database's, one with an intervention, one control. I would like to match the intervention group with the control group. Therefor I have written this syntax:

get file="/Users/xxx/xxx/control.sav".
dataset name supplier.
get file="/Users/xxx/xxx/intervention.sav".
dataset name demander.
fuzzy demanderds=demander supplierds=supplier
by= Age Weight
fuzz= 5 5
supplierid = Casenr
newdemanderidvars=supplierId.

Question 1:
Is it correct that a control patient with an Age +-5yr and Weight +-5kg compared to the intervention patient will be searched? Because in my output 260 (out of a total of 500) Fuzzy matches are found even when I put the Fuzz to 50 and 50. In that case every patient should match with every individual from the other group and i would expect the maximum Fuzzy matches (500)

Question 2:
When I run the syntax in my demander database 2 new variables appear:
- matchgroup
- supplierid
I do not understand what these variables say.

Question 3
How can I merge the intervention with the matched control patients into one database?

Excuse me if I the questions are real easy but I can't seem to find the awnser anywhere...

Thanks a lot!
Reply | Threaded
Open this post in threaded view
|

Re: Case Control Matching Fuzzy

xenia
hi,
could you explain what the fuzz is?
Reply | Threaded
Open this post in threaded view
|

Re: Case Control Matching Fuzzy

xenia
In reply to this post by maartenmeerkamp
Also, what do you mean "is it correct that a control patient.... will be searched"?  I assume you want to find controls for your cases, the matching variables are going to be set by you. If you want the matching to be done on age and weight (I don't know what your dependent variable is) then that's fine.
Reply | Threaded
Open this post in threaded view
|

Re: Case Control Matching Fuzzy

maartenmeerkamp
This is what the SPSS help function told me:
By default, a match is defined by identical values for all the BY variables.  A system-missing value prevents a case from being matched.  Fuzzy matching is also available for numeric variables. Specify FUZZ=list-of-matching tolerances.  There must be one fuzz value for each BY variable, listed in BY-value order.
A tolerance is the maximum difference in either direction that is allowed for a match.  Thus, values of 1 and 2 would match if tolerance is 1 or more, and a tolerance of zero means an exact match on that variable.  You must use 0 for any string variable.
By default, with fuzzy matching, an exact match is first tried, and then a fuzzy match is tried. There is no attempt to get the closest fuzzy match, just a match within the tolerance.

Reply | Threaded
Open this post in threaded view
|

Re: Case Control Matching Fuzzy

xenia
I'm not sure you can do case-control matching in spss, others may know of a way.

Which help file is this, i.e. you went to the Help Topics from within SPSS and searched using which keyword?
Reply | Threaded
Open this post in threaded view
|

Re: Case Control Matching Fuzzy

maartenmeerkamp
Entering the syntax FUZZY /HELP gave me the information

I red you can match in SPSS in this post:
http://www.ibm.com/developerworks/forums/thread.jspa?messageID=14523917
Reply | Threaded
Open this post in threaded view
|

Re: Case Control Matching Fuzzy

maartenmeerkamp
I found out that supplier id is the case number from the other database which matches with the subsequent case. My main questions now are:

Q1
How do merge the files, so how do only copy the matches to one database and not the whole file?

Q2
More importantly I found that changing the FUZZ to a greater number hardly improves the matched number of cases. Whereas I expect that a greater fuzz gives a greater tolerance so I would expect more cases to match but this or sometimes the opposite is true. Can someone explain me how this works?
Reply | Threaded
Open this post in threaded view
|

Re: Case Control Matching Fuzzy

Maguin, Eugene
In reply to this post by maartenmeerkamp
Where does this command come from? Is this new to 21? A piece of python code? An undocumented command? A macro?

fuzzy demanderds=demander supplierds=supplier by= Age Weight fuzz= 5 5 supplierid = Casenr newdemanderidvars=supplierId.

Thanks, Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of maartenmeerkamp
Sent: Thursday, November 15, 2012 4:19 AM
To: [hidden email]
Subject: Case Control Matching Fuzzy

Dear SPSS-experts,

I am having difficulties matching and merging two datasets for a case control study.

*The goal*:
I have two patient database's, one with an intervention, one control. I would like to match the intervention group with the control group. Therefor I have written this syntax:

get file="/Users/xxx/xxx/control.sav".
dataset name supplier.
get file="/Users/xxx/xxx/intervention.sav".
dataset name demander.
fuzzy demanderds=demander supplierds=supplier by= Age Weight fuzz= 5 5 supplierid = Casenr newdemanderidvars=supplierId.

*Question 1*:
Is it correct that a control patient with an Age +-5yr and Weight +-5kg compared to the intervention patient will be searched? Because in my output
260 (out of a total of 500) Fuzzy matches are found even when I put the Fuzz to 50 and 50. In that case every patient should match with every individual from the other group and i would expect the maximum Fuzzy matches (500)

*Question 2*:
When I run the syntax in my demander database 2 new variables appear:
- matchgroup
- supplierid
I do not understand what these variables say.

*Question 3*
How can I merge the intervention with the matched control patients into one database?

Excuse me if I the questions are real easy but I can't seem to find the awnser anywhere...

Thanks a lot!




--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Case-Control-Matching-Fuzzy-tp5716210.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Case Control Matching Fuzzy

David Marso
Administrator
IIRC it is one of JoNoh's extensions.
--
One could also quite easily roll your own with an ADD FILES and some clever LAGS and LEADS (or SHIFT VALUES).  I created something like this about 15 years ago for a client. Had birth date/birth weight/names and other data and in many cases there were slight mismatches in one or the other or all.  It was quite a complicated project but worked beautifully in the end.  If you wanted to get really clever you could bring the sorted file into MATRIX and build out some sort of DISTANCE function around neighboring cases.  

--------------

Maguin, Eugene wrote
Where does this command come from? Is this new to 21? A piece of python code? An undocumented command? A macro?

fuzzy demanderds=demander supplierds=supplier by= Age Weight fuzz= 5 5 supplierid = Casenr newdemanderidvars=supplierId.

Thanks, Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of maartenmeerkamp
Sent: Thursday, November 15, 2012 4:19 AM
To: [hidden email]
Subject: Case Control Matching Fuzzy

Dear SPSS-experts,

I am having difficulties matching and merging two datasets for a case control study.

*The goal*:
I have two patient database's, one with an intervention, one control. I would like to match the intervention group with the control group. Therefor I have written this syntax:

get file="/Users/xxx/xxx/control.sav".
dataset name supplier.
get file="/Users/xxx/xxx/intervention.sav".
dataset name demander.
fuzzy demanderds=demander supplierds=supplier by= Age Weight fuzz= 5 5 supplierid = Casenr newdemanderidvars=supplierId.

*Question 1*:
Is it correct that a control patient with an Age +-5yr and Weight +-5kg compared to the intervention patient will be searched? Because in my output
260 (out of a total of 500) Fuzzy matches are found even when I put the Fuzz to 50 and 50. In that case every patient should match with every individual from the other group and i would expect the maximum Fuzzy matches (500)

*Question 2*:
When I run the syntax in my demander database 2 new variables appear:
- matchgroup
- supplierid
I do not understand what these variables say.

*Question 3*
How can I merge the intervention with the matched control patients into one database?

Excuse me if I the questions are real easy but I can't seem to find the awnser anywhere...

Thanks a lot!




--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Case-Control-Matching-Fuzzy-tp5716210.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Automatic reply: Case Control Matching Fuzzy

jdiez@magellanbr.com

I will be out of the office throught the Thanksgiving break.  If it is an emergency, contact Fred Shumate at 225-936-9860.