MERGING Datasets on 2 key variables?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

MERGING Datasets on 2 key variables?

SJPostdoc
Hello,

I am working with two datasets to complete a study using patient surveys. The first dataset includes (1) the medical record numbers (MRNs) of 600 patients that will make up my sample, along with (2) the date of survey (DOS) for each MRN. It looks like this:
MRN   DOS
1254 6/30/15
9546 1/12/10
1307 9/4/13

The second dataset (N=5000) has multiple entries/rows for each MRN/patient: since patients take these surveys every year, each MRN has 3, 4, or 5 dates of service and subsequent data for each DOS. This dataset also contains all of the survey data (100+ variables). It looks like this:
MRN DOS     - PAIN - DEPRESSION.......
1254 1/5/09 - 5     - 4
1254 8/15/11 - 5    - 3
1254 5/31/13 - 4    - 0
1254 6/30/15 - 1    -2
9546 2/9/09 - 0     - 1
9546 1/12/10 - 0    - 0

I would like to create a dataset that includes survey data for just those patients/MRNs from my first dataset (eg, MRNs 1254 and 9546 with only the DOS from the first dataset). I've tried to import survey data from the second data set based on MRN using the merge function, but have been unsuccessful since there are triple and quadruple MRNs in the second data set. I wonder if I could merge the two datasets on both MRN -AND- DOS, that way I just end up with the 600 I need.

Any help would be greatly, greatly, appreciated!

Thanks,
StJudePostdoc
Reply | Threaded
Open this post in threaded view
|

Re: MERGING Datasets on 2 key variables?

Bruce Weaver
Administrator
Yes, you can include both MRN and DOS on the BY sub-command for MATCH FILES.   Something like this ought to do the trick.  Make sure both data sets are sorted by MRN and DOS first.

MATCH FILES
 FILE = 'File1' / IN = F1 /
 FILE = 'File2' /
 BY MRN DOS.
EXECUTE.
* Keep records for folks in the first dataset only.
SELECT IF F1.
FREQUENCIES F1.



SJPostdoc wrote
Hello,

I am working with two datasets to complete a study using patient surveys. The first dataset includes (1) the medical record numbers (MRNs) of 600 patients that will make up my sample, along with (2) the date of survey (DOS) for each MRN. It looks like this:
MRN   DOS
1254 6/30/15
9546 1/12/10
1307 9/4/13

The second dataset (N=5000) has multiple entries/rows for each MRN/patient: since patients take these surveys every year, each MRN has 3, 4, or 5 dates of service and subsequent data for each DOS. This dataset also contains all of the survey data (100+ variables). It looks like this:
MRN DOS     - PAIN - DEPRESSION.......
1254 1/5/09 - 5     - 4
1254 8/15/11 - 5    - 3
1254 5/31/13 - 4    - 0
1254 6/30/15 - 1    -2
9546 2/9/09 - 0     - 1
9546 1/12/10 - 0    - 0

I would like to create a dataset that includes survey data for just those patients/MRNs from my first dataset (eg, MRNs 1254 and 9546 with only the DOS from the first dataset). I've tried to import survey data from the second data set based on MRN using the merge function, but have been unsuccessful since there are triple and quadruple MRNs in the second data set. I wonder if I could merge the two datasets on both MRN -AND- DOS, that way I just end up with the 600 I need.

Any help would be greatly, greatly, appreciated!

Thanks,
StJudePostdoc
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).