Dear SPSS Experts, The 50 item self-administered survey questionnaire(with items varied from demographic variables to likert type items) was successfully administered to n=2000 people. Then the filled-out questionnaires were scanned two times and the datafiles were save separately as data1.sav and data2.sav. The items in the datasets are labeled Q1, Q2, ...Q50. Then the values from the two datasets are matched. The goal is to create a third datafile (data3.sav) such that the values are 0 and 1, 1 if matched and 0 if unmatched. For example, if the response of a person to Q1 in data1 and data2 are the same or matched, then the Q1 for data3 would be 1; otherwise 0. How to do this in SPSS? Thank you for your help. |
this works, although you could make it more generic:
file handle dataset_a /name = "C:\Program Files\IBM\SPSS\Statistics\20\Samples\English\employee data.sav".
file handle dataset_b /name = "C:\Program Files\IBM\SPSS\Statistics\20\Samples\English\employee data.sav". begin program.
import spss, spssaux spss.Submit("get file = dataset_a.") vnames = [v for v in spssaux.GetVariableNamesList() if v != "id"] spss.Submit("rename variables (%s=%s)." % (" ".join(vnames), " ".join([v + "_a" for v in vnames]))) spss.Submit(["sort cases by id.", "dataset name a."]) spss.Submit("get file = dataset_b.") vnames = [v for v in spssaux.GetVariableNamesList() if v != "id"] spss.Submit("rename variables (%s=%s)." % (" ".join(vnames), " ".join([v + "_b" for v in vnames]))) spss.Submit(["sort cases by id.", "dataset name b."]) spss.Submit("match files /file = a /file = b /by id.") spss.Submit(["compute %s = %s eq %s." % (v, v + "_a", v + "_b") for v in vnames]) spss.Submit("add files /file = * /keep = id %s" % " ".join(vnames)) end program. exe. Regards,
Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
Very nice Albert! Here is what I was writing for a non-python solution (although not quite as general).
1) Rename the variables in data2 so they don't conflict with data1, 2) after you match the files a simple do repeat structure will work. Just replace my [Q2List = Q2_1 to Q2_50] with whatever the names of the variables are in the second file. vector Q3_(50,F1.0). do repeat Q1List = Q1 to Q50 /Q2List = Q2_1 to Q2_50 /Q3List = Q3_1 to Q3_50. compute Q3List = (Q1List EQ Q2List). end repeat. This presumes the Q1 ... Q50 variables are contiguous in the datafile. The renaming of variables is annoying, but you could either write a simple macro to take care of that for you or just make a set set of variables via vector with non-conflicting names. Example on smaller datasets below. *******************************************************. data list free /ID Q1 Q2 Q3. begin data 1 4 5 4 2 2 1 2 3 3 3 3 end data. dataset name data1. data list free /ID Q1 Q2 Q3. begin data 1 4 4 4 2 2 2 2 3 3 3 3 end data. dataset name data2. rename variables (Q1 = Q2_1) (Q2 = Q2_2) (Q3 = Q2_3). match files file = 'data1' /file = 'data2' /by ID. dataset name data3. vector Q3_(3,F1.0). do repeat Q1List = Q1 to Q3 /Q2List = Q2_1 to Q2_3 /Q3List = Q3_1 to Q3_3. compute Q3List = (Q1List EQ Q2List). end repeat. exe. list. *******************************************************. |
In reply to this post by E. Bernardo
Use Data > Compare Datasets (COMPARE
DATASETS).
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: "E. Bernardo" <[hidden email]> To: [hidden email], Date: 07/11/2013 02:45 AM Subject: [SPSSX-L] matching values from two different datasets Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear SPSS Experts, The 50 item self-administered survey questionnaire(with items varied from demographic variables to likert type items) was successfully administered to n=2000 people. Then the filled-out questionnaires were scanned two times and the datafiles were save separately as data1.sav and data2.sav. The items in the datasets are labeled Q1, Q2, ...Q50. Then the values from the two datasets are matched. The goal is to create a third datafile (data3.sav) such that the values are 0 and 1, 1 if matched and 0 if unmatched. For example, if the response of a person to Q1 in data1 and data2 are the same or matched, then the Q1 for data3 would be 1; otherwise 0. How to do this in SPSS? Thank you for your help. |
Administrator
|
In reply to this post by Andy W
ANPS (Another Non Python Solution).
-- data list free /ID Q1 Q2 Q3. begin data 1 4 5 4 2 2 1 2 3 3 3 3 end data. dataset name data1 . SAVE OUTFILE 'G:\TEMP\d1'. data list free /ID Q1 Q2 Q3. begin data 1 4 4 4 2 2 2 2 3 3 3 3 end data. dataset name data2 . SAVE OUTFILE 'G:\TEMP\d2'. dataset declare compare. MATRIX . GET D1 / FILE = 'G:\TEMP\d1' / VARIABLES= ALL/ NAMES=Vname. GET D2 / FILE = 'G:\TEMP\d2' / VARIABLES =ALL/ NAMES=Vname. SAVE (D1-D2) / OUTFILE compare / NAMES=VNAME. END MATRIX. DATASET ACTIVATE compare. RECODE ALL (0=1)(ELSE=0). LIST. ID Q1 Q2 Q3 1.00 1.00 .00 1.00 1.00 1.00 .00 1.00 1.00 1.00 1.00 1.00 Number of cases read: 3 Number of cases listed: 3
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
Very nice, David! But I would add a FORMATS command to get rid of the decimals in your "compare" dataset.
RECODE ALL (0=1)(ELSE=0). FORMATS ALL(f1). LIST. I find 1 1 0 1 etc to be much easier on the eyes than 1.00 1.00 .00 1.00.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Albert, Jon, Andy, David, Bruce and all: Thank you for your help. Problem solved! Eins From: Bruce Weaver <[hidden email]> To: [hidden email] Sent: Thursday, July 11, 2013 11:45 PM Subject: Re: matching values from two different datasets Very nice, David! But I would add a FORMATS command to get rid of the decimals in your "compare" dataset. RECODE ALL (0=1)(ELSE=0). *FORMATS ALL(f1).* LIST. I find 1 1 0 1 etc to be much easier on the eyes than 1.00 1.00 .00 1.00. David Marso wrote > ANPS (Another Non Python Solution). > -- > data list free /ID Q1 Q2 Q3. > begin data > 1 4 5 4 > 2 2 1 2 > 3 3 3 3 > end data. > dataset name data1 . > SAVE OUTFILE 'G:\TEMP\d1'. > > data list free /ID Q1 Q2 Q3. > begin data > 1 4 4 4 > 2 2 2 2 > 3 3 3 3 > end data. > dataset name data2 . > SAVE OUTFILE 'G:\TEMP\d2'. > > dataset declare compare. > MATRIX . > GET D1 / FILE = 'G:\TEMP\d1' / VARIABLES= ALL/ NAMES=Vname. > GET D2 / FILE = 'G:\TEMP\d2' / VARIABLES =ALL/ NAMES=Vname. > SAVE (D1-D2) / OUTFILE compare / NAMES=VNAME. > END MATRIX. > DATASET ACTIVATE compare. > RECODE ALL (0=1)(ELSE=0). > LIST. > > ID Q1 Q2 Q3 > > 1.00 1.00 .00 1.00 > 1.00 1.00 .00 1.00 > 1.00 1.00 1.00 1.00 > > > Number of cases read: 3 Number of cases listed: 3 > Andy W wrote >> Very nice Albert! Here is what I was writing for a non-python solution >> (although not quite as general). >> >> 1) Rename the variables in data2 so they don't conflict with data1, 2) >> after you match the files a simple do repeat structure will work. Just >> replace my [Q2List = Q2_1 to Q2_50] with whatever the names of the >> variables are in the second file. >> >> vector Q3_(50,F1.0). >> do repeat Q1List = Q1 to Q50 /Q2List = Q2_1 to Q2_50 /Q3List = Q3_1 to >> Q3_50. >> compute Q3List = (Q1List EQ Q2List). >> end repeat. >> >> This presumes the Q1 ... Q50 variables are contiguous in the datafile. >> The renaming of variables is annoying, but you could either write a >> simple macro to take care of that for you or just make a set set of >> variables via vector with non-conflicting names. >> >> Example on smaller datasets below. >> >> *******************************************************. >> data list free /ID Q1 Q2 Q3. >> begin data >> 1 4 5 4 >> 2 2 1 2 >> 3 3 3 3 >> end data. >> dataset name data1. >> >> data list free /ID Q1 Q2 Q3. >> begin data >> 1 4 4 4 >> 2 2 2 2 >> 3 3 3 3 >> end data. >> dataset name data2. >> rename variables (Q1 = Q2_1) >> (Q2 = Q2_2) >> (Q3 = Q2_3). >> >> match files file = 'data1' >> /file = 'data2' >> /by ID. >> dataset name data3. >> >> vector Q3_(3,F1.0). >> do repeat Q1List = Q1 to Q3 /Q2List = Q2_1 to Q2_3 /Q3List = Q3_1 to >> Q3_3. >> compute Q3List = (Q1List EQ Q2List). >> end repeat. >> exe. >> list. >> *******************************************************. ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/matching-values-from-two-different-datasets-tp5721102p5721110.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
walang anuman, Eins ;-)
Regards,
Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
Free forum by Nabble | Edit this page |