Dear List members
I am trying use Match File function (ONE TO MANY MERGE) to add variables from one file to another. But i am getting errors. Can someone help me please? BOTH FILES HAVE THE 3 VARIABLES EVENTREFNUM INCIDENTREFNUM AND PERSONCNI TESTFILE1 -------- MANY RECORDS - 625 TESTFILE2 --------- 1 RECORD - 51330 TASK: I have to merge and create a new data file. MY SYNTAX MATCH FILES FILE=TESTFILE1.sav/TABLE=TESTFILE2.sav/BY EventRefNum IncidentRefNum PersonCNI. execute. RESULT Error # 63 in column 18. Text: TESTFILE1.sav The file does not exist. Execution of this command stops. Error # 63 in column 38. Text: TESTFILE2.sav The file does not exist. Note # 5146 There is no working file to restore. You must define a working file before proceeding with your analysis. EXECUTE. However when i run the syntax below using menu DATASET ACTIVATE DataSet11. ----------------------(testfile1) MATCH FILES /FILE=* /TABLE='DataSet13' --------------------(testfile2) /BY EventRefNum IncidentRefNum PersonCNI. EXECUTE. I do get a file with merged records but (only with 625 records). What I need is a file with 51330 records. Hope i have explained this well. I would be grateful if someone could help me? regards thara
|
See
http://www.ats.ucla.edu/stat/spss/modules/merge.htm You only get 625 lines because you have confused the MANY file with the "1" file. [Above] has an example matching DADs to KIDs, which shows that DADs is the one that is specified by TABLE= . The number of lines comes from the FILE= specification for KIDs. Your try in syntax gives very explicit error messages, "File does not exist." So, files by those names do not exist in whatever SPSS is taking as the default directory. Either you have SPSS looking at the wrong directory, or those files are not there. -- Rich Ulrich Date: Mon, 14 May 2012 14:53:58 +1000 From: [hidden email] Subject: HELP WITH MATCH FILES - ONE TO MANY MERGE To: [hidden email] Dear List members I am trying use Match File function (ONE TO MANY MERGE) to add variables from one file to another. But i am getting errors. Can someone help me please? BOTH FILES HAVE THE 3 VARIABLES EVENTREFNUM INCIDENTREFNUM AND PERSONCNI TESTFILE1 -------- MANY RECORDS - 625 TESTFILE2 --------- 1 RECORD - 51330 TASK: I have to merge and create a new data file. MY SYNTAX MATCH FILES FILE=TESTFILE1.sav/TABLE=TESTFILE2.sav/BY EventRefNum IncidentRefNum PersonCNI. execute. RESULT Error # 63 in column 18. Text: TESTFILE1.sav The file does not exist. Execution of this command stops. Error # 63 in column 38. Text: TESTFILE2.sav The file does not exist. Note # 5146 There is no working file to restore. You must define a working file before proceeding with your analysis. EXECUTE. However when i run the syntax below using menu DATASET ACTIVATE DataSet11. ----------------------(testfile1) MATCH FILES /FILE=* /TABLE='DataSet13' --------------------(testfile2) /BY EventRefNum IncidentRefNum PersonCNI. EXECUTE. I do get a file with merged records but (only with 625 records). What I need is a file with 51330 records. Hope i have explained this well. I would be grateful if someone could help me? regards thara
|
Administrator
|
In reply to this post by thara vardhan-2
From the FM:
"TABLE Subcommand TABLE specifies a table lookup (or keyed table) file. A lookup file contributes variables but not cases to the new active dataset. Variables from the table file are added to all cases from other files that have matching values for the key variables. FILE specifies the files that supply the cases." So: Do something like... DATASET ACTIVATE DataSet11. MATCH FILES /FILE=* /FILE='DataSet13' /BY EventRefNum IncidentRefNum PersonCNI. DO IF SUM(EventRefNum -LAG(EventRefNum), IncidentRefNum -LAG(IncidentRefNum ) , PersonCNI-LAG(PersonCNI) ) EQ 0. DO REPEAT v=<<***listOfVarsInOriginalTABLEfile***>>. COMPUTE v=LAG(V). END REPEAT. END IF.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by thara vardhan-2
Thara,
You either need to use the full path name or use the DATASET NAME command for the two datasets being used (then you can use those names in your syntax). This syntax says that you want to add variables from testfile2 (the one/table) into testfile1 (the many/master file). Testfile1 cannot be a table file since it has many records for the id variables (EventRefNum IncidentRefNum PersonCNI).
HOWEVER, Generally, the table file has fewer records than the master file, so it is quite unclear what you are trying to do. Are you attempting to add variables for a selection of cases from testfile2-that would require something like matching the 625 records and then adding them to the (51330-625) records? i.e. 625 records relate to some smaller number of records included in the 51330, but not to all records. DATASET ACTIVATE DataSet11. /*----------------------(testfile1). DATASET NAME TESTFILE1. DATASET ACTIVATE DataSet13. /*----------------------(testfile2). DATASET NAME TESTFILE2. MATCH FILES /FILE=TESTFILE1
/TABLE=TESTFILE2 From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Thara Vardhan Dear List members
PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. |
Although we generally think of a look-up file as having fewer records than the analysis file, there are many scenarios where the opposite is true. You might have a file of a population with lots of demographics that you want to attach to a sample of that population. Perhaps that is true here? From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Melissa Ives Thara, You either need to use the full path name or use the DATASET NAME command for the two datasets being used (then you can use those names in your syntax). This syntax says that you want to add variables from testfile2 (the one/table) into testfile1 (the many/master file). Testfile1 cannot be a table file since it has many records for the id variables (EventRefNum IncidentRefNum PersonCNI). HOWEVER, Generally, the table file has fewer records than the master file, so it is quite unclear what you are trying to do. Are you attempting to add variables for a selection of cases from testfile2-that would require something like matching the 625 records and then adding them to the (51330-625) records? i.e. 625 records relate to some smaller number of records included in the 51330, but not to all records. DATASET ACTIVATE DataSet11. /*----------------------(testfile1). DATASET NAME TESTFILE1. DATASET ACTIVATE DataSet13. /*----------------------(testfile2). DATASET NAME TESTFILE2. MATCH FILES /FILE=TESTFILE1 /TABLE=TESTFILE2 From: SPSSX(r) Discussion [hidden email] On Behalf Of Thara Vardhan Dear List members
PRIVILEGED AND CONFIDENTIAL INFORMATION |
In reply to this post by thara vardhan-2
Use apostrophes to refer to files. Good luck, Mario ==================
Mario Giesel
Munich, Germany |
In reply to this post by ViAnn Beadle
Hi Rich, David, Melissa, Mario and Viann
My sincere thanks to all of you for helping me with this problem of matching ‘One to Many’ records with two data files. However my problem is still unresolved. Before I proceed any further: Rich – I have seen the very good tutorial guide from the UCLA website that gave the examples of the “dads” and “kids” file. This is when tried to translate the syntax with my file names. However I realised that 1) both the files were open 2) directory file path was not specified My default directory is not C:\drive,it is network group’s drive. However I am not able to place it correctly in the syntax. Maybe the path file is quite long Here is my attempt: Mario – I tried both with and without the apostrophe’s but I think the basic issues is specifying the directory. David – your excellent syntax does work but only up to a certain extent – I cannot lag Personcni variable because my testfile2 with 51330 records is unique when I use eventrefnum, incidentrefnum and personcni. This is the reason I specified it as the table file in the match function. There can be multiple records for the same eventrefnum and increfnum – because more than 1 person was involved in the same event but each person has a distinct personcni number. For each personcni address and other personal details are different. Here is some dummy data (testfile2) to illustrate: Erefnum increfnum PersionCNI LKA 27326 7781 29291 2/106 ABC ST 27326 7781 05960 7 MNO ST 27326 7781 10474 54 XYZ RD Testfile1 – contains legal actions attached to these eventrefnum, increfnum and personcni. Issue with file 1 is that there are multiple records for the same (39 records) eventrefnum, increfnum and personcni because of the presence of the variable ‘lawpartcode’ which can be different for each record Here is some dummy data to illustrate my problem: Erefnum increfnum PersionCNI lawpartcode 977435 052869 82242 7093 977435 052869 82242 5587 27326 7781 29291 1188 27326 7781 29291 1246 27326 7781 29291 6156 27326 7781 29291 820 27326 7781 05960 6156 27326 7781 05960 820 27326 7781 05960 1246 27326 7781 05960 6156 27326 7781 05960 820 DATASET ACTIVATE DataSet1. MATCH FILES /FILE=* /FILE='DataSet3' /BY EventRefNum IncidentRefNum PersonCNI. DO IF SUM(EventRefNum -LAG(EventRefNum), IncidentRefNum -LAG(IncidentRefNum ) , PersonCNI-LAG(PersonCNI) ) EQ 0. DO REPEAT v=<<listOfVarsInOriginalTABLEfile>>. COMPUTE v=LAG(V). END REPEAT. END IF. Finally Melissa – you have got it absolutely right – Are you attempting to add variables for a selection of cases from testfile2-that would require something like matching the 625 records and then adding them to the (51330-625) records? i.e. 625 records relate to some smaller number of records included in the 51330, but not to all records. Yes I am attempting to do the above. It works perfectly when I merge incident and person details from testfile2 into this file (test file 1 with 624 records). using menu command DATASET ACTIVATE DataSet11. ----------------------(testfile1) MATCH FILES /FILE=* /TABLE='DataSet13' --------------------(testfile2) /BY EventRefNum IncidentRefNum PersonCNI. EXECUTE. Viann Thank you so much. Yes you got it right too. Although we generally think of a look-up file as having fewer records than the analysis file, there are many scenarios where the opposite is true. You might have a file of a population with lots of demographics that you want to attach to a sample of that population. Perhaps that is true here? My problem is the external organisation lady is insisting that I need to provide it all in one big file! (that is the 51330 records file should have the 624 legal action details). Thank you all once again for your patience in reading through my woes! If any of you could help me understand and proceed with this I would be extremely thankful because I have to create 5 similar files for other incident categories every month. regards Thara
|
Administrator
|
Looks like my code does exactly what one would desire to do in this sort of situation.
Please indicate if I somehow misunderstand the situation. NOTE. Your file snippets as posted are *NOT* sorted in order. Maybe that has something to do with the problem. You *should* post what you expect the file to look like AFTER the match. Anyhow... -------------- DATA LIST /Erefnum increfnm PersCNI LKA (F5,X,F4,X,F5,X, A20). BEGIN DATA 27326 7781 29291 2/106 ABC ST 27326 7781 05960 7 MNO ST 27326 7781 10474 54 XYZ RD END DATA. LIST. SORT CASES BY Erefnum increfnm PersCNI . SAVE OUTFILE "unique.sav". DATA LIST LIST /Erefnum increfnm PersCNI lawcode. BEGIN DATA 977435 052869 82242 7093 977435 052869 82242 5587 27326 7781 29291 1188 27326 7781 29291 1246 27326 7781 29291 6156 27326 7781 29291 820 27326 7781 05960 6156 27326 7781 05960 820 27326 7781 05960 1246 27326 7781 05960 6156 27326 7781 05960 820 END DATA. SORT CASES BY Erefnum increfnm PersCNI . SAVE OUTFILE "multi.sav". MATCH FILES / FILE * / FILE "unique.sav" / BY Erefnum increfnm PersCNI . DO IF SUM(Erefnum -LAG(Erefnum),increfnm -LAG(increfnm ) ,PersCNI-LAG(PersCNI))EQ 0. COMPUTE LKA=LAG(LKA). END IF. LIST. EREFNUM INCREFNM PERSCNI LAWCODE LKA 27326.00 7781.00 5960.00 6156.00 7 MNO ST 27326.00 7781.00 5960.00 820.00 7 MNO ST 27326.00 7781.00 5960.00 1246.00 7 MNO ST 27326.00 7781.00 5960.00 6156.00 7 MNO ST 27326.00 7781.00 5960.00 820.00 7 MNO ST 27326.00 7781.00 10474.00 . 54 XYZ RD 27326.00 7781.00 29291.00 1188.00 2/106 ABC ST 27326.00 7781.00 29291.00 1246.00 2/106 ABC ST 27326.00 7781.00 29291.00 6156.00 2/106 ABC ST 27326.00 7781.00 29291.00 820.00 2/106 ABC ST 977435.0 52869.00 82242.00 7093.00 977435.0 52869.00 82242.00 5587.00 Number of cases read: 12 Number of cases listed: 12
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
** CONTEXT**.
David – your excellent syntax does work but only up to a certain extent – 'non-table' file.because my testfile2 with 51330 records is unique when I use eventrefnum, incidentrefnum and personcni. This is the reason I specified it as the table file in the match function. ---- "testfile2 with 51330 records is unique....." Ignore that because it is duplicated in the 'non-table' file and that is the entire point of this exercise. After the match the additional fields from the 'table-file' are attached to *ONLY* the first record of the KEY combinations. The LAG *applies* in the 'non-table' records and drags these additional fields into the other matching records!!!.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by thara vardhan-2
Thinking off the cuff without testing anything. It seems like you could use your original Match files/file=/table= syntax (with the dataset name commands to simplify your match file syntax and not require
the full path) That results in the 625 records having the updated information from the 51330. THEN To recombine the dataset you would need to replace the 625 records in the 51330 file with the updated 625 from the match files. It seems that this could be done with
UPDATE file=<master file: i.e. the 51330>/file=<transaction file: i.e. the updated 625>/by
EventRefNum IncidentRefNum PersonCNI Read the manual about how update functions to be sure it does exactly what you want. HTH. Melissa From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Thara Vardhan Hi Rich, David, Melissa, Mario and Viann
PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. |
Administrator
|
You will lose the multiple records from the 625 case data set .
I still think the MATCH... LAG'nDrag is likely the 'safest' approach here! -- Using my previous file definitions with Melissa's suggested syntax: MATCH FILES / FILE * / TABLE "unique.sav" / BY Erefnum increfnm PersCNI . UPDATE /FILE "unique.sav" /FILE * / BY Erefnum increfnm PersCNI . FORMATS Erefnum increfnm (N6.0) PersCNI (N5.0). LIST. EREFNUM INCREFNM PERSCNI LKA LAWCODE 027326 007781 05960 7 MNO ST 820.00 027326 007781 10474 54 XYZ RD . 027326 007781 29291 2/106 ABC ST 820.00 977435 052869 82242 5587.00 Number of cases read: 4 Number of cases listed: 4
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
Here is another approach which pushes the "TABLE" into the multiples File but then acquires the non matches from the multiples file without explicitly lagging the variables from the "TABLE" into the multiple records defined by the KEYS. Does require an additional ADD FILES and purge but does not require knowledge of the variable names in the "TABLE" as my first proposed solution requires.
-- * Active file * is the file containing multiple instances of the KEY combos * . MATCH FILES / FILE * / TABLE "unique.sav" / BY Erefnum increfnm PersCNI . ADD FILES/FILE * / IN=@M@ /FILE "unique.sav" /IN=@U@/ BY Erefnum increfnm PersCNI. COMPUTE #FLAG=SUM(Erefnum-LAG(Erefnum),increfnm-LAG(increfnm),PersCNI-LAG(PersCNI)) EQ 0 AND LAG(@M@) AND @U@. SELECT IF NOT #FLAG. LIST.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |