Hi,
I have two big database in two different years like this: Year 1 Year 2 ID V1 V2 ID V1 V2 1 a c 1 b e 2 a j 1 a j 3 b k 2 a k 4 b l 2 c l end 4 b l 7 c k 100 d e . . . I need to create a third column in Year one that returns the observed values of V1 and V2 in Year two. Is that possible? Thank you in advance for advice. |
Administrator
|
Take a look at the /TABLE sub-command for MATCH FILES.
http://pic.dhe.ibm.com/infocenter/spssstat/v20r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fsyn_match_files_table.htm You can see an example of it in the "one-to-many merge" example here: http://www.ats.ucla.edu/stat/spss/modules/merge.htm Note that the example given there is a bit dated in that it uses only files stored on the hard disk rather than two open datasets. One more thing: Because your example has variables V1 and V2 in both files, you'll also need to pay attention to the /RENAME sub-command. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
In reply to this post by tlrodarte
Flatten the Year2 file using CASESTOVARS and then just do a normal 1:1 match into Year1 BY ID.
If this doesn't address the issue then you will have to take the time to explain your problem in greater detail (ie example of actual inputs and desired output).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by Bruce Weaver
Bruce,
Are you sure that applies to the OP's request?
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
No, I'm not sure. But then I'm not entirely sure what the OP wants the final file to look like, now that I reread it. ;-)
"I need to create a third column in Year one that returns the observed values of V1 and V2 in Year two." It would seem that they need at least 2 new variables. In the example provided, IDs 1 and 2 each have two rows of data, so two V1 values and two V2 values. Does the OP want the values from the first row only, or all values? Anyway, your approach of applying CASESTOVARS before 1:1 matching is probably the way to go.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
I think we need more information from the OP. I could be that the third column in Year One might be contain a transformation of V1 and V2 in Year Two, e.g., sum, product, etc.
B ________________________________________ From: SPSSX(r) Discussion [[hidden email]] on behalf of Bruce Weaver [[hidden email]] Sent: Sunday, January 19, 2014 1:17 PM To: [hidden email] Subject: Re: Match Files No, I'm not sure. But then I'm not entirely sure what the OP wants the final file to look like, now that I reread it. ;-) "I need to create a third column in Year one that returns the observed values of V1 and V2 in Year two." It would seem that they need at least 2 new variables. In the example provided, IDs 1 and 2 each have two rows of data, so two V1 values and two V2 values. Does the OP want the values from the first row only, or all values? Anyway, your approach of applying CASESTOVARS before 1:1 matching is probably the way to go. David Marso wrote > Bruce, > Are you sure that applies to the OP's request? > Bruce Weaver wrote >> Take a look at the /TABLE sub-command for MATCH FILES. >> >> http://pic.dhe.ibm.com/infocenter/spssstat/v20r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fsyn_match_files_table.htm >> >> You can see an example of it in the "one-to-many merge" example here: >> >> http://www.ats.ucla.edu/stat/spss/modules/merge.htm >> >> Note that the example given there is a bit dated in that it uses only >> files stored on the hard disk rather than two open datasets. >> >> One more thing: Because your example has variables V1 and V2 in both >> files, you'll also need to pay attention to the /RENAME sub-command. >> >> HTH. >> >> tlrodarte wrote >>> Hi, >>> >>> I have two big database in two different years like this: >>> >>> Year 1 Year 2 >>> >>> ID V1 V2 ID V1 V2 >>> 1 a c 1 b e >>> 2 a j 1 a j >>> 3 b k 2 a k >>> 4 b l 2 c l >>> end 4 b l >>> 7 c k >>> 100 d e >>> . >>> . >>> . >>> >>> I need to create a third column in Year one that returns the observed >>> values of V1 and V2 >>> in Year two. Is that possible? >>> >>> Thank you in advance for advice. ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Match-Files-tp5723992p5723996.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Instead of third column i should have written third and fourth columns for that example; in fact there are 7 columns (V1...V7) in my database. Also, i only need the information in the first row. Before finding this discussion group i tried to do like the "one-to-many merge" above, but it failed in the sense that the information from the second year did not appear in the Data Editor: the variables (renamed) were there, but the values were missing. I will try CASESTOVARS and return to you later. Thanks again! |
Correction:
... "the information from the FIRST year did not appear in the Data Editor"... |
Administrator
|
In reply to this post by tlrodarte
I suspect you just needed an EXECUTE after your MATCH FILES. When that is the case, you'll see "Transformations pending" in an information area at the bottom of the data editor.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
I see that message in the information area, but using EXECUTE does not change the result: the informations of the first year "disappear" from the database.
|
Administrator
|
OK! At no point do you post your syntax, the specifics of what the outcome should look like etc! If you expect people to assist you might want to be more informative!On Sun, Jan 19, 2014 at 3:55 PM, tlrodarte [via SPSSX Discussion] <[hidden email]> wrote: I see that message in the information area, but using EXECUTE does not change the result: the informations of the first year "disappear" from the database.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
Yes, providing more info at the outset would have been good. Here are the two data sets shown in the first post in the thread.
Year 1 Year 2 ID V1 V2 ID V1 V2 1 a c 1 b e 2 a j 1 a j 3 b k 2 a k 4 b l 2 c l end 4 b l 7 c k 100 d e . . . 1. What do you want the merged dataset to look like? 2. What syntax are you using to try to merge the datasets?
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
It rarely gets easier than this ;-)
CASESTOVARS /ID=ID. MATCH FILES / FILE year1 /IN=IN_Year1 / FILE year2 /IN=IN_year2 / BY ID . LIST. ID v1 v2 v1.1 v1.2 v2.1 v2.2 IN_Year1 IN_year2 1 a c b a e j 1 1 2 a j a c k l 1 1 3 b k 1 0 4 b l b c l k 1 1
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Bruce Weaver
Ok, i will explain better:
Below is the syntax that i used and resulted into the error i mentioned before (data missing): Syntax1.sps My files refers to data related to under graduate students and those students are identified by the numeric variable 'CO ALUNOS'. I have some informations (for example access or not to public financing) about the students in the year of 2009 and i have these students in 2010 and many others. I need to "capture" the students of 2009 in the database of 2010 to get the same informations that i have of 2009 in the data base of 2010 in order to find out the evolution of the students situation. So the merged data should contain the students of 2009 and the informations related to them in 2009 and in 2010. |
Without fixing your eventual problem, here is what is wrong with
this .sps file, and how to improve it. 1) You match by CO_ALUNOS: Therefore, you do not want to RENAME it, and you do not want to DROP it. 2) The file renames everything in the first file, and then DROPs them. Why bother? Use KEEP= CO_ALUNOS and those others will never appear. Or, if you want them, you can't DROP them. David's solution still seems like it was the good one, but you seem to have changed the description of the problem so that it does not apply. If you can put all the data in one file, with ID and YEAR as identifier, and using the one set of variable names... dropping the duplicated or unneeded variables... then you can use VarsToCases to combine the several years. -- Rich Ulrich ---------------------------------------- > Date: Sun, 19 Jan 2014 14:43:12 -0800 > From: [hidden email] > Subject: Re: Match Files > To: [hidden email] > > Ok, i will explain better: > > Below is the syntax that i used and resulted into the error i mentioned > before (data missing): > > Syntax1.sps > <http://spssx-discussion.1045642.n5.nabble.com/file/n5724005/Syntax1.sps> > > My files refers to data related to under graduate students and those > students are identified by the numeric variable 'CO ALUNOS'. I have some > informations (for example access or not to public financing) about the > students in the year of 2009 and i have these students in 2010 and many > others. I need to "capture" the students of 2009 in the database of 2010 to > get the same informations that i have of 2009 in the data base of 2010 in > order to find out the evolution of the students situation. So the merged > data should contain the students of 2009 and the informations related to > them in 2009 and in 2010. > > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by tlrodarte
At 05:43 PM 1/19/2014, tlrodarte wrote:
>My files refers to data related to under graduate students and those >students are identified by the numeric variable 'CO_ALUNOS'. I have >some informations (for example access or not to public financing) >about the students in the year of 2009 and i have these students in >2010 and many others. I need to "capture" the students of 2009 in >the database of 2010 to get the same informations that i have of >2009 in the data base of 2010 in order to find out the evolution of >the students situation. So the merged data should contain the >students of 2009 and the informations related to them in 2009 and in 2010. You have two things that are making this difficult. One is that the same variables occur in both files, so that you have to rename when matching. The other is that you appear to have (according to your example data) more than one record per student in one of the files. You wrote, earlier, "I only need the information in the first row." Does that mean, only the first record for each student, if there's more than one? See if this does what you want. You have to rename the variables in Year 2 so they don't conflict with names in Year 1; CASESTOVARS is a tricky way to do that. See if this does what you want. |-----------------------------|---------------------------| |Output Created |20-JAN-2014 00:03:06 | |-----------------------------|---------------------------| [Year_1] Student V1 V2 1 a c 2 a j 3 b k 4 b l Number of cases read: 4 Number of cases listed: 4 DATASET ACTIVATE Year_2 WINDOW=FRONT. LIST. List |-----------------------------|---------------------------| |Output Created |20-JAN-2014 00:03:06 | |-----------------------------|---------------------------| [Year_2] Student V1 V2 1 b e 1 a j 2 a k 2 c l 4 b l 7 c k Number of cases read: 6 Number of cases listed: 6 * Get Year 2 data ready for merge: . DATASET ACTIVATE Year_2 WINDOW=FRONT. * Keep only the first record for each student . ADD FILES /FILE= * /BY Student /FIRST=Record1. SELECT IF Record1. * Rename the variables . CASESTOVARS /ID = Student /INDEX = Record1 /GROUPBY = VARIABLE /AUTOFIX = NO . Cases to Variables Notes [suppressed] LIST. List |-----------------------------|---------------------------| |Output Created |20-JAN-2014 00:03:06 | |-----------------------------|---------------------------| [Year_2] Student V1.1 V2.1 1 b e 2 a k 4 b l 7 c k Number of cases read: 4 Number of cases listed: 4 * Merge Year 1 and Year 2 data: . MATCH FILES /FILE=Year_1 /IN=In_Yr_1 /FILE=Year_2 /BY Student. SELECT IF In_Yr_1. LIST. List |-----------------------------|---------------------------| |Output Created |20-JAN-2014 00:03:07 | |-----------------------------|---------------------------| Student V1 V2 V1.1 V2.1 In_Yr_1 1 a c b e 1 2 a j a k 1 3 b k 1 4 b l b l 1 Number of cases read: 4 Number of cases listed: 4 ============================= APPENDIX: Test data and code ============================= * C:\Documents and Settings\Richard\My Documents . * \Technical\spssx-l\Z-2014\ . * 2014-01-19 tlrodarte-Match Files.SPS . * In response to posting . * Date: Sun, 19 Jan 2014 14:43:12 -0800 . * From: tlrodarte <[hidden email]> . * Subject: Re: Match Files . * To: [hidden email] . * and earlier in the same thread. . * It appears to be a straightforward MATCH FILES, adding data from . * one year to student data from another year, complicated by . * having more than one record per student in File 2. . DATA LIST LIST/ Student V1 V2 (F2, 2A1). BEGIN DATA 1 a c 2 a j 3 b k 4 b l END DATA. DATASET NAME Year_1. DATA LIST LIST/ Student V1 V2 (F2, 2A1). BEGIN DATA 1 b e 1 a j 2 a k 2 c l 4 b l 7 c k END DATA. DATASET NAME Year_2. DATASET ACTIVATE Year_1 WINDOW=FRONT. LIST. DATASET ACTIVATE Year_2 WINDOW=FRONT. LIST. * GET Year 2 data ready for merge: . DATASET ACTIVATE Year_2 WINDOW=FRONT. * Keep only the first record for each student . ADD FILES /FILE= * /BY Student /FIRST=Record1. SELECT IF Record1. * Rename the variables . CASESTOVARS /ID = Student /INDEX = Record1 /GROUPBY = VARIABLE /AUTOFIX = NO . LIST. * Merge Year 1 and Year 2 data: . MATCH FILES /FILE=Year_1 /IN=In_Yr_1 /FILE=Year_2 /BY Student. SELECT IF In_Yr_1. LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |