This post was updated on .
Hi everyone,
I'm trying to merge two datasets, a sample which look like: Dataset A Dataset B 900100 900100 900100 900100 900100 900100 900100 900200 900100 900200 900100 900300 900100 900300 900200 900300 900200 900300 900300 900400 900300 900400 900400 900400 I want to match dataset B with dataset A. The datasets correspond to census designated PUMAS in 2000 (dataset A) and 2010 (dataset B). The pumas changed from 2000 to 2010. For example, the 7 '900100' pumas in 2000 became 3 '900100s' in 2010. Any advice/tips are greatly appreciated and thank you! Greg -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
So, looking at your two lists the naïve person sees 7 900100 records in A and three in B. And since she's supposed to match these up she wants to know which, if any, B records go with which, if any, A records. What do you tell her?
Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Greg Sent: Tuesday, March 6, 2018 8:34 AM To: [hidden email] Subject: merging Hi everyone, I'm trying to merge two datasets, a sample which look like: Dataset A Dataset B 900100 900100 900100 900100 900100 900100 900100 900200 900100 900200 900100 900300 900100 900300 900200 900300 900200 900300 900300 900400 900300 900400 900400 900400 I want to match dataset B with dataset A. Any advice/tips are greatly appreciated and thank you! Greg -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Hi Gene,
Thank you for your response. I edited my original post by adding some (brief) context to the datasets. Greg -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Greg, I don't have any experience with PUMA or PUMS files. I have heard of them because somebody here worked with American Community Survey data. When I googled PUMAs, one response was cartographic boundary shapefiles for public use from the census bureau (https://www.census.gov > Geography > Maps & Data > Cartographic Boundary Files). I wonder if, using ARC, you could put the 2000 shapefiles side by side with the 2010 shapefiles and get something useful enough to go forward.
Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Greg Sent: Tuesday, March 6, 2018 9:04 AM To: [hidden email] Subject: Re: merging Hi Gene, Thank you for your response. I edited my original post by adding some (brief) context to the datasets. Greg -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
The term to search for is "crosswalk". Basically you will do an areal
allocation to go from one set to another. https://usa.ipums.org/usa/volii/puma00_puma10_crosswalk_spatial.shtml ----- Andy W [hidden email] http://andrewpwheeler.wordpress.com/ -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Ok and thank you. Though I'm not completely confident in working with GIS,
this cannot be done via spss merging? Greg -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
No you can do this all within SPSS -- the link I provided has an excel
spreadsheet you can read into SPSS. In a nutshell, you will need to aggregate the 2000 database based on the proportions in that spreadsheet, then it will be a simple 1 to 1 merge. ----- Andy W [hidden email] http://andrewpwheeler.wordpress.com/ -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
and this is what confuses me. So I aggregate based on the 2000 pumas (using
them as the break variable), but since some 2000 pumas split into more than one puma in 2010, I will still have to do a 'many to many' merge? (I'm sure I'm missing something(?). Greg -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
No, the crosswalk file should have both 2000 and 2010 identifiers. It will
then have a field that is a proportion. You will want to aggregate whatever 2000 information to the 2010 identifiers, using that proportion as a weight. That is about as much specifics as I can give without more information about your original data, to which I could construct an example with code. ----- Andy W [hidden email] http://andrewpwheeler.wordpress.com/ -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |