|
Hello all! I am new and I subscribed just to ask this question. if your forum is closed-off, or if you would like me to contribute first before asking, please let me know and I will do this.
My question is simple and I hope the solution will be, too. I wish to merge 2 files. 1 file has points on a road, the other has stretched of road. I want to merge them, so that the set with the points contain the data for the stretch of road. This is because each stretch contains multiple points, but each point has only 1 stretch. I have 1 variable in the point-dataset, that specifies the location of the point on the road in terms of a general hectometer system. In the second dataset, I have a start- and end-variable on the same hectometer system that show the beginning- and end-point of the road. Issues such as defining the road (by road name), the direction of the traffic et cetera must not be taken into consideration, those have been delt with. I was thinking in the direction of making a new variable in both files, based upon some greater than and lesser than statements. But the sheer amounts of statements I would have to make is endless. I hope you guys can help me out. Thank you! Miriam |
|
What I would do is make a crosswalk table of all potential points to each stretch of road. I did something similar awhile ago to explode a set of street segments with beginning-end numbers using
If you have any questions feel free to ask whatever. It would probably help though in general to be more specific about what your data look like. |
|
Administrator
|
In reply to this post by Miriam
Hello Miriam. It sounds like you want to merge files via MATCH FILES. The UCLA Statistical Computing website has a tutorial you might find helpful -- it shows how to do the many-to-one kind of matching you describe. You can find it here:
http://www.ats.ucla.edu/stat/spss/topics/data_management.htm Scroll down to "Merging (MATCH merging) SPSS data files". Notice the important distinction between /TABLE and /FILE sub-commands for MATCH FILES. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
This post was updated on .
Hi guys,
Thank you for the answers! I already have plenty of experience with merging files. I just haven't come across this problem yet, and my colleague who has is on a holiday. Andy, what you wrote is, I think, what I'm looking for. I will try to apply it tomorrow. To inform you more, these are the data I work with. It considers road-data. The roads in the Netherlands have a starting point such as a city and move up from there on both sides of the road (regardles of the direction of traffic). They count in kilometers, more specifically, hecometers (they stand for a 100 meters or, more or less, a 100 yards). This is the file with traveltime across a stretch of road. It has 3 relevant variables, namelijk starting point, ending point and length of the stretch of road. Variable view HmStart Numeric 8 2 None None 10 Right Scale Input HmEind Numeric 8 2 None None 10 Right Scale Input MLengte Numeric 8 2 None None 10 Right Scale Input Data view HMstart HmEind MLengte 192,00 191,50 50,00 192,00 191,50 50,00 192,00 191,50 50,00 This is the file with the intensity of the road, that is the amount of people who travel there at a given time. It only contains the location of the point. Variable view Hm Numeric 8 2 None None 10 Right Scale Input Data view 181,00 181,00 181,00 |
|
This post was updated on .
Well, I decided for another solution.
I am now creating a new variable "RDwegvak" in the stretch-file, by aggregating the file by HMstart and HMeind. In the result file I give each stretch of road it's own unique number. Then I make a syntax quickly with excel, that says: if the hm [point] is between [x] and [y], then RDwegvak = [z]. I feed that back to the intensity (point) file. The last step, merging the two files by my new RDwegvak, is a mere formality. *******************Syntax mergen reistijd en intensiteit************* Compute RDwegvak = 0 . DATASET ACTIVATE DataSet1. DATASET DECLARE RTwegvak_A13. AGGREGATE /OUTFILE='RTwegvak_A13' /BREAK=HmStart HmEind /MLengte_mean=MEAN(MLengte) /N_BREAK=N. SORT CASES BY HmStart(A) HmEind(A). SAVE OUTFILE='K:\Analyse fase 2\Case '+ '7\RDWegvak_A13.sav' /COMPRESSED. DATASET ACTIVATE Dataset1 . SORT CASES BY HmStart(A) HmEind(A). MATCH FILES /FILE=* /TABLE='RTwegvak_A13' /BY HmStart HmEind. EXECUTE. GET FILE='K:\ZZ0146 Quick scan evaluatie regelscenario’s VCZWN\Analyse\Analyse fase 2\Case 7\intensiteit A13 case 7, minuut.sav'. DATASET NAME DataSet1 WINDOW=FRONT. compute rdwegvak = 0 . If ((HM> 0 ) and (HM< 1.19 )) Rdwegvak = 1 . If ((HM> 0 ) and (HM< 8.72 )) Rdwegvak = 2 . If ((HM> 1 ) and (HM< 34.58 )) Rdwegvak = 3 . etcetera. Thanks for al your help! |
|
Administrator
|
In reply to this post by Miriam
As a start you need to post a snippet of the two data files to illustrate the specific issues.
I am not eSPSSecially inclined to exercise InterNeTelepathy at the moment. Best way to get simple solutions to 'simple' questions is to provide a clear, concise but thorough description of the issues. It could be as simple as interleaving the SORTED files with ADD FILES and using LAG or LEAD to link. OTOH I have no idea how your data appear and your 5 minutes are up.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
In reply to this post by Miriam
Regards,
Albert-Jan <snip> >GET >� FILE='K:\ZZ0146 Quick scan evaluatie regelscenario’s VCZWN\Analyse\Analyse >fase 2\Case 7\intensiteit A13 case 7, minuut.sav'. >DATASET NAME DataSet1 WINDOW=FRONT. > >compute rdwegvak = 0 . > >If ((HM>� � � � 0� � � � � ) and (HM<� � � 1.19� � )) Rdwegvak =� � � 1� � � � � . >If ((HM>� � � � 0� � � � � ) and (HM<� � � 8.72� � )) Rdwegvak =� � � 2� � � � � . >If ((HM>� � � � 1� � � � � ) and (HM<� � � 34.58� � � )) Rdwegvak =� � � 3� � � � � . > Besides, using Excel might be okay if this is a one time job, but other than that Excel is evil ;-) Albert-Jan ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
This post was updated on .
In reply to this post by David Marso
@Albert-Jan, you're right, this example won't work out, but that's because I worked with flawed test-variables. The principle works. I'm a big SPSS-fan as opposed to excel but when it comes to creating repeated syntax of between 20/100 lines, it's good. It's easy :)
@David, I've just posted that I'm new and I ask for comments if I'm not doing it right. So no need to be sarcastic. |
|
Administrator
|
Believe me I get it! I've been answering questions in this forum for
over 20 years and used to do tech sport, training and consulting at SPSS in Chicago. Don't be so hasty as to consider reading between the lines of my posting a waste of your time. Maybe a waste of my time? Think about ADD FILES BY... and LAG or LEAD RECODE is also your friend. On Tue, Sep 11, 2012 at 2:37 PM, Miriam [via SPSSX Discussion] <[hidden email]> wrote: > @David, telepathy? I've just posted that I'm new and I ask for comments if > I'm not doing it right. So no need to be sarcastic. The others seem to get > it, so better not post anything, as so not to waste both our time? > > Albert-Jan, you're right, this example won't work out, but that's because I > worked with flawed test-variables. The principle works. I'm a big SPSS-fan > as opposed to excel but when it comes to creating repeated syntax of between > 20/100 lines, it's good. It's easy :) > > ________________________________ > If you reply to this email, your message will be added to the discussion > below: > http://spssx-discussion.1045642.n5.nabble.com/Merge-from-1-point-on-a-road-to-a-stretch-of-road-tp5715019p5715027.html > To unsubscribe from Merge from 1 point on a road to a stretch of road, click > here. > NAML
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
I think there was a listserv server ‘hiccup’ because the same initial Miriam message appeared twice and there were a couple of responses (Bruce and Albert-Jan, I think), then the first initial message, a second Miriam message, David, I think you replied about then and then the initial message was repeated. Maybe I have parts of the sequence wrong but that’s how I remember it. Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Believe me I get it! I've been answering questions in this forum for > @David, telepathy? I've just posted that I'm new and I ask for comments if > I'm not doing it right. So no need to be sarcastic. The others seem to get > it, so better not post anything, as so not to waste both our time? > > Albert-Jan, you're right, this example won't work out, but that's because I > worked with flawed test-variables. The principle works. I'm a big SPSS-fan > as opposed to excel but when it comes to creating repeated syntax of between > 20/100 lines, it's good. It's easy :) > > ________________________________ > If you reply to this email, your message will be added to the discussion > below: > http://spssx-discussion.1045642.n5.nabble.com/Merge-from-1-point-on-a-road-to-a-stretch-of-road-tp5715019p5715027.html > To unsubscribe from Merge from 1 point on a road to a stretch of road, click > here. > NAML Please reply to the list and not to my personal email. View this message in context: Re: Merge from 1 point on a road to a stretch of road |
|
In reply to this post by David Marso
Come on, David. No need to be rude.
If you don't want to help, just don't respond.
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: David Marso <[hidden email]> To: [hidden email] Date: 09/11/2012 01:14 PM Subject: Re: [SPSSX-L] Merge from 1 point on a road to a stretch of road Sent by: "SPSSX(r) Discussion" <[hidden email]> Believe me I get it! I've been answering questions in this forum for over 20 years and used to do tech sport, training and consulting at SPSS in Chicago. Don't be so hasty as to consider reading between the lines of my posting a waste of your time. Maybe a waste of my time? Think about ADD FILES BY... and LAG or LEAD RECODE is also your friend. On Tue, Sep 11, 2012 at 2:37 PM, Miriam [via SPSSX Discussion] <[hidden email]> wrote: > @David, telepathy? I've just posted that I'm new and I ask for comments if > I'm not doing it right. So no need to be sarcastic. The others seem to get > it, so better not post anything, as so not to waste both our time? > > Albert-Jan, you're right, this example won't work out, but that's because I > worked with flawed test-variables. The principle works. I'm a big SPSS-fan > as opposed to excel but when it comes to creating repeated syntax of between > 20/100 lines, it's good. It's easy :) > > ________________________________ > If you reply to this email, your message will be added to the discussion > below: > http://spssx-discussion.1045642.n5.nabble.com/Merge-from-1-point-on-a-road-to-a-stretch-of-road-tp5715019p5715027.html > To unsubscribe from Merge from 1 point on a road to a stretch of road, click > here. > NAML Please reply to the list and not to my personal
email. View this message in context: Re: Merge from 1 point on a road to a stretch of road Sent from the SPSSX Discussion mailing list archive at Nabble.com. |
|
In reply to this post by Miriam
>
> Albert-Jan, you're right, this example won't work out, but that's > because I > worked with flawed test-variables. The principle works. I'm a big SPSS-fan > as opposed to excel but when it comes to creating repeated syntax of between > 20/100 lines, it's good. It's easy :) > Hmmm, I still think you could ditch Excel ;-) After the AGGREGATE couldn't you do: dataset activate RTwegvak_A13. compute Rdwegvak = $casenum. exe. This assigns an integer to each interval. You'd still need to value labels for RdWegvak, but that could be done with Python, or perhaps even with PRINT OUTFILE in conjunction with ADD VALUE LABELS. Hope you won't make any errors --the A13 is very near my house!! ;-) ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Miriam
I have no idea where you are going with this, but anything that has manual manipulations will surely have a better approach directly in syntax. Before descending into more madness and tangential replies, here is another example given what I believe your input data look like. Again if you have any questions let me know. If this doesn't reflect what your data look like please be explicit in how it doesn't generalize to your situation.
There are ways around saving an actual file through xsave I can think of, but hopefully this approach is amenable to your workflow. |
|
Administrator
|
In reply to this post by Jon K Peck
David's response may have seemed a bit abrupt (especially to a list newbie who is not accustomed to his style). But let's not lose the main point he was making. I.e.,
"Best way to get simple solutions to 'simple' questions is to provide a clear, concise but thorough description of the issues." In this particular case, that translates to posting "snippet[s] of the two data files to illustrate the specific issues." Without that, everyone is just taking their best guess as to what the problem is. Perhaps someone should write (or borrow from elsewhere) a set of tips on how to increase the likelihood of getting useful responses to mailing list posts, and post it here once a month! ;-) HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
Re: Merge from 1 point on a road to a stretch of road
I don't often chime in, but I support Bruce on this. And I think that a set of tips on what's needed in the body of the email should also contain some tips about what to put in the subject line as well. It's really difficult for a lot of us who've been on the listserv for awhile, and pretty much try to exhaust the archives or Raynald's site before posting, to search for help in the archives when the subject line is specific to a project rather than a statistic or statistical approach. So maybe we can fashion some guidelines for folks, so there is more consistency, and ultimately less posting for solutions that are readily available in the archives, or other sites, because the subject line is more specific. Just a thought. Brian -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Bruce Weaver Sent: Tuesday, September 11, 2012 4:43 PM To: [hidden email] Subject: Re: Merge from 1 point on a road to a stretch of road David's response may have seemed a bit abrupt (especially to a list newbie who is not accustomed to his /style/). But let's not lose the main point he was making. I.e., "Best way to get simple solutions to 'simple' questions is to provide a clear, concise but thorough description of the issues." In this particular case, that translates to posting "snippet[s] of the two data files to illustrate the specific issues." Without that, everyone is just taking their best guess as to what the problem is. Perhaps someone should write (or borrow from elsewhere) a set of tips on how to increase the likelihood of getting useful responses to mailing list posts, and post it here once a month! ;-) HTH. Jon K Peck wrote > > Come on, David. No need to be rude. If you don't want to help, just > don't respond. > > Jon Peck (no "h") aka Kim > Senior Software Engineer, IBM > peck@.ibm > new phone: 720-342-5621 > > > > > From: David Marso <david.marso@> > To: SPSSX-L@.uga > Date: 09/11/2012 01:14 PM > Subject: Re: [SPSSX-L] Merge from 1 point on a road to a > road > Sent by: "SPSSX(r) Discussion" <SPSSX-L@.uga> > > > > Believe me I get it! I've been answering questions in this forum for > over 20 years and used to do tech sport, training and consulting at > SPSS in Chicago. Don't be so hasty as to consider reading between the > lines of my posting a waste of your time. > Maybe a waste of my time? > Think about ADD FILES BY... and LAG or LEAD > RECODE is also your friend. > > > On Tue, Sep 11, 2012 at 2:37 PM, Miriam [via SPSSX Discussion] > <[hidden email]> wrote: > >> @David, telepathy? I've just posted that I'm new and I ask for > if >> I'm not doing it right. So no need to be sarcastic. The others seem to > get >> it, so better not post anything, as so not to waste both our time? >> >> Albert-Jan, you're right, this example won't work out, but that's > because I >> worked with flawed test-variables. The principle works. I'm a big > SPSS-fan >> as opposed to excel but when it comes to creating repeated syntax of > between >> 20/100 lines, it's good. It's easy :) >> >> ________________________________ >> If you reply to this email, your message will be added to the > >> below: >> > http://spssx-discussion.1045642.n5.nabble.com/Merge-from-1-point-on-a-ro ad-to-a-stretch-of-road-tp5715019p5715027.html > >> To unsubscribe from Merge from 1 point on a road to a stretch of road, > click >> here. >> NAML > Please reply to the list and not to my personal email. > Those desiring my consulting or training services please feel free to > email me. > > View this message in context: Re: Merge from 1 point on a road to a > stretch of road > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Merge-from-1-point-on-a-ro ad-to-a-stretch-of-road-tp5715019p5715033.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Administrator
|
Here are some pointers for folks posting to some forum on MS Word. If you replaced "Word" with "SPSS", you'd have a good start on some tips for this forum, I think. ;-)
http://word.mvps.org/findhelp/posting.htm
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
|
I will be out of the office until Friday, September 14. I will respond to your e-mail as soon as possible on my return. |
|
Administrator
|
In reply to this post by Bruce Weaver
Indeed!!! In fact I believe I will open a new thread with a Copy/Paste of that link and add a few hopefully illuminating comments.
--
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
Administrator
|
In reply to this post by Andy W
"What I would do is make a crosswalk table of all potential points to each stretch of road."
Why would you want to do this? Just interleave with ADD FILES and LAG or LEAD!! Brute force is unlikely to work in any general case. One fundamental issue is granularity! quote author="Andy W"> <p>What I would do is make a crosswalk table of all potential points to each stretch of road. I did something similar awhile ago to explode a set of street segments with beginning-end numbers using <code>XSAVE</code>, and so I will paste that to show an example of what I am talking about.</p> <p><pre><code> set seed = 10. input program. loop #i = 1 to 100. compute begin_add = TRUNC(RV.UNIFORM(100,1000)). compute end_add = begin_add + 100. compute street = #i. end case. end loop. end file. end input program. dataset name sim. execute. *this will save an outfile of all the addresses. loop #i = begin_add to end_add. compute add = #i. xsave outfile = 'save\exploded_table.sav'. end loop. get file = 'save\exploded_table.sav'. </code></pre></p> <p>If you have any questions feel free to ask whatever. It would probably help though in general to be more specific about what your data look like.</p>
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
I have no idea what you are talking about, care to share an example or elaborate?
To be specific just take the example datasets I created (there still hasn't been quite a confirmation by the OP that that is what the data looks like, but that is my best guess). I agree it could be computationally cumbersome with large files (it could "explode" into really enormous number of records), but situations in which I have used it for it was not a problem. Also note (besides the lines to make fake data), the solution I provide is hardly complicated (it takes less than 20 lines of code). You could probably also take out the getting the made table and sorting it (I just didn't figure out first try how to make the table sorted correctly on the orignal XSAVE, but I imagine it is possible). I really have no idea what you are talking about (neither about "interleaving with lag" nor with "granularity"), so please elaborate. Andy |
| Free forum by Nabble | Edit this page |
