dear list members
I have a problem with matching two files. The key-variable is a string, but they do not match up, even though the resulting data set doesn't show any differences between the values of key variables. But they are: sorting them and comparing them with lag shows they are not equal, but I do not understand how they differ. The first key variable is originally read from a text file while th other one originates from excel xlsx. Key variable is formatted A18. syntax example: match files file =*/file = file2 / by key. Example of how the new data set looks like: var1 var2 var3 var5 var5 111 343 123 aaa . . . . aaa 123 111 343 123 bbb . . . . bbb 123 111 343 123 ccc . . . . ccc 123 111 343 123 ddd . . . . ddd 123 I encountered this before. Back then it was a small file, having resolved it in another way. But this one's too large. thanks in advance. Maurice -- ___________________________________________________________________ Maurice Vergeer Department of communication, Radboud University (www.ru.nl) PO Box 9104, NL-6500 HE Nijmegen, The Netherlands Visiting Professor Yeungnam University, Gyeongsan, South Korea Recent publications: -Vergeer, M., Eisinga, R. & Franses, Ph.H. (forthcoming). Supply and demand effects in television viewing. A time series analysis. Communications - The European Journal of Communication Research. -Vergeer, M. Lim, Y.S. Park, H.W. (forthcoming). Mediated Relations: New Methods to study Online Social Capital. Asian Journal of Communication. -Vergeer, M., Hermans, L., & Sams, S. (forthcoming). Online social networks and micro-blogging in political campaigning: The exploration of a new campaign tool and a new campaign style. Party Politics. -Pleijter, A., Hermans, L. & Vergeer, M. (forthcoming). Journalists and journalism in the Netherlands. In D. Weaver & L. Willnat, The Global Journalist in the 21st Century. London: Routledge. Webspace www.mauricevergeer.nl http://blog.mauricevergeer.nl/ www.journalisteninhetdigitaletijdperk.nl maurice.vergeer (skype) ___________________________________________________________________ |
Maurice,
The example data, if it's intended to be that, is not helpful.
At all. So, time for the problem drill.
1) You might swear the key variable is A18 but is
that how spss reports it to be? The same in both
files?
2) when you execute the match files, do you get any
errors or warnings of absolutely any kind?
3) post some true example data from file=* and file=2
and the result. Label which variable is the key in example data
set..
Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Maurice Vergeer Sent: Wednesday, June 29, 2011 5:41 PM To: [hidden email] Subject: match file with string variable I have a problem with matching two files. The key-variable is a string, but they do not match up, even though the resulting data set doesn't show any differences between the values of key variables. But they are: sorting them and comparing them with lag shows they are not equal, but I do not understand how they differ. The first key variable is originally read from a text file while th other one originates from excel xlsx. Key variable is formatted A18. syntax example: match files file =*/file = file2 / by key. Example of how the new data set looks like: var1 var2 var3 var5 var5 111 343 123 aaa . . . . aaa 123 111 343 123 bbb . . . . bbb 123 111 343 123 ccc . . . . ccc 123 111 343 123 ddd . . . . ddd 123 I encountered this before. Back then it was a small file, having resolved it in another way. But this one's too large. thanks in advance. Maurice -- ___________________________________________________________________ Maurice Vergeer Department of communication, Radboud University (www.ru.nl) PO Box 9104, NL-6500 HE Nijmegen, The Netherlands Visiting Professor Yeungnam University, Gyeongsan, South Korea Recent publications: -Vergeer, M., Eisinga, R. & Franses, Ph.H. (forthcoming). Supply and demand effects in television viewing. A time series analysis. Communications - The European Journal of Communication Research. -Vergeer, M. Lim, Y.S. Park, H.W. (forthcoming). Mediated Relations: New Methods to study Online Social Capital. Asian Journal of Communication. -Vergeer, M., Hermans, L., & Sams, S. (forthcoming). Online social networks and micro-blogging in political campaigning: The exploration of a new campaign tool and a new campaign style. Party Politics. -Pleijter, A., Hermans, L. & Vergeer, M. (forthcoming). Journalists and journalism in the Netherlands. In D. Weaver & L. Willnat, The Global Journalist in the 21st Century. London: Routledge. Webspace www.mauricevergeer.nl http://blog.mauricevergeer.nl/ www.journalisteninhetdigitaletijdperk.nl maurice.vergeer (skype) ___________________________________________________________________ |
Administrator
|
In reply to this post by Maurice Vergeer
Maurice,
Need to list the key variable too if we are to be of any help. other than that: are there leading blanks in the key variable? That will cause the MATCH to fail. Perhaps list the first 50 resulting cases so we can see what's up. HTH, David
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Dear David and Gene,
problem found: trailing spaces. I already had forced the key variables to the same length. Otherwise it wouldn't or at least shouldn't match. I received no errors, obviously. Strange though that the traliling spaces even exist. Data downloaded using Twitter's API. Another thing to worry about when using API's. thanks for the help. Maurice On Thu, Jun 30, 2011 at 07:05, David Marso <[hidden email]> wrote: Maurice, -- ___________________________________________________________________ Maurice Vergeer Department of communication, Radboud University (www.ru.nl) PO Box 9104, NL-6500 HE Nijmegen, The Netherlands Visiting Professor Yeungnam University, Gyeongsan, South Korea Recent publications: -Vergeer, M., Eisinga, R. & Franses, Ph.H. (forthcoming). Supply and demand effects in television viewing. A time series analysis. Communications - The European Journal of Communication Research. -Vergeer, M. Lim, Y.S. Park, H.W. (forthcoming). Mediated Relations: New Methods to study Online Social Capital. Asian Journal of Communication. -Vergeer, M., Hermans, L., & Sams, S. (forthcoming). Online social networks and micro-blogging in political campaigning: The exploration of a new campaign tool and a new campaign style. Party Politics. -Pleijter, A., Hermans, L. & Vergeer, M. (forthcoming). Journalists and journalism in the Netherlands. In D. Weaver & L. Willnat, The Global Journalist in the 21st Century. London: Routledge. Webspace www.mauricevergeer.nl http://blog.mauricevergeer.nl/ www.journalisteninhetdigitaletijdperk.nl maurice.vergeer (skype) ___________________________________________________________________ |
As long as the defined lengths of the strings
are the same, trailing spaces in the values shouldn't matter. In fact,
all string values are right-padded with spaces to the defined length.
But for comparison purposes "string"="string ". Leading spaces, however, do matter. From: Maurice Vergeer <[hidden email]> To: [hidden email] Date: 06/29/2011 06:38 PM Subject: Re: match file with string variable Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear David and Gene, problem found: trailing spaces. I already had forced the key variables to the same length. Otherwise it wouldn't or at least shouldn't match. I received no errors, obviously. Strange though that the traliling spaces even exist. Data downloaded using Twitter's API. Another thing to worry about when using API's. thanks for the help. Maurice On Thu, Jun 30, 2011 at 07:05, David Marso <david.marso@...> wrote: Maurice, Need to list the key variable too if we are to be of any help. other than that: are there leading blanks in the key variable? That will cause the MATCH to fail. Perhaps list the first 50 resulting cases so we can see what's up. HTH, David Maurice Vergeer wrote: > > dear list members > > I have a problem with matching two files. The key-variable is a string, > but > they do not match up, even though the resulting data set doesn't show any > differences between the values of key variables. But they are: sorting > them > and comparing them with lag shows they are not equal, but I do not > understand how they differ. The first key variable is originally read from > a > text file while th other one originates from excel xlsx. > Key variable is formatted A18. > > syntax example: > match files file =*/file = file2 / by key. > > Example of how the new data set looks like: > var1 var2 var3 var5 var5 > 111 343 123 aaa . > . . . aaa 123 > 111 343 123 bbb . > . . . bbb 123 > 111 343 123 ccc . > . . . ccc 123 > 111 343 123 ddd . > . . . ddd 123 > > I encountered this before. Back then it was a small file, having resolved > it > in another way. But this one's too large. > > > thanks in advance. > Maurice > > > > > > > > -- > > ___________________________________________________________________ > Maurice Vergeer > Department of communication, Radboud University (www.ru.nl) > PO Box 9104, NL-6500 HE Nijmegen, The Netherlands > > Visiting Professor Yeungnam University, Gyeongsan, South Korea > > Recent publications: > -Vergeer, M., Eisinga, R. & Franses, Ph.H. (forthcoming). Supply and > demand > effects in television viewing. A time series analysis. *Communications - > The > European Journal of Communication Research*. > -Vergeer, M. Lim, Y.S. Park, H.W. (forthcoming). Mediated Relations: New > Methods to study Online Social Capital. Asian Journal of Communication. > -Vergeer, M., Hermans, L., & Sams, S. (forthcoming). Online social > networks > and micro-blogging in political campaigning: The exploration of a new > campaign tool and a new campaign style. Party Politics. > -Pleijter, A., Hermans, L. & Vergeer, M. (forthcoming). Journalists and > journalism in the Netherlands. In D. Weaver & L. Willnat, The Global > Journalist in the 21st Century. London: Routledge. > > Webspace > www.mauricevergeer.nl > http://blog.mauricevergeer.nl/ > www.journalisteninhetdigitaletijdperk.nl > maurice.vergeer (skype) > ___________________________________________________________________ > -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/match-file-with-string-variable-tp4536848p4536909.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD -- ___________________________________________________________________ Maurice Vergeer Department of communication, Radboud University (www.ru.nl) PO Box 9104, NL-6500 HE Nijmegen, The Netherlands Visiting Professor Yeungnam University, Gyeongsan, South Korea Recent publications: -Vergeer, M., Eisinga, R. & Franses, Ph.H. (forthcoming). Supply and demand effects in television viewing. A time series analysis. Communications - The European Journal of Communication Research. -Vergeer, M. Lim, Y.S. Park, H.W. (forthcoming). Mediated Relations: New Methods to study Online Social Capital. Asian Journal of Communication. -Vergeer, M., Hermans, L., & Sams, S. (forthcoming). Online social networks and micro-blogging in political campaigning: The exploration of a new campaign tool and a new campaign style. Party Politics. -Pleijter, A., Hermans, L. & Vergeer, M. (forthcoming). Journalists and journalism in the Netherlands. In D. Weaver & L. Willnat, The Global Journalist in the 21st Century. London: Routledge. Webspace www.mauricevergeer.nl http://blog.mauricevergeer.nl/ www.journalisteninhetdigitaletijdperk.nl maurice.vergeer (skype) ___________________________________________________________________ |
In reply to this post by Maurice Vergeer
Hi Maurice,
Possible Problem Scenario 1 From your statement: "sorting them and comparing them with lag shows they are not equal, but I do not understand how they differ." I am assuming that you have checked for duplicates entries and had saved the file after removing the duplicates. If you have not saved the file after removing the duplicates, please do so. Please note that a blank cell is treated as a valid entry for a string variable which means it will also be considered as a duplicated entry. Scenario 2 All files to be matched are to be saved as SPSS files first. I am assuming that you had done so. Scenario 3 All SPSS files need to be sorted in ascending order before!
any matching takes place. I will do the following:
Warmest Regards Dorraj Oet
Date: Thu, 30 Jun 2011 06:41:18 +0900 From: [hidden email] Subject: match file with string variable To: [hidden email] dear list members I have a problem with matching two files. The key-variable is a string, but they do not match up, even though the resulting data set doesn't show any differences between the values of key variables. But they are: sorting them and comparing them with lag shows they are not equal, but I do not understand how they differ. The first key variable is originally read from a text file while th other one originates from excel xlsx. Key variable is formatted A18. syntax example: match files file =*/file = file2 / by key. Example of how the new data set looks like: var1 var2 var3 var5 var5 111 343 123 aaa . . . . aaa 123 111 343 123 bbb . . . . bbb 123 111 343 123 ccc . . . . ccc 123 111 343 123 ddd . . . . ddd 123 I encountered this before. Back then it was a small file, having resolved it in another way. But this one's too large. thanks in advance. Maurice -- ___________________________________________________________________ Maurice Vergeer Department of communication, Radboud University (www.ru.nl) PO Box 9104, NL-6500 HE Nijmegen, The Netherlands Visiting Professor Yeungnam University, Gyeongsan, South Korea Recent publications: -Vergeer, M., Eisinga, R. & Franses, Ph.H. (forthcoming). Supply and demand effects in television viewing. A time series analysis. Communications - The European Journal of Communication Research. -Vergeer, M. Lim, Y.S. Park, H.W. (forthcoming). Mediated Relations: New Methods to study Online Social Capital. Asian Journal of Communication. -Vergeer, M., Hermans, L., & Sams, S. (forthcoming). Online social networks and micro-blogging in political campaigning: The exploration of a new campaign tool and a new campaign style. Party Politics. -Pleijter, A., Hermans, L. & Vergeer, M. (forthcoming). Journalists and journalism in the Netherlands. In D. Weaver & L. Willnat, The Global Journalist in the 21st Century. London: Routledge. Webspace www.mauricevergeer.nl http://blog.mauricevergeer.nl/ www.journalisteninhetdigitaletijdperk.nl maurice.vergeer (skype) ___________________________________________________________________ |
In reply to this post by Rick Oliver-3
Right. The problem couldn't be trailing
blanks, but it could be nonprinting characters such as tabs or the popular
French non-breaking space character that look like spaces and are different
between the two files. If you really want to figure out what the
problem was, change the variable formats to A36. Then you can see
the numerical codes for those "blanks". A true blank would
be hex 20. In code page mode, a non-breaking space would be A0.
Regards, Jon Peck Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: Rick Oliver/Chicago/IBM@IBMUS To: [hidden email] Date: 06/29/2011 06:14 PM Subject: Re: [SPSSX-L] match file with string variable Sent by: "SPSSX(r) Discussion" <[hidden email]> As long as the defined lengths of the strings are the same, trailing spaces in the values shouldn't matter. In fact, all string values are right-padded with spaces to the defined length. But for comparison purposes "string"="string ". Leading spaces, however, do matter. From: Maurice Vergeer <[hidden email]> To: [hidden email] Date: 06/29/2011 06:38 PM Subject: Re: match file with string variable Sent by: "SPSSX(r) Discussion" <[hidden email]> Dear David and Gene, problem found: trailing spaces. I already had forced the key variables to the same length. Otherwise it wouldn't or at least shouldn't match. I received no errors, obviously. Strange though that the traliling spaces even exist. Data downloaded using Twitter's API. Another thing to worry about when using API's. thanks for the help. Maurice On Thu, Jun 30, 2011 at 07:05, David Marso <david.marso@...> wrote: Maurice, Need to list the key variable too if we are to be of any help. other than that: are there leading blanks in the key variable? That will cause the MATCH to fail. Perhaps list the first 50 resulting cases so we can see what's up. HTH, David Maurice Vergeer wrote: > > dear list members > > I have a problem with matching two files. The key-variable is a string, > but > they do not match up, even though the resulting data set doesn't show any > differences between the values of key variables. But they are: sorting > them > and comparing them with lag shows they are not equal, but I do not > understand how they differ. The first key variable is originally read from > a > text file while th other one originates from excel xlsx. > Key variable is formatted A18. > > syntax example: > match files file =*/file = file2 / by key. > > Example of how the new data set looks like: > var1 var2 var3 var5 var5 > 111 343 123 aaa . > . . . aaa 123 > 111 343 123 bbb . > . . . bbb 123 > 111 343 123 ccc . > . . . ccc 123 > 111 343 123 ddd . > . . . ddd 123 > > I encountered this before. Back then it was a small file, having resolved > it > in another way. But this one's too large. > > > thanks in advance. > Maurice > > > > > > > > -- > > ___________________________________________________________________ > Maurice Vergeer > Department of communication, Radboud University (www.ru.nl) > PO Box 9104, NL-6500 HE Nijmegen, The Netherlands > > Visiting Professor Yeungnam University, Gyeongsan, South Korea > > Recent publications: > -Vergeer, M., Eisinga, R. & Franses, Ph.H. (forthcoming). Supply and > demand > effects in television viewing. A time series analysis. *Communications - > The > European Journal of Communication Research*. > -Vergeer, M. Lim, Y.S. Park, H.W. (forthcoming). Mediated Relations: New > Methods to study Online Social Capital. Asian Journal of Communication. > -Vergeer, M., Hermans, L., & Sams, S. (forthcoming). Online social > networks > and micro-blogging in political campaigning: The exploration of a new > campaign tool and a new campaign style. Party Politics. > -Pleijter, A., Hermans, L. & Vergeer, M. (forthcoming). Journalists and > journalism in the Netherlands. In D. Weaver & L. Willnat, The Global > Journalist in the 21st Century. London: Routledge. > > Webspace > www.mauricevergeer.nl > http://blog.mauricevergeer.nl/ > www.journalisteninhetdigitaletijdperk.nl > maurice.vergeer (skype) > ___________________________________________________________________ > -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/match-file-with-string-variable-tp4536848p4536909.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@... (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD -- ___________________________________________________________________ Maurice Vergeer Department of communication, Radboud University (www.ru.nl) PO Box 9104, NL-6500 HE Nijmegen, The Netherlands Visiting Professor Yeungnam University, Gyeongsan, South Korea Recent publications: -Vergeer, M., Eisinga, R. & Franses, Ph.H. (forthcoming). Supply and demand effects in television viewing. A time series analysis. Communications - The European Journal of Communication Research. -Vergeer, M. Lim, Y.S. Park, H.W. (forthcoming). Mediated Relations: New Methods to study Online Social Capital. Asian Journal of Communication. -Vergeer, M., Hermans, L., & Sams, S. (forthcoming). Online social networks and micro-blogging in political campaigning: The exploration of a new campaign tool and a new campaign style. Party Politics. -Pleijter, A., Hermans, L. & Vergeer, M. (forthcoming). Journalists and journalism in the Netherlands. In D. Weaver & L. Willnat, The Global Journalist in the 21st Century. London: Routledge. Webspace www.mauricevergeer.nl http://blog.mauricevergeer.nl/ www.journalisteninhetdigitaletijdperk.nl maurice.vergeer (skype) ___________________________________________________________________ |
Free forum by Nabble | Edit this page |