|
Administrator
|
Hi Mike,
Check: http://listserv.uga.edu/cgi-bin/wa?A2=ind9606&L=spssx-l&P=R9609&D=0 Put this in your pipe and smoke it !! ;-) I posted that gem over 14 years ago ;-)))) You'll need to fiddle with it to fit your situation but the basic idea is applicable. HTH, David * General Parser *. DATA LIST / X 1-80 (A). BEGIN DATA 11-0101-423-7384 END DATA. VECTOR NUMS(10). COMPUTE #0=0. LOOP. COMPUTE #1=INDEX(X,'-'). COMPUTE #0=#0+1. IF #1>0 NUMS(#0)=NUMBER(SUBSTR(X,1,#1-1),F8). COMPUTE X=SUBSTR(X,#1+1). END LOOP IF #1=0. COMPUTE NUMS(#0)=NUMBER(X,F8). MATCH FILES FILE * / DROP X. LIST. NUMS1 NUMS2 NUMS3 NUMS4 NUMS5 NUMS6 NUMS7 NUMS8 NUMS9 NUMS10 11.00 101.00 423.00 7384.00 . ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
|
I'm having trouble with a Merge from the GUI or the syntax generated.
I want to add some variables from an another copy of the data. The other variables are the same and the cases are the same, so it should be a simple merge. The GUI shows what I'd expect - three additional variables. If I run the merge from the GUI I get a bunch of errors about temporary variables, and the file reverts to what I started with. When I paste the syntax I see a huge string with RENAME and DROP, and when I run it the errors look the same as from the GUI. The syntax generated is like this: MATCH FILES /FILE=* /FILE='DataSet10' /RENAME (var1 var2 ... var1442 = d0 d1 ... d1442) /DROP = d0 d1 ... d1442. EXECUTE. The errors are all about undefined variable names When I run the following simple syntax instead, everything seems fine. MATCH FILES /FILE=* /FILE='DataSet10'. EXECUTE. Is this a bug in SPSS? I'm running V16.0.1. Thanks Mike ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Mike
Looks like the GUI syntax renames all the variables
and then drops them, so there are none left in the working file. I don't
quite understand "the variables are the same", but your best bet is to stick to
your simple syntax,(re) name the three you want from the other file first and
then do the merge. SPSS will keep the duplicate variables from the first
file named and add any different ones from the second. If you want them in
a different order you can always use /KEEP <vars in order
wanted>
|
|
In reply to this post by Mike Pritchard
If the variables are the same and the cases are the same, why do you want to
merge. Perhaps you want to use UPDATE rather than MATCH FILES? -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Mike Pritchard Sent: Sunday, September 12, 2010 3:08 PM To: [hidden email] Subject: Merge problems I'm having trouble with a Merge from the GUI or the syntax generated. I want to add some variables from an another copy of the data. The other variables are the same and the cases are the same, so it should be a simple merge. The GUI shows what I'd expect - three additional variables. If I run the merge from the GUI I get a bunch of errors about temporary variables, and the file reverts to what I started with. When I paste the syntax I see a huge string with RENAME and DROP, and when I run it the errors look the same as from the GUI. The syntax generated is like this: MATCH FILES /FILE=* /FILE='DataSet10' /RENAME (var1 var2 ... var1442 = d0 d1 ... d1442) /DROP = d0 d1 ... d1442. EXECUTE. The errors are all about undefined variable names When I run the following simple syntax instead, everything seems fine. MATCH FILES /FILE=* /FILE='DataSet10'. EXECUTE. Is this a bug in SPSS? I'm running V16.0.1. Thanks Mike ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Sorry for my lack of clarity. There is a subset of variables in the latest
working file (that has been modified through coding/recoding and labeling) that is also in the other file. The other file is an earlier version with a few additional variables that were dropped inadvertently from the working file when some operations - primarily SAVE with a different order - were done. So I needed to recover these variables. I don't think UPDATE would work in this case, and there is no need to rename as the variables are not duplicates. Thanks for your help. Mike ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Mike,
I don't know why you having the troubles you are having. I'd like to suggest a different tack on the problem. You said 'There is a subset of variables in the latest working file (that has been modified through coding/recoding and labeling) that is also in the other file.' Let's call this fileA. You continue 'The other file is an earlier version with a few additional variables that were dropped inadvertently from the working file when some operations - primarily SAVE with a different order - were done.' Let's call this fileB. As I understand you, you want to drop some variable from fileA and add those variables back from in from fileB. Looking at the syntax in your first post it seemed that there were variable name change issues but that wasn't so clear to me. This is clunky, not svelte at all. But, so what. Get file=fileB/keep=id x1 to x23/rename (x1 to x23=y1 to y23). Save outfile=fileB1. Get file=fileA/drop=y1 to y23. Match files file=*/file=fileB1/by id. I noticed that in your posted syntax you had no 'by' subcommand. Perhaps that was omitted for clarity. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Mike Pritchard Sent: Monday, September 13, 2010 10:43 AM To: [hidden email] Subject: Re: Merge problems Sorry for my lack of clarity. There is a subset of variables in the latest working file (that has been modified through coding/recoding and labeling) that is also in the other file. The other file is an earlier version with a few additional variables that were dropped inadvertently from the working file when some operations - primarily SAVE with a different order - were done. So I needed to recover these variables. I don't think UPDATE would work in this case, and there is no need to rename as the variables are not duplicates. Thanks for your help. Mike ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by Mike Pritchard
Mike
Do you still have the earlier file? If so you
might be better working forward from that again (if you still have the
syntax). I've just done a merge of three files (all cases common variables
in file1, half cases,but different variables in file2, other half of cases and
different variables again in file3) for someone with a similar problem
(first time in 20-odd years) and it actually worked first time! One common
variable appeared twice, but that was because it had a different spelling in one
file.
John
|
|
In reply to this post by Mike Pritchard
At 10:43 AM 9/13/2010, Mike Pritchard wrote:
There is a subset of variables in the latest working file (that has been modified through coding/recoding and labeling) that is also in the other file. The other file is an earlier version with a To start with (and it's not what you asked), you have no 'BY' clause in either of you MATCH FILES commands. From the Command Syntax Reference, .. If BY is not used, the program performs a parallel (sequential) match, combining the first case from each file, then the second case from each file, and so on, without regard to any identifying values that may be present. So, one extra case, or one missing one, from either file, and your result can have values for some cases that belong with other cases altogether. Do you have any set of variables that can form a record key within your files? If so, use them. But as to what you asked about, your syntax MATCH FILES /FILE=* /FILE='DataSet10'. works because (CSR again) If the same variable name is used in more than one input file, data are taken from the file specified first. Dictionary information is taken from the first file containing value labels, missing values, or a variable label for the common variable. If the first file has no such information, MATCH FILES checks the second file, and so on, seeking dictionary information. So, for all the variables that appear in both files, you get the value from the active file ('FILE=*'). Fine, if that's what you want, but make sure it is what you want. (And, by the way, this syntax will blow up if any variables from the two files have the same name but are type-incompatible: that is, one numeric and one string, or two strings of different lengths. But your files don't have that problem.) Now, as you write, the GUI generates syntax, MATCH FILES /FILE=*That's because the GUI's code-generating logic takes the premise that all variables (except key variables) are actually different between the files, and if any do have the same name, it's a conflict. So the GUI generates this awkward code to get rid of all variables in DataSet10 that also occur in the active file. If I run the merge from the GUI I get a bunch of errors about temporary variables. The errors are all about undefined variable names. You'd have to give us a few examples of what variable names are 'undefined'. 1,442 variables is a very long RENAME list, but there's no documented limit of the number of variables to RENAME. It looks like there could be 1,442 source variables and 1,443 target variables; might that be true? Although the GUI's code-generator should be smart enough not to let that happen. Anyway, go ahead and use your simple syntax. However, if I were doing this, I'd load the old file, keeping only key variables and the 'lost' variables I wanted to recover; sort both files by the set of key variables; and use something like (untested), MATCH FILES /FILE=<newfile> /FILE=<oldfile> /BY <keyvars> The other file is an earlier version with a few variables that were dropped inadvertently file when some operations - primarily SAVE with a different order - were done. One moral: always end a KEEP list with the keyword 'ALL', unless you're trying to drop some variables. That way, any variables you forget to name will still be there, at the end of the file. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Thanks Richard for some useful additional explanations. Now that I’ve convinced myself that there was some kind of
bug in this version of SPSS, I’ll look closer at using Syntax without GUI
assistance for this kind of thing. I’ve had trouble when trying to use the GUI to get a
starting point for a merge with Keyed variables. I can’t seem to
select any variables from the active data set to move to the Keyed variables
window, only from the excluded variables. But hopefully (next time I need
this), I can just enter /BY with the keyed variables from the active set.
Or not… Cheers Mike |
|
At 07:41 PM 9/13/2010, Mike Pritchard wrote:
Ive had trouble when trying to use the GUI to get a starting point for a merge with Keyed variables. I cant seem to select any variables from the active data set to move to the Keyed variables window, only from the excluded variables. Yeah. The GUI was probably set up that way on purpose. By their nature, key variables have to be in both input files. As I wrote, the GUI was constructed on the view that every variable present in both files is either a key variable or a conflict. But it is disquieting to see all variables that occur in both files, including all that might be used as key variables, classified as "Excluded variables". (The actual meaning of that pane is "variable names found in both files", but probably the designers thought that would confuses people.) Hopefully (next time I need this), I can just enter /BY with the keyed variables from the active set. Again, not from the GUI you can't. But I find the GUI's little help in setting up MATCH FILES or ADD FILES, anyway. (Do remember that, inherently, the key can consist only of variables found in both files -- or in all files, if you're merging more than two.) Now that Ive convinced myself that there was some kind of bug in this version of SPSS, Ill look closer at using Syntax without GUI assistance for this kind of thing. I'd do precisely that. But you'll find that SPSS won't regard this as a bug. As, indeed, it isn't. It's maybe a glitch: the program was, indeed, designed to work the way it does; but it's as a result of a number of decisions, each individually reasonable, that led to an unfortunate result. =================== APPENDIX: Test data =================== DATA LIST FREE / a b c. BEGIN DATA 1 7 9.1 2 8 9.2 3 9 9.3 END DATA. DATASET NAME Left WINDOW=FRONT. FORMATS a b (F2) c (F5.1). DATA LISTS FREE /a b d. BEGIN DATA 1 4 7.1 2 5 7.2 3 6 7.3 END DATA. FORMATS a b (F2) d (F5.1). DATASET NAME Right WINDOW=FRONT. DATASET ACTIVATE Left WINDOW=FRONT. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
