|
can someone help me with this. for three variables (out of many variables that i have in my data set), i only want to keep every unique combination of these three variables once. so i want to remove any duplicates (not duplicates of my data set but duplicates where these three variables have the same result). thx.
|
|
"i only want to keep every unique combination of
these three variables once." What about doing a SELECT IF for cases where the three variables are equal (or the same) and then removing the dupes? That's probably what I'd try in the absence of any better syntax. Mark On Tue, Mar 18, 2008 at 9:17 AM, jimjohn <[hidden email]> wrote: > can someone help me with this. for three variables (out of many variables > that i have in my data set), i only want to keep every unique combination > of > these three variables once. so i want to remove any duplicates (not > duplicates of my data set but duplicates where these three variables have > the same result). thx. > -- > View this message in context: > http://www.nabble.com/removing-duplicates-in-spss-tp16121958p16121958.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
Something like this will keep the first occurrence of each unique combination (you could use the LAST keyword to select the last occurrence):
SORT CASES BY var1 var2 var3. MATCH FILES /FILE=* /BY var1 var2 var3 /FIRST=First. SELECT IF First=1. The only drawback is that the file must be sorted by the variables of interest, so "first" or "last" is only relative to the sorted order. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Mark Palmberg Sent: Tuesday, March 18, 2008 9:31 AM To: [hidden email] Subject: Re: removing duplicates in spss "i only want to keep every unique combination of these three variables once." What about doing a SELECT IF for cases where the three variables are equal (or the same) and then removing the dupes? That's probably what I'd try in the absence of any better syntax. Mark On Tue, Mar 18, 2008 at 9:17 AM, jimjohn <[hidden email]> wrote: > can someone help me with this. for three variables (out of many variables > that i have in my data set), i only want to keep every unique combination > of > these three variables once. so i want to remove any duplicates (not > duplicates of my data set but duplicates where these three variables have > the same result). thx. > -- > View this message in context: > http://www.nabble.com/removing-duplicates-in-spss-tp16121958p16121958.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by jimjohn
At 10:17 AM 3/18/2008, jimjohn wrote:
>can someone help me with this. for three variables (out of many variables >that i have in my data set), i only want to keep every unique >combination of these three variables once. See Richard Oliver's logic for *how* to do it. (There are similar alternatives based on LAG, but with a multi-variable key, "/FIRST]=" or "/LAST=" logic is easier. Now, *whether* to do it. Unless you're absolutely sure that the other values are all the same whenever the three key variables are the same (and it looks like you aren't), you're throwing away information. AND you're creating an output dataset that isn't determined by the input: that is, *which* record you keep isn't defined, but depends on the initial sort order of the records. Dangerous. Make sure you have a better justification than that it's inconvenient to have duplicate key sets. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by jimjohn
Although syntax for this has been posted, building syntax for this problem is what the Data>Identify Duplicate Cases dialog does, and it has lots of bells and whistles. It's a lot easier than hand rolling this.
HTH, Jon Peck -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of jimjohn Sent: Tuesday, March 18, 2008 8:18 AM To: [hidden email] Subject: [SPSSX-L] removing duplicates in spss can someone help me with this. for three variables (out of many variables that i have in my data set), i only want to keep every unique combination of these three variables once. so i want to remove any duplicates (not duplicates of my data set but duplicates where these three variables have the same result). thx. -- View this message in context: http://www.nabble.com/removing-duplicates-in-spss-tp16121958p16121958.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
