Login  Register

Re: How to delete user-defined duplicate cases

Posted by David Marso on Jun 15, 2012; 3:01pm
URL: http://spssx-discussion.165.s1.nabble.com/How-to-delete-user-defined-duplicate-cases-tp5713684p5713691.html

Or how about a s(L)ick 2 liner ;-)
--
COMPUTE #DATE=MAX(ANY(1,$CASENUM,ID NE LAG(ID),
                                              CTIME.DAYS(DATE-#DATE) GT 30)*DATE,#DATE).
COMPUTE RETAIN=DATE=#DATE.
EXE.
R B wrote
Thank you, David and Gene. Will try out both sets of code. Much appreciated.

Best,

Ryan

On Fri, Jun 15, 2012 at 10:15 AM, David Marso <[hidden email]> wrote:

> Hi Ryan,
> Try this.
> DATA LIST LIST /ID (A5) Date (ADATE).
> BEGIN DATA
> 1G3k4  11/11/2009
> 1G3k4  12/06/2009
> 1G3k4  12/15/2009
> 1G3k4  12/19/2009
> 1G3k4  02/22/2010
> 5TRJ1  11/10/2009
> RQR12  11/10/2009
> END DATA.
> LIST.
> DO IF $CASENUM=1 OR ID NE LAG(ID).
> +  COMPUTE #DATE=DATE.
> +  COMPUTE RETAIN=1.
> ELSE.
> +  DO IF CTIME.DAYS(DATE-#DATE) LT 30.
> +    COMPUTE RETAIN=0.
> +  ELSE.
> +    COMPUTE RETAIN=1.
> +    COMPUTE #DATE=DATE.
> +  END IF.
> END IF.
> EXE.
> LIST.
>
> R B wrote
> >
> > Hello:
> >
> > Let me start with an illustration of my dataset:
> >
> > ID      Date
> > 1G3k4  11/11/2009
> > 1G3k4  12/06/2009
> > 1G3k4  12/15/2009
> > 1G3k4  12/19/2009
> > 1G3k4  02/22/2010
> > 5TRJ1  11/10/2009
> > RQR12  11/10/2009
> > .
> > .
> > .
> >
> > The variable type of "ID" is STRING and the variable type of "Date" is
> > DATE
> > with a format mm/dd/yyyy.
> >
> > I would like to remove "subsequent" cases that are within a 30-day period
> > of the first case associated with a particular ID in a given time period.
> > Let's take "1G3k4"...I would like the case entered on 11/11/2009 (first
> > time this ID appears in the dataset) to be retained, but the case entered
> > on 12/06/2009 should be deleted. Now that we've entered a new 30-day
> > period
> > for that same ID (on 12/15/2009), the first case entered on 12/15/2009
> > should be retained but the following case associated with that same ID on
> > 12/19/2009 sholud be deleted. The next case associated with ID "1G3k4"
> > does
> > not appear until 02/22/2010 and there are no subsequent cases within a
> > 30-day period associated with that ID, so that case is retained and
> > nothing
> > else needs to be done. Same goes with the other two example IDs provided.
> > Hope this step-by-step illustration is not convoluted.
> >
> > I've been playing about with the lag function, but I can't seem to get it
> > to do what I want. Any thoughts would be most appreciated. Apologies if
> > I've asked this question in the past.
> >
> > Thanks,
> >
> > Ryan
> >
>
>
> --
> View this message in context:
> http://spssx-discussion.1045642.n5.nabble.com/How-to-delete-user-defined-duplicate-cases-tp5713684p5713687.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"