Hello. I need to select cases from a large dataset based on case-specific criteria. Participants: time-stamp information about each participant observation. Some participants have more observations than others. Date_of_Observation: Time stamp for when each participant was 'recorded' Treatment: each participant received a treatment - at varying times. I would like to select the cases that correspond to the date Treatment was given and all cases following the Treatment. However, I have not been able to come up with the appropriate criteria since each case is unique. Example, for Participant #1 (below), I would like to select Row #2 (treatment) and Row #3 (initial post-treatment observation). For
Participant #2, I would select Row #5. And, for Participant #3, I would like to select Rows 8-10. Row ID Date_of_observation Treatment 1 1 11/01/2013 12:00 0 2 1 11/01/2013 13:00
1 3 1 11/02/2013 5:00 1 4 2 11/10/2013 12:00 0 5 2 11/10/2013 18:00 1 6 3 11/12/2013 2:00 0 7 3 11/12/2013 2:30 0 8 3 11/12/2013 3:00 1 9 3 11/12/2013 23:00 1 10 3 11/13/2013 6:00 1 I look forward to any suggestions you may have. |
This is equivalent to this recent question, http://spssx-discussion.1045642.n5.nabble.com/select-cases-td5723008.html#a5723013, and so the solutions would be exactly the same.
|
Administrator
|
In reply to this post by cla
For the sample data you show, once Treatment switches from 0 to 1, it remains 1 on all subsequent records for that ID. Is that always the case, or is it possible for someone to have 1 followed by 0 at a later date or time? If Treatment never goes back to 0 after having switched to 1, this is very simple:
SELECT IF Treatment. EXECUTE. But if some of the records after the first occurrence of Treatment=1 can have Treatment=0, then it is a bit more complicated, and you need a solution along the lines hinted at by Andy. I'm not sure it's "identical" to the earlier problem, but certainly related. Here's an example with one more ID added to your sample data. The added person has Treatment=0 records following the first Treatment=1 record, and those records are retained. NEW FILE. DATASET CLOSE all. DATA LIST list / Row ID (2f2.0) Date_of_observation(adate10) Obs_Time (time5) Treatment (f1). BEGIN DATA 1 1 11/01/2013 12:00 0 2 1 11/01/2013 13:00 1 3 1 11/02/2013 5:00 1 4 2 11/10/2013 12:00 0 5 2 11/10/2013 18:00 1 6 3 11/12/2013 2:00 0 7 3 11/12/2013 2:30 0 8 3 11/12/2013 3:00 1 9 3 11/12/2013 23:00 1 10 3 11/13/2013 6:00 1 11 4 11/12/2013 2:00 0 12 4 11/12/2013 2:30 1 13 4 11/12/2013 3:00 0 14 4 11/12/2013 23:00 0 15 4 11/13/2013 6:00 0 END DATA. * This assumes the cases are already sorted by ID, Date and Time. * If not, sort them. DO IF ($Casenum EQ 1) OR (ID NE LAG(ID)). /* First row for each ID. - COMPUTE # = Treatment. ELSE. /* Not the first row for an ID. - IF Treatment and NOT LAG(Treatment) # = 1. END IF. SELECT IF #. LIST. Output from LIST: Row ID Date_of_observation Obs_Time Treatment 2 1 11/01/2013 13:00 1 3 1 11/02/2013 5:00 1 5 2 11/10/2013 18:00 1 8 3 11/12/2013 3:00 1 9 3 11/12/2013 23:00 1 10 3 11/13/2013 6:00 1 12 4 11/12/2013 2:30 1 13 4 11/12/2013 3:00 0 14 4 11/12/2013 23:00 0 15 4 11/13/2013 6:00 0 Number of cases read: 10 Number of cases listed: 10 HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Andy W
Yes. Thank you! However, the post you mention specifies all cases greater than 6 months. How can I set this criterion based on "GE 1 minute after previous time-stamp." In other words, I do not have a uniform time cutoff in which to code. Each participant's cutoff needs to reference their own unique 'pre-treatment' timestamp? Does this make sense? On Wednesday, November 13, 2013 9:51 AM, Andy W <[hidden email]>
wrote: This is equivalent to this recent question, http://spssx-discussion.1045642.n5.nabble.com/select-cases-td5723008.html#a5723013, and so the solutions would be exactly the same. ----- Andy W [hidden email] http://andrewpwheeler.wordpress.com/ -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/selecting-cases-based-on-case-specific-criteria-tp5723036p5723037.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
"Does this make sense?"
No it doesn't since you didn't specify anything about timestamp in your initial query regarding selection criteria. Take a step back and resubmit a more complete scenario. "Each participant's cutoff needs to reference their own unique 'pre-treatment' timestamp?" You completely lose me here! "GE 1 minute after previous time-stamp." DATEDIFF is documented in the FM. Take a look.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Thanks for your candor David. Please see below as I've tried to simplify/clarify the question: Question: How can I identify the time that a given observation occurred and all subsequent observations? Participants: time-stamp information about each observation (type = A and B). Some participants have more observations than others. Observations: There are two kinds, "A=treatment" and "B=non-treatment" and both of these can occur post-treatment. I would like to identify when Observation A was FIRST observed for each participant and all subsequent observations (both A and B).
Date_of_Observation:
Time stamp for when each participant was observed with A or B.
For the data shown below, I would like to find a way to: 1) identify the timestamp for when A was FIRST recorded and then 2) select all observations (which could be both A and B) after that time point. I can easily filter out Observation A . . . But, having a difficult time only filtering the Observation B's that follow that initial Observation A. Example: For Participant #3, I would like to flag Observation A at Row 7 (since it is the first time A was observed for that particular participant) and then select all cases for that same participant that occurs AFTER (i.e., after 11/12/2013 at 2:30pm). In this case, I would select rows 8-10. Thus, each participant's 'flag' or 'cutoff' references their own unique timestamp for when Observation A first occurred. Row ID Date_of_observation Observation
1 1 11/01/2013
12:00 B
2 1 11/01/2013
13:00 A 3 1 11/02/2013
5:00 B
4 2 11/10/2013
12:00 A
5 2 11/10/2013
18:00 B
6 3 11/12/2013 2:00 B
7 3 11/12/2013 2:30 A
8 3 11/12/2013 3:00 B
9 3 11/12/2013 23:00 A
10 3 11/13/2013 6:00 A I hope this makes things a bit more clear. Thank you in advance for your help.
On Wednesday, November 13, 2013 11:56 AM, David Marso <[hidden email]> wrote: "Does this make sense?" No it doesn't since you didn't specify anything about timestamp in your initial query regarding selection criteria. Take a step back and resubmit a more complete scenario. "Each participant's cutoff needs to reference their own unique 'pre-treatment' timestamp?" You completely lose me here! "GE 1 minute after previous time-stamp." DATEDIFF is documented in the FM. Take a look. cla wrote > Yes. Thank you! However, the post you mention specifies all cases greater > than 6 months. How can I set this criterion based on "GE 1 minute after > previous time-stamp."�> > > > In other words, I do not have a uniform time cutoff in which to code. Each > participant's cutoff needs to�reference their own unique 'pre-treatment' > timestamp? > > Does this make sense? > > > > On Wednesday, November 13, 2013 9:51 AM, Andy W < > apwheele@ > > wrote: > > This is equivalent to this recent question, > http://spssx-discussion.1045642.n5.nabble.com/select-cases-td5723008.html#a5723013, > and so the solutions would be exactly the same. > > > > ----- > Andy W > apwheele@ > http://andrewpwheeler.wordpress.com/ > -- > View this message in context: > http://spssx-discussion.1045642.n5.nabble.com/selecting-cases-based-on-case-specific-criteria-tp5723036p5723037.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/selecting-cases-based-on-case-specific-criteria-tp5723036p5723045.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
To find the first instance of a treatment A make a new date variable that equals "date_of_observation" when the treatment is A, and some large date in the future when the treatment is B. Aggregate by ID taking the MIN of that new variable. This will be the first date of the treatment A per id, and with this you can flag any observation occurring on subsequent date-times. Example below.
**********************************************. data list list (,) / row id (2F1.0) date (ADATE10) time (TIME5) observation (A1). begin data 1,1,11/01/2013,12:00,B 2,1,11/01/2013,13:00,A 3,1,11/02/2013,5:00,B 4,2,11/10/2013,12:00,A 5,2,11/10/2013,18:00,B 6,3,11/12/2013,2:00,B 7,3,11/12/2013,2:30,A 8,3,11/12/2013,3:00,B 9,3,11/12/2013,23:00,A 10,3,11/13/2013 6:00,A end data. dataset name orig. compute date_of_observation = date + time. formats date_of_observation (DATETIME17). exe. *Make new variable, if not A make a future date. DO IF Observation = "A". COMPUTE TreatDate = date_of_observation. ELSE. COMPUTE TreatDate = DATE.MDY(12,30,2099). END IF. *Agg min of that new variable. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=id /TreatDateStart = MIN(TreatDate). FORMATS TreatDate TreatDateStart (DATETIME17). *Flag if past first treatment date. COMPUTE Flag = (date_of_observation >= TreatDateStart). EXECUTE. **********************************************. |
Administrator
|
Here is a variation on Andy's code (note it won't need to be edited after Dec 31, 2099 ;-)
Note that AGG ignores SYSMIS in dealing with stats, so no need to build future value for B. IF Observation = "A" TreatDate = date + time. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=id /TreatDateStart = MIN(TreatDate). *Flag if past first treatment date. COMPUTE Flag = (date+Time >= TreatDateStart). FORMATS TreatDate TreatDateStart (DATETIME17). LIST.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Andy, David. Wow. Thank you. It worked! On Wednesday, November 13, 2013 3:53 PM, David Marso <[hidden email]> wrote: Here is a variation on Andy's code (note it won't need to be edited after Dec 31, 2099 ;-) Note that AGG ignores SYSMIS in dealing with stats, so no need to build future value for B. IF Observation = "A" TreatDate = date + time. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=id /TreatDateStart = MIN(TreatDate). *Flag if past first treatment date. COMPUTE Flag = (date+Time >= TreatDateStart). FORMATS TreatDate TreatDateStart (DATETIME17). LIST. Andy W wrote > To find the first instance of a treatment A make a new date variable that > equals "date_of_observation" when the treatment is A, and some large date > in the future when the treatment is B. Aggregate by ID taking the MIN of > that new variable. This will be the first date of the treatment A per id, > and with this you can flag any observation occurring on subsequent > date-times. Example below. > > **********************************************. > data list list (,) / row id (2F1.0) date (ADATE10) time (TIME5) > observation (A1). > begin data > 1,1,11/01/2013,12:00,B > 2,1,11/01/2013,13:00,A > 3,1,11/02/2013,5:00,B > 4,2,11/10/2013,12:00,A > 5,2,11/10/2013,18:00,B > 6,3,11/12/2013,2:00,B > 7,3,11/12/2013,2:30,A > 8,3,11/12/2013,3:00,B > 9,3,11/12/2013,23:00,A > 10,3,11/13/2013 6:00,A > end data. > dataset name orig. > compute date_of_observation = date + time. > formats date_of_observation (DATETIME17). > exe. > > > *Make new variable, if not A make a future date. > DO IF Observation = "A". > COMPUTE TreatDate = date_of_observation. > ELSE. > COMPUTE TreatDate = DATE.MDY(12,30,2099). > END IF. > *Agg min of that new variable. > AGGREGATE > /OUTFILE=* > MODE=ADDVARIABLES > /BREAK=id > /TreatDateStart = MIN(TreatDate). > FORMATS TreatDate TreatDateStart (DATETIME17). > *Flag if past first treatment date. > COMPUTE Flag = (date_of_observation >= TreatDateStart). > EXECUTE. > **********************************************. ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/selecting-cases-based-on-case-specific-criteria-tp5723036p5723051.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Free forum by Nabble | Edit this page |