Hi,
I am trying to select one case from groups of related cases in a big dataset containing hospital stays. For each stay there are several cases that describes parts of the hospital stay and one aggregated case that combines the parts into one case. All the cases contains the same variables. I am trying to select only the cases that describes the first part of a hospital stay. Here is an example syntax that illustrates the variables im working with: DATA LIST LIST / id stay in out level. BEGIN DATA. 1, 2, 5, 7, 1 2, 2, 7, 15, 1 3, 2, 5, 15, 2 4, 3, 12, 18, 1 5, 3, 18, 19, 1 6, 3, 19, 24, 1 7, 3, 12, 24, 2 END DATA. LIST. id = case id (unique for each case) stay = id that relates cases in aggregated stays in = start date (for part or aggregated stay) out = end date (for part or aggregated stay) level = 1: part of aggregated stay level = 2: combined aggregated stay I have no problem filtering out the level=2 cases, I include them only to describe my problem. What I'm looking for is a way to select only the cases with id = 1 and id =4. Can anyone help me? Best regards Lars N. |
Administrator
|
Look at MATCH FILES (FIRST function)
or LAG or SHIFT VALUES. Either of these will add to your joy!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Thank's, I will look into these functions. Best regards Lars N.
|
In reply to this post by David Marso
Hi again
I have tried to look in tho the MATCH FILES and shift values functions, but I still can't figure it out.
As far as I could understand the MATCH FILES is used to combine two or more datasets (have all my cases in one dataset). Regarding the SHIFT VALUES (LAG/LEAD), it seemed to me as they would create a new variable with the value from a chosen variable from the previous or next case (or n'th case). Maybe I just didn't get it? I might have been a but unclear about what I wanted in my original post, as I was in a hurry to leave the office when I posted it. What I want is to select only the one LEVEL=1 case from each unique STAY-id with the lowest IN value. In my example this is the case with ID=1 and ID=4. In my real data set the IN-value is a DATETIME variable, but I can't see that it will be any practical differences in method. Can anyone help me with a syntax (or describe the method) on how to do this based on my example? It would be of great help to me. Best regards Lars N. 13. mai 2014 kl. 15:57 skrev David Marso [via SPSSX Discussion] <[hidden email]>: Look at MATCH FILES (FIRST function) |
Administrator
|
In reply to this post by nessie
In a later post, you wrote:
"I have tried to look in tho the MATCH FILES and shift values functions, but I still can't figure it out. As far as I could understand the MATCH FILES is used to combine two or more datasets (have all my cases in one dataset)." MATCH FILES can indeed be used for merging multiple datasets. But it can also be used with a single dataset to flag the FIRST and LAST rows for a given variable or combination of variables. In your case, you want to flag and keep the FIRST record for each value of your variable STAY. For example... DATA LIST LIST / id stay in out level. BEGIN DATA. 1, 2, 5, 7, 1 2, 2, 7, 15, 1 3, 2, 5, 15, 2 4, 3, 12, 18, 1 5, 3, 18, 19, 1 6, 3, 19, 24, 1 7, 3, 12, 24, 2 END DATA. MATCH FILES file = * / BY stay / FIRST = KeepRecord. SELECT IF KeepRecord. LIST. OUTPUT: id stay in out level KeepRecord 1.00 2.00 5.00 7.00 1.00 1 4.00 3.00 12.00 18.00 1.00 1 Number of cases read: 2 Number of cases listed: 2 Alternatively, you could use the LAG function and the system variable $CASENUM. * Read in the sample data as before. SELECT IF ($casenum EQ 1) OR (stay NE LAG(stay)). LIST. OUTPUT: id stay in out level 1.00 2.00 5.00 7.00 1.00 4.00 3.00 12.00 18.00 1.00 Number of cases read: 2 Number of cases listed: 2 Note that both of these methods assume the data is sorted in such a way that the record you want to keep for each value of STAY is the first one. HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by nessie
Match Files sets a flag 'first' for the first occurrence of a value of 'stay'. /LAST gives the last... See CSR (TFM)
/PR DATA LIST LIST / id stay in out level. BEGIN DATA. 1, 2, 5, 7, 1 2, 2, 7, 15, 1 3, 2, 5, 15, 2 4, 3, 12, 18, 1 5, 3, 18, 19, 1 6, 3, 19, 24, 1 7, 3, 12, 24, 2 END DATA. LIST. MATCH FILES FILE * /BY stay /FIRST first. EXECUTE. LIST. |
Thank's a lot to all tree you - you made my day :) Best regards Lars N. 13. mai 2014 kl. 19:33 skrev PRogman [via SPSSX Discussion] <[hidden email]>: Match Files sets a flag 'first' for the first occurrence of a value of 'stay'. /LAST gives the last... See CSR (TFM) |
You can also use: DATA LIST LIST
/ id stay in out level.
BEGIN DATA.
1, 2, 5, 7, 1
2, 2, 7, 15, 1
3, 2, 5, 15, 2
4, 3, 12, 18, 1
5, 3, 18, 19, 1
6, 3, 19, 24, 1
7, 3, 12, 24, 2
END DATA.
aggregate /outfile='test.sav' /presorted /break=stay /lowin=min(in) /caseid=first(id) . get file='test.sav' . From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of nessie Thank's a lot to all tree you - you made my day :) Best regards Lars N. 13. mai 2014 kl. 19:33 skrev PRogman [via SPSSX Discussion] <[hidden email]>:
Match Files sets a flag 'first' for the first occurrence of a value of 'stay'. /LAST gives the last... See CSR (TFM)
DATA LIST LIST /
id stay in out level.
BEGIN DATA.
1, 2, 5, 7, 1
2, 2, 7, 15, 1
3, 2, 5, 15, 2
4, 3, 12, 18, 1
5, 3, 18, 19, 1
6, 3, 19, 24, 1
7, 3, 12, 24, 2
END DATA.
LIST.
MATCH FILES
FILE *
/BY stay
/FIRST first.
EXECUTE.
LIST.
If you reply to this email, your message will be added to the discussion below: To unsubscribe from Selecting one of several related cases,
click here. View this message in context:
Re: Selecting one of several related cases |
Administrator
|
You would then want to put all variables of interest on both sides of the FIRST function.
Might want to specify a DATASET rather than writing the disk file. DATASET DECLARE agg. AGGREGATE OUTFILE agg / BREAK stay / id in out level=FIRST(id in out level). DATASET ACTIVATE agg. I personally would go with Bruce's SELECT IF $CASENUM EQ 1 OR stay NE LAG(stay). (however one must be cautious when using SELECT IF with either $CASENUM or LAG).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |