Hi Guys, so this time I am working with a different database and I have about 9000 Unique Patient IDs. however each unique patient visited the doctor many times so for each patient I have about 40 different cases that vary in about 10 of 80 variables ( variables like sex, gender bla bla remain the the same but things like Diagnosis1, diagnosis 2, diagnosis 3 , etc vary in each case)...
What I want to do is to group cases with the same unique ID so I then analyze forexample what proportion of patients saw a doctor for reason W. What I thought of was to use casetovars by ID but that is going to create way too many variables for each patient 80*40 = 3200 ...and funny enough casetovars doesn't have a /keep subcommmand so i cant just select for the few variables that do vary among different visits of the same patient..and /drop is gonna be a pain to type out like 80 variables to be dropped Any idea how i can go about what i wanna do more efficiently than what i came up? Thanks! |
Administrator
|
Here's a general approach to get the proportion of patients who saw a doctor for reason W.
1. Compute a flag for presence of reason W in your diagnosis variables. Use the ANY function for this, and set the flag to 1 if W is found, 0 if not. 2. Use AGGREGATE to get the max value of the flag for each unique ID. 3. Run FREQUENCIES on the max of the flag variable. (Use only the first case for each unique ID if you wrote the max flag value to the same data set in step 2.) HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by devoidx
Have you looked at aggregate? DATASET DECLARE patient_file. AGGREGATE outfile = patient_file /break = id sex gender bla1 bla2 /diagnosis1 to diagnosis80 = MAX(diagnosis1 TO diagnosis80 . DATASET ACTIVATE patient_file WINDOW = FRONT. FREQUENCIES diagnosis1 TO diagnosis80. If you have binary variables for diagnosis, your patient file will have a value for 1 for patients who have been seen for each condition, and 0 if they have not been treated Jim Marks Sr Market Research Manager National Market Research Kaiser Foundation Health Plan of the Mid-Atlantic States, Inc. 2101 E. Jefferson St. Rockville, MD 20852 Phone: (301) 816-6822 Cell Phone: (301) 456-6164 NOTICE TO RECIPIENT: If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. Thank you. From: devoidx <[hidden email]> To: [hidden email] Date: 09/30/2013 10:30 AM Subject: Grouping Cases with the same unique ID Sent by: "SPSSX(r) Discussion" <[hidden email]> Hi Guys, so this time I am working with a different database and I have about 9000 Unique Patient IDs. however each unique patient visited the doctor many times so for each patient I have about 40 different cases that vary in about 10 of 80 variables ( variables like sex, gender bla bla remain the the same but things like Diagnosis1, diagnosis 2, diagnosis 3 , etc vary in each case)... What I want to do is to group cases with the same unique ID so I then analyze forexample what proportion of patients saw a doctor for reason W. What I thought of was to use casetovars by ID but that is going to create way too many variables for each patient 80*40 = 3200 ...and funny enough casetovars doesn't have a /keep subcommmand so i cant just select for the few variables that do vary among different visits of the same patient..and /drop is gonna be a pain to type out like 80 variables to be dropped Any idea how i can go about what i wanna do more efficiently than what i came up? Thanks! -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Grouping-Cases-with-the-same-unique-ID-tp5722321.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Thanks guys, Ill give the aggregate strategy a try and report back.
|
In reply to this post by Jim Marks
This bit of text here is not so clear: however each unique patient visited the doctor many You have 9000 patients who appear one or more times in your database. What’s the 40 different cases about? And, how do you get 3200 variables.
How did you go about solving your problem? With respect to Jim’s solution, a key question is whether Dx1 to Dx(n) are precoded categories or Dx1 is the first recorded ICD code, Dx2 is the second, etc.
If it’s the first, Jim’s solution is perfect (although I assume that equal ids imply the same demo data allowing for coding errors). If it’s the second, then one way is casestovars (and 3200 variables is not that many); the other way is to create a set of
new variables such that NewDx1 is the most common diagnosis, NewDx2 is the second most common, etc. There’s lots of possible diagnoses so this leads to hundreds (thousands?) of NewDX(n) variables. And then aggregate these NewDx(i) variables.
Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Jim Marks
|
What I meant was that, there are 9000 unique patients...and each patient can have many patient records..(some have up to a 100)...so there can be 100 cases in the database for the same patient that varies in either the presenting diagnosis or the procedure that was done on that patient in the office
I'm not sure what you mean by prerecorded categories, each Diagnosis variable contains ICD9 codes depending on reasons that the patient visited the office each particular time. if patient came in with one complaint, only diagnosis 1 will have a value, if patient came in with multiple complaints then other diagnosis variables will have values as well. the issue i had with casetovars was that there is no /keep subcommand so I cant keep only the variables that do vary case by case for the same patient...and /drop would mean that I had to type out all the 80 variables that remain the same case to case for the same patient that seem inefficient..which is why i was seeking for other ideas. |
Administrator
|
In reply to this post by Bruce Weaver
I would probably blast the entire thing into long format, retaining ID and diagnosis codes .
Then DATASET DECLARE id_diag. AGGREGATE OUTFILE id_diag / BREAK ID diagnosis_code / ID_DiagCount=N. This gives you one record per Id x diag You can then further aggregate etc...
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by devoidx
Prerecorded categories means Dx1 is ICD 324.50, Dx2 is ICD 456.31, etc. You don't have those, so forget that but I wanted to check.
I'm curious what the max number of diagnoses any one person received at any one visit. The specific question you posed could be stated as What is the frequency distribution of diagnoses by unique people. So what matters in the dataset is patient id and Dx1 to Dx(n). For what it matters, use the Delete variables command to get rid of unwanted variables. I'm kind of guessing that the list of possible diagnoses that one or more people have received might be quite long, unless this is for a highly focused practice but even then it might not be so short. Rather than casestovars, which I earlier advocated, I now think Varstocases, which has a drop subcommand, would be better (are you out there Richard?) followed by Aggregate breaking on patient id and diagnosis and followed by either Frequencies on the diagnosis variable or a second Aggregate breaking on Diagnosis followed by a List command to print the diagnosis-count pairs for each diagnosis. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of devoidx Sent: Monday, September 30, 2013 11:54 AM To: [hidden email] Subject: Re: Grouping Cases with the same unique ID What I meant was that, there are 9000 unique patients...and each patient can have many patient records..(some have up to a 100)...so there can be 100 cases in the database for the same patient that varies in either the presenting diagnosis or the procedure that was done on that patient in the office I'm not sure what you mean by prerecorded categories, each Diagnosis variable contains ICD9 codes depending on reasons that the patient visited the office each particular time. if patient came in with one complaint, only diagnosis 1 will have a value, if patient came in with multiple complaints then other diagnosis variables will have values as well. the issue i had with casetovars was that there is no /keep subcommand so I cant keep only the variables that do vary case by case for the same patient...and /drop would mean that I had to type out all the 80 variables that remain the same case to case for the same patient that seem inefficient..which is why i was seeking for other ideas. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Grouping-Cases-with-the-same-unique-ID-tp5722321p5722331.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by David Marso
Good call. This would get them all done in one relatively easy step.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Free forum by Nabble | Edit this page |