|
HI SPSSers,
I've got a file that I thought I could use CASESTOVARS to restructure, but now I'm not so sure. This is a simplified version of what I have now (patient ID, sequential number of each pt's phone call, call date, diagnostic code, most recent date of diagnosis, and alphanumeric code for the cardiac risk factor represented by diagnosis). Each diagnosis is an observation: Pt_ID callnum calldt ICD9 visit_dt newlabel 10001 1 1/5/05 401.9 12/4/04 HTN 10001 1 1/5/05 305.1 12/4/04 SMO 10001 2 2/1/06 272.0 5/5/05 HLIP 10001 2 2/1/06 401.9 5/5/05 HTN 10001 2 2/1/06 305.1 5/5/05 SMO 10002* 1 9/28/04 10003 1 1/7/05 272.4 2/1/03 HTN 10003 2 7/9/05 250.0 3/12/05 DIAB 10003 2 7/9/05 272.4 3/1/05 HTN *this pt had a call but no risk factors or associated visit dates. This is where I want to go - turn each call into an observation, and create 4 new variables for the 4 risk factors: Pt_ID callnum calldt visit_dt RF_1 RF_2 RF_3 RF_4 10001 1 1/5/05 12/4/04 HTN SMO 10001 2 2/1/06 5/5/05 HLIP HTN SMO 10002 1 9/28/04 10003 1 7/7/05 2/1/03 HTN 10003 2 7/9/05 3/12/05 DIAB HTN Actually I will have to create new variables for each of the diagnoses and their associated visit dates....just didn't have room to cram all those variables into one line in this message. So what is tripping me up with CASESTOVARS is how to use the ID subcommand when I need to group the data by, first Pt_ID number and then the call number for that Pt_ID. Can't get a handle on that one. Or is CASESTOVARS not the strategy to use? Thanks for pointing me in the right direction... Tanya NOTICE TO RECIPIENT: If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. Thank you. |
|
At 06:18 PM 6/7/2007, Tanya Temkin wrote:
>I've got a file that I thought I could use CASESTOVARS to restructure. >This is a simplified version: >. patient ID, >. sequential number of each pt's phone call, >. call date, >. diagnostic code, >. most recent date of diagnosis, and >. alphanumeric code for the cardiac risk factor represented by >diagnosis). > >Each diagnosis is an observation: >Pt_ID callnum calldt ICD9 visit_dt newlabel >10001 1 1/5/05 401.9 12/4/04 HTN >10001 1 1/5/05 305.1 12/4/04 SMO >10001 2 2/1/06 272.0 5/5/05 HLIP >10001 2 2/1/06 401.9 5/5/05 HTN >10001 2 2/1/06 305.1 5/5/05 SMO >10002* 1 9/28/04 >10003 1 1/7/05 272.4 2/1/03 HTN >10003 2 7/9/05 250.0 3/12/05 DIAB >10003 2 7/9/05 272.4 3/1/05 HTN > >*this pt had a call but no risk factors or associated visit dates. > >I want to turn each call into an observation, and create 4 new >variables for the 4 risk factors: > >Pt_ID callnum calldt visit_dt RF_1 RF_2 RF_3 RF_4 >10001 1 1/5/05 12/4/04 HTN SMO >10001 2 2/1/06 5/5/05 HLIP HTN SMO >10002 1 9/28/04 >10003 1 7/7/05 2/1/03 HTN >10003 2 7/9/05 3/12/05 DIAB HTN > >So what is tripping me up with CASESTOVARS is [...] I had considerably more trouble than I expected. Below is a solution, in SPSS 15 draft output. Note step II., to get one record for each visit with all the variables simply carried over. I wanted a much simpler solution with just CASESTOVARS, like this: CASESTOVARS /ID = Pt_ID callnum /RENAME = newlabel=RF /SEPARATOR = '_' /GROUPBY = VARIABLE /DROP = ICD9 /AUTOFIX = YES. What defeats this is that it won't treat 'visit_dt' as a fixed variable, even though it IS fixed over all the records of a visit (counting 'missing' as a value). Can anybody do better? Before the solution: do you want what you asked for? You get (ignoring the variables that are simply carried over), Pt_ID callnum RF_1 RF_2 RF_3 10001 1 HTN SMO 10001 2 HLIP HTN SMO 10002 1 10003 1 HTN 10003 2 DIAB HTN Risk factors 'HTN' and 'SMO' appear in different columns in different cases, as would other risk factors. It would make, say, calculating the frequency of occurrence of 'HMO' quite difficult. Would an alternative form be better, with a variable for every possible each risk factor you're considering, having with value 'Present' or 'Absent' for each visit? >Actually I will have to create new variables for each of the diagnoses >and their associated visit dates....just didn't have room to cram all >those variables into one line in this message. Does this mean you want to have new variables for the ICD9 codes as well, i.e. include those in the CASESTOVARS the way 'newlabel' is? Or something going beyond that? Well, for another day... Anyway, here goes: ================================ Solution (SPSS 15 draft output). It uses datasets (SPSS 14 and 15). For earlier releases, scratch files would work, with considerable reworking of DATASET commands into file-handling commands. ================================ |-----------------------------|---------------------------| |Output Created |07-JUN-2007 22:58:29 | |-----------------------------|---------------------------| [Original] Pt_ID callnum calldt ICD9 visit_dt newlabel 10001 1 01/05/2005 401.9 12/04/2004 HTN 10001 1 01/05/2005 305.1 12/04/2004 SMO 10001 2 02/01/2006 272.0 05/05/2005 HLIP 10001 2 02/01/2006 401.9 05/05/2005 HTN 10001 2 02/01/2006 305.1 05/05/2005 SMO 10002 1 09/28/2004 . 10003 1 01/07/2005 272.4 02/01/2003 HTN 10003 2 07/09/2005 250.0 03/12/2005 DIAB 10003 2 07/09/2005 272.4 03/01/2005 HTN Number of cases read: 9 Number of cases listed: 9 * I. Restructure, ignoring variables to be kept but not ....... . * active in the restructuring. ....... . ADD FILES /FILE = Original /KEEP = Pt_ID callnum newlabel. DATASET NAME NewLabl WINDOW=FRONT. LIST. List |-----------------------------|---------------------------| |Output Created |07-JUN-2007 22:58:30 | |-----------------------------|---------------------------| [NewLabl] Pt_ID callnum newlabel 10001 1 HTN 10001 1 SMO 10001 2 HLIP 10001 2 HTN 10001 2 SMO 10002 1 10003 1 HTN 10003 2 DIAB 10003 2 HTN Number of cases read: 9 Number of cases listed: 9 SORT CASES BY Pt_ID callnum . CASESTOVARS /ID = Pt_ID callnum /RENAME = newlabel=RF /SEPARATOR = '_' /GROUPBY = VARIABLE . Cases to Variables |----------------------------|---------------------------| |Output Created |07-JUN-2007 22:58:30 | |----------------------------|---------------------------| [NewLabl] Generated Variables |----------|------| |Original |Result| |Variable |------| | |Name | |--------|-|------| |newlabel|1|RF_1 | | |2|RF_2 | | |3|RF_3 | |--------|-|------| Processing Statistics |---------------|---| |Cases In |9 | |Cases Out |5 | |---------------|---| |Cases In/Cases |1.8| |Out | | |---------------|---| |Variables In |3 | |Variables Out |5 | |---------------|---| |Index Values |3 | |---------------|---| LIST. List |-----------------------------|---------------------------| |Output Created |07-JUN-2007 22:58:30 | |-----------------------------|---------------------------| [NewLabl] Pt_ID callnum RF_1 RF_2 RF_3 10001 1 HTN SMO 10001 2 HLIP HTN SMO 10002 1 10003 1 HTN 10003 2 DIAB HTN Number of cases read: 5 Number of cases listed: 5 * II. Create one record per visit, with only identifiers ....... . * and those variables not active in restructuring ....... . DATASET ACTIVATE Original WINDOW=FRONT. DATASET DECLARE FixedVar WINDOW=MINIMIZED. MISSING VALUES ICD9(' '). AGGREGATE OUTFILE =FixedVar /BREAK =Pt_ID callnum /calldt =FIRST(calldt) /visit_dt=FIRST(visit_dt). DATASET ACTIVATE FixedVar WINDOW=FRONT. LIST. List |-----------------------------|---------------------------| |Output Created |07-JUN-2007 22:58:31 | |-----------------------------|---------------------------| [FixedVar] Pt_ID callnum calldt visit_dt 10001 1 01/05/2005 12/04/2004 10001 2 02/01/2006 05/05/2005 10002 1 09/28/2004 . 10003 1 01/07/2005 02/01/2003 10003 2 07/09/2005 03/12/2005 Number of cases read: 5 Number of cases listed: 5 * III. Combine to one record per visit, with variables ....... . * not active in restructuring and those restructured ....... . MATCH FILES /FILE=FixedVar /FILE=NewLabl /BY Pt_ID callnum. DATASET NAME Final WINDOW=FRONT. LIST. List |-----------------------------|---------------------------| |Output Created |07-JUN-2007 22:58:32 | |-----------------------------|---------------------------| [Final] Pt_ID callnum calldt visit_dt RF_1 RF_2 RF_3 10001 1 01/05/2005 12/04/2004 HTN SMO 10001 2 02/01/2006 05/05/2005 HLIP HTN SMO 10002 1 09/28/2004 . 10003 1 01/07/2005 02/01/2003 HTN 10003 2 07/09/2005 03/12/2005 DIAB HTN Number of cases read: 5 Number of cases listed: 5 =================== APPENDIX: Test data =================== * ................................................................. . * ................. Test data ..................... . DATA LIST LIST SKIP=1/ Pt_ID callnum calldt ICD9 visit_dt newlabel ( F5 F2 ADATE10 A5 ADATE10 A6). BEGIN DATA Pt_ID callnum calldt ICD9 visit_dt newlabel 10001 1 1/5/05 401.9 12/4/04 HTN 10001 1 1/5/05 305.1 12/4/04 SMO 10001 2 2/1/06 272.0 5/5/05 HLIP 10001 2 2/1/06 401.9 5/5/05 HTN 10001 2 2/1/06 305.1 5/5/05 SMO 10002 1 9/28/04 10003 1 1/7/05 272.4 2/1/03 HTN 10003 2 7/9/05 250.0 3/12/05 DIAB 10003 2 7/9/05 272.4 3/1/05 HTN END DATA. * ................. Post after this point ..................... . * ................................................................. . DATASET NAME Original WINDOW=FRONT. LIST. |
|
Hi,
I read that you're using ICD 9. Do you happen to have ICD-9-CM in an SPSS-friendly format? I currently only have a pdf version and I'd be very happy to have an xls, dbf, sav or whatever version. Or perhaps you could point me to some web address? I found one Thai website, but I realized my thai had become kinda dusty ;-) Cheers, and thanks in advance, Albert-Jan --- Richard Ristow <[hidden email]> wrote: > At 06:18 PM 6/7/2007, Tanya Temkin wrote: > > >I've got a file that I thought I could use > CASESTOVARS to restructure. > >This is a simplified version: > >. patient ID, > >. sequential number of each pt's phone call, > >. call date, > >. diagnostic code, > >. most recent date of diagnosis, and > >. alphanumeric code for the cardiac risk factor > represented by > >diagnosis). > > > >Each diagnosis is an observation: > >Pt_ID callnum calldt ICD9 visit_dt newlabel > >10001 1 1/5/05 401.9 12/4/04 HTN > >10001 1 1/5/05 305.1 12/4/04 SMO > >10001 2 2/1/06 272.0 5/5/05 HLIP > >10001 2 2/1/06 401.9 5/5/05 HTN > >10001 2 2/1/06 305.1 5/5/05 SMO > >10002* 1 9/28/04 > >10003 1 1/7/05 272.4 2/1/03 HTN > >10003 2 7/9/05 250.0 3/12/05 DIAB > >10003 2 7/9/05 272.4 3/1/05 HTN > > > >*this pt had a call but no risk factors or > associated visit dates. > > > >I want to turn each call into an observation, and > create 4 new > >variables for the 4 risk factors: > > > >Pt_ID callnum calldt visit_dt RF_1 RF_2 RF_3 > RF_4 > >10001 1 1/5/05 12/4/04 HTN SMO > >10001 2 2/1/06 5/5/05 HLIP HTN SMO > >10002 1 9/28/04 > >10003 1 7/7/05 2/1/03 HTN > >10003 2 7/9/05 3/12/05 DIAB HTN > > > >So what is tripping me up with CASESTOVARS is [...] > > I had considerably more trouble than I expected. > Below is a solution, > in SPSS 15 draft output. Note step II., to get one > record for each > visit with all the variables simply carried over. I > wanted a much > simpler solution with just CASESTOVARS, like this: > > CASESTOVARS > /ID = Pt_ID callnum > /RENAME = newlabel=RF > /SEPARATOR = '_' > /GROUPBY = VARIABLE > /DROP = ICD9 > /AUTOFIX = YES. > > What defeats this is that it won't treat 'visit_dt' > as a fixed > variable, even though it IS fixed over all the > records of a visit > (counting 'missing' as a value). Can anybody do > better? > > Before the solution: do you want what you asked for? > You get (ignoring > the variables that are simply carried over), > > Pt_ID callnum RF_1 RF_2 RF_3 > > 10001 1 HTN SMO > 10001 2 HLIP HTN SMO > 10002 1 > 10003 1 HTN > 10003 2 DIAB HTN > > Risk factors 'HTN' and 'SMO' appear in different > columns in different > cases, as would other risk factors. It would make, > say, calculating the > frequency of occurrence of 'HMO' quite difficult. > Would an alternative > form be better, with a variable for every possible > each risk factor > you're considering, having with value 'Present' or > 'Absent' for each > visit? > > > >Actually I will have to create new variables for > each of the diagnoses > >and their associated visit dates....just didn't > have room to cram all > >those variables into one line in this message. > > Does this mean you want to have new variables for > the ICD9 codes as > well, i.e. include those in the CASESTOVARS the way > 'newlabel' is? Or > something going beyond that? Well, for another > day... > > Anyway, here goes: > ================================ > Solution (SPSS 15 draft output). > It uses datasets (SPSS 14 and 15). For earlier > releases, scratch files > would work, with considerable reworking of DATASET > commands into > file-handling commands. > ================================ > > |Output Created |07-JUN-2007 22:58:29 > | > |-----------------------------|---------------------------| > [Original] > > Pt_ID callnum calldt ICD9 visit_dt newlabel > > 10001 1 01/05/2005 401.9 12/04/2004 HTN > 10001 1 01/05/2005 305.1 12/04/2004 SMO > 10001 2 02/01/2006 272.0 05/05/2005 HLIP > 10001 2 02/01/2006 401.9 05/05/2005 HTN > 10001 2 02/01/2006 305.1 05/05/2005 SMO > 10002 1 09/28/2004 . > 10003 1 01/07/2005 272.4 02/01/2003 HTN > 10003 2 07/09/2005 250.0 03/12/2005 DIAB > 10003 2 07/09/2005 272.4 03/01/2005 HTN > > Number of cases read: 9 Number of cases listed: > 9 > > > * I. Restructure, ignoring variables to be kept > but not ....... . > * active in the restructuring. > ....... . > > ADD FILES > /FILE = Original > /KEEP = Pt_ID callnum newlabel. > DATASET NAME NewLabl WINDOW=FRONT. > LIST. > > List > > |Output Created |07-JUN-2007 22:58:30 > | > |-----------------------------|---------------------------| > [NewLabl] > > Pt_ID callnum newlabel > > 10001 1 HTN > 10001 1 SMO > 10001 2 HLIP > 10001 2 HTN > 10001 2 SMO > 10002 1 > 10003 1 HTN > 10003 2 DIAB > 10003 2 HTN > > Number of cases read: 9 Number of cases listed: > 9 > > > SORT CASES BY Pt_ID callnum . > CASESTOVARS > /ID = Pt_ID callnum > /RENAME = newlabel=RF > /SEPARATOR = '_' > /GROUPBY = VARIABLE . > > Cases to Variables > > |Output Created |07-JUN-2007 22:58:30 > | > |----------------------------|---------------------------| > [NewLabl] > > Generated Variables > |----------|------| > |Original |Result| > |Variable |------| > | |Name | > |--------|-|------| > |newlabel|1|RF_1 | > | |2|RF_2 | > | |3|RF_3 | > |--------|-|------| > > Processing Statistics > |---------------|---| > |Cases In |9 | > |Cases Out |5 | > |---------------|---| > |Cases In/Cases |1.8| > |Out | | > Cheers! Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Did you know that 87.166253% of all statistics claim a precision of results that is not justified by the method employed? [HELMUT RICHTER] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ____________________________________________________________________________________ Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. http://smallbusiness.yahoo.com/webhosting |
|
In reply to this post by Tanya Temkin
Use this--BUT NOTE: Pt_ID=10003 callnum=2 has 2 different visit_dts
listed. This works as you would expect IF these two dates are made identical first. You also would need to rename newlabel to 'RF' to get the varnames you indicate in the expected file. If you want to have the visit information together then change GROUPBY to 'INDEX' that will do ICD9_1, RF_1, ICD9_2, RF_2... instead of ICD9_1, ICD9_2, ICD9_3, ICD9_4, RF_1,... Etc as the syntax below will do. CASESTOVARS /ID = Pt_ID callnum /FIXED=CALLDT VISIT_DT /GROUPBY = VARIABLE . Melissa -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Tanya Temkin Sent: Thursday, June 07, 2007 5:18 PM To: [hidden email] Subject: [SPSSX-L] CASESTOVARS the right thing here? HI SPSSers, I've got a file that I thought I could use CASESTOVARS to restructure, but now I'm not so sure. This is a simplified version of what I have now (patient ID, sequential number of each pt's phone call, call date, diagnostic code, most recent date of diagnosis, and alphanumeric code for the cardiac risk factor represented by diagnosis). Each diagnosis is an observation: Pt_ID callnum calldt ICD9 visit_dt newlabel 10001 1 1/5/05 401.9 12/4/04 HTN 10001 1 1/5/05 305.1 12/4/04 SMO 10001 2 2/1/06 272.0 5/5/05 HLIP 10001 2 2/1/06 401.9 5/5/05 HTN 10001 2 2/1/06 305.1 5/5/05 SMO 10002* 1 9/28/04 10003 1 1/7/05 272.4 2/1/03 HTN 10003 2 7/9/05 250.0 3/12/05 DIAB 10003 2 7/9/05 272.4 3/1/05 HTN *this pt had a call but no risk factors or associated visit dates. This is where I want to go - turn each call into an observation, and create 4 new variables for the 4 risk factors: Pt_ID callnum calldt visit_dt RF_1 RF_2 RF_3 RF_4 10001 1 1/5/05 12/4/04 HTN SMO 10001 2 2/1/06 5/5/05 HLIP HTN SMO 10002 1 9/28/04 10003 1 7/7/05 2/1/03 HTN 10003 2 7/9/05 3/12/05 DIAB HTN Actually I will have to create new variables for each of the diagnoses and their associated visit dates....just didn't have room to cram all those variables into one line in this message. So what is tripping me up with CASESTOVARS is how to use the ID subcommand when I need to group the data by, first Pt_ID number and then the call number for that Pt_ID. Can't get a handle on that one. Or is CASESTOVARS not the strategy to use? Thanks for pointing me in the right direction... Tanya NOTICE TO RECIPIENT: If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. Thank you. PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. |
|
At 09:32 AM 6/8/2007, Melissa Ives wrote:
>Use this--BUT NOTE: Pt_ID=10003 callnum=2 has 2 different visit_dts >listed. This works as you would expect IF these two dates are made >identical first. THANK you, Melissa. So THAT is what I missed. So, Tanya, here's the much simpler solution. CASESTOVARS is fine, after all. SPSS 15 draft output (WRR:not saved separately): List |-----------------------------|---------------------------| |Output Created |08-JUN-2007 11:44:47 | |-----------------------------|---------------------------| [Original] Pt_ID callnum calldt ICD9 visit_dt newlabel 10001 1 01/05/2005 401.9 12/04/2004 HTN 10001 1 01/05/2005 305.1 12/04/2004 SMO 10001 2 02/01/2006 272.0 05/05/2005 HLIP 10001 2 02/01/2006 401.9 05/05/2005 HTN 10001 2 02/01/2006 305.1 05/05/2005 SMO 10002 1 09/28/2004 . 10003 1 01/07/2005 272.4 02/01/2003 HTN 10003 2 07/09/2005 250.0 03/12/2005 DIAB 10003 2 07/09/2005 272.4 03/12/2005 HTN Number of cases read: 9 Number of cases listed: 9 CASESTOVARS /ID = Pt_ID callnum /RENAME = newlabel=RF /SEPARATOR = '_' /GROUPBY = VARIABLE /DROP = ICD9 /AUTOFIX = YES. Cases to Variables |-----------------------------|---------------------------| |Output Created |08-JUN-2007 11:44:47 | |-----------------------------|---------------------------| [Original] Generated Variables |----------|------| |Original |Result| |Variable |------| | |Name | |--------|-|------| |newlabel|1|RF_1 | | |2|RF_2 | | |3|RF_3 | |--------|-|------| Processing Statistics |---------------|---| |Cases In |9 | |Cases Out |5 | |---------------|---| |Cases In/Cases |1.8| |Out | | |---------------|---| |Variables In |6 | |Variables Out |7 | |---------------|---| |Index Values |3 | |---------------|---| LIST. List |-----------------------------|---------------------------| |Output Created |08-JUN-2007 11:44:47 | |-----------------------------|---------------------------| [Original] Pt_ID callnum calldt visit_dt RF_1 RF_2 RF_3 10001 1 01/05/2005 12/04/2004 HTN SMO 10001 2 02/01/2006 05/05/2005 HLIP HTN SMO 10002 1 09/28/2004 . 10003 1 01/07/2005 02/01/2003 HTN 10003 2 07/09/2005 03/12/2005 DIAB HTN Number of cases read: 5 Number of cases listed: 5 =================== APPENDIX: Test data =================== AS before, but changed "3/1/05" to "3/12/05" in last line. DATA LIST LIST SKIP=1/ Pt_ID callnum calldt ICD9 visit_dt newlabel ( F5 F2 ADATE10 A5 ADATE10 A6). BEGIN DATA Pt_ID callnum calldt ICD9 visit_dt newlabel 10001 1 1/5/05 401.9 12/4/04 HTN 10001 1 1/5/05 305.1 12/4/04 SMO 10001 2 2/1/06 272.0 5/5/05 HLIP 10001 2 2/1/06 401.9 5/5/05 HTN 10001 2 2/1/06 305.1 5/5/05 SMO 10002 1 9/28/04 10003 1 1/7/05 272.4 2/1/03 HTN 10003 2 7/9/05 250.0 3/12/05 DIAB 10003 2 7/9/05 272.4 3/12/05 HTN END DATA. |
|
Thanks Melissa and Richard. I haven't applied your solution yet but am
almost set to go - have to do a few other steps before I am ready to restructure. Ideally I would keep the different dates in the visit_dt field for pt_ID 10003, since people often got different diagnoses on different dates. But if that can't be done (at least not thru CASESTOVARS) I guess I will just retain the "long" file as a reference file to key dates to diagnoses, and use the restructured file to indicate what risk factors are associated with each call (that'll suffice for me to set up dichotomous dummy variables for each risk factor.) Tanya NOTICE TO RECIPIENT: If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. Thank you. Richard Ristow <[hidden email]> Sent by: "SPSSX(r) Discussion" <[hidden email]> 06/08/2007 08:48 AM Please respond to Richard Ristow <[hidden email]> To [hidden email] cc Subject Re: CASESTOVARS the right thing here? At 09:32 AM 6/8/2007, Melissa Ives wrote: >Use this--BUT NOTE: Pt_ID=10003 callnum=2 has 2 different visit_dts >listed. This works as you would expect IF these two dates are made >identical first. THANK you, Melissa. So THAT is what I missed. So, Tanya, here's the much simpler solution. CASESTOVARS is fine, after all. SPSS 15 draft output (WRR:not saved separately): List |-----------------------------|---------------------------| |Output Created |08-JUN-2007 11:44:47 | |-----------------------------|---------------------------| [Original] Pt_ID callnum calldt ICD9 visit_dt newlabel 10001 1 01/05/2005 401.9 12/04/2004 HTN 10001 1 01/05/2005 305.1 12/04/2004 SMO 10001 2 02/01/2006 272.0 05/05/2005 HLIP 10001 2 02/01/2006 401.9 05/05/2005 HTN 10001 2 02/01/2006 305.1 05/05/2005 SMO 10002 1 09/28/2004 . 10003 1 01/07/2005 272.4 02/01/2003 HTN 10003 2 07/09/2005 250.0 03/12/2005 DIAB 10003 2 07/09/2005 272.4 03/12/2005 HTN Number of cases read: 9 Number of cases listed: 9 CASESTOVARS /ID = Pt_ID callnum /RENAME = newlabel=RF /SEPARATOR = '_' /GROUPBY = VARIABLE /DROP = ICD9 /AUTOFIX = YES. Cases to Variables |-----------------------------|---------------------------| |Output Created |08-JUN-2007 11:44:47 | |-----------------------------|---------------------------| [Original] Generated Variables |----------|------| |Original |Result| |Variable |------| | |Name | |--------|-|------| |newlabel|1|RF_1 | | |2|RF_2 | | |3|RF_3 | |--------|-|------| Processing Statistics |---------------|---| |Cases In |9 | |Cases Out |5 | |---------------|---| |Cases In/Cases |1.8| |Out | | |---------------|---| |Variables In |6 | |Variables Out |7 | |---------------|---| |Index Values |3 | |---------------|---| LIST. List |-----------------------------|---------------------------| |Output Created |08-JUN-2007 11:44:47 | |-----------------------------|---------------------------| [Original] Pt_ID callnum calldt visit_dt RF_1 RF_2 RF_3 10001 1 01/05/2005 12/04/2004 HTN SMO 10001 2 02/01/2006 05/05/2005 HLIP HTN SMO 10002 1 09/28/2004 . 10003 1 01/07/2005 02/01/2003 HTN 10003 2 07/09/2005 03/12/2005 DIAB HTN Number of cases read: 5 Number of cases listed: 5 =================== APPENDIX: Test data =================== AS before, but changed "3/1/05" to "3/12/05" in last line. DATA LIST LIST SKIP=1/ Pt_ID callnum calldt ICD9 visit_dt newlabel ( F5 F2 ADATE10 A5 ADATE10 A6). BEGIN DATA Pt_ID callnum calldt ICD9 visit_dt newlabel 10001 1 1/5/05 401.9 12/4/04 HTN 10001 1 1/5/05 305.1 12/4/04 SMO 10001 2 2/1/06 272.0 5/5/05 HLIP 10001 2 2/1/06 401.9 5/5/05 HTN 10001 2 2/1/06 305.1 5/5/05 SMO 10002 1 9/28/04 10003 1 1/7/05 272.4 2/1/03 HTN 10003 2 7/9/05 250.0 3/12/05 DIAB 10003 2 7/9/05 272.4 3/12/05 HTN END DATA. |
|
In reply to this post by Tanya Temkin
It can be done, just leave Visit_dt out of the /FIXED command.
Melissa -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Tanya Temkin Sent: Friday, June 08, 2007 11:43 AM To: [hidden email] Subject: Re: [SPSSX-L] CASESTOVARS the right thing here? Thanks Melissa and Richard. I haven't applied your solution yet but am almost set to go - have to do a few other steps before I am ready to restructure. Ideally I would keep the different dates in the visit_dt field for pt_ID 10003, since people often got different diagnoses on different dates. But if that can't be done (at least not thru CASESTOVARS) I guess I will just retain the "long" file as a reference file to key dates to diagnoses, and use the restructured file to indicate what risk factors are associated with each call (that'll suffice for me to set up dichotomous dummy variables for each risk factor.) Tanya NOTICE TO RECIPIENT: If you are not the intended recipient of this e-mail, you are prohibited from sharing, copying, or otherwise using or disclosing its contents. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and permanently delete this e-mail and any attachments without reading, forwarding or saving them. Thank you. Richard Ristow <[hidden email]> Sent by: "SPSSX(r) Discussion" <[hidden email]> 06/08/2007 08:48 AM Please respond to Richard Ristow <[hidden email]> To [hidden email] cc Subject Re: CASESTOVARS the right thing here? At 09:32 AM 6/8/2007, Melissa Ives wrote: >Use this--BUT NOTE: Pt_ID=10003 callnum=2 has 2 different visit_dts >listed. This works as you would expect IF these two dates are made >identical first. THANK you, Melissa. So THAT is what I missed. So, Tanya, here's the much simpler solution. CASESTOVARS is fine, after all. SPSS 15 draft output (WRR:not saved separately): List |-----------------------------|---------------------------| |Output Created |08-JUN-2007 11:44:47 | |-----------------------------|---------------------------| [Original] Pt_ID callnum calldt ICD9 visit_dt newlabel 10001 1 01/05/2005 401.9 12/04/2004 HTN 10001 1 01/05/2005 305.1 12/04/2004 SMO 10001 2 02/01/2006 272.0 05/05/2005 HLIP 10001 2 02/01/2006 401.9 05/05/2005 HTN 10001 2 02/01/2006 305.1 05/05/2005 SMO 10002 1 09/28/2004 . 10003 1 01/07/2005 272.4 02/01/2003 HTN 10003 2 07/09/2005 250.0 03/12/2005 DIAB 10003 2 07/09/2005 272.4 03/12/2005 HTN Number of cases read: 9 Number of cases listed: 9 CASESTOVARS /ID = Pt_ID callnum /RENAME = newlabel=RF /SEPARATOR = '_' /GROUPBY = VARIABLE /DROP = ICD9 /AUTOFIX = YES. Cases to Variables |-----------------------------|---------------------------| |Output Created |08-JUN-2007 11:44:47 | |-----------------------------|---------------------------| [Original] Generated Variables |----------|------| |Original |Result| |Variable |------| | |Name | |--------|-|------| |newlabel|1|RF_1 | | |2|RF_2 | | |3|RF_3 | |--------|-|------| Processing Statistics |---------------|---| |Cases In |9 | |Cases Out |5 | |---------------|---| |Cases In/Cases |1.8| |Out | | |---------------|---| |Variables In |6 | |Variables Out |7 | |---------------|---| |Index Values |3 | |---------------|---| LIST. List |-----------------------------|---------------------------| |Output Created |08-JUN-2007 11:44:47 | |-----------------------------|---------------------------| [Original] Pt_ID callnum calldt visit_dt RF_1 RF_2 RF_3 10001 1 01/05/2005 12/04/2004 HTN SMO 10001 2 02/01/2006 05/05/2005 HLIP HTN SMO 10002 1 09/28/2004 . 10003 1 01/07/2005 02/01/2003 HTN 10003 2 07/09/2005 03/12/2005 DIAB HTN Number of cases read: 5 Number of cases listed: 5 =================== APPENDIX: Test data =================== AS before, but changed "3/1/05" to "3/12/05" in last line. DATA LIST LIST SKIP=1/ Pt_ID callnum calldt ICD9 visit_dt newlabel ( F5 F2 ADATE10 A5 ADATE10 A6). BEGIN DATA Pt_ID callnum calldt ICD9 visit_dt newlabel 10001 1 1/5/05 401.9 12/4/04 HTN 10001 1 1/5/05 305.1 12/4/04 SMO 10001 2 2/1/06 272.0 5/5/05 HLIP 10001 2 2/1/06 401.9 5/5/05 HTN 10001 2 2/1/06 305.1 5/5/05 SMO 10002 1 9/28/04 10003 1 1/7/05 272.4 2/1/03 HTN 10003 2 7/9/05 250.0 3/12/05 DIAB 10003 2 7/9/05 272.4 3/12/05 HTN END DATA. PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. |
|
In reply to this post by Tanya Temkin
At 12:43 PM 6/8/2007, Tanya Temkin wrote:
>Ideally I would keep the different dates in the visit_dt field for >pt_ID 10003, since people often got different diagnoses on different >dates. But if that can't be done (at least not thru CASESTOVARS) I >guess I will just retain the "long" file as a reference file to key >dates to diagnoses, and use the restructured file to indicate what >risk factors are associated with each call (that'll suffice for me to >set up dichotomous dummy variables for each risk factor.) Now we're getting into something deeper: Now how to do it, but what you mean to do. What, in short, your data says, and what it means. This is a topic about organizing data, that goes under the heading "normalization" in data-base circles. I and Melissa both took it that your structure is, * You have a set of *patients*, with a unique identifier, Pt_ID. You probably have some information that's particular to a patient (name, address, date of birth, ...), but since none of that's in the file you posted, it doesn't come up now. * A patient may have any number of *calls*. Calls are identified by 'callnum' within each patient; the unique identifier for a call is the combination of Pt_ID and callnum. . Each call has a 'calldt' and a 'visit_dt' (though the latter may be missing); at least, both Melissa and I thought so. THAT'S the crucial question: does the visit date belong to the 'call', or to the diagnosis within the 'call'? CAN one call have more than one visit date? * In a call, any number of *diagnoses* may be reached. Each is identified by its 'ICD9' value, and has n 'newlabel' risk-factor code which is a recode of the ICD9 value. Now, you write "people often got different diagnoses on different dates". I'm sure they do; but do they get different diagnoses on different visited dates *that belong to the same call*? That is, does 'visit_dt' belong to the call (so the combination of Pt_ID and callnum should have only one 'visit_dt'), or to the diagnosis (so the combination of Pt_ID and callnum may have many visit dates as there are diagnoses)? What Melissa suggests will work fine for the second case (many visit dates for one call). What we both suggested previously works for the first case (many calls, only one visit date per call). In the latter case, a diagnosis may still be given on many dates; but on dates associated with different 'calls'. Finally, you write that you will >use the restructured file to indicate what risk factors are associated >with each call (that'll suffice for me to set up dichotomous dummy >variables for each risk factor.) That's what I was wondering about. You can do that from the restructured file, but it may be easier to do it directly from the 'long' file. What's been your thinking about that? -Onward, and good luck, Richard |
|
In reply to this post by Albert-Jan Roskam
At 07:34 AM 6/8/2007, Albert-jan Roskam wrote, to me and the list:
>I read that you're using ICD 9. Do you happen to have ICD-9-CM in an >SPSS-friendly format? I currently only have a pdf version and I'd be >very happy to have an xls, dbf, sav or whatever version. I don't, Albert-jan. Remember, here I'm responding to Tanya Temkin about *her* problem that involves ICD-9 codes; I'm not using them myself. Why don't you post this as a new thread, something like "SPSS-readable version of ICD-9-CM"? -Regards, Richard |
|
Hi everybody,
(I also posted this in another thread earlier this week.) I was wondering if somebody knows where to get an SPSS-readable version of ICD-9-CM (International Classification of Diseases). In particular I'm interested in the medical procedures. Currently I only have a pdf, which is not so practical to work with. The only thing I could find was a dbf with a readme written in Thai language. Not my strongest point. ;-) Thank you in advance! Cheers!! Albert-Jan Cheers! Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Did you know that 87.166253% of all statistics claim a precision of results that is not justified by the method employed? [HELMUT RICHTER] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ____________________________________________________________________________________ Never miss an email again! Yahoo! Toolbar alerts you the instant new Mail arrives. http://tools.search.yahoo.com/toolbar/features/mail/ |
|
Ingenex ( http://www.ingenix.com/Products/Hospitals/CodiComReimMgtHOSP/
)sells the codes on a disk. The last time I purchased them they were in a flat field format. You could bring them into Excel and then to SPSS. Or someone out there has them in a SQL database in their practice management software and they could export them for you. Remember that both the ICD-9 codes and CPT codes are copyright protected. So you may want to On 6/9/07 8:24 AM, "Albert-jan Roskam" <[hidden email]> wrote: > Hi everybody, > > (I also posted this in another thread earlier this > week.) I was wondering if somebody knows where to get > an SPSS-readable version of ICD-9-CM (International > Classification of Diseases). In particular I'm > interested in the medical procedures. Currently I only > have a pdf, which is not so practical to work with. > The only thing I could find was a dbf with a readme > written in Thai language. Not my strongest point. ;-) > > Thank you in advance! > > Cheers!! > Albert-Jan > > Cheers! > Albert-Jan > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Did you know that 87.166253% of all statistics claim a precision of results > that is not justified by the method employed? [HELMUT RICHTER] > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > ______________________________________________________________________________ > ______ > Never miss an email again! > Yahoo! Toolbar alerts you the instant new Mail arrives. > http://tools.search.yahoo.com/toolbar/features/mail/ |
|
In reply to this post by Richard Ristow
I think this is may clarify the problem (thanks, Tanya). It's a long
posting, I'm afraid. At 07:59 PM 6/12/2007, Tanya wrote off-list (twice-quoted text is mine): >>* A patient may have any number of *calls*. Calls are identified by >>'callnum' within each patient; the unique identifier for a call is >>the combination of Pt_ID and callnum. > >Right. > >>. Each call has a 'calldt' and a 'visit_dt' (though the latter may be >>missing); at least, both Melissa and I thought so. THAT'S the crucial >>question: does the visit date belong to the 'call', or to the >>diagnosis within the 'call'? CAN one call have more than one visit >>date? > >We are looking at what risk factors were known to exist at time of >call - that is, risk factors that had been diagnosed on previous >visits. THAT'S the one that I, and I think Melissa as well, missed entirely. I think we both tackled your question a little too quickly, and purely on its terms as you posted it. I took it for granted that there was a call, and a visit to follow up the call. The clue we (or at least I) missed was that the visit dates in your test data are *earlier* than the call dates. >So the visit date is associated with the diagnosis, but each call can >have more than one prior visit date at which a risk factor diagnosis >was made. For example, a patient may have six different >diabetes-related diagnoses, with each diabetic complication >"discovered" on different visits. > >(The "big picture" context for all this is a cohort study that aims to >identify demographic and clinical predictors of undiagnosed coronary >artery disease among persons calling a managed care organization's >advice line with complaints of apparent cardiac chest pain.) Ah: "persons calling a managed care organization's advice line". So that's what a 'call' means, distinct from a 'visit'. (I assume a 'visit' is a medical office visit in the usual sense.) >>* In a call, any number of *diagnoses* may be reached. Each is >>identified by its 'ICD9' value, and has n 'newlabel' risk-factor code >>which is a recode of the ICD9 value. Which, I now see, was simply wrong. >The number of diagnoses per call is only limited by the finite number >of diagnoses we are including as indicators of the major cardiac risk >factors of smoking, hypertension, hyperlipidemia, and diabetes. Right; the same missed point. Diagnoses pertaining to a call are those arrived at in visits *preceding* the call. As you say: >If the person got diagnoses A and B prior to call 1, those diagnoses >(and the dates of visits where those diagnoses were made) would be >linked to that call. Now, I conjecture that, since the same diagnosis may easily be reached several times, the visit date is the date of the *last visit before the call on which the diagnosis was reached*. Or is it the *last visit overall* on which the diagnosis was reached? >So I'll excise those "extra" diagnoses That is, all but the latest? > - sort the file by Pt_ID, call number, risk factor code, and then > date of diagnosis (all in ascending order). Then merge the file to > itself on those key variables, use LAST subcommand to flag the most > recent diagnosis date for each risk factor, for each call, and use > SELECT IF to retain only those flagged observations. For each call, > what I *should* end up with is up to four associated risk factor > values, each in a separate observation - each with most recent date > of visit at which a diagnosis associated with that risk factor was > given. Good. Good luck to you, and post again (which is preferable to asking off-list) if you have any difficulties. >(I realize that I'd better give the people without any risk factors a >risk factor code value of "none" or something so they stay in the >file...) Or, if you just merge back with the original 'call' records, that should take care of it. >THEN, I can restructure the file via CASESTOVARS per your and >Melissa's able instructions. Good enough. To get the form you originally requested, you probably need to /DROP both 'ICD9' and 'visit_dt' on your CASESTOVARS. OR, since you asked earlier (so I've triple-quoted it), >>>I would keep the different dates in the visit_dt field for pt_ID >>>10003, since people often got different diagnoses on different >>>dates. But if that can't be done (at least not thru CASESTOVARS) ... It can, precisely as Melissa suggested: "It can be done, just leave Visit_dt out of the /FIXED command." If you do this, try /GROUPBY=INDEX on your CASESTOVARS; you may find the result easier to read. I've written about using an indicator variable for each (possible) risk factor, instead of the structure you requested, that Melissa and I gave the solution for. But that's your judgement. -Good luck and good analysis, Richard |
|
In reply to this post by Albert-Jan Roskam
Hello Albert-jan,
Please have a look of here: http://www.hcup-us.ahrq.gov/toolssoftware/ccs/ccs.jsp If you scroll way down you'll find an ASCII version of ICD codes with the major description. Hope this helps. Ken On Sat, 9 Jun 2007 05:24:13 -0700, Albert-jan Roskam <[hidden email]> wrote: >Hi everybody, > >(I also posted this in another thread earlier this >week.) I was wondering if somebody knows where to get >an SPSS-readable version of ICD-9-CM (International >Classification of Diseases). In particular I'm >interested in the medical procedures. Currently I only >have a pdf, which is not so practical to work with. >The only thing I could find was a dbf with a readme >written in Thai language. Not my strongest point. ;-) > >Thank you in advance! > >Cheers!! >Albert-Jan > >Cheers! >Albert-Jan > >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >Did you know that 87.166253% of all statistics claim a precision of results >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > >____________________________________________________________________________________ >Never miss an email again! >Yahoo! Toolbar alerts you the instant new Mail arrives. >http://tools.search.yahoo.com/toolbar/features/mail/ |
|
Hi Ken!
This is EXACTLY what I was looking for! *THANKS* a lot! I really appreciate it! Best wishes, Albert-Jan --- Ken Chui <[hidden email]> wrote: > Hello Albert-jan, > > Please have a look of here: > http://www.hcup-us.ahrq.gov/toolssoftware/ccs/ccs.jsp > > If you scroll way down you'll find an ASCII version > of ICD codes with the > major description. Hope this helps. > > Ken > > On Sat, 9 Jun 2007 05:24:13 -0700, Albert-jan Roskam > <[hidden email]> wrote: > > >Hi everybody, > > > >(I also posted this in another thread earlier this > >week.) I was wondering if somebody knows where to > get > >an SPSS-readable version of ICD-9-CM (International > >Classification of Diseases). In particular I'm > >interested in the medical procedures. Currently I > only > >have a pdf, which is not so practical to work with. > >The only thing I could find was a dbf with a readme > >written in Thai language. Not my strongest point. > ;-) > > > >Thank you in advance! > > > >Cheers!! > >Albert-Jan > > > >Cheers! > >Albert-Jan > > > >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > >Did you know that 87.166253% of all statistics > claim a precision of results > that is not justified by the method employed? > [HELMUT RICHTER] > >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > > > > >____________________________________________________________________________________ > >Never miss an email again! > >Yahoo! Toolbar alerts you the instant new Mail > arrives. > >http://tools.search.yahoo.com/toolbar/features/mail/ > Cheers! Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Did you know that 87.166253% of all statistics claim a precision of results that is not justified by the method employed? [HELMUT RICHTER] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ____________________________________________________________________________________ Need a vacation? Get great deals to amazing places on Yahoo! Travel. http://travel.yahoo.com/ |
| Free forum by Nabble | Edit this page |
