insurance claims risk adjustment takes forever

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

insurance claims risk adjustment takes forever

mpirritano

Hello all,

 

I’m trying to run an insurance claims risk adjustment program. But by my calculations it could take several thousand hours to run.

 

The program takes the over 16,000 ICD9 codes and puts diagnoses into groups. My data has 8 diagnoses per claim. And I have almost 3 million claims.

 

My machine is a dual core Pentium, using the python plugin, 2 gigs of ram, windows xp.

 

Does it make sense that it could take so long? Someone asked me if spss runs from memory, compared to sas which runs from the disk. Could this be part of the issue?

 

Bottom line. Does this type of analysis sound possible with SPSS.

 

Thanks

Matt

 

Matthew Pirritano, Ph.D.

Research Analyst IV

Medical Services Initiative (MSI)

Orange County Health Care Agency

(714) 568-5648

 

Reply | Threaded
Open this post in threaded view
|

Re: insurance claims risk adjustment takes forever

Maguin, Eugene
Matt,

I can't comment on spss vs sas or on your computer. It seems to me, though,
that your dataset+required computations is at a place were the computational
algorithm matters, maybe a great deal. I'm pretty sure there are others on
the list that have experience with big datasets and can comment better than
I can. I wondering if you have the most efficient algorithm for the required
operations. Have you tested alternatives and, if so, was there enough
difference to matter? And, if you'd care to, I'd be interested to hear a
description of your computational algorithm.

Gene Maguin


Hello all,

I'm trying to run an insurance claims risk adjustment program. But by my
calculations it could take several thousand hours to run.

The program takes the over 16,000 ICD9 codes and puts diagnoses into groups.
My data has 8 diagnoses per claim. And I have almost 3 million claims.

My machine is a dual core Pentium, using the python plugin, 2 gigs of ram,
windows xp.

Does it make sense that it could take so long? Someone asked me if spss runs
from memory, compared to sas which runs from the disk. Could this be part of
the issue?
Bottom line. Does this type of analysis sound possible with SPSS.

Thanks
Matt

Matthew Pirritano, Ph.D.
Research Analyst IV
Medical Services Initiative (MSI)
Orange County Health Care Agency
(714) 568-5648

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: insurance claims risk adjustment takes forever

mpirritano
Thanks gene,

My syntax looks like this.

Create the 99 numeric variables that are the diagnostic groups.

Run do repeat:

Do repeat diagnosis = diagnosis1 to diagnosis8.
        THE 16,000+ IF STATEMENTS TO GROUP THE 8 DIAGNOSES BASED ON ICD9
CODES. IF DIAGNOSIS = XXXXX THEN GROUPx = 1.
End repeat.

If the diagnosis codes are missing after these if statements then make
GROUPx equal to zero.

If more efficient code could speed this up that'd be great! Maybe an
alternative to the 16,000+ if statements? Is there a way to put this in
another file that the syntax draws from? The program was originally
written in sas and used a macro to bring in these recodes.

Thanks
Matt

Matthew Pirritano, Ph.D.
Research Analyst IV
Medical Services Initiative (MSI)
Orange County Health Care Agency
(714) 568-5648
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Gene Maguin
Sent: Monday, March 29, 2010 8:32 AM
To: [hidden email]
Subject: Re: insurance claims risk adjustment takes forever

Matt,

I can't comment on spss vs sas or on your computer. It seems to me,
though,
that your dataset+required computations is at a place were the
computational
algorithm matters, maybe a great deal. I'm pretty sure there are others
on
the list that have experience with big datasets and can comment better
than
I can. I wondering if you have the most efficient algorithm for the
required
operations. Have you tested alternatives and, if so, was there enough
difference to matter? And, if you'd care to, I'd be interested to hear a
description of your computational algorithm.

Gene Maguin


Hello all,

I'm trying to run an insurance claims risk adjustment program. But by my
calculations it could take several thousand hours to run.

The program takes the over 16,000 ICD9 codes and puts diagnoses into
groups.
My data has 8 diagnoses per claim. And I have almost 3 million claims.

My machine is a dual core Pentium, using the python plugin, 2 gigs of
ram,
windows xp.

Does it make sense that it could take so long? Someone asked me if spss
runs
from memory, compared to sas which runs from the disk. Could this be
part of
the issue?
Bottom line. Does this type of analysis sound possible with SPSS.

Thanks
Matt

Matthew Pirritano, Ph.D.
Research Analyst IV
Medical Services Initiative (MSI)
Orange County Health Care Agency
(714) 568-5648

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: insurance claims risk adjustment takes forever

Albert-Jan Roskam
Hi,

What if you apply the recodes to the ICD code file (the one with just 16000 records) and create the grouping variable there, the match the resulting file to the 3M file with a table lookup. So you do one match for each diagnosis (so a total of eight matches). Not 100% sure, but it'd be great if you could do that without any executes between the matches.

Btw, I did something *very* similar to what you're trying to do a few years ago with a 100M-record dataset, and although it took a long time, it didn't take anywhere near "several thousand" hours. 2000 hours = 83 days. Are you serious about this?

Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In the face of ambiguity, refuse the temptation to guess.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- On Mon, 3/29/10, Pirritano, Matthew <[hidden email]> wrote:

From: Pirritano, Matthew <[hidden email]>
Subject: Re: [SPSSX-L] insurance claims risk adjustment takes forever
To: [hidden email]
Date: Monday, March 29, 2010, 5:56 PM

Thanks gene,

My syntax looks like this.

Create the 99 numeric variables that are the diagnostic groups.

Run do repeat:

Do repeat diagnosis = diagnosis1 to diagnosis8.
        THE 16,000+ IF STATEMENTS TO GROUP THE 8 DIAGNOSES BASED ON ICD9
CODES. IF DIAGNOSIS = XXXXX THEN GROUPx = 1.
End repeat.

If the diagnosis codes are missing after these if statements then make
GROUPx equal to zero.

If more efficient code could speed this up that'd be great! Maybe an
alternative to the 16,000+ if statements? Is there a way to put this in
another file that the syntax draws from? The program was originally
written in sas and used a macro to bring in these recodes.

Thanks
Matt

Matthew Pirritano, Ph.D.
Research Analyst IV
Medical Services Initiative (MSI)
Orange County Health Care Agency
(714) 568-5648
-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of
Gene Maguin
Sent: Monday, March 29, 2010 8:32 AM
To: SPSSX-L@...
Subject: Re: insurance claims risk adjustment takes forever

Matt,

I can't comment on spss vs sas or on your computer. It seems to me,
though,
that your dataset+required computations is at a place were the
computational
algorithm matters, maybe a great deal. I'm pretty sure there are others
on
the list that have experience with big datasets and can comment better
than
I can. I wondering if you have the most efficient algorithm for the
required
operations. Have you tested alternatives and, if so, was there enough
difference to matter? And, if you'd care to, I'd be interested to hear a
description of your computational algorithm.

Gene Maguin


Hello all,

I'm trying to run an insurance claims risk adjustment program. But by my
calculations it could take several thousand hours to run.

The program takes the over 16,000 ICD9 codes and puts diagnoses into
groups.
My data has 8 diagnoses per claim. And I have almost 3 million claims.

My machine is a dual core Pentium, using the python plugin, 2 gigs of
ram,
windows xp.

Does it make sense that it could take so long? Someone asked me if spss
runs
from memory, compared to sas which runs from the disk. Could this be
part of
the issue?
Bottom line. Does this type of analysis sound possible with SPSS.

Thanks
Matt

Matthew Pirritano, Ph.D.
Research Analyst IV
Medical Services Initiative (MSI)
Orange County Health Care Agency
(714) 568-5648

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Reply | Threaded
Open this post in threaded view
|

Re: insurance claims risk adjustment takes forever

Richard Ristow
In reply to this post by mpirritano
At 11:56 AM 3/29/2010, Pirritano, Matthew wrote:

My syntax looks like this.

>Create the 99 numeric variables that are the diagnostic groups.
>Run do repeat:
>
>Do repeat diagnosis = diagnosis1 to diagnosis8.
>         THE 16,000+ IF STATEMENTS TO GROUP THE 8 DIAGNOSES BASED ON ICD9
>CODES. IF DIAGNOSIS = XXXXX THEN GROUPx = 1.
>End repeat.
>
>If more efficient code could speed this up that'd be great! Maybe an
>alternative to the 16,000+ if statements?

RECODE is very fast, far faster than an IF chain. I've no idea
whether you'd need 16,000 RECODE clauses, but I doubt it; surely, you
have ranges of ICD9 codes that all map into the same diagnostic category.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: insurance claims risk adjustment takes forever

statisticsdoc
In reply to this post by mpirritano
Matthew,

In keeping with Richard's suggestion, it might help to write out the
recode statements for each diagnosis, rather than performing the do
repeat.  My experience with ICD9 codes is that you need one recode
statement for each broader diagnostic category.  Once you have written
the code for one diagnosis, using a global replace in a text editor will
enable you to write the code for the second diagnosis and so forth.  I
may be mistaken, but I believe that reading the recode commands without
going through the do repeat loop will save you time.

HTH,

Steve Brand

www.StatisticsDoc.com

On Mon, Mar 29, 2010 at 1:13 PM, Richard Ristow wrote:

> At 11:56 AM 3/29/2010, Pirritano, Matthew wrote:
>
> My syntax looks like this.
>
>> Create the 99 numeric variables that are the diagnostic groups.
>> Run do repeat:
>>
>> Do repeat diagnosis = diagnosis1 to diagnosis8.
>>         THE 16,000+ IF STATEMENTS TO GROUP THE 8 DIAGNOSES BASED ON
>> ICD9
>> CODES. IF DIAGNOSIS = XXXXX THEN GROUPx = 1.
>> End repeat.
>>
>> If more efficient code could speed this up that'd be great! Maybe an
>> alternative to the 16,000+ if statements?
>
> RECODE is very fast, far faster than an IF chain. I've no idea
> whether you'd need 16,000 RECODE clauses, but I doubt it; surely, you
> have ranges of ICD9 codes that all map into the same diagnostic
> category.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except
> the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD