Computing a new variable by selecting on 100's of different diagnosis codes

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Computing a new variable by selecting on 100's of different diagnosis codes

mcgannmary
Hi,

I am working with a large medical claims file and need to compute a new
variable that represents patients with one of over 300 different diagnosis
codes.  I would normally use the wizard to compute a new variable and enter
the individual diagnosis codes that meet the criteria for inclusion.
However, with hundreds of codes, it will take a very long time to enter them
and there will likely be a lot of typos.

Is there an efficient and accurate way to tell SPSS to select cases that
have one of over 300 codes without  having to type them in in the syntax
editor or the wizard?

Thank you!
Mary



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Computing a new variable by selecting on 100's of different diagnosis codes

David Marso
Administrator
Where are these target diagnosis codes stored?  
Are they in an electronic format or in a hard copy? [If not electronic you
will have to type them].
Is the criteria variable a single field or a set of fields?  
How many cases in the file?

THESE ARE THE TYPE OF THINGS YOU SHOULD POST IN INITIAL QUERIES.

General solution:

Codes are in a sorted SPSS dataset DSCritCode with a single variable
CritCode.
Raw data are a dataset Raw sorted on a unique ID variable [IDvar].
Criteria variable a set of contiguous fields CV1 TO CV10.
Length of strings? CritCode and CV1 TO CV10 need to be identical.

UNTESTED off the top of my head.

DATASET ACTIVATE Raw .
DATASET  COPY CopyRaw .
DATASET ACTIVATE CopyRaw .
VARSTOCASES /ID=IDvar/MAKE  CritCode FROM CV1 TO CV10.
SORT CASES BY CritCode .
MATCH FILES /FILE * /TABLE=DSCritCode /IN=@FLAG@/BY CritCode .
AGGREGATE OUTFILE */BREAK IDvar / @FLAG@=MAX(@FLAG@).
MATCH FILES /FILE Raw /FILE * / BY IDvar .

Active file will have a new variable @FLAG@ with a 1 for cases with one or
more of the values of interest.



mcgannmary wrote

> Hi,
>
> I am working with a large medical claims file and need to compute a new
> variable that represents patients with one of over 300 different diagnosis
> codes.  I would normally use the wizard to compute a new variable and
> enter
> the individual diagnosis codes that meet the criteria for inclusion.
> However, with hundreds of codes, it will take a very long time to enter
> them
> and there will likely be a lot of typos.
>
> Is there an efficient and accurate way to tell SPSS to select cases that
> have one of over 300 codes without  having to type them in in the syntax
> editor or the wizard?
>
> Thank you!
> Mary
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Computing a new variable by selecting on 100's of different diagnosis codes

Jon Peck
In reply to this post by mcgannmary
Here is an example that matches the codes in one command.  Any keys with no match in the lookup table will be sysmis.  No sorting or restructuring of the dataset required.

* The lookup table.
data list free/ value(F8.0) akey(A1).
begin data
10 'a'
20 'b'
100 'z'
end data.
dataset name lookup.

* The main dataset.
data list free/x(f8.0) y(A2).
begin data
1 'a'
2 'b'
5 'a '
10 ''
1 'b'
end data.
dataset name main.
dataset activate main.
* This command creates the result code by looking up the key in the lookup table.
spssinc trans result = resultcodealpha
/initial "extendedTransforms.vlookup('akey', 'value', 'lookup')"
/formula func(y).

list.
x y  resultcodealpha 
 
       1 a         10.00 
       2 b         20.00 
       5 a         10.00 
      10             . 
       1 b         20.00

The spssinc trans extension command can be installed via the Extensions > Extension Hub menu if you don't already have it.


On Tue, May 7, 2019 at 1:07 PM mcgannmary <[hidden email]> wrote:
Hi,

I am working with a large medical claims file and need to compute a new
variable that represents patients with one of over 300 different diagnosis
codes.  I would normally use the wizard to compute a new variable and enter
the individual diagnosis codes that meet the criteria for inclusion.
However, with hundreds of codes, it will take a very long time to enter them
and there will likely be a lot of typos.

Is there an efficient and accurate way to tell SPSS to select cases that
have one of over 300 codes without  having to type them in in the syntax
editor or the wizard?

Thank you!
Mary



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Computing a new variable by selecting on 100's of different diagnosis codes

Mary E Mcgann

Thank you, Jon. This is perfect!

Mary

 

From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Jon Peck
Sent: Tuesday, May 7, 2019 4:22 PM
To: [hidden email]
Subject: Re: Computing a new variable by selecting on 100's of different diagnosis codes

 

Here is an example that matches the codes in one command.  Any keys with no match in the lookup table will be sysmis.  No sorting or restructuring of the dataset required.

 

* The lookup table.

data list free/ value(F8.0) akey(A1).

begin data

10 'a'

20 'b'

100 'z'

end data.

dataset name lookup.

 

* The main dataset.

data list free/x(f8.0) y(A2).

begin data

1 'a'

2 'b'

5 'a '

10 ''

1 'b'

end data.

dataset name main.

dataset activate main.

* This command creates the result code by looking up the key in the lookup table.

spssinc trans result = resultcodealpha

/initial "extendedTransforms.vlookup('akey', 'value', 'lookup')"

/formula func(y).

 

list.

x y  resultcodealpha 

 

       1 a         10.00 

       2 b         20.00 

       5 a         10.00 

      10             . 

       1 b         20.00

 

The spssinc trans extension command can be installed via the Extensions > Extension Hub menu if you don't already have it.

 

 

On Tue, May 7, 2019 at 1:07 PM mcgannmary <[hidden email]> wrote:

Hi,

I am working with a large medical claims file and need to compute a new
variable that represents patients with one of over 300 different diagnosis
codes.  I would normally use the wizard to compute a new variable and enter
the individual diagnosis codes that meet the criteria for inclusion.
However, with hundreds of codes, it will take a very long time to enter them
and there will likely be a lot of typos.

Is there an efficient and accurate way to tell SPSS to select cases that
have one of over 300 codes without  having to type them in in the syntax
editor or the wizard?

Thank you!
Mary



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


 

--

Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Computing a new variable by selecting on 100's of different diagnosis codes

Art Kendall
In reply to this post by Jon Peck
Jon


perhaps before the SPSSINC line set a default something like this

*initialize to a user missing code.
numeric ResultCodeAlpha (f2).
compute ResultCodeAlpha = -1.

missing values ResultCodeAlpha (low thru -1).
value labels ResultCodeAlpha
-1 'has codes but no code on list'
. . .

If there are input cases with all blank codes should that be treated as a
different kind of user missing?



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: Computing a new variable by selecting on 100's of different diagnosis codes

Jon Peck
That would have no effect on the result as it would be overwritten.  All the code knows is that the lookup did not find a value.  A user could prepare the lookup table with a specific user missing value if they know what values would be looked up, but it would be easier to post process the lookup results.

On Thu, May 9, 2019 at 12:44 PM Art Kendall <[hidden email]> wrote:
Jon


perhaps before the SPSSINC line set a default something like this

*initialize to a user missing code.
numeric ResultCodeAlpha (f2).
compute ResultCodeAlpha = -1.

missing values ResultCodeAlpha (low thru -1).
value labels ResultCodeAlpha
-1 'has codes but no code on list'
. . .

If there are input cases with all blank codes should that be treated as a
different kind of user missing?



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Computing a new variable by selecting on 100's of different diagnosis codes

Art Kendall
I should have thought through the process a little more. post makes more
sense

My main idea was  the user knows why the result was missing so symis should
be corrected

either
there were valid codes but none matched the list
or
there were no valid codes.

In many instances the reasoning about missingness would need to take that
into account.



-----
Art Kendall
Social Research Consultants
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants