anything like LIKE in SPSS?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

anything like LIKE in SPSS?

Tanya Temkin
Hi to all,

I need advice on how to deal with identifying "duplicate" values when
strings are very close but not identical. A good wildcard function would
be helpful but don't know of any.

This is a mockup of my data (much simplified) that logs calls regarding
patient complaints:

Pt_No   calldate   SnP   Complaint
1       12/07/04    TSR  SPRAINS-JOINT INJURY & BROKEN BONES
1       12/07/04    RN   SPRAINS/JOINT INJURY/BROKEN BONES
1       12/12/04    TSR  COUGH/COLD
2       3/05/05     TSR  HEAD INJURY
2       3/05/05     RN   HEAD INJURY/TRAUMA
2       3/12/05     TSR  DIAGNOSTIC TEST:RESULTS
2       3/12/05     RN   DIAGNOSTIC TEST:  RESULTS
2       9/01/05     TSR  CAST PROBLEMS
3       7/28/04     TSR  DIZZY/VERTIGO/FAINTING
3       7/28/04     RN   DIZZINESS/VERTIGO/SYNCOPE

Sometimes a teleservice representative (TSR) handles the call on her own;
often she transfers it to a registered nurse (RN). The patient may talk to
the TSR and RN about the same problem. However, as you can see, the text
strings the TSRs and RNs use to document the same patient complaint may
differ - extra spaces, slightly different wording or punctuation....

So much for my plans to use the LAG function to flag TSR and RN handling
of the same complaint on the same call!

I've looked at past postings on use of INDEX and SCAN but I'm not sure
that's what I need here... my SAS-using colleagues tell me that SAS has a
LIKE function that can identify similar words or phrases in different
string values. That sounds like what I need. Is there anything like this
in SPSS?

Oh yes, I'm using V 14. Only base system.


Thanks in advance.

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680
TIE 8-428-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.
Reply | Threaded
Open this post in threaded view
|

Re: anything like in SPSS?

Peck, Jon
SPSS does not have a built-in wildcard function, but using programmability (optional and free with SPSS Base), there is a powerful regular expression facility available that can be used for this purpose.  You have to figure out what patterns to look for, but the re language is very expressive.

In SPSS 14, you have to create a new text file and merge it back to your data.  With SPSS 15 you can do this directly.

-Jon Peck
SPSS

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Tanya Temkin
Sent: Thursday, April 26, 2007 2:52 PM
To: [hidden email]
Subject: [SPSSX-L] anything like LIKE in SPSS?

Hi to all,

I need advice on how to deal with identifying "duplicate" values when
strings are very close but not identical. A good wildcard function would
be helpful but don't know of any.

This is a mockup of my data (much simplified) that logs calls regarding
patient complaints:

Pt_No   calldate   SnP   Complaint
1       12/07/04    TSR  SPRAINS-JOINT INJURY & BROKEN BONES
1       12/07/04    RN   SPRAINS/JOINT INJURY/BROKEN BONES
1       12/12/04    TSR  COUGH/COLD
2       3/05/05     TSR  HEAD INJURY
2       3/05/05     RN   HEAD INJURY/TRAUMA
2       3/12/05     TSR  DIAGNOSTIC TEST:RESULTS
2       3/12/05     RN   DIAGNOSTIC TEST:  RESULTS
2       9/01/05     TSR  CAST PROBLEMS
3       7/28/04     TSR  DIZZY/VERTIGO/FAINTING
3       7/28/04     RN   DIZZINESS/VERTIGO/SYNCOPE

Sometimes a teleservice representative (TSR) handles the call on her own;
often she transfers it to a registered nurse (RN). The patient may talk to
the TSR and RN about the same problem. However, as you can see, the text
strings the TSRs and RNs use to document the same patient complaint may
differ - extra spaces, slightly different wording or punctuation....

So much for my plans to use the LAG function to flag TSR and RN handling
of the same complaint on the same call!

I've looked at past postings on use of INDEX and SCAN but I'm not sure
that's what I need here... my SAS-using colleagues tell me that SAS has a
LIKE function that can identify similar words or phrases in different
string values. That sounds like what I need. Is there anything like this
in SPSS?

Oh yes, I'm using V 14. Only base system.


Thanks in advance.

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680
TIE 8-428-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.
Reply | Threaded
Open this post in threaded view
|

Re: anything like LIKE in SPSS?

meljr
In reply to this post by Tanya Temkin
Tanya, if both the RN and the TSR have a unique set of complaint descriptions, I would pick one and set up the other to match it.
Example of assigning values of TSR to RN:
if (Complaint = "SPRAINS-JOINT INJURY & BROKEN BONES") Complaint = "SPRAINS/JOINT INJURY/BROKEN BONES".

I would not usually do this by hand unless I only have a few descriptions.
What I would do is put the matching columns in an Excel spreadsheet and write the syntax arount them. Then I copy it into the SPSS syntax. If you use Excel, you will have to play around with it a bit to get it right.

Good luck!
meljr

Tanya Temkin wrote
Hi to all,

I need advice on how to deal with identifying "duplicate" values when
strings are very close but not identical. A good wildcard function would
be helpful but don't know of any.

This is a mockup of my data (much simplified) that logs calls regarding
patient complaints:

Pt_No   calldate   SnP   Complaint
1       12/07/04    TSR  SPRAINS-JOINT INJURY & BROKEN BONES
1       12/07/04    RN   SPRAINS/JOINT INJURY/BROKEN BONES
1       12/12/04    TSR  COUGH/COLD
2       3/05/05     TSR  HEAD INJURY
2       3/05/05     RN   HEAD INJURY/TRAUMA
2       3/12/05     TSR  DIAGNOSTIC TEST:RESULTS
2       3/12/05     RN   DIAGNOSTIC TEST:  RESULTS
2       9/01/05     TSR  CAST PROBLEMS
3       7/28/04     TSR  DIZZY/VERTIGO/FAINTING
3       7/28/04     RN   DIZZINESS/VERTIGO/SYNCOPE

Sometimes a teleservice representative (TSR) handles the call on her own;
often she transfers it to a registered nurse (RN). The patient may talk to
the TSR and RN about the same problem. However, as you can see, the text
strings the TSRs and RNs use to document the same patient complaint may
differ - extra spaces, slightly different wording or punctuation....

So much for my plans to use the LAG function to flag TSR and RN handling
of the same complaint on the same call!

I've looked at past postings on use of INDEX and SCAN but I'm not sure
that's what I need here... my SAS-using colleagues tell me that SAS has a
LIKE function that can identify similar words or phrases in different
string values. That sounds like what I need. Is there anything like this
in SPSS?

Oh yes, I'm using V 14. Only base system.


Thanks in advance.

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680
TIE 8-428-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.
Reply | Threaded
Open this post in threaded view
|

Re: anything like LIKE in SPSS?

John Painter
Hello,

There is also a LIKE function available in SQL. Do you have a database
program (e.g., Access) available?

Best,

John

meljr wrote:

> Tanya, if both the RN and the TSR have a unique set of complaint
> descriptions, I would pick one and set up the other to match it.
> Example of assigning values of TSR to RN:
> if (Complaint = "SPRAINS-JOINT INJURY & BROKEN BONES") Complaint =
> "SPRAINS/JOINT INJURY/BROKEN BONES".
>
> I would not usually do this by hand unless I only have a few descriptions.
> What I would do is put the matching columns in an Excel spreadsheet and
> write the syntax arount them. Then I copy it into the SPSS syntax. If you
> use Excel, you will have to play around with it a bit to get it right.
>
> Good luck!
> meljr
>
>
> Tanya Temkin wrote:
>
>> Hi to all,
>>
>> I need advice on how to deal with identifying "duplicate" values when
>> strings are very close but not identical. A good wildcard function would
>> be helpful but don't know of any.
>>
>> This is a mockup of my data (much simplified) that logs calls regarding
>> patient complaints:
>>
>> Pt_No   calldate   SnP   Complaint
>> 1       12/07/04    TSR  SPRAINS-JOINT INJURY & BROKEN BONES
>> 1       12/07/04    RN   SPRAINS/JOINT INJURY/BROKEN BONES
>> 1       12/12/04    TSR  COUGH/COLD
>> 2       3/05/05     TSR  HEAD INJURY
>> 2       3/05/05     RN   HEAD INJURY/TRAUMA
>> 2       3/12/05     TSR  DIAGNOSTIC TEST:RESULTS
>> 2       3/12/05     RN   DIAGNOSTIC TEST:  RESULTS
>> 2       9/01/05     TSR  CAST PROBLEMS
>> 3       7/28/04     TSR  DIZZY/VERTIGO/FAINTING
>> 3       7/28/04     RN   DIZZINESS/VERTIGO/SYNCOPE
>>
>> Sometimes a teleservice representative (TSR) handles the call on her own;
>> often she transfers it to a registered nurse (RN). The patient may talk to
>> the TSR and RN about the same problem. However, as you can see, the text
>> strings the TSRs and RNs use to document the same patient complaint may
>> differ - extra spaces, slightly different wording or punctuation....
>>
>> So much for my plans to use the LAG function to flag TSR and RN handling
>> of the same complaint on the same call!
>>
>> I've looked at past postings on use of INDEX and SCAN but I'm not sure
>> that's what I need here... my SAS-using colleagues tell me that SAS has a
>> LIKE function that can identify similar words or phrases in different
>> string values. That sounds like what I need. Is there anything like this
>> in SPSS?
>>
>> Oh yes, I'm using V 14. Only base system.
>>
>>
>> Thanks in advance.
>>
>> Tanya Temkin
>> Research Associate
>> AACC Reporting
>> Northern California Regional Office
>> The Permanente Medical Group
>> (510) 625-6680
>> TIE 8-428-6680
>>
>> NOTICE TO RECIPIENT:  If you are not the intended recipient of this
>> e-mail, you are prohibited from sharing, copying, or otherwise using or
>> disclosing its contents.  If you have received this e-mail in error,
>> please notify the sender immediately by reply e-mail and permanently
>> delete this e-mail and any attachments without reading, forwarding or
>> saving them.  Thank you.
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/anything-like-LIKE-in-SPSS--tf3653842.html#a10208083
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: anything like LIKE in SPSS?

Melissa Ives
In reply to this post by Tanya Temkin
Look into the Index function--you can select based on a subset of items--eg
index(Complaint,"SPRAIN") or Index(Compleint,"DIZZ") would get what you want.

________________________________

From: SPSSX(r) Discussion on behalf of Tanya Temkin
Sent: Thu 4/26/2007 2:51 PM
To: [hidden email]
Subject: [SPSSX-L] anything like LIKE in SPSS?



Hi to all,

I need advice on how to deal with identifying "duplicate" values when
strings are very close but not identical. A good wildcard function would
be helpful but don't know of any.

This is a mockup of my data (much simplified) that logs calls regarding
patient complaints:

Pt_No   calldate   SnP   Complaint
1       12/07/04    TSR  SPRAINS-JOINT INJURY & BROKEN BONES
1       12/07/04    RN   SPRAINS/JOINT INJURY/BROKEN BONES
1       12/12/04    TSR  COUGH/COLD
2       3/05/05     TSR  HEAD INJURY
2       3/05/05     RN   HEAD INJURY/TRAUMA
2       3/12/05     TSR  DIAGNOSTIC TEST:RESULTS
2       3/12/05     RN   DIAGNOSTIC TEST:  RESULTS
2       9/01/05     TSR  CAST PROBLEMS
3       7/28/04     TSR  DIZZY/VERTIGO/FAINTING
3       7/28/04     RN   DIZZINESS/VERTIGO/SYNCOPE

Sometimes a teleservice representative (TSR) handles the call on her own;
often she transfers it to a registered nurse (RN). The patient may talk to
the TSR and RN about the same problem. However, as you can see, the text
strings the TSRs and RNs use to document the same patient complaint may
differ - extra spaces, slightly different wording or punctuation....

So much for my plans to use the LAG function to flag TSR and RN handling
of the same complaint on the same call!

I've looked at past postings on use of INDEX and SCAN but I'm not sure
that's what I need here... my SAS-using colleagues tell me that SAS has a
LIKE function that can identify similar words or phrases in different
string values. That sounds like what I need. Is there anything like this
in SPSS?

Oh yes, I'm using V 14. Only base system.


Thanks in advance.

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680
TIE 8-428-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.




PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.
Reply | Threaded
Open this post in threaded view
|

Re: anything like in SPSS?

Marks, Jim
In reply to this post by Peck, Jon
Is there a programmability module for addresses?

--jim

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Peck, Jon
Sent: Thursday, April 26, 2007 2:58 PM
To: [hidden email]
Subject: Re: anything like in SPSS?

SPSS does not have a built-in wildcard function, but using
programmability (optional and free with SPSS Base), there is a powerful
regular expression facility available that can be used for this purpose.
You have to figure out what patterns to look for, but the re language is
very expressive.

In SPSS 14, you have to create a new text file and merge it back to your
data.  With SPSS 15 you can do this directly.

-Jon Peck
SPSS

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Tanya Temkin
Sent: Thursday, April 26, 2007 2:52 PM
To: [hidden email]
Subject: [SPSSX-L] anything like LIKE in SPSS?

Hi to all,

I need advice on how to deal with identifying "duplicate" values when
strings are very close but not identical. A good wildcard function would
be helpful but don't know of any.

This is a mockup of my data (much simplified) that logs calls regarding
patient complaints:

Pt_No   calldate   SnP   Complaint
1       12/07/04    TSR  SPRAINS-JOINT INJURY & BROKEN BONES
1       12/07/04    RN   SPRAINS/JOINT INJURY/BROKEN BONES
1       12/12/04    TSR  COUGH/COLD
2       3/05/05     TSR  HEAD INJURY
2       3/05/05     RN   HEAD INJURY/TRAUMA
2       3/12/05     TSR  DIAGNOSTIC TEST:RESULTS
2       3/12/05     RN   DIAGNOSTIC TEST:  RESULTS
2       9/01/05     TSR  CAST PROBLEMS
3       7/28/04     TSR  DIZZY/VERTIGO/FAINTING
3       7/28/04     RN   DIZZINESS/VERTIGO/SYNCOPE

Sometimes a teleservice representative (TSR) handles the call on her
own; often she transfers it to a registered nurse (RN). The patient may
talk to the TSR and RN about the same problem. However, as you can see,
the text strings the TSRs and RNs use to document the same patient
complaint may differ - extra spaces, slightly different wording or
punctuation....

So much for my plans to use the LAG function to flag TSR and RN handling
of the same complaint on the same call!

I've looked at past postings on use of INDEX and SCAN but I'm not sure
that's what I need here... my SAS-using colleagues tell me that SAS has
a LIKE function that can identify similar words or phrases in different
string values. That sounds like what I need. Is there anything like this
in SPSS?

Oh yes, I'm using V 14. Only base system.


Thanks in advance.

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680
TIE 8-428-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.
Reply | Threaded
Open this post in threaded view
|

Re: anything like in SPSS?

Peck, Jon
There is no module already set up for this, but we put an example in the 4th edition of the Data Management book (pdf downloadable from http://www.spss.com/spss/data_management_book.htm)

in which regular expressions are used to parse out parts of addresses that are not rigorously structured.  It is in Chapter 21.  Using these techniques, you can do a lot with a little bit of code.  How much work (code) you would have to do depends on how robust a result you want and how well controlled the input is, so some experimentation would be a good idea.

HTH,
Jon Peck

-----Original Message-----
From: Marks, Jim [mailto:[hidden email]]
Sent: Thursday, April 26, 2007 11:21 PM
To: Peck, Jon; [hidden email]
Subject: RE: Re: anything like in SPSS?

Is there a programmability module for addresses?

--jim

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Peck, Jon
Sent: Thursday, April 26, 2007 2:58 PM
To: [hidden email]
Subject: Re: anything like in SPSS?

SPSS does not have a built-in wildcard function, but using
programmability (optional and free with SPSS Base), there is a powerful
regular expression facility available that can be used for this purpose.
You have to figure out what patterns to look for, but the re language is
very expressive.

In SPSS 14, you have to create a new text file and merge it back to your
data.  With SPSS 15 you can do this directly.

-Jon Peck
SPSS

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Tanya Temkin
Sent: Thursday, April 26, 2007 2:52 PM
To: [hidden email]
Subject: [SPSSX-L] anything like LIKE in SPSS?

Hi to all,

I need advice on how to deal with identifying "duplicate" values when
strings are very close but not identical. A good wildcard function would
be helpful but don't know of any.

This is a mockup of my data (much simplified) that logs calls regarding
patient complaints:

Pt_No   calldate   SnP   Complaint
1       12/07/04    TSR  SPRAINS-JOINT INJURY & BROKEN BONES
1       12/07/04    RN   SPRAINS/JOINT INJURY/BROKEN BONES
1       12/12/04    TSR  COUGH/COLD
2       3/05/05     TSR  HEAD INJURY
2       3/05/05     RN   HEAD INJURY/TRAUMA
2       3/12/05     TSR  DIAGNOSTIC TEST:RESULTS
2       3/12/05     RN   DIAGNOSTIC TEST:  RESULTS
2       9/01/05     TSR  CAST PROBLEMS
3       7/28/04     TSR  DIZZY/VERTIGO/FAINTING
3       7/28/04     RN   DIZZINESS/VERTIGO/SYNCOPE

Sometimes a teleservice representative (TSR) handles the call on her
own; often she transfers it to a registered nurse (RN). The patient may
talk to the TSR and RN about the same problem. However, as you can see,
the text strings the TSRs and RNs use to document the same patient
complaint may differ - extra spaces, slightly different wording or
punctuation....

So much for my plans to use the LAG function to flag TSR and RN handling
of the same complaint on the same call!

I've looked at past postings on use of INDEX and SCAN but I'm not sure
that's what I need here... my SAS-using colleagues tell me that SAS has
a LIKE function that can identify similar words or phrases in different
string values. That sounds like what I need. Is there anything like this
in SPSS?

Oh yes, I'm using V 14. Only base system.


Thanks in advance.

Tanya Temkin
Research Associate
AACC Reporting
Northern California Regional Office
The Permanente Medical Group
(510) 625-6680
TIE 8-428-6680

NOTICE TO RECIPIENT:  If you are not the intended recipient of this
e-mail, you are prohibited from sharing, copying, or otherwise using or
disclosing its contents.  If you have received this e-mail in error,
please notify the sender immediately by reply e-mail and permanently
delete this e-mail and any attachments without reading, forwarding or
saving them.  Thank you.