Re: Remove repeated "words" in a string variable

Posted by Jon Peck on
URL: http://spssx-discussion.165.s1.nabble.com/Remove-repeated-words-in-a-string-variable-tp5740541p5740543.html

Here is a simple solution using the SPSSINC TRANS extension command, which can be installed from the Extensions > Extension Hub if yu don't already  have it.
First, it defines a function; then TRANS uses it.  The TYPE value is the length for the output string variable - typically the same as the input length.
This does not preserve order.  That would require a bit more complicated code.

begin program python3.
def unique(codes):
    return " ".join(set(codes.split()))
end program.

spssinc trans result=codes type=200
/formula "unique(codes)".

On Thu, May 20, 2021 at 11:53 AM BjornAhlstrom <[hidden email]> wrote:
Hi, I´m working on a large dataset. In one string variable I have ICD-10 codes. In many cases they are repeated like this:" T810 J969 T812 R651 J809 T810 N178 B371 M628 B968C T810 T812...". As you can see there are five "T812" and three T810. Sometimes there are 50 repeats of ICU codes which make this variable unnecessary wide. I would like to keep only one of each code for each case. I have tride the syntax presented in this discussion: http://spssx-discussion.1045642.n5.nabble.com/How-to-remove-duplicate-repeated-character-in-a-variable-td5728880.html However, the suggested solutions does not give the desired result. Data example: DATA LIST LIST / id * icd(a50). BEGIN DATA 1 "T079 S370 S220 S270 S220 S369 T079 T079 S370 S220 " 2 "J809B N179 J969 R572 J459 J159 J969 J809C R651 N179 " 3 "I609 N179 R572 J809C B371 I609 N179 N179 I609" END DATA. Any suggestions? Thanks in advance, Björn

Sent from the SPSSX Discussion mailing list archive at Nabble.com.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD