Hi,
I attempted to send a full example using a saved SPSS data file, but the listserv did not allow that type of posting. Instead, below is a next-best representation of the problem. * How can I delete embedded control characters which act as returns when reported from SPSS; where two rectangular control characters looking like [][] are interspersed in long comments; For project, a csv file is stored in Access and loaded to SPSS using ODBC. DATA LIST LIST /ID(A5) COMMENT(A5000). BEGIN DATA. 1 This is a very long comment with 2 rectangular control characters [][] interspersed[][] 2 But difficult to copy and display since the characters act as a return in SPSS[][] 3 And so on [][] but these are not the true control characters END DATA. *The DO REPEAT below was offered by Jon Peck as a possible solution to a similar problem in 2004. DO REPEAT VAR=COMMENT. COMPUTE LocHex01 = INDEX(VAR,' '). IF (LocHex01 > 0)VAR = SUBSTR(VAR,1,LocHex01-1). END REPEAT. EXECUTE. *All the above did was wipe out the comments entirely; Perhaps I was supposed to copy the hidden characters into INDEX(VAR,' '); However, I cannot find a way to copy them. Any help is appreciated. Thanks. Kevin _________________________________________________________________ Get in the mood for Valentine's Day. View photos, recipes and more on your Live.com page. http://www.live.com/?addTemplate=ValentinesDay&ocid=T001MSN30A0701 |
At 07:09 PM 1/30/2007, Kevin Hynes wrote:
>How can I delete embedded control characters which act as returns when >reported from SPSS; where two rectangular control characters looking >like [][] are interspersed in long [data strings]. I meant to get to this the last time you posted. Suppose the characters you want to eliminate are in string variable #BadChar, possibly followed by trailing blanks. To remove all instances of those 'bad' characters and close up the gaps: * Eliminate 'bad' characters ..... . LOOP #Attempt = 1 TO LENGTH(TestStr1)+1. . COMPUTE #1BadOne = INDEX(TestStr1,RTRIM(#BadChar),1). . DO IF #1BadOne EQ 0. . BREAK. . ELSE. . COMPUTE SUBSTR(TestStr1,#1BadOne) = SUBSTR(TestStr1,#1BadOne+1). . END IF. END LOOP. Or, to replace all instances by some character, in this case "+": * Replace 'bad' characters by "+" ..... . LOOP #Attempt = 1 TO LENGTH(TestStr2)+1. . COMPUTE #1BadOne = INDEX(TestStr2,RTRIM(#BadChar),1). . DO IF #1BadOne EQ 0. . BREAK. . ELSE. . COMPUTE SUBSTR(TestStr2,#1BadOne,1) = '+'. . END IF. END LOOP. .................... That leaves the problem of building the list of undesired characters, when the ones you want to eliminate, the control characters, can't well be entered in character-string constants. Instead, generate the list of characters from its ASCII numerical equivalents. The following technique for finding the ASCII numeric equivalent of characters, or the ASCII character equivalent of a number, is from Raynald Levesque, and was called to our attention by Gene Maguin(*). The following is SPSS 14 draft output, but it should work in all SPSS releases. Careful: 'ASCIIhex' is a *numeric* variable; 'hex' is a *string* variable. The input contains one variable, an A1 variable named 'CHAR'. * I. Get numeric code from characters, . * and characters from numeric. . GET FILE=TestChar. NUMERIC ASCII (F3) /ASCIIhex (PIBHEX02). STRING HEX (A3). STRING RECOVER (A1). COMPUTE ASCII = NUMBER(CHAR,PIB1). COMPUTE ASCIIhex = ASCII. COMPUTE HEX = CONCAT('x',STRING(ASCII,PIBHEX02)). COMPUTE RECOVER = STRING(ASCII,PIB1). LIST. |-----------------------------|---------------------------| |Output Created |31-JAN-2007 21:08:21 | |-----------------------------|---------------------------| [TestChar] C:\Documents and Settings\Richard\My Documents\Temporary\SPSS \2007-01-31 Hynes - Deleting embedded control characters TestChar.Sav CHAR ASCII ASCIIhex HEX RECOVER L 76 4C x4C L O 79 4F x4F O V 86 56 x56 V E 69 45 x45 E 32 20 x20 1 49 31 x31 1 4 52 34 x34 4 Number of cases read: 7 Number of cases listed: 7 .................... The control characters are 000 through 015 (decimal); but you'll have to decide what characters you want to get rid of. If you have a set of ASCII numerical values for 'bad' characters, you can add them to the list with a DO REPEAT. . DO REPEAT CharNum = 65 69 73 79 85. . COMPUTE #BadChar = CONCAT(RTRIM(#BadChar), STRING(CharNum,PIB1)). . END REPEAT. If you have a contiguous range of values, you can use a LOOP: . LOOP #CharNum = 48 TO 57. . COMPUTE #BadChar = CONCAT(RTRIM(#BadChar), STRING(#CharNum,PIB1)). . END LOOP. Here's a complete demonstration. For illustration, the 'bad' characters to be eliminated are the vowels (AEIOU) and the digits. This is SPSS 14 draft output: LIST. List |-----------------------------|---------------------------| |Output Created |31-JAN-2007 21:08:22 | |-----------------------------|---------------------------| TestStr1 TestStr2 LOVE 14 OF 39 PUPPIES LOVE 14 OF 39 PUPPIES LOVE 14 OF 39 PUPPIES LOVE 14 OF 39 PUPPIES Number of cases read: 2 Number of cases listed: 2 STRING #BadChar (A25). DO IF $CASENUM EQ 1. * Vowels .......... . . DO REPEAT CharNum = 65 69 73 79 85. . COMPUTE #BadChar = CONCAT(RTRIM(#BadChar), STRING(CharNum,PIB1)). . END REPEAT. * Digits .......... . . LOOP #CharNum = 48 TO 57. . COMPUTE #BadChar = CONCAT(RTRIM(#BadChar), STRING(#CharNum,PIB1)). . END LOOP. . PRINT / 'Eliminate characters ' #BadChar. END IF. * Eliminate 'bad' characters ..... . LOOP #Attempt = 1 TO LENGTH(TestStr1)+1 IF $CASENUM GT 1. . COMPUTE #1BadOne = INDEX(TestStr1,RTRIM(#BadChar),1). . DO IF #1BadOne EQ 0. . BREAK. . ELSE. . COMPUTE SUBSTR(TestStr1,#1BadOne) = SUBSTR(TestStr1,#1BadOne+1). . END IF. END LOOP. * Replace 'bad' characters by "+" ..... . LOOP #Attempt = 1 TO LENGTH(TestStr2)+1 IF $CASENUM GT 1. . COMPUTE #1BadOne = INDEX(TestStr2,RTRIM(#BadChar),1). . DO IF #1BadOne EQ 0. . BREAK. . ELSE. . COMPUTE SUBSTR(TestStr2,#1BadOne,1) = '+'. . END IF. END LOOP. LIST. List |-----------------------------|---------------------------| |Output Created |31-JAN-2007 21:08:22 | |-----------------------------|---------------------------| Eliminate characters AEIOU0123456789 TestStr1 TestStr2 LOVE 14 OF 39 PUPPIES LOVE 14 OF 39 PUPPIES LV F PPPS L+V+ ++ +F ++ P+PP++S Number of cases read: 2 Number of cases listed: 2 =================== APPENDIX: Test data =================== * ....... Test Data ............... . FILE HANDLE TestString /NAME=<choose>. FILE HANDLE TestChar /NAME=<choose>. NEW FILE. INPUT PROGRAM. . STRING TestStr1 TestStr2 (A25). . LEAVE TestSTr1 TestStr2. . COMPUTE TestStr1 = 'LOVE 14 OF 39 PUPPIES'. . COMPUTE TestStr2 = TestStr1. . END CASE. . END CASE. END FILE. END INPUT PROGRAM. DATASET NAME TestString. SAVE OUTFILE=TestString. . /*-- LIST /*-*/. STRING CHAR(A1). LOOP #POS = 1 TO 7 IF $CASENUM EQ 1. . COMPUTE CHAR = SUBSTR(TestStr1,#POS,1). . XSAVE OUTFILE = TestChar /KEEP=CHAR. END LOOP. EXECUTE. ==================== (*) Citation for Gene Maguin: posting to SPSSX-L, Date: Wed, 6 Dec 2006 17:04:02 -0500 From: Gene Maguin <[hidden email]> Subject: Re: Non-printing characters To: [hidden email] |
Hi,
Excel has a text function called 'PROPER', which deletes all non-printable characters. It's less sophisticated than the pretty syntax below, but if it works, it works... Cheers! Albert-Jan --- Richard Ristow <[hidden email]> wrote: > At 07:09 PM 1/30/2007, Kevin Hynes wrote: > > >How can I delete embedded control characters which > act as returns when > >reported from SPSS; where two rectangular control > characters looking > >like [][] are interspersed in long [data strings]. > > I meant to get to this the last time you posted. > > Suppose the characters you want to eliminate are in > string variable > #BadChar, possibly followed by trailing blanks. To > remove all instances > of those 'bad' characters and close up the gaps: > > * Eliminate 'bad' characters ..... . > LOOP #Attempt = 1 TO LENGTH(TestStr1)+1. > . COMPUTE #1BadOne = > INDEX(TestStr1,RTRIM(#BadChar),1). > . DO IF #1BadOne EQ 0. > . BREAK. > . ELSE. > . COMPUTE SUBSTR(TestStr1,#1BadOne) > = SUBSTR(TestStr1,#1BadOne+1). > . END IF. > END LOOP. > > Or, to replace all instances by some character, in > this case "+": > > * Replace 'bad' characters by "+" ..... . > LOOP #Attempt = 1 TO LENGTH(TestStr2)+1. > . COMPUTE #1BadOne = > INDEX(TestStr2,RTRIM(#BadChar),1). > . DO IF #1BadOne EQ 0. > . BREAK. > . ELSE. > . COMPUTE SUBSTR(TestStr2,#1BadOne,1) = '+'. > . END IF. > END LOOP. > .................... > That leaves the problem of building the list of > undesired characters, > when the ones you want to eliminate, the control > characters, can't > well be entered in character-string constants. > Instead, generate the > list of characters from its ASCII numerical > equivalents. The following > technique for finding the ASCII numeric equivalent > of characters, or > the ASCII character equivalent of a number, is from > Raynald Levesque, > and was called to our attention by Gene Maguin(*). > The following is > SPSS 14 draft output, but it should work in all SPSS > releases. Careful: > 'ASCIIhex' is a *numeric* variable; 'hex' is a > *string* variable. The > input contains one variable, an A1 variable named > 'CHAR'. > > * I. Get numeric code from characters, . > * and characters from numeric. . > GET FILE=TestChar. > > NUMERIC ASCII (F3) > /ASCIIhex (PIBHEX02). > STRING HEX (A3). > STRING RECOVER (A1). > > COMPUTE ASCII = NUMBER(CHAR,PIB1). > COMPUTE ASCIIhex = ASCII. > COMPUTE HEX = > CONCAT('x',STRING(ASCII,PIBHEX02)). > COMPUTE RECOVER = STRING(ASCII,PIB1). > > LIST. > > |Output Created |31-JAN-2007 21:08:21 > | > |-----------------------------|---------------------------| > [TestChar] > C:\Documents and Settings\Richard\My > Documents\Temporary\SPSS > \2007-01-31 Hynes - Deleting embedded control > characters > TestChar.Sav > > CHAR ASCII ASCIIhex HEX RECOVER > > L 76 4C x4C L > O 79 4F x4F O > V 86 56 x56 V > E 69 45 x45 E > 32 20 x20 > 1 49 31 x31 1 > 4 52 34 x34 4 > > Number of cases read: 7 Number of cases listed: > 7 > .................... > The control characters are 000 through 015 > (decimal); but you'll have > to decide what characters you want to get rid of. If > you have a set of > ASCII numerical values for 'bad' characters, you can > add them to the > list with a DO REPEAT. > > . DO REPEAT CharNum = 65 69 73 79 85. > . COMPUTE #BadChar = CONCAT(RTRIM(#BadChar), > > STRING(CharNum,PIB1)). > . END REPEAT. > > If you have a contiguous range of values, you can > use a LOOP: > > . LOOP #CharNum = 48 TO 57. > . COMPUTE #BadChar = CONCAT(RTRIM(#BadChar), > > STRING(#CharNum,PIB1)). > . END LOOP. > > Here's a complete demonstration. For illustration, > the 'bad' characters > to be eliminated are the vowels (AEIOU) and the > digits. This is SPSS 14 > draft output: > > LIST. > > List > > |Output Created |31-JAN-2007 21:08:22 > | > |-----------------------------|---------------------------| > TestStr1 TestStr2 > > LOVE 14 OF 39 PUPPIES LOVE 14 OF 39 PUPPIES > LOVE 14 OF 39 PUPPIES LOVE 14 OF 39 PUPPIES > > Number of cases read: 2 Number of cases listed: > 2 > > > STRING #BadChar (A25). > DO IF $CASENUM EQ 1. > * Vowels .......... . > . DO REPEAT CharNum = 65 69 73 79 85. > . COMPUTE #BadChar = CONCAT(RTRIM(#BadChar), > > STRING(CharNum,PIB1)). > . END REPEAT. > * Digits .......... . > . LOOP #CharNum = 48 TO 57. > . COMPUTE #BadChar = CONCAT(RTRIM(#BadChar), > > STRING(#CharNum,PIB1)). > . END LOOP. > . PRINT / 'Eliminate characters ' #BadChar. > END IF. > > * Eliminate 'bad' characters ..... . > LOOP #Attempt = 1 TO LENGTH(TestStr1)+1 IF $CASENUM > GT 1. > . COMPUTE #1BadOne = > INDEX(TestStr1,RTRIM(#BadChar),1). > . DO IF #1BadOne EQ 0. > . BREAK. > . ELSE. > . COMPUTE SUBSTR(TestStr1,#1BadOne) > = SUBSTR(TestStr1,#1BadOne+1). > . END IF. > END LOOP. > > * Replace 'bad' characters by "+" ..... . > LOOP #Attempt = 1 TO LENGTH(TestStr2)+1 IF $CASENUM > GT 1. > . COMPUTE #1BadOne = > INDEX(TestStr2,RTRIM(#BadChar),1). > . DO IF #1BadOne EQ 0. > . BREAK. > . ELSE. > . COMPUTE SUBSTR(TestStr2,#1BadOne,1) = '+'. > . END IF. > END LOOP. > > LIST. > > List > > |Output Created |31-JAN-2007 21:08:22 > | > |-----------------------------|---------------------------| > > Eliminate characters AEIOU0123456789 > > TestStr1 TestStr2 > === message truncated === ____________________________________________________________________________________ Now that's room service! Choose from over 150,000 hotels in 45,000 destinations on Yahoo! Travel to find your fit. http://farechase.yahoo.com/promo-generic-14795097 |
Free forum by Nabble | Edit this page |