A syntax question: char.substr not substr

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

A syntax question: char.substr not substr

Ron0z
I’ve just had my version of SPSS updated. I had been using  ver 22 but now
I’m on ver 25. I run command files exclusively in everything I do. My files
ran error free in ver 22, now I have a few errors. Here’s one: Line 174 was
identified with an error. You’ll see line 174 is a sort command.

And the error that was reported: “The SUBSTR function is depreciated.
Consider using the CHAR.SUBSTR function.”

The code is as follows:

[line 171]if (substr(HomePhone, 1, 1) NE ' ') Phone = concat('[line ',
rtrim(HomePhone), ']').
[line 172]if (substr(MobilePhone, 1, 1) NE ' ') Mobile = concat('[line ',
rtrim(MobilePhone), ']').
[line 173]
[line 174]sort cases by College Course_name FamilyName FirstName.

This seems strange. When I edited SUBSTR to CHAR.SUBSTR the problem
vanished. It’s strange in that the error was not trapped on either line 171
or 172 but waited until the SORT command was encountered.


The same with this one:
select if (substr(ClientID, 1, 3) eq 'CIT').
FREQUENCIES VARIABLES=Citizen atsi.

The error was identified on the FREQUENCIES command and not the SELECT.
Once again changing to substr to char.substr corrected the problem.

So, it’s no big deal to type “char.” in front of all my substr entries in my
code. But I wonder why. Does anyone know? And why does the error not show on
the actual line with the syntax problem?





--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: A syntax question: char.substr not substr

PRogman
I believe it is the result of commands that are stored, pending execution,
until a command is executed that reads the dataset (explained in the FM
(CSR>Universals>Commands>Command Order). Your if-statements are
transformations, all pending until the sort command executes. If I remember
correctly, the execute command is actually an emtpy procedure just running
pending commands, and not necessary if the next command is a procedure
reading the data and executing pending commands anyway. It makes the run
more efficient, by gathering transformations and executing all at the same
dataset reading
Your second example illustrates the possibility to do some more
transformations on the selected cases before running the frequencies
command.
/PR


Ron0z wrote

> I’ve just had my version of SPSS updated. I had been using  ver 22 but now
> I’m on ver 25. I run command files exclusively in everything I do. My
> files
> ran error free in ver 22, now I have a few errors. Here’s one: Line 174
> was
> identified with an error. You’ll see line 174 is a sort command.
>
> And the error that was reported: “The SUBSTR function is depreciated.
> Consider using the CHAR.SUBSTR function.”
>
> The code is as follows:
>
> [line 171]if (substr(HomePhone, 1, 1) NE ' ') Phone = concat('[line ',
> rtrim(HomePhone), ']').
> [line 172]if (substr(MobilePhone, 1, 1) NE ' ') Mobile = concat('[line ',
> rtrim(MobilePhone), ']').
> [line 173]
> [line 174]sort cases by College Course_name FamilyName FirstName.
>
> This seems strange. When I edited SUBSTR to CHAR.SUBSTR the problem
> vanished. It’s strange in that the error was not trapped on either line
> 171
> or 172 but waited until the SORT command was encountered.
>
>
> The same with this one:
> select if (substr(ClientID, 1, 3) eq 'CIT').
> FREQUENCIES VARIABLES=Citizen atsi.
>
> The error was identified on the FREQUENCIES command and not the SELECT.
> Once again changing to substr to char.substr corrected the problem.
>
> So, it’s no big deal to type “char.” in front of all my substr entries in
> my
> code. But I wonder why. Does anyone know? And why does the error not show
> on
> the actual line with the syntax problem?
>
>
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: A syntax question: char.substr not substr

Bruce Weaver
Administrator
In reply to this post by Ron0z
To be clear, the message you saw was a WARNING, not an ERROR message.  I just
cobbled together this little example:

NEW FILE.
DATASET CLOSE ALL.
DATA LIST LIST / s1 (A5).
BEGIN DATA
abcde
ABCDE
fghij
FGHIJ
END DATA.
STRING s2 s3 (A1).
COMPUTE s2 = SUBSTR(s1,3,1).
COMPUTE s3 = CHAR.SUBSTR(s1,3,1).
LIST.

When run this immediately after launching an SPSS session, I see the same
warning message you described, but execution did not stop, and variables s2
and s3 were both computed correctly.  Here is the relevant output:

List

>Warning # 661.  Command name: LIST
>The SUBSTR function is deprecated.  Consider using the CHAR.SUBSTR
function.

s1    s2 s3

abcde c  c
ABCDE C  C
fghij h  h
FGHIJ H  H

Number of cases read:  4    Number of cases listed:  4

When I re-ran the syntax after that, I got the same output, but without the
warning message.  So it would appear that the warning is issued once per
session at most.  

I am using v25.0.0.1 for Windows (64-bit).  



Ron0z wrote

> I’ve just had my version of SPSS updated. I had been using  ver 22 but now
> I’m on ver 25. I run command files exclusively in everything I do. My
> files
> ran error free in ver 22, now I have a few errors. Here’s one: Line 174
> was
> identified with an error. You’ll see line 174 is a sort command.
>
> And the error that was reported: “The SUBSTR function is depreciated.
> Consider using the CHAR.SUBSTR function.”
>
> The code is as follows:
>
> [line 171]if (substr(HomePhone, 1, 1) NE ' ') Phone = concat('[line ',
> rtrim(HomePhone), ']').
> [line 172]if (substr(MobilePhone, 1, 1) NE ' ') Mobile = concat('[line ',
> rtrim(MobilePhone), ']').
> [line 173]
> [line 174]sort cases by College Course_name FamilyName FirstName.
>
> This seems strange. When I edited SUBSTR to CHAR.SUBSTR the problem
> vanished. It’s strange in that the error was not trapped on either line
> 171
> or 172 but waited until the SORT command was encountered.
>
>
> The same with this one:
> select if (substr(ClientID, 1, 3) eq 'CIT').
> FREQUENCIES VARIABLES=Citizen atsi.
>
> The error was identified on the FREQUENCIES command and not the SELECT.
> Once again changing to substr to char.substr corrected the problem.
>
> So, it’s no big deal to type “char.” in front of all my substr entries in
> my
> code. But I wonder why. Does anyone know? And why does the error not show
> on
> the actual line with the syntax problem?
>
>
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: A syntax question: char.substr not substr

Rick Oliver
In reply to this post by Ron0z
The substr function actually counts bytes, not characters, so this was a historically a bit confusing in multi-byte code pages. In Unicode, there are also some characters in Western code pages that use more than one byte (e.g. accented characters), adding to the confusion. char.substr always counts characters, regardless of the number of bytes in the character.

On Tue, Oct 23, 2018 at 10:27 PM Ron0z <[hidden email]> wrote:
I’ve just had my version of SPSS updated. I had been using  ver 22 but now
I’m on ver 25. I run command files exclusively in everything I do. My files
ran error free in ver 22, now I have a few errors. Here’s one: Line 174 was
identified with an error. You’ll see line 174 is a sort command.

And the error that was reported: “The SUBSTR function is depreciated.
Consider using the CHAR.SUBSTR function.”

The code is as follows:

[line 171]if (substr(HomePhone, 1, 1) NE ' ') Phone = concat('[line ',
rtrim(HomePhone), ']').
[line 172]if (substr(MobilePhone, 1, 1) NE ' ') Mobile = concat('[line ',
rtrim(MobilePhone), ']').
[line 173]
[line 174]sort cases by College Course_name FamilyName FirstName.

This seems strange. When I edited SUBSTR to CHAR.SUBSTR the problem
vanished. It’s strange in that the error was not trapped on either line 171
or 172 but waited until the SORT command was encountered.


The same with this one:
select if (substr(ClientID, 1, 3) eq 'CIT').
FREQUENCIES VARIABLES=Citizen atsi.

The error was identified on the FREQUENCIES command and not the SELECT.
Once again changing to substr to char.substr corrected the problem.

So, it’s no big deal to type “char.” in front of all my substr entries in my
code. But I wonder why. Does anyone know? And why does the error not show on
the actual line with the syntax problem?





--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: A syntax question: char.substr not substr

PRogman
...And according to the FM seven CHAR. prefixed string functions were
released in v16, so they have been around some time...
/PR


Rick Oliver wrote
> The substr function actually counts bytes, not characters, so this was a
> historically a bit confusing in multi-byte code pages. In Unicode, there
> are also some characters in Western code pages that use more than one byte
> (e.g. accented characters), adding to the confusion. char.substr always
> counts characters, regardless of the number of bytes in the character.
>
> On Tue, Oct 23, 2018 at 10:27 PM Ron0z &lt;

> ronald.crichton@.edu

> &gt; wrote:
>
>> I’ve just had my version of SPSS updated. I had been using  ver 22 but
>> now
>> I’m on ver 25. I run command files exclusively in everything I do. My
>> files
>> ran error free in ver 22, now I have a few errors. Here’s one: Line 174
>> was
>> identified with an error. You’ll see line 174 is a sort command.
>>
>> And the error that was reported: “The SUBSTR function is depreciated.
>> Consider using the CHAR.SUBSTR function.”
>>
>> The code is as follows:
>>
>> [line 171]if (substr(HomePhone, 1, 1) NE ' ') Phone = concat('[line ',
>> rtrim(HomePhone), ']').
>> [line 172]if (substr(MobilePhone, 1, 1) NE ' ') Mobile = concat('[line ',
>> rtrim(MobilePhone), ']').
>> [line 173]
>> [line 174]sort cases by College Course_name FamilyName FirstName.
>>
>> This seems strange. When I edited SUBSTR to CHAR.SUBSTR the problem
>> vanished. It’s strange in that the error was not trapped on either line
>> 171
>> or 172 but waited until the SORT command was encountered.
>>
>>
>> The same with this one:
>> select if (substr(ClientID, 1, 3) eq 'CIT').
>> FREQUENCIES VARIABLES=Citizen atsi.
>>
>> The error was identified on the FREQUENCIES command and not the SELECT.
>> Once again changing to substr to char.substr corrected the problem.
>>
>> So, it’s no big deal to type “char.” in front of all my substr entries in
>> my
>> code. But I wonder why. Does anyone know? And why does the error not show
>> on
>> the actual line with the syntax problem?
>>
>>
>>
>>
>>
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: A syntax question: char.substr not substr

Rick Oliver
In a recent release (24, I think), the application runs in Unicode mode by default instead of code page mode. (You can change it at any time.) And in Unicode mode, I think the results for the char function can be different than in code page mode (e.g. the aformentioned accented characters, which are two bytes in Unicode). IIRC, the warning was added at the same time to nudge users toward more Unicode-friendly habits. 

On Wed, Oct 24, 2018 at 9:59 AM PRogman <[hidden email]> wrote:
...And according to the FM seven CHAR. prefixed string functions were
released in v16, so they have been around some time...
/PR


Rick Oliver wrote
> The substr function actually counts bytes, not characters, so this was a
> historically a bit confusing in multi-byte code pages. In Unicode, there
> are also some characters in Western code pages that use more than one byte
> (e.g. accented characters), adding to the confusion. char.substr always
> counts characters, regardless of the number of bytes in the character.
>
> On Tue, Oct 23, 2018 at 10:27 PM Ron0z &lt;

> ronald.crichton@.edu

> &gt; wrote:
>
>> I’ve just had my version of SPSS updated. I had been using  ver 22 but
>> now
>> I’m on ver 25. I run command files exclusively in everything I do. My
>> files
>> ran error free in ver 22, now I have a few errors. Here’s one: Line 174
>> was
>> identified with an error. You’ll see line 174 is a sort command.
>>
>> And the error that was reported: “The SUBSTR function is depreciated.
>> Consider using the CHAR.SUBSTR function.”
>>
>> The code is as follows:
>>
>> [line 171]if (substr(HomePhone, 1, 1) NE ' ') Phone = concat('[line ',
>> rtrim(HomePhone), ']').
>> [line 172]if (substr(MobilePhone, 1, 1) NE ' ') Mobile = concat('[line ',
>> rtrim(MobilePhone), ']').
>> [line 173]
>> [line 174]sort cases by College Course_name FamilyName FirstName.
>>
>> This seems strange. When I edited SUBSTR to CHAR.SUBSTR the problem
>> vanished. It’s strange in that the error was not trapped on either line
>> 171
>> or 172 but waited until the SORT command was encountered.
>>
>>
>> The same with this one:
>> select if (substr(ClientID, 1, 3) eq 'CIT').
>> FREQUENCIES VARIABLES=Citizen atsi.
>>
>> The error was identified on the FREQUENCIES command and not the SELECT.
>> Once again changing to substr to char.substr corrected the problem.
>>
>> So, it’s no big deal to type “char.” in front of all my substr entries in
>> my
>> code. But I wonder why. Does anyone know? And why does the error not show
>> on
>> the actual line with the syntax problem?
>>
>>
>>
>>
>>
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: A syntax question: char.substr not substr

Jon Peck
Adding to what Rick and others have said, here is more than you want to know about characters.

Unicode mode was introduced in V16 and became the default in V22, I think.  The char.* functions were introduced in V16 so that using them in place of the older functions would work whether you were running in Unicode mode, which supports over 100,000 characters, or code page mode, which supports at most 255.  They operate on characters rather than bytes.  However, the old character functions such as substr still work in both modes if you only have simple characters, i.e., characters that are represented in a single byte.  Most modern software works on Unicode characters, which solves what used to be a lot of messy character problems.

One twist is that substr is allowed on the left hand side of a COMPUTE while char.substr is not.  This is because left-side substr replaces a byte string with the same number of bytes, but operating on characters might change the number of bytes required to hold the replacement characters, so that could affect the entire string.  LHS substr was always a bad idea, though, IMO.

Another difference is that string expressions in Unicode mode are automatically RTRIMmed, i.e., trailing blanks are removed, so there is no need to use RTRIM unless you need to use code page mode.

One special new character function, normalize, was added along with the char.* functions.  It returns the canonical form of a character string (NFC) and is useful in situations where there is more than one byte sequence that can represent the same Unicode character.  This is a very rare concern in Statistics but can cause string comparisons to return unequal for two strings that look the same if the byte forms are different.



On Wed, Oct 24, 2018 at 10:17 AM Rick Oliver <[hidden email]> wrote:
In a recent release (24, I think), the application runs in Unicode mode by default instead of code page mode. (You can change it at any time.) And in Unicode mode, I think the results for the char function can be different than in code page mode (e.g. the aformentioned accented characters, which are two bytes in Unicode). IIRC, the warning was added at the same time to nudge users toward more Unicode-friendly habits. 

On Wed, Oct 24, 2018 at 9:59 AM PRogman <[hidden email]> wrote:
...And according to the FM seven CHAR. prefixed string functions were
released in v16, so they have been around some time...
/PR


Rick Oliver wrote
> The substr function actually counts bytes, not characters, so this was a
> historically a bit confusing in multi-byte code pages. In Unicode, there
> are also some characters in Western code pages that use more than one byte
> (e.g. accented characters), adding to the confusion. char.substr always
> counts characters, regardless of the number of bytes in the character.
>
> On Tue, Oct 23, 2018 at 10:27 PM Ron0z &lt;

> ronald.crichton@.edu

> &gt; wrote:
>
>> I’ve just had my version of SPSS updated. I had been using  ver 22 but
>> now
>> I’m on ver 25. I run command files exclusively in everything I do. My
>> files
>> ran error free in ver 22, now I have a few errors. Here’s one: Line 174
>> was
>> identified with an error. You’ll see line 174 is a sort command.
>>
>> And the error that was reported: “The SUBSTR function is depreciated.
>> Consider using the CHAR.SUBSTR function.”
>>
>> The code is as follows:
>>
>> [line 171]if (substr(HomePhone, 1, 1) NE ' ') Phone = concat('[line ',
>> rtrim(HomePhone), ']').
>> [line 172]if (substr(MobilePhone, 1, 1) NE ' ') Mobile = concat('[line ',
>> rtrim(MobilePhone), ']').
>> [line 173]
>> [line 174]sort cases by College Course_name FamilyName FirstName.
>>
>> This seems strange. When I edited SUBSTR to CHAR.SUBSTR the problem
>> vanished. It’s strange in that the error was not trapped on either line
>> 171
>> or 172 but waited until the SORT command was encountered.
>>
>>
>> The same with this one:
>> select if (substr(ClientID, 1, 3) eq 'CIT').
>> FREQUENCIES VARIABLES=Citizen atsi.
>>
>> The error was identified on the FREQUENCIES command and not the SELECT.
>> Once again changing to substr to char.substr corrected the problem.
>>
>> So, it’s no big deal to type “char.” in front of all my substr entries in
>> my
>> code. But I wonder why. Does anyone know? And why does the error not show
>> on
>> the actual line with the syntax problem?
>>
>>
>>
>>
>>
>> --
>> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>>

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: A syntax question: char.substr not substr

Ron0z
Well, thank you folks, that was great, but Jon, you’ve raised my curiosity.
You mentioned a twist. Can you please give me a couple of examples of what
you mean by one and not the other being allowed on the LHS of a COMPUTE
command.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: A syntax question: char.substr not substr

Bruce Weaver
Administrator
Here's a simple example similar to the one I posted earlier.

NEW FILE.
DATASET CLOSE ALL.
DATA LIST LIST / s1 (A5).
BEGIN DATA
abcde
ABCDE
fghij
FGHIJ
END DATA.
STRING s2 s3 (A5).
COMPUTE s2 = s1.
COMPUTE s3 = s1.
COMPUTE SUBSTR(s2,3,1) = "X".          /* THIS LINE WORKS.
COMPUTE CHAR.SUBSTR(s3,3,1) = "X". /* THIS LINE DOES NOT WORK.

>Error # 4029 in column 9.  Text: CHAR.SUBSTR
>The function appearing on the left side of the assignment operator (equals
>sign) is not valid in that context.
>Execution of this command stops.
LIST.
List

s1    s2    s3

abcde abXde abcde
ABCDE ABXDE ABCDE
fghij fgXij fghij
FGHIJ FGXIJ FGHIJ

Number of cases read:  4    Number of cases listed:  4




Ron0z wrote

> Well, thank you folks, that was great, but Jon, you’ve raised my
> curiosity.
> You mentioned a twist. Can you please give me a couple of examples of what
> you mean by one and not the other being allowed on the LHS of a COMPUTE
> command.
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: A syntax question: char.substr not substr

Jon Peck
In reply to this post by Ron0z
Here is an example where difficulties occur.  I am in Unicode mode, where the number of bytes per character varies.  I'm not sure that this will all survive the email, but characters 5-7 are a with grave accent, e with grave accent, and o with tilde.
data list list/s(a20).
begin data
abcdxyz999
abcdàèõ999
end data.
dataset name data.
compute len = length(s).
list.

The two strings above have the same number of characters, but their length in bytes differs.
s                         len 
abcdxyz999            10.00 
abcdàèõ999           13.00

The characters at positions 5-7 take one byte each in the first record but two bytes in the second.

So suppose I run this code.
compute substr(s, 5, 3) = "123".
list.
s                         len 
abcd123999              10.00 
abcd123                 13.00

Something weird  has happened.  The first record is fine, but the second is not.  You can't see this with list, but in the Data Editor, you can see that the value for the second record is now
abcd123�õ999
The string "123" was inserted where expected, but now I  have an invalid character following, because the first two accented characters actually took 4 bytes but they were replace with 3.  So one character is only half there, and the following 999 is not displayed properly by list, because there is an undefined character in the middle of the string.

The  substr function is allowed here, because it was always legal and would work in code page mode, where those characters take only one byte, but if you run, instead,
compute char.substr(s, 5, 3) = "123".
you get this error message:
>Error # 4029 in column 9.  Text: char.substr 
>The function appearing on the left side of the assignment operator (equals 
>sign) is not valid in that context.

So it would be very hard for a user in general to know how to make the LHS substr function do what was expected in Unicode mode.  The proper code would be to use the replace function or to use RHS substr plus concat to make the change.

It would have been possible to implement LHS char.substr to adjust the whole string on a character basis, but the transformation system code was not designed for that type of operation, and with LHS substr rarely used, especially since the replace function was added, in V14, the development resources were put to better use.

LHS substr in code page mode is ugly but okay, except that in the large character set scripts - Japanese, Korean, traditional and simplified Chinese, characters are already variable length even there.

On Wed, Oct 24, 2018 at 2:41 PM Ron0z <[hidden email]> wrote:
Well, thank you folks, that was great, but Jon, you’ve raised my curiosity.
You mentioned a twist. Can you please give me a couple of examples of what
you mean by one and not the other being allowed on the LHS of a COMPUTE
command.



--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: A syntax question: char.substr not substr

Ron0z
Thanks for the great explanation Jon, and for taking the time to do so. It’s
most appreciated. I think part of my confusion was due to my never having
considered it possible to do anything like

compute substr(s, 5, 3) = "123".

My usage of substr in spss and other programming languages had been for
reading the characters of a string only, then doing something with them. I
had never considered it feasible to use substr to insert characters into a
string. Had I wanted to achieve this I would have been more than likely to
have used a few lines of code, with concat possibly being one of them.
Though, I have to say, your example is an elegant and simple solution.

I've shared your message around the office.




--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: A syntax question: char.substr not substr

Ron0z
Thanks Jon. While checking out STATS_ADJUST_WIDTHS in the Extensions Hub I
clicked on the More Info button, and noticed the author was “JKP, IBM.”  Is
that you?

I wanted to use syntax with this. I had a bit of difficulty getting started
but using the pull-down to write some code for me helped get me going.

I did a few test runs on copies of my data. I was reluctant to use something
I was unfamiliar with, but got it going in the end. This is the answer.
Thank you.


Observations on the use of STATS ADJUST WIDTHS:
Seeing a few warnings on my first encounter (encoded in a locale-specific
(code page) encoding) put me on edge, but I guess that was the nature of my
data, and the warnings were to be expected. No issue there.

I had initially used variables=all (it was so tempting) but found it best to
specify variables because not all of my files were identical, and it stopped
processing part way through. I was using width=first and was notified
following a run that a variable wasn’t found in a later file. When I listed
all variables in the first file it ran to completion creating all files.
With files where the structure was identical using variables=all was
perfectly fine.

Some files had used the same variable name but at some point I had shifted
the spec from numeric to string, and of course the width remained larger on
those occassions. That’s my problem. Just something I need to look out for.

Loved the suffix option.





--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: A syntax question: char.substr not substr

Jon Peck
Yes, I am JKP.  I wrote most of the 100+ Python and R-based extensions on the Extension Hub.  My last few years at IBM, I was pretty much free to do whatever I thought was useful.

STATS ADJUST WIDTHS uses standard SPSS commands to open, adjust widths and save files, so any warnings those operations would produce will appear with ADJUST WIDTHS.  If there is a type conflict for a variable or a variable doesn't occur in a subsequent file, you may get a warning, but the command will continue and do the sensible thing.

-JKP

On Thu, Oct 25, 2018 at 9:49 PM Ron0z <[hidden email]> wrote:
Thanks Jon. While checking out STATS_ADJUST_WIDTHS in the Extensions Hub I
clicked on the More Info button, and noticed the author was “JKP, IBM.”  Is
that you?

I wanted to use syntax with this. I had a bit of difficulty getting started
but using the pull-down to write some code for me helped get me going.

I did a few test runs on copies of my data. I was reluctant to use something
I was unfamiliar with, but got it going in the end. This is the answer.
Thank you.


Observations on the use of STATS ADJUST WIDTHS:
Seeing a few warnings on my first encounter (encoded in a locale-specific
(code page) encoding) put me on edge, but I guess that was the nature of my
data, and the warnings were to be expected. No issue there.

I had initially used variables=all (it was so tempting) but found it best to
specify variables because not all of my files were identical, and it stopped
processing part way through. I was using width=first and was notified
following a run that a variable wasn’t found in a later file. When I listed
all variables in the first file it ran to completion creating all files.
With files where the structure was identical using variables=all was
perfectly fine.

Some files had used the same variable name but at some point I had shifted
the spec from numeric to string, and of course the width remained larger on
those occassions. That’s my problem. Just something I need to look out for.

Loved the suffix option.





--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD