Odd, very odd, something

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Odd, very odd, something

Maguin, Eugene

Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08Dec10’.

 

So, naturally, the question is what is going on?

And how can it be fixed so that the dash character shows instead of the diamond character?

 

If it matters: 21, fully patched, not Unicode.

 

Thanks, Gene Maguin

Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

Rick Oliver-3
Just a shot in the dark, but is it really a dash, or is it an em-dash or an en-dash?

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email],
Date:        01/08/2014 02:40 PM
Subject:        Odd, very odd, something
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08Dec10’.
 
So, naturally, the question is what is going on?
And how can it be fixed so that the dash character shows instead of the diamond character?
 
If it matters: 21, fully patched, not Unicode.
 
Thanks, Gene Maguin
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

Jon K Peck
In reply to this post by Maguin, Eugene
The question mark indicates that you have an unprintable character in that location.  If you are not in Unicode mode and using a western code page such as the usual cp1252, there are only a few such character slots.  Please post some code that shows this behavior.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email],
Date:        01/08/2014 01:41 PM
Subject:        [SPSSX-L] Odd, very odd, something
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08Dec10’.
 
So, naturally, the question is what is going on?
And how can it be fixed so that the dash character shows instead of the diamond character?
 
If it matters: 21, fully patched, not Unicode.
 
Thanks, Gene Maguin
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

Maguin, Eugene

Rick Oliver: I believe it is an ordinary, if such a thing still exists, dash. Specifically the key to the right of  the )/0 key.

 

Jon Peck: Here is the syntax.

 

STRING RANGE(A11).

COMPUTE RANGE='     -     '.

VECTOR #MO(36,F3.0)/#AMO(36,A5).

DO REPEAT X=JAN08 TO DEC10/Y=#MO1 TO #MO36/B=#AMO1 TO #AMO36/

   A='Jan08' 'Feb08' 'Mar08' 'Apr08' 'May08' 'Jun08' 'Jul08' 'Aug08' 'Sep08'

   'Oct08' 'Nov08' 'Dec08' 'Jan09' 'Feb09' 'Mar09' 'Apr09' 'May09' 'Jun09'

   'Jul09' 'Aug09' 'Sep09' 'Oct09' 'Nov09' 'Dec09' 'Jan10' 'Feb10' 'Mar10'

   'Apr10' 'May10' 'Jun10' 'Jul10' 'Aug10' 'Sep10' 'Oct10' 'Nov10' 'Dec10'.

+  COMPUTE Y=X.

+  COMPUTE B=A.

END REPEAT.

LOOP #I=1 TO 36.

+  DO IF (#I EQ 1).

+     IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,1,5)=#AMO(#I).

+  ELSE IF (#I EQ 36).

+     IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,7,5)=#AMO(#I).

+  ELSE.

+     IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I-1)))

         SUBSTR(RANGE,1,5)=#AMO(#I).

+     IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I+1)))

         SUBSTR(RANGE,7,5)=#AMO(#I).

+  END IF.

END LOOP.

EXECUTE.

 

 

 

 

 

 

From: Jon K Peck [mailto:[hidden email]]
Sent: Wednesday, January 08, 2014 3:56 PM
To: Maguin, Eugene
Cc: [hidden email]
Subject: Re: [SPSSX-L] Odd, very odd, something

 

The question mark indicates that you have an unprintable character in that location.  If you are not in Unicode mode and using a western code page such as the usual cp1252, there are only a few such character slots.  Please post some code that shows this behavior.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email],
Date:        01/08/2014 01:41 PM
Subject:        [SPSSX-L] Odd, very odd, something
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08Dec10’.
 
So, naturally, the question is what is going on?
And how can it be fixed so that the dash character shows instead of the diamond character?
 
If it matters: 21, fully patched, not Unicode.
 
Thanks, Gene Maguin

Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something. Info correction.

Maguin, Eugene
In reply to this post by Maguin, Eugene

I just looked at the edit-options-general box and the two options in character encoding section are both grayed out but the Unicode circle  is bulleted. So perhaps I am really running in Unicode and didn’t realize it.

 

I retyped the line COMPUTE RANGE=’     -     ‘. And re-ran the section and no diamonds, just dashes. Even if I start over, I can’t reproduce the problem. So: FWIW.

 

Gene Maguin

 

 

 

From: Jon K Peck [[hidden email]]
Sent: Wednesday, January 08, 2014 3:56 PM
To: Maguin, Eugene
Cc:
[hidden email]
Subject: Re: [SPSSX-L] Odd, very odd, something

 

The question mark indicates that you have an unprintable character in that location.  If you are not in Unicode mode and using a western code page such as the usual cp1252, there are only a few such character slots.  Please post some code that shows this behavior.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email],
Date:        01/08/2014 01:41 PM
Subject:        [SPSSX-L] Odd, very odd, something
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08Dec10’.
 
So, naturally, the question is what is going on?
And how can it be fixed so that the dash character shows instead of the diamond character?
 
If it matters: 21, fully patched, not Unicode.
 
Thanks, Gene Maguin

Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something. Info correction.

Jon K Peck
If the dash is not the plain ascii hyphen, then it would be two bytes, and you would get an invalid byte sequence and, hence, the bad display.  I thought we issued an error message with left-hand-side use of substr in Unicode mode, but apparently we let do that only for char.substr.

This, then, is an illustration of exactly the problem with lhs substr that I wrote about recently.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email],
Date:        01/08/2014 02:45 PM
Subject:        Re: [SPSSX-L] Odd, very odd, something. Info correction.
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I just looked at the edit-options-general box and the two options in character encoding section are both grayed out but the Unicode circle  is bulleted. So perhaps I am really running in Unicode and didn’t realize it.
 
I retyped the line COMPUTE RANGE=’     -     ‘. And re-ran the section and no diamonds, just dashes. Even if I start over, I can’t reproduce the problem. So: FWIW.
 
Gene Maguin
 
 
 
From: Jon K Peck [mailto:peck@...]
Sent:
Wednesday, January 08, 2014 3:56 PM
To:
Maguin, Eugene
Cc:
[hidden email]
Subject:
Re: [SPSSX-L] Odd, very odd, something

 
The question mark indicates that you have an unprintable character in that location.  If you are not in Unicode mode and using a western code page such as the usual cp1252, there are only a few such character slots.  Please post some code that shows this behavior.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM

peck@...
phone: 720-342-5621





From:        
"Maguin, Eugene" <emaguin@...>
To:        
[hidden email],
Date:        
01/08/2014 01:41 PM
Subject:        
[SPSSX-L] Odd, very odd, something
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>





Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08
Dec10’.
 
So, naturally, the question is what is going on?
And how can it be fixed so that the dash character shows instead of the diamond character?

 
If it matters: 21, fully patched, not Unicode.

 
Thanks, Gene Maguin

Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

David Marso
Administrator
In reply to this post by Maguin, Eugene
Gene,
Maybe attach a small subset of data and the actual syntax file to the thread in Nabble?
http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-tt5723836.html
See the More button on the right!
I don't have time to try to guess the contents of your existing data set.
I assume Jan08 .... Dec10 on the DO REPEAT are existing variables.
Even more important.  What are actually attempting to do here?
Maybe there is a more straightforward syntax?
D.
--

Maguin, Eugene wrote
Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08�Dec10’.

So, naturally, the question is what is going on?
And how can it be fixed so that the dash character shows instead of the diamond character?

If it matters: 21, fully patched, not Unicode.

Thanks, Gene Maguin
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something. Info correction.

Maguin, Eugene
In reply to this post by Jon K Peck

In this specific case, Replace would work fine because I can specify that only one instance of ‘     ‘ be replaced, which would be the leftmost on the first call and then the rightmost on the second call. However, it seems to me that Substr is (was) more general in that you could overwrite whatever was in the ‘j’ characters beginning at position ‘i’. Is that an accurate summary?

 

Gene Maguin

 

 

 

From: Jon K Peck [mailto:[hidden email]]
Sent: Wednesday, January 08, 2014 4:53 PM
To: Maguin, Eugene
Cc: [hidden email]
Subject: Re: [SPSSX-L] Odd, very odd, something. Info correction.

 

If the dash is not the plain ascii hyphen, then it would be two bytes, and you would get an invalid byte sequence and, hence, the bad display.  I thought we issued an error message with left-hand-side use of substr in Unicode mode, but apparently we let do that only for char.substr.

This, then, is an illustration of exactly the problem with lhs substr that I wrote about recently.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email],
Date:        01/08/2014 02:45 PM
Subject:        Re: [SPSSX-L] Odd, very odd, something. Info correction.
Sent by:        "SPSSX(r) Discussion" <[hidden email]>





I just looked at the edit-options-general box and the two options in character encoding section are both grayed out but the Unicode circle  is bulleted. So perhaps I am really running in Unicode and didn’t realize it.
 
I retyped the line COMPUTE RANGE=’     -     ‘. And re-ran the section and no diamonds, just dashes. Even if I start over, I can’t reproduce the problem. So: FWIW.
 
Gene Maguin
 
 
 
From: Jon K Peck [[hidden email]]
Sent:
Wednesday, January 08, 2014 3:56 PM
To:
Maguin, Eugene
Cc:
[hidden email]
Subject:
Re: [SPSSX-L] Odd, very odd, something
 
The question mark indicates that you have an unprintable character in that location.  If you are not in Unicode mode and using a western code page such as the usual cp1252, there are only a few such character slots.  Please post some code that shows this behavior.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM

[hidden email]
phone: 720-342-5621





From:        
"Maguin, Eugene" <[hidden email]>
To:        
[hidden email],
Date:        
01/08/2014 01:41 PM

Subject:        
[SPSSX-L] Odd, very odd, something
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>






Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08
Dec10’.
 
So, naturally, the question is what is going on?
And how can it be fixed so that the dash character shows instead of the diamond character?

 
If it matters: 21, fully patched, not Unicode.

 
Thanks, Gene Maguin

Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

Maguin, Eugene
In reply to this post by David Marso
David,
This was the entire section of the file that was executed. It read an excel file of 94 records. You are certainly welcome to comment but to put the excel file up, I'd want to work it over to change some data values and I'd rather not. Gene

***************************************************************************.
GET DATA /TYPE=ODBC/CONNECT='DSN=Excel Files;DBQ=H:\xxx Data\Data\'+
   'wwww_Days.xlsx;DriverId=1046;MaxBufferSize=2048;PageTimeout=5;'+
   ';QuotedId=Yes'/SQL='SELECT Location, Capacity, Jan08, Feb08, Mar08, '+
   'Apr08, May08, Jun08, Jul08, Aug08, Sep08, Oct08, Nov08, Dec08, Jan09, '+
   'Feb09, Mar09, Apr09, May09, Jun09, Jul09, Aug09, Sep09, Oct09, Nov09, '+
   'Dec09, Jan10, Feb10, Mar10, Apr10, May10, Jun10, Jul10, Aug10, Sep10, '+
   'Oct10, Nov10, Dec10 FROM `DaysCare$`'/ASSUMEDSTRWIDTH=255.
CACHE.
EXECUTE.

ALTER TYPE Location(AMIN).
FORMAT Capacity Jan08 Feb08 Mar08 Apr08 May08 Jun08 Jul08 Aug08 Sep08 Oct08
  Nov08 Dec08 Jan09 Feb09 Mar09 Apr09 May09 Jun09 Jul09 Aug09 Sep09 Oct09
  Nov09 Dec09 Jan10 Feb10 Mar10 Apr10 May10 Jun10 Jul10 Aug10 Sep10 Oct10
  Nov10 Dec10(F4.0).

*  GET RID OF RECORD 94 WHICH HAS A BLANK LOCATION FIELD.
SELECT IF (LOCATION NE '     ').
EXECUTE.

COUNT Vals=Jan08 Feb08 Mar08 Apr08 May08 Jun08 Jul08 Aug08 Sep08 Oct08
  Nov08 Dec08 Jan09 Feb09 Mar09 Apr09 May09 Jun09 Jul09 Aug09 Sep09 Oct09
  Nov09 Dec09 Jan10 Feb10 Mar10 Apr10 May10 Jun10 Jul10 Aug10 Sep10 Oct10
  Nov10 Dec10(MISSING).
FORMAT VALS(F2.0).
COMPUTE VALS=36-VALS.
FREQUENCIES VALS.

COMPUTE SEQ=$CASENUM.
EXECUTE.
FORMAT SEQ(F3.0).

STRING RANGE(A11).
COMPUTE RANGE='     -     '.
VECTOR #MO(36,F3.0)/#AMO(36,A5).
DO REPEAT X=JAN08 TO DEC10/Y=#MO1 TO #MO36/B=#AMO1 TO #AMO36/
   A='Jan08' 'Feb08' 'Mar08' 'Apr08' 'May08' 'Jun08' 'Jul08' 'Aug08' 'Sep08'
   'Oct08' 'Nov08' 'Dec08' 'Jan09' 'Feb09' 'Mar09' 'Apr09' 'May09' 'Jun09'
   'Jul09' 'Aug09' 'Sep09' 'Oct09' 'Nov09' 'Dec09' 'Jan10' 'Feb10' 'Mar10'
   'Apr10' 'May10' 'Jun10' 'Jul10' 'Aug10' 'Sep10' 'Oct10' 'Nov10' 'Dec10'.
+  COMPUTE Y=X.
+  COMPUTE B=A.
END REPEAT.
LOOP #I=1 TO 36.
+  DO IF (#I EQ 1).
+     IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,1,5)=#AMO(#I).
+  ELSE IF (#I EQ 36).
+     IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,7,5)=#AMO(#I).
+  ELSE.
+     IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I-1)))
         SUBSTR(RANGE,1,5)=#AMO(#I).
+     IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I+1)))
         SUBSTR(RANGE,7,5)=#AMO(#I).
+  END IF.
END LOOP.
EXECUTE.

WRITE OUTFILE='H:\xxx Data\Data\Range_Months.txt' /
   SEQ ' ' LOCATION ' ' RANGE.
EXECUTE.


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso
Sent: Wednesday, January 08, 2014 4:54 PM
To: [hidden email]
Subject: Re: Odd, very odd, something

Gene,
Maybe attach a small subset of data and the actual syntax file to the thread in Nabble?
http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-tt5723836.html
See the More button on the right!
I don't have time to try to guess the contents of your existing data set.
I assume Jan08 .... Dec10 on the DO REPEAT are existing variables.
Even more important.  What are actually attempting to do here?
Maybe there is a more straightforward syntax?
D.
--


Maguin, Eugene wrote

> Given an A11 variable, initially defined to be ‘…..-…..’, where a dot
> is a space, I replace using the SUBSTR function the 5 spaces on either
> side of the dash character, ‘-‘, with a 5 character string such as
> ‘Jan08’. The result in the data window shows a black diamond shaped
> character with an embedded, white question mark character. An example is ‘Jan08 Dec10’.
>
> So, naturally, the question is what is going on?
> And how can it be fixed so that the dash character shows instead of
> the diamond character?
>
> If it matters: 21, fully patched, not Unicode.
>
> Thanks, Gene Maguin





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-tp5723836p5723842.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

Andy W
I see no need for the left hand side substring in that example. Just build two strings, say "Left" and "Right" - then replace "SUBSTR(RANGE,1,5)" in the code with Left and "SUBSTR(RANGE,7,5)" in the code with Right. Then after the loop concatenate the strings together (with your hyphen in between).

IMO it would be better to use VARSTOCASES and not worry about hard coding all those strings as well (-1 and +1 in the loop then just becomes LAG and LEAD - or you could parse the strings to make actual date variables).
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something. Info correction.

Jon K Peck
In reply to this post by Maguin, Eugene
I believe that you had a non-ascii dash, which is two bytes in Unicode, and the logic of your code would only work if each character, including the dash, is one byte, so the result is an invalid utf-8 character.  If any of the input fields can also contain accented or other non-ascii characters, the situation will be even worse.

When you retyped the RANGE string, you apparently got an ascii dash.

It is important for people to stop assuming that a byte is a character and to use the char.* functions that Statistics has provided since V16.  And avoid left hand side substr.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email],
Date:        01/08/2014 02:45 PM
Subject:        Re: [SPSSX-L] Odd, very odd, something. Info correction.
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




I just looked at the edit-options-general box and the two options in character encoding section are both grayed out but the Unicode circle  is bulleted. So perhaps I am really running in Unicode and didn’t realize it.
 
I retyped the line COMPUTE RANGE=’     -     ‘. And re-ran the section and no diamonds, just dashes. Even if I start over, I can’t reproduce the problem. So: FWIW.
 
Gene Maguin
 
 
 
From: Jon K Peck [mailto:peck@...]
Sent:
Wednesday, January 08, 2014 3:56 PM
To:
Maguin, Eugene
Cc:
[hidden email]
Subject:
Re: [SPSSX-L] Odd, very odd, something

 
The question mark indicates that you have an unprintable character in that location.  If you are not in Unicode mode and using a western code page such as the usual cp1252, there are only a few such character slots.  Please post some code that shows this behavior.


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM

peck@...
phone: 720-342-5621





From:        
"Maguin, Eugene" <emaguin@...>
To:        
[hidden email],
Date:        
01/08/2014 01:41 PM
Subject:        
[SPSSX-L] Odd, very odd, something
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>





Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08
Dec10’.
 
So, naturally, the question is what is going on?
And how can it be fixed so that the dash character shows instead of the diamond character?

 
If it matters: 21, fully patched, not Unicode.

 
Thanks, Gene Maguin

Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

Andy W
In reply to this post by Maguin, Eugene
Also, you appear to just be finding the begin and end valid dates given the range correct? A more minimal example below for some simpler code.

DATA LIST FREE /Jan08 Feb08 Mar08 Apr08.
BEGIN DATA
1 1 . .
. 1 1 .
. . 1 1
1 1 1 1
END DATA.
STRING Range (A11).
STRING Left Right (A5).
DO REPEAT X=JAN08 TO Apr08 /A='Jan08' 'Feb08' 'Mar08' 'Apr08'.
IF NOT(MISSING(X)) AND Left = " " Left = A.
IF NOT(MISSING(X)) Right = A.
END REPEAT.
COMPUTE Range = CONCAT(Left,"-",Right).

The first IF statement only assigns the string if A is not missing and the string is empty. In the second IF statement the last valid value wins. This assumes stuff in the middle isn't missing, which I'm not quite sure what your original code does if that is the case.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

David Marso
Administrator
In reply to this post by Maguin, Eugene
The core of my question is what do you want to do.
What do you start with?  What do you want to end up with?

Maguin, Eugene wrote
David,
This was the entire section of the file that was executed. It read an excel file of 94 records. You are certainly welcome to comment but to put the excel file up, I'd want to work it over to change some data values and I'd rather not. Gene

***************************************************************************.
GET DATA /TYPE=ODBC/CONNECT='DSN=Excel Files;DBQ=H:\xxx Data\Data\'+
   'wwww_Days.xlsx;DriverId=1046;MaxBufferSize=2048;PageTimeout=5;'+
   ';QuotedId=Yes'/SQL='SELECT Location, Capacity, Jan08, Feb08, Mar08, '+
   'Apr08, May08, Jun08, Jul08, Aug08, Sep08, Oct08, Nov08, Dec08, Jan09, '+
   'Feb09, Mar09, Apr09, May09, Jun09, Jul09, Aug09, Sep09, Oct09, Nov09, '+
   'Dec09, Jan10, Feb10, Mar10, Apr10, May10, Jun10, Jul10, Aug10, Sep10, '+
   'Oct10, Nov10, Dec10 FROM `DaysCare$`'/ASSUMEDSTRWIDTH=255.
CACHE.
EXECUTE.

ALTER TYPE Location(AMIN).
FORMAT Capacity Jan08 Feb08 Mar08 Apr08 May08 Jun08 Jul08 Aug08 Sep08 Oct08
  Nov08 Dec08 Jan09 Feb09 Mar09 Apr09 May09 Jun09 Jul09 Aug09 Sep09 Oct09
  Nov09 Dec09 Jan10 Feb10 Mar10 Apr10 May10 Jun10 Jul10 Aug10 Sep10 Oct10
  Nov10 Dec10(F4.0).

*  GET RID OF RECORD 94 WHICH HAS A BLANK LOCATION FIELD.
SELECT IF (LOCATION NE '     ').
EXECUTE.

COUNT Vals=Jan08 Feb08 Mar08 Apr08 May08 Jun08 Jul08 Aug08 Sep08 Oct08
  Nov08 Dec08 Jan09 Feb09 Mar09 Apr09 May09 Jun09 Jul09 Aug09 Sep09 Oct09
  Nov09 Dec09 Jan10 Feb10 Mar10 Apr10 May10 Jun10 Jul10 Aug10 Sep10 Oct10
  Nov10 Dec10(MISSING).
FORMAT VALS(F2.0).
COMPUTE VALS=36-VALS.
FREQUENCIES VALS.

COMPUTE SEQ=$CASENUM.
EXECUTE.
FORMAT SEQ(F3.0).

STRING RANGE(A11).
COMPUTE RANGE='     -     '.
VECTOR #MO(36,F3.0)/#AMO(36,A5).
DO REPEAT X=JAN08 TO DEC10/Y=#MO1 TO #MO36/B=#AMO1 TO #AMO36/
   A='Jan08' 'Feb08' 'Mar08' 'Apr08' 'May08' 'Jun08' 'Jul08' 'Aug08' 'Sep08'
   'Oct08' 'Nov08' 'Dec08' 'Jan09' 'Feb09' 'Mar09' 'Apr09' 'May09' 'Jun09'
   'Jul09' 'Aug09' 'Sep09' 'Oct09' 'Nov09' 'Dec09' 'Jan10' 'Feb10' 'Mar10'
   'Apr10' 'May10' 'Jun10' 'Jul10' 'Aug10' 'Sep10' 'Oct10' 'Nov10' 'Dec10'.
+  COMPUTE Y=X.
+  COMPUTE B=A.
END REPEAT.
LOOP #I=1 TO 36.
+  DO IF (#I EQ 1).
+     IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,1,5)=#AMO(#I).
+  ELSE IF (#I EQ 36).
+     IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,7,5)=#AMO(#I).
+  ELSE.
+     IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I-1)))
         SUBSTR(RANGE,1,5)=#AMO(#I).
+     IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I+1)))
         SUBSTR(RANGE,7,5)=#AMO(#I).
+  END IF.
END LOOP.
EXECUTE.

WRITE OUTFILE='H:\xxx Data\Data\Range_Months.txt' /
   SEQ ' ' LOCATION ' ' RANGE.
EXECUTE.


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso
Sent: Wednesday, January 08, 2014 4:54 PM
To: [hidden email]
Subject: Re: Odd, very odd, something

Gene,
Maybe attach a small subset of data and the actual syntax file to the thread in Nabble?
http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-tt5723836.html
See the More button on the right!
I don't have time to try to guess the contents of your existing data set.
I assume Jan08 .... Dec10 on the DO REPEAT are existing variables.
Even more important.  What are actually attempting to do here?
Maybe there is a more straightforward syntax?
D.
--


Maguin, Eugene wrote
> Given an A11 variable, initially defined to be ‘…..-…..’, where a dot
> is a space, I replace using the SUBSTR function the 5 spaces on either
> side of the dash character, ‘-‘, with a 5 character string such as
> ‘Jan08’. The result in the data window shows a black diamond shaped
> character with an embedded, white question mark character. An example is ‘Jan08 Dec10’.
>
> So, naturally, the question is what is going on?
> And how can it be fixed so that the dash character shows instead of
> the diamond character?
>
> If it matters: 21, fully patched, not Unicode.
>
> Thanks, Gene Maguin





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-tp5723836p5723842.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

Maguin, Eugene
In reply to this post by Andy W
So much simpler, so much more elegant. Thank you.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Andy W
Sent: Wednesday, January 08, 2014 8:04 PM
To: [hidden email]
Subject: Re: Odd, very odd, something

Also, you appear to just be finding the begin and end valid dates given the range correct? A more minimal example below for some simpler code.

DATA LIST FREE /Jan08 Feb08 Mar08 Apr08.
BEGIN DATA
1 1 . .
. 1 1 .
. . 1 1
1 1 1 1
END DATA.
STRING Range (A11).
STRING Left Right (A5).
DO REPEAT X=JAN08 TO Apr08 /A='Jan08' 'Feb08' 'Mar08' 'Apr08'.
IF NOT(MISSING(X)) AND Left = " " Left = A.
IF NOT(MISSING(X)) Right = A.
END REPEAT.
COMPUTE Range = CONCAT(Left,"-",Right).

The first IF statement only assigns the string if A is not missing and the string is empty. In the second IF statement the last valid value wins. This assumes stuff in the middle isn't missing, which I'm not quite sure what your original code does if that is the case.



-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-tp5723836p5723849.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

Maguin, Eugene
In reply to this post by David Marso
Hi David,
The whole point was to scan through the 36 month-year values and identify the first and last non missing values in the sequence and then stash the labels of those two values in the Range variable. The complication was that the variable names did not have variable labels associated with them. I also knew from what the data were about that there were no embedded missing values.
Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso
Sent: Wednesday, January 08, 2014 8:33 PM
To: [hidden email]
Subject: Re: Odd, very odd, something

The core of my question is what do you want to do.
What do you start with?  What do you want to end up with?


Maguin, Eugene wrote

> David,
> This was the entire section of the file that was executed. It read an
> excel file of 94 records. You are certainly welcome to comment but to
> put the excel file up, I'd want to work it over to change some data
> values and I'd rather not. Gene
>
> ***************************************************************************.
> GET DATA /TYPE=ODBC/CONNECT='DSN=Excel Files;DBQ=H:\xxx Data\Data\'+
>    'wwww_Days.xlsx;DriverId=1046;MaxBufferSize=2048;PageTimeout=5;'+
>    ';QuotedId=Yes'/SQL='SELECT Location, Capacity, Jan08, Feb08, Mar08, '+
>    'Apr08, May08, Jun08, Jul08, Aug08, Sep08, Oct08, Nov08, Dec08,
> Jan09, '+
>    'Feb09, Mar09, Apr09, May09, Jun09, Jul09, Aug09, Sep09, Oct09,
> Nov09, '+
>    'Dec09, Jan10, Feb10, Mar10, Apr10, May10, Jun10, Jul10, Aug10,
> Sep10, '+
>    'Oct10, Nov10, Dec10 FROM `DaysCare$`'/ASSUMEDSTRWIDTH=255.
> CACHE.
> EXECUTE.
>
> ALTER TYPE Location(AMIN).
> FORMAT Capacity Jan08 Feb08 Mar08 Apr08 May08 Jun08 Jul08 Aug08 Sep08
> Oct08
>   Nov08 Dec08 Jan09 Feb09 Mar09 Apr09 May09 Jun09 Jul09 Aug09 Sep09 Oct09
>   Nov09 Dec09 Jan10 Feb10 Mar10 Apr10 May10 Jun10 Jul10 Aug10 Sep10 Oct10
>   Nov10 Dec10(F4.0).
>
> *  GET RID OF RECORD 94 WHICH HAS A BLANK LOCATION FIELD.
> SELECT IF (LOCATION NE '     ').
> EXECUTE.
>
> COUNT Vals=Jan08 Feb08 Mar08 Apr08 May08 Jun08 Jul08 Aug08 Sep08 Oct08
>   Nov08 Dec08 Jan09 Feb09 Mar09 Apr09 May09 Jun09 Jul09 Aug09 Sep09 Oct09
>   Nov09 Dec09 Jan10 Feb10 Mar10 Apr10 May10 Jun10 Jul10 Aug10 Sep10 Oct10
>   Nov10 Dec10(MISSING).
> FORMAT VALS(F2.0).
> COMPUTE VALS=36-VALS.
> FREQUENCIES VALS.
>
> COMPUTE SEQ=$CASENUM.
> EXECUTE.
> FORMAT SEQ(F3.0).
>
> STRING RANGE(A11).
> COMPUTE RANGE='     -     '.
> VECTOR #MO(36,F3.0)/#AMO(36,A5).
> DO REPEAT X=JAN08 TO DEC10/Y=#MO1 TO #MO36/B=#AMO1 TO #AMO36/
>    A='Jan08' 'Feb08' 'Mar08' 'Apr08' 'May08' 'Jun08' 'Jul08' 'Aug08'
> 'Sep08'
>    'Oct08' 'Nov08' 'Dec08' 'Jan09' 'Feb09' 'Mar09' 'Apr09' 'May09' 'Jun09'
>    'Jul09' 'Aug09' 'Sep09' 'Oct09' 'Nov09' 'Dec09' 'Jan10' 'Feb10' 'Mar10'
>    'Apr10' 'May10' 'Jun10' 'Jul10' 'Aug10' 'Sep10' 'Oct10' 'Nov10'
> 'Dec10'.
> +  COMPUTE Y=X.
> +  COMPUTE B=A.
> END REPEAT.
> LOOP #I=1 TO 36.
> +  DO IF (#I EQ 1).
> +     IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,1,5)=#AMO(#I).
> +  ELSE IF (#I EQ 36).
> +     IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,7,5)=#AMO(#I).
> +  ELSE.
> +     IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I-1)))
>          SUBSTR(RANGE,1,5)=#AMO(#I).
> +     IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I+1)))
>          SUBSTR(RANGE,7,5)=#AMO(#I).
> +  END IF.
> END LOOP.
> EXECUTE.
>
> WRITE OUTFILE='H:\xxx Data\Data\Range_Months.txt' /
>    SEQ ' ' LOCATION ' ' RANGE.
> EXECUTE.
>
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:

> SPSSX-L@.UGA

> ] On Behalf Of David Marso
> Sent: Wednesday, January 08, 2014 4:54 PM
> To:

> SPSSX-L@.UGA

> Subject: Re: Odd, very odd, something
>
> Gene,
> Maybe attach a small subset of data and the actual syntax file to the
> thread in Nabble?
> http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-t
> t5723836.html
> See the More button on the right!
> I don't have time to try to guess the contents of your existing data set.
> I assume Jan08 .... Dec10 on the DO REPEAT are existing variables.
> Even more important.  What are actually attempting to do here?
> Maybe there is a more straightforward syntax?
> D.
> --
>
>
> Maguin, Eugene wrote
>> Given an A11 variable, initially defined to be ‘…..-…..’, where a dot
>> is a space, I replace using the SUBSTR function the 5 spaces on
>> either side of the dash character, ‘-‘, with a 5 character string
>> such as ‘Jan08’. The result in the data window shows a black diamond
>> shaped character with an embedded, white question mark character. An
>> example is
>> ‘Jan08 Dec10’.
>>
>> So, naturally, the question is what is going on?
>> And how can it be fixed so that the dash character shows instead of
>> the diamond character?
>>
>> If it matters: 21, fully patched, not Unicode.
>>
>> Thanks, Gene Maguin
>
>
>
>
>
> -----
> Please reply to the list and not to my personal email.
> Those desiring my consulting or training services please feel free to
> email me.
> ---
> "Nolite dare sanctum canibus neque mittatis margaritas vestras ante
> porcos ne forte conculcent eas pedibus suis."
> Cum es damnatorum possederunt porcos iens ut salire off sanguinum
> cliff in abyssum?"
> --
> View this message in context:
> http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-t
> p5723836p5723842.html Sent from the SPSSX Discussion mailing list
> archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSX-L For a list of commands to
> manage subscriptions, send the command INFO REFCARD
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSX-L For a list of commands to
> manage subscriptions, send the command INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-tp5723836p5723850.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

David Marso
Administrator
In reply to this post by Andy W
Very nice Andy!

Andy W wrote
Also, you appear to just be finding the begin and end valid dates given the range correct? A more minimal example below for some simpler code.

DATA LIST FREE /Jan08 Feb08 Mar08 Apr08.
BEGIN DATA
1 1 . .
. 1 1 .
. . 1 1
1 1 1 1
END DATA.
STRING Range (A11).
STRING Left Right (A5).
DO REPEAT X=JAN08 TO Apr08 /A='Jan08' 'Feb08' 'Mar08' 'Apr08'.
IF NOT(MISSING(X)) AND Left = " " Left = A.
IF NOT(MISSING(X)) Right = A.
END REPEAT.
COMPUTE Range = CONCAT(Left,"-",Right).

The first IF statement only assigns the string if A is not missing and the string is empty. In the second IF statement the last valid value wins. This assumes stuff in the middle isn't missing, which I'm not quite sure what your original code does if that is the case.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

David Marso
Administrator

Maybe load the 'labels' programmatically?

VECTOR #MOYR (36,A5).
DO IF $CASENUM=1.
+  COMPUTE #=1.
+  LOOP #Y=1 TO 3.
+    LOOP #M=1 TO 12.
+      COMPUTE #MoYr(#)=CONCAT(CHAR.SUBSTR("JanFebMarAprMayJunJulAugSepOctNovDec",(#M-1)*3+1,3),
                               CHAR.SUBSTR("080910",(#Y-1)*2+1,2)).
+      COMPUTE #=#+1.
+    END LOOP.
+  END LOOP.
END IF.

* Stick Andy's code here using #MOYR1 TO #MoYr36 in second DO REPEAT list.

David Marso wrote
Very nice Andy!

Andy W wrote
Also, you appear to just be finding the begin and end valid dates given the range correct? A more minimal example below for some simpler code.

DATA LIST FREE /Jan08 Feb08 Mar08 Apr08.
BEGIN DATA
1 1 . .
. 1 1 .
. . 1 1
1 1 1 1
END DATA.
STRING Range (A11).
STRING Left Right (A5).
DO REPEAT X=JAN08 TO Apr08 /A='Jan08' 'Feb08' 'Mar08' 'Apr08'.
IF NOT(MISSING(X)) AND Left = " " Left = A.
IF NOT(MISSING(X)) Right = A.
END REPEAT.
COMPUTE Range = CONCAT(Left,"-",Right).

The first IF statement only assigns the string if A is not missing and the string is empty. In the second IF statement the last valid value wins. This assumes stuff in the middle isn't missing, which I'm not quite sure what your original code does if that is the case.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something

Andy W
I was thinking too if you know they are going to be months and you know the start month-year you could use something like:

DATA LIST FREE /Dec08 Jan09 Feb09 Mar09.
BEGIN DATA
1 1 . .
. 1 1 .
. . 1 1
1 1 1 1
END DATA.
STRING Left Right (A6).
COMPUTE #iter = 12.
DO REPEAT X=Dec08 TO Mar09.
  COMPUTE #Month = MOD(#iter - 1,12) + 1.
  COMPUTE #Year = TRUNC((#iter-1)/12) + 2008.
  COMPUTE #Date = DATE.MDY(#Month,1,#Year).
  IF NOT(MISSING(X)) AND Left = " " Left = STRING(#Date,MOYR6).
  IF NOT(MISSING(X)) Right = STRING(#Date,MOYR6).
  COMPUTE #iter = #iter + 1.
END REPEAT.

This takes knowing the start month-year, but if you feed it the variable list you don't need to know the end. I didn't want Gene's work on writing out all of those strings go to waste though!

(IMO I would most often turn this data from long to wide for other analyses, and it would be as simple as VARSTOCASES and then an aggregate with FIRST and LAST in that format.)
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something. Info correction.

Albert-Jan Roskam
In reply to this post by Jon K Peck
So many "high" dashes! Why not just have one and only one.

Code Name
U+002D hyphen-minus
U+007E tilde (when used as swung dash)
U+058A armenian hyphen
U+05BE hebrew punctuation maqaf
U+1400 canadian syllabics hyphen
U+1806 mongolian todo soft hyphen
U+2010 hyphen
U+2011 non-breaking hyphen
U+2012 figure dash
U+2013 en dash
U+2014 em dash
U+2015 horizontal bar (=quotation dash)
U+2053 swung dash
U+207B superscript minus
U+208B subscript minus
U+2212 minus sign
U+2E17 double oblique hyphen
U+301C wav e da s h
U+3030 wav y da s h
U+30A0 katakana-hiragana double hyphen
U+FE31 presentation form for vertical em dash
U+FE32 presentation form for vertical en dash
U+FE58 small em dash
U+FE63 small hyphen-minus
U+FF0D fullwidth hyphen-minus

source: http://www.unicode.org/versions/Unicode6.3.0/ch06.pdf, p 196.

Regards,

Albert-Jan



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a

fresh water system, and public health, what have the Romans ever done for us?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--------------------------------------------
On Thu, 1/9/14, Jon K Peck <[hidden email]> wrote:

 Subject: Re: [SPSSX-L] Odd, very odd, something. Info correction.
 To: [hidden email]
 Date: Thursday, January 9, 2014, 1:41 AM

 I believe
 that you had a non-ascii dash,
 which is two bytes in Unicode, and the logic of your code
 would only work
 if each character, including the dash, is one byte, so the
 result is an
 invalid utf-8 character. � If any of the input fields
 can also contain
 accented or other non-ascii characters, the situation will
 be even worse.



 When you retyped the RANGE
 string, you
 apparently got an ascii dash.



 It is important for people
 to stop assuming
 that a byte is a character and to use the char.* functions
 that Statistics
 has provided since V16. � And avoid left hand side
 substr.





 Jon Peck (no "h") aka Kim

 Senior Software Engineer, IBM

 [hidden email]

 phone: 720-342-5621









 From:
 �  �  �
 � "Maguin,
 Eugene"
 <[hidden email]>

 To: �
 �  �
 � [hidden email],


 Date:
 �  �  �
 � 01/08/2014
 02:45 PM

 Subject:
 �  �
 �  � Re:
 [SPSSX-L]
 Odd, very odd, something. Info correction.

 Sent by:
 �  �
 �  � "SPSSX(r)
 Discussion" <[hidden email]>








 I just looked
 at the edit-options-general
 box and the two options in character encoding section are
 both grayed out
 but the Unicode circle � is bulleted. So perhaps I am
 really running
 in Unicode and didn’t realize it.

 �

 I retyped the
 line COMPUTE
 RANGE=’ �  �  - �  �  ‘. And re-ran the
 section and no
 diamonds, just dashes. Even if I start over, I can’t
 reproduce the problem.
 So: FWIW.

 �

 Gene
 Maguin

 �

 �

 �

 From: Jon K Peck
 [mailto:[hidden email]]


 Sent: Wednesday, January 08, 2014 3:56 PM

 To: Maguin, Eugene

 Cc: [hidden email]

 Subject: Re: [SPSSX-L] Odd, very odd, something

 �

 The question mark indicates that
 you have
 an unprintable character in that location. � If you are
 not in Unicode
 mode and using a western code page such as the usual cp1252,
 there are
 only a few such character slots. � Please post some code
 that shows
 this behavior.






 Jon Peck (no "h") aka Kim

 Senior Software Engineer, IBM

 [hidden email]

 phone: 720-342-5621









 From: �  �  �  � "Maguin,
 Eugene" <[hidden email]>


 To: �  �  �  � [hidden email],


 Date: �  �  �  � 01/08/2014
 01:41 PM


 Subject: �  �  �  � [SPSSX-L]
 Odd, very odd, something

 Sent by: �  �  �  � "SPSSX(r)
 Discussion" <[hidden email]>










 Given an A11 variable, initially defined to be
 ‘…..-…..’, where a dot
 is a space, I replace using the SUBSTR function the 5 spaces
 on either
 side of the dash character, ‘-‘, with a 5 character
 string such as ‘Jan08’.
 The result in the data window shows a black diamond shaped
 character with
 an embedded, white question mark character. An example is
 ‘Jan08�Dec10’.


  �

 So, naturally, the question is what is going on?

 And how can it be fixed so that the dash character shows
 instead of the
 diamond character?

  �

 If it matters: 21, fully patched, not Unicode.


  �

 Thanks, Gene Maguin

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Odd, very odd, something. Info correction.

Jon K Peck
With over 100,000 characters in Unicode, why scrimp on dashes?


Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        Albert-Jan Roskam <[hidden email]>
To:        [hidden email], Jon K Peck/Chicago/IBM@IBMUS,
Date:        01/10/2014 08:30 AM
Subject:        Re: [SPSSX-L] Odd, very odd, something. Info correction.




So many "high" dashes! Why not just have one and only one.

Code Name
U+002D hyphen-minus
U+007E tilde (when used as swung dash)
U+058A armenian hyphen
U+05BE hebrew punctuation maqaf
U+1400 canadian syllabics hyphen
U+1806 mongolian todo soft hyphen
U+2010 hyphen
U+2011 non-breaking hyphen
U+2012 figure dash
U+2013 en dash
U+2014 em dash
U+2015 horizontal bar (=quotation dash)
U+2053 swung dash
U+207B superscript minus
U+208B subscript minus
U+2212 minus sign
U+2E17 double oblique hyphen
U+301C wav e da s h
U+3030 wav y da s h
U+30A0 katakana-hiragana double hyphen
U+FE31 presentation form for vertical em dash
U+FE32 presentation form for vertical en dash
U+FE58 small em dash
U+FE63 small hyphen-minus
U+FF0D fullwidth hyphen-minus

source:
http://www.unicode.org/versions/Unicode6.3.0/ch06.pdf, p 196.

Regards,

Albert-Jan



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a

fresh water system, and public health, what have the Romans ever done for us?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--------------------------------------------
On Thu, 1/9/14, Jon K Peck <[hidden email]> wrote:

Subject: Re: [SPSSX-L] Odd, very odd, something. Info correction.
To: [hidden email]
Date: Thursday, January 9, 2014, 1:41 AM

I believe
that you had a non-ascii dash,
which is two bytes in Unicode, and the logic of your code
would only work
if each character, including the dash, is one byte, so the
result is an
invalid utf-8 character.  If any of the input fields
can also contain
accented or other non-ascii characters, the situation will
be even worse.



When you retyped the RANGE
string, you
apparently got an ascii dash.



It is important for people
to stop assuming
that a byte is a character and to use the char.* functions
that Statistics
has provided since V16.  And avoid left hand side
substr.





Jon Peck (no "h") aka Kim

Senior Software Engineer, IBM

[hidden email]

phone: 720-342-5621









From:
     
 "Maguin,
Eugene"
<[hidden email]>

To:  
   
 [hidden email],


Date:
     
 01/08/2014
02:45 PM

Subject:
   
   Re:
[SPSSX-L]
Odd, very odd, something. Info correction.

Sent by:
   
   "SPSSX(r)
Discussion" <[hidden email]>








I just looked
at the edit-options-general
box and the two options in character encoding section are
both grayed out
but the Unicode circle  is bulleted. So perhaps I am
really running
in Unicode and didn’t realize it.

 

I retyped the
line COMPUTE
RANGE=’     -     ‘. And re-ran the
section and no
diamonds, just dashes. Even if I start over, I can’t
reproduce the problem.
So: FWIW.

 

Gene
Maguin

 

 

 

From: Jon K Peck
[
mailto:peck@...]


Sent: Wednesday, January 08, 2014 3:56 PM

To: Maguin, Eugene

Cc: [hidden email]

Subject: Re: [SPSSX-L] Odd, very odd, something

 

The question mark indicates that
you have
an unprintable character in that location.  If you are
not in Unicode
mode and using a western code page such as the usual cp1252,
there are
only a few such character slots.  Please post some code
that shows
this behavior.






Jon Peck (no "h") aka Kim

Senior Software Engineer, IBM

[hidden email]

phone: 720-342-5621









From:        "Maguin,
Eugene" <[hidden email]>


To:        [hidden email],


Date:        01/08/2014
01:41 PM


Subject:        [SPSSX-L]
Odd, very odd, something

Sent by:        "SPSSX(r)
Discussion" <[hidden email]>










Given an A11 variable, initially defined to be
‘…..-…..’, where a dot
is a space, I replace using the SUBSTR function the 5 spaces
on either
side of the dash character, ‘-‘, with a 5 character
string such as ‘Jan08’.
The result in the data window shows a black diamond shaped
character with
an embedded, white question mark character. An example is
‘Jan08�Dec10’.


  

So, naturally, the question is what is going on?

And how can it be fixed so that the dash character shows
instead of the
diamond character?

  

If it matters: 21, fully patched, not Unicode.


  

Thanks, Gene Maguin





12