Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data
window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08�Dec10’. So, naturally, the question is what is going on? And how can it be fixed so that the dash character shows instead of the diamond character? If it matters: 21, fully patched, not Unicode. Thanks, Gene Maguin |
Just a shot in the dark, but is it really
a dash, or is it an em-dash or an en-dash?
Rick Oliver Senior Information Developer IBM Business Analytics (SPSS) E-mail: [hidden email] From: "Maguin, Eugene" <[hidden email]> To: [hidden email], Date: 01/08/2014 02:40 PM Subject: Odd, very odd, something Sent by: "SPSSX(r) Discussion" <[hidden email]> Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08�Dec10’. So, naturally, the question is what is going on? And how can it be fixed so that the dash character shows instead of the diamond character? If it matters: 21, fully patched, not Unicode. Thanks, Gene Maguin |
In reply to this post by Maguin, Eugene
The question mark indicates that you have
an unprintable character in that location. If you are not in Unicode
mode and using a western code page such as the usual cp1252, there are
only a few such character slots. Please post some code that shows
this behavior.
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: "Maguin, Eugene" <[hidden email]> To: [hidden email], Date: 01/08/2014 01:41 PM Subject: [SPSSX-L] Odd, very odd, something Sent by: "SPSSX(r) Discussion" <[hidden email]> Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08�Dec10’. So, naturally, the question is what is going on? And how can it be fixed so that the dash character shows instead of the diamond character? If it matters: 21, fully patched, not Unicode. Thanks, Gene Maguin |
Rick Oliver: I believe it is an ordinary, if such a thing still exists, dash. Specifically the key to the right of the )/0 key. Jon Peck: Here is the syntax. STRING RANGE(A11). COMPUTE RANGE=' - '. VECTOR #MO(36,F3.0)/#AMO(36,A5). DO REPEAT X=JAN08 TO DEC10/Y=#MO1 TO #MO36/B=#AMO1 TO #AMO36/ A='Jan08' 'Feb08' 'Mar08' 'Apr08' 'May08' 'Jun08' 'Jul08' 'Aug08' 'Sep08' 'Oct08' 'Nov08' 'Dec08' 'Jan09' 'Feb09' 'Mar09' 'Apr09' 'May09' 'Jun09' 'Jul09' 'Aug09' 'Sep09' 'Oct09' 'Nov09' 'Dec09' 'Jan10' 'Feb10' 'Mar10' 'Apr10' 'May10' 'Jun10' 'Jul10' 'Aug10' 'Sep10' 'Oct10' 'Nov10' 'Dec10'. + COMPUTE Y=X. + COMPUTE B=A. END REPEAT. LOOP #I=1 TO 36. + DO IF (#I EQ 1). + IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,1,5)=#AMO(#I). + ELSE IF (#I EQ 36). + IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,7,5)=#AMO(#I). + ELSE. + IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I-1)))
SUBSTR(RANGE,1,5)=#AMO(#I). + IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I+1)))
SUBSTR(RANGE,7,5)=#AMO(#I). + END IF. END LOOP. EXECUTE. From: Jon K Peck [mailto:[hidden email]]
The question mark indicates that you have an unprintable character in that location. If you are not in Unicode mode and using a western code page such as the usual cp1252,
there are only a few such character slots. Please post some code that shows this behavior.
|
In reply to this post by Maguin, Eugene
I just looked at the edit-options-general box and the two options in character encoding section are both grayed out but the Unicode circle is bulleted. So
perhaps I am really running in Unicode and didn’t realize it. I retyped the line COMPUTE RANGE=’ - ‘. And re-ran the section and no diamonds, just dashes. Even if I start over, I can’t reproduce the problem. So:
FWIW. Gene Maguin From: Jon K Peck [[hidden email]]
The question mark indicates that you have an unprintable character in that location. If you are not in Unicode mode and using a western code page such as the usual cp1252,
there are only a few such character slots. Please post some code that shows this behavior.
|
If the dash is not the plain ascii hyphen,
then it would be two bytes, and you would get an invalid byte sequence
and, hence, the bad display. I thought we issued an error message
with left-hand-side use of substr in Unicode mode, but apparently we let
do that only for char.substr.
This, then, is an illustration of exactly the problem with lhs substr that I wrote about recently. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: "Maguin, Eugene" <[hidden email]> To: [hidden email], Date: 01/08/2014 02:45 PM Subject: Re: [SPSSX-L] Odd, very odd, something. Info correction. Sent by: "SPSSX(r) Discussion" <[hidden email]> I just looked at the edit-options-general box and the two options in character encoding section are both grayed out but the Unicode circle is bulleted. So perhaps I am really running in Unicode and didn’t realize it. I retyped the line COMPUTE RANGE=’ - ‘. And re-ran the section and no diamonds, just dashes. Even if I start over, I can’t reproduce the problem. So: FWIW. Gene Maguin From: Jon K Peck [mailto:peck@...] Sent: Wednesday, January 08, 2014 3:56 PM To: Maguin, Eugene Cc: [hidden email] Subject: Re: [SPSSX-L] Odd, very odd, something The question mark indicates that you have an unprintable character in that location. If you are not in Unicode mode and using a western code page such as the usual cp1252, there are only a few such character slots. Please post some code that shows this behavior. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM peck@... phone: 720-342-5621 From: "Maguin, Eugene" <emaguin@...> To: [hidden email], Date: 01/08/2014 01:41 PM Subject: [SPSSX-L] Odd, very odd, something Sent by: "SPSSX(r) Discussion" <[hidden email]> Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08�Dec10’. So, naturally, the question is what is going on? And how can it be fixed so that the dash character shows instead of the diamond character? If it matters: 21, fully patched, not Unicode. Thanks, Gene Maguin |
Administrator
|
In reply to this post by Maguin, Eugene
Gene,
Maybe attach a small subset of data and the actual syntax file to the thread in Nabble? http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-tt5723836.html See the More button on the right! I don't have time to try to guess the contents of your existing data set. I assume Jan08 .... Dec10 on the DO REPEAT are existing variables. Even more important. What are actually attempting to do here? Maybe there is a more straightforward syntax? D. --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Jon K Peck
In this specific case, Replace would work fine because I can specify that only one instance of ‘ ‘ be replaced, which would be the leftmost on the first
call and then the rightmost on the second call. However, it seems to me that Substr is (was) more general in that you could overwrite whatever was in the ‘j’ characters beginning at position ‘i’. Is that an accurate summary? Gene Maguin From: Jon K Peck [mailto:[hidden email]]
If the dash is not the plain ascii hyphen, then it would be two bytes, and you would get an invalid byte sequence and, hence, the bad display. I thought we issued an error
message with left-hand-side use of substr in Unicode mode, but apparently we let do that only for char.substr.
|
In reply to this post by David Marso
David,
This was the entire section of the file that was executed. It read an excel file of 94 records. You are certainly welcome to comment but to put the excel file up, I'd want to work it over to change some data values and I'd rather not. Gene ***************************************************************************. GET DATA /TYPE=ODBC/CONNECT='DSN=Excel Files;DBQ=H:\xxx Data\Data\'+ 'wwww_Days.xlsx;DriverId=1046;MaxBufferSize=2048;PageTimeout=5;'+ ';QuotedId=Yes'/SQL='SELECT Location, Capacity, Jan08, Feb08, Mar08, '+ 'Apr08, May08, Jun08, Jul08, Aug08, Sep08, Oct08, Nov08, Dec08, Jan09, '+ 'Feb09, Mar09, Apr09, May09, Jun09, Jul09, Aug09, Sep09, Oct09, Nov09, '+ 'Dec09, Jan10, Feb10, Mar10, Apr10, May10, Jun10, Jul10, Aug10, Sep10, '+ 'Oct10, Nov10, Dec10 FROM `DaysCare$`'/ASSUMEDSTRWIDTH=255. CACHE. EXECUTE. ALTER TYPE Location(AMIN). FORMAT Capacity Jan08 Feb08 Mar08 Apr08 May08 Jun08 Jul08 Aug08 Sep08 Oct08 Nov08 Dec08 Jan09 Feb09 Mar09 Apr09 May09 Jun09 Jul09 Aug09 Sep09 Oct09 Nov09 Dec09 Jan10 Feb10 Mar10 Apr10 May10 Jun10 Jul10 Aug10 Sep10 Oct10 Nov10 Dec10(F4.0). * GET RID OF RECORD 94 WHICH HAS A BLANK LOCATION FIELD. SELECT IF (LOCATION NE ' '). EXECUTE. COUNT Vals=Jan08 Feb08 Mar08 Apr08 May08 Jun08 Jul08 Aug08 Sep08 Oct08 Nov08 Dec08 Jan09 Feb09 Mar09 Apr09 May09 Jun09 Jul09 Aug09 Sep09 Oct09 Nov09 Dec09 Jan10 Feb10 Mar10 Apr10 May10 Jun10 Jul10 Aug10 Sep10 Oct10 Nov10 Dec10(MISSING). FORMAT VALS(F2.0). COMPUTE VALS=36-VALS. FREQUENCIES VALS. COMPUTE SEQ=$CASENUM. EXECUTE. FORMAT SEQ(F3.0). STRING RANGE(A11). COMPUTE RANGE=' - '. VECTOR #MO(36,F3.0)/#AMO(36,A5). DO REPEAT X=JAN08 TO DEC10/Y=#MO1 TO #MO36/B=#AMO1 TO #AMO36/ A='Jan08' 'Feb08' 'Mar08' 'Apr08' 'May08' 'Jun08' 'Jul08' 'Aug08' 'Sep08' 'Oct08' 'Nov08' 'Dec08' 'Jan09' 'Feb09' 'Mar09' 'Apr09' 'May09' 'Jun09' 'Jul09' 'Aug09' 'Sep09' 'Oct09' 'Nov09' 'Dec09' 'Jan10' 'Feb10' 'Mar10' 'Apr10' 'May10' 'Jun10' 'Jul10' 'Aug10' 'Sep10' 'Oct10' 'Nov10' 'Dec10'. + COMPUTE Y=X. + COMPUTE B=A. END REPEAT. LOOP #I=1 TO 36. + DO IF (#I EQ 1). + IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,1,5)=#AMO(#I). + ELSE IF (#I EQ 36). + IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,7,5)=#AMO(#I). + ELSE. + IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I-1))) SUBSTR(RANGE,1,5)=#AMO(#I). + IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I+1))) SUBSTR(RANGE,7,5)=#AMO(#I). + END IF. END LOOP. EXECUTE. WRITE OUTFILE='H:\xxx Data\Data\Range_Months.txt' / SEQ ' ' LOCATION ' ' RANGE. EXECUTE. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: Wednesday, January 08, 2014 4:54 PM To: [hidden email] Subject: Re: Odd, very odd, something Gene, Maybe attach a small subset of data and the actual syntax file to the thread in Nabble? http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-tt5723836.html See the More button on the right! I don't have time to try to guess the contents of your existing data set. I assume Jan08 .... Dec10 on the DO REPEAT are existing variables. Even more important. What are actually attempting to do here? Maybe there is a more straightforward syntax? D. -- Maguin, Eugene wrote > Given an A11 variable, initially defined to be ‘…..-…..’, where a dot > is a space, I replace using the SUBSTR function the 5 spaces on either > side of the dash character, ‘-‘, with a 5 character string such as > ‘Jan08’. The result in the data window shows a black diamond shaped > character with an embedded, white question mark character. An example is ‘Jan08 Dec10’. > > So, naturally, the question is what is going on? > And how can it be fixed so that the dash character shows instead of > the diamond character? > > If it matters: 21, fully patched, not Unicode. > > Thanks, Gene Maguin ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-tp5723836p5723842.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
I see no need for the left hand side substring in that example. Just build two strings, say "Left" and "Right" - then replace "SUBSTR(RANGE,1,5)" in the code with Left and "SUBSTR(RANGE,7,5)" in the code with Right. Then after the loop concatenate the strings together (with your hyphen in between).
IMO it would be better to use VARSTOCASES and not worry about hard coding all those strings as well (-1 and +1 in the loop then just becomes LAG and LEAD - or you could parse the strings to make actual date variables). |
In reply to this post by Maguin, Eugene
I believe that you had a non-ascii dash,
which is two bytes in Unicode, and the logic of your code would only work
if each character, including the dash, is one byte, so the result is an
invalid utf-8 character. If any of the input fields can also contain
accented or other non-ascii characters, the situation will be even worse.
When you retyped the RANGE string, you apparently got an ascii dash. It is important for people to stop assuming that a byte is a character and to use the char.* functions that Statistics has provided since V16. And avoid left hand side substr. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: "Maguin, Eugene" <[hidden email]> To: [hidden email], Date: 01/08/2014 02:45 PM Subject: Re: [SPSSX-L] Odd, very odd, something. Info correction. Sent by: "SPSSX(r) Discussion" <[hidden email]> I just looked at the edit-options-general box and the two options in character encoding section are both grayed out but the Unicode circle is bulleted. So perhaps I am really running in Unicode and didn’t realize it. I retyped the line COMPUTE RANGE=’ - ‘. And re-ran the section and no diamonds, just dashes. Even if I start over, I can’t reproduce the problem. So: FWIW. Gene Maguin From: Jon K Peck [mailto:peck@...] Sent: Wednesday, January 08, 2014 3:56 PM To: Maguin, Eugene Cc: [hidden email] Subject: Re: [SPSSX-L] Odd, very odd, something The question mark indicates that you have an unprintable character in that location. If you are not in Unicode mode and using a western code page such as the usual cp1252, there are only a few such character slots. Please post some code that shows this behavior. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM peck@... phone: 720-342-5621 From: "Maguin, Eugene" <emaguin@...> To: [hidden email], Date: 01/08/2014 01:41 PM Subject: [SPSSX-L] Odd, very odd, something Sent by: "SPSSX(r) Discussion" <[hidden email]> Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08�Dec10’. So, naturally, the question is what is going on? And how can it be fixed so that the dash character shows instead of the diamond character? If it matters: 21, fully patched, not Unicode. Thanks, Gene Maguin |
In reply to this post by Maguin, Eugene
Also, you appear to just be finding the begin and end valid dates given the range correct? A more minimal example below for some simpler code.
DATA LIST FREE /Jan08 Feb08 Mar08 Apr08. BEGIN DATA 1 1 . . . 1 1 . . . 1 1 1 1 1 1 END DATA. STRING Range (A11). STRING Left Right (A5). DO REPEAT X=JAN08 TO Apr08 /A='Jan08' 'Feb08' 'Mar08' 'Apr08'. IF NOT(MISSING(X)) AND Left = " " Left = A. IF NOT(MISSING(X)) Right = A. END REPEAT. COMPUTE Range = CONCAT(Left,"-",Right). The first IF statement only assigns the string if A is not missing and the string is empty. In the second IF statement the last valid value wins. This assumes stuff in the middle isn't missing, which I'm not quite sure what your original code does if that is the case. |
Administrator
|
In reply to this post by Maguin, Eugene
The core of my question is what do you want to do.
What do you start with? What do you want to end up with?
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Andy W
So much simpler, so much more elegant. Thank you.
-----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Andy W Sent: Wednesday, January 08, 2014 8:04 PM To: [hidden email] Subject: Re: Odd, very odd, something Also, you appear to just be finding the begin and end valid dates given the range correct? A more minimal example below for some simpler code. DATA LIST FREE /Jan08 Feb08 Mar08 Apr08. BEGIN DATA 1 1 . . . 1 1 . . . 1 1 1 1 1 1 END DATA. STRING Range (A11). STRING Left Right (A5). DO REPEAT X=JAN08 TO Apr08 /A='Jan08' 'Feb08' 'Mar08' 'Apr08'. IF NOT(MISSING(X)) AND Left = " " Left = A. IF NOT(MISSING(X)) Right = A. END REPEAT. COMPUTE Range = CONCAT(Left,"-",Right). The first IF statement only assigns the string if A is not missing and the string is empty. In the second IF statement the last valid value wins. This assumes stuff in the middle isn't missing, which I'm not quite sure what your original code does if that is the case. ----- Andy W [hidden email] http://andrewpwheeler.wordpress.com/ -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-tp5723836p5723849.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by David Marso
Hi David,
The whole point was to scan through the 36 month-year values and identify the first and last non missing values in the sequence and then stash the labels of those two values in the Range variable. The complication was that the variable names did not have variable labels associated with them. I also knew from what the data were about that there were no embedded missing values. Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: Wednesday, January 08, 2014 8:33 PM To: [hidden email] Subject: Re: Odd, very odd, something The core of my question is what do you want to do. What do you start with? What do you want to end up with? Maguin, Eugene wrote > David, > This was the entire section of the file that was executed. It read an > excel file of 94 records. You are certainly welcome to comment but to > put the excel file up, I'd want to work it over to change some data > values and I'd rather not. Gene > > ***************************************************************************. > GET DATA /TYPE=ODBC/CONNECT='DSN=Excel Files;DBQ=H:\xxx Data\Data\'+ > 'wwww_Days.xlsx;DriverId=1046;MaxBufferSize=2048;PageTimeout=5;'+ > ';QuotedId=Yes'/SQL='SELECT Location, Capacity, Jan08, Feb08, Mar08, '+ > 'Apr08, May08, Jun08, Jul08, Aug08, Sep08, Oct08, Nov08, Dec08, > Jan09, '+ > 'Feb09, Mar09, Apr09, May09, Jun09, Jul09, Aug09, Sep09, Oct09, > Nov09, '+ > 'Dec09, Jan10, Feb10, Mar10, Apr10, May10, Jun10, Jul10, Aug10, > Sep10, '+ > 'Oct10, Nov10, Dec10 FROM `DaysCare$`'/ASSUMEDSTRWIDTH=255. > CACHE. > EXECUTE. > > ALTER TYPE Location(AMIN). > FORMAT Capacity Jan08 Feb08 Mar08 Apr08 May08 Jun08 Jul08 Aug08 Sep08 > Oct08 > Nov08 Dec08 Jan09 Feb09 Mar09 Apr09 May09 Jun09 Jul09 Aug09 Sep09 Oct09 > Nov09 Dec09 Jan10 Feb10 Mar10 Apr10 May10 Jun10 Jul10 Aug10 Sep10 Oct10 > Nov10 Dec10(F4.0). > > * GET RID OF RECORD 94 WHICH HAS A BLANK LOCATION FIELD. > SELECT IF (LOCATION NE ' '). > EXECUTE. > > COUNT Vals=Jan08 Feb08 Mar08 Apr08 May08 Jun08 Jul08 Aug08 Sep08 Oct08 > Nov08 Dec08 Jan09 Feb09 Mar09 Apr09 May09 Jun09 Jul09 Aug09 Sep09 Oct09 > Nov09 Dec09 Jan10 Feb10 Mar10 Apr10 May10 Jun10 Jul10 Aug10 Sep10 Oct10 > Nov10 Dec10(MISSING). > FORMAT VALS(F2.0). > COMPUTE VALS=36-VALS. > FREQUENCIES VALS. > > COMPUTE SEQ=$CASENUM. > EXECUTE. > FORMAT SEQ(F3.0). > > STRING RANGE(A11). > COMPUTE RANGE=' - '. > VECTOR #MO(36,F3.0)/#AMO(36,A5). > DO REPEAT X=JAN08 TO DEC10/Y=#MO1 TO #MO36/B=#AMO1 TO #AMO36/ > A='Jan08' 'Feb08' 'Mar08' 'Apr08' 'May08' 'Jun08' 'Jul08' 'Aug08' > 'Sep08' > 'Oct08' 'Nov08' 'Dec08' 'Jan09' 'Feb09' 'Mar09' 'Apr09' 'May09' 'Jun09' > 'Jul09' 'Aug09' 'Sep09' 'Oct09' 'Nov09' 'Dec09' 'Jan10' 'Feb10' 'Mar10' > 'Apr10' 'May10' 'Jun10' 'Jul10' 'Aug10' 'Sep10' 'Oct10' 'Nov10' > 'Dec10'. > + COMPUTE Y=X. > + COMPUTE B=A. > END REPEAT. > LOOP #I=1 TO 36. > + DO IF (#I EQ 1). > + IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,1,5)=#AMO(#I). > + ELSE IF (#I EQ 36). > + IF (NOT(SYSMIS(#MO(#I)))) SUBSTR(RANGE,7,5)=#AMO(#I). > + ELSE. > + IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I-1))) > SUBSTR(RANGE,1,5)=#AMO(#I). > + IF (NOT(SYSMIS(#MO(#I))) AND SYSMIS(#MO(#I+1))) > SUBSTR(RANGE,7,5)=#AMO(#I). > + END IF. > END LOOP. > EXECUTE. > > WRITE OUTFILE='H:\xxx Data\Data\Range_Months.txt' / > SEQ ' ' LOCATION ' ' RANGE. > EXECUTE. > > > -----Original Message----- > From: SPSSX(r) Discussion [mailto: > SPSSX-L@.UGA > ] On Behalf Of David Marso > Sent: Wednesday, January 08, 2014 4:54 PM > To: > SPSSX-L@.UGA > Subject: Re: Odd, very odd, something > > Gene, > Maybe attach a small subset of data and the actual syntax file to the > thread in Nabble? > http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-t > t5723836.html > See the More button on the right! > I don't have time to try to guess the contents of your existing data set. > I assume Jan08 .... Dec10 on the DO REPEAT are existing variables. > Even more important. What are actually attempting to do here? > Maybe there is a more straightforward syntax? > D. > -- > > > Maguin, Eugene wrote >> Given an A11 variable, initially defined to be ‘…..-…..’, where a dot >> is a space, I replace using the SUBSTR function the 5 spaces on >> either side of the dash character, ‘-‘, with a 5 character string >> such as ‘Jan08’. The result in the data window shows a black diamond >> shaped character with an embedded, white question mark character. An >> example is >> ‘Jan08 Dec10’. >> >> So, naturally, the question is what is going on? >> And how can it be fixed so that the dash character shows instead of >> the diamond character? >> >> If it matters: 21, fully patched, not Unicode. >> >> Thanks, Gene Maguin > > > > > > ----- > Please reply to the list and not to my personal email. > Those desiring my consulting or training services please feel free to > email me. > --- > "Nolite dare sanctum canibus neque mittatis margaritas vestras ante > porcos ne forte conculcent eas pedibus suis." > Cum es damnatorum possederunt porcos iens ut salire off sanguinum > cliff in abyssum?" > -- > View this message in context: > http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-t > p5723836p5723842.html Sent from the SPSSX Discussion mailing list > archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the command. To leave the > list, send the command SIGNOFF SPSSX-L For a list of commands to > manage subscriptions, send the command INFO REFCARD > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the command. To leave the > list, send the command SIGNOFF SPSSX-L For a list of commands to > manage subscriptions, send the command INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Odd-very-odd-something-tp5723836p5723850.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
In reply to this post by Andy W
Very nice Andy!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
Maybe load the 'labels' programmatically? VECTOR #MOYR (36,A5). DO IF $CASENUM=1. + COMPUTE #=1. + LOOP #Y=1 TO 3. + LOOP #M=1 TO 12. + COMPUTE #MoYr(#)=CONCAT(CHAR.SUBSTR("JanFebMarAprMayJunJulAugSepOctNovDec",(#M-1)*3+1,3), CHAR.SUBSTR("080910",(#Y-1)*2+1,2)). + COMPUTE #=#+1. + END LOOP. + END LOOP. END IF. * Stick Andy's code here using #MOYR1 TO #MoYr36 in second DO REPEAT list.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
I was thinking too if you know they are going to be months and you know the start month-year you could use something like:
DATA LIST FREE /Dec08 Jan09 Feb09 Mar09. BEGIN DATA 1 1 . . . 1 1 . . . 1 1 1 1 1 1 END DATA. STRING Left Right (A6). COMPUTE #iter = 12. DO REPEAT X=Dec08 TO Mar09. COMPUTE #Month = MOD(#iter - 1,12) + 1. COMPUTE #Year = TRUNC((#iter-1)/12) + 2008. COMPUTE #Date = DATE.MDY(#Month,1,#Year). IF NOT(MISSING(X)) AND Left = " " Left = STRING(#Date,MOYR6). IF NOT(MISSING(X)) Right = STRING(#Date,MOYR6). COMPUTE #iter = #iter + 1. END REPEAT. This takes knowing the start month-year, but if you feed it the variable list you don't need to know the end. I didn't want Gene's work on writing out all of those strings go to waste though! (IMO I would most often turn this data from long to wide for other analyses, and it would be as simple as VARSTOCASES and then an aggregate with FIRST and LAST in that format.) |
In reply to this post by Jon K Peck
So many "high" dashes! Why not just have one and only one.
Code Name U+002D hyphen-minus U+007E tilde (when used as swung dash) U+058A armenian hyphen U+05BE hebrew punctuation maqaf U+1400 canadian syllabics hyphen U+1806 mongolian todo soft hyphen U+2010 hyphen U+2011 non-breaking hyphen U+2012 figure dash U+2013 en dash U+2014 em dash U+2015 horizontal bar (=quotation dash) U+2053 swung dash U+207B superscript minus U+208B subscript minus U+2212 minus sign U+2E17 double oblique hyphen U+301C wav e da s h U+3030 wav y da s h U+30A0 katakana-hiragana double hyphen U+FE31 presentation form for vertical em dash U+FE32 presentation form for vertical en dash U+FE58 small em dash U+FE63 small hyphen-minus U+FF0D fullwidth hyphen-minus source: http://www.unicode.org/versions/Unicode6.3.0/ch06.pdf, p 196. Regards, Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -------------------------------------------- On Thu, 1/9/14, Jon K Peck <[hidden email]> wrote: Subject: Re: [SPSSX-L] Odd, very odd, something. Info correction. To: [hidden email] Date: Thursday, January 9, 2014, 1:41 AM I believe that you had a non-ascii dash, which is two bytes in Unicode, and the logic of your code would only work if each character, including the dash, is one byte, so the result is an invalid utf-8 character. � If any of the input fields can also contain accented or other non-ascii characters, the situation will be even worse. When you retyped the RANGE string, you apparently got an ascii dash. It is important for people to stop assuming that a byte is a character and to use the char.* functions that Statistics has provided since V16. � And avoid left hand side substr. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: � � � � "Maguin, Eugene" <[hidden email]> To: � � � � [hidden email], Date: � � � � 01/08/2014 02:45 PM Subject: � � � � Re: [SPSSX-L] Odd, very odd, something. Info correction. Sent by: � � � � "SPSSX(r) Discussion" <[hidden email]> I just looked at the edit-options-general box and the two options in character encoding section are both grayed out but the Unicode circle � is bulleted. So perhaps I am really running in Unicode and didn’t realize it. � I retyped the line COMPUTE RANGE=’ � � - � � ‘. And re-ran the section and no diamonds, just dashes. Even if I start over, I can’t reproduce the problem. So: FWIW. � Gene Maguin � � � From: Jon K Peck [mailto:[hidden email]] Sent: Wednesday, January 08, 2014 3:56 PM To: Maguin, Eugene Cc: [hidden email] Subject: Re: [SPSSX-L] Odd, very odd, something � The question mark indicates that you have an unprintable character in that location. � If you are not in Unicode mode and using a western code page such as the usual cp1252, there are only a few such character slots. � Please post some code that shows this behavior. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: � � � � "Maguin, Eugene" <[hidden email]> To: � � � � [hidden email], Date: � � � � 01/08/2014 01:41 PM Subject: � � � � [SPSSX-L] Odd, very odd, something Sent by: � � � � "SPSSX(r) Discussion" <[hidden email]> Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08�Dec10’. � So, naturally, the question is what is going on? And how can it be fixed so that the dash character shows instead of the diamond character? � If it matters: 21, fully patched, not Unicode. � Thanks, Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
With over 100,000 characters in Unicode,
why scrimp on dashes?
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: Albert-Jan Roskam <[hidden email]> To: [hidden email], Jon K Peck/Chicago/IBM@IBMUS, Date: 01/10/2014 08:30 AM Subject: Re: [SPSSX-L] Odd, very odd, something. Info correction. So many "high" dashes! Why not just have one and only one. Code Name U+002D hyphen-minus U+007E tilde (when used as swung dash) U+058A armenian hyphen U+05BE hebrew punctuation maqaf U+1400 canadian syllabics hyphen U+1806 mongolian todo soft hyphen U+2010 hyphen U+2011 non-breaking hyphen U+2012 figure dash U+2013 en dash U+2014 em dash U+2015 horizontal bar (=quotation dash) U+2053 swung dash U+207B superscript minus U+208B subscript minus U+2212 minus sign U+2E17 double oblique hyphen U+301C wav e da s h U+3030 wav y da s h U+30A0 katakana-hiragana double hyphen U+FE31 presentation form for vertical em dash U+FE32 presentation form for vertical en dash U+FE58 small em dash U+FE63 small hyphen-minus U+FF0D fullwidth hyphen-minus source: http://www.unicode.org/versions/Unicode6.3.0/ch06.pdf, p 196. Regards, Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -------------------------------------------- On Thu, 1/9/14, Jon K Peck <[hidden email]> wrote: Subject: Re: [SPSSX-L] Odd, very odd, something. Info correction. To: [hidden email] Date: Thursday, January 9, 2014, 1:41 AM I believe that you had a non-ascii dash, which is two bytes in Unicode, and the logic of your code would only work if each character, including the dash, is one byte, so the result is an invalid utf-8 character. If any of the input fields can also contain accented or other non-ascii characters, the situation will be even worse. When you retyped the RANGE string, you apparently got an ascii dash. It is important for people to stop assuming that a byte is a character and to use the char.* functions that Statistics has provided since V16. And avoid left hand side substr. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: "Maguin, Eugene" <[hidden email]> To: [hidden email], Date: 01/08/2014 02:45 PM Subject: Re: [SPSSX-L] Odd, very odd, something. Info correction. Sent by: "SPSSX(r) Discussion" <[hidden email]> I just looked at the edit-options-general box and the two options in character encoding section are both grayed out but the Unicode circle is bulleted. So perhaps I am really running in Unicode and didn’t realize it. I retyped the line COMPUTE RANGE=’ - ‘. And re-ran the section and no diamonds, just dashes. Even if I start over, I can’t reproduce the problem. So: FWIW. Gene Maguin From: Jon K Peck [mailto:peck@...] Sent: Wednesday, January 08, 2014 3:56 PM To: Maguin, Eugene Cc: [hidden email] Subject: Re: [SPSSX-L] Odd, very odd, something The question mark indicates that you have an unprintable character in that location. If you are not in Unicode mode and using a western code page such as the usual cp1252, there are only a few such character slots. Please post some code that shows this behavior. Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 From: "Maguin, Eugene" <[hidden email]> To: [hidden email], Date: 01/08/2014 01:41 PM Subject: [SPSSX-L] Odd, very odd, something Sent by: "SPSSX(r) Discussion" <[hidden email]> Given an A11 variable, initially defined to be ‘…..-…..’, where a dot is a space, I replace using the SUBSTR function the 5 spaces on either side of the dash character, ‘-‘, with a 5 character string such as ‘Jan08’. The result in the data window shows a black diamond shaped character with an embedded, white question mark character. An example is ‘Jan08�Dec10’. So, naturally, the question is what is going on? And how can it be fixed so that the dash character shows instead of the diamond character? If it matters: 21, fully patched, not Unicode. Thanks, Gene Maguin |
Free forum by Nabble | Edit this page |