converting letters to integer numbers

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

converting letters to integer numbers

Maguin, Eugene

It’s that question in various iterations that i ask every so often.

Given x (a1 format) = ‘R’, for instance, I’d like to convert the value of x to a number that arithmetic can be performed on, i.e., F format.

 

I know that x can be displayed as ahex2 with a format statement as the manual shows (p 59-60).

 

I know there’s other ways, such as a do repeat, and probably other ways that more knowledgable people know but is there straight forward single statement compute based sequence maybe involving the string and numeric functions?

 

Oh, let me add that this is 21 and I think it is not Unicode, or, at least, the data editor does not indicate Unicode, which I think 22 does.

 

Thanks Gene Maguin

 

Reply | Threaded
Open this post in threaded view
|

addendum to my posting

Maguin, Eugene

I was wrong. The data encoding is Unicode.

Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

Jon K Peck
In reply to this post by Maguin, Eugene
This will give you the numeric codes for each character.

data list list/letter(a1).
begin data
'a'
'A'
'1'
end data.
dataset name letters.
compute number = number(letter, PIB1).
list.

If your inputs are actually numeric strings, then you can just use F format above.


V21 (and earlier and later versions) can be in either Unicode or code page format at your option.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621




From:        "Maguin, Eugene" <[hidden email]>
To:        [hidden email],
Date:        05/07/2014 04:03 PM
Subject:        [SPSSX-L] converting letters to integer numbers
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




It’s that question in various iterations that i ask every so often.
Given x (a1 format) = ‘R’, for instance, I’d like to convert the value of x to a number that arithmetic can be performed on, i.e., F format.
 
I know that x can be displayed as ahex2 with a format statement as the manual shows (p 59-60).
 
I know there’s other ways, such as a do repeat, and probably other ways that more knowledgable people know but is there straight forward single statement compute based sequence maybe involving the string and numeric functions?
 
Oh, let me add that this is 21 and I think it is not Unicode, or, at least, the data editor does not indicate Unicode, which I think 22 does.
 
Thanks Gene Maguin
 
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

Rick Oliver-3
I won't ask why someone wants to perform arithmetic on letters, but what about AUTORECODE?

If you need a coding that can be reused and always produce the same autorecoded values, you can create a dataset with all the letters of the alphabet (upper and lower case), and use the SAVE TEMPLATE subcommand to create an autorecode template.

Rick Oliver
Senior Information Developer
IBM Business Analytics (SPSS)
E-mail: [hidden email]




From:        Jon K Peck/Chicago/IBM@IBMUS
To:        [hidden email],
Date:        05/07/2014 05:39 PM
Subject:        Re: converting letters to integer numbers
Sent by:        "SPSSX(r) Discussion" <[hidden email]>




This will give you the numeric codes for each character.

data list list/letter(a1).

begin data

'a'

'A'

'1'

end data.

dataset name letters.

compute number = number(letter, PIB1).

list.


If your inputs are actually numeric strings, then you can just use F format above.



V21 (and earlier and later versions) can be in either Unicode or code page format at your option.

Jon Peck (no "h") aka Kim
Senior Software Engineer, IBM
[hidden email]
phone: 720-342-5621





From:        
"Maguin, Eugene" <[hidden email]>
To:        
[hidden email],
Date:        
05/07/2014 04:03 PM
Subject:        
[SPSSX-L] converting letters to integer numbers
Sent by:        
"SPSSX(r) Discussion" <[hidden email]>




It’s that question in various iterations that i ask every so often.
Given x (a1 format) = ‘R’, for instance, I’d like to convert the value of x to a number that arithmetic can be performed on, i.e., F format.
 
I know that x can be displayed as ahex2 with a format statement as the manual shows (p 59-60).
 
I know there’s other ways, such as a do repeat, and probably other ways that more knowledgable people know but is there straight forward single statement compute based sequence maybe involving the string and numeric functions?

 
Oh, let me add that this is 21 and I think it is not Unicode, or, at least, the data editor does not indicate Unicode, which I think 22 does.
 
Thanks Gene Maguin

 
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

Richard Ristow
In reply to this post by Maguin, Eugene
At 06:02 PM 5/7/2014, Maguin, Eugene wrote:

>It's that question in various iterations that i ask every so often.
>Given x (a1 format) = 'R', for instance, I'd like to convert the
>value of x to a number that arithmetic can be performed on, i.e., F format.

and, at 06:06 PM 5/7/2014, added:
>The data encoding is Unicode.

Do you need the Unicode code-point value for the letter, or just
*some* numerical value? There's always

COMPUTE CharNumb = CHAR.INDEX('ABCDEFGHIJKLMNOPQRSTUVWXYZ',Char).

or variations; among other things, adding 64 to this result will give
the correct ASCII or Unicode numeric value. Something like this may
be your best bet; at least, it's representation-independent.

Back when character strings were ASCII, the problem of getting the
numeric value for a character, or the character value for a number,
was solved; I've copied the solutions (posted in 2008) to the end of this note.

Now, SPSS uses UTF-8 representation for Unicode. In UTF-8, all the
standard ASCII characters (numeric value 0-127) are represented as
single bytes with the same values as in ASCII, so the same code *may*
work if you don't need to recover numeric values from multi-byte codes.

It *might* work to read, say, a two-byte Unicode character by reading
it with format PIB2. However, that will give you the two-byte UTF-8
representation, which is *not* the Unicode code-point value. It can
be converted, and it would be practicable, but it's a little messy; I
won't work on details at this point.

Reprise:
Date:     Wed, 17 Sep 2008 21:23:01 -0400
From:     Richard Ristow <[hidden email]>
Subject:  Character, ASCII, hex (was, re: substring)
Comments: To: Gene Maguin <[hidden email]>
To:       [hidden email]

. To get the ASCII numeric value of a character, convert it using
function NUMBER and format PIB1.
. To go the other way, from an ASCII numeric value to the
corresponding character, use function STRING, with format PIB1.
. To get the hex for an ASCII numeric value as a two-character
string, use function STRING with format PIBHEX02.

>*  I.  Get numeric code from characters, .
>*      and characters from numeric.      .
>GET FILE=TestChar.
>
>NUMERIC ASCII    (F3)
>        /ASCIIhex (PIBHEX02).
>STRING  HEX      (A3).
>STRING  RECOVER  (A1).
>
>COMPUTE ASCII     = NUMBER(CHAR,PIB1).
>COMPUTE ASCIIhex  = ASCII.
>COMPUTE HEX       = CONCAT('x',STRING(ASCII,PIBHEX02)).
>COMPUTE RECOVER   = STRING(ASCII,PIB1).
>
>LIST.
>|-----------------------------|---------------------------|
>|Output Created               |31-JAN-2007 21:08:21       |
>|-----------------------------|---------------------------|
>[TestChar]
>C:\Documents and Settings\Richard\My Documents\Temporary\SPSS
>   \2007-01-31 Hynes - Deleting embedded control characters
>TestChar.Sav
>
>CHAR ASCII ASCIIhex HEX RECOVER
>
>L      76     4C    x4C L
>O      79     4F    x4F O
>V      86     56    x56 V
>E      69     45    x45 E
>        32     20    x20
>1      49     31    x31 1
>4      52     34    x34 4
>
>Number of cases read:  7    Number of cases listed:  7

X-ELNK-Received-Info: spv=0;
X-ELNK-AV: 0
X-ELNK-Info: sbv=0; sbrc=.0; sbf=0b; sbw=000;
X-Antivirus: AVG for E-mail 8.0.169 [270.6.21/1678]
Content-Type: multipart/mixed; boundary="=======AVGMAIL-48D28A0B0000======="

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

Richard Ristow
In reply to this post by Maguin, Eugene
Postscript:

At 06:02 PM 5/7/2014, Maguin, Eugene wrote:

>It's that question in various iterations that i ask every so often.
>Given x (a1 format) = 'R', for instance, I'd like to convert the
>value of x to a number that arithmetic can be performed on, i.e., F format.

Would you say how you're planning to use the numerical values? As
always, that information may help us suggest alternative approaches.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

Maguin, Eugene
Jon, thank you. It worked excellently. And thank you to all who replied.

Richard, I wrote a tidbit of code to compute random permutations and I wanted to check them for internal consistency. Rather than permutating numbers, I permutated letters. The check is that no letter appears more than once in the string. I had thought the summing the letter's numeric codes would give me that check but it won't.
Gene Maguin


-----Original Message-----
From: Richard Ristow [mailto:[hidden email]]
Sent: Wednesday, May 07, 2014 7:31 PM
To: Maguin, Eugene; [hidden email]
Subject: Re: converting letters to integer numbers

Postscript:

At 06:02 PM 5/7/2014, Maguin, Eugene wrote:

>It's that question in various iterations that i ask every so often.
>Given x (a1 format) = 'R', for instance, I'd like to convert the value
>of x to a number that arithmetic can be performed on, i.e., F format.

Would you say how you're planning to use the numerical values? As always, that information may help us suggest alternative approaches.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

David Marso
Administrator
Hi Gene,
Does the 3rd segment of code do what you require?
/* Generate 10000 cases of uniform (0,1) */.
NEW FILE.
DATASET CLOSE ALL.
DATASET DECLARE one.
MATRIX.
SAVE UNIFORM(10000,1)/OUTFILE one/VARIABLES junk.
END MATRIX.
DATASET ACTIVATE one.

/* Generate permutations of length 5 from {A:Z}*/.
STRING #alpha (A26).
IF ($CASENUM EQ 1) #alpha="ABCDEFGHIJKLMNOPQRSTUVWXYZ".
STRING Permutation (A5).
LOOP #=1 TO 5.
COMPUTE Permutation=CONCAT(Permutation,CHAR.SUBSTR(#alpha,TRUNC(RV.UNIFORM(1,CHAR.LENGTH(#alpha)),1))).
END LOOP.
EXECUTE.

/* Test for repetitions */.
LOOP #=1 TO CHAR.LENGTH(Permutation)-1.
COMPUTE repetition = CHAR.RINDEX(Permutation,CHAR.SUBSTR(Permutation,#,1),1) GT #.
END LOOP IF repetition.
FREQUENCIES repetition.

Maguin, Eugene wrote
Jon, thank you. It worked excellently. And thank you to all who replied.

Richard, I wrote a tidbit of code to compute random permutations and I wanted to check them for internal consistency. Rather than permutating numbers, I permutated letters. The check is that no letter appears more than once in the string. I had thought the summing the letter's numeric codes would give me that check but it won't.
Gene Maguin


-----Original Message-----
From: Richard Ristow [mailto:[hidden email]]
Sent: Wednesday, May 07, 2014 7:31 PM
To: Maguin, Eugene; [hidden email]
Subject: Re: converting letters to integer numbers

Postscript:

At 06:02 PM 5/7/2014, Maguin, Eugene wrote:

>It's that question in various iterations that i ask every so often.
>Given x (a1 format) = 'R', for instance, I'd like to convert the value
>of x to a number that arithmetic can be performed on, i.e., F format.

Would you say how you're planning to use the numerical values? As always, that information may help us suggest alternative approaches.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

Richard Ristow
At 11:07 AM 5/8/2014, David Marso posted in this thread. I think the
code posted has a couple of bugs:
. It'll never select letter 'Z'.
. It draws characters with replacement, so it'll easily generate repetitions

The code below is David's, except,
. I've used old-form string functions rather than CHAR. functions,
because of the age of my SPSS version.
. I've broken up the longest statement, in the loop to generate
permutation, by introducing intermediate scratch variables.

The permutation-generating code is,

COMPUTE #AlphLen = LENGTH(RTRIM(#alpha)).
LOOP #=1 TO 5.
.  COMPUTE #RandLtr = SUBSTR(#alpha,
                              TRUNC(RV.UNIFORM(1,#AlphLen)),
                              1).
.  COMPUTE Permutation=
    CONCAT(RTRIM(Permutation),#RandLtr).
END LOOP.

A. This selects letters at random with replacement, so it can easily
pull duplicate letters.
B. I don't think it will ever select 'Z'. I think you want
TRUNC(RV.UNIFORM(1,#AlphLen+1))

Here's the output of a test run. Note repeated letters in cases 1011,
1012, 1017 and 1018.
|-----------------------------|---------------------------|
|Output Created               |08-MAY-2014 13:03:52       |
|-----------------------------|---------------------------|
  [Cases]
   ID Permutation Repetition

1001 SMDEJ            0
1002 NBPQE            0
1003 BTXLW            0
1004 VCJYO            0
1005 JWFCH            0
1006 CRDBV            0
1007 DTJNE            0
1008 PCRDJ            0
1009 SRKUQ            0
1010 HPIUF            0
1011 HQDOQ            1
1012 XSUGU            1
1013 YSKLF            0
1014 NAKHS            0
1015 PYKLS            0
1016 VQHIG            0
1017 RJOGG            1
1018 RFROU            1
1019 SIPXB            0
1020 ACWIS            0

Number of cases read:  20    Number of cases listed:  20
=========================================
APPENDIX: All code (not saved separately)
=========================================

SET RNG = MT       /* 'Mersenne twister' random number generator */ .
SET MTINDEX = 9104 /*  Pocket calculator RNG                     */ .
NEW FILE.
INPUT PROGRAM.
.  NUMERIC ID (N4).
.  LOOP    ID = 1001 TO 1020.
.    END CASE.
.  END LOOP.
.  END FILE.
END INPUT PROGRAM.
DATASET NAME     Cases WINDOW=FRONT.

/* Generate permutations of length 5 from {A:Z}*/.
STRING  #alpha (A26).
IF ($CASENUM EQ 1) #alpha="ABCDEFGHIJKLMNOPQRSTUVWXYZ".
STRING  #RandLtr (A1).
STRING  Permutation (A5).
NUMERIC #AlphLen (F3).
COMPUTE #AlphLen = LENGTH(RTRIM(#alpha)).
LOOP #=1 TO 5.
.  COMPUTE #RandLtr = SUBSTR(#alpha,
                              TRUNC(RV.UNIFORM(1,#AlphLen)),
                              1).
.  COMPUTE Permutation=
    CONCAT(RTRIM(Permutation),#RandLtr).
END LOOP.
EXECUTE.

/* Test for repetitions */.
NUMERIC Repetition (F2).
/* Test for repetitions */.
LOOP #=1 TO LENGTH(RTRIM(Permutation))-1.
.  COMPUTE repetition = RINDEX(Permutation,
                                SUBSTR(Permutation,#,1),1) GT #.
END LOOP IF repetition.

LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

David Marso
Administrator
"A. This selects letters at random with replacement, so it can easily
pull duplicate letters.
B. I don't think it will ever select 'Z'. I think you want
TRUNC(RV.UNIFORM(1,#AlphLen+1)) "

A.The sampling with replacement was intentional in order to generate data which could be tested for repetitions ;-)
B.  Good catch Richard, +1 it should be.  I could argue that this was intentional as well in order to keep you and others on your toes ;-)))
--
Richard Ristow wrote
At 11:07 AM 5/8/2014, David Marso posted in this thread. I think the
code posted has a couple of bugs:
. It'll never select letter 'Z'.
. It draws characters with replacement, so it'll easily generate repetitions

The code below is David's, except,
. I've used old-form string functions rather than CHAR. functions,
because of the age of my SPSS version.
. I've broken up the longest statement, in the loop to generate
permutation, by introducing intermediate scratch variables.

The permutation-generating code is,

COMPUTE #AlphLen = LENGTH(RTRIM(#alpha)).
LOOP #=1 TO 5.
.  COMPUTE #RandLtr = SUBSTR(#alpha,
                              TRUNC(RV.UNIFORM(1,#AlphLen)),
                              1).
.  COMPUTE Permutation=
    CONCAT(RTRIM(Permutation),#RandLtr).
END LOOP.

A. This selects letters at random with replacement, so it can easily
pull duplicate letters.
B. I don't think it will ever select 'Z'. I think you want
TRUNC(RV.UNIFORM(1,#AlphLen+1))

Here's the output of a test run. Note repeated letters in cases 1011,
1012, 1017 and 1018.
|-----------------------------|---------------------------|
|Output Created               |08-MAY-2014 13:03:52       |
|-----------------------------|---------------------------|
  [Cases]
   ID Permutation Repetition

1001 SMDEJ            0
1002 NBPQE            0
1003 BTXLW            0
1004 VCJYO            0
1005 JWFCH            0
1006 CRDBV            0
1007 DTJNE            0
1008 PCRDJ            0
1009 SRKUQ            0
1010 HPIUF            0
1011 HQDOQ            1
1012 XSUGU            1
1013 YSKLF            0
1014 NAKHS            0
1015 PYKLS            0
1016 VQHIG            0
1017 RJOGG            1
1018 RFROU            1
1019 SIPXB            0
1020 ACWIS            0

Number of cases read:  20    Number of cases listed:  20
=========================================
APPENDIX: All code (not saved separately)
=========================================

SET RNG = MT       /* 'Mersenne twister' random number generator */ .
SET MTINDEX = 9104 /*  Pocket calculator RNG                     */ .
NEW FILE.
INPUT PROGRAM.
.  NUMERIC ID (N4).
.  LOOP    ID = 1001 TO 1020.
.    END CASE.
.  END LOOP.
.  END FILE.
END INPUT PROGRAM.
DATASET NAME     Cases WINDOW=FRONT.

/* Generate permutations of length 5 from {A:Z}*/.
STRING  #alpha (A26).
IF ($CASENUM EQ 1) #alpha="ABCDEFGHIJKLMNOPQRSTUVWXYZ".
STRING  #RandLtr (A1).
STRING  Permutation (A5).
NUMERIC #AlphLen (F3).
COMPUTE #AlphLen = LENGTH(RTRIM(#alpha)).
LOOP #=1 TO 5.
.  COMPUTE #RandLtr = SUBSTR(#alpha,
                              TRUNC(RV.UNIFORM(1,#AlphLen)),
                              1).
.  COMPUTE Permutation=
    CONCAT(RTRIM(Permutation),#RandLtr).
END LOOP.
EXECUTE.

/* Test for repetitions */.
NUMERIC Repetition (F2).
/* Test for repetitions */.
LOOP #=1 TO LENGTH(RTRIM(Permutation))-1.
.  COMPUTE repetition = RINDEX(Permutation,
                                SUBSTR(Permutation,#,1),1) GT #.
END LOOP IF repetition.

LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

Maguin, Eugene
David,  I want to try out your code understanding why you wrote it the way you did. I'd expect to learn something.

My code follows. The purpose was to create random permutations of a preset number of elements for the purpose of conducting permutation tests on data by computing a statistic for all possible combinations of, in its most simple form, two variables. For an n of 5 cases, there's 120 permutations. So, easy to list all 120. I have 16 cases. Now, about 1E10 or 11 permutations. Thus a random sample of maybe 10k permutations. The creation rules are that every case is listed exactly once in every permutation and that permutation set be purged of duplicates. The use of letters seems odd, I suppose, but characters use less space than  floating point numbers.

So.

INPUT PROGRAM.
SET SEED=987123.
NUMERIC PERM(F2.0).
VECTOR S P(8A1).
LOOP PERM=1 TO 2.
DO REPEAT X=S1 TO S8/Y='A' 'B' 'C' 'D' 'E' 'F' 'G' 'H'.
+  COMPUTE X=Y.
END REPEAT.
LOOP #I=1 TO 8.
COMPUTE #N=TRUNC(RV.UNIFORM(1,8-#I+2)). /* RETURNS VALUES IN OPEN INTERVAL.
COMPUTE P(#I)=S(#N).
*  CONDENSE OUT THE USED VALUE.
DO IF (#N NE 8-#I+1).
LOOP #J=#N TO #8-#I.
COMPUTE S(#J)=S(#J+1).
END LOOP.
END IF.
COMPUTE S(8-#I+1)='-'.
*  PRINT / S1 TO S8 I N J P1 TO P8 (8(A1,1X),3(F1.0,1X),8(A1,1X)).
END LOOP.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.

DELETE VARIABLES S1 TO S8.

STRING SET(A8).
COMPUTE SET=CONCAT(P1,P2,P3,P4,P5,P6,P7,P8).
COMPUTE TOTAL=0.
DO REPEAT X=P1 TO P8.
+  COMPUTE TOTAL=TOTAL+10**(NUMBER(X,PIB1)-65).
END REPEAT.
FORMAT TOTAL(F8.0).
FREQUENCIES TOTAL P1 TO P8.

AGGREGATE OUTFILE=* MODE=ADDVARIABLES/BREAK=SET/REPS=NU.
FREQUENCIES REPS.









-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso
Sent: Thursday, May 08, 2014 1:40 PM
To: [hidden email]
Subject: Re: converting letters to integer numbers

"A. This selects letters at random with replacement, so it can easily pull duplicate letters.
B. I don't think it will ever select 'Z'. I think you want
TRUNC(RV.UNIFORM(1,#AlphLen+1)) "

A.The sampling with replacement was intentional in order to generate data which could be tested for repetitions ;-) B.  Good catch Richard, +1 it should be.  I could argue that this was intentional as well in order to keep you and others on your toes ;-)))
--

Richard Ristow wrote

> At 11:07 AM 5/8/2014, David Marso posted in this thread. I think the
> code posted has a couple of bugs:
> . It'll never select letter 'Z'.
> . It draws characters with replacement, so it'll easily generate
> repetitions
>
> The code below is David's, except,
> . I've used old-form string functions rather than CHAR. functions,
> because of the age of my SPSS version.
> . I've broken up the longest statement, in the loop to generate
> permutation, by introducing intermediate scratch variables.
>
> The permutation-generating code is,
>
> COMPUTE #AlphLen = LENGTH(RTRIM(#alpha)).
> LOOP #=1 TO 5.
> .  COMPUTE #RandLtr = SUBSTR(#alpha,
>                               TRUNC(RV.UNIFORM(1,#AlphLen)),
>                               1).
> .  COMPUTE Permutation=
>     CONCAT(RTRIM(Permutation),#RandLtr).
> END LOOP.
>
> A. This selects letters at random with replacement, so it can easily
> pull duplicate letters.
> B. I don't think it will ever select 'Z'. I think you want
> TRUNC(RV.UNIFORM(1,#AlphLen+1))
>
> Here's the output of a test run. Note repeated letters in cases 1011,
> 1012, 1017 and 1018.
> |-----------------------------|---------------------------|
> |Output Created               |08-MAY-2014 13:03:52       |
> |-----------------------------|---------------------------|
>   [Cases]
>    ID Permutation Repetition
>
> 1001 SMDEJ            0
> 1002 NBPQE            0
> 1003 BTXLW            0
> 1004 VCJYO            0
> 1005 JWFCH            0
> 1006 CRDBV            0
> 1007 DTJNE            0
> 1008 PCRDJ            0
> 1009 SRKUQ            0
> 1010 HPIUF            0
> 1011 HQDOQ            1
> 1012 XSUGU            1
> 1013 YSKLF            0
> 1014 NAKHS            0
> 1015 PYKLS            0
> 1016 VQHIG            0
> 1017 RJOGG            1
> 1018 RFROU            1
> 1019 SIPXB            0
> 1020 ACWIS            0
>
> Number of cases read:  20    Number of cases listed:  20
> =========================================
> APPENDIX: All code (not saved separately)
> =========================================
>
> SET RNG = MT       /* 'Mersenne twister' random number generator */ .
> SET MTINDEX = 9104 /*  Pocket calculator RNG                     */ .
> NEW FILE.
> INPUT PROGRAM.
> .  NUMERIC ID (N4).
> .  LOOP    ID = 1001 TO 1020.
> .    END CASE.
> .  END LOOP.
> .  END FILE.
> END INPUT PROGRAM.
> DATASET NAME     Cases WINDOW=FRONT.
>
> /* Generate permutations of length 5 from {A:Z}*/.
> STRING  #alpha (A26).
> IF ($CASENUM EQ 1) #alpha="ABCDEFGHIJKLMNOPQRSTUVWXYZ".
> STRING  #RandLtr (A1).
> STRING  Permutation (A5).
> NUMERIC #AlphLen (F3).
> COMPUTE #AlphLen = LENGTH(RTRIM(#alpha)).
> LOOP #=1 TO 5.
> .  COMPUTE #RandLtr = SUBSTR(#alpha,
>                               TRUNC(RV.UNIFORM(1,#AlphLen)),
>                               1).
> .  COMPUTE Permutation=
>     CONCAT(RTRIM(Permutation),#RandLtr).
> END LOOP.
> EXECUTE.
>
> /* Test for repetitions */.
> NUMERIC Repetition (F2).
> /* Test for repetitions */.
> LOOP #=1 TO LENGTH(RTRIM(Permutation))-1.
> .  COMPUTE repetition = RINDEX(Permutation,
>                                 SUBSTR(Permutation,#,1),1) GT #.
> END LOOP IF repetition.
>
> LIST.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSX-L For a list of commands to
> manage subscriptions, send the command INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/converting-letters-to-integer-numbers-tp5725888p5725924.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

David Marso
Administrator
In reply to this post by David Marso
I previously posted a macro to create random permutations and realized it was not particularly general and/or commented consistently with the code.
Here's another try.
DEFINE !Permute(!POSITIONAL !TOKENS(1)
   / !POSITIONAL !TOKENS(1)
   / FROM !CMDEND !DEFAULT (ABCDEFGHIJKLMNOPQRSTUVWXYZ)).
PRESERVE.
SET MXLOOPS=1000000.
STRING #alpha (!CONCAT("A",!LENGTH(!FROM))).
IF ($CASENUM EQ 1) #alpha=!QUOTE(!FROM).
STRING !2 (!CONCAT("A",!1)).
STRING #ltr (A1).
COMPUTE #=1.
COMPUTE #alphalength=CHAR.LENGTH(#alpha) + 1.
LOOP.
+  COMPUTE #ltr=CHAR.SUBSTR(#alpha,TRUNC(RV.UNIFORM(1,#alphalength))).
+  DO IF CHAR.INDEX(!2,#ltr) EQ 0.
+    COMPUTE !2=CONCAT(RTRIM(!2),#ltr).
+    COMPUTE # = # + 1.
+  END IF.
END LOOP IF # GT !1.
RESTORE.
!ENDDEFINE.
SET MPRINT ON.
/*Create random permutations of length 6 from default set and put in Perm6 */.
!Permute 6 Perm6.
/*Create random permutations of length 5 from arbitrary character set and put in Perm5 */.
!Permute 5 Perm5 FROM 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ .
EXECUTE.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

David Marso
Administrator
In reply to this post by Maguin, Eugene
Here are a couple of methods using the MATRIX language.
*Notes from FM:
/* GRADE. Rank elements in matrix, using sequential integers for ties */.
/* UNIFORM. Create matrix of uniform random numbers */.

SET MXLOOPS=10000000.

/* LONG VERSION */.
/*Random list of 100000 from 20 elements*/.
MATRIX.
COMPUTE k=20.
COMPUTE nPerm=100000.
COMPUTE perms=MAKE(nperm,k,0).
COMPUTE rand=UNIFORM(nperm,k).
LOOP #=1 TO nPerm.
COMPUTE perms(#,:)=GRADE(rand(#,:)).
END LOOP.
SAVE perms / OUTFILE *.
END MATRIX.

/* This can be reduced to the following */.

MATRIX.
LOOP #=1 TO 100000.
SAVE GRADE(UNIFORM(1,20)) / OUTFILE *.
END LOOP.
END MATRIX.


/* Exhaustive list */.
MATRIX.
COMPUTE x=8.
COMPUTE p=T({1:x}).

LOOP #=1 TO x-1.
+  COMPUTE goodperm=0.
+  COMPUTE p={KRONEKER(T({1:x}),MAKE(NROW(p),1,1)),KRONEKER(MAKE(x,1,1),p)}.
+  COMPUTE found=MAKE(NROW(p),1,0).
+  COMPUTE ncp=NCOL(p).
+  COMPUTE ncp1=ncp-1.
+  LOOP ##=1 TO NROW(p).
+    COMPUTE target=p(##,1).
+    LOOP ###=2 TO NCp.
+        COMPUTE found(##)=(p(##,###) EQ target).
+    END LOOP IF found(##).
+    COMPUTE goodperm=goodperm + (found(##) EQ 0).
+  END LOOP.

+  COMPUTE row=1.
+  COMPUTE final=MAKE(goodperm,ncp,0).
+  LOOP good=1 TO NROW(p).
+    DO IF found(good)=0.
+      COMPUTE final(row,:)=p(good,:).
+      COMPUTE row=row+1.
+    END IF.
+  END LOOP.
+  COMPUTE p=final.
END LOOP.
SAVE p / OUTFILE * .
END MATRIX.



Maguin, Eugene wrote
David,  I want to try out your code understanding why you wrote it the way you did. I'd expect to learn something.

My code follows. The purpose was to create random permutations of a preset number of elements for the purpose of conducting permutation tests on data by computing a statistic for all possible combinations of, in its most simple form, two variables. For an n of 5 cases, there's 120 permutations. So, easy to list all 120. I have 16 cases. Now, about 1E10 or 11 permutations. Thus a random sample of maybe 10k permutations. The creation rules are that every case is listed exactly once in every permutation and that permutation set be purged of duplicates. The use of letters seems odd, I suppose, but characters use less space than  floating point numbers.

So.

INPUT PROGRAM.
SET SEED=987123.
NUMERIC PERM(F2.0).
VECTOR S P(8A1).
LOOP PERM=1 TO 2.
DO REPEAT X=S1 TO S8/Y='A' 'B' 'C' 'D' 'E' 'F' 'G' 'H'.
+  COMPUTE X=Y.
END REPEAT.
LOOP #I=1 TO 8.
COMPUTE #N=TRUNC(RV.UNIFORM(1,8-#I+2)). /* RETURNS VALUES IN OPEN INTERVAL.
COMPUTE P(#I)=S(#N).
*  CONDENSE OUT THE USED VALUE.
DO IF (#N NE 8-#I+1).
LOOP #J=#N TO #8-#I.
COMPUTE S(#J)=S(#J+1).
END LOOP.
END IF.
COMPUTE S(8-#I+1)='-'.
*  PRINT / S1 TO S8 I N J P1 TO P8 (8(A1,1X),3(F1.0,1X),8(A1,1X)).
END LOOP.
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.

DELETE VARIABLES S1 TO S8.

STRING SET(A8).
COMPUTE SET=CONCAT(P1,P2,P3,P4,P5,P6,P7,P8).
COMPUTE TOTAL=0.
DO REPEAT X=P1 TO P8.
+  COMPUTE TOTAL=TOTAL+10**(NUMBER(X,PIB1)-65).
END REPEAT.
FORMAT TOTAL(F8.0).
FREQUENCIES TOTAL P1 TO P8.

AGGREGATE OUTFILE=* MODE=ADDVARIABLES/BREAK=SET/REPS=NU.
FREQUENCIES REPS.



-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso
Sent: Thursday, May 08, 2014 1:40 PM
To: [hidden email]
Subject: Re: converting letters to integer numbers

"A. This selects letters at random with replacement, so it can easily pull duplicate letters.
B. I don't think it will ever select 'Z'. I think you want
TRUNC(RV.UNIFORM(1,#AlphLen+1)) "

A.The sampling with replacement was intentional in order to generate data which could be tested for repetitions ;-) B.  Good catch Richard, +1 it should be.  I could argue that this was intentional as well in order to keep you and others on your toes ;-)))
--

Richard Ristow wrote
> At 11:07 AM 5/8/2014, David Marso posted in this thread. I think the
> code posted has a couple of bugs:
> . It'll never select letter 'Z'.
> . It draws characters with replacement, so it'll easily generate
> repetitions
>
> The code below is David's, except,
> . I've used old-form string functions rather than CHAR. functions,
> because of the age of my SPSS version.
> . I've broken up the longest statement, in the loop to generate
> permutation, by introducing intermediate scratch variables.
>
> The permutation-generating code is,
>
> COMPUTE #AlphLen = LENGTH(RTRIM(#alpha)).
> LOOP #=1 TO 5.
> .  COMPUTE #RandLtr = SUBSTR(#alpha,
>                               TRUNC(RV.UNIFORM(1,#AlphLen)),
>                               1).
> .  COMPUTE Permutation=
>     CONCAT(RTRIM(Permutation),#RandLtr).
> END LOOP.
>
> A. This selects letters at random with replacement, so it can easily
> pull duplicate letters.
> B. I don't think it will ever select 'Z'. I think you want
> TRUNC(RV.UNIFORM(1,#AlphLen+1))
>
> Here's the output of a test run. Note repeated letters in cases 1011,
> 1012, 1017 and 1018.
> |-----------------------------|---------------------------|
> |Output Created               |08-MAY-2014 13:03:52       |
> |-----------------------------|---------------------------|
>   [Cases]
>    ID Permutation Repetition
>
> 1001 SMDEJ            0
> 1002 NBPQE            0
> 1003 BTXLW            0
> 1004 VCJYO            0
> 1005 JWFCH            0
> 1006 CRDBV            0
> 1007 DTJNE            0
> 1008 PCRDJ            0
> 1009 SRKUQ            0
> 1010 HPIUF            0
> 1011 HQDOQ            1
> 1012 XSUGU            1
> 1013 YSKLF            0
> 1014 NAKHS            0
> 1015 PYKLS            0
> 1016 VQHIG            0
> 1017 RJOGG            1
> 1018 RFROU            1
> 1019 SIPXB            0
> 1020 ACWIS            0
>
> Number of cases read:  20    Number of cases listed:  20
> =========================================
> APPENDIX: All code (not saved separately)
> =========================================
>
> SET RNG = MT       /* 'Mersenne twister' random number generator */ .
> SET MTINDEX = 9104 /*  Pocket calculator RNG                     */ .
> NEW FILE.
> INPUT PROGRAM.
> .  NUMERIC ID (N4).
> .  LOOP    ID = 1001 TO 1020.
> .    END CASE.
> .  END LOOP.
> .  END FILE.
> END INPUT PROGRAM.
> DATASET NAME     Cases WINDOW=FRONT.
>
> /* Generate permutations of length 5 from {A:Z}*/.
> STRING  #alpha (A26).
> IF ($CASENUM EQ 1) #alpha="ABCDEFGHIJKLMNOPQRSTUVWXYZ".
> STRING  #RandLtr (A1).
> STRING  Permutation (A5).
> NUMERIC #AlphLen (F3).
> COMPUTE #AlphLen = LENGTH(RTRIM(#alpha)).
> LOOP #=1 TO 5.
> .  COMPUTE #RandLtr = SUBSTR(#alpha,
>                               TRUNC(RV.UNIFORM(1,#AlphLen)),
>                               1).
> .  COMPUTE Permutation=
>     CONCAT(RTRIM(Permutation),#RandLtr).
> END LOOP.
> EXECUTE.
>
> /* Test for repetitions */.
> NUMERIC Repetition (F2).
> /* Test for repetitions */.
> LOOP #=1 TO LENGTH(RTRIM(Permutation))-1.
> .  COMPUTE repetition = RINDEX(Permutation,
>                                 SUBSTR(Permutation,#,1),1) GT #.
> END LOOP IF repetition.
>
> LIST.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the command. To leave the
> list, send the command SIGNOFF SPSSX-L For a list of commands to
> manage subscriptions, send the command INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/converting-letters-to-integer-numbers-tp5725888p5725924.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

David Marso
Administrator
Here is a MUCH MORE EFFICIENT rewrite of the exhaustive list method.
Don't know what I was thinking earlier ;-)))
No need to run through twice nor any requirement for a second matrix.

SET MXLOOPS=10000000.
/* Exhaustive list */.
MATRIX.
COMPUTE x=9.
COMPUTE p=T({1:x}).

LOOP #=1 TO x-1.
+  COMPUTE p={KRONEKER(T({1:x}),MAKE(NROW(p),1,1)),KRONEKER(MAKE(x,1,1),p)}.
+  COMPUTE goodrow=1.
+  LOOP ##=1 TO NROW(p).
+    COMPUTE target=p(##,1).
+    LOOP ###=2 TO NCOL(p).
+        COMPUTE found=(p(##,###) EQ target).
+    END LOOP IF found.
+    DO IF (found EQ 0).
+      COMPUTE p(goodrow,:)=p(##,:).
+      COMPUTE goodrow=goodrow+1.
+    END IF.
+  END LOOP.
+  COMPUTE p=p(1:(goodrow-1),:).
END LOOP.
SAVE p / OUTFILE * .
END MATRIX.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

Andy W
Here is a python program using the itertools set of commands to generate a set of permutations. You can check out the docs to see what other type of combinations/permutations can be made, https://docs.python.org/2/library/itertools.html#itertools.permutations.

BEGIN PROGRAM.
import spss
import itertools

#making permutations
YourSet = 'ABC'
YourLen = 3
x = list(itertools.permutations(YourSet,YourLen))

#exporting to an SPSS dataset
spss.StartDataStep()
datasetObj = spss.Dataset(name=None)
for i in range(YourLen):
  datasetObj.varlist.append('X'+str(i),1)
for j in x:
  datasetObj.cases.append(j)
spss.EndDataStep()
END PROGRAM.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: converting letters to integer numbers

David Marso
Administrator
Running with
YourSet = 'ABCDEFGHI'
YourLen = 9
makes SPSS go BOOM!
9! is only 362880
--------------------------
An unknown error has terminated communication with the processor.  The
SPSS Statistics Processor is unavailable.


Andy W wrote
Here is a python program using the itertools set of commands to generate a set of permutations. You can check out the docs to see what other type of combinations/permutations can be made, https://docs.python.org/2/library/itertools.html#itertools.permutations.

BEGIN PROGRAM.
import spss
import itertools

#making permutations
YourSet = 'ABC'
YourLen = 3
x = list(itertools.permutations(YourSet,YourLen))

#exporting to an SPSS dataset
spss.StartDataStep()
datasetObj = spss.Dataset(name=None)
for i in range(YourLen):
  datasetObj.varlist.append('X'+str(i),1)
for j in x:
  datasetObj.cases.append(j)
spss.EndDataStep()
END PROGRAM.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"