finding line break using Python

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

finding line break using Python

la volta statistics
Dear all

I would like to find the position of line breaks in the title of a table
using Python. It seems that the normal '\n' as a search criteria does not
work (see example below).
A second problem I have, is that I get an UnicodeEncodeError, when I use a
character such as '®' in the title (for generating this error, activate the
print strOldTitle line).
Can somebody help me?
Thank in advance,
Christian


DATA LIST FREE /var1 var2 .
BEGIN DATA
1 2
END DATA.
* Custom Tables.
CTABLES
  /VLABELS VARIABLES=var1 var2 DISPLAY=DEFAULT
  /TABLE var2 BY var1 [COUNT F40.0]
  /CATEGORIES VARIABLES=var1 var2 ORDER=A KEY=VALUE EMPTY=EXCLUDE
  /TITLES
   TITLE= 'First line®' 'second line®' 'third line®'.

BEGIN PROGRAM.
import spss,viewer
spssappObj = viewer.spssapp()
outputdoc = spssappObj.GetDesignatedOutput()
objItems = outputdoc.Items
outputdoc.ClearSelection()

for i in range(objItems.Count):
        objItem = objItems.GetItem(i)
        if objItem.SPSSType == 5 :
                objItem.Selected = True
                objPivotTable = objItem.ActivateTable()
                strOldTitle = objPivotTable.TitleText
#               print strOldTitle
                print strOldTitle.find("\n")
End Program.



*******************************.
la volta statistics
Christian Schmidhauser, Dr.phil.II
Weinbergstrasse 108
Ch-8006 Zürich
Tel: +41 (043) 233 98 01
Fax: +41 (043) 233 98 02
email: mailto:[hidden email]
internet: http://www.lavolta.ch/
Reply | Threaded
Open this post in threaded view
|

Re: finding line break using Python

Peck, Jon
When you are using the OLE automation methods, the line ending characters are a little different.  Change "\n" to "\r" (carriage return), and it will work.

Your second problem is more complicated to explain.  If you change the print line to
                print repr(strOldTitle)
you will not get the exception.  You will see the hexadecimal escape notation for 8-bit characters such as the registration mark.  It will look like this:
u'First line\xae\rsecond line\xae\rthird line\xae'

By default, Python handles encoding/decoding between Unicode and code page for 7-bit ascii ( and COM methods generally return Unicode strings).  There are examples in the DisplayDict module of how to control the encoding so that it works with any character set, but here is the magic code.

Inside the program, add this:
import locale, codecs
encoding = locale.getlocale()[1]
ec = codecs.getencoder(encoding)

Fetch the title text like this:
strOldTitle = ec(objPivotTable.TitleText)[0]

That converts the Unicode back to the encoding (code page) you are running in.

-Jon Peck
SPSS




-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of la volta statistics
Sent: Monday, March 19, 2007 12:10 PM
To: [hidden email]
Subject: [SPSSX-L] finding line break using Python

Dear all

I would like to find the position of line breaks in the title of a table
using Python. It seems that the normal '\n' as a search criteria does not
work (see example below).
A second problem I have, is that I get an UnicodeEncodeError, when I use a
character such as '®' in the title (for generating this error, activate the
print strOldTitle line).
Can somebody help me?
Thank in advance,
Christian


DATA LIST FREE /var1 var2 .
BEGIN DATA
1 2
END DATA.
* Custom Tables.
CTABLES
  /VLABELS VARIABLES=var1 var2 DISPLAY=DEFAULT
  /TABLE var2 BY var1 [COUNT F40.0]
  /CATEGORIES VARIABLES=var1 var2 ORDER=A KEY=VALUE EMPTY=EXCLUDE
  /TITLES
   TITLE= 'First line®' 'second line®' 'third line®'.

BEGIN PROGRAM.
import spss,viewer
spssappObj = viewer.spssapp()
outputdoc = spssappObj.GetDesignatedOutput()
objItems = outputdoc.Items
outputdoc.ClearSelection()

for i in range(objItems.Count):
        objItem = objItems.GetItem(i)
        if objItem.SPSSType == 5 :
                objItem.Selected = True
                objPivotTable = objItem.ActivateTable()
                strOldTitle = objPivotTable.TitleText
#               print strOldTitle
                print strOldTitle.find("\n")
End Program.



*******************************.
la volta statistics
Christian Schmidhauser, Dr.phil.II
Weinbergstrasse 108
Ch-8006 Zürich
Tel: +41 (043) 233 98 01
Fax: +41 (043) 233 98 02
email: mailto:[hidden email]
internet: http://www.lavolta.ch/
Reply | Threaded
Open this post in threaded view
|

AW: [SPSSX-L] finding line break using Python

la volta statistics
Thanks Jon, that helped
Christian

-----Ursprüngliche Nachricht-----
Von: Peck, Jon [mailto:[hidden email]]
Gesendet: Montag, 19. März 2007 19:11
An: la volta statistics; [hidden email]
Betreff: RE: [SPSSX-L] finding line break using Python


When you are using the OLE automation methods, the line ending characters
are a little different.  Change "\n" to "\r" (carriage return), and it will
work.

Your second problem is more complicated to explain.  If you change the print
line to
                print repr(strOldTitle)
you will not get the exception.  You will see the hexadecimal escape
notation for 8-bit characters such as the registration mark.  It will look
like this:
u'First line\xae\rsecond line\xae\rthird line\xae'

By default, Python handles encoding/decoding between Unicode and code page
for 7-bit ascii ( and COM methods generally return Unicode strings).  There
are examples in the DisplayDict module of how to control the encoding so
that it works with any character set, but here is the magic code.

Inside the program, add this:
import locale, codecs
encoding = locale.getlocale()[1]
ec = codecs.getencoder(encoding)

Fetch the title text like this:
strOldTitle = ec(objPivotTable.TitleText)[0]

That converts the Unicode back to the encoding (code page) you are running
in.

-Jon Peck
SPSS




-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of la
volta statistics
Sent: Monday, March 19, 2007 12:10 PM
To: [hidden email]
Subject: [SPSSX-L] finding line break using Python

Dear all

I would like to find the position of line breaks in the title of a table
using Python. It seems that the normal '\n' as a search criteria does not
work (see example below).
A second problem I have, is that I get an UnicodeEncodeError, when I use a
character such as '' in the title (for generating this error, activate the
print strOldTitle line).
Can somebody help me?
Thank in advance,
Christian


DATA LIST FREE /var1 var2 .
BEGIN DATA
1 2
END DATA.
* Custom Tables.
CTABLES
  /VLABELS VARIABLES=var1 var2 DISPLAY=DEFAULT
  /TABLE var2 BY var1 [COUNT F40.0]
  /CATEGORIES VARIABLES=var1 var2 ORDER=A KEY=VALUE EMPTY=EXCLUDE
  /TITLES
   TITLE= 'First line' 'second line' 'third line'.

BEGIN PROGRAM.
import spss,viewer
spssappObj = viewer.spssapp()
outputdoc = spssappObj.GetDesignatedOutput()
objItems = outputdoc.Items
outputdoc.ClearSelection()

for i in range(objItems.Count):
        objItem = objItems.GetItem(i)
        if objItem.SPSSType == 5 :
                objItem.Selected = True
                objPivotTable = objItem.ActivateTable()
                strOldTitle = objPivotTable.TitleText
#               print strOldTitle
                print strOldTitle.find("\n")
End Program.



*******************************.
la volta statistics
Christian Schmidhauser, Dr.phil.II
Weinbergstrasse 108
Ch-8006 Zrich
Tel: +41 (043) 233 98 01
Fax: +41 (043) 233 98 02
email: mailto:[hidden email]
internet: http://www.lavolta.ch/