SPSSX Discussion

File information using Python

Classic

List

Threaded

5 messages Options

la volta statistics

File information using Python

Dear all

I have a string variable (gender) coded F = female, M = male; and " " = not
available.

I try to make a file information text file using Python (I am a beginner
with Python). I noticed when I run a code snippet found in the archives,
that the 'missing' string somehow scrolls the list of the value labels on
position upwards. Can someone help here? I am using SPSS 14.
Thanks, Christian

Example:

DATA LIST /ID 1-2 Gender 4 (A) Smoker 6(A).
begin data.
1 M Y
2 F
3 N
4 F Y
5 M Y
6 F
7 M Y
8 M N
9
10 F Y
end data.

VAR LABEL id 'id'.
VAR LABEL Gender 'Gender'.
VAL LABEL Gender ' ' 'n.a.' 'F' 'female' 'M' 'male'.

VAR LABEL Smoker 'Smoking'.
VAL LABEL Smoker ' ' 'n.a.' 'Y' 'yes' 'N' 'no'.

begin program.
import spss, spssaux

vardict = spssaux.VariableDict()
for v in vardict:
print "\n","-Variable:", v.VariableName,"\n","--Variable Label: ",
v.VariableLabel
valueLabels = spssaux.GetValueLabels(int(v))
if valueLabels:
print "--Value Labels:"
for lbl in sorted(valueLabels):
print " ", lbl," ", valueLabels[lbl]
end program.

* Result:
*********.
-Variable: ID
--Variable Label: id

-Variable: Gender
--Variable Label: Gender
--Value Labels:
F n.a. <- Wrong!!
M female <- Wrong!!

-Variable: Smoker
--Variable Label: Smoking
--Value Labels:
N n.a. <- Wrong!!
Y no <- Wrong!!

*Should be:
**********.
-Variable: ID
--Variable Label: id

-Variable: Gender
--Variable Label: Gender
--Value Labels:
n.a.
F female
M male

-Variable: Smoker
--Variable Label: Smoking
--Value Labels:
n.a.
N no
Y yes

*******************************
la volta statistics
Christian Schmidhauser, Dr.phil.II
Weinbergstrasse 108
Ch-8006 Zürich
Tel: +41 (043) 233 98 01
Fax: +41 (043) 233 98 02
mailto:[hidden email]

Rolf Pfister

AW: File information using Python

Christian,

Nice to see that there is another Python user in town!

I think your problem has to do with Python. Probably, Python Dictionaries don't like spaces (' ') as keys.

As workaround, just replace the space by another symbol and it works fine. I tested it successfully with dots '.' for missing values.

Perhaps you should check the SPSS Python Forum for this question as well (http://www.spss.com/fusetalk/forum/categories.cfm?catid=9&zb=1235083 ). It's very useful for Python information.

My modified syntax looks like that:

DATA LIST /ID 1-2 Gender 4 (A) Smoker 6(A).
begin data.
1 M Y
2 F
3 N
4 F Y
5 M Y
6 F
7 M Y
8 M N
9
10 F Y
end data.
COMPUTE Gender = REPLACE(Gender,' ','.') .
COMPUTE Gender = REPLACE(Smoker,' ','.') .
MISSING VALUES Gender Smoker ('.').
VAR LABEL id 'id'.
VAR LABEL Gender 'Gender'.
VAL LABEL Gender '.' 'n.a.' 'F' 'female' 'M' 'male'.
VAR LABEL Smoker 'Smoking'.
VAL LABEL Smoker '.' 'n.a.' 'Y' 'yes' 'N' 'no'.

begin program.
import spss, spssaux

for i in range(spss.GetVariableCount()):
print "\n","-Variable:", spss.GetVariableName(i),"\n","--Variable Label: ", spss.GetVariableLabel(i)
valueLabels = spssaux.GetValueLabels(i)
if valueLabels:
print "--Value Labels:"
for value in sorted(valueLabels):
print " ", value," ", valueLabels[value]
end program.

And I get:

-Variable: ID
--Variable Label: id

-Variable: Gender
--Variable Label: Gender
--Value Labels:
. n.a.
F female
M male

-Variable: Smoker
--Variable Label: Smoking
--Value Labels:
. n.a.
N no
Y yes

Best Regards
Rolf

--
Rolf Pfister
SPSS Switzerland
Schneckenmannstrasse 25
CH-8044 Zürich

Email: [hidden email]
Support: [hidden email]
Visit our website at http://www.spss.ch

-----Ursprüngliche Nachricht-----
Von: SPSSX(r) Discussion [mailto:[hidden email]] Im Auftrag von la volta statistics
Gesendet: Dienstag, 6. Februar 2007 13:13
An: [hidden email]
Betreff: File information using Python

Dear all

I have a string variable (gender) coded F = female, M = male; and " " = not
available.

I try to make a file information text file using Python (I am a beginner
with Python). I noticed when I run a code snippet found in the archives,
that the 'missing' string somehow scrolls the list of the value labels on
position upwards. Can someone help here? I am using SPSS 14.
Thanks, Christian

Example:

DATA LIST /ID 1-2 Gender 4 (A) Smoker 6(A).
begin data.
1 M Y
2 F
3 N
4 F Y
5 M Y
6 F
7 M Y
8 M N
9
10 F Y
end data.

VAR LABEL id 'id'.
VAR LABEL Gender 'Gender'.
VAL LABEL Gender ' ' 'n.a.' 'F' 'female' 'M' 'male'.

VAR LABEL Smoker 'Smoking'.
VAL LABEL Smoker ' ' 'n.a.' 'Y' 'yes' 'N' 'no'.

begin program.
import spss, spssaux

vardict = spssaux.VariableDict()
for v in vardict:
print "\n","-Variable:", v.VariableName,"\n","--Variable Label: ",
v.VariableLabel
valueLabels = spssaux.GetValueLabels(int(v))
if valueLabels:
print "--Value Labels:"
for lbl in sorted(valueLabels):
print " ", lbl," ", valueLabels[lbl]
end program.

* Result:
*********.
-Variable: ID
--Variable Label: id

-Variable: Gender
--Variable Label: Gender
--Value Labels:
F n.a. <- Wrong!!
M female <- Wrong!!

-Variable: Smoker
--Variable Label: Smoking
--Value Labels:
N n.a. <- Wrong!!
Y no <- Wrong!!

*Should be:
**********.
-Variable: ID
--Variable Label: id

-Variable: Gender
--Variable Label: Gender
--Value Labels:
n.a.
F female
M male

-Variable: Smoker
--Variable Label: Smoking
--Value Labels:
n.a.
N no
Y yes

*******************************
la volta statistics
Christian Schmidhauser, Dr.phil.II
Weinbergstrasse 108
Ch-8006 Zürich
Tel: +41 (043) 233 98 01
Fax: +41 (043) 233 98 02
mailto:[hidden email]

Peck, Jon

Re: File information using Python

In reply to this post by la volta statistics

You are doing very well with Python, but you have uncovered a small bug in OMS that caused a blank value not to be in the OMS XML which the GetValueLabels function uses (via XPath). We have posted an update to the spssaux module (2.0.4) on Developer Central that gets around this problem.

I have also suggested a small simplification to your Python code below, although the change does not affect the result. ValueLabels are a property like the others in the dictionary, so you can access them using the same syntax as used for the rest.

HTH,
Jon Peck

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of la volta statistics
Sent: Tuesday, February 06, 2007 6:13 AM
To: [hidden email]
Subject: [SPSSX-L] File information using Python

Dear all

I have a string variable (gender) coded F = female, M = male; and " " = not
available.

I try to make a file information text file using Python (I am a beginner
with Python). I noticed when I run a code snippet found in the archives,
that the 'missing' string somehow scrolls the list of the value labels on
position upwards. Can someone help here? I am using SPSS 14.
Thanks, Christian

Example:

DATA LIST /ID 1-2 Gender 4 (A) Smoker 6(A).
begin data.
1 M Y
2 F
3 N
4 F Y
5 M Y
6 F
7 M Y
8 M N
9
10 F Y
end data.

VAR LABEL id 'id'.
VAR LABEL Gender 'Gender'.
VAL LABEL Gender ' ' 'n.a.' 'F' 'female' 'M' 'male'.

VAR LABEL Smoker 'Smoking'.
VAL LABEL Smoker ' ' 'n.a.' 'Y' 'yes' 'N' 'no'.

begin program.
import spss, spssaux

vardict = spssaux.VariableDict()
for v in vardict:
print "\n","-Variable:", v.VariableName,"\n","--Variable Label: ",
v.VariableLabel
valueLabels = v.ValueLabels
if valueLabels:
print "--Value Labels:"
for lbl in sorted(valueLabels):
print " ", lbl," ", valueLabels[lbl]
end program.

* Result:
*********.
-Variable: ID
--Variable Label: id

-Variable: Gender
--Variable Label: Gender
--Value Labels:
F n.a. <- Wrong!!
M female <- Wrong!!

-Variable: Smoker
--Variable Label: Smoking
--Value Labels:
N n.a. <- Wrong!!
Y no <- Wrong!!

*Should be:
**********.
-Variable: ID
--Variable Label: id

-Variable: Gender
--Variable Label: Gender
--Value Labels:
n.a.
F female
M male

-Variable: Smoker
--Variable Label: Smoking
--Value Labels:
n.a.
N no
Y yes

*******************************
la volta statistics
Christian Schmidhauser, Dr.phil.II
Weinbergstrasse 108
Ch-8006 Zürich
Tel: +41 (043) 233 98 01
Fax: +41 (043) 233 98 02
mailto:[hidden email]

Roberts, Michael

Aggregating datafiles in Python

Hi listers,

I am in knowing whether anyone has a solution to aggregating several
files in SPSS using Python then saving them to a folder.

I ran some code (function included below but does not work correctly)
without the /outfile command, but that simply adds a column to the
dataset before saving the files, and I need to be able to reduce the
cases (thence the aggregation)

Any help would be appreciated.

def fileAggr():
cmd2 = """ SORT CASES BY var1, var2, ..., varn.
AGGREGATE
/PRESORTED
/BREAK=vara, varb
/redundancies 'duplicate entries' = SUM(valid). """
spss.Submit(cmd2)
return cmd2

Thanx

Mike

la volta statistics

AW: File information using Python

In reply to this post by Peck, Jon

Thanks Jon and Rolf
The update to the spssaux module (2.0.4) works now as it should.

Christian

-----Ursprüngliche Nachricht-----
Von: SPSSX(r) Discussion [mailto:[hidden email]]Im Auftrag von
Peck, Jon
Gesendet: Dienstag, 6. Februar 2007 19:02
An: [hidden email]
Betreff: Re: File information using Python

You are doing very well with Python, but you have uncovered a small bug in
OMS that caused a blank value not to be in the OMS XML which the
GetValueLabels function uses (via XPath). We have posted an update to the
spssaux module (2.0.4) on Developer Central that gets around this problem.

I have also suggested a small simplification to your Python code below,
although the change does not affect the result. ValueLabels are a property
like the others in the dictionary, so you can access them using the same
syntax as used for the rest.

HTH,
Jon Peck

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of la
volta statistics
Sent: Tuesday, February 06, 2007 6:13 AM
To: [hidden email]
Subject: [SPSSX-L] File information using Python

Dear all

I have a string variable (gender) coded F = female, M = male; and " " = not
available.

I try to make a file information text file using Python (I am a beginner
with Python). I noticed when I run a code snippet found in the archives,
that the 'missing' string somehow scrolls the list of the value labels on
position upwards. Can someone help here? I am using SPSS 14.
Thanks, Christian

Example:

DATA LIST /ID 1-2 Gender 4 (A) Smoker 6(A).
begin data.
1 M Y
2 F
3 N
4 F Y
5 M Y
6 F
7 M Y
8 M N
9
10 F Y
end data.

VAR LABEL id 'id'.
VAR LABEL Gender 'Gender'.
VAL LABEL Gender ' ' 'n.a.' 'F' 'female' 'M' 'male'.

VAR LABEL Smoker 'Smoking'.
VAL LABEL Smoker ' ' 'n.a.' 'Y' 'yes' 'N' 'no'.

begin program.
import spss, spssaux

vardict = spssaux.VariableDict()
for v in vardict:
print "\n","-Variable:", v.VariableName,"\n","--Variable Label: ",
v.VariableLabel
valueLabels = v.ValueLabels
if valueLabels:
print "--Value Labels:"
for lbl in sorted(valueLabels):
print " ", lbl," ", valueLabels[lbl]
end program.

* Result:
*********.
-Variable: ID
--Variable Label: id

-Variable: Gender
--Variable Label: Gender
--Value Labels:
F n.a. <- Wrong!!
M female <- Wrong!!

-Variable: Smoker
--Variable Label: Smoking
--Value Labels:
N n.a. <- Wrong!!
Y no <- Wrong!!

*Should be:
**********.
-Variable: ID
--Variable Label: id

-Variable: Gender
--Variable Label: Gender
--Value Labels:
n.a.
F female
M male

-Variable: Smoker
--Variable Label: Smoking
--Value Labels:
n.a.
N no
Y yes

*******************************
la volta statistics
Christian Schmidhauser, Dr.phil.II
Weinbergstrasse 108
Ch-8006 Zrich
Tel: +41 (043) 233 98 01
Fax: +41 (043) 233 98 02
mailto:[hidden email]