SPSSX Discussion

macro/python query, variable creation

Classic

List

Threaded

6 messages Options

Christine-28

macro/python query, variable creation

Hi all,

My data is set up with a long list of groups assessed over 6 time periods on 5 issues. I need to create an aggregated number of counts per T, and N issues per T for each group ('000's of groups). I am hoping to automate this to help with time and to help prevent data entry errors.

My data is currently set up like this (although I have only indicated 2 T's and 3 issues here it's actually 6 times and 5 issues.

data list free /group t1 t2 x1 x2 x3.
begin data
1 1 . 1 . .
1 1 . 1 . .
1 1 . 1 . .
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . .
1 . 1 . . .
2 1 . 1 . .
2 1 . 1 . .
2 1 . . 1 .
2 . 1 1 . .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . . 1
2 . 1 . . 1
end data.

I need to end up with this:

data list free /group x_t1 x_t2 x1t1 x1t2 x2t1 x2t2 x3t1 x3t2.
begin data
1 3 6 3 0 0 0 0 4
2 3 7 2 1 1 4 0 2
end data.

(where x_t1 and x_t2 are counts of all assessments in T1 and T2.)

My current syntax looks like many (42) versions of this with the T's and X's substituted each time:

SORT CASES BY group(a) t1(a) x1(d).
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/PRESORTED
/BREAK=group t1
/nt1_x1=SUM(x1).

I then select the first instance of group and save an outfile of that to get the final result.

I assume there's some way of automating the sorting by each T and X and creating aggregated sums of X but I just can't put the pieces together.

I apologise in advance if there is a simple solution to this, I have been using basic syntax for a couple of years but am only just starting to look into programming using python or macros and cannot find a simple solution with my current skill level. In addition to this solution needed above, I have downloaded the 'Programming and Data Management' text and am now wondering if there are any more text resources to learn python and macros in this context.

Thankyou, Christine

Ruben Geert van den Berg

Re: macro/python query, variable creation

Dear Christine,

If I understood your situation correctly, the code below should work (although there's 36 instead of the 42 aggregates you mentioned so I may have missed something).

Insofar as learning macros and/or Python are concerned: I try to learn by running syntax from others, modifying it, and then see what happens. I find such examples easier to find for macros than for Python syntax. Very basically, I think Raynald Leveque's site provided me with the most instructive macro examples (but check out Bruce Weaver's site as well). On SPSS devcentral you can find two .sps files with nice Python examples. If you want, I can send you lots of examples off-list.

It would be lovely, though, if there was a large online Python-SPSS syntax library somewhere.

HTH,

Ruben van den Berg
Consultant Models & Methods
TNS NIPO
Email: [hidden email]
Mobiel: +31 6 24641435
Telefoon: +31 20 522 5738
Internet: www.tns-nipo.com

*Create testdata.

input program.
loop #i=1 to 20.
compute group=(#i gt 10)+1.
end case.
end loop.
end file.
end input program.

do repeat t=t1 to t6.
compute t=rnd(rv.uni(-.5,1.5)).
end repeat.

do repeat x=x1 to x5.
compute x=rnd(rv.uni(-.5,1.5)).
end repeat.

execute./*not needed, but nice to see the data.

*The first six counts can be made with a single aggregate command.

aggregate
/outfile * mode addvariables
/break group
/x_t1 to x_t6=sum(t1 to t6).

*For the next thirty aggregates, nested macro index loops can be used.

define !aggregates()
!do !l1 = 1 !to 5
!do !l2=1 !to 6
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/BREAK=group !con("t",!l2)
/!con("x",!l1,"t",!l2)=SUM(!con("x",!l1)).
!doend
!doend
!enddefine.

*Macro call, with macro expansion printed in order to see whether the desired syntax is generated.

set mpr on.
!aggregates.
set mpr off.

Date: Thu, 24 Jun 2010 13:55:25 +1000
From: [hidden email]
Subject: macro/python query, variable creation
To: [hidden email]

Hi all,

My data is set up with a long list of groups assessed over 6 time periods on 5 issues. I need to create an aggregated number of counts per T, and N issues per T for each group ('000's of groups). I am hoping to automate this to help with time and to help prevent data entry errors.

My data is currently set up like this (although I have only indicated 2 T's and 3 issues here it's actually 6 times and 5 issues.

data list free /group t1 t2 x1 x2 x3.
begin data
1 1 . 1 . .
1 1 . 1 . .
1 1 . 1 . .
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . .
1 . 1 . . .
2 1 . 1 . .
2 1 . 1 . .
2 1 . . 1 .
2 . 1 1 . .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . . 1
2 . 1 . . 1
end data.

I need to end up with this:

data list free /group x_t1 x_t2 x1t1 x1t2 x2t1 x2t2 x3t1 x3t2.
begin data
1 3 6 3 0 0 0 0 4
2 3 7 2 1 1 4 0 2
end data.

(where x_t1 and x_t2 are counts of all assessments in T1 and T2.)

My current syntax looks like many (42) versions of this with the T's and X's substituted each time:

SORT CASES BY group(a) t1(a) x1(d).
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/PRESORTED
/BREAK=group t1
/nt1_x1=SUM(x1).

I then select the first instance of group and save an outfile of that to get the final result.

I assume there's some way of automating the sorting by each T and X and creating aggregated sums of X but I just can't put the pieces together.

I apologise in advance if there is a simple solution to this, I have been using basic syntax for a couple of years but am only just starting to look into programming using python or macros and cannot find a simple solution with my current skill level. In addition to this solution needed above, I have downloaded the 'Programming and Data Management' text and am now wondering if there are any more text resources to learn python and macros in this context.

Thankyou, Christine

Express yourself instantly with MSN Messenger! MSN Messenger

hillel vardi

Re: macro/python query, variable creation

In reply to this post by Christine-28

Shalom

There is no need to use macro or python to do what you need .
As you can see in the syntax that fallow you need only one time variable
with 6 values (1 to 6) , that is for each group and the 5 issues there
can by only one time value .
In your case you can do it by using if command as in the syntax . then
you can aggregate once and restructures .
In the syntax the order of the resulting variable are not the same as in
you example use add file with keep to reorder them if needed .
Also your combination of group 2 and time 2 have no data there for they
are not in the result dataset .

Hillel Vardi
BGU

dataset close all.
data list fixed /group t1 t2 x1 x2 x3 (f1 5f2) .
begin data
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1
1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
end data .
if t2 eq 1 t1=2 .
execute .
DATASET DECLARE final.
AGGREGATE
/OUTFILE='final'
/BREAK=group t1
/x1_sum=SUM(x1)
/x2_sum=SUM(x2)
/x3_sum=SUM(x3).

DATASET ACTIVATE final .
SORT CASES BY group t1.
CASESTOVARS
/ID=group
/INDEX=t1
/GROUPBY=VARIABLE.

Christine wrote:

> Hi all,
>
> My data is set up with a long list of groups assessed over 6 time
> periods on 5 issues. I need to create an aggregated number of counts
> per T, and N issues per T for each group ('000's of groups). I am
> hoping to automate this to help with time and to help prevent data
> entry errors.
>
> My data is currently set up like this (although I have only indicated
> 2 T's and 3 issues here it's actually 6 times and 5 issues.
>
> data list free /group t1 t2 x1 x2 x3.
> begin data
> 1 1 . 1 . .
> 1 1 . 1 . .
> 1 1 . 1 . .
> 1 . 1 . . 1
> 1 . 1 . . 1
> 1 . 1 . . 1
> 1 . 1 . . 1
> 1 . 1 . . .
> 1 . 1 . . .
> 2 1 . 1 . .
> 2 1 . 1 . .
> 2 1 . . 1 .
> 2 . 1 1 . .
> 2 . 1 . 1 .
> 2 . 1 . 1 .
> 2 . 1 . 1 .
> 2 . 1 . 1 .
> 2 . 1 . . 1
> 2 . 1 . . 1
> end data.
>
> I need to end up with this:
>
> data list free /group x_t1 x_t2 x1t1 x1t2 x2t1 x2t2 x3t1 x3t2.
> begin data
> 1 3 6 3 0 0 0 0 4
> 2 3 7 2 1 1 4 0 2
> end data.
>
> (where x_t1 and x_t2 are counts of all assessments in T1 and T2.)
>
> My current syntax looks like many (42) versions of this with the T's
> and X's substituted each time:
>
> SORT CASES BY group(a) t1(a) x1(d).
> AGGREGATE
> /OUTFILE=* MODE=ADDVARIABLES
> /PRESORTED
> /BREAK=group t1
> /nt1_x1=SUM(x1).
>
> I then select the first instance of group and save an outfile of that
> to get the final result.
>
> I assume there's some way of automating the sorting by each T and X
> and creating aggregated sums of X but I just can't put the pieces
> together.
>
> I apologise in advance if there is a simple solution to this, I have
> been using basic syntax for a couple of years but am only just
> starting to look into programming using python or macros and cannot
> find a simple solution with my current skill level. In addition to
> this solution needed above, I have downloaded the 'Programming and
> Data Management' text and am now wondering if there are any more text
> resources to learn python and macros in this context.
>
> Thankyou, Christine
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Ramón López

Re: macro/python query, variable creation

Hello All:
Please I need good online references in order to start programming PHYTON with SPSS 18.
Thanks in advance

Ingº. Ramon Lopez (58)-412-8318409

Chat

Skype: lopeznomar

Contactame

Twitter

2010/6/24 hillel vardi <[hidden email]>

Shalom

There is no need to use macro or python to do what you need .
As you can see in the syntax that fallow you need only one time variable
with 6 values (1 to 6) , that is for each group and the 5 issues there
can by only one time value .
In your case you can do it by using if command as in the syntax . then
you can aggregate once and restructures .
In the syntax the order of the resulting variable are not the same as in
you example use add file with keep to reorder them if needed .
Also your combination of group 2 and time 2 have no data there for they
are not in the result dataset .

Hillel Vardi
BGU

dataset close all.
data list fixed /group t1 t2 x1 x2 x3 (f1 5f2) .

begin data
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1
1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
end data .
if t2 eq 1 t1=2 .
execute .
DATASET DECLARE final.
AGGREGATE
/OUTFILE='final'
/BREAK=group t1
/x1_sum=SUM(x1)
/x2_sum=SUM(x2)
/x3_sum=SUM(x3).

DATASET ACTIVATE final .
SORT CASES BY group t1.
CASESTOVARS
/ID=group
/INDEX=t1
/GROUPBY=VARIABLE.

Christine wrote:

Hi all,

My data is set up with a long list of groups assessed over 6 time
periods on 5 issues. I need to create an aggregated number of counts
per T, and N issues per T for each group ('000's of groups). I am
hoping to automate this to help with time and to help prevent data
entry errors.

My data is currently set up like this (although I have only indicated
2 T's and 3 issues here it's actually 6 times and 5 issues.

data list free /group t1 t2 x1 x2 x3.
begin data
1 1 . 1 . .
1 1 . 1 . .
1 1 . 1 . .
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . .
1 . 1 . . .
2 1 . 1 . .
2 1 . 1 . .
2 1 . . 1 .
2 . 1 1 . .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . . 1
2 . 1 . . 1
end data.

I need to end up with this:

data list free /group x_t1 x_t2 x1t1 x1t2 x2t1 x2t2 x3t1 x3t2.
begin data
1 3 6 3 0 0 0 0 4
2 3 7 2 1 1 4 0 2
end data.

(where x_t1 and x_t2 are counts of all assessments in T1 and T2.)

My current syntax looks like many (42) versions of this with the T's
and X's substituted each time:

SORT CASES BY group(a) t1(a) x1(d).
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/PRESORTED
/BREAK=group t1
/nt1_x1=SUM(x1).

I then select the first instance of group and save an outfile of that
to get the final result.

I assume there's some way of automating the sorting by each T and X
and creating aggregated sums of X but I just can't put the pieces
together.

I apologise in advance if there is a simple solution to this, I have
been using basic syntax for a couple of years but am only just
starting to look into programming using python or macros and cannot
find a simple solution with my current skill level. In addition to
this solution needed above, I have downloaded the 'Programming and
Data Management' text and am now wondering if there are any more text
resources to learn python and macros in this context.

Thankyou, Christine

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Christine-28

Re: macro/python query, variable creation

As mentioned in my original post I downloaded 'Programming and Data Management for IBM Statistics 18" from the SPSS website http://www.spss.com/sites/dm-book/, I also go to Raynald's site.

2010/6/25 Ramón López <[hidden email]>

Hello All:
Please I need good online references in order to start programming PHYTON with SPSS 18.
Thanks in advance

Ingº. Ramon Lopez (58)-412-8318409
Chat Skype: lopeznomar

Contactame Twitter

2010/6/24 hillel vardi <[hidden email]>

Shalom

There is no need to use macro or python to do what you need .
As you can see in the syntax that fallow you need only one time variable
with 6 values (1 to 6) , that is for each group and the 5 issues there
can by only one time value .
In your case you can do it by using if command as in the syntax . then
you can aggregate once and restructures .
In the syntax the order of the resulting variable are not the same as in
you example use add file with keep to reorder them if needed .
Also your combination of group 2 and time 2 have no data there for they
are not in the result dataset .

Hillel Vardi
BGU

dataset close all.
data list fixed /group t1 t2 x1 x2 x3 (f1 5f2) .

begin data
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1
1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
end data .
if t2 eq 1 t1=2 .
execute .
DATASET DECLARE final.
AGGREGATE
/OUTFILE='final'
/BREAK=group t1
/x1_sum=SUM(x1)
/x2_sum=SUM(x2)
/x3_sum=SUM(x3).

DATASET ACTIVATE final .
SORT CASES BY group t1.
CASESTOVARS
/ID=group
/INDEX=t1
/GROUPBY=VARIABLE.

Christine wrote:

Hi all,

My data is set up with a long list of groups assessed over 6 time
periods on 5 issues. I need to create an aggregated number of counts
per T, and N issues per T for each group ('000's of groups). I am
hoping to automate this to help with time and to help prevent data
entry errors.

My data is currently set up like this (although I have only indicated
2 T's and 3 issues here it's actually 6 times and 5 issues.

data list free /group t1 t2 x1 x2 x3.
begin data
1 1 . 1 . .
1 1 . 1 . .
1 1 . 1 . .
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . .
1 . 1 . . .
2 1 . 1 . .
2 1 . 1 . .
2 1 . . 1 .
2 . 1 1 . .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . . 1
2 . 1 . . 1
end data.

I need to end up with this:

data list free /group x_t1 x_t2 x1t1 x1t2 x2t1 x2t2 x3t1 x3t2.
begin data
1 3 6 3 0 0 0 0 4
2 3 7 2 1 1 4 0 2
end data.

(where x_t1 and x_t2 are counts of all assessments in T1 and T2.)

My current syntax looks like many (42) versions of this with the T's
and X's substituted each time:

SORT CASES BY group(a) t1(a) x1(d).
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/PRESORTED
/BREAK=group t1
/nt1_x1=SUM(x1).

I then select the first instance of group and save an outfile of that
to get the final result.

I assume there's some way of automating the sorting by each T and X
and creating aggregated sums of X but I just can't put the pieces
together.

I apologise in advance if there is a simple solution to this, I have
been using basic syntax for a couple of years but am only just
starting to look into programming using python or macros and cannot
find a simple solution with my current skill level. In addition to
this solution needed above, I have downloaded the 'Programming and
Data Management' text and am now wondering if there are any more text
resources to learn python and macros in this context.

Thankyou, Christine

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Albert-Jan Roskam

Re: macro/python query, variable creation

http://docs.python.org/index.html --> I started where it says "start here".
You learn about the various data types, basic statements, the most useful builtin modules, etc.

Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- On Fri, 6/25/10, Christine <[hidden email]> wrote:

From: Christine <[hidden email]>
Subject: Re: [SPSSX-L] macro/python query, variable creation
To: [hidden email]
Date: Friday, June 25, 2010, 2:53 AM

As mentioned in my original post I downloaded 'Programming and Data Management for IBM Statistics 18" from the SPSS website http://www.spss.com/sites/dm-book/, I also go to Raynald's site.

2010/6/25 Ramón López <lopeznomar@...>
Hello All:
Please I need good online references in order to start programming PHYTON with SPSS 18.
Thanks in advance

Ingº. Ramon Lopez (58)-412-8318409
Chat Skype: lopeznomar

Contactame Twitter

2010/6/24 hillel vardi <hilel@...>

Shalom

There is no need to use macro or python to do what you need .
As you can see in the syntax that fallow you need only one time variable
with 6 values (1 to 6) , that is for each group and the 5 issues there
can by only one time value .
In your case you can do it by using if command as in the syntax . then
you can aggregate once and restructures .
In the syntax the order of the resulting variable are not the same as in
you example use add file with keep to reorder them if needed .
Also your combination of group 2 and time 2 have no data there for they
are not in the result dataset .

Hillel Vardi
BGU

dataset close all.
data list fixed /group t1 t2 x1 x2 x3 (f1 5f2) .

begin data
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
1 1
1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
2 1 1
end data .
if t2 eq 1 t1=2 .
execute .
DATASET DECLARE final.
AGGREGATE
/OUTFILE='final'
/BREAK=group t1
/x1_sum=SUM(x1)
/x2_sum=SUM(x2)
/x3_sum=SUM(x3).

DATASET ACTIVATE final .
SORT CASES BY group t1.
CASESTOVARS
/ID=group
/INDEX=t1
/GROUPBY=VARIABLE.

Christine wrote:

Hi all,

My data is set up with a long list of groups assessed over 6 time
periods on 5 issues. I need to create an aggregated number of counts
per T, and N issues per T for each group ('000's of groups). I am
hoping to automate this to help with time and to help prevent data
entry errors.

My data is currently set up like this (although I have only indicated
2 T's and 3 issues here it's actually 6 times and 5 issues.

data list free /group t1 t2 x1 x2 x3.
begin data
1 1 . 1 . .
1 1 . 1 . .
1 1 . 1 . .
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . 1
1 . 1 . . .
1 . 1 . . .
2 1 . 1 . .
2 1 . 1 . .
2 1 . . 1 .
2 . 1 1 . .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . 1 .
2 . 1 . . 1
2 . 1 . . 1
end data.

I need to end up with this:

data list free /group x_t1 x_t2 x1t1 x1t2 x2t1 x2t2 x3t1 x3t2.
begin data
1 3 6 3 0 0 0 0 4
2 3 7 2 1 1 4 0 2
end data.

(where x_t1 and x_t2 are counts of all assessments in T1 and T2.)

My current syntax looks like many (42) versions of this with the T's
and X's substituted each time:

SORT CASES BY group(a) t1(a) x1(d).
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/PRESORTED
/BREAK=group t1
/nt1_x1=SUM(x1).

I then select the first instance of group and save an outfile of that
to get the final result.

I assume there's some way of automating the sorting by each T and X
and creating aggregated sums of X but I just can't put the pieces
together.

I apologise in advance if there is a simple solution to this, I have
been using basic syntax for a couple of years but am only just
starting to look into programming using python or macros and cannot
find a simple solution with my current skill level. In addition to
this solution needed above, I have downloaded the 'Programming and
Data Management' text and am now wondering if there are any more text
resources to learn python and macros in this context.

Thankyou, Christine

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD