Problem running lag syntax

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem running lag syntax

Boreak Silk

Hello Everyone,

 

I’m trying to run below lag syntax on an hourly interval dataset, with 30,000+ records, to calculate averages

over the past 52 weeks of the same hour and day but it is not working. SPSS keeps saying

Transformations Pending. Any idea what is wrong with the syntax? I ran it on SPSS 19 for Windows.

 

IF $CASENUM > 8736 Test_L52=(SUM(LAG(Test,168),LAG(Test,336),LAG(Test,504),LAG(Test,672),

 LAG(Test,840),LAG(Test,1008),LAG(Test,1176),LAG(Test,1344),LAG(Test,1512),LAG(Test,1680),

LAG(Test,1848),LAG(Test,2016),LAG(Test,2184),LAG(Test,2352),LAG(Test,2520),LAG(Test,2688),

LAG(Test,2856),LAG(Test,3024),LAG(Test,3192),LAG(Test,3360),LAG(Test,3528),LAG(Test,3696),

LAG(Test,3864),LAG(Test,4032),LAG(Test,4200),LAG(Test,4368),LAG(Test,4536),LAG(Test,4704),

LAG(Test,4872),LAG(Test,5040),LAG(Test,5208),LAG(Test,5376),LAG(Test,5544),LAG(Test,5712),

 LAG(Test,5880),LAG(Test,6048),LAG(Test,6216),LAG(Test,6384),LAG(Test,6552),LAG(Test,6720),

LAG(Test,6888),LAG(Test,7056),LAG(Test,7224),LAG(Test,7392),LAG(Test,7560),LAG(Test,7728),

 LAG(Test,7896),LAG(Test,8064),LAG(Test,8232),LAG(Test,8400),LAG(Test,8568),LAG(Test,8736)))/52.

 

 

Thank in advance,

 

Boreak

 

This email is intended solely for the named addressee.
If you are not the addressee indicated please delete it immediately.

Reply | Threaded
Open this post in threaded view
|

Re: Problem running lag syntax

eharrigan
HI Boreak,

Try adding an execute after your last line of syntax.

... LAG(Test,7896),LAG(Test,8064),LAG(Test,8232),LAG(Test,8400),LAG(Test,8568),LAG(Test,8736)))/52.
execute.

Emma
________________________________________
From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Boreak Silk [[hidden email]]
Sent: Thursday, June 06, 2013 6:46 PM
To: [hidden email]
Subject: Problem running lag syntax

Hello Everyone,

I’m trying to run below lag syntax on an hourly interval dataset, with 30,000+ records, to calculate averages
over the past 52 weeks of the same hour and day but it is not working. SPSS keeps saying
Transformations Pending. Any idea what is wrong with the syntax? I ran it on SPSS 19 for Windows.

IF $CASENUM > 8736 Test_L52=(SUM(LAG(Test,168),LAG(Test,336),LAG(Test,504),LAG(Test,672),
 LAG(Test,840),LAG(Test,1008),LAG(Test,1176),LAG(Test,1344),LAG(Test,1512),LAG(Test,1680),
LAG(Test,1848),LAG(Test,2016),LAG(Test,2184),LAG(Test,2352),LAG(Test,2520),LAG(Test,2688),
LAG(Test,2856),LAG(Test,3024),LAG(Test,3192),LAG(Test,3360),LAG(Test,3528),LAG(Test,3696),
LAG(Test,3864),LAG(Test,4032),LAG(Test,4200),LAG(Test,4368),LAG(Test,4536),LAG(Test,4704),
LAG(Test,4872),LAG(Test,5040),LAG(Test,5208),LAG(Test,5376),LAG(Test,5544),LAG(Test,5712),
 LAG(Test,5880),LAG(Test,6048),LAG(Test,6216),LAG(Test,6384),LAG(Test,6552),LAG(Test,6720),
LAG(Test,6888),LAG(Test,7056),LAG(Test,7224),LAG(Test,7392),LAG(Test,7560),LAG(Test,7728),
 LAG(Test,7896),LAG(Test,8064),LAG(Test,8232),LAG(Test,8400),LAG(Test,8568),LAG(Test,8736)))/52.


Thank in advance,

Boreak


This email is intended solely for the named addressee.
If you are not the addressee indicated please delete it immediately.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Problem running lag syntax

Ruben Geert van den Berg
In reply to this post by Boreak Silk
If I understand correctly, each record in your data is a consecutive hour. For each combination of day (1 through 7) and hour (0 through 23) you'd like to calculate a mean over 52 (weeks) data point, right? So you'll have 168 (7 days * 24 hours) new variables.

In that case, you could try something like the following (you'll need to have the Python essentials installed for this):

*1. Create test data.
input program.
loop # = 1 to 8736.
compute outcome = rv.ber(.5).
end case.
end loop.
end file.
end input program.
exe.

*2. Create week, day, hour day_hour as variables.
compute week = tru(($casenum - 1) /168) + 1.
compute day = mod(tru(($casenum - 1) / 24) ,7) + 1.
compute hour = mod(($casenum - 1),24).
compute day_hour=100*day + hour./*temporary variable, needed later on.
exe./*better omitted but allows for quick visual check.

*3. Important.
format day_hour(f3.0).

*4. Compute new variables.
begin program.
import spss,spssaux
cmd = spssaux.CreateXMLOutput("means outcome by day_hour.")
for day in range(1,8):
    for hour in range(24):
        mean = float(spssaux.GetValuesFromXMLWorkspace(cmd,"Report",cellAttrib="number",colCategory="Mean",rowCategory=str(100 * day + hour))[0])
        spss.Submit('compute mean_day_%(day)d_hour_%(hour)d = %(mean)f.'%locals())
end program.

*5. Visual inspection.
means outcome by day by hour.

*6. Delete temp variable.
delete variable day_hour.

HTH,

Ruben
Reply | Threaded
Open this post in threaded view
|

Re: Problem running lag syntax

Andy W
In reply to this post by Boreak Silk
IMO you will probably want to look at the CREATE command, in particular the prior moving average. If you create a categorical variable to ID the hour of day and day of week, you can then use split files to create the running mean within that category.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Problem running lag syntax

Maguin, Eugene
In reply to this post by Ruben Geert van den Berg
Although a lot of time has already been invested in the project, I think that, depending on how time is coded, the problem could be worked using aggregate. In a one year period you have 168 blocks of 52 records, each block corresponds to an hour of the week. So, if time were numbered correctly and sorted correctly, the break variable would be HourOfWeek.

Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Ruben Geert van den Berg
Sent: Friday, June 07, 2013 4:14 AM
To: [hidden email]
Subject: Re: Problem running lag syntax

If I understand correctly, each record in your data is a consecutive hour.
For each combination of day (1 through 7) and hour (0 through 23) you'd like to calculate a mean over 52 (weeks) data point, right? So you'll have 168 (7 days * 24 hours) new variables.

In that case, you could try something like the following (you'll need to have the Python essentials installed for this):

*1. Create test data.
input program.
loop # = 1 to 8736.
compute outcome = rv.ber(.5).
end case.
end loop.
end file.
end input program.
exe.

*2. Create week, day, hour day_hour as variables.
compute week = tru(($casenum - 1) /168) + 1.
compute day = mod(tru(($casenum - 1) / 24) ,7) + 1.
compute hour = mod(($casenum - 1),24).
compute day_hour=100*day + hour./*temporary variable, needed later on.
exe./*better omitted but allows for quick visual check.

*3. Important.
format day_hour(f3.0).

*4. Compute new variables.
begin program.
import spss,spssaux
cmd = spssaux.CreateXMLOutput("means outcome by day_hour.") for day in range(1,8):
    for hour in range(24):
        mean =
float(spssaux.GetValuesFromXMLWorkspace(cmd,"Report",cellAttrib="number",colCategory="Mean",rowCategory=str(100
* day + hour))[0])
        spss.Submit('compute mean_day_%(day)d_hour_%(hour)d =
%(mean)f.'%locals())
end program.

*5. Visual inspection.
means outcome by day by hour.

*6. Delete temp variable.
delete variable day_hour.

HTH,

Ruben



--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Problem-running-lag-syntax-tp5720617p5720622.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Problem running lag syntax

Andy W
Aggregate won't work for running averages. In addition to Emma's initial advice (problem solved via execute!) here is an example updating Ruben's fake data slightly. I'm confused where Ruben thinks the OP is asking for 168 seperate variables, but maybe I'm confused.

****************************************************.
*Fake data.
input program.
loop #i = 1 to 100000.
compute outcome = rv.ber(.5).
end case.
end loop.
end file.
end input program.
exe.
compute #daybase = DATE.MDY(01,01,2000).
compute #timebase = 0.
compute myDateTime = #daybase + #timebase + ($casenum*600).
formats myDateTime (DATETIME17).

*Categorical variable.
compute #hour = XDATE.HOUR(myDateTime).
compute #day = XDATE.WKDAY(myDateTime).
compute cat = #day + (#hour/100).

*Split file and PMA.
sort cases by cat myDateTime.
split file by cat.
CREATE
  /mean_out=PMA(outcome 52).
split file off.
****************************************************.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Problem running lag syntax

David Marso
Administrator
In reply to this post by Boreak Silk
Use EXECUTE to force a data pass or even better run some stats to check the result.
Looking at that code compels me to suggest something more along the lines of the following.
Certainly more maintainable than dealing with those hard coded numbers!

DEFINE L52 (!POS !TOKENS(1) / !POS !TOKENS(1)  ) .
DO IF $CASENUM GT 8736.
!DO !I=168 !TO 8736 !BY 168.
COMPUTE !1=SUM(!1,LAG(!2,!I)).
!DOEND
COMPUTE !1=!1/52.
END IF.

!ENDDEFINE .

L52 XL_52 X .
EXECUTE.
Boreak Silk wrote
Hello Everyone,

 

I'm trying to run below lag syntax on an hourly interval dataset, with
30,000+ records, to calculate averages

over the past 52 weeks of the same hour and day but it is not working.
SPSS keeps saying

Transformations Pending. Any idea what is wrong with the syntax? I ran
it on SPSS 19 for Windows.

 

IF $CASENUM > 8736
Test_L52=(SUM(LAG(Test,168),LAG(Test,336),LAG(Test,504),LAG(Test,672),

 
LAG(Test,840),LAG(Test,1008),LAG(Test,1176),LAG(Test,1344),LAG(Test,1512
),LAG(Test,1680),

LAG(Test,1848),LAG(Test,2016),LAG(Test,2184),LAG(Test,2352),LAG(Test,252
0),LAG(Test,2688),

LAG(Test,2856),LAG(Test,3024),LAG(Test,3192),LAG(Test,3360),LAG(Test,352
8),LAG(Test,3696),

LAG(Test,3864),LAG(Test,4032),LAG(Test,4200),LAG(Test,4368),LAG(Test,453
6),LAG(Test,4704),

LAG(Test,4872),LAG(Test,5040),LAG(Test,5208),LAG(Test,5376),LAG(Test,554
4),LAG(Test,5712),

 
LAG(Test,5880),LAG(Test,6048),LAG(Test,6216),LAG(Test,6384),LAG(Test,655
2),LAG(Test,6720),

LAG(Test,6888),LAG(Test,7056),LAG(Test,7224),LAG(Test,7392),LAG(Test,756
0),LAG(Test,7728),

 
LAG(Test,7896),LAG(Test,8064),LAG(Test,8232),LAG(Test,8400),LAG(Test,856
8),LAG(Test,8736)))/52.

 

 

Thank in advance,

 

Boreak

 

This email is intended solely for the named addressee.
If you are not the addressee indicated please delete it immediately.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Problem running lag syntax

David Marso
Administrator
That can be further refined by noticing that cases 168 away from each other will differ by only 2 elements.
Use this to implement a provisional means style approach rather than adding 52 things together when 50 have already been added and that lives above in the file.
Leaving the solution for you (TODO) as an exercise in efficient coding ;-)!
--
David Marso wrote
Use EXECUTE to force a data pass or even better run some stats to check the result.
Looking at that code compels me to suggest something more along the lines of the following.
Certainly more maintainable than dealing with those hard coded numbers!

DEFINE L52 (!POS !TOKENS(1) / !POS !TOKENS(1)  ) .
DO IF $CASENUM GT 8736.
!DO !I=168 !TO 8736 !BY 168.
COMPUTE !1=SUM(!1,LAG(!2,!I)).
!DOEND
COMPUTE !1=!1/52.
END IF.

!ENDDEFINE .

L52 XL_52 X .
EXECUTE.
Boreak Silk wrote
Hello Everyone,

 

I'm trying to run below lag syntax on an hourly interval dataset, with
30,000+ records, to calculate averages

over the past 52 weeks of the same hour and day but it is not working.
SPSS keeps saying

Transformations Pending. Any idea what is wrong with the syntax? I ran
it on SPSS 19 for Windows.

 

IF $CASENUM > 8736
Test_L52=(SUM(LAG(Test,168),LAG(Test,336),LAG(Test,504),LAG(Test,672),

 
LAG(Test,840),LAG(Test,1008),LAG(Test,1176),LAG(Test,1344),LAG(Test,1512
),LAG(Test,1680),

LAG(Test,1848),LAG(Test,2016),LAG(Test,2184),LAG(Test,2352),LAG(Test,252
0),LAG(Test,2688),

LAG(Test,2856),LAG(Test,3024),LAG(Test,3192),LAG(Test,3360),LAG(Test,352
8),LAG(Test,3696),

LAG(Test,3864),LAG(Test,4032),LAG(Test,4200),LAG(Test,4368),LAG(Test,453
6),LAG(Test,4704),

LAG(Test,4872),LAG(Test,5040),LAG(Test,5208),LAG(Test,5376),LAG(Test,554
4),LAG(Test,5712),

 
LAG(Test,5880),LAG(Test,6048),LAG(Test,6216),LAG(Test,6384),LAG(Test,655
2),LAG(Test,6720),

LAG(Test,6888),LAG(Test,7056),LAG(Test,7224),LAG(Test,7392),LAG(Test,756
0),LAG(Test,7728),

 
LAG(Test,7896),LAG(Test,8064),LAG(Test,8232),LAG(Test,8400),LAG(Test,856
8),LAG(Test,8736)))/52.

 

 

Thank in advance,

 

Boreak

 

This email is intended solely for the named addressee.
If you are not the addressee indicated please delete it immediately.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Problem running lag syntax

Boreak Silk
Thanks you all for your contribution, especially to David - your
approach worked well without Python.



Boreak


-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
David Marso
Sent: Saturday, 8 June 2013 4:47 AM
To: [hidden email]
Subject: Re: Problem running lag syntax

That can be further refined by noticing that cases 168 away from each
other will differ by only 2 elements.
Use this to implement a provisional means style approach rather than
adding
52 things together when 50 have already been added and that lives above
in the file.
Leaving the solution for you (TODO) as an exercise in efficient coding
;-)!
--

David Marso wrote
> Use EXECUTE to force a data pass or even better run some stats to
> check the result.
> Looking at that code compels me to suggest something more along the
> lines of the following.
> Certainly more maintainable than dealing with those hard coded
numbers!

>
> DEFINE L52 (!POS !TOKENS(1) / !POS !TOKENS(1)  ) .
> DO IF $CASENUM GT 8736.
> !DO !I=168 !TO 8736 !BY 168.
> COMPUTE !1=SUM(!1,LAG(!2,!I)).
> !DOEND
> COMPUTE !1=!1/52.
> END IF.
>
> !ENDDEFINE .
>
> L52 XL_52 X .
> EXECUTE.
> Boreak Silk wrote
>> Hello Everyone,
>>
>>
>>
>> I'm trying to run below lag syntax on an hourly interval dataset,
>> with 30,000+ records, to calculate averages
>>
>> over the past 52 weeks of the same hour and day but it is not
working.

>> SPSS keeps saying
>>
>> Transformations Pending. Any idea what is wrong with the syntax? I
>> ran it on SPSS 19 for Windows.
>>
>>
>>
>> IF $CASENUM > 8736
>> Test_L52=(SUM(LAG(Test,168),LAG(Test,336),LAG(Test,504),LAG(Test,672)
>> ,
>>
>>
>> LAG(Test,840),LAG(Test,1008),LAG(Test,1176),LAG(Test,1344),LAG(Test,1
>> 512
>> ),LAG(Test,1680),
>>
>> LAG(Test,1848),LAG(Test,2016),LAG(Test,2184),LAG(Test,2352),LAG(Test,
>> 252
>> 0),LAG(Test,2688),
>>
>> LAG(Test,2856),LAG(Test,3024),LAG(Test,3192),LAG(Test,3360),LAG(Test,
>> 352
>> 8),LAG(Test,3696),
>>
>> LAG(Test,3864),LAG(Test,4032),LAG(Test,4200),LAG(Test,4368),LAG(Test,
>> 453
>> 6),LAG(Test,4704),
>>
>> LAG(Test,4872),LAG(Test,5040),LAG(Test,5208),LAG(Test,5376),LAG(Test,
>> 554
>> 4),LAG(Test,5712),
>>
>>
>> LAG(Test,5880),LAG(Test,6048),LAG(Test,6216),LAG(Test,6384),LAG(Test,
>> 655
>> 2),LAG(Test,6720),
>>
>> LAG(Test,6888),LAG(Test,7056),LAG(Test,7224),LAG(Test,7392),LAG(Test,
>> 756
>> 0),LAG(Test,7728),
>>
>>
>> LAG(Test,7896),LAG(Test,8064),LAG(Test,8232),LAG(Test,8400),LAG(Test,
>> 856
>> 8),LAG(Test,8736)))/52.
>>
>>
>>
>>
>>
>> Thank in advance,
>>
>>
>>
>> Boreak
>>
>>
>>
>> This email is intended solely for the named addressee.
>> If you are not the addressee indicated please delete it immediately.





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to
email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante
porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff
in abyssum?"
--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/Problem-running-lag-syntax
-tp5720617p5720636.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list
of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD