Including rows for which zero events occurred.

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Including rows for which zero events occurred.

Paul Marchant
I have a file of data with 3 key variables ‘Area Code’, ‘Week Number’ & ‘Event Count’ as well as other ancillary variables.
However a row only exists where at least one event occurred. What I wish to do is include rows for weeks in which zero events occurred, so that every area has a complete time series of counts. How may I do this, please?

Paul Marchant

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Including rows for which zero events occurred.

Maguin, Eugene
I think varstocases might be useful here.  Let's say:

Ac wn ec x1, ..., x(n)
234 3 7
234 7 1
234 8 7
234 13 5
267 2 2
267 8 3
561 9 8
561 10 3
561 18 4

I think I'd do this.
1) Run aggregate (add variables mode) breaking on ac and saving first(wn) and last(wn). Then determine the maximum difference between first and last.
2) Save ac, first(wn), last(wn)  to a new file. Casestovars to reduce the file to a single record per ac.
3) Create vector wn(max difference).
4) Fill wn1 with the week numbers beginning with first and ending with last. This:
Loop #i=first to last.
Compute wn(#i)=#i.
End loop.
5) Now, finally, varstocases. Match this file to the original by ac wn. Recode ec=sysmis to 0.
Caveat. Some ac will have one or more blank wns prior to the varstocases. Check the documentation but I think those records will not be created by default. But if they are, a select if will get rid of them.

There's probably better, more clever ways to do this. Perhaps we'll find out what they are.
Gene Maguin












-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Paul Marchant
Sent: Friday, August 05, 2016 11:26 AM
To: [hidden email]
Subject: Including rows for which zero events occurred.

I have a file of data with 3 key variables ‘Area Code’, ‘Week Number’ & ‘Event Count’ as well as other ancillary variables.
However a row only exists where at least one event occurred. What I wish to do is include rows for weeks in which zero events occurred, so that every area has a complete time series of counts. How may I do this, please?

Paul Marchant

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Including rows for which zero events occurred.

Andy W
In reply to this post by Paul Marchant
See https://andrewpwheeler.wordpress.com/2014/08/04/aggregating-values-in-time-series-charts/

In a nutshell, you have to make a new dataset with all of the time sets and then left join your original data. Then recode the missing to 0.

See the macro linked at the end of my post, https://dl.dropboxusercontent.com/s/i5cii0zdw78tszc/TimeExpand_Macros.sps?dl=0, that helps with this.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/