Filling out gaps and MA calculation

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Filling out gaps and MA calculation

Robert L
The other day, I was asked to set up a procedure for calculation of weekly moving averages of cases, actually positive covid19 cases, and the first impulse was that this would be easy. But there turned out to be two problems:

1) The data has gaps. For each combination of sex and age group, I have the data for those weeks when there were observed cases. The weeks when there were no cases are not included.

2) The plan was to calculate 2 week moving averages for all subgroups defined by sex and age group. This had been easy if there were no gaps and if there was some AGGREGATE function for calculation of such averages.

The data in a simplifed form looks as follows. The first week should be 40 and the last week 42. So the first subgroup below, there should be a line with week=41, sex=1, agegrp=1 and cases=0. And for the last group there should be a line with week=42, sex=2, agegrp=3 and cases=0.

DATA LIST LIST/week(F2) sex(F1) agegrp(F1) cases(F2).
BEGIN DATA
40 1 1 3
42 1 1 1
40 1 2 8
41 1 2 10
40 1 3 26
41 1 3 25
42 1 3 10
41 2 1 4
42 2 1 1
40 2 2 3
42 2 2 1
40 2 3 16
41 2 3 29
END DATA.
DATASET NAME ma WINDOW=FRONT.

My first idea was to use XSAVE to add data for weeks with no cases, and some loop structure where something like "CREATE /ma=PMA(cases)" could be used. But I don't see how either should be set up. Suggestions anyone?

Robert

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Robert Lundqvist
Reply | Threaded
Open this post in threaded view
|

Re: Filling out gaps and MA calculation

Maguin, Eugene
I'll bet there are cleverer solutions but one way is this.
First, aggregate your data by week. Every week will have one case. (How soon until there are weeks with none).
Second, delete every variable but the week variable.
Third, do this.

do repeat y=week1 week2 week3 week4 week5.
+  compute y=week.
end repeat.
execute.
varstocases make week from week to week5.
compute sex=1+mod(($casenum-1),2).
sort cases by week sex.
compute agegrp=1+mod(($casenum-1),3).
sort cases by week sex agegrp.

Now you got one record for every combination of week, sex, agegrp. Save this file call it 'allweeks'.

Fouth, aggregate your data file as usual.
Fifth, match files with allweeks as the 'file' and the aggregated file as the 'table'.
Sixth, recode sysmis case counts to 0.

Should be it.

Gene Maguin



-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Robert Lundqvist
Sent: Monday, October 19, 2020 9:58 AM
To: [hidden email]
Subject: Filling out gaps and MA calculation

The other day, I was asked to set up a procedure for calculation of weekly moving averages of cases, actually positive covid19 cases, and the first impulse was that this would be easy. But there turned out to be two problems:

1) The data has gaps. For each combination of sex and age group, I have the data for those weeks when there were observed cases. The weeks when there were no cases are not included.

2) The plan was to calculate 2 week moving averages for all subgroups defined by sex and age group. This had been easy if there were no gaps and if there was some AGGREGATE function for calculation of such averages.

The data in a simplifed form looks as follows. The first week should be 40 and the last week 42. So the first subgroup below, there should be a line with week=41, sex=1, agegrp=1 and cases=0. And for the last group there should be a line with week=42, sex=2, agegrp=3 and cases=0.

DATA LIST LIST/week(F2) sex(F1) agegrp(F1) cases(F2).
BEGIN DATA
40 1 1 3
42 1 1 1
40 1 2 8
41 1 2 10
40 1 3 26
41 1 3 25
42 1 3 10
41 2 1 4
42 2 1 1
40 2 2 3
42 2 2 1
40 2 3 16
41 2 3 29
END DATA.
DATASET NAME ma WINDOW=FRONT.

My first idea was to use XSAVE to add data for weeks with no cases, and some loop structure where something like "CREATE /ma=PMA(cases)" could be used. But I don't see how either should be set up. Suggestions anyone?

Robert

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Filling out gaps and MA calculation

Jon Peck
Once the data are regularized to restore the omitted cases, the CREATE procedure can do the moving averages for whatever span is desired, but since each gap will affect the subsequent moving average until the order is fulfilled, there will be more gaps in the MA than in the original data.

On Mon, Oct 19, 2020 at 10:54 AM Maguin, Eugene <[hidden email]> wrote:
I'll bet there are cleverer solutions but one way is this.
First, aggregate your data by week. Every week will have one case. (How soon until there are weeks with none).
Second, delete every variable but the week variable.
Third, do this.

do repeat y=week1 week2 week3 week4 week5.
+  compute y=week.
end repeat.
execute.
varstocases make week from week to week5.
compute sex=1+mod(($casenum-1),2).
sort cases by week sex.
compute agegrp=1+mod(($casenum-1),3).
sort cases by week sex agegrp.

Now you got one record for every combination of week, sex, agegrp. Save this file call it 'allweeks'.

Fouth, aggregate your data file as usual.
Fifth, match files with allweeks as the 'file' and the aggregated file as the 'table'.
Sixth, recode sysmis case counts to 0.

Should be it.

Gene Maguin



-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Robert Lundqvist
Sent: Monday, October 19, 2020 9:58 AM
To: [hidden email]
Subject: Filling out gaps and MA calculation

The other day, I was asked to set up a procedure for calculation of weekly moving averages of cases, actually positive covid19 cases, and the first impulse was that this would be easy. But there turned out to be two problems:

1) The data has gaps. For each combination of sex and age group, I have the data for those weeks when there were observed cases. The weeks when there were no cases are not included.

2) The plan was to calculate 2 week moving averages for all subgroups defined by sex and age group. This had been easy if there were no gaps and if there was some AGGREGATE function for calculation of such averages.

The data in a simplifed form looks as follows. The first week should be 40 and the last week 42. So the first subgroup below, there should be a line with week=41, sex=1, agegrp=1 and cases=0. And for the last group there should be a line with week=42, sex=2, agegrp=3 and cases=0.

DATA LIST LIST/week(F2) sex(F1) agegrp(F1) cases(F2).
BEGIN DATA
40 1 1 3
42 1 1 1
40 1 2 8
41 1 2 10
40 1 3 26
41 1 3 25
42 1 3 10
41 2 1 4
42 2 1 1
40 2 2 3
42 2 2 1
40 2 3 16
41 2 3 29
END DATA.
DATASET NAME ma WINDOW=FRONT.

My first idea was to use XSAVE to add data for weeks with no cases, and some loop structure where something like "CREATE /ma=PMA(cases)" could be used. But I don't see how either should be set up. Suggestions anyone?

Robert

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD


--
Jon K Peck
[hidden email]

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Filling out gaps and MA calculation

Bruce Weaver
Administrator
In reply to this post by Maguin, Eugene
Hi Gene.  Does this generate the AllWeeks dataset you have in mind?

NEW FILE.
DATASET CLOSE ALL.

INPUT PROGRAM.
- LOOP Week = 40 to 42.
-  LEAVE Week.
-  Loop Sex = 1 to 2.
-   LEAVE Sex.
-   LOOP AgeGrp = 1 to 3.
-   END CASE.
-   END LOOP.
-  END LOOP.
- END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.
FORMATS Week(F2.0) / Sex AgeGrp(F1).
DATASET NAME AllWeeks.



Maguin, Eugene wrote

> I'll bet there are cleverer solutions but one way is this.
> First, aggregate your data by week. Every week will have one case. (How
> soon until there are weeks with none).
> Second, delete every variable but the week variable.
> Third, do this.
>
> do repeat y=week1 week2 week3 week4 week5.
> +  compute y=week.
> end repeat.
> execute.
> varstocases make week from week to week5.
> compute sex=1+mod(($casenum-1),2).
> sort cases by week sex.
> compute agegrp=1+mod(($casenum-1),3).
> sort cases by week sex agegrp.
>
> Now you got one record for every combination of week, sex, agegrp. Save
> this file call it 'allweeks'.
>
> Fouth, aggregate your data file as usual.
> Fifth, match files with allweeks as the 'file' and the aggregated file as
> the 'table'.
> Sixth, recode sysmis case counts to 0.
>
> Should be it.
>
> Gene Maguin





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Filling out gaps and MA calculation

Maguin, Eugene
Even better! Thanks, Bruce.

-----Original Message-----
From: SPSSX(r) Discussion <[hidden email]> On Behalf Of Bruce Weaver
Sent: Monday, October 19, 2020 3:09 PM
To: [hidden email]
Subject: Re: Filling out gaps and MA calculation

Hi Gene.  Does this generate the AllWeeks dataset you have in mind?

NEW FILE.
DATASET CLOSE ALL.

INPUT PROGRAM.
- LOOP Week = 40 to 42.
-  LEAVE Week.
-  Loop Sex = 1 to 2.
-   LEAVE Sex.
-   LOOP AgeGrp = 1 to 3.
-   END CASE.
-   END LOOP.
-  END LOOP.
- END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.
FORMATS Week(F2.0) / Sex AgeGrp(F1).
DATASET NAME AllWeeks.



Maguin, Eugene wrote

> I'll bet there are cleverer solutions but one way is this.
> First, aggregate your data by week. Every week will have one case.
> (How soon until there are weeks with none).
> Second, delete every variable but the week variable.
> Third, do this.
>
> do repeat y=week1 week2 week3 week4 week5.
> +  compute y=week.
> end repeat.
> execute.
> varstocases make week from week to week5.
> compute sex=1+mod(($casenum-1),2).
> sort cases by week sex.
> compute agegrp=1+mod(($casenum-1),3).
> sort cases by week sex agegrp.
>
> Now you got one record for every combination of week, sex, agegrp.
> Save this file call it 'allweeks'.
>
> Fouth, aggregate your data file as usual.
> Fifth, match files with allweeks as the 'file' and the aggregated file
> as the 'table'.
> Sixth, recode sysmis case counts to 0.
>
> Should be it.
>
> Gene Maguin





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Filling out gaps and MA calculation

Robert L
In reply to this post by Robert L
Bruce, Jon, and Art, many thanks!

I had completely missed the way an AllWeeks dataset could be matched with the basic dataset. Neat! I will certainly try to remember this for similar tasks in the future.

 
-----Ursprungligt meddelande-----
Från: SPSSX(r) Discussion [mailto:[hidden email]] För Bruce Weaver
Skickat: den 19 oktober 2020 21:09
Till: [hidden email]
Ämne: Re: Filling out gaps and MA calculation

Hi Gene.  Does this generate the AllWeeks dataset you have in mind?

NEW FILE.
DATASET CLOSE ALL.

INPUT PROGRAM.
- LOOP Week = 40 to 42.
-  LEAVE Week.
-  Loop Sex = 1 to 2.
-   LEAVE Sex.
-   LOOP AgeGrp = 1 to 3.
-   END CASE.
-   END LOOP.
-  END LOOP.
- END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.
FORMATS Week(F2.0) / Sex AgeGrp(F1).
DATASET NAME AllWeeks.



Maguin, Eugene wrote

> I'll bet there are cleverer solutions but one way is this.
> First, aggregate your data by week. Every week will have one case.
> (How soon until there are weeks with none).
> Second, delete every variable but the week variable.
> Third, do this.
>
> do repeat y=week1 week2 week3 week4 week5.
> +  compute y=week.
> end repeat.
> execute.
> varstocases make week from week to week5.
> compute sex=1+mod(($casenum-1),2).
> sort cases by week sex.
> compute agegrp=1+mod(($casenum-1),3).
> sort cases by week sex agegrp.
>
> Now you got one record for every combination of week, sex, agegrp.
> Save this file call it 'allweeks'.
>
> Fouth, aggregate your data file as usual.
> Fifth, match files with allweeks as the 'file' and the aggregated file
> as the 'table'.
> Sixth, recode sysmis case counts to 0.
>
> Should be it.
>
> Gene Maguin





-----
--
Bruce Weaver
[hidden email]
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

NOTE: My Hotmail account is not monitored regularly.
To send me an e-mail, please use the address shown above.

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Robert Lundqvist