How to "groupsum" consecutive values within a variable according to conditions

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

How to "groupsum" consecutive values within a variable according to conditions

Jonas Steiner
This post was updated on .
Hello community

I'm struggling with the following problem in SPSS Version 23.

It's about the duration of days without any rainfall. The data is sorted according to DATE.



I would like to sum up consecutive values in one binary variable called "Trockentag" [0,1] (drydays in english) from top to down - "grouped" - meaning, that values should be consecutively added and puted as a value in a new variable, until a next following value of "Trockentag" is 0 (a day when its raining again). The adding of values should start again, when a value is 1 and add consecutively, till a next value is 0 again. And so on.

The result should be a new variable called "duration of drydays" in a new dataset which shows them grouped summed values per values in "YR" [2015, 2014, 2013, ..., 1980] and "Station" [503, 502, 402, 401, 301, etc]. See attached graphic.




Thank you so much in an advance!


Greetings
Jonas
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

David Marso
Administrator
Two approaches come to mind:
1. SPLIT FILE followed by CREATE using CSUM.
2. Use LAG function and COMPUTE with IF statement.

Note file should be SORTED Ascending by Year and Station for first solution.
---
Jonas Steiner wrote
Hello community

I'm struggling with the following problem in SPSS Version 23.

I would like to sum up consecutive values in one binary variable called "Trockentag" [0,1] from top to down - "grouped" - meaning, that values should be consecutively added and puted as a value in a new variable, until a next following value of "Trockentag" is 0. The adding of values should start again, when a value is 1 and add consecutively, till a next value is 0 again. And so on.

The result should be a new variable in a new dataset which shows them grouped summed values (say 1,1,3,3,1,6,5,4,1 and so on) per values in "YR" [2015, 2014, 2013, ..., 1980] and "Station" [503, 502, 402, 401, 301, etc].




I hope you can relate to the definition of my problem and Thank you so much in an advance!


Greetings
Jonas
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

Kirill Orlov
In reply to this post by Jonas Steiner
will this suit?
var is binary

comp w= w+var.
if not var w= 0.
leave w.
exec.

20.10.2016 10:58, Jonas Steiner пишет:
Hello community

I'm struggling with the following problem in SPSS Version 23.

I would like to sum up consecutive values in one binary variable called
"Trockentag" [0,1] from top to down - "grouped" - meaning, that values
should be consecutively added and puted as a value in a new variable, until
a next following value of "Trockentag" is 0. The adding of values should
start again, when a value is 1 and add consecutively, till a next value is 0
again. And so on.

The result should be a new variable in a new dataset which shows them
grouped summed values (say 1,1,3,3,1,6,5,4,1 and so on) per values in "YR"
[2015, 2014, 2013, ..., 1980] and "Station" [503, 502, 402, 401, 301, etc].


<http://spssx-discussion.1045642.n5.nabble.com/file/n5733327/Problem_jonas.png> 

I hope you can relate to the definition of my problem and Thank you so much
in an advance!


Greetings
Jonas



--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/How-to-groupsum-consecutive-values-within-a-variable-according-to-conditions-tp5733327.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD






Avast logo

Это сообщение проверено на вирусы антивирусом Avast.
www.avast.com


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

David Marso
Administrator

Nice one!
Except you need
IF NOT(var) ...
And probably deal with the YR Station transitions?

Kirill Orlov wrote
will this suit?
var is binary

comp w= w+var.
if not var w= 0.
leave w.
exec.

20.10.2016 10:58, Jonas Steiner пишет:
> Hello community
>
> I'm struggling with the following problem in SPSS Version 23.
>
> I would like to sum up consecutive values in one binary variable called
> "Trockentag" [0,1] from top to down - "grouped" - meaning, that values
> should be consecutively added and puted as a value in a new variable, until
> a next following value of "Trockentag" is 0. The adding of values should
> start again, when a value is 1 and add consecutively, till a next value is 0
> again. And so on.
>
> The result should be a new variable in a new dataset which shows them
> grouped summed values (say 1,1,3,3,1,6,5,4,1 and so on) per values in "YR"
> [2015, 2014, 2013, ..., 1980] and "Station" [503, 502, 402, 401, 301, etc].
>
>
> <http://spssx-discussion.1045642.n5.nabble.com/file/n5733327/Problem_jonas.png>
>
> I hope you can relate to the definition of my problem and Thank you so much
> in an advance!
>
>
> Greetings
> Jonas
>
>
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/How-to-groupsum-consecutive-values-within-a-variable-according-to-conditions-tp5733327.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>




---
Это сообщение проверено на вирусы антивирусом Avast.
https://www.avast.com/antivirus

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

Bruce Weaver
Administrator
David, that NOT(var) advice struck me as odd, so I tried the following:

DATA LIST FREE / var (F1).
BEGIN DATA
1 1 1 0 1 0 0 1 1 1 0 1 1 1 1 1 1 0 0 0 1
END DATA.

COMPUTE w1 = w1+var.
COMPUTE w2 = w2+var.
IF NOT var w1 = 0.
IF NOT(var) w2 = 0.
LEAVE w1 w2.
FORMATS w1 w2 (F5.0).
LIST.

OUTPUT:

var    w1    w2
 
 1      1     1
 1      2     2
 1      3     3
 0      0     0
 1      1     1
 0      0     0
 0      0     0
 1      1     1
 1      2     2
 1      3     3
 0      0     0
 1      1     1
 1      2     2
 1      3     3
 1      4     4
 1      5     5
 1      6     6
 0      0     0
 0      0     0
 0      0     0
 1      1     1
 
Number of cases read:  21    Number of cases listed:  21

;-)

David Marso wrote
Nice one!
Except you need
IF NOT(var) ...
And probably deal with the YR Station transitions?

Kirill Orlov wrote
will this suit?
var is binary

comp w= w+var.
if not var w= 0.
leave w.
exec.

20.10.2016 10:58, Jonas Steiner пишет:
> Hello community
>
> I'm struggling with the following problem in SPSS Version 23.
>
> I would like to sum up consecutive values in one binary variable called
> "Trockentag" [0,1] from top to down - "grouped" - meaning, that values
> should be consecutively added and puted as a value in a new variable, until
> a next following value of "Trockentag" is 0. The adding of values should
> start again, when a value is 1 and add consecutively, till a next value is 0
> again. And so on.
>
> The result should be a new variable in a new dataset which shows them
> grouped summed values (say 1,1,3,3,1,6,5,4,1 and so on) per values in "YR"
> [2015, 2014, 2013, ..., 1980] and "Station" [503, 502, 402, 401, 301, etc].
>
>
> <http://spssx-discussion.1045642.n5.nabble.com/file/n5733327/Problem_jonas.png>
>
> I hope you can relate to the definition of my problem and Thank you so much
> in an advance!
>
>
> Greetings
> Jonas
>
>
>
> --
> View this message in context: http://spssx-discussion.1045642.n5.nabble.com/How-to-groupsum-consecutive-values-within-a-variable-according-to-conditions-tp5733327.html
> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>




---
Это сообщение проверено на вирусы антивирусом Avast.
https://www.avast.com/antivirus

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

Jonas Steiner
In reply to this post by Jonas Steiner
Hi everyone

Big thanks to you two for fast your support!

Kirill, your suggestions unfortunately does not seem to work for my exact purpose.




I'm struggling over the following incomplete ideas (TRY1 to TRY3) from a web search. Maybe they can give you a hint?




### TRY1

COMPUTE Tr_addiert = Trockentag.
IF id NE LAG(Trockentag) Tr_addiert = 0.
IF id EQ LAG(Trockentag) Tr_addiert = LAG(Trockentag)+1.

#got this idea from a google search. But how do I define "id"? What is "id"? And what about "Station" and "YR"?




### TRY2

DO IF $casenum=1.
    COMPUTE Tr_addiert=Trockentag.
        ELSE IF MISSING(Trockentag).
            COMPUTE Tr_addiert=LAG(var3).
        ELSE.
            COMPUTE Tr_addiert=Trockentag + LAG(Tr_addiert).
END IF.


#How to bring the variables " "Station" and "YR" into it?



### TRY3

AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=Station YR
  /Tr_add=CSUM(Trockentag).

#Does not work because SPSS cant handle CSUM in a AGGREGATE command.




Best wishes
Jonas
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

David Marso
Administrator
Please read my original posting, RTFM, write some code and do the face-palm thing.
Jonas Steiner wrote
Hi everyone

Big thanks to you two for fast your support!

Kirill, your suggestions unfortunately does not seem to work for my exact purpose.




I'm struggling over the following incomplete ideas (TRY1 to TRY3) from a web search. Maybe they can give you a hint?




### TRY1

COMPUTE Tr_addiert = Trockentag.
IF id NE LAG(Trockentag) Tr_addiert = 0.
IF id EQ LAG(Trockentag) Tr_addiert = LAG(Trockentag)+1.

#got this idea from a google search. But how do I define "id"? What is "id"? And what about "Station" and "YR"?




### TRY2

DO IF $casenum=1.
    COMPUTE Tr_addiert=Trockentag.
        ELSE IF MISSING(Trockentag).
            COMPUTE Tr_addiert=LAG(var3).
        ELSE.
            COMPUTE Tr_addiert=Trockentag + LAG(Tr_addiert).
END IF.


#How to bring the variables " "Station" and "YR" into it?



### TRY3

AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=Station YR
  /Tr_add=CSUM(Trockentag).

#Does not work because SPSS cant handle CSUM in a AGGREGATE command.




Best wishes
Jonas
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

Jonas Steiner

I managed to do the following:

DO IF $casenum=1.
    COMPUTE Tr_addiert=Trockentag.
        ELSE IF Trockentag=0.
            COMPUTE Tr_addiert=0.
        ELSE IF LAG(Trockentag) GT 0.
            COMPUTE Tr_addiert=1+LAG(Tr_addiert).
        ELSE.
            COMPUTE
                Tr_addiert=Trockentag + LAG(Tr_addiert).
END IF.

EXECUTE.


... what counts up the days with "Trockentage" (dry days in english) in a consecutive order. But that's not exactly what I'm going for.



I want to know the time spans in days (summed up consecutive values=1, intervened by values=0) in "Trockentage") per "Station" and "YR".
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

Maguin, Eugene
This problem, which seems like it ought to have been pretty simple, has been a small saga. Jonas, I think it would have been and would now be helpful to go back to your original post and show the desired computational result. You describe it but to me it's not very understandable.
Gene Maguin

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jonas Steiner
Sent: Friday, October 21, 2016 9:13 AM
To: [hidden email]
Subject: Re: How to "groupsum" consecutive values within a variable according to conditions

I managed to do the following:

DO IF $casenum=1.
    COMPUTE Tr_addiert=Trockentag.
        ELSE IF Trockentag=0.
            COMPUTE Tr_addiert=0.
        ELSE IF LAG(Trockentag) GT 0.
            COMPUTE Tr_addiert=1+LAG(Tr_addiert).
        ELSE.
            COMPUTE
                Tr_addiert=Trockentag + LAG(Tr_addiert).
END IF.

EXECUTE.


... what counts up the days with "Trockentage" (dry days in english) in a consecutive order. But that's not exactly what I'm going for.



I want to know the time spans in days (summed up consecutive values=1, intervened by values=0) in "Trockentage") per "Station" and "YR".




--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/How-to-groupsum-consecutive-values-within-a-variable-according-to-conditions-tp5733327p5733350.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

David Marso
Administrator
In reply to this post by Jonas Steiner
Untested:
DO IF $casenum EQ 1 OR  Station NE LAG(Station) OR YR NE LAG(YR) .
....
....
REREAD MY FIRST POST!!!!!

Jonas Steiner wrote
I managed to do the following:

DO IF $casenum=1.
    COMPUTE Tr_addiert=Trockentag.
        ELSE IF Trockentag=0.
            COMPUTE Tr_addiert=0.
        ELSE IF LAG(Trockentag) GT 0.
            COMPUTE Tr_addiert=1+LAG(Tr_addiert).
        ELSE.
            COMPUTE
                Tr_addiert=Trockentag + LAG(Tr_addiert).
END IF.

EXECUTE.


... what counts up the days with "Trockentage" (dry days in english) in a consecutive order. But that's not exactly what I'm going for.



I want to know the time spans in days (summed up consecutive values=1, intervened by values=0) in "Trockentage") per "Station" and "YR".
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

Jonas Steiner
In reply to this post by Maguin, Eugene
I updated my original post and illustrated it with some new pictures. I think the relation to the content of my data seems helpful. It should be clear now.



Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

Kirill Orlov
It is not quite right an action to update greatly your initial question on the site - because we are mailing list, not a wiki site. You could post the update as an answer in the threadof your question (better) or post an entirely new question.

24.10.2016 15:14, Jonas Steiner пишет:
I updated my original post and illustrated it with some new pictures. I think
the relation to the content of my data seems helpful. It should be clear
now.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

Bruce Weaver
Administrator
Kirill is pointing out that people who do not read the list via the Nabble archive will not see your update.  But rather than posting the update as another message in the thread, you could give the link to the updated post in Nabble:

http://spssx-discussion.1045642.n5.nabble.com/How-to-quot-groupsum-quot-consecutive-values-within-a-variable-according-to-conditions-td5733327.html

Jonas, does this give the result you want?

COMPUTE OriginalOrder = $CASENUM.
RECODE Trockentag (missing=0).
SORT CASES by Station DATE.
COMPUTE #TrockChange = Trockentag NE LAG(Trockentag).
COMPUTE #StationChange = Station NE LAG(Station).
* NOTE that #Break is initialized to value of 0, and retains value across cases.
COMPUTE #Break = SUM(#Break, #TrockChange OR #StationChange).
COMPUTE BreakVar = #Break.
EXECUTE.
FORMATS OriginalOrder BreakVar (F5.0).

* Keep dry day records only.
SELECT IF Trockentag.
FREQUENCIES Trockentag.

DATASET DECLARE new.
AGGREGATE
  /OUTFILE='new'
  /BREAK=BreakVar
  /Station=FIRST(Station)
  /YR=FIRST(YR)
  /OriginalOrder=FIRSZT(OriginalOrder)
  /DryDur "Duration of dry days" = NU.
DATASET ACTIVATE new.
SORT CASES BY OriginalOrder. /* Remove if not needed.
LIST.


Kirill Orlov wrote
It is not quite right an action to update greatly your initial question
on the site - *because *we are mailing list, not a wiki site. You could
post the update as an answer in the threadof your question (better) or
post an entirely new question.

24.10.2016 15:14, Jonas Steiner пишет:
> I updated my original post and illustrated it with some new pictures. I think
> the relation to the content of my data seems helpful. It should be clear
> now.
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

David Marso
Administrator
Furthermore, post your revised code after incorporating recent hints.
i.e. put on your thinking cap and get off your ass.
-----
Bruce Weaver wrote
Kirill is pointing out that people who do not read the list via the Nabble archive will not see your update.  But rather than posting the update as another message in the thread, you could give the link to the updated post in Nabble:

http://spssx-discussion.1045642.n5.nabble.com/How-to-quot-groupsum-quot-consecutive-values-within-a-variable-according-to-conditions-td5733327.html

Jonas, does this give the result you want?

COMPUTE OriginalOrder = $CASENUM.
RECODE Trockentag (missing=0).
SORT CASES by Station DATE.
COMPUTE #TrockChange = Trockentag NE LAG(Trockentag).
COMPUTE #StationChange = Station NE LAG(Station).
* NOTE that #Break is initialized to value of 0, and retains value across cases.
COMPUTE #Break = SUM(#Break, #TrockChange OR #StationChange).
COMPUTE BreakVar = #Break.
EXECUTE.
FORMATS OriginalOrder BreakVar (F5.0).

* Keep dry day records only.
SELECT IF Trockentag.
FREQUENCIES Trockentag.

DATASET DECLARE new.
AGGREGATE
  /OUTFILE='new'
  /BREAK=BreakVar
  /Station=FIRST(Station)
  /YR=FIRST(YR)
  /OriginalOrder=FIRSZT(OriginalOrder)
  /DryDur "Duration of dry days" = NU.
DATASET ACTIVATE new.
SORT CASES BY OriginalOrder. /* Remove if not needed.
LIST.


Kirill Orlov wrote
It is not quite right an action to update greatly your initial question
on the site - *because *we are mailing list, not a wiki site. You could
post the update as an answer in the threadof your question (better) or
post an entirely new question.

24.10.2016 15:14, Jonas Steiner пишет:
> I updated my original post and illustrated it with some new pictures. I think
> the relation to the content of my data seems helpful. It should be clear
> now.
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

David Marso
Administrator
In reply to this post by Bruce Weaver
This is somewhat more concise ;-).

/*Assuming file is sorted by YEAR Station */.

COMPUTE Segment=SUM(LAG(Segment),(Trockentag EQ 0)).
SPLIT FILE BY YEAR Station Segment.
CREATE DaysDry=CSUM(Trockentag ).
MATCH FILES / FILE * / BY YEAR Station Segment /LAST=MaxRain.

/* EXECUTE.


Bruce Weaver wrote
Kirill is pointing out that people who do not read the list via the Nabble archive will not see your update.  But rather than posting the update as another message in the thread, you could give the link to the updated post in Nabble:

http://spssx-discussion.1045642.n5.nabble.com/How-to-quot-groupsum-quot-consecutive-values-within-a-variable-according-to-conditions-td5733327.html

Jonas, does this give the result you want?

COMPUTE OriginalOrder = $CASENUM.
RECODE Trockentag (missing=0).
SORT CASES by Station DATE.
COMPUTE #TrockChange = Trockentag NE LAG(Trockentag).
COMPUTE #StationChange = Station NE LAG(Station).
* NOTE that #Break is initialized to value of 0, and retains value across cases.
COMPUTE #Break = SUM(#Break, #TrockChange OR #StationChange).
COMPUTE BreakVar = #Break.
EXECUTE.
FORMATS OriginalOrder BreakVar (F5.0).

* Keep dry day records only.
SELECT IF Trockentag.
FREQUENCIES Trockentag.

DATASET DECLARE new.
AGGREGATE
  /OUTFILE='new'
  /BREAK=BreakVar
  /Station=FIRST(Station)
  /YR=FIRST(YR)
  /OriginalOrder=FIRSZT(OriginalOrder)
  /DryDur "Duration of dry days" = NU.
DATASET ACTIVATE new.
SORT CASES BY OriginalOrder. /* Remove if not needed.
LIST.


Kirill Orlov wrote
It is not quite right an action to update greatly your initial question
on the site - *because *we are mailing list, not a wiki site. You could
post the update as an answer in the threadof your question (better) or
post an entirely new question.

24.10.2016 15:14, Jonas Steiner пишет:
> I updated my original post and illustrated it with some new pictures. I think
> the relation to the content of my data seems helpful. It should be clear
> now.
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: How to "groupsum" consecutive values within a variable according to conditions

Bruce Weaver
Administrator
Yes, it's far more concise.  But to get it to work properly with input data like the OP provided, it is vital to recode SYSMIS to 0 for variable Trockentag.  (When I ran it without including that RECODE, it was a mess.)

NEW FILE.
DATASET CLOSE all.
DATA LIST LIST / Station Trockentag YR (3F5.0).
BEGIN DATA
501 0 2015
501 0 2015
501 1 2015
501 1 2015
501 1 2015
503 1 2015
503 1 2015
503 1 2015
503 1 2015
503 1 2015
503 1 2015
503 1 2015
503 . 2015
503 . 2015
503 1 2015
503 1 2015
503 1 2015
503 1 2015
503 . 2015
503 1 2015
503 1 2015
503 . 2015
503 . 2015
503 . 2015
503 1 2015
503 1 2015
503 1 2015
503 . 2015
503 . 2015
503 1 2015
503 1 2015
END DATA.
DATASET NAME Original.

SORT CASES BY YR Station.
RECODE Trockentag (missing=0). /* You need this.

*****************************************************.
* David's code with YEAR changed to YR.
COMPUTE Segment=SUM(LAG(Segment),(Trockentag EQ 0)).
SPLIT FILE BY YR Station Segment.
CREATE DaysDry=CSUM(Trockentag ).
MATCH FILES / FILE * / BY YR Station Segment /LAST=MaxRain.
*****************************************************.
SPLIT FILE OFF.
TEMPORARY.
SELECT IF MaxRain and DaysDry GT 0.
LIST.

OUTPUT:

Station Trockentag    YR  Segment DaysDry MaxRain
 
   501         1    2015     2.00       3    1
   503         1    2015     2.00       7    1
   503         1    2015     4.00       4    1
   503         1    2015     5.00       2    1
   503         1    2015     8.00       3    1
   503         1    2015    10.00       2    1
 
Number of cases read:  6    Number of cases listed:  6


David Marso wrote
This is somewhat more concise ;-).

/*Assuming file is sorted by YEAR Station */.

COMPUTE Segment=SUM(LAG(Segment),(Trockentag EQ 0)).
SPLIT FILE BY YEAR Station Segment.
CREATE DaysDry=CSUM(Trockentag ).
MATCH FILES / FILE * / BY YEAR Station Segment /LAST=MaxRain.

/* EXECUTE.


Bruce Weaver wrote
Kirill is pointing out that people who do not read the list via the Nabble archive will not see your update.  But rather than posting the update as another message in the thread, you could give the link to the updated post in Nabble:

http://spssx-discussion.1045642.n5.nabble.com/How-to-quot-groupsum-quot-consecutive-values-within-a-variable-according-to-conditions-td5733327.html

Jonas, does this give the result you want?

COMPUTE OriginalOrder = $CASENUM.
RECODE Trockentag (missing=0).
SORT CASES by Station DATE.
COMPUTE #TrockChange = Trockentag NE LAG(Trockentag).
COMPUTE #StationChange = Station NE LAG(Station).
* NOTE that #Break is initialized to value of 0, and retains value across cases.
COMPUTE #Break = SUM(#Break, #TrockChange OR #StationChange).
COMPUTE BreakVar = #Break.
EXECUTE.
FORMATS OriginalOrder BreakVar (F5.0).

* Keep dry day records only.
SELECT IF Trockentag.
FREQUENCIES Trockentag.

DATASET DECLARE new.
AGGREGATE
  /OUTFILE='new'
  /BREAK=BreakVar
  /Station=FIRST(Station)
  /YR=FIRST(YR)
  /OriginalOrder=FIRSZT(OriginalOrder)
  /DryDur "Duration of dry days" = NU.
DATASET ACTIVATE new.
SORT CASES BY OriginalOrder. /* Remove if not needed.
LIST.


Kirill Orlov wrote
It is not quite right an action to update greatly your initial question
on the site - *because *we are mailing list, not a wiki site. You could
post the update as an answer in the threadof your question (better) or
post an entirely new question.

24.10.2016 15:14, Jonas Steiner пишет:
> I updated my original post and illustrated it with some new pictures. I think
> the relation to the content of my data seems helpful. It should be clear
> now.
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).