Hi All,
I have respondent/record level which requires aggregating by month periods however as the AGGREGATE function doesn't produce a row for where the break cell has a zero count, the resulting aggregated dataset has missing rows for the date period and so it is difficult to do any longitudinal analysis on the data.
Conceptually this is what I am trying to do, as below.
Say I have two datasets, dsMonth and dsYear. dsMonth taking 12 values of the calendar month and dsYear taking how many year periods in the data. I am then trying to create a combined dataset which takes all combination of these two dataset which results in the dataset dsFinal. Any help would be much appreciated.
Thanks in advance. Jignesh
DATA LIST FREE / Month(A3). BEGIN DATA Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec END DATA. DATASET NAME dsMonth.
DATA LIST FREE / Year(A5). BEGIN DATA Y2012 Y2013 Y2014 END DATA. DATASET NAME dsYear.
DATA LIST FREE / Year(A5) Month(A3). BEGIN DATA Y2012 Jan Y2012 Feb Y2012 Mar Y2012 Apr Y2012 May Y2012 Jun Y2012 Jul Y2012 Aug Y2012 Sep Y2012 Oct Y2012 Nov Y2012 Dec Y2013 Jan Y2013 Feb Y2013 Mar Y2013 Apr Y2013 May Y2013 Jun Y2013 Jul Y2013 Aug Y2013 Sep Y2013 Oct Y2013 Nov Y2013 Dec Y2014 Jan Y2014 Feb Y2014 Mar Y2014 Apr Y2014 May Y2014 Jun Y2014 Jul Y2014 Aug Y2014 Sep Y2014 Oct Y2014 Nov Y2014 Dec END DATA. DATASET NAME dsFinal. |
I should point out my in my real data it is not quite straight forward as 12 months and 3 years (otherwise I could just LOOP this). I have to aggregate on two dimensions, one being time period and the other with unknown values however if I aggregate on them both together then I am likely to have empty cells for certain time periods so I aggregate at a total level on each individually to get the overall unique occurrence of each dimension. Then the task at hand is to create a master template table/dataset to be able to match the results of AGGREGATE to (which will have excluded rows but given master template dataset will have complete rows it will result in complete dataset, as desired) Many thanks, Jignesh On 24 September 2014 11:23, Jignesh Sutar <[hidden email]> wrote:
|
I wrote a blog post about this problem, http://andrewpwheeler.wordpress.com/2014/08/04/aggregating-values-in-time-series-charts/. Basically what I do is to make a separate dataset that has the complete times and then merge in the aggregated data, then recode system missings to zero.
Here I have a set of macros that ease making the new datasets of time a bit. https://dl.dropboxusercontent.com/s/pbk0lwkexh2ccoy/TimeExpand_Macros.sps |
Thanks Andy, This is very useful! But say I am wanting to aggregate by Gender also and there is missing data for males in Jan, June, July and missing data for Females in April and August. How could I generate a complete dataset for Males and Female in all month periods? (If I have missing data for both males and female in any particular month then my approach wouldst work to generate the initial "overall" dataset and would have to use the approach you take) but for me that an initial starting point. From there I still need to create all combinations of, say Gender, and time period. I'm not aggregating on time period alone. Cheers Jignesh On 24 September 2014 12:34, Andy W <[hidden email]> wrote: I wrote a blog post about this problem, |
I see three choices.
- When generating the expanded dates you can add an extra loop for gender. - Or, after you make the aggregated dataset you can collapse to the individual dates (CASESTOVARS) and then merge into the expanded dates. - Similar to above, instead of using a particular count in the aggregation, create a set of dummy variables for each category and then sum the dummy variables. Example below of the last option. ***************************************************. *Make dummy variables for gender. COMPUTE Male = (Gender = 'M'). COMPUTE Fema = (Gender = 'F'). *Make aggregate counts. DATASET DECLARE AggData. AGGREGATE OUTFILE='AggData' /BREAK = Year Month /Male = SUM(Male) /Fema = SUM(Fema). DATASET ACTIVATE AggData. *Now merge in the expanded dataset. MATCH FILES FILE = * /FILE = 'ExpandData' /BY Year Month. RECODE Male Fema (SYSMIS = 0). *If you want back in long format. VARSTOCASES /MAKE GenderCount FROM Male Fema /INDEX GenderType. VALUE LABELS GenderType 1 'Male' 2 'Female'. ***************************************************. |
I see how that is working. But I only give gender as an example to demonstrate the task at hand. The two initial aggregate shall dictate how many categories there shall be these are otherwise unknown (see original post). So was hoping some kind of combination match could be achieved else perhaps will have to resort to Python to read the results of the two aggregation and generate the appropriate syntax as you demonstrate. Cheers Jignesh On 24 September 2014 13:14, Andy W <[hidden email]> wrote:
|
That is why I suggested some of the other approaches. Try to work out the CASESTOVARS approach I suggested for your particular case.
|
Administrator
|
In reply to this post by Jignesh Sutar
Search my recent posts in this group with keyword KRONECKER then apply the lessons in those posts!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
That should read KRONEKER! The first part of the MATRIX docs is in ERROR.
That said, study this and prosper! Quite obviously this can be readily extended to any number of variables!!!! -- /* data simulation here */. MATRIX. SAVE T({1:100})/OUTFILE * . END MATRIX. COMPUTE V1=TRUNC(RV.UNIFORM(1,13)). COMPUTE V2=TRUNC(RV.UNIFORM(1,13)). CROSSTABS V1 BY V2. AGGREGATE OUTFILE 'C:\TEMP\v1.sav' / BREAK v1 / Count=N. AGGREGATE OUTFILE 'C:\TEMP\v2.sav' / BREAK v2 / Count=N. /* relevant code here */. MATRIX. GET v1 / FILE='C:\TEMP\v1.sav' /VARIABLES v1. GET v2 / FILE='C:\TEMP\v2.sav' /VARIABLES v2. SAVE {KRONEKER(v1,MAKE(NROW(v2),1,1)),KRONEKER(MAKE(NROW(v1),1,1),v2)} /OUTFILE */VARIABLES v1 v2 . END MATRIX.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Brilliant, thanks. Here is another example of it in action with the additional sub command STRINGS specification, for it to work with string data. DATA LIST FREE / Year(A5) Month(A3). BEGIN DATA Y2012 Jan Y2012 Feb Y2012 Mar Y2012 Apr Y2012 May Y2012 Jun Y2012 Jul Y2012 Aug Y2012 Sep Y2012 Oct Y2012 Nov Y2012 Dec Y2013 Jan Y2014 Dec END DATA. DATASET NAME dsFinal. AGGREGATE OUTFILE 'C:\TEMP\v1.sav' / BREAK Year / Count=N. AGGREGATE OUTFILE 'C:\TEMP\v2.sav' / BREAK Month / Count=N. /* relevant code here */. MATRIX. GET Year / FILE='C:\TEMP\v1.sav' /VARIABLES Year. GET Month / FILE='C:\TEMP\v2.sav' /VARIABLES Month. SAVE {KRONEKER(Year,MAKE(NROW(Month),1,1)),KRONEKER(MAKE(NROW(Year),1,1),Month)} /OUTFILE */VARIABLES Year Month /STRINGS=Year Month . END MATRIX. On 24 September 2014 14:20, David Marso <[hidden email]> wrote: That should read KRONEKER! The first part of the MATRIX docs is in ERROR. |
Administrator
|
Interesting that KRONEKER even works with strings!
OTOH: 1*x=x for all x! Hence strings must be stored as numbers in MATRIX so weird things like multiplication work. Little experiment: Are we having fun yet? ;-) MATRIX. COMPUTE x={"a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"; "A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"}. COMPUTE y=MAKE(26,1,0). LOOP i=1 TO 6. COMPUTE y={y,T(x)*i}. END LOOP. PRINT y(:,2:13) /FORMAT "A8". END MATRIX. Run MATRIX procedure: Y(:,2:13) a A a 0 A 0 000008 b000008 a @ A @ y(((((D Q(((((D 00000H b00000H b B b 0 B 0 000008 c000008 b @ B @ z(((((D R(((((D 00000H c00000H c C c 0 C 0 000008 d000008 c @ C @ |(((((D T(((((D 00000H d00000H d D d 0 D 0 000008 f000008 d @ D @ }(((((D U(((((D 00000H f00000H e E e 0 E 0 000008 h000008 e @ E @ ~(((((D V(((((D 00000H h00000H f F f 0 F 0 000008 i000008 f @ F @ (((((D X(((((D 00000H i00000H g G g 0 G 0 000008 j000008 g @ G @ (((((D Y(((((D 00000H j00000H h H h 0 H 0 000008 l000008 h @ H @ (((((D Z(((((D 00000H l00000H i I i 0 I 0 000008 n000008 i @ I @ (((((D [(((((D 00000H n00000H j J j 0 J 0 000008 o000008 j @ J @ (((((D \(((((D 00000H o00000H k K k 0 K 0 000008 p000008 k @ K @ (((((D ^(((((D 00000H p00000H l L l 0 L 0 000008 r000008 l @ L @ (((((D _(((((D 00000H r00000H m M m 0 M 0 000008 t000008 m @ M @ (((((D `(((((D 00000H t00000H n N n 0 N 0 000008 u000008 n @ N @ (((((D b(((((D 00000H u00000H o O o 0 O 0 000008 v000008 o @ O @ (((((D c(((((D 00000H v00000H p P p 0 P 0 000008 x000008 p @ P @ (((((D d(((((D 00000H x00000H q Q q 0 Q 0 000008 z000008 q @ Q @ (((((D e(((((D 00000H z00000H r R r 0 R 0 000008 {000008 r @ R @ (((((D f(((((D 00000H {00000H s S s 0 S 0 000008 |000008 s @ S @ (((((D h(((((D 00000H |00000H t T t 0 T 0 000008 ~000008 t @ T @ (((((D i(((((D 00000H ~00000H u U u 0 U 0 000008 000008 u @ U @ (((((D j(((((D 00000H 00000H v V v 0 V 0 000008 000008 v @ V @ (((((D l(((((D 00000H 00000H w W w 0 W 0 000008 000008 w @ W @ (((((D m(((((D 00000H 00000H x X x 0 X 0 000008 000008 x @ X @ (((((D n(((((D 00000H 00000H y Y y 0 Y 0 000008 000008 y @ Y @ (((((D o(((((D 00000H 00000H z Z z 0 Z 0 000008 000008 z @ Z @ (((((D p(((((D 00000H 00000H ------ END MATRIX -----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Interesting results, if nothing, a great reminder how powerful matrix transformations can be, been some years since I used them, really ought to make us of them more. How would you use the KRONEKER if you had 3 dimensions? Is there a more efficient way of doing it than below: DATA LIST FREE / Gender (A1) Year(A5) Month(A3). BEGIN DATA M Y2012 Jan M Y2012 Feb F Y2012 Mar M Y2012 Apr M Y2012 May F Y2012 Jun M Y2012 Jul M Y2012 Aug M Y2012 Sep M Y2012 Oct M Y2012 Nov M Y2012 Dec M Y2013 Jan M Y2014 Dec END DATA. DATASET NAME dsFinal. AGGREGATE OUTFILE 'C:\TEMP\v0.sav' / BREAK Gender / Count=N. AGGREGATE OUTFILE 'C:\TEMP\v1.sav' / BREAK Year / Count=N. AGGREGATE OUTFILE 'C:\TEMP\v2.sav' / BREAK Month / Count=N. MATRIX. GET Year / FILE='C:\TEMP\v1.sav' /VARIABLES Year. GET Month / FILE='C:\TEMP\v2.sav' /VARIABLES Month. SAVE {KRONEKER(Year,MAKE(NROW(Month),1,1)),KRONEKER(MAKE(NROW(Year),1,1),Month)} /OUTFILE=* /VARIABLES Year Month /STRINGS=Year Month . END MATRIX. dataset name dsV3. dataset activate dsV3. string Combo (a8). compute Combo =concat(Year,Month). add files file=*/keep=Combo. save outfile='C:\TEMP\v3.sav'. MATRIX. GET Gender / FILE='C:\TEMP\v0.sav' /VARIABLES Gender. GET Combo / FILE='C:\TEMP\v3.sav' /VARIABLES Combo. SAVE {KRONEKER(Gender,MAKE(NROW(Combo),1,1)),KRONEKER(MAKE(NROW(Gender),1,1),Combo)} /OUTFILE=* /VARIABLES Gender Combo /STRINGS=Gender Combo . END MATRIX. On 25 September 2014 15:03, David Marso <[hidden email]> wrote: Interesting that KRONEKER even works with strings! |
SQL has a CROSS JOIN feature which produces the Cartesian product of individual tables. It would be great if STAR JOIN could utilize similar features. The START JOIN demo below doesn't work but it would be great if something as simple as that could be specified to achieve the same (in the background it could be calling the matrix command, if necessary)? Not sure if this is currently possible in STAR JOIN, I don't think it is. DATA LIST FREE / Gender (A1) Year(A5) Month(A3). BEGIN DATA M Y2012 Jan M Y2012 Feb F Y2012 Mar M Y2012 Apr M Y2012 May F Y2012 Jun M Y2012 Jul M Y2012 Aug M Y2012 Sep M Y2012 Oct M Y2012 Nov M Y2012 Dec M Y2013 Jan M Y2014 Dec END DATA. DATASET NAME dsFinal. dataset declare dsG. dataset declare dsY. dataset declare dsM. AGGREGATE OUTFILE dsG / BREAK Gender / Count=N. AGGREGATE OUTFILE dsY / BREAK Year / Count=N. AGGREGATE OUTFILE dsM / BREAK Month / Count=N. star join /select * /from dsY /crossjoin dsM /outfile file=*. On 26 September 2014 10:29, Jignesh Sutar <[hidden email]> wrote:
|
Administrator
|
In reply to this post by Jignesh Sutar
Rather than saving the first just use compute.
eg. MATRIX. COMPUTE A={1;2}. COMPUTE B={4;5}. COMPUTE C={7;8}. /* assume the above have been read using GET!. COMPUTE AB={KRONEKER(A,MAKE(NROW(B),1,1)),KRONEKER(MAKE(NROW(A),1,1),B)}. COMPUTE ABC={KRONEKER(AB,MAKE(NROW(C),1,1)),KRONEKER(MAKE(NROW(AB),1,1),C)}. PRINT ABC. END MATRIX. ABC 1 4 7 1 4 8 1 5 7 1 5 8 2 4 7 2 4 8 2 5 7 2 5 8
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
A general solution using my recursion technique ;-)
--- DEFINE !K (!POS !CHAREND ("/") /!POS !CMDEND). !IF (!TAIL(!1) !NE "") !THEN !LET !T1=!HEAD(!1) !LET !T2=!HEAD(!TAIL(!1)) !LET !Result=!CONCAT(!HEAD(!1),!HEAD(!T2)). + COMPUTE !Result={KRONEKER(!Head(!1),MAKE(NROW(!T2),1,1)),KRONEKER(MAKE(NROW(!T1),1,1),!T2)}. + !K !Result !TAIL(!TAIL(!1)) /!2. !ELSE SAVE !1 / OUTFILE */VARIABLES !2. !IFEND !ENDDEFINE. SET MPRINT ON. MATRIX. COMPUTE A={1;2;3}. COMPUTE B={3;4;5}. COMPUTE C={5;6;7}. COMPUTE D={7;8;9}. COMPUTE E={9;10;11}. COMPUTE F={11;12;13}. COMPUTE G={13;14;15}. !K A B C D E F G/ A B C D E F G. END MATRIX.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Is there a hack for getting this to work for string lengths greater than 8 characters? Or will I have to do a autorecode on the string variable and then run the KRONEKER matrix on that and then match back in the string definition?
Thanks, Jignesh |
In reply to this post by David Marso
Brilliant, this could be very useful! Thanks!
|
Administrator
|
In reply to this post by Jignesh Sutar
MATRIX supports only SHORT strings.
In general, support for strings in MATRIX is woefully limited. I tend to work with only numeric data in MATRIX and yes, AUTORECODE would be the game to play here. Meanwhile, since you seem on the ball (cranially engaged ;-), here is one more variation. DEFINE !K (!POS !CHAREND ("/") /!POS !CMDEND). !IF (!TAIL(!1) !NE "") !THEN !LET !T1=!HEAD(!1) !LET !T2=!HEAD(!TAIL(!1)) !LET !Result=!CONCAT(!HEAD(!1),!HEAD(!T2)). + COMPUTE !Result={KRONEKER(!Head(!1),MAKE(NROW(!T2),1,1)),KRONEKER(MAKE(NROW(!T1),1,1),!T2)}. + !K !Result !TAIL(!TAIL(!1)) /!2. !ELSE SAVE !1 / OUTFILE */VARIABLES !2. !IFEND !ENDDEFINE. DEFINE !Load (!POS !CHAREND ("/") / !POS !CMDEND). !IF (!1 !NE "") !THEN GET !HEAD(!2) / FILE !HEAD(!1) / VARIABLES !HEAD(!2) . !Load !TAIL(!1) / !TAIL(!2). !IFEND. !ENDDEFINE. DEFINE !FullCat (Files !CHAREND ("/") / Vars !CMDEND). MATRIX. !Load !Files / !Vars . !K !Vars / !Vars . END MATRIX. !ENDDEFINE. !FullCat Files "C:\TEMP\A.SAV" "C:\TEMP\B.SAV" "C:\TEMP\C.SAV" "C:\TEMP\D.SAV" "C:\TEMP\E.SAV" "C:\TEMP\F.SAV" "C:\TEMP\G.SAV" / Vars A B C D E F G. /* Data File simulation */. SET MPRINT ON. MATRIX. COMPUTE A={1;2;3}. COMPUTE B={3;4;5}. COMPUTE C={5;6;7}. COMPUTE D={7;8;9}. COMPUTE E={9;10;11}. COMPUTE F={11;12;13}. COMPUTE G={13;14;15}. SAVE A /OUTFILE "C:\TEMP\A.SAV"/VARIABLES A. SAVE B /OUTFILE "C:\TEMP\B.SAV"/VARIABLES B. SAVE C /OUTFILE "C:\TEMP\C.SAV"/VARIABLES C. SAVE D /OUTFILE "C:\TEMP\D.SAV"/VARIABLES D. SAVE E /OUTFILE "C:\TEMP\E.SAV"/VARIABLES E. SAVE F /OUTFILE "C:\TEMP\F.SAV"/VARIABLES F. SAVE G /OUTFILE "C:\TEMP\G.SAV"/VARIABLES G. END MATRIX. ---
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |