Kroneker / Cartesian Product / !Mny2Mny / Unroll

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Kroneker / Cartesian Product / !Mny2Mny / Unroll

Jignesh Sutar
I have an OLAP style aggregated dataset for which I have many dimensions (in example below I demonstrate with only two, Dim1 and Dim2) with a final dimension being time (Month in example below).

I want to force/unroll for ever unique combination of Dim1 and Dim2, every month present in the data. Discarding the scenario that there would be a missing month altogether from all aggregated combinations of Dim1 and Dim2, assume the dataset would contain all the time points of relevance  - at a total level but not perhaps for any individual Dim1*Dim2 level. Andy Wheeler !TimeExpand macro is great for that particular use case: https://andrewpwheeler.wordpress.com/2014/08/04/aggregating-values-in-time-series-charts/


I actually have a working solution to this problem as demonstrated below, but was wondering if there was an (even) more efficient/direct way of doing this and perhaps if Richard Ristows' !Mny2Mny was designed to, or has the capability of achieving this? (I tried but failed)




/ ******************************************************* / .
/* Load David Marsos' KRONEKER Matrix Macro */.
/ ******************************************************* / .
DEFINE !K (!POS !CHAREND ("/") /!POS !CMDEND).
!IF (!TAIL(!1) !NE "") !THEN
   !LET !T1=!HEAD(!1) !LET !T2=!HEAD(!TAIL(!1)) !LET
!Result=!CONCAT(!HEAD(!1),!HEAD(!T2)).
+  COMPUTE
!Result={KRONEKER(!Head(!1),MAKE(NROW(!T2),1,1)),KRONEKER(MAKE(NROW(!T1),1,1),!T2)}.
+  !K !Result !TAIL(!TAIL(!1)) /!2.
!ELSE
SAVE !1 / OUTFILE */VARIABLES !2.
!IFEND
!ENDDEFINE.

DEFINE !Load (!POS !CHAREND ("/") / !POS !CMDEND).
!IF (!1 !NE "") !THEN
GET !HEAD(!2) / FILE !HEAD(!1) / VARIABLES !HEAD(!2) .
!Load !TAIL(!1) / !TAIL(!2).
!IFEND.
!ENDDEFINE.

DEFINE !FullCat (Files !CHAREND ("/") / Vars !CMDEND).
MATRIX.
!Load !Files / !Vars .
!K !Vars  / !Vars .
END MATRIX.
!ENDDEFINE.
/ ******************************************************* / .

 
/ ******************************************************* / .
/* Demo Data */.
/ ******************************************************* / .
DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0).
BEGIN DATA.
A 7 1 50
A 7 2 40
A 7 3 20
A 8 1 56
A 8 2 86
A 9 1 45
B 8 2 68
B 8 3 58
B 9 1 57
END DATA.
DATASET NAME DSRaw.


/ ******************************************************* / .
/* CODE */.
/ ******************************************************* / .
DATASET ACTIVATE DSRaw.
SORT CASES BY Dim1 Dim2.
MATCH FILES FILE=* /BY Dim1 Dim2 /FIRST=Idx.
CREATE Idx=CSUM(Idx).

/* Extract unique values for each dimension */.
DATASET ACTIVATE DSRaw.
DATASET DECLARE dsIdx.
AGGREGATE OUTFILE=dsIdx /BREAK=Idx /Count=N.
DATASET ACTIVATE dsIdx.
SAVE OUTFILE="C:\Temp\dsIdx.sav" /DROP=Count.

DATASET ACTIVATE DSRaw.
DATASET DECLARE dsMonth.
AGGREGATE OUTFILE=dsMonth /BREAK=Month /Count=N.
DATASET ACTIVATE dsMonth.
SAVE OUTFILE="C:\Temp\dsMonth.sav" /DROP=Count.

!FullCat
  Files "C:\Temp\dsIdx.sav" "C:\Temp\dsMonth.sav"
  / Vars Idx Month.

DATASET NAME DSExpanded.
DATASET ACTIVATE DSRaw.
SORT CASES BY Idx Month.

DATASET ACTIVATE DSExpanded.
SORT CASES BY Idx Month.
MATCH FILES FILE=* /TABLE=DSRaw /BY Idx Month.

DATASET ACTIVATE DSExpanded.
AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES /BREAK=Idx /Dim1 to Dim2=MAX(Dim1 TO Dim2).
EXE.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Kroneker / Cartesian Product / !Mny2Mny / Unroll

Jignesh Sutar
Actually a more simpler approach given below, using CASESTOVARS and then VARSTOCASES. I could use SPSSINC SELECT VARIABLES to generate the variable list for the last AGGREGATE and V2C (to dynamically capture varying number of month periods).

Apart from that, can anyone anticipate any flaws taking this approach over perhaps the Cartesian product? Cartesian product seems to be very long winded.


/ ******************************************************* / .
/* Demo Data */.
/ ******************************************************* / .
DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0).
BEGIN DATA.
A 7 1 50
A 7 2 40
A 7 3 20
A 8 1 56
A 8 2 86
A 9 1 45
B 8 2 68
B 8 3 58
B 9 1 57
END DATA.
DATASET NAME DSRaw.

/ ******************************************************* / .
/* CODE */.
/ ******************************************************* / .
SORT CASES BY Dim1 Dim2.
COMPUTE Month2=Month.
CASESTOVARS /ID=Dim1 Dim2 /AUTOFIX=NO /INDEX=Month.
AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES /Month2.1 to Month2.3=MAX(Month2.1 to Month2.3).
VARSTOCASES /MAKE Month FROM Month2.1 to Month2.3 /MAKE Measure from Measure.1 to Measure.3.




On 10 April 2015 at 19:38, Jignesh Sutar <[hidden email]> wrote:
I have an OLAP style aggregated dataset for which I have many dimensions (in example below I demonstrate with only two, Dim1 and Dim2) with a final dimension being time (Month in example below).

I want to force/unroll for ever unique combination of Dim1 and Dim2, every month present in the data. Discarding the scenario that there would be a missing month altogether from all aggregated combinations of Dim1 and Dim2, assume the dataset would contain all the time points of relevance  - at a total level but not perhaps for any individual Dim1*Dim2 level. Andy Wheeler !TimeExpand macro is great for that particular use case: https://andrewpwheeler.wordpress.com/2014/08/04/aggregating-values-in-time-series-charts/


I actually have a working solution to this problem as demonstrated below, but was wondering if there was an (even) more efficient/direct way of doing this and perhaps if Richard Ristows' !Mny2Mny was designed to, or has the capability of achieving this? (I tried but failed)




/ ******************************************************* / .
/* Load David Marsos' KRONEKER Matrix Macro */.
/ ******************************************************* / .
DEFINE !K (!POS !CHAREND ("/") /!POS !CMDEND).
!IF (!TAIL(!1) !NE "") !THEN
   !LET !T1=!HEAD(!1) !LET !T2=!HEAD(!TAIL(!1)) !LET
!Result=!CONCAT(!HEAD(!1),!HEAD(!T2)).
+  COMPUTE
!Result={KRONEKER(!Head(!1),MAKE(NROW(!T2),1,1)),KRONEKER(MAKE(NROW(!T1),1,1),!T2)}.
+  !K !Result !TAIL(!TAIL(!1)) /!2.
!ELSE
SAVE !1 / OUTFILE */VARIABLES !2.
!IFEND
!ENDDEFINE.

DEFINE !Load (!POS !CHAREND ("/") / !POS !CMDEND).
!IF (!1 !NE "") !THEN
GET !HEAD(!2) / FILE !HEAD(!1) / VARIABLES !HEAD(!2) .
!Load !TAIL(!1) / !TAIL(!2).
!IFEND.
!ENDDEFINE.

DEFINE !FullCat (Files !CHAREND ("/") / Vars !CMDEND).
MATRIX.
!Load !Files / !Vars .
!K !Vars  / !Vars .
END MATRIX.
!ENDDEFINE.
/ ******************************************************* / .

 
/ ******************************************************* / .
/* Demo Data */.
/ ******************************************************* / .
DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0).
BEGIN DATA.
A 7 1 50
A 7 2 40
A 7 3 20
A 8 1 56
A 8 2 86
A 9 1 45
B 8 2 68
B 8 3 58
B 9 1 57
END DATA.
DATASET NAME DSRaw.


/ ******************************************************* / .
/* CODE */.
/ ******************************************************* / .
DATASET ACTIVATE DSRaw.
SORT CASES BY Dim1 Dim2.
MATCH FILES FILE=* /BY Dim1 Dim2 /FIRST=Idx.
CREATE Idx=CSUM(Idx).

/* Extract unique values for each dimension */.
DATASET ACTIVATE DSRaw.
DATASET DECLARE dsIdx.
AGGREGATE OUTFILE=dsIdx /BREAK=Idx /Count=N.
DATASET ACTIVATE dsIdx.
SAVE OUTFILE="C:\Temp\dsIdx.sav" /DROP=Count.

DATASET ACTIVATE DSRaw.
DATASET DECLARE dsMonth.
AGGREGATE OUTFILE=dsMonth /BREAK=Month /Count=N.
DATASET ACTIVATE dsMonth.
SAVE OUTFILE="C:\Temp\dsMonth.sav" /DROP=Count.

!FullCat
  Files "C:\Temp\dsIdx.sav" "C:\Temp\dsMonth.sav"
  / Vars Idx Month.

DATASET NAME DSExpanded.
DATASET ACTIVATE DSRaw.
SORT CASES BY Idx Month.

DATASET ACTIVATE DSExpanded.
SORT CASES BY Idx Month.
MATCH FILES FILE=* /TABLE=DSRaw /BY Idx Month.

DATASET ACTIVATE DSExpanded.
AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES /BREAK=Idx /Dim1 to Dim2=MAX(Dim1 TO Dim2).
EXE.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Kroneker / Cartesian Product / !Mny2Mny / Unroll

David Marso
Administrator
In reply to this post by Jignesh Sutar
Build a K way AGGREGATE and then drop the result into MATRIX.
Run the KRONEKER (with the unit placeholders) for column 1 , 2-> result(1:2)
then K(result,col3) ->Result(1:3),... and so forth in a LOOP.
----
Jignesh Sutar wrote
I have an OLAP style aggregated dataset for which I have many dimensions
(in example below I demonstrate with only two, Dim1 and Dim2) with a final
dimension being time (Month in example below).

I want to force/unroll for ever unique combination of Dim1 and Dim2, every
month present in the data. Discarding the scenario that there would be a
missing month altogether from all aggregated combinations of Dim1 and Dim2,
assume the dataset would contain all the time points of relevance  - at a
total level but not perhaps for any individual Dim1*Dim2 level. Andy
Wheeler !TimeExpand macro is great for that particular use case:
https://andrewpwheeler.wordpress.com/2014/08/04/aggregating-values-in-time-series-charts/


I actually have a working solution to this problem as demonstrated below,
but was wondering if there was an (even) more efficient/direct way of doing
this and perhaps if Richard Ristows' !Mny2Mny was designed to, or has the
capability of achieving this? (I tried but failed)

http://spssx-discussion.1045642.n5.nabble.com/Macro-Mny2Mny-joins-and-Cartesian-products-v-2-td5724939.html

http://spssx-discussion.1045642.n5.nabble.com/Many-too-many-demo-td5724887.html


/ ******************************************************* / .
/* Load David Marsos' KRONEKER Matrix Macro */.
/ ******************************************************* / .
DEFINE !K (!POS !CHAREND ("/") /!POS !CMDEND).
!IF (!TAIL(!1) !NE "") !THEN
   !LET !T1=!HEAD(!1) !LET !T2=!HEAD(!TAIL(!1)) !LET
!Result=!CONCAT(!HEAD(!1),!HEAD(!T2)).
+  COMPUTE
!Result={KRONEKER(!Head(!1),MAKE(NROW(!T2),1,1)),KRONEKER(MAKE(NROW(!T1),1,1),!T2)}.
+  !K !Result !TAIL(!TAIL(!1)) /!2.
!ELSE
SAVE !1 / OUTFILE */VARIABLES !2.
!IFEND
!ENDDEFINE.

DEFINE !Load (!POS !CHAREND ("/") / !POS !CMDEND).
!IF (!1 !NE "") !THEN
GET !HEAD(!2) / FILE !HEAD(!1) / VARIABLES !HEAD(!2) .
!Load !TAIL(!1) / !TAIL(!2).
!IFEND.
!ENDDEFINE.

DEFINE !FullCat (Files !CHAREND ("/") / Vars !CMDEND).
MATRIX.
!Load !Files / !Vars .
!K !Vars  / !Vars .
END MATRIX.
!ENDDEFINE.
/ ******************************************************* / .


/ ******************************************************* / .
/* Demo Data */.
/ ******************************************************* / .
DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0).
BEGIN DATA.
A 7 1 50
A 7 2 40
A 7 3 20
A 8 1 56
A 8 2 86
A 9 1 45
B 8 2 68
B 8 3 58
B 9 1 57
END DATA.
DATASET NAME DSRaw.


/ ******************************************************* / .
/* CODE */.
/ ******************************************************* / .
DATASET ACTIVATE DSRaw.
SORT CASES BY Dim1 Dim2.
MATCH FILES FILE=* /BY Dim1 Dim2 /FIRST=Idx.
CREATE Idx=CSUM(Idx).

/* Extract unique values for each dimension */.
DATASET ACTIVATE DSRaw.
DATASET DECLARE dsIdx.
AGGREGATE OUTFILE=dsIdx /BREAK=Idx /Count=N.
DATASET ACTIVATE dsIdx.
SAVE OUTFILE="C:\Temp\dsIdx.sav" /DROP=Count.

DATASET ACTIVATE DSRaw.
DATASET DECLARE dsMonth.
AGGREGATE OUTFILE=dsMonth /BREAK=Month /Count=N.
DATASET ACTIVATE dsMonth.
SAVE OUTFILE="C:\Temp\dsMonth.sav" /DROP=Count.

!FullCat
  Files "C:\Temp\dsIdx.sav" "C:\Temp\dsMonth.sav"
  / Vars Idx Month.

DATASET NAME DSExpanded.
DATASET ACTIVATE DSRaw.
SORT CASES BY Idx Month.

DATASET ACTIVATE DSExpanded.
SORT CASES BY Idx Month.
MATCH FILES FILE=* /TABLE=DSRaw /BY Idx Month.

DATASET ACTIVATE DSExpanded.
AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES /BREAK=Idx /Dim1 to
Dim2=MAX(Dim1 TO Dim2).
EXE.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Kroneker / Cartesian Product / !Mny2Mny / Unroll

David Marso
Administrator
Actually, don't aggregate K way, simply do it for each set of marginal.

David Marso wrote
Build a K way AGGREGATE and then drop the result into MATRIX.
Run the KRONEKER (with the unit placeholders) for column 1 , 2-> result(1:2)
then K(result,col3) ->Result(1:3),... and so forth in a LOOP.
----
Jignesh Sutar wrote
I have an OLAP style aggregated dataset for which I have many dimensions
(in example below I demonstrate with only two, Dim1 and Dim2) with a final
dimension being time (Month in example below).

I want to force/unroll for ever unique combination of Dim1 and Dim2, every
month present in the data. Discarding the scenario that there would be a
missing month altogether from all aggregated combinations of Dim1 and Dim2,
assume the dataset would contain all the time points of relevance  - at a
total level but not perhaps for any individual Dim1*Dim2 level. Andy
Wheeler !TimeExpand macro is great for that particular use case:
https://andrewpwheeler.wordpress.com/2014/08/04/aggregating-values-in-time-series-charts/


I actually have a working solution to this problem as demonstrated below,
but was wondering if there was an (even) more efficient/direct way of doing
this and perhaps if Richard Ristows' !Mny2Mny was designed to, or has the
capability of achieving this? (I tried but failed)

http://spssx-discussion.1045642.n5.nabble.com/Macro-Mny2Mny-joins-and-Cartesian-products-v-2-td5724939.html

http://spssx-discussion.1045642.n5.nabble.com/Many-too-many-demo-td5724887.html


/ ******************************************************* / .
/* Load David Marsos' KRONEKER Matrix Macro */.
/ ******************************************************* / .
DEFINE !K (!POS !CHAREND ("/") /!POS !CMDEND).
!IF (!TAIL(!1) !NE "") !THEN
   !LET !T1=!HEAD(!1) !LET !T2=!HEAD(!TAIL(!1)) !LET
!Result=!CONCAT(!HEAD(!1),!HEAD(!T2)).
+  COMPUTE
!Result={KRONEKER(!Head(!1),MAKE(NROW(!T2),1,1)),KRONEKER(MAKE(NROW(!T1),1,1),!T2)}.
+  !K !Result !TAIL(!TAIL(!1)) /!2.
!ELSE
SAVE !1 / OUTFILE */VARIABLES !2.
!IFEND
!ENDDEFINE.

DEFINE !Load (!POS !CHAREND ("/") / !POS !CMDEND).
!IF (!1 !NE "") !THEN
GET !HEAD(!2) / FILE !HEAD(!1) / VARIABLES !HEAD(!2) .
!Load !TAIL(!1) / !TAIL(!2).
!IFEND.
!ENDDEFINE.

DEFINE !FullCat (Files !CHAREND ("/") / Vars !CMDEND).
MATRIX.
!Load !Files / !Vars .
!K !Vars  / !Vars .
END MATRIX.
!ENDDEFINE.
/ ******************************************************* / .


/ ******************************************************* / .
/* Demo Data */.
/ ******************************************************* / .
DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0).
BEGIN DATA.
A 7 1 50
A 7 2 40
A 7 3 20
A 8 1 56
A 8 2 86
A 9 1 45
B 8 2 68
B 8 3 58
B 9 1 57
END DATA.
DATASET NAME DSRaw.


/ ******************************************************* / .
/* CODE */.
/ ******************************************************* / .
DATASET ACTIVATE DSRaw.
SORT CASES BY Dim1 Dim2.
MATCH FILES FILE=* /BY Dim1 Dim2 /FIRST=Idx.
CREATE Idx=CSUM(Idx).

/* Extract unique values for each dimension */.
DATASET ACTIVATE DSRaw.
DATASET DECLARE dsIdx.
AGGREGATE OUTFILE=dsIdx /BREAK=Idx /Count=N.
DATASET ACTIVATE dsIdx.
SAVE OUTFILE="C:\Temp\dsIdx.sav" /DROP=Count.

DATASET ACTIVATE DSRaw.
DATASET DECLARE dsMonth.
AGGREGATE OUTFILE=dsMonth /BREAK=Month /Count=N.
DATASET ACTIVATE dsMonth.
SAVE OUTFILE="C:\Temp\dsMonth.sav" /DROP=Count.

!FullCat
  Files "C:\Temp\dsIdx.sav" "C:\Temp\dsMonth.sav"
  / Vars Idx Month.

DATASET NAME DSExpanded.
DATASET ACTIVATE DSRaw.
SORT CASES BY Idx Month.

DATASET ACTIVATE DSExpanded.
SORT CASES BY Idx Month.
MATCH FILES FILE=* /TABLE=DSRaw /BY Idx Month.

DATASET ACTIVATE DSExpanded.
AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES /BREAK=Idx /Dim1 to
Dim2=MAX(Dim1 TO Dim2).
EXE.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Kroneker / Cartesian Product / !Mny2Mny / Unroll

Jignesh Sutar
In reply to this post by David Marso
That's if you want a full Cartesian product against all dimensions Dim1, Dim2 and Month (i.e. 2*3*3=18), no? I had that working actually with your macro, which with you can do that in one scoop (albeit the overhead of having to prep each aggregated dataset for the 3 dimensions). I don't necessarily need a full Cartesian product of all dimensions individual. Only the unique dimensions of Dim1 and Dim2 that exits need to be crossed with all dates so in this example it is 5 unique Dim1, Dim2 combinations and 3 unique months so 5*3=15 final number of rows.


On 10 April 2015 at 20:13, David Marso <[hidden email]> wrote:
Build a K way AGGREGATE and then drop the result into MATRIX.
Run the KRONEKER (with the unit placeholders) for column 1 , 2-> result(1:2)
then K(result,col3) ->Result(1:3),... and so forth in a LOOP.
----

Jignesh Sutar wrote
> I have an OLAP style aggregated dataset for which I have many dimensions
> (in example below I demonstrate with only two, Dim1 and Dim2) with a final
> dimension being time (Month in example below).
>
> I want to force/unroll for ever unique combination of Dim1 and Dim2, every
> month present in the data. Discarding the scenario that there would be a
> missing month altogether from all aggregated combinations of Dim1 and
> Dim2,
> assume the dataset would contain all the time points of relevance  - at a
> total level but not perhaps for any individual Dim1*Dim2 level. Andy
> Wheeler !TimeExpand macro is great for that particular use case:
> https://andrewpwheeler.wordpress.com/2014/08/04/aggregating-values-in-time-series-charts/
>
>
> I actually have a working solution to this problem as demonstrated below,
> but was wondering if there was an (even) more efficient/direct way of
> doing
> this and perhaps if Richard Ristows' !Mny2Mny was designed to, or has the
> capability of achieving this? (I tried but failed)
>
> http://spssx-discussion.1045642.n5.nabble.com/Macro-Mny2Mny-joins-and-Cartesian-products-v-2-td5724939.html
>
> http://spssx-discussion.1045642.n5.nabble.com/Many-too-many-demo-td5724887.html
>
>
> / ******************************************************* / .
> /* Load David Marsos' KRONEKER Matrix Macro */.
> / ******************************************************* / .
> DEFINE !K (!POS !CHAREND ("/") /!POS !CMDEND).
> !IF (!TAIL(!1) !NE "") !THEN
>    !LET !T1=!HEAD(!1) !LET !T2=!HEAD(!TAIL(!1)) !LET
> !Result=!CONCAT(!HEAD(!1),!HEAD(!T2)).
> +  COMPUTE
> !Result={KRONEKER(!Head(!1),MAKE(NROW(!T2),1,1)),KRONEKER(MAKE(NROW(!T1),1,1),!T2)}.
> +  !K !Result !TAIL(!TAIL(!1)) /!2.
> !ELSE
> SAVE !1 / OUTFILE */VARIABLES !2.
> !IFEND
> !ENDDEFINE.
>
> DEFINE !Load (!POS !CHAREND ("/") / !POS !CMDEND).
> !IF (!1 !NE "") !THEN
> GET !HEAD(!2) / FILE !HEAD(!1) / VARIABLES !HEAD(!2) .
> !Load !TAIL(!1) / !TAIL(!2).
> !IFEND.
> !ENDDEFINE.
>
> DEFINE !FullCat (Files !CHAREND ("/") / Vars !CMDEND).
> MATRIX.
> !Load !Files / !Vars .
> !K !Vars  / !Vars .
> END MATRIX.
> !ENDDEFINE.
> / ******************************************************* / .
>
>
> / ******************************************************* / .
> /* Demo Data */.
> / ******************************************************* / .
> DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0).
> BEGIN DATA.
> A 7 1 50
> A 7 2 40
> A 7 3 20
> A 8 1 56
> A 8 2 86
> A 9 1 45
> B 8 2 68
> B 8 3 58
> B 9 1 57
> END DATA.
> DATASET NAME DSRaw.
>
>
> / ******************************************************* / .
> /* CODE */.
> / ******************************************************* / .
> DATASET ACTIVATE DSRaw.
> SORT CASES BY Dim1 Dim2.
> MATCH FILES FILE=* /BY Dim1 Dim2 /FIRST=Idx.
> CREATE Idx=CSUM(Idx).
>
> /* Extract unique values for each dimension */.
> DATASET ACTIVATE DSRaw.
> DATASET DECLARE dsIdx.
> AGGREGATE OUTFILE=dsIdx /BREAK=Idx /Count=N.
> DATASET ACTIVATE dsIdx.
> SAVE OUTFILE="C:\Temp\dsIdx.sav" /DROP=Count.
>
> DATASET ACTIVATE DSRaw.
> DATASET DECLARE dsMonth.
> AGGREGATE OUTFILE=dsMonth /BREAK=Month /Count=N.
> DATASET ACTIVATE dsMonth.
> SAVE OUTFILE="C:\Temp\dsMonth.sav" /DROP=Count.
>
> !FullCat
>   Files "C:\Temp\dsIdx.sav" "C:\Temp\dsMonth.sav"
>   / Vars Idx Month.
>
> DATASET NAME DSExpanded.
> DATASET ACTIVATE DSRaw.
> SORT CASES BY Idx Month.
>
> DATASET ACTIVATE DSExpanded.
> SORT CASES BY Idx Month.
> MATCH FILES FILE=* /TABLE=DSRaw /BY Idx Month.
>
> DATASET ACTIVATE DSExpanded.
> AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES /BREAK=Idx /Dim1 to
> Dim2=MAX(Dim1 TO Dim2).
> EXE.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

>  (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD





-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Kroneker-Cartesian-Product-Mny2Mny-Unroll-tp5729195p5729196.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Kroneker / Cartesian Product / !Mny2Mny / Unroll

Richard Ristow
In reply to this post by Jignesh Sutar
At 02:38 PM 4/10/2015, Jignesh Sutar wrote:
>I have an OLAP style aggregated dataset for which I have many
>dimensions (in example below I demonstrate with only two, Dim1 and
>Dim2) with a final dimension being time (Month in example below).
>
>I ... was wondering if there was an (even) more efficient/direct way
>of doing this and perhaps if Richard Ristows' !Mny2Mny was designed
>to, or has the capability of achieving this?

Sorry -- I've had the updated release of !Mny2Mny hanging fire,
awaiting final testing, for many months.  I append the latest
version, which incorporates David Marso's MATRIX logic for
'unrolling' the catalog of records to completely match the set of
records in the output.

There's rudimentary documentation at the head of the file. In
general, I've tracked the behavior of MATCH FILES as closely as I
could; however, !Mny2Mny only works with exactly two input
files.  (If there is a matching key of one or more variables, both
inputs must be sorted on that key.)

This version doesn't have OMS code to suppress printed output from MATRIX.

Let me know of any problems in functioning or documentation.

-Best of luck,
  Richard
=========================
APPENDIX:  Macro !Mny2Mny
=========================
*  ****************************************************************  .
*  Macro for many-to-many merge of two datasets or .SAV files.       .
*  Richard Ristow, March 2014                                        .
*  Contact: [hidden email]                                  .
*                                                                    .
*  The technique of representing a set of consecutive records in a   .
*  file by the largest value of a sequence number was introducted to .
*  the author by David Marso. David also wrote code using MATRIX to  .
*  'unroll' a record containing a pair of such numbers into the set  .
*  of all pairs of sequence numbers represented, which has been      .
*  adapted here.                                                     .
*  ================================================================= .
*                                                                    .
*  All arguments are terminated by '/'.                              .
*                                                                    .
*  Required arguments:                                               .
*  LeftIn   Name of one of the datasets, or saved files, to be joined.
*           (referred to as the "Left input");  may be '*'.          .
*  RightIn  Name of the other  dataset,  or saved file,  to be joined.
*           (referred to as the "Right input"); may not be '*'.      .
*                                                                    .
*  Optional arguments for the join operation:                        .
*  BYvar    Variable or variables on which the inputs are to be      .
*           joined; if not specified, the result will be a Cartesian .
*           product, matching every record from each input to every  .
*           record from the other.  Requirements for BYvar are the   .
*           same as as for BY variables in MATCH FILES:              .
*           all variables must be present and of the same type in    .
*           both inputs; and both inputs must be in ascending order  .
*           on the variable(s).                                      .
*  Join     Type of join (relevant only if BYvar is specified):      .
*           Outer (default), LeftOuter, RightOuter, or Inner --      .
*        .  Outer        Output includes contents of all records     .
*                        from both inputs                            .
*        .  Left Outer   Output includes contents of all records     .
*        or LeftOuter    from the Left input, but only records from  .
*        or LOuter       the Right input that match some left-input  .
*                        record                                      .
*        .  Right Outer  Output includes contents of all records     .
*        or RightOuter   from the Right input, but only records from .
*        or ROuter       the Left input that match some right-input  .
*                        record                                      .
*        .  Inner        Output contains only records from each      .
*                        input that match some record in the other   .
*  Left#    Name of variable to hold sequence numbers of Left  input .
*           records, within values of BYvar. Name should not occur   .
*           in either input file.  Default: Left#                    .
*  Right#   Name of variable to hold sequence numbers of Right input .
*           records, within values of BYvar. Name should not occur   .
*           in either input file.  Default: Right#                   .
*                                                                    .
*  Optional arguments for the inputs:                                .
*  LRename  'RENAME' to be applied to the Left  input                .
*  LDrop    'DROP'   to be applied to the Left  input                .
*  LKeep    'KEEP'   to be applied to the Left  input (default: ALL) .
*  RRename  'RENAME' to be applied to the Right input                .
*  RDrop    'DROP'   to be applied to the Right input                .
*  RKeep    'KEEP'   to be applied to the Right input (default: ALL) .
*                                                                    .
*  Optional argument  for the output:                                .
*  DSname   Name of the output, joined dataset. The joined dataset   .
*           will always be active, but will be unnamed unless        .
*           DSname is specified.                                     .
*  ................................................................  .
*  Reserved names:                                                   .
*                                                                    .
*  By default, variables Left# and Right# are created, and retained  .
*  in the output. (These variables may be given other names by       .
*  specifying macro arguments Left#= and Right#=)                    .
*                                                                    .
*  Variables @Mny2Mny_NoBreak, @Mny2Mny_NewGrp                       .
*            @Mny2Mny_LMaxRec  @Mny2Mny_RMaxRec                      .
*            @Mny2Mny_Grp#     @Mny2Mny_GrpRecs                      .
*        are created, used, and discarded.                           .
*                                                                    .
*  Datasets  @Mny2Mny_Left#d   @Mny2Mny_Right#d                      .
*            @Mny2Mny_Left##   @Mny2Mny_GrpSmry                      .
*        are created, used, and discarded.                           .
*  ****************************************************************  .

DEFINE !Mny2Mny(LeftIn  = !CHAREND('/')
                /RightIn = !CHAREND('/')
                /BYvar   = !CHAREND('/')
                /LRename = !CHAREND('/')
                /LDrop   = !CHAREND('/')
                /LKeep   = !CHAREND('/') !DEFAULT(ALL)
                /RRename = !CHAREND('/')
                /RDrop   = !CHAREND('/')
                /RKeep   = !CHAREND('/') !DEFAULT(ALL)
                /Join    = !CHAREND('/') !DEFAULT(Outer)
                /Left#   = !CHAREND('/') !DEFAULT(Left#)
                /Right#  = !CHAREND('/') !DEFAULT(Right#)
                /DSname  = !CHAREND('/'))

    ECHO " ".
    ECHO !QUOTE(!CONCAT('Macro !','Mny2Mny:')).

*  .......     Section I:   Input parsing and error checking   ....... .
    !LET !Error = No

*  .......     I.A:         Required inputs                    ....... .

    !IF (   !LeftIn  !EQ !NULL
        !OR !RightIn !EQ !NULL) !THEN
        !LET !Error = Yes
        ECHO "Arguments LeftIn and RightIn (the two inputs)".
        ECHO "      are required.".
        !IF  (!LeftIn  !NE !NULL) !THEN
        ECHO !QUOTE(!CONCAT("      (Left input specified: ",
                                   !LeftIn,')')).
        !IFEND
        !IF  (!RightIn !NE !NULL) !THEN
        ECHO !QUOTE(!CONCAT("      (Right input specified: ",
                                   !RightIn,')')).
        !IFEND
    !IFEND

    !IF (   !RightIn !EQ '*') !THEN
        !LET !Error = Yes
        ECHO "The Right input may not be '*'. (The Left input may be)".
    !IFEND

*  .......     I.B:         One-token inputs                   ....... .

    !IF (!TAIL(!Left#)  !NE !NULL) !THEN
        !LET !Error = Yes
        ECHO "'Left#', if specified, must be a single token".
        ECHO !QUOTE(!CONCAT('(Entered: ','Left#,')')).
    !IFEND

    !IF (!TAIL(!Right#) !NE !NULL) !THEN
        !LET !Error = Yes
        ECHO "'Right#', if specified, must be a single token".
        ECHO !QUOTE(!CONCAT('(Entered: ','Right#,')')).
    !IFEND

    !IF (!TAIL(!DSname) !NE !NULL) !THEN
        !LET !Error = Yes
        ECHO "'DSname', if specified, must be a single token".
        ECHO !QUOTE(!CONCAT('(Entered: ',!DSname,')')).
    !IFEND

*  .......     I.C:         Join type                          ....... .

    !IF (!UPCASE(!Join) !EQ 'OUTER') !THEN
       !LET !JType = 'OUTER'
       !LET !JMsg  = 'Outer join'
    !IFEND
    !IF (!UPCASE(!Join) !EQ 'LEFT OUTER') !THEN
       !LET !JType = 'LEFTOUT'
       !LET !JMsg  = 'Left outer join'
    !IFEND
    !IF (!UPCASE(!Join) !EQ 'LEFTOUTER') !THEN
       !LET !JType = 'LEFTOUT'
       !LET !JMsg  = 'Left outer join'
    !IFEND
    !IF (!UPCASE(!Join) !EQ 'LOUTER') !THEN
       !LET !JType = 'LEFTOUT'
       !LET !JMsg  = 'Left outer join'
    !IFEND

    !IF (!UPCASE(!Join) !EQ 'RIGHT OUTER') !THEN
       !LET !JType = 'RIGHTOUT'
       !LET !JMsg  = 'Right outer join'
    !IFEND
    !IF (!UPCASE(!Join) !EQ 'RIGHTOUTER') !THEN
       !LET !JType = 'RIGHTOUT'
       !LET !JMsg  = 'Right outer join'
    !IFEND
    !IF (!UPCASE(!Join) !EQ 'ROUTER') !THEN
       !LET !JType = 'RIGHTOUT'
       !LET !JMsg  = 'Right outer join'
    !IFEND

    !IF (!UPCASE(!Join) !EQ 'INNER') !THEN
       !LET !JType = 'INNER'
       !LET !JMsg  = 'Inner join'
    !IFEND

    !IF (!BYvar         !EQ !NULL)   !THEN
       !LET !JType = 'CARTPROD'
       !LET !JMsg  = 'Cartesian product'
    !IFEND

    !IF (!JType !EQ !NULL) !THEN
        !LET !Error = Yes
        ECHO "Invalid join type:".
        ECHO !QUOTE(!CONCAT('    Join=',!Join)).
    !IFEND


*  .......     Section II:  Informational messages            ........ .
    !IF (!Error !EQ No) !THEN
        ECHO !QUOTE(!CONCAT(!JMsg,' of')).
        ECHO !QUOTE(!CONCAT('   ',!LeftIn)).
        !IF (        !LRename !NE !NULL) !THEN
           ECHO !QUOTE(!CONCAT('      /RENAME=',!LRename))
        !IFEND
        !IF (        !LDrop   !NE !NULL) !THEN
           ECHO !QUOTE(!CONCAT('      /DROP  =',!LDrop))
        !IFEND
        !IF (!UPCASE(!LKeep)  !NE 'ALL') !THEN
           ECHO !QUOTE(!CONCAT('      /KEEP  =',!BYVar,' ',!LKeep))
        !IFEND

        ECHO !QUOTE(!CONCAT('to ',!RightIn)).
        !IF (        !RRename !NE !NULL) !THEN
           ECHO !QUOTE(!CONCAT('      /RENAME=',!RRename))
        !IFEND
        !IF (        !RDrop   !NE !NULL) !THEN
           ECHO !QUOTE(!CONCAT('      /DROP  =',!RDrop))
        !IFEND
        !IF (!UPCASE(!RKeep)  !NE 'ALL') !THEN
           ECHO !QUOTE(!CONCAT('      /KEEP  =',!BYVar,' ',!RKeep))
        !IFEND
        !IF (!BYvar  !NE !NULL)          !THEN
           ECHO !QUOTE(!CONCAT('by ',!BYvar)).
        !IFEND
        !IF (!DSname !NE !NULL)          !THEN
           ECHO !QUOTE(!CONCAT('Output dataset named ',!DSname)).
        !IFEND
      !IFEND

*  .......     Section III: Processing                          ....... .
*  .......     III.A    Copies of the inputs, adding record     ....... .
*                       sequence numbers (within BYvar values)  ....... .
    !IF (!Error !EQ No) !THEN

*  .......     III.A.1  Left  input -> dataset @Mny2Mny_Left#d  ....... .
        !IF (!BYvar !EQ !NULL)
        !THEN
        * ... Left  input, when there are no BYvar vars  ... .
        ADD FILES
           /FILE=!Leftin
          !IF (!LRename !NE !NULL) !THEN
           /RENAME = !LRename
          !IFEND
          !IF (!LDrop   !NE !NULL) !THEN
           /DROP   = !LDrop
          !IFEND
           /KEEP   = !BYvar !LKeep.

        COMPUTE @Mny2Mny_NewGrp  = ($CASENUM EQ 1).
        COMPUTE !Left#           =  $CASENUM.

        !ELSE
        * ... Left  input, when there are    BYvar vars  ... .
        ADD FILES
           /FILE=!Leftin
          !IF (!LRename !NE !NULL) !THEN
           /RENAME = !LRename
          !IFEND
          !IF (!LDrop   !NE !NULL) !THEN
           /DROP   = !LDrop
          !IFEND
           /KEEP   = !BYvar !LKeep
           /BY       !BYvar
           /FIRST  = @Mny2Mny_NewGrp.

         DO IF    @Mny2Mny_NewGrp.
         .  COMPUTE !Left# = 1.
         ELSE.
         .  COMPUTE !Left# = LAG(!Left#) + 1.
         END IF.
         !IFEND

        * ... Left  input, code common to both cases     ... .

        COMPUTE        @Mny2Mny_NoBreak = 1.
        FORMATS        @Mny2Mny_NoBreak
                       @Mny2Mny_NewGrp (F2).
        FORMATS        !Left#          (F4).
        VARIABLE WIDTH !Left#          (6).

        DATASET NAME      @Mny2Mny_Left#d WINDOW=ASIS.
        *  ... End of processing Left  input .
        /*-- LIST /*-*/.

*  .......     III.A.2  Right input -> dataset @Mny2Mny_Right#d ....... .

        !IF (!BYvar !EQ !NULL)
        !THEN
        * ... Right input, when there are no BYvar vars ... .
        ADD FILES
           /FILE=!RightIn
        !IF (!RRename !NE !NULL) !THEN
           /RENAME = !RRename
        !IFEND
        !IF (!RDrop   !NE !NULL) !THEN
           /DROP   = !RDrop
        !IFEND
           /KEEP   = !BYvar !RKeep.

        COMPUTE @Mny2Mny_NewGrp  = ($CASENUM EQ 1).
        COMPUTE !Right#          =  $CASENUM.

        !ELSE
        * ... Right input, when there are    BYvar vars ... .
        ADD FILES
           /FILE=!RightIn
        !IF (!RRename !NE !NULL) !THEN
           /RENAME = !RRename
        !IFEND
        !IF (!RDrop   !NE !NULL) !THEN
           /DROP   = !RDrop
        !IFEND
           /KEEP   = !BYvar !RKeep
           /BY !BYvar
           /FIRST  = @Mny2Mny_NewGrp.

        DO IF    @Mny2Mny_NewGrp.
        .  COMPUTE !Right# = 1.
        ELSE.
        .  COMPUTE !Right# = LAG(!Right#) + 1.
        END IF.
        !IFEND

        * ... Right input, code common to both cases    ... .

        COMPUTE        @Mny2Mny_NoBreak = 1.
        FORMATS        @Mny2Mny_NoBreak
                       @Mny2Mny_NewGrp (F2).
        FORMATS        !Right#         (F4).
        VARIABLE WIDTH !Right#         (6).

        DATASET NAME      @Mny2Mny_Right#d WINDOW=ASIS.
        *  ... End of processing Right input .
        /*-- LIST /*-*/.

*  .......     III.B  Build 'spine' of all keys for all records ....... .
*  .......     III.B.1  Combine record counts from both inputs  ....... .

        DATASET ACTIVATE  @Mny2Mny_Left#d  WINDOW=ASIS.
        DATASET DECLARE   @Mny2Mny_Left##.
        AGGREGATE OUTFILE=@Mny2Mny_Left##
           /BREAK = @Mny2Mny_NoBreak !BYvar
           /@Mny2Mny_LMaxRec = MAX(!Left#).


        DATASET ACTIVATE  @Mny2Mny_Right#d WINDOW=ASIS.
        AGGREGATE OUTFILE=*
           /BREAK   = @Mny2Mny_NoBreak !BYvar
           /@Mny2Mny_RMaxRec = MAX(!Right#).

        MATCH FILES
           /FILE    = @Mny2Mny_Left##
           /FILE    = *
           /BY @Mny2Mny_NoBreak !BYvar.

        RECODE @Mny2Mny_LMaxRec
               @Mny2Mny_RMaxRec (MISSING=0).

        DATASET NAME  @Mny2Mny_GrpSmry.

        /*-- LIST /*-*/.

*  .......     III.B.2  Select, based on join type              ....... .
       !IF  (!Jtype !EQ 'INNER') !THEN
        SELECT IF (@Mny2Mny_LMaxRec GT 0
               AND @Mny2Mny_RMaxRec GT 0).
       !IFEND

       !IF  (!Jtype !EQ 'LEFTOUT') !THEN
        SELECT IF (@Mny2Mny_RMaxRec GT 0).
       !IFEND

       !IF  (!Jtype !EQ 'RIGHTOUT') !THEN
        SELECT IF (@Mny2Mny_LMaxRec GT 0).
       !IFEND

        * ..................................... .
        * For OUTER join and Cartesian product, .
        * no selection is necessary.            .
        * ..................................... .

        /*--  LIST /*-*/.

*  .......     III.B.3  Expand the counts to get list of        ........ .
*                       pairs of record numbers from the        ........ .
*                       groups, by group number                 ........ .

*      @Mny2Mny_Grp# is a sequence number for record groups     ........ .
*      (defined by sets of !BYvar values). It serves as a       ........ .
*      surrogate group identifier within MATRIX, which cannot   ........ .
*      in general accept the !BYvar values directly -- there    ........ .
*      may be several of them, and they may be strings.         ........ .

        COMPUTE @Mny2Mny_Grp#    = $CASENUM.

*      @Mny2Mny_GrpRecs is the number of records that will be   ........ .
*      in the final output, for each group.                     ........ .

        COMPUTE @Mny2Mny_GrpRecs = MAX(@Mny2Mny_LMaxRec,1)
                                  *MAX(@Mny2Mny_RMaxRec,1).

*  .......     MATRIX code by David Marso, adapted here.        ........ .
*  ... OMS  /SELECT TABLES
             /IF COMMANDS = ["Matrix"]
                 SUBTYPES = ["Notes"]
             /DESTINATION   VIEWER = NO
             /TAG = "ZapMATRIX".

        MATRIX.
        +  GET GrpSmry
           /FILE=*
           /VARIABLES=@Mny2Mny_Grp#
                      @Mny2Mny_LMaxRec @Mny2Mny_RMaxRec
                      @Mny2Mny_GrpRecs
           /NAMES=GSnames.

        +  LOOP Group#=1 TO NROW(GrpSmry).

        +     COMPUTE LeftNums=T({1:MMAX({GrpSmry(Group#,2),1})}).
        +     DO IF (GrpSmry(Group#,2) EQ 0).
        +        COMPUTE LeftNums = 0*LeftNums.
        +     END IF.

        +     COMPUTE RghtNums=T({1:MMAX({GrpSmry(Group#,3),1})}).
        +     DO IF (GrpSmry(Group#,3) EQ 0).
        +        COMPUTE RghtNums = 0*RghtNums.
        +     END IF.

        +     SAVE ({MAKE(GrpSmry(Group#,4),1,GrpSmry(Group#,1)),
                     KRONEKER(MAKE(NROW(RghtNums),1,1),LeftNums),
                     KRONEKER(RghtNums,MAKE(NROW(LeftNums),1,1)) })
               / OUTFILE *
               / VARIABLES @Mny2Mny_Grp# !Left# !Right#.

        +  END LOOP.
        END MATRIX.

        FORMATS @Mny2Mny_Grp# !Left# !Right# (F4).
        VARIABLE WIDTH        !Left# !Right# (6).
        /*--  LIST /*-*/.

*  .......     III.B.4  Combine with summary file by group key  ........ .
*                       to get the 'spine': list of all pairs   ........ .
*                       of record numbers by group key.         ........ .

        MATCH FILES
           /TABLE=@Mny2Mny_GrpSmry
           /FILE =*
           /BY    @Mny2Mny_Grp#
           /DROP  @Mny2Mny_GrpRecs
                  @Mny2Mny_LMaxRec @Mny2Mny_RMaxRec.

        /*-- DATASET COPY Spine      /*-*/.

*  .......     III.C  Add the substantive data                  ....... .

        MATCH FILES
           /FILE =*
           /TABLE=@Mny2Mny_Right#d
           /BY   !BYvar !Right#
           /DROP  @Mny2Mny_Grp#
                  @Mny2Mny_NoBreak @Mny2Mny_NewGrp.

        SORT CASES
               BY !BYvar !Left# !Right#.

        MATCH FILES
           /TABLE=@Mny2Mny_Left#d
           /FILE =*
           /BY   !BYvar !Left#
           /DROP  @Mny2Mny_NoBreak @Mny2Mny_NewGrp
           /KEEP !BYvar !Left# !Right# ALL.

        EXECUTE /* so input datasets can be dropped */.

        !IF (!DSname !NE !NULL) !THEN
           DATASET NAME !DSname WINDOW=FRONT.
        !IFEND
        DATASET CLOSE @Mny2Mny_Left#d.
        DATASET CLOSE @Mny2Mny_Right#d.
        DATASET CLOSE @Mny2Mny_Left##.
        DATASET CLOSE @Mny2Mny_GrpSmry.

    !IFEND
!ENDDEFINE.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Kroneker / Cartesian Product / !Mny2Mny / Unroll

Maguin, Eugene
In reply to this post by Jignesh Sutar

This is late in the history of this thread but would you (Jignesh) specify what the unrolled dataset would look like because I don’t see it in the example data you offer. As I understand Richard’s use of the term, I would expect to see the un-unrolled record with repeated data on the measure variable, e.g., measure.1 to measure.n.

Gene Maguin

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jignesh Sutar
Sent: Friday, April 10, 2015 2:38 PM
To: [hidden email]
Subject: Kroneker / Cartesian Product / !Mny2Mny / Unroll

 

I have an OLAP style aggregated dataset for which I have many dimensions (in example below I demonstrate with only two, Dim1 and Dim2) with a final dimension being time (Month in example below).

 

I want to force/unroll for ever unique combination of Dim1 and Dim2, every month present in the data. Discarding the scenario that there would be a missing month altogether from all aggregated combinations of Dim1 and Dim2, assume the dataset would contain all the time points of relevance  - at a total level but not perhaps for any individual Dim1*Dim2 level. Andy Wheeler !TimeExpand macro is great for that particular use case: https://andrewpwheeler.wordpress.com/2014/08/04/aggregating-values-in-time-series-charts/

 

 

I actually have a working solution to this problem as demonstrated below, but was wondering if there was an (even) more efficient/direct way of doing this and perhaps if Richard Ristows' !Mny2Mny was designed to, or has the capability of achieving this? (I tried but failed)

 

 

 

 

/ ******************************************************* / .

/* Load David Marsos' KRONEKER Matrix Macro */.

/ ******************************************************* / .

DEFINE !K (!POS !CHAREND ("/") /!POS !CMDEND).

!IF (!TAIL(!1) !NE "") !THEN

   !LET !T1=!HEAD(!1) !LET !T2=!HEAD(!TAIL(!1)) !LET

!Result=!CONCAT(!HEAD(!1),!HEAD(!T2)).

+  COMPUTE

!Result={KRONEKER(!Head(!1),MAKE(NROW(!T2),1,1)),KRONEKER(MAKE(NROW(!T1),1,1),!T2)}.

+  !K !Result !TAIL(!TAIL(!1)) /!2.

!ELSE

SAVE !1 / OUTFILE */VARIABLES !2.

!IFEND

!ENDDEFINE.

 

DEFINE !Load (!POS !CHAREND ("/") / !POS !CMDEND).

!IF (!1 !NE "") !THEN

GET !HEAD(!2) / FILE !HEAD(!1) / VARIABLES !HEAD(!2) .

!Load !TAIL(!1) / !TAIL(!2).

!IFEND.

!ENDDEFINE.

 

DEFINE !FullCat (Files !CHAREND ("/") / Vars !CMDEND).

MATRIX.

!Load !Files / !Vars .

!K !Vars  / !Vars .

END MATRIX.

!ENDDEFINE.

/ ******************************************************* / .

 

 

/ ******************************************************* / .

/* Demo Data */.

/ ******************************************************* / .

DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0).

BEGIN DATA.

A 7 1 50

A 7 2 40

A 7 3 20

A 8 1 56

A 8 2 86

A 9 1 45

B 8 2 68

B 8 3 58

B 9 1 57

END DATA.

DATASET NAME DSRaw.

 

 

/ ******************************************************* / .

/* CODE */.

/ ******************************************************* / .

DATASET ACTIVATE DSRaw.

SORT CASES BY Dim1 Dim2.

MATCH FILES FILE=* /BY Dim1 Dim2 /FIRST=Idx.

CREATE Idx=CSUM(Idx).

 

/* Extract unique values for each dimension */.

DATASET ACTIVATE DSRaw.

DATASET DECLARE dsIdx.

AGGREGATE OUTFILE=dsIdx /BREAK=Idx /Count=N.

DATASET ACTIVATE dsIdx.

SAVE OUTFILE="C:\Temp\dsIdx.sav" /DROP=Count.

 

DATASET ACTIVATE DSRaw.

DATASET DECLARE dsMonth.

AGGREGATE OUTFILE=dsMonth /BREAK=Month /Count=N.

DATASET ACTIVATE dsMonth.

SAVE OUTFILE="C:\Temp\dsMonth.sav" /DROP=Count.

 

!FullCat

  Files "C:\Temp\dsIdx.sav" "C:\Temp\dsMonth.sav"

  / Vars Idx Month.

 

DATASET NAME DSExpanded.

DATASET ACTIVATE DSRaw.

SORT CASES BY Idx Month.

 

DATASET ACTIVATE DSExpanded.

SORT CASES BY Idx Month.

MATCH FILES FILE=* /TABLE=DSRaw /BY Idx Month.

 

DATASET ACTIVATE DSExpanded.

AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES /BREAK=Idx /Dim1 to Dim2=MAX(Dim1 TO Dim2).

EXE.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Kroneker / Cartesian Product / !Mny2Mny / Unroll

Jignesh Sutar
Hi Eugene,

In this thread I posted a solution to what the unrolled dataset would look like. Is my use of the term "unrolled" incorrect to what it has been refereed to as previously? Apologies, if so.



You should notice that in the combination of

Dim1="A" and Dim2=8 ---> Month=3 is missing. 
Dim1="A" and Dim2=9 ---> Month=2 and 3 are missing.
Dim1="B" and Dim2=8 ---> Month=1 is missing.
Dim1="B" and Dim2=1 ---> Month=2 and 3 are missing.

However, the final solution expands/unrolls these dates. This is what I was trying to achieve.



On 13 April 2015 at 15:43, Maguin, Eugene <[hidden email]> wrote:

This is late in the history of this thread but would you (Jignesh) specify what the unrolled dataset would look like because I don’t see it in the example data you offer. As I understand Richard’s use of the term, I would expect to see the un-unrolled record with repeated data on the measure variable, e.g., measure.1 to measure.n.

Gene Maguin

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Jignesh Sutar
Sent: Friday, April 10, 2015 2:38 PM
To: [hidden email]
Subject: Kroneker / Cartesian Product / !Mny2Mny / Unroll

 

I have an OLAP style aggregated dataset for which I have many dimensions (in example below I demonstrate with only two, Dim1 and Dim2) with a final dimension being time (Month in example below).

 

I want to force/unroll for ever unique combination of Dim1 and Dim2, every month present in the data. Discarding the scenario that there would be a missing month altogether from all aggregated combinations of Dim1 and Dim2, assume the dataset would contain all the time points of relevance  - at a total level but not perhaps for any individual Dim1*Dim2 level. Andy Wheeler !TimeExpand macro is great for that particular use case: https://andrewpwheeler.wordpress.com/2014/08/04/aggregating-values-in-time-series-charts/

 

 

I actually have a working solution to this problem as demonstrated below, but was wondering if there was an (even) more efficient/direct way of doing this and perhaps if Richard Ristows' !Mny2Mny was designed to, or has the capability of achieving this? (I tried but failed)

 

 

 

 

/ ******************************************************* / .

/* Load David Marsos' KRONEKER Matrix Macro */.

/ ******************************************************* / .

DEFINE !K (!POS !CHAREND ("/") /!POS !CMDEND).

!IF (!TAIL(!1) !NE "") !THEN

   !LET !T1=!HEAD(!1) !LET !T2=!HEAD(!TAIL(!1)) !LET

!Result=!CONCAT(!HEAD(!1),!HEAD(!T2)).

+  COMPUTE

!Result={KRONEKER(!Head(!1),MAKE(NROW(!T2),1,1)),KRONEKER(MAKE(NROW(!T1),1,1),!T2)}.

+  !K !Result !TAIL(!TAIL(!1)) /!2.

!ELSE

SAVE !1 / OUTFILE */VARIABLES !2.

!IFEND

!ENDDEFINE.

 

DEFINE !Load (!POS !CHAREND ("/") / !POS !CMDEND).

!IF (!1 !NE "") !THEN

GET !HEAD(!2) / FILE !HEAD(!1) / VARIABLES !HEAD(!2) .

!Load !TAIL(!1) / !TAIL(!2).

!IFEND.

!ENDDEFINE.

 

DEFINE !FullCat (Files !CHAREND ("/") / Vars !CMDEND).

MATRIX.

!Load !Files / !Vars .

!K !Vars  / !Vars .

END MATRIX.

!ENDDEFINE.

/ ******************************************************* / .

 

 

/ ******************************************************* / .

/* Demo Data */.

/ ******************************************************* / .

DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0).

BEGIN DATA.

A 7 1 50

A 7 2 40

A 7 3 20

A 8 1 56

A 8 2 86

A 9 1 45

B 8 2 68

B 8 3 58

B 9 1 57

END DATA.

DATASET NAME DSRaw.

 

 

/ ******************************************************* / .

/* CODE */.

/ ******************************************************* / .

DATASET ACTIVATE DSRaw.

SORT CASES BY Dim1 Dim2.

MATCH FILES FILE=* /BY Dim1 Dim2 /FIRST=Idx.

CREATE Idx=CSUM(Idx).

 

/* Extract unique values for each dimension */.

DATASET ACTIVATE DSRaw.

DATASET DECLARE dsIdx.

AGGREGATE OUTFILE=dsIdx /BREAK=Idx /Count=N.

DATASET ACTIVATE dsIdx.

SAVE OUTFILE="C:\Temp\dsIdx.sav" /DROP=Count.

 

DATASET ACTIVATE DSRaw.

DATASET DECLARE dsMonth.

AGGREGATE OUTFILE=dsMonth /BREAK=Month /Count=N.

DATASET ACTIVATE dsMonth.

SAVE OUTFILE="C:\Temp\dsMonth.sav" /DROP=Count.

 

!FullCat

  Files "C:\Temp\dsIdx.sav" "C:\Temp\dsMonth.sav"

  / Vars Idx Month.

 

DATASET NAME DSExpanded.

DATASET ACTIVATE DSRaw.

SORT CASES BY Idx Month.

 

DATASET ACTIVATE DSExpanded.

SORT CASES BY Idx Month.

MATCH FILES FILE=* /TABLE=DSRaw /BY Idx Month.

 

DATASET ACTIVATE DSExpanded.

AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES /BREAK=Idx /Dim1 to Dim2=MAX(Dim1 TO Dim2).

EXE.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Kroneker / Cartesian Product / !Mny2Mny / Unroll

Andy W
In reply to this post by Jignesh Sutar
I can't say whether it is more efficient, but I updated that code to "fill in" time periods for panel datasets. I've posted the updated code at https://dl.dropboxusercontent.com/s/i5cii0zdw78tszc/TimeExpand_Macros.sps?dl=0. Here is an example for your dataset.

******************************.
*Using the PanelFillDate macro, available at https://dl.dropbox.com/s/i5cii0zdw78tszc/TimeExpand_Macros.sps?dl=0.
DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0).
BEGIN DATA.
A 7 1 50
A 7 2 40
A 7 3 20
A 8 1 56
A 8 2 86
A 9 1 45
B 8 2 68
B 8 3 58
B 9 1 57
END DATA.
DATASET NAME DSRaw.

*Needs an actual date variable, and the ID needs to be a number.
COMPUTE MonthT = DATE.MDY(Month,1,2014).
STRING Uni (A20).
COMPUTE Uni = CONCAT(Dim1,"_|_",STRING(Dim2,F1.0)).
AUTORECODE Uni /INTO Uni_#.

!PanelFillDate OrigData = DSRaw Id = Uni_# DateVar = MonthT Begin = "DATE.MDY(1,1,2014)" End = "DATE.MDY(12,1,2014)" Type = "Months".
*Cleaning up extraneous variables.
MATCH FILES FILE = * /DROP Uni Uni_#.
******************************.

Though this goes the extra step and adds in the empty months.
Andy W
apwheele@gmail.com
http://andrewpwheeler.wordpress.com/
Reply | Threaded
Open this post in threaded view
|

Re: Kroneker / Cartesian Product / !Mny2Mny / Unroll

Jignesh Sutar
That's very similar to my CASESTOVARS (but I convert the time period rather than ID as you do) and then VARSTOCASES. I like how you use INPUT PROGRAM to explicitly generate the time points, neat!


On 13 April 2015 at 18:29, Andy W <[hidden email]> wrote:
I can't say whether it is more efficient, but I updated that code to "fill
in" time periods for panel datasets. I've posted the updated code at
https://dl.dropboxusercontent.com/s/i5cii0zdw78tszc/TimeExpand_Macros.sps?dl=0.
Here is an example for your dataset.

******************************.
*Using the PanelFillDate macro, available at
https://dl.dropbox.com/s/i5cii0zdw78tszc/TimeExpand_Macros.sps?dl=0.
DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0).
BEGIN DATA.
A 7 1 50
A 7 2 40
A 7 3 20
A 8 1 56
A 8 2 86
A 9 1 45
B 8 2 68
B 8 3 58
B 9 1 57
END DATA.
DATASET NAME DSRaw.

*Needs an actual date variable, and the ID needs to be a number.
COMPUTE MonthT = DATE.MDY(Month,1,2014).
STRING Uni (A20).
COMPUTE Uni = CONCAT(Dim1,"_|_",STRING(Dim2,F1.0)).
AUTORECODE Uni /INTO Uni_#.

!PanelFillDate OrigData = DSRaw Id = Uni_# DateVar = MonthT Begin =
"DATE.MDY(1,1,2014)" End = "DATE.MDY(12,1,2014)" Type = "Months".
*Cleaning up extraneous variables.
MATCH FILES FILE = * /DROP Uni Uni_#.
******************************.

Though this goes the extra step and adds in the empty months.



-----
Andy W
[hidden email]
http://andrewpwheeler.wordpress.com/
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Kroneker-Cartesian-Product-Mny2Mny-Unroll-tp5729195p5729220.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Kroneker / Cartesian Product / !Mny2Mny / Unroll

David Marso
Administrator
DATA LIST /DATE1 (ADATE).
BEGIN DATA
01/20/2009
10/30/2010
END DATA.
MATRIX.
GET Dates / FILE *.
SAVE T({dates(1,1):dates(2,1):86400}) /OUTFILE * /VARIABLES=datefill.
END MATRIX.
FORMATS datefill (ADATE).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"