I have an OLAP style aggregated dataset for which I have many dimensions (in example below I demonstrate with only two, Dim1 and Dim2) with a final dimension being time (Month in example below). I want to force/unroll for ever unique combination of Dim1 and Dim2, every month present in the data. Discarding the scenario that there would be a missing month altogether from all aggregated combinations of Dim1 and Dim2, assume the dataset would contain all the time points of relevance - at a total level but not perhaps for any individual Dim1*Dim2 level. Andy Wheeler !TimeExpand macro is great for that particular use case: https://andrewpwheeler.wordpress.com/2014/08/04/aggregating-values-in-time-series-charts/ I actually have a working solution to this problem as demonstrated below, but was wondering if there was an (even) more efficient/direct way of doing this and perhaps if Richard Ristows' !Mny2Mny was designed to, or has the capability of achieving this? (I tried but failed) / ******************************************************* / . /* Load David Marsos' KRONEKER Matrix Macro */. / ******************************************************* / . DEFINE !K (!POS !CHAREND ("/") /!POS !CMDEND). !IF (!TAIL(!1) !NE "") !THEN !LET !T1=!HEAD(!1) !LET !T2=!HEAD(!TAIL(!1)) !LET !Result=!CONCAT(!HEAD(!1),!HEAD(!T2)). + COMPUTE !Result={KRONEKER(!Head(!1),MAKE(NROW(!T2),1,1)),KRONEKER(MAKE(NROW(!T1),1,1),!T2)}. + !K !Result !TAIL(!TAIL(!1)) /!2. !ELSE SAVE !1 / OUTFILE */VARIABLES !2. !IFEND !ENDDEFINE. DEFINE !Load (!POS !CHAREND ("/") / !POS !CMDEND). !IF (!1 !NE "") !THEN GET !HEAD(!2) / FILE !HEAD(!1) / VARIABLES !HEAD(!2) . !Load !TAIL(!1) / !TAIL(!2). !IFEND. !ENDDEFINE. DEFINE !FullCat (Files !CHAREND ("/") / Vars !CMDEND). MATRIX. !Load !Files / !Vars . !K !Vars / !Vars . END MATRIX. !ENDDEFINE. / ******************************************************* / . / ******************************************************* / . /* Demo Data */. / ******************************************************* / . DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0). BEGIN DATA. A 7 1 50 A 7 2 40 A 7 3 20 A 8 1 56 A 8 2 86 A 9 1 45 B 8 2 68 B 8 3 58 B 9 1 57 END DATA. DATASET NAME DSRaw. / ******************************************************* / . /* CODE */. / ******************************************************* / . DATASET ACTIVATE DSRaw. SORT CASES BY Dim1 Dim2. MATCH FILES FILE=* /BY Dim1 Dim2 /FIRST=Idx. CREATE Idx=CSUM(Idx). /* Extract unique values for each dimension */. DATASET ACTIVATE DSRaw. DATASET DECLARE dsIdx. AGGREGATE OUTFILE=dsIdx /BREAK=Idx /Count=N. DATASET ACTIVATE dsIdx. SAVE OUTFILE="C:\Temp\dsIdx.sav" /DROP=Count. DATASET ACTIVATE DSRaw. DATASET DECLARE dsMonth. AGGREGATE OUTFILE=dsMonth /BREAK=Month /Count=N. DATASET ACTIVATE dsMonth. SAVE OUTFILE="C:\Temp\dsMonth.sav" /DROP=Count. !FullCat Files "C:\Temp\dsIdx.sav" "C:\Temp\dsMonth.sav" / Vars Idx Month. DATASET NAME DSExpanded. DATASET ACTIVATE DSRaw. SORT CASES BY Idx Month. DATASET ACTIVATE DSExpanded. SORT CASES BY Idx Month. MATCH FILES FILE=* /TABLE=DSRaw /BY Idx Month. DATASET ACTIVATE DSExpanded. AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES /BREAK=Idx /Dim1 to Dim2=MAX(Dim1 TO Dim2). EXE. |
Actually a more simpler approach given below, using CASESTOVARS and then VARSTOCASES. I could use SPSSINC SELECT VARIABLES to generate the variable list for the last AGGREGATE and V2C (to dynamically capture varying number of month periods). Apart from that, can anyone anticipate any flaws taking this approach over perhaps the Cartesian product? Cartesian product seems to be very long winded. / ******************************************************* / . /* Demo Data */. / ******************************************************* / . DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0). BEGIN DATA. A 7 1 50 A 7 2 40 A 7 3 20 A 8 1 56 A 8 2 86 A 9 1 45 B 8 2 68 B 8 3 58 B 9 1 57 END DATA. DATASET NAME DSRaw. / ******************************************************* / . /* CODE */. / ******************************************************* / . SORT CASES BY Dim1 Dim2. COMPUTE Month2=Month. CASESTOVARS /ID=Dim1 Dim2 /AUTOFIX=NO /INDEX=Month. AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES /Month2.1 to Month2.3=MAX(Month2.1 to Month2.3). VARSTOCASES /MAKE Month FROM Month2.1 to Month2.3 /MAKE Measure from Measure.1 to Measure.3. On 10 April 2015 at 19:38, Jignesh Sutar <[hidden email]> wrote:
|
Administrator
|
In reply to this post by Jignesh Sutar
Build a K way AGGREGATE and then drop the result into MATRIX.
Run the KRONEKER (with the unit placeholders) for column 1 , 2-> result(1:2) then K(result,col3) ->Result(1:3),... and so forth in a LOOP. ----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
Actually, don't aggregate K way, simply do it for each set of marginal.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by David Marso
That's if you want a full Cartesian product against all dimensions Dim1, Dim2 and Month (i.e. 2*3*3=18), no? I had that working actually with your macro, which with you can do that in one scoop (albeit the overhead of having to prep each aggregated dataset for the 3 dimensions). I don't necessarily need a full Cartesian product of all dimensions individual. Only the unique dimensions of Dim1 and Dim2 that exits need to be crossed with all dates so in this example it is 5 unique Dim1, Dim2 combinations and 3 unique months so 5*3=15 final number of rows. On 10 April 2015 at 20:13, David Marso <[hidden email]> wrote: Build a K way AGGREGATE and then drop the result into MATRIX. |
In reply to this post by Jignesh Sutar
At 02:38 PM 4/10/2015, Jignesh Sutar wrote:
>I have an OLAP style aggregated dataset for which I have many >dimensions (in example below I demonstrate with only two, Dim1 and >Dim2) with a final dimension being time (Month in example below). > >I ... was wondering if there was an (even) more efficient/direct way >of doing this and perhaps if Richard Ristows' !Mny2Mny was designed >to, or has the capability of achieving this? Sorry -- I've had the updated release of !Mny2Mny hanging fire, awaiting final testing, for many months. I append the latest version, which incorporates David Marso's MATRIX logic for 'unrolling' the catalog of records to completely match the set of records in the output. There's rudimentary documentation at the head of the file. In general, I've tracked the behavior of MATCH FILES as closely as I could; however, !Mny2Mny only works with exactly two input files. (If there is a matching key of one or more variables, both inputs must be sorted on that key.) This version doesn't have OMS code to suppress printed output from MATRIX. Let me know of any problems in functioning or documentation. -Best of luck, Richard ========================= APPENDIX: Macro !Mny2Mny ========================= * **************************************************************** . * Macro for many-to-many merge of two datasets or .SAV files. . * Richard Ristow, March 2014 . * Contact: [hidden email] . * . * The technique of representing a set of consecutive records in a . * file by the largest value of a sequence number was introducted to . * the author by David Marso. David also wrote code using MATRIX to . * 'unroll' a record containing a pair of such numbers into the set . * of all pairs of sequence numbers represented, which has been . * adapted here. . * ================================================================= . * . * All arguments are terminated by '/'. . * . * Required arguments: . * LeftIn Name of one of the datasets, or saved files, to be joined. * (referred to as the "Left input"); may be '*'. . * RightIn Name of the other dataset, or saved file, to be joined. * (referred to as the "Right input"); may not be '*'. . * . * Optional arguments for the join operation: . * BYvar Variable or variables on which the inputs are to be . * joined; if not specified, the result will be a Cartesian . * product, matching every record from each input to every . * record from the other. Requirements for BYvar are the . * same as as for BY variables in MATCH FILES: . * all variables must be present and of the same type in . * both inputs; and both inputs must be in ascending order . * on the variable(s). . * Join Type of join (relevant only if BYvar is specified): . * Outer (default), LeftOuter, RightOuter, or Inner -- . * . Outer Output includes contents of all records . * from both inputs . * . Left Outer Output includes contents of all records . * or LeftOuter from the Left input, but only records from . * or LOuter the Right input that match some left-input . * record . * . Right Outer Output includes contents of all records . * or RightOuter from the Right input, but only records from . * or ROuter the Left input that match some right-input . * record . * . Inner Output contains only records from each . * input that match some record in the other . * Left# Name of variable to hold sequence numbers of Left input . * records, within values of BYvar. Name should not occur . * in either input file. Default: Left# . * Right# Name of variable to hold sequence numbers of Right input . * records, within values of BYvar. Name should not occur . * in either input file. Default: Right# . * . * Optional arguments for the inputs: . * LRename 'RENAME' to be applied to the Left input . * LDrop 'DROP' to be applied to the Left input . * LKeep 'KEEP' to be applied to the Left input (default: ALL) . * RRename 'RENAME' to be applied to the Right input . * RDrop 'DROP' to be applied to the Right input . * RKeep 'KEEP' to be applied to the Right input (default: ALL) . * . * Optional argument for the output: . * DSname Name of the output, joined dataset. The joined dataset . * will always be active, but will be unnamed unless . * DSname is specified. . * ................................................................ . * Reserved names: . * . * By default, variables Left# and Right# are created, and retained . * in the output. (These variables may be given other names by . * specifying macro arguments Left#= and Right#=) . * . * Variables @Mny2Mny_NoBreak, @Mny2Mny_NewGrp . * @Mny2Mny_LMaxRec @Mny2Mny_RMaxRec . * @Mny2Mny_Grp# @Mny2Mny_GrpRecs . * are created, used, and discarded. . * . * Datasets @Mny2Mny_Left#d @Mny2Mny_Right#d . * @Mny2Mny_Left## @Mny2Mny_GrpSmry . * are created, used, and discarded. . * **************************************************************** . DEFINE !Mny2Mny(LeftIn = !CHAREND('/') /RightIn = !CHAREND('/') /BYvar = !CHAREND('/') /LRename = !CHAREND('/') /LDrop = !CHAREND('/') /LKeep = !CHAREND('/') !DEFAULT(ALL) /RRename = !CHAREND('/') /RDrop = !CHAREND('/') /RKeep = !CHAREND('/') !DEFAULT(ALL) /Join = !CHAREND('/') !DEFAULT(Outer) /Left# = !CHAREND('/') !DEFAULT(Left#) /Right# = !CHAREND('/') !DEFAULT(Right#) /DSname = !CHAREND('/')) ECHO " ". ECHO !QUOTE(!CONCAT('Macro !','Mny2Mny:')). * ....... Section I: Input parsing and error checking ....... . !LET !Error = No * ....... I.A: Required inputs ....... . !IF ( !LeftIn !EQ !NULL !OR !RightIn !EQ !NULL) !THEN !LET !Error = Yes ECHO "Arguments LeftIn and RightIn (the two inputs)". ECHO " are required.". !IF (!LeftIn !NE !NULL) !THEN ECHO !QUOTE(!CONCAT(" (Left input specified: ", !LeftIn,')')). !IFEND !IF (!RightIn !NE !NULL) !THEN ECHO !QUOTE(!CONCAT(" (Right input specified: ", !RightIn,')')). !IFEND !IFEND !IF ( !RightIn !EQ '*') !THEN !LET !Error = Yes ECHO "The Right input may not be '*'. (The Left input may be)". !IFEND * ....... I.B: One-token inputs ....... . !IF (!TAIL(!Left#) !NE !NULL) !THEN !LET !Error = Yes ECHO "'Left#', if specified, must be a single token". ECHO !QUOTE(!CONCAT('(Entered: ','Left#,')')). !IFEND !IF (!TAIL(!Right#) !NE !NULL) !THEN !LET !Error = Yes ECHO "'Right#', if specified, must be a single token". ECHO !QUOTE(!CONCAT('(Entered: ','Right#,')')). !IFEND !IF (!TAIL(!DSname) !NE !NULL) !THEN !LET !Error = Yes ECHO "'DSname', if specified, must be a single token". ECHO !QUOTE(!CONCAT('(Entered: ',!DSname,')')). !IFEND * ....... I.C: Join type ....... . !IF (!UPCASE(!Join) !EQ 'OUTER') !THEN !LET !JType = 'OUTER' !LET !JMsg = 'Outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'LEFT OUTER') !THEN !LET !JType = 'LEFTOUT' !LET !JMsg = 'Left outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'LEFTOUTER') !THEN !LET !JType = 'LEFTOUT' !LET !JMsg = 'Left outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'LOUTER') !THEN !LET !JType = 'LEFTOUT' !LET !JMsg = 'Left outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'RIGHT OUTER') !THEN !LET !JType = 'RIGHTOUT' !LET !JMsg = 'Right outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'RIGHTOUTER') !THEN !LET !JType = 'RIGHTOUT' !LET !JMsg = 'Right outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'ROUTER') !THEN !LET !JType = 'RIGHTOUT' !LET !JMsg = 'Right outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'INNER') !THEN !LET !JType = 'INNER' !LET !JMsg = 'Inner join' !IFEND !IF (!BYvar !EQ !NULL) !THEN !LET !JType = 'CARTPROD' !LET !JMsg = 'Cartesian product' !IFEND !IF (!JType !EQ !NULL) !THEN !LET !Error = Yes ECHO "Invalid join type:". ECHO !QUOTE(!CONCAT(' Join=',!Join)). !IFEND * ....... Section II: Informational messages ........ . !IF (!Error !EQ No) !THEN ECHO !QUOTE(!CONCAT(!JMsg,' of')). ECHO !QUOTE(!CONCAT(' ',!LeftIn)). !IF ( !LRename !NE !NULL) !THEN ECHO !QUOTE(!CONCAT(' /RENAME=',!LRename)) !IFEND !IF ( !LDrop !NE !NULL) !THEN ECHO !QUOTE(!CONCAT(' /DROP =',!LDrop)) !IFEND !IF (!UPCASE(!LKeep) !NE 'ALL') !THEN ECHO !QUOTE(!CONCAT(' /KEEP =',!BYVar,' ',!LKeep)) !IFEND ECHO !QUOTE(!CONCAT('to ',!RightIn)). !IF ( !RRename !NE !NULL) !THEN ECHO !QUOTE(!CONCAT(' /RENAME=',!RRename)) !IFEND !IF ( !RDrop !NE !NULL) !THEN ECHO !QUOTE(!CONCAT(' /DROP =',!RDrop)) !IFEND !IF (!UPCASE(!RKeep) !NE 'ALL') !THEN ECHO !QUOTE(!CONCAT(' /KEEP =',!BYVar,' ',!RKeep)) !IFEND !IF (!BYvar !NE !NULL) !THEN ECHO !QUOTE(!CONCAT('by ',!BYvar)). !IFEND !IF (!DSname !NE !NULL) !THEN ECHO !QUOTE(!CONCAT('Output dataset named ',!DSname)). !IFEND !IFEND * ....... Section III: Processing ....... . * ....... III.A Copies of the inputs, adding record ....... . * sequence numbers (within BYvar values) ....... . !IF (!Error !EQ No) !THEN * ....... III.A.1 Left input -> dataset @Mny2Mny_Left#d ....... . !IF (!BYvar !EQ !NULL) !THEN * ... Left input, when there are no BYvar vars ... . ADD FILES /FILE=!Leftin !IF (!LRename !NE !NULL) !THEN /RENAME = !LRename !IFEND !IF (!LDrop !NE !NULL) !THEN /DROP = !LDrop !IFEND /KEEP = !BYvar !LKeep. COMPUTE @Mny2Mny_NewGrp = ($CASENUM EQ 1). COMPUTE !Left# = $CASENUM. !ELSE * ... Left input, when there are BYvar vars ... . ADD FILES /FILE=!Leftin !IF (!LRename !NE !NULL) !THEN /RENAME = !LRename !IFEND !IF (!LDrop !NE !NULL) !THEN /DROP = !LDrop !IFEND /KEEP = !BYvar !LKeep /BY !BYvar /FIRST = @Mny2Mny_NewGrp. DO IF @Mny2Mny_NewGrp. . COMPUTE !Left# = 1. ELSE. . COMPUTE !Left# = LAG(!Left#) + 1. END IF. !IFEND * ... Left input, code common to both cases ... . COMPUTE @Mny2Mny_NoBreak = 1. FORMATS @Mny2Mny_NoBreak @Mny2Mny_NewGrp (F2). FORMATS !Left# (F4). VARIABLE WIDTH !Left# (6). DATASET NAME @Mny2Mny_Left#d WINDOW=ASIS. * ... End of processing Left input . /*-- LIST /*-*/. * ....... III.A.2 Right input -> dataset @Mny2Mny_Right#d ....... . !IF (!BYvar !EQ !NULL) !THEN * ... Right input, when there are no BYvar vars ... . ADD FILES /FILE=!RightIn !IF (!RRename !NE !NULL) !THEN /RENAME = !RRename !IFEND !IF (!RDrop !NE !NULL) !THEN /DROP = !RDrop !IFEND /KEEP = !BYvar !RKeep. COMPUTE @Mny2Mny_NewGrp = ($CASENUM EQ 1). COMPUTE !Right# = $CASENUM. !ELSE * ... Right input, when there are BYvar vars ... . ADD FILES /FILE=!RightIn !IF (!RRename !NE !NULL) !THEN /RENAME = !RRename !IFEND !IF (!RDrop !NE !NULL) !THEN /DROP = !RDrop !IFEND /KEEP = !BYvar !RKeep /BY !BYvar /FIRST = @Mny2Mny_NewGrp. DO IF @Mny2Mny_NewGrp. . COMPUTE !Right# = 1. ELSE. . COMPUTE !Right# = LAG(!Right#) + 1. END IF. !IFEND * ... Right input, code common to both cases ... . COMPUTE @Mny2Mny_NoBreak = 1. FORMATS @Mny2Mny_NoBreak @Mny2Mny_NewGrp (F2). FORMATS !Right# (F4). VARIABLE WIDTH !Right# (6). DATASET NAME @Mny2Mny_Right#d WINDOW=ASIS. * ... End of processing Right input . /*-- LIST /*-*/. * ....... III.B Build 'spine' of all keys for all records ....... . * ....... III.B.1 Combine record counts from both inputs ....... . DATASET ACTIVATE @Mny2Mny_Left#d WINDOW=ASIS. DATASET DECLARE @Mny2Mny_Left##. AGGREGATE OUTFILE=@Mny2Mny_Left## /BREAK = @Mny2Mny_NoBreak !BYvar /@Mny2Mny_LMaxRec = MAX(!Left#). DATASET ACTIVATE @Mny2Mny_Right#d WINDOW=ASIS. AGGREGATE OUTFILE=* /BREAK = @Mny2Mny_NoBreak !BYvar /@Mny2Mny_RMaxRec = MAX(!Right#). MATCH FILES /FILE = @Mny2Mny_Left## /FILE = * /BY @Mny2Mny_NoBreak !BYvar. RECODE @Mny2Mny_LMaxRec @Mny2Mny_RMaxRec (MISSING=0). DATASET NAME @Mny2Mny_GrpSmry. /*-- LIST /*-*/. * ....... III.B.2 Select, based on join type ....... . !IF (!Jtype !EQ 'INNER') !THEN SELECT IF (@Mny2Mny_LMaxRec GT 0 AND @Mny2Mny_RMaxRec GT 0). !IFEND !IF (!Jtype !EQ 'LEFTOUT') !THEN SELECT IF (@Mny2Mny_RMaxRec GT 0). !IFEND !IF (!Jtype !EQ 'RIGHTOUT') !THEN SELECT IF (@Mny2Mny_LMaxRec GT 0). !IFEND * ..................................... . * For OUTER join and Cartesian product, . * no selection is necessary. . * ..................................... . /*-- LIST /*-*/. * ....... III.B.3 Expand the counts to get list of ........ . * pairs of record numbers from the ........ . * groups, by group number ........ . * @Mny2Mny_Grp# is a sequence number for record groups ........ . * (defined by sets of !BYvar values). It serves as a ........ . * surrogate group identifier within MATRIX, which cannot ........ . * in general accept the !BYvar values directly -- there ........ . * may be several of them, and they may be strings. ........ . COMPUTE @Mny2Mny_Grp# = $CASENUM. * @Mny2Mny_GrpRecs is the number of records that will be ........ . * in the final output, for each group. ........ . COMPUTE @Mny2Mny_GrpRecs = MAX(@Mny2Mny_LMaxRec,1) *MAX(@Mny2Mny_RMaxRec,1). * ....... MATRIX code by David Marso, adapted here. ........ . * ... OMS /SELECT TABLES /IF COMMANDS = ["Matrix"] SUBTYPES = ["Notes"] /DESTINATION VIEWER = NO /TAG = "ZapMATRIX". MATRIX. + GET GrpSmry /FILE=* /VARIABLES=@Mny2Mny_Grp# @Mny2Mny_LMaxRec @Mny2Mny_RMaxRec @Mny2Mny_GrpRecs /NAMES=GSnames. + LOOP Group#=1 TO NROW(GrpSmry). + COMPUTE LeftNums=T({1:MMAX({GrpSmry(Group#,2),1})}). + DO IF (GrpSmry(Group#,2) EQ 0). + COMPUTE LeftNums = 0*LeftNums. + END IF. + COMPUTE RghtNums=T({1:MMAX({GrpSmry(Group#,3),1})}). + DO IF (GrpSmry(Group#,3) EQ 0). + COMPUTE RghtNums = 0*RghtNums. + END IF. + SAVE ({MAKE(GrpSmry(Group#,4),1,GrpSmry(Group#,1)), KRONEKER(MAKE(NROW(RghtNums),1,1),LeftNums), KRONEKER(RghtNums,MAKE(NROW(LeftNums),1,1)) }) / OUTFILE * / VARIABLES @Mny2Mny_Grp# !Left# !Right#. + END LOOP. END MATRIX. FORMATS @Mny2Mny_Grp# !Left# !Right# (F4). VARIABLE WIDTH !Left# !Right# (6). /*-- LIST /*-*/. * ....... III.B.4 Combine with summary file by group key ........ . * to get the 'spine': list of all pairs ........ . * of record numbers by group key. ........ . MATCH FILES /TABLE=@Mny2Mny_GrpSmry /FILE =* /BY @Mny2Mny_Grp# /DROP @Mny2Mny_GrpRecs @Mny2Mny_LMaxRec @Mny2Mny_RMaxRec. /*-- DATASET COPY Spine /*-*/. * ....... III.C Add the substantive data ....... . MATCH FILES /FILE =* /TABLE=@Mny2Mny_Right#d /BY !BYvar !Right# /DROP @Mny2Mny_Grp# @Mny2Mny_NoBreak @Mny2Mny_NewGrp. SORT CASES BY !BYvar !Left# !Right#. MATCH FILES /TABLE=@Mny2Mny_Left#d /FILE =* /BY !BYvar !Left# /DROP @Mny2Mny_NoBreak @Mny2Mny_NewGrp /KEEP !BYvar !Left# !Right# ALL. EXECUTE /* so input datasets can be dropped */. !IF (!DSname !NE !NULL) !THEN DATASET NAME !DSname WINDOW=FRONT. !IFEND DATASET CLOSE @Mny2Mny_Left#d. DATASET CLOSE @Mny2Mny_Right#d. DATASET CLOSE @Mny2Mny_Left##. DATASET CLOSE @Mny2Mny_GrpSmry. !IFEND !ENDDEFINE. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Jignesh Sutar
This is late in the history of this thread but would you (Jignesh) specify what the unrolled dataset would look like because I don’t see it in the example data
you offer. As I understand Richard’s use of the term, I would expect to see the un-unrolled record with repeated data on the measure variable, e.g., measure.1 to measure.n.
Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Jignesh Sutar I have an OLAP style aggregated dataset for which I have many dimensions (in example below I demonstrate with only two, Dim1 and Dim2) with a final dimension being time (Month in example below). I want to force/unroll for ever unique combination of Dim1 and Dim2, every month present in the data. Discarding the scenario that there would be a missing month altogether from all aggregated combinations of Dim1 and Dim2, assume the dataset
would contain all the time points of relevance - at a total level but not perhaps for any individual Dim1*Dim2 level. Andy Wheeler !TimeExpand macro is great for that particular use case: https://andrewpwheeler.wordpress.com/2014/08/04/aggregating-values-in-time-series-charts/ I actually have a working solution to this problem as demonstrated below, but was wondering if there was an (even) more efficient/direct way of doing this and perhaps if Richard Ristows' !Mny2Mny was designed to, or has the capability of
achieving this? (I tried but failed) / ******************************************************* / . /* Load David Marsos' KRONEKER Matrix Macro */. / ******************************************************* / . DEFINE !K (!POS !CHAREND ("/") /!POS !CMDEND). !IF (!TAIL(!1) !NE "") !THEN !LET !T1=!HEAD(!1) !LET !T2=!HEAD(!TAIL(!1)) !LET !Result=!CONCAT(!HEAD(!1),!HEAD(!T2)). + COMPUTE !Result={KRONEKER(!Head(!1),MAKE(NROW(!T2),1,1)),KRONEKER(MAKE(NROW(!T1),1,1),!T2)}. + !K !Result !TAIL(!TAIL(!1)) /!2. !ELSE SAVE !1 / OUTFILE */VARIABLES !2. !IFEND !ENDDEFINE. DEFINE !Load (!POS !CHAREND ("/") / !POS !CMDEND). !IF (!1 !NE "") !THEN GET !HEAD(!2) / FILE !HEAD(!1) / VARIABLES !HEAD(!2) . !Load !TAIL(!1) / !TAIL(!2). !IFEND. !ENDDEFINE. DEFINE !FullCat (Files !CHAREND ("/") / Vars !CMDEND). MATRIX. !Load !Files / !Vars . !K !Vars / !Vars . END MATRIX. !ENDDEFINE. / ******************************************************* / . / ******************************************************* / . /* Demo Data */. / ******************************************************* / . DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0). BEGIN DATA. A 7 1 50 A 7 2 40 A 7 3 20 A 8 1 56 A 8 2 86 A 9 1 45 B 8 2 68 B 8 3 58 B 9 1 57 END DATA. DATASET NAME DSRaw. / ******************************************************* / . /* CODE */. / ******************************************************* / . DATASET ACTIVATE DSRaw. SORT CASES BY Dim1 Dim2. MATCH FILES FILE=* /BY Dim1 Dim2 /FIRST=Idx. CREATE Idx=CSUM(Idx). /* Extract unique values for each dimension */. DATASET ACTIVATE DSRaw. DATASET DECLARE dsIdx. AGGREGATE OUTFILE=dsIdx /BREAK=Idx /Count=N. DATASET ACTIVATE dsIdx. SAVE OUTFILE="C:\Temp\dsIdx.sav" /DROP=Count. DATASET ACTIVATE DSRaw. DATASET DECLARE dsMonth. AGGREGATE OUTFILE=dsMonth /BREAK=Month /Count=N. DATASET ACTIVATE dsMonth. SAVE OUTFILE="C:\Temp\dsMonth.sav" /DROP=Count. !FullCat Files "C:\Temp\dsIdx.sav" "C:\Temp\dsMonth.sav" / Vars Idx Month. DATASET NAME DSExpanded. DATASET ACTIVATE DSRaw. SORT CASES BY Idx Month. DATASET ACTIVATE DSExpanded. SORT CASES BY Idx Month. MATCH FILES FILE=* /TABLE=DSRaw /BY Idx Month. DATASET ACTIVATE DSExpanded. AGGREGATE OUTFILE=* MODE=ADDVARIABLES OVERWRITE=YES /BREAK=Idx /Dim1 to Dim2=MAX(Dim1 TO Dim2). EXE. ===================== To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
|
Hi Eugene, In this thread I posted a solution to what the unrolled dataset would look like. Is my use of the term "unrolled" incorrect to what it has been refereed to as previously? Apologies, if so. You should notice that in the combination of Dim1="A" and Dim2=8 ---> Month=3 is missing. Dim1="A" and Dim2=9 ---> Month=2 and 3 are missing. Dim1="B" and Dim2=8 ---> Month=1 is missing. Dim1="B" and Dim2=1 ---> Month=2 and 3 are missing. However, the final solution expands/unrolls these dates. This is what I was trying to achieve. On 13 April 2015 at 15:43, Maguin, Eugene <[hidden email]> wrote:
|
In reply to this post by Jignesh Sutar
I can't say whether it is more efficient, but I updated that code to "fill in" time periods for panel datasets. I've posted the updated code at https://dl.dropboxusercontent.com/s/i5cii0zdw78tszc/TimeExpand_Macros.sps?dl=0. Here is an example for your dataset.
******************************. *Using the PanelFillDate macro, available at https://dl.dropbox.com/s/i5cii0zdw78tszc/TimeExpand_Macros.sps?dl=0. DATA LIST LIST / Dim1 (A1) Dim2 (F1.0) Month (F1.0) Measure (F8.0). BEGIN DATA. A 7 1 50 A 7 2 40 A 7 3 20 A 8 1 56 A 8 2 86 A 9 1 45 B 8 2 68 B 8 3 58 B 9 1 57 END DATA. DATASET NAME DSRaw. *Needs an actual date variable, and the ID needs to be a number. COMPUTE MonthT = DATE.MDY(Month,1,2014). STRING Uni (A20). COMPUTE Uni = CONCAT(Dim1,"_|_",STRING(Dim2,F1.0)). AUTORECODE Uni /INTO Uni_#. !PanelFillDate OrigData = DSRaw Id = Uni_# DateVar = MonthT Begin = "DATE.MDY(1,1,2014)" End = "DATE.MDY(12,1,2014)" Type = "Months". *Cleaning up extraneous variables. MATCH FILES FILE = * /DROP Uni Uni_#. ******************************. Though this goes the extra step and adds in the empty months. |
That's very similar to my CASESTOVARS (but I convert the time period rather than ID as you do) and then VARSTOCASES. I like how you use INPUT PROGRAM to explicitly generate the time points, neat! On 13 April 2015 at 18:29, Andy W <[hidden email]> wrote: I can't say whether it is more efficient, but I updated that code to "fill |
Administrator
|
DATA LIST /DATE1 (ADATE).
BEGIN DATA 01/20/2009 10/30/2010 END DATA. MATRIX. GET Dates / FILE *. SAVE T({dates(1,1):dates(2,1):86400}) /OUTFILE * /VARIABLES=datefill. END MATRIX. FORMATS datefill (ADATE).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |