|
Menu typo corrected. That's what
happens in Juniper pollen season.
Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] phone: 720-342-5621 |
|
In reply to this post by David Marso
At 08:57 AM 3/10/2014, David Marso wrote, among other things:
>Building Richard's 'spine' can be done easily with KRONEKER. Substantive response has been moved to thread "Macro !Mny2Mny, for many-to-many merge and Cartesian product" ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by David Marso
At 08:57 AM 3/10/2014, David Marso wrote, under subject line
"Re: News from the SPSS Community": >Building Richard's 'spine' can be done easily with KRONEKER. I'd be interested in how to do this, since I don't find the double-VARSTOCASES approach wholly satisfactory(*). The input from which the 'spine' is generated consists of an indefinite number of records, each containing, . One key value, where the key can consist of any number of variables of any datatypes. (The key value is unique in this file.) . A left-hand record count, GE 0: Rl . A right-hand record count, GE 0: Rr The output, the spine itself, contains the same key values, but with Rl*Rr records for each (or Rl or Rr records, when an outer join is being performed and one of the values is 0). There's one record for each {l,r:1<=l<=Rl,1<=r<=Rr} (except that l is 0 if Rl is 0, and r is 0 if Rr is 0.) I'm appending Appendix I (with the output) and Appendix II (with the test data and all code, including the macro definition) from a small demonstration run. In the output, the pre-spine and spine are listed, following the listing of the merged data. You'll see that both have variables beyond the basics I listed above. Those are variables used in earlier stages, which are carried forward past when they are no longer needed, to the first time they can conveniently be eliminated. Finally, notice that this is a true outer join, with unmatched records from both inputs appearing in the output. ====================================== (*)Definition and use of the 'spine': see Date: Sun, 9 Mar 2014 17:08:56 -0500 From: Richard Ristow <[hidden email]> Subject: Re: Macro !Mny2Mny, for many-to-many merge and Cartesian product To: [hidden email] ======================================= APPENDIX I: Test and demonstration run ======================================= |-----------------------------|---------------------------| |Output Created |10-MAR-2014 14:37:05 | |-----------------------------|---------------------------| [Schools] Category School Art RI School of Design Art Metropolitan Museum Culinary Johnson & Wales Liberal Arts Oberlin Liberal Arts Antioch Liberal Arts Swarthmore Technical MIT Technical Cal Tech Technical ETH Number of cases read: 9 Number of cases listed: 9 ... |-----------------------------|---------------------------| |Output Created |10-MAR-2014 14:37:05 | |-----------------------------|---------------------------| [Subjects] Category Subject Art Painting Art Drawing Art Sculpture Liberal Arts Literature Liberal Arts Philosophy Psychoceramics Carberry Psychoceramics Library science Technical Calculus Technical Physics Technical Chemistry Technical Engineering Number of cases read: 11 Number of cases listed: 11 !Mny2Mny LeftIn=Schools/ RightIn=Subjects/ BYvar=Category. Macro !Mny2Mny: Outer join of Schools to Subjects by Category DATASET NAME Merged WINDOW=FRONT. LIST. List |-----------------------------|---------------------------| |Output Created |10-MAR-2014 14:37:07 | |-----------------------------|---------------------------| [Merged] Category Left# Right# School Subject Art 1 1 RI School of Design Painting Art 1 2 RI School of Design Drawing Art 1 3 RI School of Design Sculpture Art 2 1 Metropolitan Museum Painting Art 2 2 Metropolitan Museum Drawing Art 2 3 Metropolitan Museum Sculpture Culinary 1 0 Johnson & Wales Liberal Arts 1 1 Oberlin Literature Liberal Arts 1 2 Oberlin Philosophy Liberal Arts 2 1 Antioch Literature Liberal Arts 2 2 Antioch Philosophy Liberal Arts 3 1 Swarthmore Literature Liberal Arts 3 2 Swarthmore Philosophy Psychoceramics 0 1 Carberry Psychoceramics 0 2 Library science Technical 1 1 MIT Calculus Technical 1 2 MIT Physics Technical 1 3 MIT Chemistry Technical 1 4 MIT Engineering Technical 2 1 Cal Tech Calculus Technical 2 2 Cal Tech Physics Technical 2 3 Cal Tech Chemistry Technical 2 4 Cal Tech Engineering Technical 3 1 ETH Calculus Technical 3 2 ETH Physics Technical 3 3 ETH Chemistry Technical 3 4 ETH Engineering Number of cases read: 27 Number of cases listed: 27 DATASET ACTIVATE Pre_Spine WINDOW=FRONT. LIST. List |-----------------------------|---------------------------| |Output Created |10-MAR-2014 14:37:07 | |-----------------------------|---------------------------| [Pre_Spine] @Mny2Mny_NoBreak Category @Mny2Mny_LMaxRec @Mny2Mny_RMaxRec 1 Art 2 3 1 Culinary 1 0 1 Liberal Arts 3 2 1 Psychoceramics 0 2 1 Technical 3 4 Number of cases read: 5 Number of cases listed: 5 DATASET ACTIVATE Spine WINDOW=FRONT. LIST. List |-----------------------------|---------------------------| |Output Created |10-MAR-2014 14:37:08 | |-----------------------------|---------------------------| [Spine] [listing has been hand-reformatted] @Mny2Mny_ Category @Mny2Mny_ @Mny2Mny_ Right# Left# NoBreak LMaxRec RMaxRec 1 Art 2 3 1 1 1 Art 2 3 1 2 1 Art 2 3 2 1 1 Art 2 3 2 2 1 Art 2 3 3 1 1 Art 2 3 3 2 1 Culinary 1 0 0 1 1 Liberal Arts 3 2 1 1 1 Liberal Arts 3 2 1 2 1 Liberal Arts 3 2 1 3 1 Liberal Arts 3 2 2 1 1 Liberal Arts 3 2 2 2 1 Liberal Arts 3 2 2 3 1 Psychoceramics 0 2 1 0 1 Psychoceramics 0 2 2 0 1 Technical 3 4 1 1 1 Technical 3 4 1 2 1 Technical 3 4 1 3 1 Technical 3 4 2 1 1 Technical 3 4 2 2 1 Technical 3 4 2 3 1 Technical 3 4 3 1 1 Technical 3 4 3 2 1 Technical 3 4 3 3 1 Technical 3 4 4 1 1 Technical 3 4 4 2 1 Technical 3 4 4 3 Number of cases read: 27 Number of cases listed: 27 ======================================= APPENDIX II: Test data, and all code ======================================= * C:\Documents and Settings\Richard\My Documents . * \Technical\spssx-l\Z-2014 . * \2014-03-10 Marso-Re News from the SPSS Community.SPS . * In response to posting . * Date: Mon, 10 Mar 2014 06:57:24 -0700 . * From: David Marso <[hidden email]> . * Subject: Re: News from the SPSS Community . * To: [hidden email] . * The response will be posted under subject line . * Re: Macro !Mny2Mny, for many-to-many merge and Cartesian product . * David Marso wrote, . * "Building Richard's 'spine' can be done easily with [MATRIX . * function] KRONEKER." . * . * See thread beginning with . * Date: Sun, 9 Mar 2014 17:54:52 -0500 . * From: Richard Ristow <[hidden email]> . * Subject: Macro !Mny2Mny, for many-to-many merge . * and Cartesian product . * To: [hidden email] . * especially . * Date: Sun, 9 Mar 2014 17:08:56 -0500 . * From: Richard Ristow <[hidden email]> . * Subject: Re: Macro !Mny2Mny, for many-to-many merge . * and Cartesian product . * To: [hidden email] . * ................. Matrix <!>Mny2Mny .................... . * ... This version modified, to save copies of the dataset from . * ... which the 'spine' is built, and the spine itself. . DEFINE !Mny2Mny(LeftIn = !CHAREND('/') /RightIn = !CHAREND('/') /BYvar = !CHAREND('/') /LRename = !CHAREND('/') /LDrop = !CHAREND('/') /LKeep = !CHAREND('/') !DEFAULT(ALL) /RRename = !CHAREND('/') /RDrop = !CHAREND('/') /RKeep = !CHAREND('/') !DEFAULT(ALL) /Join = !CHAREND('/') !DEFAULT(Outer) /MaxGrp = !TOKENS(1) !DEFAULT(50)) ECHO " ". ECHO !QUOTE(!CONCAT('Macro !','Mny2Mny:')). * ....... Section I: Input parsing and error checking ....... . !LET !Error = No !IF ( !LeftIn !EQ !NULL !OR !RightIn !EQ !NULL) !THEN !LET !Error = Yes ECHO "Arguments LeftIn and RightIn (the two inputs)". ECHO " are required.". !IFEND !IF ( !RightIn !EQ '*') !THEN !LET !Error = Yes ECHO "The Right input may not be '*'". !IFEND !IF (!UPCASE(!Join) !EQ 'OUTER') !THEN !LET !JType = 'OUTER' !LET !JMsg = 'Outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'LEFT OUTER') !THEN !LET !JType = 'LEFTOUT' !LET !JMsg = 'Left outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'LEFTOUTER') !THEN !LET !JType = 'LEFTOUT' !LET !JMsg = 'Left outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'LOUTER') !THEN !LET !JType = 'LEFTOUT' !LET !JMsg = 'Left outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'RIGHT OUTER') !THEN !LET !JType = 'RIGHTOUT' !LET !JMsg = 'Right outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'RIGHTOUTER') !THEN !LET !JType = 'RIGHTOUT' !LET !JMsg = 'Right outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'ROUTER') !THEN !LET !JType = 'RIGHTOUT' !LET !JMsg = 'Right outer join' !IFEND !IF (!UPCASE(!Join) !EQ 'INNER') !THEN !LET !JType = 'INNER' !LET !JMsg = 'Inner join' !IFEND !IF (!BYvar !EQ !NULL) !THEN !LET !JType = 'CARTPROD' !LET !JMsg = 'Cartesian product' !IFEND !IF (!JType !EQ !NULL) !THEN !LET !Error = Yes ECHO "Invalid join type:". ECHO !QUOTE(!CONCAT(' Join=',!Join)). !IFEND * ....... Section II: Informational messages ........ . !IF (!Error !EQ No) !THEN ECHO !QUOTE(!CONCAT(!JMsg,' of')). ECHO !QUOTE(!CONCAT(' ',!LeftIn)). !IF ( !LRename !NE !NULL) !THEN ECHO !QUOTE(!CONCAT(' /RENAME=',!LRename)) !IFEND !IF ( !LDrop !NE !NULL) !THEN ECHO !QUOTE(!CONCAT(' /DROP =',!LDrop)) !IFEND !IF (!UPCASE(!LKeep) !NE 'ALL') !THEN ECHO !QUOTE(!CONCAT(' /KEEP =',!BYVar,' ',!LKeep)) !IFEND ECHO !QUOTE(!CONCAT('to ',!RightIn)). !IF ( !RRename !NE !NULL) !THEN ECHO !QUOTE(!CONCAT(' /RENAME=',!RRename)) !IFEND !IF ( !RDrop !NE !NULL) !THEN ECHO !QUOTE(!CONCAT(' /DROP =',!RDrop)) !IFEND !IF (!UPCASE(!RKeep) !NE 'ALL') !THEN ECHO !QUOTE(!CONCAT(' /KEEP =',!BYVar,' ',!RKeep)) !IFEND !IF (!BYvar !NE !NULL) !THEN ECHO !QUOTE(!CONCAT('by ',!BYvar)). !IFEND !IFEND * ....... Section III: Processing ....... . * ....... III.A Sequence-numbered copies of the inputs ....... . !IF (!Error !EQ No) !THEN ADD FILES /FILE=!Leftin. NUMERIC @Mny2Mny_NoBreak (F2). COMPUTE @Mny2Mny_NoBreak = 1. ADD FILES /FILE=* !IF (!LRename !NE !NULL) !THEN /RENAME = !LRename !IFEND /BY @Mny2Mny_NoBreak !BYvar /FIRST = @Mny2Mny_NewGrp !IF (!LDrop !NE !NULL) !THEN /DROP = !LDrop !IFEND /KEEP = @Mny2Mny_NoBreak !BYvar !LKeep. DO IF @Mny2Mny_NewGrp. . COMPUTE Left# = 1. ELSE. . COMPUTE Left# = LAG(Left#) + 1. END IF. FORMATS Left# (F4). DATASET NAME @Mny2Mny_Left#d WINDOW=ASIS. /*-- LIST /*-*/. DATASET DECLARE @Mny2Mny_Left##. AGGREGATE OUTFILE=@Mny2Mny_Left## /BREAK = @Mny2Mny_NoBreak !BYvar /@Mny2Mny_LMaxRec = MAX(Left#). ADD FILES /FILE=!RightIn. NUMERIC @Mny2Mny_NoBreak(F2). COMPUTE @Mny2Mny_NoBreak = 1. ADD FILES /FILE=* !IF (!LRename !NE !NULL) !THEN /RENAME = !RRename !IFEND /BY @Mny2Mny_NoBreak !BYvar /FIRST = @Mny2Mny_NewGrp !IF (!RDrop !NE !NULL) !THEN /DROP = !RDrop !IFEND /KEEP = @Mny2Mny_NoBreak !BYvar !RKeep. DO IF @Mny2Mny_NewGrp. . COMPUTE Right# = 1. ELSE. . COMPUTE Right# = LAG(Right#) + 1. END IF. FORMATS Right# (F4). DATASET NAME @Mny2Mny_Right#d WINDOW=ASIS. /*-- LIST /*-*/. * ....... III.B Build 'spine' of all keys for all records ....... . * ....... III.B.1 Combine record counts from the inputs ....... . AGGREGATE OUTFILE=* /BREAK = @Mny2Mny_NoBreak !BYvar /@Mny2Mny_RMaxRec = MAX(Right#). /*-- LIST /*-*/. MATCH FILES /FILE = @Mny2Mny_Left## /FILE = * /BY @Mny2Mny_NoBreak !BYvar. RECODE @Mny2Mny_LMaxRec @Mny2Mny_RMaxRec (MISSING=0). * ....... III.B.2 Select, based on join type ....... . !IF (!Jtype !EQ 'INNER') !THEN SELECT IF (@Mny2Mny_LMaxRec GT 0 AND @Mny2Mny_RMaxRec GT 0). !IFEND !IF (!Jtype !EQ 'LEFTOUT') !THEN SELECT IF (@Mny2Mny_RMaxRec GT 0). !IFEND !IF (!Jtype !EQ 'RIGHTOUT') !THEN SELECT IF (@Mny2Mny_LMaxRec GT 0). !IFEND * ..................................... . * For OUTER join and Cartesian product, . * no selection is necessary. . * ..................................... . * ....... III.B.3 Error message, if record count exceeds ....... . * the allowed maximum ....... . IF $CASENUM EQ 1 #Overflow = 0. DO IF @Mny2Mny_LMaxRec GT !MaxGrp. . DO IF #Overflow EQ 0. . PRINT / 'Left input: ' @Mny2Mny_LMaxRec !IF (!Jtype !EQ 'CARTPROD') !THEN ' records ' !ELSE ' records for key ', !BYvar !IFEND !QUOTE(!CONCAT(' (max is ',!MaxGrp,')')). . COMPUTE #Overflow = 1. . ELSE. . PRINT / 'Left input: ' @Mny2Mny_LMaxRec !IF (!Jtype !EQ 'CARTPROD') !THEN ' records '. !ELSE ' records for key ', !BYvar. !IFEND . END IF. END IF. DO IF @Mny2Mny_RMaxRec GT !MaxGrp. . DO IF #Overflow EQ 0. . PRINT / 'Right input: ' @Mny2Mny_RMaxRec !IF (!Jtype !EQ 'CARTPROD') !THEN ' records ' !ELSE ' records for key ', !BYvar !IFEND !QUOTE(!CONCAT(' (max is ',!MaxGrp,')')). . COMPUTE #Overflow = 1. . ELSE. . PRINT / 'Right input: ' @Mny2Mny_RMaxRec !IF (!Jtype !EQ 'CARTPROD') !THEN ' records '. !ELSE ' records for key ', !BYvar. !IFEND . END IF. END IF. * ....... III.B.4 Build the 'spine', of all record keys ........ . * ....... From the file with counts of records from both ........ . * ....... inputs by BYvar values (which is currently active), ........ . * ....... 'unroll' to build a file with all combinations of ........ . * ....... BYvar values, and pairs of record sequence numbers, ........ . * ....... that will occur in the output. ........ . * ....... This file is the 'spine'. ........ . /**/ DATASET COPY Pre_Spine /*-*/. * 'Unroll' the Right record numbers. VECTOR Rec#(!MaxGrp,F4). DO IF @Mny2Mny_RMaxRec EQ 0. . COMPUTE Rec#(1) = 0. ELSE. . LOOP #Idx = 1 TO MIN(@Mny2Mny_RMaxRec,!MaxGrp). . COMPUTE Rec#(#Idx) = #Idx. . END LOOP. END IF. EXECUTE /* apparently necessary */. !LET !Sfx = !LENGTH (!BLANKS (!MaxGrp)) OMS /SELECT ALL /IF COMMANDS = ["Variables to Cases"] /DESTINATION VIEWER = NO /TAG = "@MNY2MNY_V2C_ZAP". VARSTOCASES /MAKE Right# FROM Rec#1 TO !CONCAT('Rec#',!Sfx). * 'Unroll' the Left record numbers. VECTOR Rec#(!MaxGrp,F4). DO IF @Mny2Mny_LMaxRec EQ 0. . COMPUTE Rec#(1) = 0. ELSE. . LOOP #Idx = 1 TO MIN(@Mny2Mny_LMaxRec,!MaxGrp). . COMPUTE Rec#(#Idx) = #Idx. . END LOOP. END IF. EXECUTE /* apparently necessary */. VARSTOCASES /MAKE Left# FROM Rec#1 TO !CONCAT('Rec#',!Sfx). OMSEND TAG=["@MNY2MNY_V2C_ZAP"]. /**/ DATASET COPY Spine /*-*/. * ....... III.C Add the substantive data ....... . MATCH FILES /FILE=* /TABLE=@Mny2Mny_Right#d /BY !BYvar Right# /DROP @Mny2Mny_NoBreak @Mny2Mny_NewGrp @Mny2Mny_LMaxRec @Mny2Mny_RMaxRec. SORT CASES BY !BYvar Left# Right#. MATCH FILES /TABLE=@Mny2Mny_Left#d /FILE=* /BY !BYvar Left# /DROP @Mny2Mny_NoBreak @Mny2Mny_NewGrp /KEEP !BYvar Left# Right# ALL. EXECUTE /* so input datasets can be dropped */. DATASET CLOSE @Mny2Mny_Left#d. DATASET CLOSE @Mny2Mny_Right#d. DATASET CLOSE @Mny2Mny_Left##. !IFEND !ENDDEFINE. * ................. Test data .................... . DATA LIST FIXED /Category 04-18 (A) School 19-40 (A). BEGIN DATA Art RI School of Design Art Metropolitan Museum Culinary Johnson & Wales Liberal Arts Oberlin Liberal Arts Antioch Liberal Arts Swarthmore Technical MIT Technical Cal Tech Technical ETH END DATA. DATASET NAME Schools. . /*-- LIST /*-*/. DATA LIST FIXED /Category 04-18 (A) Subject 19-35 (A). BEGIN DATA Art Painting Art Drawing Art Sculpture Liberal Arts Literature Liberal Arts Philosophy Psychoceramics Carberry Psychoceramics Library science Technical Calculus Technical Physics Technical Chemistry Technical Engineering END DATA. DATASET NAME Subjects. * ...................................................... . DATASET ACTIVATE Schools WINDOW=FRONT. LIST. DATASET ACTIVATE Subjects WINDOW=FRONT. LIST. !Mny2Mny LeftIn=Schools/ RightIn=Subjects/ BYvar=Category. DATASET NAME Merged WINDOW=FRONT. LIST. DATASET ACTIVATE Pre_Spine WINDOW=FRONT. LIST. DATASET ACTIVATE Spine WINDOW=FRONT. LIST. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
