Administrator
|
This post was updated on .
Hi Richard and whoever is interested:
This is a very crude basic sketch and I have made no effort to turn it into a macro at this point. Basically rather than have the variables as a set be the key, imagine one has mapped these to a single variable using a counter or some other device. The logic in the matrix program then generates the necessary number of records for merging into the two data sets. Since the 'spine' is already ordered properly for the left file we immediately merge it in as a TABLE. SORT to align with the 'right' file and match that. I hope this makes sense ;-))) ---- NEW FILE. DATASET CLOSE ALL. DATA LIST / Category (A16) School (A30). BEGIN DATA Art RI School of Design Art Metropolitan Museum Culinary Johnson & Wales Liberal Arts Oberlin Liberal Arts Antioch Liberal Arts Swarthmore Technical MIT Technical Cal Tech Technical ETH END DATA. SORT CASES BY Category. DATASET NAME f1. LIST. DATA LIST / Category (A16) Subject (A30) . BEGIN DATA Art Painting Art Drawing Art Sculpture Liberal Arts Literature Liberal Arts Philosophy Psychoceramics Carberry Psychoceramics Library science Technical Calculus Technical Physics Technical Chemistry Technical Engineering END DATA. SORT CASES BY Category. DATASET NAME f2. LIST. ADD FILES / FILE f1 / FILE f2 / IN=@IN2@. AUTORECODE Category / INTO AR_Category. IF $CASENUM EQ 1 OR @IN2@ NE LAG(@IN2@) OR Category NE LAG(Category) InternalCount=1. IF MISSING(InternalCount) InternalCount=LAG(InternalCount)+1. DATASET NAME stacked. DATASET DECLARE agg. AGGREGATE OUTFILE agg / BREAK @IN2@ Category AR_Category/ NperCat=N . DATASET Activate stacked. DATASET COPY stack2. SELECT IF NOT(@IN2@). EXECUTE. DELETE VARIABLES Subject. RENAME VARIABLES INTERNALCount=F1INstance. DATASET Activate stack2. SELECT IF @IN2@. EXECUTE. DELETE VARIABLES School. RENAME VARIABLES INTERNALCount=F2INstance. DATASET ACTIVATE agg. AGGREGATE OUTFILE * MODE ADDVARIABLES / BREAK @IN2@ / NCatPerFile=N. DATASET DECLARE spine . MATRIX. GET DATA /FILE * / VARIABLES @IN2@ AR_Category NPerCat NCatPerFile. COMPUTE Base1=Data(1:Data(1,4),2:3). COMPUTE Base2=Data((Data(1,4)+1):NROW(Data),2:3). COMPUTE MaxCat=MMAX(Data(:,2)). COMPUTE X1=MAKE(MaxCat,1,0). COMPUTE X2=MAKE(MaxCat,1,0). LOOP #=1 TO NROW(Base1) . COMPUTE X1(Base1(#,1))=Base1(#,2). END LOOP. LOOP #=1 TO NROW(Base2) . COMPUTE X2(Base2(#,1))=Base2(#,2). END LOOP. COMPUTE M12=RMAX({{X1 &*X2},{X1,X2}}). COMPUTE C={T({1:MaxCat}),X1,X2,M12}. COMPUTE Spine=MAKE(MSUM(M12),3,0). COMPUTE CurRow=1. LOOP #=1 TO NRow(C). COMPUTE LastRow=CurRow+C(#,4)-1. COMPUTE T1=T({1:MMAX({C(#,2),1})}). COMPUTE T2=T({1:MMAX({C(#,3),1})}). COMPUTE T0=MAKE(C(#,4),1,C(#,1)). COMPUTE Spine(CurRow:LastRow,:)= {T0,KRONEKER(T1,MAKE(NROW(T2),1,1)), KRONEKER(MAKE(NROW(T1),1,1),T2)}. COMPUTE CurRow=LastRow+1. END LOOP. SAVE Spine / OUTFILE Spine / VARIABLES AR_Category F1Instance F2Instance. END MATRIX. MATCH FILES FILE spine / TABLE stacked / BY AR_Category F1Instance. SORT CASES BY AR_Category F2Instance. MATCH FILES FILE * / TABLE stack2 / BY AR_Category F2Instance.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
A *VERY* MUCH improved MATRIX solution from previous.
DATASET DECLARE spine . MATRIX. GET DATA /FILE * / VARIABLES @IN2@ AR_Category NPerCat NCatPerFile. COMPUTE Base1=Data(1:Data(1,4),2:3). COMPUTE Base2=Data((Data(1,4)+1):NROW(Data),2:3). COMPUTE MaxCat=MMAX(Data(:,2)). COMPUTE XX=MAKE(MaxCat,2,0). LOOP #=1 TO MaxCat . + DO IF NRow(Base1) GE #. + COMPUTE XX(Base1(#,1),1)=Base1(#,2). + END IF. + DO IF NRow(Base2) GE #. + COMPUTE XX(Base2(#,1),2)=Base2(#,2). + END IF. END LOOP. COMPUTE C={T({1:MaxCat}),XX,RMAX({XX(:,1) &* XX(:,2),XX})}. LOOP #=1 TO NRow(C). COMPUTE T1=T({1:MMAX({C(#,2),1})}). COMPUTE T2=T({1:MMAX({C(#,3),1})}). SAVE ({MAKE(C(#,4),1,C(#,1)), KRONEKER(T1,MAKE(NROW(T2),1,1)), KRONEKER(MAKE(NROW(T1),1,1),T2)}) / OUTFILE Spine / VARIABLES AR_Category F1Instance F2Instance. END LOOP. END MATRIX. MATCH FILES FILE spine / TABLE stacked / BY AR_Category F1Instance. SORT CASES BY AR_Category F2Instance. MATCH FILES FILE * / TABLE stack2 / BY AR_Category F2Instance. EXECUTE. DATASET NAME MergedCartProd. DATASET ACTIVATE MergedCartProd.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Hi David,
I was running your code to really see how it worked, particularly the Kronecker section because my reading and understanding of the function is that it is a product. I'm not asking for a comment on the kronecker function. You're a highly skilled programmer; let me see if I can get it. But before I got there, I ran into a problem with the code. Following is a section of your code with my comments. DATASET Activate stacked. DATASET COPY stack2. * these 4 lines operate on stack2 and functionally delete the f2 dataset. SELECT IF NOT(@IN2@). EXECUTE. DELETE VARIABLES Subject. RENAME VARIABLES INTERNALCount=F1INstance. DATASET Activate stack2. SELECT IF @IN2@. EXECUTE. /* with the execution of this command, stack2 has no records and variables of category, @in2@, AR_category, F1Instance. DELETE VARIABLES School. RENAME VARIABLES INTERNALCount=F2INstance. /*<<< InternalCount does not exist in stack2. Gene Maguin ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Actually the copy is not active at the point of the code you indicate. Stacked is the active file. I could probably have done this using AUTORECODE with templates but it would be messier.
On Wed, Mar 12, 2014 at 10:11 AM, Maguin, Eugene [via SPSSX Discussion] <[hidden email]> wrote: Hi David,
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by David Marso
Once again, I find that referring to a dataset name on the SAVE command in the MATRIX program is generating an error for me.
Here is the error message that appears after END MATRIX: Run MATRIX procedure: >Error # 34 in column 12. Text: spine >SPSS Statistics cannot access a file with the given file specification. The >file specification is either syntactically invalid, specifies an invalid >drive, specifies a protected directory, specifies a protected file, or >specifies a non-sharable file. >Execution of this command stops. ------ END MATRIX ----- To avoid it, I have to change spine to * in the SAVE command. Version info: SPSS 21.0 0.2, 64-bit version; Windows 7 Professional, SP1 David, I assume you are not having a problem with this on your system? What is your version info? Here is a slightly modified version of David's syntax that runs error-free for me. NEW FILE. DATASET CLOSE ALL. DATA LIST / Category (A16) School (A30). BEGIN DATA Art RI School of Design Art Metropolitan Museum Culinary Johnson & Wales Liberal Arts Oberlin Liberal Arts Antioch Liberal Arts Swarthmore Technical MIT Technical Cal Tech Technical ETH END DATA. SORT CASES BY Category. DATASET NAME f1. LIST. DATA LIST / Category (A16) Subject (A30) . BEGIN DATA Art Painting Art Drawing Art Sculpture Liberal Arts Literature Liberal Arts Philosophy Psychoceramics Carberry Psychoceramics Library science Technical Calculus Technical Physics Technical Chemistry Technical Engineering END DATA. SORT CASES BY Category. DATASET NAME f2. LIST. ADD FILES / FILE f1 / FILE f2 / IN=@IN2@. AUTORECODE Category / INTO AR_Category. IF $CASENUM EQ 1 OR @IN2@ NE LAG(@IN2@) OR Category NE LAG(Category) InternalCount=1. IF MISSING(InternalCount) InternalCount=LAG(InternalCount)+1. DATASET NAME stacked. DATASET DECLARE agg. AGGREGATE OUTFILE agg / BREAK @IN2@ Category AR_Category/ NperCat=N . DATASET Activate stacked. DATASET COPY stack2. SELECT IF NOT(@IN2@). EXECUTE. DELETE VARIABLES Subject. RENAME VARIABLES INTERNALCount=F1INstance. DATASET Activate stack2. SELECT IF @IN2@. EXECUTE. DELETE VARIABLES School. RENAME VARIABLES INTERNALCount=F2INstance. DATASET ACTIVATE agg. AGGREGATE OUTFILE * MODE ADDVARIABLES / BREAK @IN2@ / NCatPerFile=N. DATASET DECLARE spine . /* Removed this from DM's code . MATRIX. GET DATA /FILE * / VARIABLES @IN2@ AR_Category NPerCat NCatPerFile. COMPUTE Base1=Data(1:Data(1,4),2:3). COMPUTE Base2=Data((Data(1,4)+1):NROW(Data),2:3). COMPUTE MaxCat=MMAX(Data(:,2)). COMPUTE XX=MAKE(MaxCat,2,0). LOOP #=1 TO MaxCat . + DO IF NRow(Base1) GE #. + COMPUTE XX(Base1(#,1),1)=Base1(#,2). + END IF. + DO IF NRow(Base2) GE #. + COMPUTE XX(Base2(#,1),2)=Base2(#,2). + END IF. END LOOP. COMPUTE C={T({1:MaxCat}),XX,RMAX({XX(:,1) &* XX(:,2),XX})}. LOOP #=1 TO NRow(C). COMPUTE T1=T({1:MMAX({C(#,2),1})}). COMPUTE T2=T({1:MMAX({C(#,3),1})}). ***** Change "OUTFILE spine" to "OUTFILE *" *******. SAVE ({MAKE(C(#,4),1,C(#,1)), KRONEKER(T1,MAKE(NROW(T2),1,1)), KRONEKER(MAKE(NROW(T1),1,1),T2)}) / OUTFILE spine / VARIABLES AR_Category F1Instance F2Instance. END LOOP. END MATRIX. ******************* . DATASET NAME spine. /* Added this to DM's code. ******************* . MATCH FILES FILE spine / TABLE stacked / BY AR_Category F1Instance. SORT CASES BY AR_Category F2Instance. MATCH FILES FILE * / TABLE stack2 / BY AR_Category F2Instance. EXECUTE. DATASET NAME MergedCartProd. DATASET ACTIVATE MergedCartProd. FORMATS F1instance F2instance(f2.0). LIST.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
I am running version 22 on Doze 8.1
--- Here is a fresh version of the code. I have revised the hacky crap involving the copied data set by creating the instance counter at the beginning in each source file. Creating a secondary map file to bridge the spine to the actual category codes back to the source files (f1,f2) spine (AR_Category, Instance1, Instance2) --> map (AR_Category, Category) --> f1 (Category, Instance1) --> f2 (Category, Instance2) Also did a bit of tweaking of the MATRIX code itself. This should handle some pretty large problems as the full KP matrix/spine is not actually stored in full. It writes out subsets (one KP matrix for each original matching category) to the active file as they are computed. One immense advantage is there is NO XSAVE to external data sets as required by Jon's extension or Richard's code. Pats self on back...... ----- NEW FILE. DATASET CLOSE ALL. DATA LIST / Category (A16) School (A30). BEGIN DATA Art RI School of Design Art Metropolitan Museum Culinary Johnson & Wales Liberal Arts Oberlin Liberal Arts Antioch Liberal Arts Swarthmore Technical MIT Technical Cal Tech Technical ETH END DATA. SORT CASES BY Category. IF $CASENUM EQ 1 OR Category NE LAG(Category) F1INstance=1. IF MISSING(F1INstance) F1INstance=LAG(F1INstance)+1. DATASET NAME f1. LIST. DATA LIST / Category (A16) Subject (A30) . BEGIN DATA Art Painting Art Drawing Art Sculpture Liberal Arts Literature Liberal Arts Philosophy Psychoceramics Carberry Psychoceramics Library science Technical Calculus Technical Physics Technical Chemistry Technical Engineering END DATA. SORT CASES BY Category. IF $CASENUM EQ 1 OR Category NE LAG(Category) F2INstance=1. IF MISSING(F2INstance) F2INstance=LAG(F2INstance)+1. DATASET NAME f2. LIST. ADD FILES / FILE f1 / FILE f2 / IN=@IN2@. AUTORECODE Category / INTO AR_Category. DATASET NAME stacked. DATASET DECLARE agg. AGGREGATE OUTFILE agg / BREAK @IN2@ Category AR_Category/ NperCat=N . DATASET DECLARE map. DATASET ACTIVATE agg. AGGREGATE OUTFILE map / BREAK Category AR_Category/ NperCat=N . DATASET ACTIVATE agg. AGGREGATE OUTFILE * MODE ADDVARIABLES / BREAK @IN2@ / NCatPerFile=N. MATRIX. GET DATA /FILE * / VARIABLES AR_Category NPerCat NCatPerFile. COMPUTE B1=Data(1:Data(1,3),1:3). COMPUTE B2=Data((Data(1,3)+1):NROW(Data),1:3). COMPUTE MaxCat=MMAX(Data(:,1)). COMPUTE C={T({1:MaxCat}),MAKE(MaxCat,3,0)}. LOOP #=1 TO NROW(B1). + COMPUTE C(B1(#,1),2)=B1(#,2). END LOOP. LOOP #=1 TO NROW(B2). + COMPUTE C(B2(#,1),3)=B2(#,2). END LOOP. LOOP #=1 TO MaxCat. + COMPUTE C(#,2:3) = CMAX ({C(#,2),C(#,3);1,1}). + COMPUTE C(#,4)=C(#,2) * C(#,3). + SAVE ({MAKE(C(#,4),1,C(#,1)), KRONEKER(T({1:C(#,2)}),MAKE(C(#,3),1,1)), KRONEKER(MAKE(C(#,2),1,1),T({1:C(#,3)}))}) / OUTFILE * / VARIABLES AR_Category F1Instance F2Instance. END LOOP. END MATRIX. /* Merge 'spine' into map file to link to source files */. MATCH FILES FILE * / TABLE map / BY AR_Category . MATCH FILES FILE * / TABLE f1 / BY Category F1Instance. SORT CASES BY AR_Category F2Instance. MATCH FILES FILE * / TABLE f2 / BY Category F2Instance. DATASET NAME MergedCartProd. EXECUTE. DATASET CLOSE agg. DATASET CLOSE map. DATASET CLOSE stacked. DATASET ACTIVATE MergedCartProd. DELETE VARIABLES F1Instance F2Instance AR_Category NperCat. LIST. Category School Subject Art RI School of Design Painting Art Metropolitan Museum Painting Art RI School of Design Drawing Art Metropolitan Museum Drawing Art RI School of Design Sculpture Art Metropolitan Museum Sculpture Culinary Johnson & Wales Liberal Arts Oberlin Literature Liberal Arts Antioch Literature Liberal Arts Swarthmore Literature Liberal Arts Oberlin Philosophy Liberal Arts Antioch Philosophy Liberal Arts Swarthmore Philosophy Psychoceramics Carberry Psychoceramics Library science Technical MIT Calculus Technical Cal Tech Calculus Technical ETH Calculus Technical MIT Physics Technical Cal Tech Physics Technical ETH Physics Technical MIT Chemistry Technical Cal Tech Chemistry Technical ETH Chemistry Technical MIT Engineering Technical Cal Tech Engineering Technical ETH Engineering Number of cases read: 27 Number of cases listed: 27
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
FYI,
I'm running v21.0.0.2 and an old version of v15. This runs on both. Thanks for this. I'm doing more work with strings, and this came along serendipitously. B ----- NEW FILE. DATASET CLOSE ALL. DATA LIST / Category (A16) School (A30). BEGIN DATA Art RI School of Design Art Metropolitan Museum Culinary Johnson & Wales Liberal Arts Oberlin Liberal Arts Antioch Liberal Arts Swarthmore Technical MIT Technical Cal Tech Technical ETH END DATA. SORT CASES BY Category. IF $CASENUM EQ 1 OR Category NE LAG(Category) F1INstance=1. IF MISSING(F1INstance) F1INstance=LAG(F1INstance)+1. DATASET NAME f1. LIST. DATA LIST / Category (A16) Subject (A30) . BEGIN DATA Art Painting Art Drawing Art Sculpture Liberal Arts Literature Liberal Arts Philosophy Psychoceramics Carberry Psychoceramics Library science Technical Calculus Technical Physics Technical Chemistry Technical Engineering END DATA. SORT CASES BY Category. IF $CASENUM EQ 1 OR Category NE LAG(Category) F2INstance=1. IF MISSING(F2INstance) F2INstance=LAG(F2INstance)+1. DATASET NAME f2. LIST. ADD FILES / FILE f1 / FILE f2 / IN=@IN2@. AUTORECODE Category / INTO AR_Category. DATASET NAME stacked. DATASET DECLARE agg. AGGREGATE OUTFILE agg / BREAK @IN2@ Category AR_Category/ NperCat=N . DATASET DECLARE map. DATASET ACTIVATE agg. AGGREGATE OUTFILE map / BREAK Category AR_Category/ NperCat=N . DATASET ACTIVATE agg. AGGREGATE OUTFILE * MODE ADDVARIABLES / BREAK @IN2@ / NCatPerFile=N. MATRIX. GET DATA /FILE * / VARIABLES AR_Category NPerCat NCatPerFile. COMPUTE B1=Data(1:Data(1,3),1:3). COMPUTE B2=Data((Data(1,3)+1):NROW(Data),1:3). COMPUTE MaxCat=MMAX(Data(:,1)). COMPUTE C={T({1:MaxCat}),MAKE(MaxCat,3,0)}. LOOP #=1 TO NROW(B1). + COMPUTE C(B1(#,1),2)=B1(#,2). END LOOP. LOOP #=1 TO NROW(B2). + COMPUTE C(B2(#,1),3)=B2(#,2). END LOOP. LOOP #=1 TO MaxCat. + COMPUTE C(#,2:3) = CMAX ({C(#,2),C(#,3);1,1}). + COMPUTE C(#,4)=C(#,2) * C(#,3). + SAVE ({MAKE(C(#,4),1,C(#,1)), KRONEKER(T({1:C(#,2)}),MAKE(C(#,3),1,1)), KRONEKER(MAKE(C(#,2),1,1),T({1:C(#,3)}))}) / OUTFILE * / VARIABLES AR_Category F1Instance F2Instance. END LOOP. END MATRIX. /* Merge 'spine' into map file to link to source files */. MATCH FILES FILE * / TABLE map / BY AR_Category . MATCH FILES FILE * / TABLE f1 / BY Category F1Instance. SORT CASES BY AR_Category F2Instance. MATCH FILES FILE * / TABLE f2 / BY Category F2Instance. DATASET NAME MergedCartProd. EXECUTE. DATASET CLOSE agg. DATASET CLOSE map. DATASET CLOSE stacked. DATASET ACTIVATE MergedCartProd. DELETE VARIABLES F1Instance F2Instance AR_Category NperCat. LIST. Category School Subject Art RI School of Design Painting Art Metropolitan Museum Painting Art RI School of Design Drawing Art Metropolitan Museum Drawing Art RI School of Design Sculpture Art Metropolitan Museum Sculpture Culinary Johnson & Wales Liberal Arts Oberlin Literature Liberal Arts Antioch Literature Liberal Arts Swarthmore Literature Liberal Arts Oberlin Philosophy Liberal Arts Antioch Philosophy Liberal Arts Swarthmore Philosophy Psychoceramics Carberry Psychoceramics Library science Technical MIT Calculus Technical Cal Tech Calculus Technical ETH Calculus Technical MIT Physics Technical Cal Tech Physics Technical ETH Physics Technical MIT Chemistry Technical Cal Tech Chemistry Technical ETH Chemistry Technical MIT Engineering Technical Cal Tech Engineering Technical ETH Engineering Number of cases read: 27 Number of cases listed: 27 ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/MATRIX-code-to-create-spine-for-R-Ristows-Macro-Mny2Mny-for-many-to-many-merge-and-Cartesian-product-tp5724820p5724853.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
I'm curious Brian, do the previous incarnations work on your v 21.x.x and 15?
In part: NEW FILE. DATASET CLOSE all. DATASET DECLARE tmp. MATRIX. SAVE UNIFORM(100,10) / OUTFILE tmp / VARIABLES x01 TO x10. END MATRIX. I know MATRIX throws a fit when READING from datasets, but in my experience (IIRC) writing to datasets from MATRIX was working fine in code that I have living happily in the wild). (ahem... FIX THIS IBM in v 22 fixpack please. I can't afford to step up to 23 when it is released)
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
David,
Yes. Works like a charm in both versions. I, like you I believe, think first in matrix; but there are admittedly times when it's not as friendly as non-matrix macros. B -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: Wednesday, March 12, 2014 4:59 PM To: [hidden email] Subject: Re: Revised: Macro !Mny2Mny, for many-to-many merge and Cartesian product I'm curious Brian, do the previous incarnations work on your v 21.x.x and 15? In part: NEW FILE. DATASET CLOSE all. DATASET DECLARE tmp. MATRIX. SAVE UNIFORM(100,10) / OUTFILE tmp / VARIABLES x01 TO x10. END MATRIX. *I know MATRIX throws a fit when READING from datasets*, but in my experience (IIRC) writing to datasets from MATRIX was working fine in code that I have living happily in the wild). (ahem... FIX THIS IBM in v 22 fixpack please. I can't afford to step up to 23 when it is released) ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/MATRIX-code-to-create-spine-for-R-Ristows-Macro-Mny2Mny-for-many-to-many-merge-and-Cartesian-product-tp5724820p5724859.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by David Marso
I'm continuing to have trouble with the code, which is the new version. The problem occurs here
MATRIX. GET DATA /FILE * / VARIABLES AR_Category NPerCat NCatPerFile. And the message is Line: 58 Command: Matrix Information: Matrix syntax error: unrecognized statement name. is it spelled correctly? So I ran a little Input program job to create a data file with three numeric variables and a few hundred cases and then wrote matrix. get x/file */variables x y. end matrix. I got the same message as before when I ran it and the following in the output file. My impression was that the matrix-end matrix command set was part of the 'base'. Am I wrong? Run MATRIX procedure: >Error # 12546 in column 1. Text: _SLINE >MATRIX syntax error: unrecognized statement name. Is it spelled correctly? >Execution of this command stops. Scan error detected in parser. The MATRIX statement skipped. >Error # 12546 in column 1. Text: _SLINE >MATRIX syntax error: unrecognized statement name. Is it spelled correctly? >Execution of this command stops. Scan error detected in parser. The MATRIX statement skipped. ------ END MATRIX ----- -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: Wednesday, March 12, 2014 11:25 AM To: [hidden email] Subject: Revised: Macro !Mny2Mny, for many-to-many merge and Cartesian product I am running version 22 on Doze 8.1 --- Here is a fresh version of the code. I have revised the hacky crap involving the copied data set by creating the instance counter at the beginning in each source file. Creating a secondary map file to bridge the spine to the actual category codes back to the source files (f1,f2) spine (AR_Category, Instance1, Instance2) --> map (AR_Category, Category) --> f1 (Category, Instance1) --> f2 (Category, Instance2) Also did a bit of tweaking of the MATRIX code itself. This should handle some pretty large problems as the full KP matrix/spine is not actually stored in full. It writes out subsets (one KP matrix for each original matching category) to the active file as they are computed. One immense advantage is there is NO XSAVE to external data sets as required by Jon's extension or Richard's code. Pats self on back...... ----- NEW FILE. DATASET CLOSE ALL. DATA LIST / Category (A16) School (A30). BEGIN DATA Art RI School of Design Art Metropolitan Museum Culinary Johnson & Wales Liberal Arts Oberlin Liberal Arts Antioch Liberal Arts Swarthmore Technical MIT Technical Cal Tech Technical ETH END DATA. SORT CASES BY Category. IF $CASENUM EQ 1 OR Category NE LAG(Category) F1INstance=1. IF MISSING(F1INstance) F1INstance=LAG(F1INstance)+1. DATASET NAME f1. LIST. DATA LIST / Category (A16) Subject (A30) . BEGIN DATA Art Painting Art Drawing Art Sculpture Liberal Arts Literature Liberal Arts Philosophy Psychoceramics Carberry Psychoceramics Library science Technical Calculus Technical Physics Technical Chemistry Technical Engineering END DATA. SORT CASES BY Category. IF $CASENUM EQ 1 OR Category NE LAG(Category) F2INstance=1. IF MISSING(F2INstance) F2INstance=LAG(F2INstance)+1. DATASET NAME f2. LIST. ADD FILES / FILE f1 / FILE f2 / IN=@IN2@. AUTORECODE Category / INTO AR_Category. DATASET NAME stacked. DATASET DECLARE agg. AGGREGATE OUTFILE agg / BREAK @IN2@ Category AR_Category/ NperCat=N . DATASET DECLARE map. DATASET ACTIVATE agg. AGGREGATE OUTFILE map / BREAK Category AR_Category/ NperCat=N . DATASET ACTIVATE agg. AGGREGATE OUTFILE * MODE ADDVARIABLES / BREAK @IN2@ / NCatPerFile=N. MATRIX. GET DATA /FILE * / VARIABLES AR_Category NPerCat NCatPerFile. COMPUTE B1=Data(1:Data(1,3),1:3). COMPUTE B2=Data((Data(1,3)+1):NROW(Data),1:3). COMPUTE MaxCat=MMAX(Data(:,1)). COMPUTE C={T({1:MaxCat}),MAKE(MaxCat,3,0)}. LOOP #=1 TO NROW(B1). + COMPUTE C(B1(#,1),2)=B1(#,2). END LOOP. LOOP #=1 TO NROW(B2). + COMPUTE C(B2(#,1),3)=B2(#,2). END LOOP. LOOP #=1 TO MaxCat. + COMPUTE C(#,2:3) = CMAX ({C(#,2),C(#,3);1,1}). + COMPUTE C(#,4)=C(#,2) * C(#,3). + SAVE ({MAKE(C(#,4),1,C(#,1)), KRONEKER(T({1:C(#,2)}),MAKE(C(#,3),1,1)), KRONEKER(MAKE(C(#,2),1,1),T({1:C(#,3)}))}) / OUTFILE * / VARIABLES AR_Category F1Instance F2Instance. END LOOP. END MATRIX. /* Merge 'spine' into map file to link to source files */. MATCH FILES FILE * / TABLE map / BY AR_Category . MATCH FILES FILE * / TABLE f1 / BY Category F1Instance. SORT CASES BY AR_Category F2Instance. MATCH FILES FILE * / TABLE f2 / BY Category F2Instance. DATASET NAME MergedCartProd. EXECUTE. DATASET CLOSE agg. DATASET CLOSE map. DATASET CLOSE stacked. DATASET ACTIVATE MergedCartProd. DELETE VARIABLES F1Instance F2Instance AR_Category NperCat. LIST. Category School Subject Art RI School of Design Painting Art Metropolitan Museum Painting Art RI School of Design Drawing Art Metropolitan Museum Drawing Art RI School of Design Sculpture Art Metropolitan Museum Sculpture Culinary Johnson & Wales Liberal Arts Oberlin Literature Liberal Arts Antioch Literature Liberal Arts Swarthmore Literature Liberal Arts Oberlin Philosophy Liberal Arts Antioch Philosophy Liberal Arts Swarthmore Philosophy Psychoceramics Carberry Psychoceramics Library science Technical MIT Calculus Technical Cal Tech Calculus Technical ETH Calculus Technical MIT Physics Technical Cal Tech Physics Technical ETH Physics Technical MIT Chemistry Technical Cal Tech Chemistry Technical ETH Chemistry Technical MIT Engineering Technical Cal Tech Engineering Technical ETH Engineering Number of cases read: 27 Number of cases listed: 27 ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/MATRIX-code-to-create-spine-for-R-Ristows-Macro-Mny2Mny-for-many-to-many-merge-and-Cartesian-product-tp5724820p5724853.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
"My impression was that the matrix-end matrix command set was part of the 'base'. Am I wrong?"
You are correct: MATRIX is part of the base. eg: data list free / x y. begin data 1 2 3 4 5 6 end data. MATRIX. GET data / FILE * /VARIABLES x y . PRINT data. END MATRIX. Run MATRIX procedure: DATA 1 2 3 4 5 6 ------ END MATRIX -----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Maguin, Eugene
For your first problem, consider this from the manual --
• A matrix variable name follows the same rules as those applicable to an ordinary IBM® SPSS® Statistics variable name, except matrix variable names cannot exceed 8 bytes. For your problem with the test program... I see "get x" which names the matrix x, and also a variable x. Would that mess it up? -- Rich Ulrich > Date: Thu, 13 Mar 2014 14:42:52 +0000 > From: [hidden email] > Subject: Re: Revised: Macro !Mny2Mny, for many-to-many merge and Cartesian product > To: [hidden email] > > I'm continuing to have trouble with the code, which is the new version. The problem occurs here > > MATRIX. > GET DATA /FILE * / VARIABLES AR_Category NPerCat NCatPerFile. > > And the message is > Line: 58 Command: Matrix Information: Matrix syntax error: unrecognized statement name. is it spelled correctly? > > > So I ran a little Input program job to create a data file with three numeric variables and a few hundred cases and then wrote > > matrix. > get x/file */variables x y. > end matrix. > > I got the same message as before when I ran it and the following in the output file. My impression was that the matrix-end matrix command set was part of the 'base'. Am I wrong? > > Run MATRIX procedure: > > >Error # 12546 in column 1. Text: _SLINE > >MATRIX syntax error: unrecognized statement name. Is it spelled correctly? > >Execution of this command stops. > Scan error detected in parser. > > The MATRIX statement skipped. > > > >Error # 12546 in column 1. Text: _SLINE > >MATRIX syntax error: unrecognized statement name. Is it spelled correctly? > >Execution of this command stops. > Scan error detected in parser. > > The MATRIX statement skipped. > > > ------ END MATRIX ----- > > > > > -----Original Message----- |
Administrator
|
No, that would be fine as can be seen here.
the VARIABLES clause refers to the variables in the SPSS data file. GET x assigns the contents of whatever is in the SPSS variables x and y to the matrix object x. Do wish they would up the limit on MATRIX object names to something reasonable ( > 8 chars). data list free / x y. begin data 1 2 3 4 5 6 end data. MATRIX. GET x / FILE * /VARIABLES x y . PRINT x. END MATRIX. Run MATRIX procedure: X 1 2 3 4 5 6 ------ END MATRIX -----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
The problem is my not/mis-understanding how it, matrix-end matrix works. I selected/executed the first two lines (Matrix. Get ...) rather than the whole Matrix-end matrix sequence. When I did the first two lines I got the error I earlier posted about. When i did the whole sequence, I didn’t get the error. I haven't yet ran the whole matrix-end matrix sequence in David's code but I'll bet it'll work when I do.
Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of David Marso Sent: Thursday, March 13, 2014 12:46 PM To: [hidden email] Subject: Re: Revised: Macro !Mny2Mny, for many-to-many merge and Cartesian product No, that would be fine as can be seen here. the VARIABLES clause refers to the variables in the SPSS data file. GET x assigns the contents of whatever is in the SPSS variables x and y to the matrix object x. Do wish they would up the limit on MATRIX object names to something reasonable ( > 8 chars). data list free / x y. begin data 1 2 3 4 5 6 end data. MATRIX. GET x / FILE * /VARIABLES x y . PRINT x. END MATRIX. Run MATRIX procedure: X 1 2 3 4 5 6 ------ END MATRIX ----- Rich Ulrich wrote > For your first problem, consider this from the manual -- > > • A matrix variable name follows the same rules as those applicable to > an ordinary IBM® SPSS® Statistics variable name, except matrix > variable names cannot exceed 8 bytes. > > For your problem with the test program... I see "get x" which names > the matrix x, and also a variable x. Would that mess it up? > > -- > Rich Ulrich > >> Date: Thu, 13 Mar 2014 14:42:52 +0000 >> From: > emaguin@ >> Subject: Re: Revised: Macro !Mny2Mny, for many-to-many merge >> and Cartesian product >> To: > SPSSX-L@.UGA >> >> I'm continuing to have trouble with the code, which is the new version. >> The problem occurs here >> >> MATRIX. >> GET DATA /FILE * / VARIABLES AR_Category NPerCat NCatPerFile. >> >> And the message is >> Line: 58 Command: Matrix Information: Matrix syntax error: >> unrecognized statement name. is it spelled correctly? >> >> >> So I ran a little Input program job to create a data file with three >> numeric variables and a few hundred cases and then wrote >> >> matrix. >> get x/file */variables x y. >> end matrix. >> >> I got the same message as before when I ran it and the following in >> the output file. My impression was that the matrix-end matrix command >> set was part of the 'base'. Am I wrong? >> >> Run MATRIX procedure: >> >> >Error # 12546 in column 1. Text: _SLINE MATRIX syntax error: >> >unrecognized statement name. Is it spelled >> correctly? >> >Execution of this command stops. >> Scan error detected in parser. >> >> The MATRIX statement skipped. >> >> >> >Error # 12546 in column 1. Text: _SLINE MATRIX syntax error: >> >unrecognized statement name. Is it spelled >> correctly? >> >Execution of this command stops. >> Scan error detected in parser. >> >> The MATRIX statement skipped. >> >> >> ------ END MATRIX ----- >> >> >> >> >> -----Original Message----- > [snip] Rich Ulrich wrote > For your first problem, consider this from the manual -- > > • A matrix variable name follows the same rules as those applicable to > an ordinary IBM® SPSS® Statistics variable name, except matrix > variable names cannot exceed 8 bytes. > > For your problem with the test program... I see "get x" which names > the matrix x, and also a variable x. Would that mess it up? > > -- > Rich Ulrich > >> Date: Thu, 13 Mar 2014 14:42:52 +0000 >> From: > emaguin@ >> Subject: Re: Revised: Macro !Mny2Mny, for many-to-many merge >> and Cartesian product >> To: > SPSSX-L@.UGA >> >> I'm continuing to have trouble with the code, which is the new version. >> The problem occurs here >> >> MATRIX. >> GET DATA /FILE * / VARIABLES AR_Category NPerCat NCatPerFile. >> >> And the message is >> Line: 58 Command: Matrix Information: Matrix syntax error: >> unrecognized statement name. is it spelled correctly? >> >> >> So I ran a little Input program job to create a data file with three >> numeric variables and a few hundred cases and then wrote >> >> matrix. >> get x/file */variables x y. >> end matrix. >> >> I got the same message as before when I ran it and the following in >> the output file. My impression was that the matrix-end matrix command >> set was part of the 'base'. Am I wrong? >> >> Run MATRIX procedure: >> >> >Error # 12546 in column 1. Text: _SLINE MATRIX syntax error: >> >unrecognized statement name. Is it spelled >> correctly? >> >Execution of this command stops. >> Scan error detected in parser. >> >> The MATRIX statement skipped. >> >> >> >Error # 12546 in column 1. Text: _SLINE MATRIX syntax error: >> >unrecognized statement name. Is it spelled >> correctly? >> >Execution of this command stops. >> Scan error detected in parser. >> >> The MATRIX statement skipped. >> >> >> ------ END MATRIX ----- >> >> >> >> >> -----Original Message----- > [snip] ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/MATRIX-code-to-create-spine-for-R-Ristows-Macro-Mny2Mny-for-many-to-many-merge-and-Cartesian-product-tp5724820p5724869.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by David Marso
At 11:24 AM 3/12/2014, David Marso wrote:
>One immense advantage is there is NO XSAVE to external data sets as required >by Jon's extension or Richard's code. > >Pats self on back...... By all means pat yourself on the back, but my macro did NOT do an XSAVE; it does two MATCH FILES. Read the code, or the documentation I posted, or both. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Hi Richard,
I took another look at your code and indeed, you do not XSAVE to disk. I apologize for that misattribution. Other than that doesn't the MATRIX approach seem a bit more intuitive than the double VARSTOCASES? David
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
At 04:41 PM 3/13/2014, David Marso wrote:
>Other than that doesn't the MATRIX approach seem a bit more >intuitive than the double VARSTOCASES? I don't know whether it's more intuitive -- the MATRIX syntax takes a lot of hard reading! But MATRIX is definitely the better method, because it doesn't require that the maximum size of an incoming group be specified in the code, or as a macro parameter. That's why I asked if you could work out the MATRIX logic for the spine; and I thank you for doing so. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
I took a few minutes to add commentary (WTF is going on assurance) to the code .
Should make the reading of the code a bit easier. Here is just the MATRIX piece (slightly modified from previous version). MATRIX. GET DATA /FILE */VARIABLES AR_Category NPerCat NCatPerFile. /* Read first file's index variables into B1 */. COMPUTE B1=Data(1:Data(1,3),:). /* Read second file's index variables into B2 */. COMPUTE B2=Data((Data(1,3)+1):NROW(Data),:). /* Store largest value of the key variable in MaxCat */. COMPUTE MaxCat=MMAX(Data(:,1)). /* Create placeholder matrix indexed by category value */. COMPUTE C={T({1:MaxCat}),MAKE(MaxCat,2,0)}. /* Store number of category instances from each file in C(:,2:3) */. LOOP #=1 TO NROW(B1). + COMPUTE C(B1(#,1),2)=B1(#,2). END LOOP. LOOP #=1 TO NROW(B2). + COMPUTE C(B2(#,1),3)=B2(#,2). END LOOP. /* Loop over each category and build KP and save to active file */. LOOP #=1 TO MaxCat. /* Need to set zeroes to 1 to build single row for missing categories */. + COMPUTE C(#,2:3) = CMAX ({C(#,2),C(#,3);1,1}). /* Example given C(1,:)={1,2,3}, */. /* Result saved to file is {1,1,1; 1,1,2; 1,1,3; 1,2,1; 1,2,2; 1,2,3} */. SAVE ({MAKE(C(#,2)*C(#,3),1,C(#,1)), KRONEKER(T({1:C(#,2)}) ,MAKE(C(#,3),1,1)), KRONEKER(MAKE(C(#,2),1,1),T({1:C(#,3)}))}) / OUTFILE * / VARIABLES AR_Category F1Instance F2Instance. END LOOP. END MATRIX.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Free forum by Nabble | Edit this page |