Re: Macro !Mny2Mny, for many-to-many merge and Cartesian product

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: Macro !Mny2Mny, for many-to-many merge and Cartesian product

Richard Ristow
Discussion:

I've tried to make !Mny2Mny clear for general use; for that reason,
I've made its syntax and behavior as close to that of MATCH FILES as
is consistent with its different purpose. (And, as consistent as the
macro language allows. "BY" is a reserved word in the macro language,
and can't be used as an argument name; hence, "BYvar", instead.)

Implementing the same logic in an extension command, instead, would
be clearly preferable; unfortunately, I'm running an older version of
SPSS that doesn't have the EXTENSION command. Advantages of
implementing as an extension would include,
. more flexible command syntax;
. handling semantic (run-time) errors, such as named inputs that don't exist;
. dynamically choosing names for scratch variables and datasets, so
there wouldn't have to be reserved names;
. dynamically setting the value of MaxGrps based on the inputs.

Like Jon Peck's STATS CARTPROD(1), !Mny2Mny uses standard SPSS
features; both work with variables of all types, and retain all
variable information. I thought a supplement to STATS CARTPROD would
still be useful; partly to track MATCH FILES more closely, but mainly
to handle the very common case of joins (combining records that match
on a key value) as well as Cartesian products (matching all pairs of
records from the inputs).

A many-to-many merge requires, in general, making multiple copies of
the records from all inputs. SPSS has two tools for duplicating
records: using XSAVE within a LOOP; and making the records to be
duplicated a TABLE input to MATCH FILES. STATS CARTPROD uses two
LOOP/XSAVEs; !Mny2Mny uses two MATCH FILES/TABLE=.

This time, I used MATCH FILES/TABLE= mainly to simplify use: it
doesn't need a scratch disk file. It does require creating
sequence-numbered copies of *both* inputs, plus a file I call the
'spine': it contains the identifying keys, only, from all records
that will appear in the output; it's the /FILE= file for the two
MATCH FILES/TABLE= commands. I use VARSTOCASES logic to create the 'spine'.

For future implementation, LOOP/XSAVE logic probably has an edge; it
would have a clear edge, if XSAVE could write to datasets. The logic
in STATS CARTPROD supports only the Cartesian product, rather than
matching by key variables; but it could be modified to match by keys.
===============
(1) See posting
Date:     Tue, 4 Mar 2014 14:32:29 -0700
From:     Jon K Peck <[hidden email]>
Subject:  News from the SPSS Community
To:       [hidden email]

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD