Below is typical VARSTOCASES syntax to unroll a dataset from 'wide' to
'long' organization: > ID Brand1 B1Q1 B1Q2 Brand2 B2Q1 B2Q2 Age Gender > > 1 5 1 2 4 2 1 25 1 > 2 4 2 2 3 2 1 26 2 > 3 3 1 2 1 1 1 27 1 >Number of cases read: 3 Number of cases listed: 3 > >VARSTOCASES /MAKE Brand FROM Brand1 Brand2 > /MAKE Q1 FROM B1Q1 B2Q1 > /MAKE Q2 FROM B1Q2 B2Q2 > /KEEP = ID Age Gender > /NULL = KEEP. > > ID Age Gender Brand Q1 Q2 > > 1 25 1 5 1 2 > 1 25 1 4 2 1 > 2 26 2 4 2 2 > 2 26 2 3 2 1 > 3 27 1 3 1 2 > 3 27 1 1 1 1 > >Number of cases read: 6 Number of cases listed: 6 In the 'wide' data, variables Brand1, B1Q1, and B1Q2 are logically parallel to Brand2, B2Q1, and B2Q2. Each set of three has corresponding data for one brand, and each set becomes a record in the 'long' file. But to unroll them takes a separate '/MAKE' for each set of logically equivalent variables (Brand, Q1 and Q2 in the output). And 'TO' can't be used for the variable list; every one of the variables must be named individually. It's not very elegant, and it would be genuinely awkward if there were many more than two of the groups. Analogous to MRSETS, which records one kind of relationship among variables as an dataset attribute, I'd like to see something like this (paralleling MRSETS syntax) for 'wide' relationships like the above: MVGROUPS NAME=(Brand,Q1,Q2) VARIABLES= Brand1 TO B2Q2 LABELS='Brand' 'Quality measure 1' 'Quality measure 2' or LABELSOURCE = VARLABLE FORMATSOURCE = VARFORMAT or FORMATS = (F3,F2,F2) * VARIABLES may only specify a 'TO' list * The number of variables on the list must be an exact multiple of the number of groups named * Every corresponding variable on the TO list (every 'nth' variable, if 'n' groups are named) must be the same type; if string, all must be the same length. EFFECT: The above specifies three variable groups: Brand vars Brand1 Brand2 Q1 vars B1Q1 B2Q1 Q2 vars B2Q1 B2Q2 USES: * The name of a group may be used in syntax where a list of variables would be accepted, and expands to the set of variables: MVGROUPS NAME=(Brand,Q1,Q2) VARIABLES= Brand1 TO B2Q2. DO REPEAT M1 = Q1 /M2 = Q2. . IF MISSING (M2) M2 = M1. END REPEAT. VARSTOCASES /MAKE Brand FROM Brand /MAKE Q1 FROM Q1 /MAKE Q2 FROM Q2 /KEEP = ID Age Gender /NULL = KEEP. In both of these cases, the MVGROUPS syntax would be much more compact and readable than the direct syntax, if there were many more than two sets of the variables. * A group may be indexed as a vector is This is an extension of the rule that all elements of a vector must be contiguous. However, since all elements of any group are equally spaced in the dataset, it requires only a small extension to vector indexing logic. |
Free forum by Nabble | Edit this page |