Rectangularize dataset

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Rectangularize dataset

Javier Meneses
Hi, in Stata there is command named fillin that fillin adds observations with missing data so that all interactions of varlist exist, thus making a complete rectangularization of     varlist.  fillin also adds the variable _fillin to the dataset.  _fillin is 1 for observations created by using fillin and 0 for     previously existing observations.


 How can I get this in spss??


                                    Thank
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Rectangularize dataset

David Marso
Administrator
SEE VARSTOCASES, AGGREGATE, CASESTOVARS and FLIP (in that order) to get your data restructured into one column for each variable of interest and 1 row for each category.
Then enter MATRIX mode and use KRONEKER function iterating over columns to generate all combinations (search this list to see examples of creating Cartesian products).
Finally ADD the resulting file to your raw data and you are done.
Leaving specific code details to you to experiment with.
Feel free to post back with questions if you get stuck.
---------------------------------
Javier Meneses wrote
Hi, in Stata there is command named* fillin* that fillin adds observations
with missing data so that all interactions of varlist exist, thus making a
complete rectangularization of     varlist.  fillin also adds the variable
_fillin to the dataset.  _fillin is 1 for observations created by using
fillin and 0 for     previously existing observations.


 How can I get this in spss??


                                    Thank

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
Reply | Threaded
Open this post in threaded view
|

Re: Rectangularize dataset

David Marso
Administrator
This is what I had in mind ;-)
DEFINE !fillin (!POS !CMDEND).
DATASET NAME @D@.

/* Enable restoration of original file order */.
COMPUTE @origorder=$CASENUM.
SORT CASES BY !1.

/* Make copy of data for butchery */.
DATASET COPY @CPD@.
DATASET ACTIVATE @CPD@.

/* Go wide to long */.
VARSTOCASES / MAKE @ FROM !1/ INDEX=CASE_LBL(@).

/* Determine existing sets of values */.
AGGREGATE OUTFILE */ BREAK  CASE_LBL @    /@@=N.

/* Build categories X variables table */.
CASESTOVARS ID=CASE_LBL /DROP @@.
FLIP .

/* Build matrix of combinations */.
MATRIX.
GET @/FILE * / VARIABLES !1 /MISSING=-999999.
COMPUTE @@=@(:,1).
LOOP #=2 TO NCOL(@).
COMPUTE @@={KRONEKER(@(:,#),MAKE(NROW(@@),1,1)),KRONEKER(MAKE(NROW(@(:,#)),1,1),@@)}.
END LOOP.
SAVE @@ /OUTFILE * / VARIABLES !1.
END MATRIX.



/* Interleave raw and generated cases and create status flags for original and generated cases */.
ADD FILES /FILE @D@ / IN=@D@/FILE=*/IN=@filled@/FIRST=@F@/LAST=@L@/BY !1.

/* Clobber missing values from generated table */.
DO REPEAT @=!1.
SELECT IF (@ NE -999999).
END REPEAT.

/* Retain only desired cases and clean up*/
SELECT IF ANY(1,@D@,@filled@ AND @F@).
MATCH FILES / FILE * / DROP @D@ @F@ @L@.
DATASET NAME @filled@.

DATASET CLOSE @CPD@.
!ENDDEFINE.

/* Simulate some data */.
NEW FILE.
DATASET CLOSE ALL.
MATRIX.
SAVE TRUNC(UNIFORM(100000,10)*4) /OUTFILE * /VARIABLES X01 TO X10.
END MATRIX.
SET MPRINT ON PRINTBACK ON.

/* VALID Sample calls */.
!Fillin X01 TO X10 .

!Fillin X01 X02 X03 X04 X05 X06 X07 X08 X09 X10.

!Fillin X01 X02 X04 TO X10 .

David Marso wrote
SEE VARSTOCASES, AGGREGATE, CASESTOVARS and FLIP (in that order) to get your data restructured into one column for each variable of interest and 1 row for each category.
Then enter MATRIX mode and use KRONEKER function iterating over columns to generate all combinations (search this list to see examples of creating Cartesian products).
Finally ADD the resulting file to your raw data and you are done.
Leaving specific code details to you to experiment with.
Feel free to post back with questions if you get stuck.
---------------------------------
Javier Meneses wrote
Hi, in Stata there is command named* fillin* that fillin adds observations
with missing data so that all interactions of varlist exist, thus making a
complete rectangularization of     varlist.  fillin also adds the variable
_fillin to the dataset.  _fillin is 1 for observations created by using
fillin and 0 for     previously existing observations.


 How can I get this in spss??


                                    Thank

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"