Nested or Hierarchical Data Structures for Multiple SPSS Data Files

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Nested or Hierarchical Data Structures for Multiple SPSS Data Files

Kevan Edwards (MDH)

I have several SPSS data files saved each of which contains information about individuals, but saved at different levels of observation.  There is a one to many situation in these files. For example one file contains information about demographics (1 record per person) another file contains information about utilization of services (1 record for each office visit) while others contain information about the person’s health history (1 record for each medical condition).  All files have 1 common linkage field/variable being the id number of the person.  

 

My question is, is it possible to link the SPSS .sav files I have for in which the data records reflect various “levels of observation” to create a “hierarchical” or “nested” files stricture from the individual SPSS sav files with that common a key variable (Recipient ID) , and if so, how? 

 

The “Nested Files” documentation in my SPSS manuals indicates how to do it when creating data from scratch (which I would have done had I anticipated the data files would eventually populate and grow to the extent they have on this particular project), but there is no mention of the ability to build a nested file structure when the various related data files are already SPSS save/format files.

 

I’m trying to avoid writing out all the files to text format and starting over as I fear loosing all my labels and imbedded macro’s.

 

Any suggestions are appreciated?

 

Thanks.

 

Kevan

 

----------------------------

Kevan Edwards M.A., Ph.D.

Research Scientist III
Health Economics Program, DHP

Minnesota Department of Health
PO Box 64975, St. Paul, MN  55164-0975
phone 651-201-3551  fax 651-201-5179
http://www.health.state.mn.us/healtheconomics

 

Reply | Threaded
Open this post in threaded view
|

Re: Nested or Hierarchical Data Structures for Multiple SPSS Data Files

Bruce Weaver
Administrator
Kevan Edwards (MDH) wrote
I have several SPSS data files saved each of which contains information about individuals, but saved at different levels of observation.  There is a one to many situation in these files. For example one file contains information about demographics (1 record per person) another file contains information about utilization of services (1 record for each office visit) while others contain information about the person's health history (1 record for each medical condition).  All files have 1 common linkage field/variable being the id number of the person.

My question is, is it possible to link the SPSS .sav files I have for in which the data records reflect various "levels of observation" to create a "hierarchical" or "nested" files stricture from the individual SPSS sav files with that common a key variable (Recipient ID) , and if so, how?

The "Nested Files" documentation in my SPSS manuals indicates how to do it when creating data from scratch (which I would have done had I anticipated the data files would eventually populate and grow to the extent they have on this particular project), but there is no mention of the ability to build a nested file structure when the various related data files are already SPSS save/format files.

I'm trying to avoid writing out all the files to text format and starting over as I fear loosing all my labels and imbedded macro's.

Any suggestions are appreciated?

Thanks.

Kevan

----------------------------
Kevan Edwards M.A., Ph.D.
Research Scientist III
Health Economics Program, DHP
Minnesota Department of Health
PO Box 64975, St. Paul, MN  55164-0975
phone 651-201-3551  fax 651-201-5179
http://www.health.state.mn.us/healtheconomics
Greetings from up the road in NW Ontario.  UCLA's statistical consulting service has some good tutorials on merging files (via MATCH FILES), and on restructuring (via CASESTOVARS and VARSTOCASES).  See the "intermediate data management" section at the following site:

   http://www.ats.ucla.edu/stat/Spss/topics/data_management.htm

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING: 
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).
Reply | Threaded
Open this post in threaded view
|

Re: Nested or Hierarchical Data Structures for Multiple SPSS Data Files

Richard Ristow
In reply to this post by Kevan Edwards (MDH)
At 03:55 PM 7/29/2009, Kevan Edwards (MDH) wrote:

I have several SPSS data files saved each of which contains information about individuals at different levels of observation.  There is a one to many situation in these files. For example one file contains information about demographics (1 record per person) another file contains information about utilization of services (1 record for each office visit) while others contain information about the person’s health history (1 record for each medical condition).  All files have 1 common linkage field/variable being the id number of the person. 
 
My question is, is it possible to link the SPSS .sav files I have for in which the data records reflect various “levels of observation” to create a “hierarchical” or “nested” files stricture from the individual SPSS sav files with that common a key variable (Recipient ID) , and if so, how?

You can't get a hierarchical or nested structure in a single SPSS file. But it works fine to have separate files for data at each level of observation; and then summarize (with AGGREGATE) or join (with MATCH FILES) as you need for analysis.

The “Nested Files” documentation in my SPSS manuals indicates how to do it when creating data from scratch ...

If you look really hard at the documentation, you'll see that it doesn't do that. The 'nested' filetype feature lets you take a file that comes IN nested, and write it with the data from each higher level included on every corresponding record at the lower levels. You could do that from your data, with (untested)

MATCH FILES
  /TABLE=Demographics
  /FILE =Office_Visits
  /BY PersonID.

But it's not a great idea. The resulting file is what, in the database world, is called un-normalized: in this case, breaking the rule that any piece of information (like the person's demographics) should occur in one and only one record.

You're probably fine as you are.  Happy analysis!

-Best wishes,
 Richard
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Nested or Hierarchical Data Structures for Multiple SPSS Data Files

Reutter, Alex

I think whether this is “not a great idea” depends a bit upon what one plans to do with the merged dataset.  MIXED and GENLIN both use the “long” data structure for repeated measures analyses, in which case each person’s demographics will occur in multiple records.

 

Cheers,

Alex

 


From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Richard Ristow
Sent: Thursday, July 30, 2009 3:42 AM
To: [hidden email]
Subject: Re: Nested or Hierarchical Data Structures for Multiple SPSS Data Files

 

At 03:55 PM 7/29/2009, Kevan Edwards (MDH) wrote:


I have several SPSS data files saved each of which contains information about individuals at different levels of observation.  There is a one to many situation in these files. For example one file contains information about demographics (1 record per person) another file contains information about utilization of services (1 record for each office visit) while others contain information about the persons health history (1 record for each medical condition).  All files have 1 common linkage field/variable being the id number of the person. 
 
My question is, is it possible to link the SPSS .sav files I have for in which the data records reflect various levels of observation to create a hierarchical or nested files stricture from the individual SPSS sav files with that common a key variable (Recipient ID) , and if so, how?


You can't get a hierarchical or nested structure in a single SPSS file. But it works fine to have separate files for data at each level of observation; and then summarize (with AGGREGATE) or join (with MATCH FILES) as you need for analysis.


The Nested Files documentation in my SPSS manuals indicates how to do it when creating data from scratch ...


If you look really hard at the documentation, you'll see that it doesn't do that. The 'nested' filetype feature lets you take a file that comes IN nested, and write it with the data from each higher level included on every corresponding record at the lower levels. You could do that from your data, with (untested)

MATCH FILES
  /TABLE=Demographics
  /FILE =Office_Visits
  /BY PersonID.

But it's not a great idea. The resulting file is what, in the database world, is called un-normalized: in this case, breaking the rule that any piece of information (like the person's demographics) should occur in one and only one record.

You're probably fine as you are.  Happy analysis!

-Best wishes,
 Richard


===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Nested or Hierarchical Data Structures

Richard Ristow
I'd written,

The 'nested' filetype writes a file with the data from each higher level on every corresponding record at the lower levels. You could do that from your data. But it's not a great idea. The resulting file is what, in the database world, is called un-normalized ...


At 10:29 AM 7/30/2009, Reutter, Alex wrote:

I think whether this is “not a great idea” depends upon what one plans to do with the merged dataset.  MIXED and GENLIN both use the “long” data structure for repeated measures analyses, in which case each person’s demographics will occur in multiple records.

That's correct, and important. Many analyses need the data 'de-normalized' just like that.

The master copy of the data, on disk for permanent use, should indeed be normalized, in however many SPSS files that takes. Then, for analysis, make de-normalized versions as you need, usually with code like

MATCH FILES /FILE=<X>/TABLE=<Y>/BY PersonID.
===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD