nested file

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

nested file

Hanna Zaremba
 

Dear colleague,

I wonder if someone uses Nested File in SPSS. In nearest future I will have huge project. Data will concern individuals as well as region. The regional data will be common for group of individuals. The problem will be to not multiply the same information and to have all information. In such situation I just add common data (e.g. regional data) to every individual. Someone told me that Nested File in SPSS could solve this problem. I find some laconic information in Reference Guide, but it really not enough to become familiar with that subject. And there was no information how you can do calculations on hierarchical file (since simple frequentations, crostabulations .. on general and regional level). Any experience, suggestions, information will be highly appreciated.

Hanna Zaremba

Public Opinion Research Center

Warszaw

Poland
Reply | Threaded
Open this post in threaded view
|

Re: nested file

Beadle, ViAnn
If you are talking about FILE TYPE NESTED, then the lowest level defines a case and information from higher level cases are copied to the lowest level. There is an example within the SPSS help file and in the reference guide (page 632 in Release 15). Run it and see what you get.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Hanna Zaremba
Sent: Friday, May 11, 2007 1:34 AM
To: [hidden email]
Subject: [BULK] nested file
Importance: Low



Dear colleague,

I wonder if someone uses Nested File in SPSS. In nearest future I will have huge project. Data will concern individuals as well as region. The regional data will be common for group of individuals. The problem will be to not multiply the same information and to have all information. In such situation I just add common data (e.g. regional data) to every individual. Someone told me that Nested File in SPSS could solve this problem. I find some laconic information in Reference Guide, but it really not enough to become familiar with that subject. And there was no information how you can do calculations on hierarchical file (since simple frequentations, crostabulations .. on general and regional level). Any experience, suggestions, information will be highly appreciated.

Hanna Zaremba

Public Opinion Research Center

Warszaw

Poland
Reply | Threaded
Open this post in threaded view
|

Re: nested file

Richard Ristow
In reply to this post by Hanna Zaremba
At 03:34 AM 5/11/2007, Hanna Zaremba wrote:

>In nearest future I will have data [concerning] individuals as well as
>region. The regional data will be common for group of individuals. The
>problem will be to not multiply the same information and to have all
>information. In such situation I just add common data (e.g. regional
>data) to every individual. Someone told me that Nested File in SPSS
>could solve this problem. I wonder if someone uses Nested File in
>SPSS.

You saw  ViAnn Beadle's reference to the documentation for FILE TYPE
NESTED.

The crucial point is that file types apply to *input* files, that
you're reading from an external source, not to the SPSS files you want
to build. So, depending on what your source files look like, FILE TYPE
NESTED may help you; but it has little to do with your real problem.

>The regional data will be common for group of individuals. The problem
>will be to not multiply the same information and to have all
>information. In such situation I just add common data (e.g. regional
>data) to every individual.

Right. If you have the regional data in every individual's record, your
database is 'unnormalized' (in database design terminology). Sometimes
that's the best solution, but it's right that you don't like it much.

SPSS is a sort-merge data-management system; you use MATCH FILES where
you'd us a JOIN in SQL. You would create two working saved files, which
I'll call REGIONS and INDIVIDS. There's a variable, which I'll call
RegnKey, that uniquely identifies the regions, and of course is in the
REGIONS records. -RegnKey is also in the individual records, and gives
the ID of the region to which the individual belongs.

BOTH MUST BE SORTED IN ASCENDING ORDER BY RegnKey. The file of
individuals may be further sorted on lower-level keys, if desired.

(How you build these files depends on how you receive your data, so I
won't even start making suggestions. It'll be pretty manageable,
though, if your data comes in in any reasonable form at all.)

>And there was no information how you can do calculations on
>hierarchical file (since simple frequentations, crostabulations .. on
>general and regional level).

There are a lot of ways to view a data set like this, but here are the
most basic.

GET FILE=REGIONS.

Now you have a record for every region, whether or not you have any
individuals from that region. You can do any kind of analysis by
regions.

MATCH FILES
    /TABLE=REGIONS
    /FILE =INDIVIDS
    /BY RegnKey.

Now you have a record for every individual, with the data for the
individual's region attached. (That is, the file is 'de-normalized' for
analysis.) There is no record for any region for which you have no
individuals. You can do any kind of analysis by individuals, including
comparing individual characteristics by regional characteristics.

One more view, for illustration:

GET FILE=INDIVIDS.
AGGREGATE OUTFILE=*
    /BREAK=RegnKey
    /N_InRegn 'Number of study individuals in region' = NU.
MATCH FILES
    /FILE=REGIONS
    /FILE=*
    /BY RegnKey.
RECODE N_InRegn (MISSING = 0).

Now, once again, you have a record for every region, whether or not you
have any individuals from that region. This time, you also have
variable 'N_InRegn', giving the number of individuals you have from
that region. You can, if you choose, analyze but drop regions from whom
you have no individuals, or weight by number of individuals from the
region.

This doesn't exhaust the possibilities, of course, but you can see the
approach.