|
Greetings All:
This is my first attempt at using a listserve, so please forgive any gaffes. The issue: As part of a research project, I was provided with an SPSS data file with 15 variables, all are numeric except one date formatted variable (mm/dd/yyyy). The data file is what I would refer to as hierarchical. There are approximately 800 cases, but close to 7000 records. Any individual case may have between 1 and 85 records. Each record is not a 'level' per se, but rather refers to individual events. The file is sorted by a unique 8-number identifier associated with each case, and sorted within cases by the event date (earliest to latest). There is no record number associated with the individual records. Ideally, I would like to convert the file to a rectangular file. Does anyone have any suggestions? Best Jim ====================To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
James,
For a beginner your question is pretty well formulated, though some information is missing for a complete response. Your file, in fact, is already a rectangular file: each row is an event and each column is a variable or attribute of those events. When you say you want it to be a rectangular file, you probably mean that you want a file with one row per individual (i.e. 800 rows) instead of one row per event (7000 rows). If that is so, you may in turn mean two or three different things: 1. The 800-individual file may contain all the events of each individual, side by side. If each event is characterized by, say, K variables, you would have to reserve 85K columns for each individual (besides the individual's ID). Some individuals may have data in only one event, others in only two, and some in up to 85 events. It would be a very unwieldy file, indeed. Frankly, I do not see the usefulness of this arrangement, although it may be useful for some specific purpose. 2. You may obtain SUMMARY MEASURES of all events pertaining to each individual. For instance: number of events for each individual, time elapsed from first to last events happened to each individual, average value (for each individual) of some variable characterizing the events, and so on. This can be done quite easily with the AGGREGATE command, using individual ID as the break variable, and defining the summary variables or aggregate variables that you desire. 3. You do not tell anything about it, but you may have also information ABOUT INDIVIDUALS (age, sex, occupation and so on) besides information about the events. This information is relevant for all events pertaining to a certain individual. If you have that information, where is it? Is it in a special record for the individual, or it is repeated for every particular event pertaining to each individual? In the first alternative, you may want to select out the records of individuals, as a separate file, and them merge them with the rest, assigning the information about each individual to all events pertaining to each individual, so you can analyze relationships between individual properties and event properties (do older individuals have more severe events than the young? Do women have more than men?). This requires a combination of SELECT IF, SAVE AND MATCH FILES/TABLE commands, on which detailed advice could be given if needed. Hope this helps you to clarify exactly what you want. Hector -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of James Wilson Sent: 29 October 2008 12:02 To: [hidden email] Subject: Reading in Hierarchical Data Greetings All: This is my first attempt at using a listserve, so please forgive any gaffes. The issue: As part of a research project, I was provided with an SPSS data file with 15 variables, all are numeric except one date formatted variable (mm/dd/yyyy). The data file is what I would refer to as hierarchical. There are approximately 800 cases, but close to 7000 records. Any individual case may have between 1 and 85 records. Each record is not a 'level' per se, but rather refers to individual events. The file is sorted by a unique 8-number identifier associated with each case, and sorted within cases by the event date (earliest to latest). There is no record number associated with the individual records. Ideally, I would like to convert the file to a rectangular file. Does anyone have any suggestions? Best Jim To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by James Wilson-24
At 10:01 AM 10/29/2008, James Wilson wrote:
>[I have] an SPSS data file [that] I would refer to as hierarchical. >There are approximately 800 cases, but close to 7000 records. Any >individual case may have between 1 and 85 records. Each record is >not a 'level' per se, but rather refers to individual events. The >file is sorted by a unique 8-number identifier associated with each >case, and sorted within cases by the event date (earliest to latest). See Hector Maletta's excellent remarks. It is very common to have situations like this: many individuals, each of whom has multiple events recorded, the number varying from individual to individual. A classic case is medical records, with the history of office visits for each patient. The term 'hierarchical' is accurate, though we don't seem to be using it often. >[In this file], each record is not a 'level' per se, but rather >refers to individual events. That is commonly called 'long' data organization. The alternative is one record per individual, with multiple groups of variables, one group for each event; that is called 'wide' organization. You write, >I would like to convert the file to a rectangular file. As Hector wrote, it's not clear what you mean. But you may mean, you want to convert your file from long to wide organization. If so, * The SPSS command CASESTOVARS does precisely that. See the Command Syntax Reference; or, from the menus, select Data > Restructure > Restructure selected cases into variables * However, it may not be a good idea. Many SPSS analyses, and most data manipulations, are easier in a long-form file. Notably, it's far easier to calculate individual summary statistics from a long-form file. The main SPSS command for this is AGGREGATE; or, from the menus, Data > Aggregate... (AGGREGATE is one command for which writing syntax is usually easier than using the menus.) Good luck! And, as Hector already invited, post further with follow-up questions. -Richard Ristow A side issue: >[The file has] 15 variables, all are numeric except one date >formatted variable (mm/dd/yyyy). *Probably* that is an SPSS date variable. If so, it has no inherent format. If so. It can be displayed as 'mm/dd/yyy', and it sounds like its format has been specified so it is. But it can also be displayed in any of the other formats in which SPSS can display dates, without changing its underlying values. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
