Greetings--
Allow me to clarify this question. I have a master file of student enrollment data. Every time a student moves from one school to another, a new record is created that provides a date range for the time the student was enrolled at a particular institution (enrollment date/withdrawal date). There are over 850,000 records for one school year in this file. I have another file that has expulsion and suspension data in it, with a single date for the expulsion/suspension incident. Some kids have multiple instances in one school year, but in different institutions. Some of the enrollment information is missing in this file. I need to pull it out of the other file, but I need only the enrollment record for the student that includes the date of the expulsion/suspension incident. I have some 30 students that are missing this data, representing over 50 records. I have tried matching the files, but a merge just on ID does not function properly. Some records get matched incorrectly, and I suspect it is because I have variable numbers of records per id in each file. I have worked with other programs where you can create a text file that contains just the id numbers you want to isolate, and then you write a query that uses that text file as a filter so that only the records for those ids are pulled out. My question is whether SPSS has this capability through the syntax language. Thanks for your thoughts-- Teresa :) |
Teresa,
Thinge are much clearer now than they were in your first message. One small but important discrepancy is whether you do have a common id variable. This morning it sounded like you didn't; now it sounds like you do. I'll assume you do. Your working on data from a school district, I'll bet. I've worked on data like that. And, it's a real joy! Technically, you have 'many' records in each file with the same id. A regular match files won't work because it assumes a one to one relationship. If you had a 'one to many' relationship you might be able to use the Table subcommand. However, really have the problem of filling in the enrollment dates that bracket the suspension. I think you will have to dig the records out of the enrollment file. I'd start this way. List out your susupesion records that have the missing enrollment info. Use that information to select out the enrollment record for that kid that brackets the suspension date. However, I'd suggest that you use a print statement to structure your output and print all the enrollment records for that kid. I noticed that kids could be suspended and transferred to a new school on the same day or, even more interesting, between the time they left one school and entered another. Probably clerical error. District doesn't have the money to groom the data. This also means that you will have to construct a series of If statements to insert the correction back into the suspension file. 50 cases is a lot but you must have 70 or 80 thousand unique kids in that file. So, not very many problems. Data is pretty clean. Gene Maguin |
Free forum by Nabble | Edit this page |