|
Working with a large database (over a million records) which contains
student records of every semester enrolled in the university. In addition the database includes other variables relating to student enrollment. Sample of data: Id Matricpd Termunits Term Year Highschool 012 19954 12 Fall 1995 Jeff High 012 19954 07 Spring 1996 Jeff High 123 19954 12 Fall 1995 Lincoln High 123 19954 11 Spring 1996 Lincoln High 123 19954 08 Fall 1996 Lincoln High 234 19954 12 Fall 1995 Frank High 234 19954 10 Spring 1996 Frank High 456 19964 12 Fall 1996 Wash High 456 19964 09 Spring 1997 Wash High 456 19964 07 Fall 1998 Wash High 789 19964 08 Fall 1996 Wash High 789 19964 12 Spring 1997 Wash High 777 19974 12 Fall 1997 Madison High 777 19974 05 Fall 1998 Madison High 777 19974 05 Fall 1999 Madison High 890 19954 10 Fall 1995 Madison High 890 19954 12 Spring 1996 Madison High 890 19954 12 Fall 1996 Madison High 923 19984 12 Fall 1998 Wash High 923 19984 12 Spring 1999 Wash High I would like to select a cohort where (matricpd = 19954 or matricpd = 19964 or matricpd = 19974) and (termunits gt 11) and (highschool = "Wash High" or "Madison High" or "Jeff High") and (year = 1995 or year = 1996 or year = 1997 or year = 1998 or year = 1999) But in addition for the first initial enrollment I would also like to select all of the subsequent terms of enrollment for that student or students. The additional cases may or may not match my original query. Basically if a student meets my select criteria, I want the results to return all of their records. The query would return this from the above data: 012 19954 12 Fall 1995 Jeff High 012 19954 07 Spring 1996 Jeff High 456 19964 12 Fall 1996 Wash High 456 19964 09 Spring 1997 Wash High 456 19964 07 Fall 1998 Wash High 777 19974 12 Fall 1997 Madison High 777 19974 05 Fall 1998 Madison High 777 19974 05 Fall 1999 Madison High I appreciate any help. Thanks. S. Allen ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
It seems to me that you could create an identifier for the client you wish to keep.
Some ideas to shorten your syntax...(UNTESTED, but you'll get the idea)--this creates a 0/1 dichotomy where the following is true (1) or not (0). Compute keepID=(any(matricpd,19954,19964,19974) and (termunits gt 11) and (any(highschool,"Wash High","Madison High","Jeff High") and (year ge 1995 and year le 1999)). Then do an aggregate with mode=addvariables breaking on only ID using max(keepID). Melissa -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of S.Allen Sent: Tuesday, June 10, 2008 12:59 PM To: [hidden email] Subject: [SPSSX-L] Selecting multiple cases Working with a large database (over a million records) which contains student records of every semester enrolled in the university. In addition the database includes other variables relating to student enrollment. Sample of data: Id Matricpd Termunits Term Year Highschool 012 19954 12 Fall 1995 Jeff High 012 19954 07 Spring 1996 Jeff High 123 19954 12 Fall 1995 Lincoln High 123 19954 11 Spring 1996 Lincoln High 123 19954 08 Fall 1996 Lincoln High 234 19954 12 Fall 1995 Frank High 234 19954 10 Spring 1996 Frank High 456 19964 12 Fall 1996 Wash High 456 19964 09 Spring 1997 Wash High 456 19964 07 Fall 1998 Wash High 789 19964 08 Fall 1996 Wash High 789 19964 12 Spring 1997 Wash High 777 19974 12 Fall 1997 Madison High 777 19974 05 Fall 1998 Madison High 777 19974 05 Fall 1999 Madison High 890 19954 10 Fall 1995 Madison High 890 19954 12 Spring 1996 Madison High 890 19954 12 Fall 1996 Madison High 923 19984 12 Fall 1998 Wash High 923 19984 12 Spring 1999 Wash High I would like to select a cohort where (matricpd = 19954 or matricpd = 19964 or matricpd = 19974) and (termunits gt 11) and (highschool = "Wash High" or "Madison High" or "Jeff High") and (year = 1995 or year = 1996 or year = 1997 or year = 1998 or year = 1999) But in addition for the first initial enrollment I would also like to select all of the subsequent terms of enrollment for that student or students. The additional cases may or may not match my original query. Basically if a student meets my select criteria, I want the results to return all of their records. The query would return this from the above data: 012 19954 12 Fall 1995 Jeff High 012 19954 07 Spring 1996 Jeff High 456 19964 12 Fall 1996 Wash High 456 19964 09 Spring 1997 Wash High 456 19964 07 Fall 1998 Wash High 777 19974 12 Fall 1997 Madison High 777 19974 05 Fall 1998 Madison High 777 19974 05 Fall 1999 Madison High I appreciate any help. Thanks. S. Allen ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD PRIVILEGED AND CONFIDENTIAL INFORMATION This transmittal and any attachments may contain PRIVILEGED AND CONFIDENTIAL information and is intended only for the use of the addressee. If you are not the designated recipient, or an employee or agent authorized to deliver such transmittals to the designated recipient, you are hereby notified that any dissemination, copying or publication of this transmittal is strictly prohibited. If you have received this transmittal in error, please notify us immediately by replying to the sender and delete this copy from your system. You may also call us at (309) 827-6026 for assistance. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
|
In reply to this post by S.Allen
As a quick first thought and completely untested you should be able to do it with the following steps ( By The Way I fully expect somebody to come up with a more elegant solution !! )
1. Create a new variable called "In_study" ( the name is your choice ) which has value of 1 when the conditions you want are met [ the code will be something like ( ( any ( matricpd, 19954, 19964, 19974) ) ) and (termunits gt 11) and ( ( any ( highschool, "Wash High", "Madison High", "Jeff High") ) ) and ( ( year ge 1995 ) and ( year le 1999 ) ) ] 2. Select cases to a new dataset where "In_Study" = 1 3. this bit is optional but will make life easier and that is to remove from the new data set all variables apart from the student id and In_Study 4. Merge the original dataset with the new dataset matching on student id with the new data set being a keyed table. If my reasoning is right you should have 'marker' on those cases that meet your criteria. Best Wishes John S. Lemon Student Liaison Officer Directorate of Information Technology (DIT) - University of Aberdeen Edward Wright Building: Room G51 Tel: +44 1224 273350 Fax: +44 1224 273372 > -----Original Message----- > From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of > S.Allen > Sent: Tuesday, June 10, 2008 6:59 PM > To: [hidden email] > Subject: Selecting multiple cases > > Working with a large database (over a million records) which contains > student records of every semester enrolled in the university. In addition > the database includes other variables relating to student enrollment. > > > Sample of data: > > Id Matricpd Termunits Term Year Highschool > 012 19954 12 Fall 1995 Jeff High > 012 19954 07 Spring 1996 Jeff High > 123 19954 12 Fall 1995 Lincoln High > 123 19954 11 Spring 1996 Lincoln High > 123 19954 08 Fall 1996 Lincoln High > 234 19954 12 Fall 1995 Frank High > 234 19954 10 Spring 1996 Frank High > 456 19964 12 Fall 1996 Wash High > 456 19964 09 Spring 1997 Wash High > 456 19964 07 Fall 1998 Wash High > 789 19964 08 Fall 1996 Wash High > 789 19964 12 Spring 1997 Wash High > 777 19974 12 Fall 1997 Madison High > 777 19974 05 Fall 1998 Madison High > 777 19974 05 Fall 1999 Madison High > 890 19954 10 Fall 1995 Madison High > 890 19954 12 Spring 1996 Madison High > 890 19954 12 Fall 1996 Madison High > 923 19984 12 Fall 1998 Wash High > 923 19984 12 Spring 1999 Wash High > > > I would like to select a cohort where (matricpd = 19954 or matricpd = 19964 > or matricpd = 19974) and (termunits gt 11) and (highschool = "Wash High" > or > "Madison High" or "Jeff High") and (year = 1995 or year = 1996 or year = > 1997 or year = 1998 or year = 1999) > > But in addition for the first initial enrollment I would also like to > select > all of the subsequent terms of enrollment for that student or students. The > additional cases may or may not match my original query. Basically if a > student meets my select criteria, I want the results to return all of their > records. > > The query would return this from the above data: > > 012 19954 12 Fall 1995 Jeff High > 012 19954 07 Spring 1996 Jeff High > 456 19964 12 Fall 1996 Wash High > 456 19964 09 Spring 1997 Wash High > 456 19964 07 Fall 1998 Wash High > 777 19974 12 Fall 1997 Madison High > 777 19974 05 Fall 1998 Madison High > 777 19974 05 Fall 1999 Madison High > > I appreciate any help. > > Thanks. > S. Allen > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD The University of Aberdeen is a charity registered in Scotland, No SC013683. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
| Free forum by Nabble | Edit this page |
