Hi Everyone, I have a quick question regarding deleting cases in an SPSS dataset based off of the respondent ID. I have a file of about 8,000 cases and each has a unique numeric ID that is 15 characters long. I need to delete 177 cases from the dataset using a list of IDs. Is there a way for me to do this without just finding each ID and deleting the case manually? Thank you in advance, Colin Colin M. Valdiserri Informed Decisions Group, Inc. 8854 Jordan Court | North Ridgeville, Ohio | 44039 cvaldiserri@... Follow us on LinkedIn The information and attachments contained in this e-mail message is intended for the use of the recipient(s) named above and is privileged and confidential. If you are not the intended recipient, you are formally notified that you have received this message in error and that any review, dissemination, distribution, or copying of the message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the original message. |
Colin. Make your list of 177 cases into a file and then add a variable, call it ‘drop’, to the file with a value of 1 for all 177 cases. Then match the 177 case file to the 8000 case file. All cases in the 8000 file, except for the 177, will have a value of sysmis. Recode drop from sysmis to 0 and then select if drop eq 0. So, I’m assuming that every one of the 177 are also in the 8000. If that is not true then use the Table subcommand of the match files command rather than the file subcommand for the 177 file. Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Colin Valdiserri Hi Everyone, I have a quick question regarding deleting cases in an SPSS dataset based off of the respondent ID. I have a file of about 8,000 cases and each has a unique numeric ID that is 15 characters long. I need to delete 177 cases from the dataset using a list of IDs. Is there a way for me to do this without just finding each ID and deleting the case manually? Thank you in advance, Colin Colin M. Valdiserri Informed Decisions Group, Inc. 8854 Jordan Court | North Ridgeville, Ohio | 44039 cvaldiserri@... Follow us on LinkedIn The information and attachments contained in this e-mail message is intended for the use of the recipient(s) named above and is privileged and confidential. If you are not the intended recipient, you are formally notified that you have received this message in error and that any review, dissemination, distribution, or copying of the message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the original message. |
In reply to this post by Colin Valdiserri
Colin Assuming you already have a list of the ID’s you want to delete, that it is machine readable, and that your variable is called ID and is numeric, try: count getrid = id (<list of IDs>) . select if getrid = 0 . If your list of IDs is in strings, someone else will know what to do, but it might be worth creating a serial number for each case using: compute serial = $casenum. . . . and then trying to find a workaround. John Hall Email: [hidden email] Website: www.surveyresearch.weebly.com Skype: surveyresearcher1 Phone: (+33) (0) 2.33.45.91.47 From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Colin Valdiserri Hi Everyone, I have a quick question regarding deleting cases in an SPSS dataset based off of the respondent ID. I have a file of about 8,000 cases and each has a unique numeric ID that is 15 characters long. I need to delete 177 cases from the dataset using a list of IDs. Is there a way for me to do this without just finding each ID and deleting the case manually? Thank you in advance, Colin Colin M. Valdiserri Informed Decisions Group, Inc. 8854 Jordan Court | North Ridgeville, Ohio | 44039 cvaldiserri@... Follow us on LinkedIn The information and attachments contained in this e-mail message is intended for the use of the recipient(s) named above and is privileged and confidential. If you are not the intended recipient, you are formally notified that you have received this message in error and that any review, dissemination, distribution, or copying of the message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the original message. |
In reply to this post by Colin Valdiserri
A simple way to do this is to use the UPDATE
command.
First, enter the ids of the records to be deleted as cases in a new dataset using the same variable type and name as in the master dataset. Compute a new variable in that dataset as compute deleteme = 1. With both datasets sorted by the id variable, update the master from this list, which will create deleteme in the master. Use select if to select all the cases where deleteme is not 1. HTH, Jon Peck (no "h") aka Kim Senior Software Engineer, IBM [hidden email] new phone: 720-342-5621 From: Colin Valdiserri <[hidden email]> To: [hidden email] Date: 02/24/2012 10:44 AM Subject: [SPSSX-L] Deleted Select Cases by ID Sent by: "SPSSX(r) Discussion" <[hidden email]> Hi Everyone, I have a quick question regarding deleting cases in an SPSS dataset based off of the respondent ID. I have a file of about 8,000 cases and each has a unique numeric ID that is 15 characters long. I need to delete 177 cases from the dataset using a list of IDs. Is there a way for me to do this without just finding each ID and deleting the case manually? Thank you in advance, Colin Colin M. Valdiserri Informed Decisions Group, Inc. 8854 Jordan Court | North Ridgeville, Ohio | 44039 P: 440.935.5414 F: 440.353.0621 cvaldiserri@... www.idg-consulting.com Follow us on LinkedIn The information and attachments contained in this e-mail message is intended for the use of the recipient(s) named above and is privileged and confidential. If you are not the intended recipient, you are formally notified that you have received this message in error and that any review, dissemination, distribution, or copying of the message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the original message. |
Administrator
|
In reply to this post by Maguin, Eugene
I would only add that it is crucial that both files be sorted in ascending order prior to the MATCH. Using the TABLE subcommand will also enable the location of multiple matching IDs among the 8000 case file.
and finally, you could simply use SELECT IF SYSMIS(drop) rather than recoding sysmis to 0. ---
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by John F Hall
Or you could just go jugular with
SELECT IF NOT(ANY(ID,<list of ids>)). But in the long run, the MATCH approach posted by Gene is much more scalable and easily maintainable. ---
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by David Marso
Recoding sysmis to zero will allow the addition of a value label. That
would facilitate Quality Assurance review and clarifying to oneself what one intended to do. It is almost a "Law of the Universe" that one will be interrupted between drafts of the process. Art Kendall Social Research Consultants On 2/24/2012 1:38 PM, David Marso wrote: > I would only add that it is crucial that both files be sorted in ascending > order prior to the MATCH. Using the TABLE subcommand will also enable the > location of multiple matching IDs among the 8000 case file. > and finally, you could simply use SELECT IF SYSMIS(drop) rather than > recoding sysmis to 0. > --- > > > Gene Maguin wrote >> Colin. >> >> >> >> Make your list of 177 cases into a file and then add a variable, call it >> 'drop', to the file with a value of 1 for all 177 cases. Then match the >> 177 >> case file to the 8000 case file. All cases in the 8000 file, except for >> the >> 177, will have a value of sysmis. Recode drop from sysmis to 0 and then >> select if drop eq 0. So, I'm assuming that every one of the 177 are also >> in >> the 8000. If that is not true then use the Table subcommand of the match >> files command rather than the file subcommand for the 177 file. >> >> >> >> Gene Maguin >> >> >> >> >> >> >> >> From: SPSSX(r) Discussion [mailto:SPSSX-L@.UGA] On Behalf Of >> Colin Valdiserri >> Sent: Friday, February 24, 2012 12:33 PM >> To: SPSSX-L@.UGA >> Subject: Deleted Select Cases by ID >> >> >> >> Hi Everyone, >> >> >> >> I have a quick question regarding deleting cases in an SPSS dataset based >> off of the respondent ID. I have a file of about 8,000 cases and each has >> a >> unique numeric ID that is 15 characters long. I need to delete 177 cases >> from the dataset using a list of IDs. >> >> >> >> Is there a way for me to do this without just finding each ID and deleting >> the case manually? >> >> >> >> Thank you in advance, >> >> >> >> Colin >> >> >> >> >> >> Colin M. Valdiserri >> >> Informed Decisions Group, Inc. >> >> 8854 Jordan Court | North Ridgeville, Ohio | 44039 >> P: 440.935.5414 >> F: 440.353.0621 >> >> cvaldiserri@ >> <http://us.mc8.mail.yahoo.com/mc/compose?to=cvaldiserri@> >> www.idg-consulting.com<http://www.idg-consulting.com/> >> >> >> >> Follow us on LinkedIn >> <http://www.linkedin.com/company/informed-decisions-group-inc/products> >> >> The information and attachments contained in this e-mail message is >> intended >> for the use of the recipient(s) named above and is privileged and >> confidential. If you are not the intended recipient, you are formally >> notified that you have received this message in error and that any review, >> dissemination, distribution, or copying of the message is strictly >> prohibited. If you have received this communication in error, please >> notify >> us immediately by e-mail and delete the original message. >> > > -- > View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Deleted-Select-Cases-by-ID-tp5513472p5513602.html > Sent from the SPSSX Discussion mailing list archive at Nabble.com. > > ===================== > To manage your subscription to SPSSX-L, send a message to > [hidden email] (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
Free forum by Nabble | Edit this page |