Finding Hidden Duplicates SUSPICIOUS FILE!!!

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Finding Hidden Duplicates SUSPICIOUS FILE!!!

Eugenio Grant
Hi Guys:



I have a "suspicious" file. By that I mean there could by duplicate
information in it. Obviously not the same ID number, or name twice. Any
ideas on how to check if there are duplicate records in it...



Maybe someone has had such experience in the past and could point me in the
right direction.



Thanks for your time,
Reply | Threaded
Open this post in threaded view
|

Re: Finding Hidden Duplicates SUSPICIOUS FILE!!!

Bob Schacht-3
At 11:30 AM 1/16/2007, Eugenio Grant wrote:
>Hi Guys:
>
>
>
>I have a "suspicious" file. By that I mean there could by duplicate
>information in it. Obviously not the same ID number, or name twice. Any
>ideas on how to check if there are duplicate records in it...

The first thing is that you have to decide what makes a record a
"duplicate"? If you simply mean that the values of all variables are the
same, then you can easily do that by choosing Data/Identify duplicate cases
from the drop-down  menu. Select all variables with CTRL-A, and select the
option to move all duplicates to the top of the file if you want to.

My guess is that you don't mean all variables are the same, but rather that
*some* are. Once you decide which variables uniquely define a case, if
matched with another record, you can use the same menu system to identify
your duplicates.

Bob Schacht




>Maybe someone has had such experience in the past and could point me in the
>right direction.
>
>
>
>Thanks for your time,

Robert M. Schacht, Ph.D. <[hidden email]>
Pacific Basin Rehabilitation Research & Training Center
1268 Young Street, Suite #204
Research Center, University of Hawaii
Honolulu, HI 96814
Reply | Threaded
Open this post in threaded view
|

Re: Finding Hidden Duplicates SUSPICIOUS FILE!!!

Bob Schacht-3
In reply to this post by Eugenio Grant
The SPSSX-L list server accused me of sending a duplicate message
yesterday, which it refused therefore to post. However, as far as I have
been able to tell, my message has not appeared on the list even once, so I
am sending it again, adding and deleting a few things.

At 11:30 AM 1/16/2007, Eugenio Grant wrote:

>I have a "suspicious" file. By that I mean there could by duplicate
>information in it. Obviously not the same ID number, or name twice. Any
>ideas on how to check if there are duplicate records in it...

The first thing is that you have to decide what makes a record a
"duplicate"? If you simply mean that the values of all variables are the
same, then you can easily do that by choosing Data/Identify duplicate cases
from the drop-down  menu. Select all variables with CTRL-A, and select the
option to move all duplicates to the top of the file if you want to.

My guess is that you don't mean all variables are the same, but rather that
*some* are. You will have to decide how a duplicate case would be
identified. Once you decide which variables uniquely define a case, if
matched with another record, you can use the same menu system
(Data/Identify duplicate cases) to identify your duplicates by choosing the
set of variables to identify duplicates accordingly.

This is part of the general subset of problems with the heading, "When is a
duplicate really a duplicate?"

Bob Schacht




>Maybe someone has had such experience in the past and could point me in the
>right direction.
>
>
>
>Thanks for your time,

Robert M. Schacht, Ph.D. <[hidden email]>
Pacific Basin Rehabilitation Research & Training Center
1268 Young Street, Suite #204
Research Center, University of Hawaii
Honolulu, HI 96814