Spotting data anomalies

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Spotting data anomalies

Luca Meyer
Is there a method which allows to spot data anomalies (especially
influential and outliers) in a multivariate context where data are a mix of
numeric and categorical variables?

I am thinking of hierarchical clustering-like techniques, but here I cannot
use CLUSTER because I have several thousands cases and the dimension of the
relative matrix it's just to large to function in such an instance...

Thank you for any suggestion.

Luca

Mr. Luca MEYER
Market research, data analysis & more
HYPERLINK "http://www.lucameyer.com/"www.lucameyer.com - Tel:
+39.339.495.00.21



No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.488 / Virus Database: 269.14.7/1062 - Release Date: 10/10/2007
17.11
Reply | Threaded
Open this post in threaded view
|

Re: Spotting data anomalies

Keith McCormick
Hi All,

I requires the Data Preparation module, but there is "Identify Unusual
Cases". The command syntax is DETECTANOMALY. I think that Two-Step is
certainly scalable to large data sets. In fact, DETECTANOMALY runs
Two-step first, and then finds cases that are distant to their "peer
group". You could probably do a pretty good job with Two-step and
further analysis with or without the module, but the it is pretty
convenient if you have it.

Hope that helps.

Keith McCormick
www.keithmccormick.com

On 10/11/07, Luca Meyer <[hidden email]> wrote:

> Is there a method which allows to spot data anomalies (especially
> influential and outliers) in a multivariate context where data are a mix of
> numeric and categorical variables?
>
> I am thinking of hierarchical clustering-like techniques, but here I cannot
> use CLUSTER because I have several thousands cases and the dimension of the
> relative matrix it's just to large to function in such an instance...
>
> Thank you for any suggestion.
>
> Luca
>
> Mr. Luca MEYER
> Market research, data analysis & more
> HYPERLINK "http://www.lucameyer.com/"www.lucameyer.com - Tel:
> +39.339.495.00.21
>
>
>
> No virus found in this outgoing message.
> Checked by AVG Free Edition.
> Version: 7.5.488 / Virus Database: 269.14.7/1062 - Release Date: 10/10/2007
> 17.11
>