Hello experts spss I'm doing an analysis of outliers, and the process is successfully leaving me, my problem is when I show the data with a scatterplot / points. I analyze 29 variables one by one and when graphed need to change the scales, and I've been doing since the data output, I would do it from the syntax to minimize time, I could help with this issue. I leave a shaft. syntax I use. Sincerely, These are the 29 varialbes I am analyzing and for each gender the analicis outlier. RECODE G_PAQUETE_PD (MISSING = 0). RECODE G_ALOJAMIENTO_D (MISSING = 0). RECODE G_COMIDAS_BEBIDAS_D (MISSING = 0). RECODE G_TRANSP_INT_TERRESTRE_P (MISSING = 0). RECODE G_TRANSP_INT_MARITIMO_P (MISSING = 0). RECODE G_TRANSP_INT_AEREO_P (MISSING = 0). RECODE G_TRANSP_PARKING_P (MISSING = 0). RECODE G_TRANSP_REMOLQUE_P (MISSING = 0). RECODE G_TRANSP_ASISTENCIA_P (MISSING = 0). RECODE G_TRANSP_OTROS_P (MISSING = 0). RECODE G_RECREACION_TEMATICOS_P (MISSING = 0). RECODE G_RECREACION_ESPARCIMIENTO_P (MISSING = 0). RECODE G_RECREACION_DIVERSIONES_P (MISSING = 0). RECODE G_RECREACION_DEPORTES_P (MISSING = 0). RECODE G_ALQUILER_VEHICULOS_P (MISSING = 0). RECODE G_COMUNICACIONES_P (MISSING = 0). RECODE G_SALUD_P (MISSING = 0). RECODE G_COMPRAS_ARTESANIAS_P (MISSING = 0). RECODE G_COMPRAS_TEXTIL_P (MISSING = 0). RECODE G_COMPRAS_ROPA_P (MISSING = 0). RECODE G_COMPRAS_OBJETOS_VALIOSOS_P (MISSING = 0). RECODE G_COMPRAS_OTROS_P (MISSING = 0). RECODE G_PAQUETE_INTERNO_PD (MISSING = 0). RECODE G_SSCULTURAL_PARQUES_EOLICOS_P (MISSING = 0). RECODE G_SSCULTURAL_TEATRO_P (MISSING = 0). RECODE G_SSCULTURAL_MUSEOS_P (MISSING = 0). RECODE G_SSCULTURAL_BIBLIOTECA_P (MISSING = 0). RECODE G_SSCULTURAL_OTROS_P (MISSING = 0). RECODE G_OTROS_P (MISSING = 0). * Identificar casos atípicos. DELETE VARIABLES AnomalyIndex PeerId PeerSize PeerPctSize ReasonVar_1 ReasonVar_2 ReasonMeasure_1 ReasonMeasure_2 ReasonValue_1 ReasonValue_2 ReasonNorm_1 ReasonNorm_2. DETECTANOMALY /VARIABLES SCALE=G_ALOJAMIENTO_D ID=Correlativo /PRINT ANOMALYLIST NORMS ANOMALYSUMMARY REASONSUMMARY CPS /SAVE ANOMALY(AnomalyIndex) PEERID(PeerId) PEERSIZE(PeerSize) PEERPCTSIZE(PeerPctSize) REASONVAR(ReasonVar) REASONMEASURE(ReasonMeasure) REASONVALUE(ReasonValue) REASONNORM(ReasonNorm) /HANDLEMISSING APPLY=YES CREATEMISPROPVAR=YES /CRITERIA PCTANOMALOUSCASES=2 ANOMALYCUTPOINT=NONE MINNUMPEERS=1 MAXNUMPEERS=15 NUMREASONS=3. SORT CASES BY Año tipoviaj. SPLIT FILE SEPARATE BY Año tipoviaj. * Generador de gráficos. GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=ReasonMeasure_1 AnomalyIndex PeerId MISSING=LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: ReasonMeasure_1=col(source(s), name("ReasonMeasure_1")) DATA: AnomalyIndex=col(source(s), name("AnomalyIndex")) DATA: PeerId=col(source(s), name("PeerId"), unit.category()) GUIDE: axis(dim(1), label("Medida de impacto de variable de razón 1")) GUIDE: axis(dim(2), label("Índice de anomalía")) GUIDE: legend(aesthetic(aesthetic.color.exterior), label("ID de grupo de homólogos")) ELEMENT: point(position(ReasonMeasure_1*AnomalyIndex), color.exterior(PeerId)) END GPL. SPLIT FILE OFF. Javier Figueroa Procesamiento y Análisis de bases de datos Cel: 5927-4748 / 4970-1940 Casa: 2289-0184 |
It is unclear what the problem is.
One way to simplify is to RECODE varlist (missing =0). If the variables are next to each other the varlist would be RECODE G_PAQUETE_PD to G_OTROS_P (missing = 0). However, it is unclear why you are RECODEing these variables. " I leave a shaft. " Perhaps this is output from a translator app? Remember DETECTANOMOLY point out cases that need to be looked at. Extreme values may well be legitimate. Of course some values may be outside a legitimate range, e.g., negative ages, 100 year old elementary school students, etc. Similarly DELETE VARS should be simplified by using the "to" convention. Do these variables share a common response scale? What kinds of anomalies do you suspect might be there? Outside defined value labels? Impossible values given the values of other variables? It appears that you are looking at a single variable at at a time? Why not the set of variables/ If you are just looking for extremes within a single variable why no just us boxplots in EXPLORE?
Art Kendall
Social Research Consultants |
Thank you very much for your answer, you are right can simplify commands replace and delete but equally I do work, always thanks for the advice, using the chart boxes also works for me and I will switch to better visualize atypical cases, my problem is in the syntax of the graph, I want to know where I can change the scale of the Y axis This by default for each variable generates a different level and I need to always be the same scale. SORT CASES BY Año tipoviaj. SPLIT FILE SEPARATE BY Año tipoviaj. * Generador de gráficos. GGRAPH /GRAPHDATASET NAME="graphdataset" VARIABLES=G_PAQUETE MISSING=LISTWISE REPORTMISSING=NO /GRAPHSPEC SOURCE=INLINE. BEGIN GPL SOURCE: s=userSource(id("graphdataset")) DATA: G_PAQUETE=col(source(s), name("G_PAQUETE")) DATA: id=col(source(s), name("$CASENUM"), unit.category()) COORD: rect(dim(1), transpose()) GUIDE: axis(dim(1), label("P19. Precio del paquete")) ELEMENT: schema(position(bin.quantile.letter(G_PAQUETE)), label(id)) END GPL. SPLIT FILE OFF. Thanks for your help 2016-10-06 6:21 GMT-06:00 Art Kendall <[hidden email]>: It is unclear what the problem is. Javier Figueroa Procesamiento y Análisis de bases de datos Cel: 5927-4748 / 4970-1940 Casa: 2289-0184 |
You can add a SCALE statement for either or both axes. For example, SCALE: linear(dim(2), min(0), max(150000)) sets the y-axis scale to 0...,150000. On Thu, Oct 6, 2016 at 10:00 AM, Javier Figueroa <[hidden email]> wrote:
|
Thank you very much, Jon Peck, Art Kendall and all the members of this excellent group of experts spss, are always looking to help as you need it most and simultaneously serves learning, with this solution and advice from both I save a lot of time when you are editing the graphics in the output file. This request was perhaps very simple and is basic in managing the graphics and syntax. Taking advantage, I have another question, figure I can show the value labels and displays them all and it becomes very difficult to visualize. Would you could tell the graphics only show me the outlier from the syntax? I appreciate any solution to my new question. Sincerely, 2016-10-06 10:36 GMT-06:00 Jon Peck <[hidden email]>:
Javier Figueroa Procesamiento y Análisis de bases de datos Cel: 5927-4748 / 4970-1940 Casa: 2289-0184 |
You say you want the visualizations on the same scale.
What are the variables? What scale do you want to see? Since you seem to be looking at univariate anomalies, check out EXPLORE boxplots with the dependents together option.
Art Kendall
Social Research Consultants |
In reply to this post by Javier Figueroa
There is no way to directly generate the graph the way you want it, although as Art has pointed out, EXAMINE (Analyze > Descriptive Statistics > Explore) provides some outlier graphics and diagnostics. However, you can generate the graph without the value labels but with a point ID label and then edit it in the Chart Editor and turn on individual labels using Elements > Data Label Mode for the presumably small number of outliers. Another tool that may be useful in understanding outliers, although it does not identify the points, is the STATS BAGPLOT extension command. This is a two-dimensional boxplot or a matrix of these and can be useful in locating bivariate outliers. This extension command, which requires the R Essentials, can be installed from the Utilities menu in Statistics 22 or 23 or the Extensions menu in V24. On Thu, Oct 6, 2016 at 11:41 AM, Javier Figueroa <[hidden email]> wrote:
|
Thank you very much for your help, I served much. Sincerely, 2016-10-06 13:41 GMT-06:00 Jon Peck <[hidden email]>:
Javier Figueroa Procesamiento y Análisis de bases de datos Cel: 5927-4748 / 4970-1940 Casa: 2289-0184 |
Free forum by Nabble | Edit this page |