Hi,
I have a dataset of around 1200 participants. The data is of research on attitude towards, and awareness of environmental issues, and environmentally friendly behavior. I’d like to perform multiple regressionon the data. The textbook I’m following is Andy field’s discovering statistics using SPSS. I have checked for assumptions of multiple regression, but there is one assumption I’m having a difficulty checking. This assumption is the assumption of homoscedasticity. The book suggests using the residuals plot to evaluate whether there is homoscedasticity. I’m blind, and I cannot see the plot to decide how it looks. I have checked that there is no standardized residual value above + or – 3.0. I was wondering that is there any test available in SPSs like the cooks distance test that can give me a value that I can use to learn about the scatter of the data. Of course, I can show the data to someone who can see it for me, but this is only possible in the next week, and I am hoping that if I can check for myself, then, why wait. I’m also a bit confused about how robust is the assumption of homoscedasticity. incase the data do not meats this assumption, will I have to use some other method instead of multiple regression. Regards, Faiz. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
You could use the SPSSINC BREUSCH PAGAN extension command to test homoscedasticity. It requires the R Essentials (and R) as explained on this page: https://developer.ibm.com/predictiveanalytics/downloads. On Thu, Feb 22, 2018 at 9:49 AM, faiz rasool <[hidden email]> wrote: Hi, |
The PDF available below provides (I think) a nice overview of tests of heteroskedasticity -- though available in Stata -- which might providehttps://www3.nd.edu/~rwilliam/stats2/l25.pdf On Thu, Feb 22, 2018 at 12:50 PM, Jon Peck <[hidden email]> wrote:
|
Administrator
|
Given this:
https://en.wikipedia.org/wiki/Breusch–Pagan_test It looks like it would be trivial to calculate is SPSS. in the regression save the residuals, square them and rerun the regression using the squared residuals as dependent. Then calculate the BP test using the Wiki formula. Easy but end of day for me and I need to attend to dinner. ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Faiz Rasool
Did you search the discussion list?
http://spssx-discussion.1045642.n5.nabble.com/BREUSCH-PAGAN-amp-KOENKER-TEST-MACRO-undefined-variables-td5727299.html#a5731408 /PR -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
DOH!
I thought that thread title looked familiar ;-) DRAT... I modified the macro further to utilize the new support for PIVOT TABLES in ver 25. Also changed the DO IF $CASENUM / PRINT business to use ECHO. Other caveats are still operative regarding renaming of Residuals for multiple runs. see original thread http://spssx-discussion.1045642.n5.nabble.com/BREUSCH-PAGAN-amp-KOENKER-TEST-MACRO-undefined-variables-td5727299.html#a5731408 *------------. * BREUSCH-PAGAN & KOENKER TEST MACRO * * See 'Heteroscedasticity: Testing and correcting in SPSS' * by Gwilym Pryce, for technical details. * REVISION HISTORY *. * Code by Marta Garcia-Granero 2002/10/28. * Modified by David Marso 2014/09/18 * (changed AGGREGATE and MATCH to use MODE=ADDVARIABLES, slight mods to MATRIX code, some formatting changes) * Modified by David Marso 2018/02/23 * (Modified Output formatting to support new Pivot Table support in MATRIX). * The MACRO needs 3 arguments: * the dependent, the number of predictors and the list of predictors * (if they are consecutive, the keyword TO can be used) . * (1) MACRO definition (select and run just ONCE). DEFINE bpktest( !POSITIONAL !TOKENS(1) /!POSITIONAL !TOKENS(1) /!POSITIONAL !CMDEND). * Regression to GET the residuals and residual plots. REGRESSION /STATISTICS R ANOVA /DEPENDENT !1 /METHOD=ENTER !3 /SCATTERPLOT=(*ZRESID,*ZPRED) /RESIDUALS HIST(ZRESID) NORM(ZRESID) /SAVE RESID(residual) . ECHO "Examine the scatter plot of the residuals to detect model misspecification and/or heteroscedasticity" . ECHO "". ECHO "Also, check the histogram and np plot of residuals to detect non normality of residuals " . ECHO "Skewness and kurtosis more than twice their SE indicate non-normality". * Checking normality of residuals. DESCRIPTIVES VARIABLES=residual /STATISTICS=KURTOSIS SKEWNESS . * New dependent variable (g) creation. COMPUTE sq_res=residual**2. AGGREGATE /OUTFILE=* MODE ADDVARIABLES /BREAK= /rss = SUM(sq_res) /N=N. COMPUTE g=sq_res/(rss/n). * BP&K tests. * Regression of g on the predictors. REGRESSION /STATISTICS R ANOVA /DEPENDENT g /METHOD=ENTER !3 /SAVE RESID(resid) . * Routine adapted from Gwilym Pryce. MATRIX. COMPUTE p=!2. GET g / VARIABLES=g. GET resid / VARIABLES=resid. COMPUTE sq_res2 = resid&**2. COMPUTE n = nrow(g). COMPUTE rss = msum(sq_res2). COMPUTE m0 = ident(n)-((1/n)*make(n,n,1)). COMPUTE tss = transpos(g)*m0*g. COMPUTE regss = tss-msum(sq_res2). COMPUTE r_sq=1-(rss/tss). COMPUTE bp_test=0.5*regss. COMPUTE BP_sig=1-chicdf(bp_test,p). COMPUTE k_test=n*r_sq. COMPUTE K_sig=1-chicdf(k_test,p). *Final report. PRINT /TITLE " BP&K TESTS". PRINT { regss , rss , tss, r_sq} /TITLE "Sums of Squares Partitioning" /FORMAT "F8.4" /RLABELS " " /CLABELS "Regression SS","Residual SS","Total SS","R-squared". PRINT {n,p} /TITLE "Problem Size" /FORMAT "F4.0" /RLABELS " " /CLABELS "Sample size (N)","Number of predictors (P)". PRINT {bp_test ,BP_sig ;k_test ,K_sig } /TITLE " Breusch-Pagan and Koenker tests for Heteroscedasticity" /FORMATS "F8.4" /CLABELS "Test Statistic Chi Square (df=P)", "Significance level of Chi-square df=(H0:homoscedasticity)" /RLABELS "Breusch-Pagan","Koenker". END MATRIX. !ENDDEFINE. * (2) Sample data (replace by your own)*. INPUT PROGRAM. - VECTOR x(20). - LOOP #I = 1 TO 50. - LOOP #J = 1 TO 20. - COMPUTE x(#J) = NORMAL(1). - END LOOP. - END CASE. - END LOOP. - END FILE. END INPUT PROGRAM. EXECUTE. * Sets the mode for displaying output in MATRIX. SET MDISPLAY=TABLES. * (3) MACRO CALL (select and run). * x1 is the dependent and x2 TO x20 the predictors. BPKTEST x1 19 x2 TO x20 . PRogman wrote > Did you search the discussion list? > > http://spssx-discussion.1045642.n5.nabble.com/BREUSCH-PAGAN-amp-KOENKER-TEST-MACRO-undefined-variables-td5727299.html#a5731408 > > /PR > > > > -- > Sent from: http://spssx-discussion.1045642.n5.nabble.com/ > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- Please reply to the list and not to my personal email. Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by Faiz Rasool
I am not a fan of statistical tests of the assumptions for another test or
procedure. Such tests often have too little power when n is small and too much power when n is large. Rather than testing, you could just estimate your model via UNIANOVA and allow for heteroscedasticity via the ROBUST sub-command (assuming your SPSS version is recent enough). See the link below for details. https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistics_reference_project_ddita/spss/base/syn_unianova_robust.html HTH. Faiz Rasool wrote > Hi, > > I have a dataset of around 1200 participants. The data is of research > on attitude towards, and awareness of environmental issues, and > environmentally friendly behavior. > > I’d like to perform multiple regressionon the data. The textbook I’m > following is Andy field’s discovering statistics using SPSS. I have > checked for assumptions of multiple regression, but there is one > assumption I’m having a difficulty checking. This assumption is the > assumption of homoscedasticity. The book suggests using the residuals > plot to evaluate whether there is homoscedasticity. I’m blind, and I > cannot see the plot to decide how it looks. I have checked that there > is no standardized residual value above + or – 3.0. I was wondering > that is there any test available in SPSs like the cooks distance test > that can give me a value that I can use to learn about the scatter > of the data. > > Of course, I can show the data to someone who can see it for me, but > this is only possible in the next week, and I am hoping that if I can > check for myself, then, why wait. I’m also a bit confused about how > robust is the assumption of homoscedasticity. incase the data do not > meats this assumption, will I have to use some other method instead of > multiple regression. > > Regards, > Faiz. > > ===================== > To manage your subscription to SPSSX-L, send a message to > LISTSERV@.UGA > (not to SPSSX-L), with no body text except the > command. To leave the list, send the command > SIGNOFF SPSSX-L > For a list of commands to manage subscriptions, send the command > INFO REFCARD ----- -- Bruce Weaver [hidden email] http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." NOTE: My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
I agree, in general, but if assumption tests are done, I would substantially relax the significance level employed. Over 40 years ago I wrote a paper on the statistical properties of regression following a preliminary test for autocorrelation, but, alas, nobody was interested. On Sat, Feb 24, 2018 at 2:55 PM Bruce Weaver <[hidden email]> wrote: I am not a fan of statistical tests of the assumptions for another test or -- |
In reply to this post by Bruce Weaver
Like Bruce, I'm not a fan of tests of assumptions, but I do pay attention to the
shape of distributions. In my experience - which is heavily biased towards
using rating scales - 90% or 99% of apparent heteroscedasticity is the fault of "wrong scaling" rather than underlying lumpiness. Scale items can /usually/ be analyzed as they are; scale totals occasionally benefit from transformation. Item
Response Theory uses logistic, though the complication may seem like over-kill; on
the cruder side, square root is most common, after deciding which end should
represent "zero".
Is there big skewness? Are there big outliers? Do these features represent scores
that you would consider at "equal intervals"? Does taking a transformation give
something that is more Normal? If there is an outlier that represents a "real interval",
that raises the question of whether /that/ case actually belongs in a least-squares
analysis of these data; or if it should be removed and discussed as a special case. If the transformation made no difference in the subsequent analyses and inferences, PIs often liked to present the unmodified analysis along with the comment that doing the analyses using XX-transformation to meet the variance assumptions made no difference. -- Rich Ulrich From: SPSSX(r) Discussion <[hidden email]> on behalf of Bruce Weaver <[hidden email]>
Sent: Saturday, February 24, 2018 4:55 PM To: [hidden email] Subject: Re: testing for homoscedasticity in SPSS? I am not a fan of statistical tests of the assumptions for another test or
procedure. Such tests often have too little power when n is small and too much power when n is large. Rather than testing, you could just estimate your model via UNIANOVA and allow for heteroscedasticity via the ROBUST sub-command (assuming your SPSS version is recent enough). See the link below for details. https://www.ibm.com/support/knowledgecenter/en/SSLVMB_25.0.0/statistics_reference_project_ddita/spss/base/syn_unianova_robust.html HTH. [snip, previous] |
Rich, the OP stated they were blind -- I'm not sure how exactly a blind
person would be able to apply much of your advice. I might go a bit of a different tact. For those who are not completely blind, but are visually impaired, you can export SPSS graphs as vector images, in which case you can zoom in and make the chart encompass your entire field of vision. The easiest way would be to export charts as PDF files from SPSS. If this is the case, even if you need a screen reader but have some vision that might work out OK. Even in a blurry scatterplot you could assess heteroscedasticity. You don't need to be able to resolve each individual stroke of a point in the scatterplot to see the overall distribution of points. Some things like a histogram can never be really articulated in a set of statistics. Either myself or other folks on this forum can help with constructing a chart template that makes this easier, such as larger fonts, bigger points, or higher contrast. If you are entirely blind, I might suggest making a user request to SPSS -- a simple tool to export charts as STL files, which you can then 3d print. (The smaller 3d printers anymore are not that expensive.) It would be a slow process, but tactically you could also easily identify heteroscedasticity. Again I think that would also be very useful in general for histograms. ----- Andy W [hidden email] http://andrewpwheeler.wordpress.com/ -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
This is off-topic, but I can't refrain from lifting a pet subject which was never really developed into something more concrete: don't you thing sound could be used to illustrate the "noice" in scatter plots? Imagine your favourite software connected to a tone generator where the output signals such as means would be represented by signals of different frequencies and variation by noice? Representing variation on a static 2-dimensional paper, is that really the best way? Sound could perhaps be a better tool for some cases. And then visually impaired might even have an advantage in interpreting the output?
Robert -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Andy W Sent: Friday, March 02, 2018 1:50 PM To: [hidden email] Subject: Re: testing for homoscedasticity in SPSS? Rich, the OP stated they were blind -- I'm not sure how exactly a blind person would be able to apply much of your advice. I might go a bit of a different tact. For those who are not completely blind, but are visually impaired, you can export SPSS graphs as vector images, in which case you can zoom in and make the chart encompass your entire field of vision. The easiest way would be to export charts as PDF files from SPSS. If this is the case, even if you need a screen reader but have some vision that might work out OK. Even in a blurry scatterplot you could assess heteroscedasticity. You don't need to be able to resolve each individual stroke of a point in the scatterplot to see the overall distribution of points. Some things like a histogram can never be really articulated in a set of statistics. Either myself or other folks on this forum can help with constructing a chart template that makes this easier, such as larger fonts, bigger points, or higher contrast. If you are entirely blind, I might suggest making a user request to SPSS -- a simple tool to export charts as STL files, which you can then 3d print. (The smaller 3d printers anymore are not that expensive.) It would be a slow process, but tactically you could also easily identify heteroscedasticity. Again I think that would also be very useful in general for histograms. ----- Andy W [hidden email] http://andrewpwheeler.wordpress.com/ -- Sent from: http://spssx-discussion.1045642.n5.nabble.com/ ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Robert Lundqvist
|
Free forum by Nabble | Edit this page |