OS: Windows XP
SPSS version: 20 Nature of the data: 50 cases, about 30 variables, no missing data (anything else you need to know?) I am trying to do a Breusch-Pagan-Koenker test on one of my regression models. I'm using the script provied by Marta Garcia-Granero (2002/10/28), which is widely used. I'm encountering a series of error messages and assume that only the first one(s) are likely to be helpful. The first two are: Run MATRIX procedure: BP&K TESTS ========== >Error encountered in source line # 337 >Error # 12555 >During execution of the GET statement, missing value has been encountered, but >no MISSING subcommand is specified. >Execution of this command stops. >Error encountered in source line # 338 >Error # 12555 >During execution of the GET statement, missing value has been encountered, but >no MISSING subcommand is specified. >Execution of this command stops. >Error encountered in source line # 339 These two messages evidently refer to these lines in the code: get g /variables=g. get resid /variables=resid. Those were variables created earlier in the macro and placed in my data file (along with 5 other variables) by it. The variables are there, one value per case, with no missing values. Apparently the macro is not finding the variables, or not finding all the values for the variables. The GET command refers to a temporary .sav file that I've called foo.sav. That file does NOT include the g and resid variables. It only contains one row with three values in it. So it looks like the script is creating the right variables but putting them in a place (my original data file) that the script isn't written to access; and instead it looks for them in the foo.sav file. However, I now think that may be a red herring, because I have the same problem with a White's test syntax, and in that case the required variables are saved to the correct temporary file as well as appended to my own data file. Maybe the script has a flaw, but that seems unlikely since many other people use it successfully. The only change I made was to the name of the output file (specifying a full path instead of just a filename.sav, to make sure the file could be written to an existing folder on my PC). But even with the original syntax for the filename, I get the same problem -- same error messages. Alternatively, maybe I don't understand what's going on; that's quite possible, since I have little experience with SPSS scripts. I'm appending the whole script (hoping it's not too much for the forum to handle) after my signature. Thanks for any help, Roger -------------script follows------------- * BREUSCH-PAGAN & KOENKER TEST MACRO * * See 'Heteroscedasticity: Testing and correcting in SPSS' * by Gwilym Pryce, for technical details. * Code by Marta Garcia-Granero 2002/10/28. * The MACRO needs 3 arguments: * the dependent, the number of predictors and the list of predictors * (if they are consecutive, the keyword TO can be used) . * (1) MACRO definition (select an run just ONCE). DEFINE bpktest(!POSITIONAL !TOKENS(1) /!POSITIONAL !TOKENS(1) /!POSITIONAL !CMDEND). * Regression to get the residuals and residual plots. REGRESSION /STATISTICS R ANOVA /DEPENDENT !1 /METHOD=ENTER !3 /SCATTERPLOT=(*ZRESID,*ZPRED) /RESIDUALS HIST(ZRESID) NORM(ZRESID) /SAVE RESID(residual) . do if $casenum=1. print /"Examine the scatter plot of the residuals to detect" /"model misspecification and/or heteroscedasticity" /"" /"Also, check the histogram and np plot of residuals " /"to detect non normality of residuals " /"Skewness and kurtosis more than twice their SE indicate non-normality ". end if. * Checking normality of residuals. DESCRIPTIVES VARIABLES=residual /STATISTICS=KURTOSIS SKEWNESS . * New dependent variable (g) creation. COMPUTE sq_res=residual**2. compute constant=1. AGGREGATE /OUTFILE='tempdata.sav' /BREAK=constant /rss = SUM(sq_res) /N=N. MATCH FILES /FILE=* /FILE='tempdata.sav'. EXECUTE. if missing(rss) rss=lag(rss,1). if missing(n) n=lag(n,1). compute g=sq_res/(rss/n). execute. * BP&K tests. * Regression of g on the predictors. REGRESSION /STATISTICS R ANOVA /DEPENDENT g /METHOD=ENTER !3 /SAVE RESID(resid) . *Final report. do if $casenum=1. print /" BP&K TESTS" /" ==========". end if. * Routine adapted from Gwilym Pryce. matrix. compute p=!2. get g /variables=g. get resid /variables=resid. compute sq_res2=resid&**2. compute n=nrow(g). compute rss=msum(sq_res2). compute ii_1=make(n,n,1). compute i=ident(n). compute m0=i-((1/n)*ii_1). compute tss=transpos(g)*m0*g. compute regss=tss-rss. print regss /format="f8.4" /title="Regression SS". print rss /format="f8.4" /title="Residual SS". print tss /format="f8.4" /title="Total SS". compute r_sq=1-(rss/tss). print r_sq /format="f8.4" /title="R-squared". print n /format="f4.0" /title="Sample size (N)". print p /format="f4.0" /title="Number of predictors (P)". compute bp_test=0.5*regss. print bp_test /format="f8.3" /title="Breusch-Pagan test for Heteroscedasticity" + " (CHI-SQUARE df=P)". compute sig=1-chicdf(bp_test,p). print sig /format="f8.4" /title="Significance level of Chi-square df=P (H0:" + "homoscedasticity)". compute k_test=n*r_sq. print k_test /format="f8.3" /title="Koenker test for Heteroscedasticity" + " (CHI-SQUARE df=P)". compute sig=1-chicdf(k_test,p). print sig /format="f8.4" /title="Significance level of Chi-square df=P (H0:" + "homoscedasticity)". end matrix. !ENDDEFINE. * x1 is the dependent and x2 TO x20 the predictors. * (3) MACRO CALL (select and run). BPKTEST PEW0813 6 AIRPOLLLN EDUC FFPRODLN LARGESTECON PARTYCONTROL ENVORG2LN. |
Administrator
|
Roger,
I don't believe having the macro without the DATA is particularly useful! When MATRIX bitches about missing data it is invariably due to the file containing MISSING data. Feel free to ATTACH the relevant data files to the thread. If you are using the UGA portal that is likely not an option SO. So through Nabble, here is your thread (see the More button on the right). http://spssx-discussion.1045642.n5.nabble.com/template/NamlServlet.jtp?macro=reply&node=5724658
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by rkarapin
David,
Thanks for replying. I'd be happy if I could solve this problem by dealing with some missing data. But as I said at the start of my post that there is no missing data in my original dataset. I'll try to attach it here so you can see that. Anyway, the error message is clearly referring to the get command that is trying to get variables that the macro created, not my original data. None of those variables (the relevant ones are g and resid) has missing data, either. I'll try to attach that here too. original_data.savtempdata.sav I hope you can show me that I've overlooked some missing data. Roger |
Administrator
|
Roger,
Look below case 50. There are ~64,000 rows of . values (SYSMIS). You can solve that by fetching the data file. then run the following prior to calling the macro. SELECT IF ($CASENUM LE 50).
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
David,
Many thanks, it works now. I should have thought of that. Especially since I had seen that the macro wrote zeroes to the rest of the 64000+ cases for a couple of the variables. Roger On Thu, 27 Feb 2014, David Marso [via SPSSX Discussion] wrote: > Date: Thu, 27 Feb 2014 12:18:14 -0800 (PST) > From: "David Marso [via SPSSX Discussion]" > <[hidden email]> > To: rkarapin <[hidden email]> > Subject: Re: Error message: "missing data" (syntax cannot find the data) > > > > Roger, > Look below case 50. > There are ~64,000 rows of . values (SYSMIS). > You can solve that by fetching the data file. > then run the following prior to calling the macro. > SELECT IF ($CASENUM LE 50). > > > rkarapin wrote >> David, >> >> Thanks for replying. I'd be happy if I could solve this problem by >> dealing with some missing data. >> >> But as I said at the start of my post that there is no missing data in my >> original dataset. I'll try to attach it here so you can see that. >> Anyway, the error message is clearly referring to the get command that is >> trying to get variables that the macro created, not my original data. >> None of those variables (the relevant ones are g and resid) has missing >> data, either. I'll try to attach that here too. >> original_data.sav >> <http://spssx-discussion.1045642.n5.nabble.com/file/n5724660/original_data.sav> >>> tempdata.sav >> <http://spssx-discussion.1045642.n5.nabble.com/file/n5724660/tempdata.sav> >> >> I hope you can show me that I've overlooked some missing data. >> >> Roger > > > > > > ----- > Please reply to the list and not to my personal email. > Those desiring my consulting or training services please feel free to email me. > --- > "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." > Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" > _______________________________________________ > If you reply to this email, your message will be added to the discussion below: > http://spssx-discussion.1045642.n5.nabble.com/Error-message-missing-data-syntax-cannot-find-the-data-tp5724658p5724661.html > > To unsubscribe from Error message: "missing data" (syntax cannot find the data), visit |
Hi David,
SPSS 23 on Win 10 Breusch-Pagan test script by Marta Garcia-Granero I get the same error message mentioned by Roger: Run MATRIX procedure: >Error encountered in source line # 381 >Error # 12555 >During execution of the GET statement, missing value has been encountered, but >no MISSING subcommand is specified. >Execution of this command stops. >Error encountered in source line # 382 I do have some missing values in my five IVs (coded as 999) and no missing values in my DV. Could you please confirm that the missing values prevent this script from executing? Is there an alternative? Thank you |
Perhaps Marta can address the missing values issue with her script, but there is an extension command, SPSSINC BREUSH PAGAN (Analyze > Regression > Residual Heteroscedasticity Test) that does handle missing values. This procedure is included in the R Essentials. On Mon, Aug 1, 2016 at 3:49 PM, johndoelives <[hidden email]> wrote: Hi David, |
Administrator
|
In reply to this post by johndoelives
I suppose you are talking about this macro?
http://www.spsstools.net/en/syntax/442/ You might try deleting from the working data file any cases that do not have complete data for all of the variables in your regression model. Suppose your dependent variable is Y and your explanatory variables are X1 to X5. SELECT IF NMISS(Y,X1 to X5) = 0. DESCRIPTIVES Y X1 to X5. * Then call Marta's macro. BPKTEST Y 5 X1 TO X5.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Thanks for your comments. I have found the following solution:
https://sites.google.com/site/ahmaddaryanto/scripts/Heterogeneity-test |
Administrator
|
Solution? Delve deeper! That cannot possibly solve your real problem with missing values! From the dialog code: MATRIX. get mat/variables=!dv !iv /names=nms /MISSING=OMIT. ... Implying you are unknowingly using LISTWISE deletion. See Jon's post above and go from there.
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
Administrator
|
In reply to this post by Jon Peck
Jon,
Can you tell us how the SPSSINC BREUSH PAGAN implementation handles the missing values? David --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
The default is listwise deletion, and stopping the procedure if missing values are found is an option. The procedure runs the specified regression model (categorical variables are automatically converted to factors). Then tests for homoscedasticity. • You can specify that the heteroscedasticity test will be against the alternative that the variance is a function of a chosen set of predictors. Specify the desired predictors in the Variance Model list. If the Variance Model list is empty, the heteroscedasticity test will be against the alternative that the variance is a function of the level of the dependent variable. On Wed, Aug 3, 2016 at 7:20 AM, David Marso <[hidden email]> wrote: Jon, |
Administrator
|
So it really doesn't go beyond Marta's macro or the other one the OP located?
Would some sort of data imputation make sense? For each independent variable Xi use remaining independents and dependent to estimate the missing values in Xi. Iterate until stabilized. Run the heteroscedasticity test on this data set? Just thinking about it without any coffee. ;-)
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
I don't think that Marta's macro provides for a variance model for the alternative. I would be suspicious of testing for heteroscedastic errors after imputation, since the imputed cases are likely to have residuals with higher variance due to the additional error from imputation. Better to exclude such cases. On Wed, Aug 3, 2016 at 7:50 AM, David Marso <[hidden email]> wrote: So it really doesn't go beyond Marta's macro or the other one the OP located? |
Administrator
|
What are your thoughts on using PAIRWISE prior to running the test?
You are utilizing more information than LISTWISE. --
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
My inclination would be not to use pairwise in this context. With pairwise, the usual distributional properties of the estimators don't exactly apply, so I don't know what the test properties would be. Since heteroscedasticity is a minor problem in regression unless it is severe - and hence, easily detectable, the possible additional power of the test wouldn't have a lot of value in most cases even assuming that its statistical properties are still valid. On Wed, Aug 3, 2016 at 8:37 AM, David Marso <[hidden email]> wrote: What are your thoughts on using PAIRWISE prior to running the test? |
Free forum by Nabble | Edit this page |