Greetings and apologies for cross-posting It is often claimed that normal based methods such as linear regression are 'robust' and do not give misleading results, even when data are far from normally distributed. To investigate this claim, several real data sets have been analysed: both using normal based methods and using methods based on various non-normal distributions. The first scenario, Scenario 1 is given below. We want to compare the actual concordance of two alternative methods with the predictions of statistical practitioners, such as the committed users of this list. So we are asking for your predictions about concordance for various scenarios. Scenario 1: Multiple linear regression is performed with a raw and a transformed metric. Predict % agreement between results from the 2 metrics Analyst want to know which of 21 features significantly predict overall satisfaction Raw metric is proportion of respondents favourable, p BUT p is not & can not be normally distributed. So an alternative is the inverse normal, z, corresponding to p. Best subset linear regression was conducted for 51 separate units: a. using p as metric. b. using z as metric. Concordance Question: How much difference does it make? Predict from all the significant predictors, what: % same predictors significant at 95% cl for both p and z % predictors only significant for p % predictors only significant for z. Please give your expert predictions at https://www.surveymonkey.com/s/9SY7V7Z More details about project at: http://dianakornbrot.wordpress.com/projects/methods-matter/ Dissemination of Results The actual concordance and a summary of the predicted concordance of experts will be published on 16 Feb 2014 at http://dianakornbrot.wordpress.com/projects/methods-matter/ Many thanks for reading this long screed. Comments on the project are very welcome. best Diana _____________________ Professor Diana Kornbrot Work University of Hertfordshire College Lane, Hatfield, Hertfordshire AL10 9AB, UK voice: +44 (0) 170 728 4626 email: [hidden email] skype: kornbrotme Home 19 Elmhurst Avenue London N2 0LT, UK voice: +44 (0) 208 444 2081 |
Diana,
After I read your posting I looked at your website and while I think I understand the overall question you are asking, I don’t understand the construction of
your dataset. To summarize the dataset: People rated 21 features of something and also rated their overall satisfaction, all on a 1-5 scale. Feature ratings were recoded 1-3=0, 4,5=1. There seem to have been 51 groups of people. Each group is analyzed separately
because different relationships may be expected in each group. This I don’t get: It seems that the feature ratings were converted to either proportions or “z’s” via an inverse normal distribution mapping of the proportions. Either way, haven’t you converted
your 51*n(g) dataset to a N=51 dataset. Gene Maguin From: SPSSX(r) Discussion [mailto:[hidden email]]
On Behalf Of Kornbrot, Diana Greetings and apologies for cross-posting It is often claimed that normal based methods such as linear regression are 'robust' and do not give misleading results, even when data are far from normally distributed. To investigate this claim, several real data sets have been analysed: both using normal based methods and using methods based on various non-normal distributions. The first scenario, Scenario 1 is given below. We want to compare the actual concordance of two alternative methods with the predictions of statistical practitioners, such as the committed users of this list. So we are asking for
your predictions about concordance for various scenarios. Scenario 1: Multiple linear regression is performed with a raw and a transformed metric. Predict % agreement between results from the 2 metrics % same predictors significant at 95% cl for both p and z % predictors only significant for p % predictors only significant for z. More details about project at: http://dianakornbrot.wordpress.com/projects/methods-matter/ Dissemination of Results The actual concordance and a summary of the predicted concordance of experts will be published on 16 Feb 2014 at http://dianakornbrot.wordpress.com/projects/methods-matter/ Many thanks for reading this long screed. Comments on the project are very welcome. best Diana _____________________ Professor Diana Kornbrot Work University of Hertfordshire College Lane, Hatfield, Hertfordshire AL10 9AB, UK voice:
+44 (0) 170 728 4626 email:
[hidden email] skype:
kornbrotme Home 19 Elmhurst Avenue London N2 0LT, UK voice: +44 (0) 208 444 2081 |
In reply to this post by Kornbrot, Diana
Complaints:
1. The importance of transformations is especially seen in data where there are outliers. You don't have outliers, to speak of, when your initial data are Likert items. After dichotomizing and choosing (yes/no) whether to use Probits, no difference is likely to matter so long as the proportions are between 20% and 80%. 2. Since data are measured with Likert scaling of the items, it seems that the natural comparison would be between analyzing the data either (a) while assuming that they are normal, by the usual regression; or (b) while assuming that they should be rank-transformed, by performing regression on the rank-transformed version of the raw scores. It is well-known that when there is sufficient power, artifacts of rank-transformed data tend to introduce extra variables (including suppressors) in order to account for the bad fit at the extremes -- assuming that you have the large amount of data needed to find reproducible predictors. 3. "Best subset linear regression" is such a bad idea that this project should be rejected for the fact that seems to imply that it is *not* (almost always) a bad idea. -- Rich Ulrich ________________________________ > Date: Tue, 28 Jan 2014 12:36:02 +0000 > From: [hidden email] > Subject: Statistics Challenge: Does analysis metric matter? Are normal > based methods robust? > To: [hidden email] > > Greetings and apologies for cross-posting > > It is often claimed that normal based methods such as linear regression > are 'robust' and do not give misleading results, even when data are far > from normally distributed. > > To investigate this claim, several real data sets have been analysed: > both using normal based methods and using methods based on various > non-normal distributions. The first scenario, Scenario 1 is given > below. > > We want to compare the actual concordance of two alternative methods > with the predictions of statistical practitioners, such as the > committed users of this list. So we are asking for your predictions > about concordance for various scenarios. > > Scenario 1: Multiple linear regression is performed with a raw and a > transformed metric. > Predict % agreement between results from > the 2 metrics > Analyst want to know which of 21 features significantly predict overall > satisfaction > Raw metric is proportion of respondents favourable, p > BUT p is not & can not be normally distributed. So an alternative is > the inverse normal, z, corresponding to p. > Best subset linear regression was conducted for 51 separate units: a. > using p as metric. b. using z as metric. > > Concordance Question: How much difference does it make? > Predict from all the significant predictors, what: > % same predictors significant at 95% cl for both p and z > % predictors only significant for p > % predictors only significant for z. > Please give your expert predictions at https://www.surveymonkey.com/s/9SY7V7Z > More details > about project at: http://dianakornbrot.wordpress.com/projects/methods-matter/ > > Dissemination of Results > The actual concordance and a summary of the predicted concordance of > experts will be published on 16 Feb 2014 > at http://dianakornbrot.wordpress.com/projects/methods-matter/ > > Many thanks for reading this long screed. Comments on the project are > very welcome. > > best > > Diana > _____________________ > > Professor Diana Kornbrot > Work > University of Hertfordshire > College Lane, Hatfield, Hertfordshire AL10 9AB, UK > voice: +44 (0) 170 728 4626 > email: [hidden email]<mailto:[hidden email]> > skype: kornbrotme > Home > 19 Elmhurst Avenue > London N2 0LT, UK > voice: +44 (0) 208 444 2081 > ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
As usual Rich made a
very insightful response.
In addition, CATREG can be used to model the data specifying different assumptions about level of measurement and to compare the results. Art Kendall Social Research ConsultantsOn 1/30/2014 1:44 AM, Rich Ulrich [via SPSSX Discussion] wrote: Complaints:
Art Kendall
Social Research Consultants |
Free forum by Nabble | Edit this page |