SPSSX Discussion

Statistics Challenge: Does analysis metric matter? Are normal based methods robust?

Classic

List

Threaded

4 messages Options

Kornbrot, Diana

Statistics Challenge: Does analysis metric matter? Are normal based methods robust?

Greetings and apologies for cross-posting

It is often claimed that normal based methods such as linear regression are 'robust' and do not give misleading results, even when data are far from normally distributed.

To investigate this claim, several real data sets have been analysed: both using normal based methods and using methods based on various non-normal distributions. The first scenario, Scenario 1 is given below.

We want to compare the actual concordance of two alternative methods with the predictions of statistical practitioners, such as the committed users of this list. So we are asking for your predictions about concordance for various scenarios.

Scenario 1: Multiple linear regression is performed with a raw and a transformed metric.

Predict % agreement between results from the 2 metrics
Analyst want to know which of 21 features significantly predict overall satisfaction
Raw metric is proportion of respondents favourable, p
BUT p is not & can not be normally distributed. So an alternative is the inverse normal, z, corresponding to p.
Best subset linear regression was conducted for 51 separate units: a. using p as metric. b. using z as metric.

Concordance Question: How much difference does it make?
Predict from all the significant predictors, what:

% same predictors significant at 95% cl for both p and z

% predictors only significant for p

% predictors only significant for z.
Please give your expert predictions at https://www.surveymonkey.com/s/9SY7V7Z

More details about project at: http://dianakornbrot.wordpress.com/projects/methods-matter/

Dissemination of Results

The actual concordance and a summary of the predicted concordance of experts will be published on 16 Feb 2014 at http://dianakornbrot.wordpress.com/projects/methods-matter/

Many thanks for reading this long screed. Comments on the project are very welcome.

best

Diana

_____________________

Professor Diana Kornbrot

Work

University of Hertfordshire

College Lane, Hatfield, Hertfordshire AL10 9AB, UK

voice: +44 (0) 170 728 4626

email: [hidden email]

skype: kornbrotme

Home

19 Elmhurst Avenue

London N2 0LT, UK

voice: +44 (0) 208 444 2081

Maguin, Eugene

Re: Statistics Challenge: Does analysis metric matter? Are normal based methods robust?

Diana,

After I read your posting I looked at your website and while I think I understand the overall question you are asking, I don’t understand the construction of your dataset. To summarize the dataset: People rated 21 features of something and also rated their overall satisfaction, all on a 1-5 scale. Feature ratings were recoded 1-3=0, 4,5=1. There seem to have been 51 groups of people. Each group is analyzed separately because different relationships may be expected in each group. This I don’t get: It seems that the feature ratings were converted to either proportions or “z’s” via an inverse normal distribution mapping of the proportions. Either way, haven’t you converted your 51*n(g) dataset to a N=51 dataset.

Gene Maguin

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Kornbrot, Diana
Sent: Tuesday, January 28, 2014 7:36 AM
To: [hidden email]
Subject: Statistics Challenge: Does analysis metric matter? Are normal based methods robust?

Greetings and apologies for cross-posting

It is often claimed that normal based methods such as linear regression are 'robust' and do not give misleading results, even when data are far from normally distributed.

Scenario 1: Multiple linear regression is performed with a raw and a transformed metric.

% same predictors significant at 95% cl for both p and z

% predictors only significant for p

% predictors only significant for z.
Please give your expert predictions at https://www.surveymonkey.com/s/9SY7V7Z

More details about project at: http://dianakornbrot.wordpress.com/projects/methods-matter/

Dissemination of Results

The actual concordance and a summary of the predicted concordance of experts will be published on 16 Feb 2014 at http://dianakornbrot.wordpress.com/projects/methods-matter/

Many thanks for reading this long screed. Comments on the project are very welcome.

best

Diana

_____________________

Professor Diana Kornbrot

Work

University of Hertfordshire

College Lane, Hatfield, Hertfordshire AL10 9AB, UK

voice: +44 (0) 170 728 4626

email: [hidden email]

skype: kornbrotme

Home

19 Elmhurst Avenue

London N2 0LT, UK

voice: +44 (0) 208 444 2081

Rich Ulrich

Re: Statistics Challenge: Does analysis metric matter? Are normal based methods robust?

In reply to this post by Kornbrot, Diana

Complaints:
1. The importance of transformations is especially seen in
data where there are outliers. You don't have outliers, to
speak of, when your initial data are Likert items. After
dichotomizing and choosing (yes/no) whether to use Probits,
no difference is likely to matter so long as the proportions
are between 20% and 80%.

2. Since data are measured with Likert scaling of the items,
it seems that the natural comparison would be between
analyzing the data either (a) while assuming that they
are normal, by the usual regression; or (b) while assuming
that they should be rank-transformed, by performing regression
on the rank-transformed version of the raw scores.

It is well-known that when there is sufficient power, artifacts
of rank-transformed data tend to introduce extra variables
(including suppressors) in order to account for the bad fit
at the extremes -- assuming that you have the large amount of
data needed to find reproducible predictors.

3. "Best subset linear regression" is such a bad idea that
this project should be rejected for the fact that seems to
imply that it is *not* (almost always) a bad idea.

--
Rich Ulrich

________________________________

> Date: Tue, 28 Jan 2014 12:36:02 +0000
> From: [hidden email]
> Subject: Statistics Challenge: Does analysis metric matter? Are normal
> based methods robust?
> To: [hidden email]
>
> Greetings and apologies for cross-posting
>
> It is often claimed that normal based methods such as linear regression
> are 'robust' and do not give misleading results, even when data are far
> from normally distributed.
>
> To investigate this claim, several real data sets have been analysed:
> both using normal based methods and using methods based on various
> non-normal distributions. The first scenario, Scenario 1 is given
> below.
>
> We want to compare the actual concordance of two alternative methods
> with the predictions of statistical practitioners, such as the
> committed users of this list. So we are asking for your predictions
> about concordance for various scenarios.
>
> Scenario 1: Multiple linear regression is performed with a raw and a
> transformed metric.
> Predict % agreement between results from
> the 2 metrics
> Analyst want to know which of 21 features significantly predict overall
> satisfaction
> Raw metric is proportion of respondents favourable, p
> BUT p is not & can not be normally distributed. So an alternative is
> the inverse normal, z, corresponding to p.
> Best subset linear regression was conducted for 51 separate units: a.
> using p as metric. b. using z as metric.
>
> Concordance Question: How much difference does it make?
> Predict from all the significant predictors, what:
> % same predictors significant at 95% cl for both p and z
> % predictors only significant for p
> % predictors only significant for z.
> Please give your expert predictions at https://www.surveymonkey.com/s/9SY7V7Z
> More details
> about project at: http://dianakornbrot.wordpress.com/projects/methods-matter/
>
> Dissemination of Results
> The actual concordance and a summary of the predicted concordance of
> experts will be published on 16 Feb 2014
> at http://dianakornbrot.wordpress.com/projects/methods-matter/
>
> Many thanks for reading this long screed. Comments on the project are
> very welcome.
>
> best
>
> Diana
> _____________________
>
> Professor Diana Kornbrot
> Work
> University of Hertfordshire
> College Lane, Hatfield, Hertfordshire AL10 9AB, UK
> voice: +44 (0) 170 728 4626
> email: [hidden email]<mailto:[hidden email]>
> skype: kornbrotme
> Home
> 19 Elmhurst Avenue
> London N2 0LT, UK
> voice: +44 (0) 208 444 2081
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Art Kendall

Re: Statistics Challenge: Does analysis metric matter? Are normal based methods robust?

As usual Rich made a very insightful response.

In addition, CATREG can be used to model the data specifying different assumptions about level of measurement and to compare the results.

Art Kendall
Social Research Consultants

On 1/30/2014 1:44 AM, Rich Ulrich [via SPSSX Discussion] wrote:

Complaints:
1. The importance of transformations is especially seen in
data where there are outliers. You don't have outliers, to
speak of, when your initial data are Likert items. After
dichotomizing and choosing (yes/no) whether to use Probits,
no difference is likely to matter so long as the proportions
are between 20% and 80%.

2. Since data are measured with Likert scaling of the items,
it seems that the natural comparison would be between
analyzing the data either (a) while assuming that they
are normal, by the usual regression; or (b) while assuming
that they should be rank-transformed, by performing regression
on the rank-transformed version of the raw scores.

It is well-known that when there is sufficient power, artifacts
of rank-transformed data tend to introduce extra variables
(including suppressors) in order to account for the bad fit
at the extremes -- assuming that you have the large amount of
data needed to find reproducible predictors.

3. "Best subset linear regression" is such a bad idea that
this project should be rejected for the fact that seems to
imply that it is *not* (almost always) a bad idea.

--
Rich Ulrich

________________________________

> Date: Tue, 28 Jan 2014 12:36:02 +0000
> From: [hidden email]
> Subject: Statistics Challenge: Does analysis metric matter? Are normal
> based methods robust?
> To: [hidden email]
>
> Greetings and apologies for cross-posting
>
> It is often claimed that normal based methods such as linear regression
> are 'robust' and do not give misleading results, even when data are far
> from normally distributed.
>
> To investigate this claim, several real data sets have been analysed:
> both using normal based methods and using methods based on various
> non-normal distributions. The first scenario, Scenario 1 is given
> below.
>
> We want to compare the actual concordance of two alternative methods
> with the predictions of statistical practitioners, such as the
> committed users of this list. So we are asking for your predictions
> about concordance for various scenarios.
>
> Scenario 1: Multiple linear regression is performed with a raw and a
> transformed metric.
> Predict % agreement between results from
> the 2 metrics
> Analyst want to know which of 21 features significantly predict overall
> satisfaction
> Raw metric is proportion of respondents favourable, p
> BUT p is not & can not be normally distributed. So an alternative is
> the inverse normal, z, corresponding to p.
> Best subset linear regression was conducted for 51 separate units: a.
> using p as metric. b. using z as metric.
>
> Concordance Question: How much difference does it make?
> Predict from all the significant predictors, what:
> % same predictors significant at 95% cl for both p and z
> % predictors only significant for p
> % predictors only significant for z.
> Please give your expert predictions at https://www.surveymonkey.com/s/9SY7V7Z
> More details
> about project at: http://dianakornbrot.wordpress.com/projects/methods-matter/
>
> Dissemination of Results
> The actual concordance and a summary of the predicted concordance of
> experts will be published on 16 Feb 2014
> at http://dianakornbrot.wordpress.com/projects/methods-matter/
>
> Many thanks for reading this long screed. Comments on the project are
> very welcome.
>
> best
>
> Diana
> _____________________
>
> Professor Diana Kornbrot
> Work
> University of Hertfordshire
> College Lane, Hatfield, Hertfordshire AL10 9AB, UK
> voice: +44 (0) 170 728 4626
> email: [hidden email]<mailto:[hidden email]>
> skype: kornbrotme
> Home
> 19 Elmhurst Avenue
> London N2 0LT, UK
> voice: +44 (0) 208 444 2081
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

If you reply to this email, your message will be added to the discussion below:
http://spssx-discussion.1045642.n5.nabble.com/Statistics-Challenge-Does-analysis-metric-matter-Are-normal-based-methods-robust-tp5724197p5724232.html

To start a new topic under SPSSX Discussion, email [hidden email]
To unsubscribe from SPSSX Discussion, click here.
NAML

Art Kendall
Social Research Consultants