SPSSX Discussion

correlation and regression for ordinal and nominal (dependent)

Classic

List

Threaded

32 messages Options

SiriusxTR

correlation and regression for ordinal and nominal (dependent)

Hello. I have a problem. I have an ordinal and nominal variable. I want to do a correlation test. What test should I use? Kendall tau and Spearman are not appropriate as both would have to be ordinal at least, phi not because one is ordinal, eta not because don't have scale. In one book I have read that for Cramer's V you need to have two nominals or dichotomous, in another book I have read that it is ok for nominal+ordinal. So what is the right solution?

For regression I suppose the binomial logic regression would be fine for nominal as dependent?

Thank you.

John F Hall

Re: correlation and regression for ordinal and nominal (dependent)

Not sure if this is covered, but have a look at Jim Ring's stats notes on my website.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of SiriusxTR
Sent: 10 February 2012 09:26
To: [hidden email]
Subject: correlation and regression for ordinal and nominal (dependent)

Hello. I have a problem. I have an ordinal and nominal variable. I want to do

a correlation test. What test should I use? Kendall tau and Spearman are not

appropriate as both would have to be ordinal at least, phi not because one

is ordinal, eta not because don't have scale. In one book I have read that

for Cramer's V you need to have two nominals or dichotomous, in another book

I have read that it is ok for nominal+ordinal. So what is the right

solution?

For regression I suppose the binomial logic regression would be fine for

nominal as dependent?

Thank you.

View this message in context: http://spssx-discussion.1045642.n5.nabble.com/correlation-and-regression-for-ordinal-and-nominal-dependent-tp5471812p5471812.html

Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================

To manage your subscription to SPSSX-L, send a message to

[hidden email] (not to SPSSX-L), with no body text except the

command. To leave the list, send the command

SIGNOFF SPSSX-L

For a list of commands to manage subscriptions, send the command

INFO REFCARD

Poes, Matthew Joseph

Re: correlation and regression for ordinal and nominal (dependent)

This is a complex topic to analyze correctly. I would argue there are two good ways to do this, though interpretation is tricky in both cases. I’d recommend researching the methods to understand them best.

First, since true nominal data has no real scale to it, no higher or lower, but rather just arbitrary values, typical correlations aren’t possible. However, that doesn’t mean we can’t look at how certain nominal responses are associated with certain ordinal responses.

Method 1: I think you could use a cluster analytic technique to cluster ordinal responses based on nominal responses.

Method 2: Categorical regression (Optimal Scaling Technique): This has the ability to combine different types of variables, ordinal, nominal, and continuous, transform the values into all continuous ratio type variables, and create correlations. The problem with this approach is that it’s not very common, it’s not well understood, and the best practice for transforming nominal values for this purpose is rather weak. I’d add to that, that what your interpreting in the end is a bit ambiguous if you don’t the necessary understanding of the method. I’m hoping to have a paper on this topic out for its use in public health research at some point in the future, but that won’t be of any help to you right now.

-Matt

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of John F Hall
Sent: Friday, February 10, 2012 8:46 AM
To: [hidden email]
Subject: Re: correlation and regression for ordinal and nominal (dependent)

Not sure if this is covered, but have a look at Jim Ring's stats notes on my website.

-----Original Message-----
From: SPSSX(r) Discussion [hidden email] On Behalf Of SiriusxTR
Sent: 10 February 2012 09:26
To: [hidden email]
Subject: correlation and regression for ordinal and nominal (dependent)

Hello. I have a problem. I have an ordinal and nominal variable. I want to do

a correlation test. What test should I use? Kendall tau and Spearman are not

appropriate as both would have to be ordinal at least, phi not because one

is ordinal, eta not because don't have scale. In one book I have read that

for Cramer's V you need to have two nominals or dichotomous, in another book

I have read that it is ok for nominal+ordinal. So what is the right

solution?

For regression I suppose the binomial logic regression would be fine for

nominal as dependent?

Thank you.

View this message in context: http://spssx-discussion.1045642.n5.nabble.com/correlation-and-regression-for-ordinal-and-nominal-dependent-tp5471812p5471812.html

Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================

To manage your subscription to SPSSX-L, send a message to

[hidden email] (not to SPSSX-L), with no body text except the

command. To leave the list, send the command

SIGNOFF SPSSX-L

For a list of commands to manage subscriptions, send the command

INFO REFCARD

Bruce Weaver

Re: correlation and regression for ordinal and nominal (dependent)

Administrator

In reply to this post by SiriusxTR

I assume by ordinal, you mean ordered categories (as opposed to ranks). If so, how many categories are there?

Re logistic regression, did you mean binary logistic (i.e., the nominal outcome variable has two categories)? If I'm not mistaken, binomial logistic regression refers to the situation where you have X successes in N trials for each subject. In the documentation for GENLIN, for example, this is where you would use the format,

GENLIN events-var of trials-var...etc

Thanks for clarifying.

SiriusxTR wrote

Hello. I have a problem. I have an ordinal and nominal variable. I want to do a correlation test. What test should I use? Kendall tau and Spearman are not appropriate as both would have to be ordinal at least, phi not because one is ordinal, eta not because don't have scale. In one book I have read that for Cramer's V you need to have two nominals or dichotomous, in another book I have read that it is ok for nominal+ordinal. So what is the right solution?

For regression I suppose the binomial logic regression would be fine for nominal as dependent?

Thank you.

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/

"When all else fails, RTFM."

PLEASE NOTE THE FOLLOWING:
1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above.
2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/).

Robert Mc

Re: correlation and regression for ordinal and nominal (dependent)

In reply to this post by SiriusxTR

Have you considered multinomial logistic regression?

You could use either the top or bottom category as the reference, or perhaps even the most frequently selected category.

SiriusxTR

Re: correlation and regression for ordinal and nominal (dependent)

In reply to this post by Bruce Weaver

So, I have found the answer in the link above proposed.
To explain myself, even if it is not so interesting: I have patients with prostate cancer, no. of positive fragments in biopsy (ordinal - 3 categories) and biochemical relapse (nominal - dichotomous had or did not have). I wanted to check the correlation between no. of positive fragments at biopsy and relapse, the power of this associations, and check for prediction (Binomial Logistic Regression, I suppose) if can.

Here the answer:
"SPSS does not produce specific measures when the dependent variable is nominal and the independent variable is not. There are two ways of getting round this problem: (i) for 2x2 tables the dependent variable can normally be assumed to have the same level as the independent variable, and one of the measures described below can be used; (ii) otherwise the independent variable must be treated as nominal (losing some information about the relationship between values) and chi-square or Cramer's V can be used."

Bruce Weaver

Re: correlation and regression for ordinal and nominal (dependent)

Administrator

You can use the test of "linear-by-linear" association that appears in the chi-square results from CROSSTABS. For more info:

http://www.uvm.edu/~dhowell/StatPages/More_Stuff/OrdinalChisq/OrdinalChiSq.html

HTH.

SiriusxTR wrote

So, I have found the answer in the link above proposed.
To explain myself, even if it is not so interesting: I have patients with prostate cancer, no. of positive fragments in biopsy (ordinal - 3 categories) and biochemical relapse (nominal - dichotomous had or did not have). I wanted to check the correlation between no. of positive fragments at biopsy and relapse, the power of this associations, and check for prediction (Binomial Logistic Regression, I suppose) if can.

Here the answer:
"SPSS does not produce specific measures when the dependent variable is nominal and the independent variable is not. There are two ways of getting round this problem: (i) for 2x2 tables the dependent variable can normally be assumed to have the same level as the independent variable, and one of the measures described below can be used; (ii) otherwise the independent variable must be treated as nominal (losing some information about the relationship between values) and chi-square or Cramer's V can be used."

Rich Ulrich

Re: correlation and regression for ordinal and nominal (dependent)

In reply to this post by SiriusxTR

Here is a better starting place. A dichotomous variable
has only one interval, so it satisfies the need for an "equal
interval" variable.

With three categories only, the "number of positive fragments"
is presumably scored in near-equal intervals, unless there is
a large lack of information on the subject. So the obvious
candidate for a simple test is the ordinary Pearson r. Or a
t-test on the fragments.

Given the small number of score-points, I think I would offer
the Odds-ratios between categories (zero? vs. each of the others)
as the meaningful indicator of effect-size.

--
Rich Ulrich

> Date: Fri, 10 Feb 2012 08:06:12 -0800

> From: [hidden email]
> Subject: Re: correlation and regression for ordinal and nominal (dependent)
> To: [hidden email]
>
> So, I have found the answer in the link above proposed.
> To explain myself, even if it is not so interesting: I have patients with
> prostate cancer, no. of positive fragments in biopsy (ordinal - 3
> categories) and biochemical relapse (nominal - dichotomous had or did not
> have). I wanted to check the correlation between no. of positive fragments
> at biopsy and relapse, the power of this associations, and check for
> prediction (Binomial Logistic Regression, I suppose) if can.
>
> Here the answer:
> "*SPSS does not produce specific measures when the dependent variable is
> nominal and the independent variable is not.* There are two ways of getting
> round this problem: (i) for 2x2 tables the dependent variable can normally
> be assumed to have the same level as the independent variable, and one of
> the measures described below can be used; (ii) *otherwise the independent
> variable must be treated as nominal (losing some information about the
> relationship between values) and chi-square or Cramer's V can be used*."

[...]

SiriusxTR

Re: correlation and regression for ordinal and nominal (dependent)

Thank you everybody for your help.

So I went with Cramer's V and lambda. And I will try the binomial logistic regression. If anybody has any observation, pls do not hesitate.

My question would be: I have 8 scale measurements, same cases, 8 different time points. How can I put it on a line graph, time-line, to show me the means? I have done this in SPSS 17 with legacy>interactive graph, but how can it be done in SPSS 20?

ViAnn Beadle

Re: correlation and regression for ordinal and nominal (dependent)

Here's the help text for summarizing multiple variables using chart
builder. Do it, and paste the syntax to look at it:

Drag a gallery chart or a graphic element onto the canvas.

Drag one of the scale variables that you want to summarize to the y axis.

Drag another scale variable to the top section of the y-axis drop zone.
Drop the variable when you see the plus sign (+) in the drop zone. The
Create Summary Group dialog box then appears.

Note: If the gallery chart contains a point element, the previous action
will create an overlay scatterplot. See the topic Scatterplots and dot plots
for more information. To create summaries of separate variables for a point
element, you must first drag a categorical variable to the x axis.

When you summarize multiple variables, the Chart Builder creates a new
variable whose categories are the individual variables. This "summary group"
variable is put on the x axis and is displayed as INDEX. The y axis displays
the summarized value for each variable. The Chart Builder uses an asterisk
(*) to indicate any of these constructed variables. Note that you can move
the INDEX variable to a paneling or grouping zone if desired. It acts like a
categorical variable.

Click OK to create the summary group variable.

If necessary, drag additional variables to the y-axis drop zone.

As with any chart, you can edit the individual elements. For example, you
could choose the median as the statistic for each variable, or you could
change the scale for one of the axes. See the topic Editing elements for
more information.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
SiriusxTR
Sent: Saturday, February 11, 2012 4:05 AM
To: [hidden email]
Subject: Re: correlation and regression for ordinal and nominal (dependent)

Thank you everybody for your help.

So I went with Cramer's V and lambda. And I will try the binomial logistic
regression. If anybody has any observation, pls do not hesitate.

My question would be: I have 8 scale measurements, same cases, 8 different
time points. How can I put it on a line graph, time-line, to show me the
means? I have done this in SPSS 17 with legacy>interactive graph, but how
can it be done in SPSS 20?

--
View this message in context:
http://spssx-discussion.1045642.n5.nabble.com/correlation-and-regression-for
-ordinal-and-nominal-dependent-tp5471812p5474790.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command SIGNOFF SPSSX-L For a list of
commands to manage subscriptions, send the command INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Ray Koopman

Re: correlation and regression for ordinal and nominal (dependent)

In reply to this post by SiriusxTR

The topic title implies that "relapse" is the dependent variable and "number of positive fragments" is the independent variable. Is that inference correct?

> So I went with Cramer's V and lambda. And I will try the binomial logistic regression.
> If anybody has any observation, pls do not hesitate.

Both V and lambda treat the data as unordered categories, discarding the ordinal information in the data.

To extend Rich's comment that a dichotomous variable can always be treated as an interval variable, it can also always be treated as an ordinal variable, so Kendall's tau *would* be appropriate. (You could also use Somers' d, which is to tau_b as a regression coefficient is to a correlation.) Spearman's rho would *not* be appropriate, because using ranks as if they were on an interval scale, which is what rho does, confounds the relative sizes of the intervals between the scale points of the trichotomous variable with the proportion of cases that are at each scale point.

However, all the above-mentioned statistics have the undesirable property that they are artifacts of the marginal distributions of the cases. To escape that, you should follow Rich's suggestion and look at (the logs of) the relapse odds for the three fragment-count categories. If they form an increasing sequence then there is a scaling of the three fragment-count categories that will make the trend linear, and the difference between the third and first log-odds is a measure of the size of the overall trend. On the other hand, if the log-odds do not form an increasing sequence then the location and magnitude of the reversal would also be of interest.

> I have 8 scale measurements, same cases, 8 different time points.

Are you saying that you have an N x 8 (cases x time points) data matrix, in which each cell contains both a fragment count and a relapse value?

SiriusxTR

Re: correlation and regression for ordinal and nominal (dependent)

In reply to this post by ViAnn Beadle

Wow, ViAnn Beadle, you are my hero!!!:))) Now I can uninstall the v.17:)) Really, some things are not written anywhere, and they are so hard to find!
I thank you all for your help, it means so much to me!

Ok, I will write this down in detail. I have a sample of 556 patients with prostate cancer. Biopsies were taken before the surgical intervention. I grouped the positive no. of biopsies in 3 groups (Group 1 - 1-2 positive biopsies, Group 2 - 3-4 positive biopsies, Group 3 - 5-10 positive biopsies), so this would be the ordinal, independent variable. Biochemical relapse after intervention, a nominal, dichotomous variable, yes or no (involves a 3 year period).

Task: correlation and get info on how I could predict the relapse chances based on positive fragments group.
Problem: Choosing the appropriate tests. I have read a lot in the past week, but it seems there is a problem with correlation between ordinal and nominal, no manual likes to talk about it, especially if the outcome variable is the nominal.

After a looooong studying period I have chosen Cramer's v and lambda, because books say that: Kendall-tau can be used only with 2 ordinals, tau-b if (no. columns=no. rows), tau-c if (no. columns<>no. rows), Pearson r only for scale and normally distributed, Spearman only with 2 ordinals, Gamma and Sommer's d only with 2 ordinals; and I quoted above from a book regarding the ordinal+nominal test that says Cramer's V is appropriate.

As I understand, nominal tests can be used for ordinals, some non-parametric for parametric, only you loose some power of the test (e.g. Wilcoxon for a normally distributed scale). But Cramer's and lambda gave me even this way positive, statistically significant results.:))

Now, I want to do a binomial logistic regression test, but first I have to learn about it more, and how to interpret it.

So Sommers'd for Kendall is like lambda for Cramer's???

It would be awesome to have a list which shows the tests paired in 3 , like:
(test which shows existence of correlation )-(test for measure the power of the correlation) - (test for predicting the chances or calculating the outcome variable based on independent)

SiriusxTR

Re: correlation and regression for ordinal and nominal (dependent)

In reply to this post by Ray Koopman

Sorry Ray, the 8 different time-points was a questions separate from this theme, I posted it sooner in legacy> interactive graph in SPSS v 20?, but no one answered it, so I figured if this post was so mediatized, why not ask here, maybe I get an answer, and I got the solution from ViAnn:((((

SiriusxTR

Re: correlation and regression for ordinal and nominal (dependent)

In reply to this post by ViAnn Beadle

Dear ViAnn,

I have another issue, very similar to the other one which you have answered. I am trying to do the same thing with the 8 (n) measurements in time-line, only I categorized them in ordinal variables (each of them has the same 4 categories) and I would like to present them in a clustered bar chart, where the new variable created (Category or Index) is on the x-axis and the results are clustered by the 4 categories. I tried to do it in SPSS 17 with interactive graph, but it is not working. I would like to get as a result a graph with 8 measurements, meaning 8 (n) x 4 bars, Count on Y-axis. I have seen in published papers that it can be done, but I do not know how. Maybe you can help?

David Marso

Re: correlation and regression for ordinal and nominal (dependent)

Administrator

Perhaps restructuring the data using VARSTOCASES might yield a more natural data structure for the creation of such a graph!

SiriusxTR wrote

Dear ViAnn,

I have another issue, very similar to the other one which you have answered. I am trying to do the same thing with the 8 (n) measurements in time-line, only I categorized them in ordinal variables (each of them has the same 4 categories) and I would like to present them in a clustered bar chart, where the new variable created (Category or Index) is on the x-axis and the results are clustered by the 4 categories. I tried to do it in SPSS 17 with interactive graph, but it is not working. I would like to get as a result a graph with 8 measurements, meaning 8 (n) x 4 bars, Count on Y-axis. I have seen in published papers that it can be done, but I do not know how. Maybe you can help?

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

SiriusxTR

Re: correlation and regression for ordinal and nominal (dependent)

It is not about the data organization. It is a technical question. How can I represent the frecquency (Count on Y-axis) in a clustered bar chart, multiple ordinal variables which have the same categories, variables to be on X-axis, clustered by their categories, not by another variable.

David Marso

Re: correlation and regression for ordinal and nominal (dependent)

Administrator

Such issues are *FREQUENTLY* and fundamentally a function of data structure/organization!!!!!

SiriusxTR wrote

It is not about the data organization. It is a technical question. How can I represent the frecquency (Count on Y-axis) in a clustered bar chart, multiple ordinal variables which have the same categories, variables to be on X-axis, clustered by their categories, not by another variable.

SiriusxTR

Re: correlation and regression for ordinal and nominal (dependent)

Sorry, meanwhile sby sent me additional info. I forgot to mention, that I would like to do all these without using syntax. I am a surgeon, in syntax I am zero.

David Marso

Re: correlation and regression for ordinal and nominal (dependent)

Administrator

Sorry, I won't walk you through any dialogs, *BUT*, the data structure is fundamental!!
Here is the appropriate syntax and I have attached a JPG of the associated graph.
The more modern graphics likely will give you a similar chart (I have an ancient version and don't have relevant knowledge of the current graphics language GPL etc.
FWIW: Syntax is an essential skill if you plan to be able to document your work/leave a reproducible process stream/repeat the analysis. It doesn't really matter that you are a surgeon. Everybody needs to begin somewhere!
BTW: These can be done in the dialogs but I'll leave that to you to puzzle out.
HTH David

--
*Generate some simulated data*.
INPUT PROGRAM.
LOOP caseID=1 to 1000.
+ DO REPEAT v=var01 to var08.
+ compute v=TRUNC(UNIFORM(4))+1.
+ END REPEAT.
+ END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.

** JUST GET YOUR OWN DATA and adapt the following.
------------
*.
FREQ var01 TO var08.
VARSTOCASES /MAKE trans1 FROM var01 var02 var03 var04 var05 var06 var07 var08
/INDEX = Index1(trans1)
/KEEP = caseid
/NULL = KEEP.
VALUE LABELS trans1
1 "category 1" 2 "category 2" 3 "category 3" 4 "category 4" .
CROSSTABS TABLE trans1 BY Index1.

IGRAPH /VIEWNAME='Bar Chart' /X1 = VAR(trans1) TYPE = CATEGORICAL /Y = $count
/COLOR = VAR(index1) TYPE = CATEGORICAL CLUSTER /COORDINATE = VERTICAL
/X1LENGTH=3.0 /YLENGTH=3.0 /X2LENGTH=3.0 / CHARTLOOK='NONE'
/CATORDER VAR(index1) (ASCENDING VALUES OMITEMPTY)
/BAR KEY=ON SHAPE = RECTANGLE BASELINE = AUTO.

SiriusxTR wrote

Sorry, meanwhile sby sent me additional info. I forgot to mention, that I would like to do all these without using syntax. I am a surgeon, in syntax I am zero.

SiriusxTR

Re: correlation and regression for ordinal and nominal (dependent)

Dear David,

Thank you for the help, I know I should start somewhere, I did some programming in high-school, it was 10 years ago and I know that to learn a programming language takes time, which currently I do not have.
The .jpg shows what I need, only 1 correction. I need the variables to be on the x-axis and the categories defined by colours, so clustering by categories of the variables.
Maybe there are some new graph templates (not included in SPSS) somewhere which I can use? Or save this syntax as a template in graphs, to use it in a general manner without rewriting the syntax depending the no. of the variable?