SPSSX Discussion

Non-Positive Definite Matrices in Item Analysis and Factor Analysis

Classic

List

Threaded

8 messages Options

Hannah State-Davey

Non-Positive Definite Matrices in Item Analysis and Factor Analysis

All,

I am running an item/reliability analysis and factor analysis on 129
questionnaire items using SPSS 14. I have 204 respondents. However, I am
getting a warning in reliability analysis saying 'the determinant of the
covariance matrix is zero or approximately zero. Statistics based on its
inverse matrix cannot be computed and they are displayed as system missing
values'. What effect does this have on results in reliability analysis?

I know that if using principal components analysis a non-positive definite
covariance matrix is not an issue as no matrices are inverted.

I also thought that if using Pearson's r then you shouldn't obtain a non-
positive definite matrix?

Can anyone shed any light on this for me?

Many thanks in advance

Hannah State-Davey

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Swank, Paul R

Re: Non-Positive Definite Matrices in Item Analysis and Factor Analysis

It is certainly possible to get a dependency in any correlation matrix,
particularly large ones, especially when the number of variables
approaches the number of subjects.

Paul R. Swank, Ph.D.
Professor and Director of Research
Children's Learning Institute
University of Texas Health Science Center - Houston

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hannah State-Davey
Sent: Monday, June 23, 2008 3:58 AM
To: [hidden email]
Subject: Non-Positive Definite Matrices in Item Analysis and Factor
Analysis

All,

I am running an item/reliability analysis and factor analysis on 129
questionnaire items using SPSS 14. I have 204 respondents. However, I am
getting a warning in reliability analysis saying 'the determinant of the
covariance matrix is zero or approximately zero. Statistics based on its
inverse matrix cannot be computed and they are displayed as system
missing
values'. What effect does this have on results in reliability analysis?

I know that if using principal components analysis a non-positive
definite
covariance matrix is not an issue as no matrices are inverted.

I also thought that if using Pearson's r then you shouldn't obtain a
non-
positive definite matrix?

Can anyone shed any light on this for me?

Many thanks in advance

Hannah State-Davey

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Richard Ristow

Re: Non-Positive Definite Matrices in Item Analysis

In reply to this post by Hannah State-Davey

At 04:58 AM 6/23/2008, Hannah State-Davey wrote:

>I also thought that if using Pearson's r then you shouldn't obtain a
>non-positive definite matrix?

Echoing what Paul Swank wrote: The matrix is singular (and not
positive definite) when any variable is a linear combination of any
set of the others.

>I am running on 129 questionnaire items

Start by running descriptive statistics, to see if any of those 129
show no variance.

After that, look critically at the definitions of your data, if
necessary with someone mathematically inclined. For example, make
sure that if you use a scale that is a sum or average of
questionnaire values, you don't also use those values.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

ViAnn Beadle

Re: Non-Positive Definite Matrices in Item Analysis

The first thing that comes to mind is how many non-missing cases does the OP
end up with 129 questionnaire items? There may be few cases, hence limited
variance.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Richard Ristow
Sent: Tuesday, June 24, 2008 11:12 AM
To: [hidden email]
Subject: Re: Non-Positive Definite Matrices in Item Analysis

At 04:58 AM 6/23/2008, Hannah State-Davey wrote:

>I also thought that if using Pearson's r then you shouldn't obtain a
>non-positive definite matrix?

Echoing what Paul Swank wrote: The matrix is singular (and not
positive definite) when any variable is a linear combination of any
set of the others.

>I am running on 129 questionnaire items

Start by running descriptive statistics, to see if any of those 129
show no variance.

After that, look critically at the definitions of your data, if
necessary with someone mathematically inclined. For example, make
sure that if you use a scale that is a sum or average of
questionnaire values, you don't also use those values.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Richard Ristow

Re: Non-Positive Definite Matrices in Item Analysis

At 02:08 PM 6/24/2008, ViAnn Beadle wrote:

>The first thing that comes to mind is how many non-missing cases
>does the OP end up with 129 questionnaire items? There may be few
>cases, hence limited variance.

Right! Thank you. I missed that one, and it's important.

You should run your analysis with listwise deletion of cases, i.e.
drop any that have missing values on any variable. You need to run
descriptives to check for 0 variance, *after* that deletion.

That also goes for linear dependencies. Offhand, I'd think dropping
cases a likelier source of 0 variance than of new linear
dependencies, but I'll bet somebody will think of an instance where
it'd introduce a dependency.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Hannah State-Davey

Re: Non-Positive Definite Matrices in Item Analysis and Factor Analysis

In reply to this post by Hannah State-Davey

All,

Many thanks for your comments on this issue. Apologies for my ignorance, I
am trying to learn all this stuff as quickly as I can, but can someone
give a definition of what they mean by 'dependency'?

Cheers

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Art Kendall

Re: Non-Positive Definite Matrices in Item Analysis and Factor Analysis

One of the common ways that we users create "dependency", "redundancy"
resulting in singular matrices is to include a set of scale items and
their total as independent variables in a regression, etc. Say, I have 5
items and their total. If the values for the 5 items are "known", then
the total is known.

Another common way is to have a set of dichotomous variables for the
levels of a nominal level variable.
Say I have 4 groups in my study, that can be represented by 4 yes/no
mutually exclusive variables. Any three of the dichotomies carry all of
the information that the four do,
The set necessarily has redundancy/dependency. That is why 1 fewer
dichotomous variable than the levels of a nominal level variable are
used to represent that variable.

When any variables in a matrix are perfectly predictable from all or
some of the the other variables, the amount of variance accounted for
i.e., the squared multiple correlation, is 1.00.

Even if your set of variables are not necessarily intended to be items
on a scale, using the RELIABILITY procedure is a quick and easy way to
see the squared multiple correlation of each variable in the set. When
it is 1.00 or nearly so that variable ins involved in the redundancy.

In my experience, the most common sources of this complete
multicollinearity are the two above. If there are a lot of variables
and relatively few cases, you can get near or complete predictability of
a variable from the other variables.

Art Kendall
Social Research Consultants

Hannah State-Davey wrote:

> All,
>
> Many thanks for your comments on this issue. Apologies for my ignorance, I
> am trying to learn all this stuff as quickly as I can, but can someone
> give a definition of what they mean by 'dependency'?
>
> Cheers
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Art Kendall
Social Research Consultants

Richard Ristow

Re: Non-Positive Definite Matrices in Item Analysis

In reply to this post by Hannah State-Davey

At 05:35 AM 6/25/2008, Hannah State-Davey wrote:

>Many thanks for your comments on this issue. Can someone give a
>definition of what they mean by 'dependency'?

Art gave you a run-down of common ways you can get dependence, but I
think you want to know what it is.

The definition is, variables are linearly dependent if there's any
linear combination of them that's always zero.

That doesn't help much, eh what?

Here's the simplest case: X and Y are linearly dependent, because

Y=2*X
or
2*X-Y=0

(In this case, "2*X-Y" is the 'linear combination' that's always zero.)

X Y
2 6
1 3
7 14

That linear dependence is still there, and the matrix is still
singular (and non-positive definite) even if you add a variable that
isn't part of the linear combination:

W X Y
9 2 6
8 1 3
8 7 14

And the dependence can involve more than two variables:

X Y Z
1 3 5
2 1 5
3 4 10
7 5 19

Here (if I've done the arithmetic right)
Z=2*X+Y; or,
2*X+Y-Z=0.

Here, X, Y and Z are linearly dependent; "2*X+Y-Z" is the linear
combination that's always 0.
................
Any better?

-Best regards,
Richard

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD