Login  Register

Re: PCA: R-Matrix Determinant =0 and "not positive Definite"

Posted by mzalikhan on Jun 24, 2011; 4:48am
URL: http://spssx-discussion.165.s1.nabble.com/PCA-R-Matrix-Determinant-0-and-not-positive-Definite-tp4512844p4519934.html

The data is missing because its observational data and either "some variable in some cases" or "some cases as whole" are not recorded.
I am using SPSS 16 and it gives three options in Factor analysis regarding Missing Values.
1- Exclude cases list wise (which reduces my cases from 1343-810)
2- Exclude cases pairs wise. (gives not positive matrix)
3- Replace by means. (simply replaces the missing variables by mean values and thus valid N= total N)

I am sorry for not been elaborate enough regarding data.

Its daily meteorological data, recorded at around 15 stations across South East Asia with Bangkok as center. So meteorological parameters like daily temperature, humidity, cloud cover etc (a total of 41 parameters for all 15 stations) are the variables and thus set of all the variables for each day is a case. The data i am dealing with is for summer season (March-June) (2000-2010). So its initially a 1343 (days) x 41 (parameters) matrix. The Objective of my study is to find out the meteorological patterns prevailing the region. The methodology i am going to follow is
1- to find out the minimum number of PC's representing maximum variance in the data.(of course i can exclude some of the variables out of 41 to achieve this)
2- Once I get (for example six) PCs with respective loading of different variables, (by the way on the basis of my literature review, i am expecting not more than 6 PCs) *I have to calculate the scores of 6 PCs for each day (It should be a 1343 x 6 Matrix)
3- Days in this matrix are to be grouped by following a 2-stage clustering technique.
    i- Application of an average linkage clustering method** on this 1343 x 6 matrix to determine the initial number of clusters and mean conditions with in each cluster mean component score.
    ii- Modify these initial clusters using K-means clustering technique with the initial number of clusters and their mean component scores as an initial seed value. This procedure is to classify the 1343 days in to a certain number of meteorologically homogeneous clusters.

This is the summery of my objectives and methodology. Since i don't have stat background, in fact i am using SPSS for the very first time, so i am having issues which i need to discuss with you guys.

i am rephrasing the questions that right now i have as;

1- Once i get 6 PCs and the variable loading (say 15 variables having loading greater than 0.4 are contributing in the 6 PCs Eigen value greater than 1), I get a 15 x 6 matrix. How can i calculate the scores of 6 PCs for each day(*)

2- How to Apply an average linkage clustering method on this 1343 x 6 matrix**?

Peace.