Login  Register

Re: PCA: R-Matrix Determinant =0 and "not positive Definite"

Posted by David Marso on Jun 24, 2011; 6:05pm
URL: http://spssx-discussion.165.s1.nabble.com/PCA-R-Matrix-Determinant-0-and-not-positive-Definite-tp4512844p4521832.html

To clarify other things...
adjacent observations are correlated (autocorrelation).
observations across years are correlated (seasonality).
Also the various parameters are correlated (the object of study).  HOWEVER, they may not be simply correlated at the time point of interest but be offset by a factor (lag or lead) - Cross Correlation -.
If you google Multivariate Time Series Analysis you will find TONS of info and numerous PDF files (for example: faculty.washington.edu/ezivot/econ584/notes/multivariatetimeseries.pdf ).
Browsing through this I am scratching my head muttering WTF??? (and BTW I have over 20 years of experience studying statistics and would find this a b**ch to implement in SPSS).
For a seriously 'dumbed down' discussion of Time Series Analysis see here:
http://en.wikipedia.org/wiki/Time_series
(I really hate Wikipedia for anything serious about stats but it will at least make a few issues come to light ).
I am not familiar with the latest and greatest methods of dealing with this sort of data but way back when the devil was a little boy a took a sequence of graduate stat courses dealing with ARIMA models (prerequisites were 3 other stats with one focusing on Linear Models (ANOVA, Multiple Regression etc )...  Towards the end of the semester we attempted to fit multivariate models with say 3-4 variables.
say X,Y,Z?
Fit an ARIMA model to each of X,Y,Z
Verify residuals from models are White Noise... Ex, Ey, Ez
Estimate CrossCorrelation function of Ex, Ey, Ez ...
Identify Transfer functions or Reduce dimensionality (PCA etc)?
etc....
IE: at the point of having white noise residuals and identified the proper lags, transformed appropriately, then it MIGHT be appropriate to run a PCA and reduce the dimensionality of the process/vector space.
---
Your modelling problem is complicated further by having multiple locations.
Spatial Correlation...
OTOH, sounds like your measures are average across the locations.
If so then how is it then that you have missing data?  Are the parameters missing for ALL locations on a particular day?
Not sure that averaging these is a proper procedure either.
-----
In other words, this whole thing is much more complicated than dropping a bunch of variables into a dialog listbox, checking a SAVE box and clicking OK !
I'm not sure what others in your field have done previously, but if they did what you are proposing then it's really a GIGO fiasco!
Now please don't get me wrong here,  I don't consider myself an **EXPERT** in statistics (there are people on this list that can run circles around my bony a$$ ).  There have been a lot of advances since I was in school and I don't have ready access to any scholarly journals such as Psychometrica, or the British Journal of Statistics or the Journal of Time Series and Forecasting, Structural Equations Modelling .......).
OTOH:  I know enough to see a train wreck about to occur and thought it prudent to intervene!
HTH, David

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"