|
Hi,
I have a query concerning the various uses of partial least squares. In fact I have quite a few queries! I use spectroscopy to analyse chemical compounds generated from cooking oils upon heating. I have a large data set comprising samples taken from 6 different oils at three timepoints and for both control samples and those for the oils treated with 4 different materials before heating (i.e. 6 oils, 5 treatments and 3 timepoints). To my relatively untrained eyes there would appear to be a range of tests that could be applied (and please tell me if I am way off the mark here):
(a) correlation of category of membership (X matrix) with either the spectra themselves or spectra-derived compound concentrations (Y matrix; I assume that this sort of category description is not appropriate for PLS-DA whereby membership constitutes the Y matrix instead), i.e. this would be an inverse regression; what contrast coefficients should I use for matrices combining time and material or time, material and oil; do I in fact exclude one set of category membership dummy variable columns as in more typical regression analyses? Is the use of Q2 and R2 values and 70/30 training/validation sets appropriate in this sort of analysis?);
(b) spectra (X matrix) versus %fatty acid distribution or generated compound concentrations (Y matrix, both are easily obtained from spectra for the calibration set) whereby I could compare PLS1 levels obtained from individual levels with those derived simultaneously (PLS2) from the whole lot;
(c) a PLS-DA analysis of spectral data (X matrix) versus class membership based on, e.g. acceptable limits for safe ingestible levels of generated compound (Y matrix), i.e. toxic (“1”) or non-toxic (“0”), again do I exclude one of these category membership columns?).
Many thanks if you can help!
|