Categorical variable codings

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Categorical variable codings

Chrissy Wissy
Hi,
I am trying various regression analyses on data obtained from the study of 6 different cooking oils, control and 4 different material treatments, all samples being sampled at 30 min, 60 min and 90 min heating. I am wondering what a suitable set of categorical variable codes to use for comparing, e.g. material/time or oil/material/time, i.e. 5 X 3 or 6 X 5 X 3 at all levels and interactions, assuming with the latter that "oil" can be included as an independent variable.

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: Categorical variable codings

Maguin, Eugene
Chrissy Wissy,

I'm not quite sure I understand what you said. Are you saying you have six
oils, with one sample of each was tested under one of five treatments
(counting control) and that each combination of oil and treatment was
sampled at three times? So you have 6*5=30 records with repeated measures on
time?

Or, are you saying you have 15 samples of each oil with each sample being
tested under one treatment and sampled at one time? So, 6*5*3=90 records?

Or, do you have replications of each oil sample under combination of
treatment and sampling times? So the number of records for a 5x replication
would either be 150 or 450 depending on how time is treated?

Please explain further.

Gene Maguin
Reply | Threaded
Open this post in threaded view
|

Re: Categorical variable codings

Chrissy Wissy
I guess your first point is the correct description (I hadn't thought of the repeated measurements angle!). In other words there are 15 samples in total per oil. I guess part of my confusion comes from not knowing whether I can mix coding schemes within the design, i.e. simple/polynomial, etc. (I am now wondering if polynomial is now suitable for the time parameter when this is a repated measure).

Thanks!

Gene Maguin wrote
Chrissy Wissy,

I'm not quite sure I understand what you said. Are you saying you have six
oils, with one sample of each was tested under one of five treatments
(counting control) and that each combination of oil and treatment was
sampled at three times? So you have 6*5=30 records with repeated measures on
time?

Or, are you saying you have 15 samples of each oil with each sample being
tested under one treatment and sampled at one time? So, 6*5*3=90 records?

Or, do you have replications of each oil sample under combination of
treatment and sampling times? So the number of records for a 5x replication
would either be 150 or 450 depending on how time is treated?

Please explain further.

Gene Maguin
Reply | Threaded
Open this post in threaded view
|

Re: Categorical variable codings

Maguin, Eugene
Chrissy Wissy,

Ok. You have 90 records in your dataset. One record for each combination of
oil type, treatment and time point (6*5*3=90). I'll assume that you are
familiar with regression and anova. Your initial message talked about main
effects and interactions. You can't do a model with a triple interaction
because there is only one case for each unique value of oil, treatment and
time. If you ignore time, then you can do oil and treatment main effects and
an oil by treatment interaction because you will have three cases for each
unique value oil and treatment. You can work out the numbers if you ignore
oil or treatment.

Something went profoundly wrong in the design stage of this project if
somebody wanted triple interactions (oil by treatment by time).

I would forget regression and use anova (GLM procedure) because your design
is more easily analyzed as anova.

Gene Maguin
Reply | Threaded
Open this post in threaded view
|

Re: Categorical variable codings

Chrissy Wissy
Thanks for the reply. There are of course no replicates as such, i.e. each oil is
heated over time, taking out aliquots at 30, 60 and 90 min. The oil
experiments represent control, material 1 ... material 4, i.e. 5
experiments. I am measuring the concentrations of various components at
these timepoints using NMR spectroscopy. Strangely enough I have just tried a 3-
way ANOVA for each oil with 5 (treatment) X 10 (chemical component) X 3
(time) with time as a block (it can safely be assumed that component level
will increase with time so I am not interested in that particular
interaction). I got significance for material, component but not material X
component. However if all oil results were combined as a MANOVA, material X
component came back into it which highlights that choice of oil affects how
each material behaves in terms of suppressing these components.

Gene Maguin wrote
Chrissy Wissy,

Ok. You have 90 records in your dataset. One record for each combination of
oil type, treatment and time point (6*5*3=90). I'll assume that you are
familiar with regression and anova. Your initial message talked about main
effects and interactions. You can't do a model with a triple interaction
because there is only one case for each unique value of oil, treatment and
time. If you ignore time, then you can do oil and treatment main effects and
an oil by treatment interaction because you will have three cases for each
unique value oil and treatment. You can work out the numbers if you ignore
oil or treatment.

Something went profoundly wrong in the design stage of this project if
somebody wanted triple interactions (oil by treatment by time).

I would forget regression and use anova (GLM procedure) because your design
is more easily analyzed as anova.

Gene Maguin