This post was updated on .
Hello everybody,
I have eye tracking data of a total sample (n = 76) at testing point 1 (T1) and of a partial sample (n = 20) at testing point 2 (T2). I want to compare the eye tracking performances of the test persons from the partial sample at T2 with those of the total sample at T1. That means, that the total sample should represent a norm sample and I want to test, whether the T2 partial sample differs from this norm. Which statistical test do I have to use (bearing in mind the longitudinal design) and how do I do that in SPSS? If possible: Do you have any literature references for an approach concerning this "problem". Best regards, Sebastian MB |
This is fraught with problems. The 20 at time 2 are the same people as those 20 observations at time 1 but different form the 56 people at time 1 who don't have time 2 data. So to just compare the means, you are mixing independent and dependent observations. I figure you have 3 options: 1) compare the 20 at time 2 with the same 20 at time 1 using a repeated measures approach or a paired t test (assuming the distributions are relatively symmetric and unimodal); 2) compare the 20 at time 2 with the 56 at time 1 who didn't have time 2 data using an independent samples test; 3) use a mixed models approach to repeated measures which does not use listwise deletion. At least then, the variance estimates will be less biased than using listwise deletion.
Dr. Paul R. Swank, Professor Health Promotion and Behavioral Sciences School of Public Health University of Texas Health Science Center Houston -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of SMartin Sent: Tuesday, February 19, 2013 4:42 AM To: [hidden email] Subject: Longitudinal comparison partial vs. whole sample Hello everybody, I have eye tracking data of a total sample (n = 76) at testing point 1 (T1) and of a partial sample (n = 20) at testing point 2 (T2). I want to compare the eye tracking performances of the test persons from the partial sample at T2 with those of the total sample at T1. That means, that the total sample should represent a norm sample and I want to test, whether the T2 partial sample differs from this norm. Which statistical test do I have to use (bearing in mind the longitudinal design) and how do I do that in SPSS? If possible: Do you have any literature references for an approach concerning this "problem". Best regards, Sebastian -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Longitudinal-comparison-partial-vs-whole-sample-tp5718123.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Sebastian Mayer
If we assume that you are going to present this to a somewhat critical
audience, then you have to state some hypotheses to go along with a rational narrative. Do the 20 differ at time T1 from T2? That is a paired t-test. That is the very sensible first test that you might perform if you expect changes. For something with (I expect) large individual variations like eye tracking, that test will have more power than a grouped test. Do the 20 differ from 56 at T1? That is a simple grouped t-test. That is a very sensible and necessary test to perform, if you are going to say anything about both times. If the 20 vs. 56 happen to differ at T1 (or differ for some definite reason), that limits the narrative. After describing those differences -- or, preferably, no differences -- I suppose that you could carefully explain that you want to treat the 76 at T1 as a normative sample, and (therefore) here is what you would see as that test. -- Rich Ulrich > Date: Tue, 19 Feb 2013 02:41:32 -0800 > From: [hidden email] > Subject: Longitudinal comparison partial vs. whole sample > To: [hidden email] > > Hello everybody, > > I have eye tracking data of a total sample (n = 76) at testing point 1 (T1) > and of a partial sample (n = 20) at testing point 2 (T2). > > I want to compare the eye tracking performances of the test persons from the > partial sample at T2 with those of the total sample at T1. > > That means, that the total sample should represent a norm sample and I want > to test, whether the T2 partial sample differs from this norm. > > Which statistical test do I have to use (bearing in mind the longitudinal > design) and how do I do that in SPSS? > > If possible: Do you have any literature references for an approach > concerning this "problem". > > Best regards, > |
Administrator
|
The t-tests that have been suggested assume that the observations can be reduced to a single number per subject per time point. But for eye-tracking, I wonder if that is the case. The OP needs to say more about the nature of the data. For starters, how many data points per subject are there at T1 and T2?
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Dear Paul, Rich and Bruce,
thank you for your quick replies and sorry I kept you waiting. I was quite busy the past two weeks. @ Paul: I will go with the mixed models approach. I would be VERY thankful for a hint on a paper, which might have used a similar approach OR advice on how to do this on SPSS. @ Rich: What did you mean by "and (therefore) here is what you would see as that test"? @ Bruce: I have a few thousand data points per subject. Best regards, Sebastian |
There is no real trick. Mixed models do not do listwise delition of missing data. So if you do a repeated measures analysis, the data is used for what it can be used for. For example, data availaible at only one time point can be used to estimate the variance but not the difference in means. I do not do my mixed models in SPSS so someone else will have to help with the syntax.
Paul R. Swank, Ph.D., Professor Health Promotions and Behavioral Sciences School of Public Health University of Texas Health Science Center Houston ________________________________________ From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Sebastian MB [[hidden email]] Sent: Monday, March 04, 2013 8:15 AM To: [hidden email] Subject: Re: Longitudinal comparison partial vs. whole sample Dear Paul, Rich and Bruce, thank you for your quick replies and sorry I kept you waiting. I was quite busy the past two weeks. @ Paul: I will go with the mixed models approach. I would be VERY thankful for a hint on a paper, which might have used a similar approach OR advice on how to do this on SPSS. @ Rich: What did you mean by "and (therefore) here is what you would see as that test"? @ Bruce: I have a few thousand data points per subject. Best regards, Sebastian -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Longitudinal-comparison-partial-vs-whole-sample-tp5718123p5718349.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Administrator
|
Here are some links from the UCLA website that may be helpful. You can also find several examples in the archive for this mailing list, many of them posted by Ryan.
http://www.ats.ucla.edu/stat/spss/library/spssmixed/mixed.htm http://www.ats.ucla.edu/stat/spss/examples/alda/default.htm HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
In reply to this post by Sebastian Mayer
Sebastion, Please confirm that the following statements are true: 1. You want to test whether the MEAN scores on a continuous dependent variable significantly changes from time 1 to time 2.
2. A sample size of n=76 participants were measured thousands of times at time 1 and a **RANDOM** subset of those from time 1 (n=20) were measured another several thousand times. Some questions...
Is it common practice to measure subjects thousands of times on a given occassion? If so, why? Why not just take one or two measurements? How long does it take to obtain thousands of measurements on a subject? Please elaborate. Speaking of which, can you please tell us more about the dependent variable?--what it actually is, how it's measured, the values it can take on, the typcial distribution of the [thousands of] data points per subject per time point, etc.?
All of what I stated/asked above is critically important [among other things] in determining how to proceed. If I am able to get a better sense of your research design/procedures, primary research question, dependent variable, sampling methodology, etc., I will do my best to offer up statistical code.
Ryan On Mon, Mar 4, 2013 at 9:15 AM, Sebastian MB <[hidden email]> wrote: Dear Paul, Rich and Bruce, |
In reply to this post by Sebastian Mayer
Sebastion,
I think it will be a lousy paper that uses a mixed model approach on eye-tracking data instead of using the paired t-test. Paul's suggestion that the non-paired data can be used for estimating (perhaps) the variance would be somewhat legitimate if the correlations between pairs were small, and *after* you have established that there is no reason to think that the people with missing information may be different. My comment about "here is what you would see" ... I was saying that, (1) after you do the tests as I have described, for the sake of doing "proper testing," (2) then you might resort (for the sake of convenient presentation) to testing the 76 vs. the 20 as if they were two independent groups, even though they are not. That should be tossed in as something extra, and not as the test that ought to be considered primary and most useful. -- Rich Ulrich > Date: Mon, 4 Mar 2013 06:15:12 -0800 > From: [hidden email] > Subject: Re: Longitudinal comparison partial vs. whole sample > To: [hidden email] > > Dear Paul, Rich and Bruce, > > thank you for your quick replies and sorry I kept you waiting. I was quite > busy the past two weeks. > > @ Paul: I will go with the mixed models approach. I would be VERY thankful > for a hint on a paper, which might have used a similar approach OR advice on > how to do this on SPSS. > > @ Rich: What did you mean by "and (therefore) here is what you would see as > that test"? > > @ Bruce: I have a few thousand data points per subject. > > Best regards, > ... |
First off, one can employ a linear mixed model (via REML estimation) on just the paired data (removing subjects who did not provide data at time 2, which will produce the same results as those produced by a paired t-test, assuming one were to force a compound symmetric structure when parameterizing the linear mixed model. I'd also like to point out that, within a linear mixed model, unlike a paired-t test, one can allow for heterogeneous variances across time points. It is not uncommon to see variances decrease substantially over time. Allowing for heterogeneous variances could significantly improve model fit, and produce a more statistically powerful test of the question at hand.
As I mentioned in a previous email to the OP, it is still unclear to me how many times individuals were measured at each time point. If individuals were measured thousands of times at a given time point, then I would want to know why, and potentially build these repeated measures into the statistical model. Clearly, a paired t-test would not be a viable option under such circumstances, unless one were to aggregate the data per subject in some way (e.g., take the mean of the thousands of measurements on a given subject at each time point and treat the mean as the subject's score). I'm generally not in favor of such an approach, but could be convinced otherwise, depending on the situation. Admittedly, eye tracking is not my area of expertise, so my understanding of the study may be incorrect.
I agree with Rich that if the data are not missing at random the second time around, we need to try to find out WHY before employing a linear mixed model on all possible data. Frankly, if the data are not missing at random, then I would question any type of analysis on the retained data as well. This is why I posted my message yesterday asking the OP to indicate whether the data were missing at random, among other things.
Ryan On Mon, Mar 4, 2013 at 7:36 PM, Rich Ulrich <[hidden email]> wrote:
|
Administrator
|
At the end of the day, the OP needs to provide more information about the thousands of data points per subject per time point. Given that it is eye-movement data, I doubt they're all measures of the same thing--so no simple aggregation such as computing a mean. I suspect that 2-dimensional eye position is recorded at some fairly high frequency (hence the thousands of data points), and then other measures are derived from the raw data (involving massive data reduction) -- e.g., direction and latency of an eye movement away from fixation in response to some stimulus. It also would not surprise me if the situation is more complicated than can be sorted out with the exchange of a few e-mails.
HTH.
--
Bruce Weaver bweaver@lakeheadu.ca http://sites.google.com/a/lakeheadu.ca/bweaver/ "When all else fails, RTFM." PLEASE NOTE THE FOLLOWING: 1. My Hotmail account is not monitored regularly. To send me an e-mail, please use the address shown above. 2. The SPSSX Discussion forum on Nabble is no longer linked to the SPSSX-L listserv administered by UGA (https://listserv.uga.edu/). |
Administrator
|
"It also would not surprise me if the situation is more complicated than can be sorted out with the exchange of a few e-mails. "
Aha, that explains the existence of Professional Statistical Consultants and why people spend years and years studying for PhDs!!
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me. --- "Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis." Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?" |
In reply to this post by Bruce Weaver
Going off on a tangent here...Let's guess for a moment that there is a standard, accetable way to convert raw eye tracking data into two variables (direction and latency) that represent "eye tracking ability." Further, let's assume that these two variables are bivariate normal and are measured at multiple time points. A bivariate repeated measures model could prove quite useful, and could be handled by a mixed model. Such a model intrigues me. .
I agree with Bruce that deriving appropriate statistical models for complex designs are difficult and sometimes nearly impossible over email, although I must say that with the right information provided by the OP and a brilliant statistician with knowledge in the research area could overcome these challenges. Over the years on SAS-L, I've seen some absolutely historical exchanges between (1) OPs who are willing and able to provide the necessary information in a concise way and (2) brilliant statisticians who are willing to think through the problem, and explain and provide complex solutions in a digestable form over email. Those who are members of SAS-L know about the statisticians to whom I am referring.
Ryan On Tue, Mar 5, 2013 at 4:54 PM, Bruce Weaver <[hidden email]> wrote: At the end of the day, the OP needs to provide more information about the |
In reply to this post by Bruce Weaver
Maybe the OP could provide more information.
Or - what I still assume - maybe there is a good experimental history of using eye-track data, and a popular experimental paradigm that uses some standard measures, and there is no reason to tamper with success. Or, if you were to set out to re-examine the basis of using eye-tracking data, it might be proper to design a new study that would include elements of *all* the experimental designs that have been based on the same conventional measures. I regarded this data-problem like I have regarded EEG data. That data was given to me as a number for each of several frequency bands, all based on thousands of points for every set of 5 numbers; repeated for multiple scalp reference-points. - There were distributional oddities here, and I referred to my PI and the literature before dividing through to get "relative power" and then taking the logits. - As is usually the case, the simplest single summary was effectively an average, the average frequency. IIRC, the same PI also gave me eye-tracking data ... which had decent distributions, and which I don't remember ever worrying twice about, and I've buried any recollection of what it consisted of. But it would be a major project, I suspect, to introduce a variation of the scoring. To me, it seems a bit out of order for a statistician to intrude on the basic measures, if there hasn't been a question raised about them, either directly by the PI or indirectly by weird data distributions. -- Rich Ulrich > Date: Tue, 5 Mar 2013 13:54:30 -0800 > From: [hidden email] > Subject: Re: Longitudinal comparison partial vs. whole sample > To: [hidden email] > > At the end of the day, the OP needs to provide more information about the > thousands of data points per subject per time point. Given that it is > eye-movement data, I doubt they're all measures of the same thing--so no > simple aggregation such as computing a mean. I suspect that 2-dimensional > eye position is recorded at some fairly high frequency (hence the thousands > of data points), and then other measures are derived from the raw data > (involving massive data reduction) -- e.g., direction and latency of an eye > movement away from fixation in response to some stimulus. It also would not > surprise me if the situation is more complicated than can be sorted out with > the exchange of a few e-mails. > > HTH. > ... |
Free forum by Nabble | Edit this page |