N= 50 persons (cases) give answers on a scale from 1-5 for 10 questions (items). The syntax tries to figure out, which person gives 'fake' answers just by answering randomly. The syntax autocorrelates the 10 items of each person (lag 1). Those persons whose answers autocorrelate with lag1, are considered giving fake answers.
Appart from the question if this is the propper method, the question is: Is there a possibility to make the syntax faster. Since the syntax is running sequencially, it takes very long. Plus, the makro needs very long to start in the first place. And it is not possible, to run the syntax over a large number of cases. Already, from 1.000 cases on, the syntax needs too much time (and memory). Do you see any possibility to optimise the script? *examp. file-------------. input program. loop person =1 to 100 by 1. end case. end loop. end file. end input program. exe. comp v1 =RV.BINOM(5,0.5). comp v2 =RV.BINOM(5,0.5). comp v3 =RV.BINOM(5,0.5). comp v4 =RV.BINOM(5,0.5). comp v5 =RV.BINOM(5,0.5). comp v6 =RV.BINOM(5,0.5). comp v7 =RV.BINOM(5,0.5). comp v8 =RV.BINOM(5,0.5). comp v9 =RV.BINOM(5,0.5). comp v10 =RV.BINOM(5,0.5). EXE . SORT CASES BY person(A). SAVE OUTFILE='C:\user\testfile.sav'. *create exampl agg-file ---------------- . input program. loop var001 =1 to 1by 1. end case. end loop. end file. end input program. EXECUTE . comp var002 =0 . comp person =0 . SAVE OUTFILE='C:\user\agg.sav'. * ----------- Macro starts here ------------------------------------------. DEFINE !makro_stoch (start =!tokens(1) /end = !tokens(1) /testfile = !tokens(1) /aggfile = !tokens(1) /oms_outfile = !tokens(1) /flipvar_1 = !token(1) /flipvar_2 = !token(1)). !do !var = !start !to !end. GET FILE=!testfile. FILTER OFF. USE !var thru !var /permanent. EXECUTE. FLIP VARIABLES=!flipvar_1 to !flipvar_2. SHIFT VALUES VARIABLE=var001 RESULT=var001_shift LAG=1. OMS /DESTINATION VIEWER=NO /TAG='suppressall'. oms select tables /destination format = sav outfile=!oms_outfile /if commands = ['Correlations'] subtypes = ['Correlations']. CORRELATIONS /VARIABLES=var001 var001_shift /PRINT=TWOTAIL NOSIG /MISSING=PAIRWISE. OMSEND . GET FILE=!oms_outfile. FILTER OFF. USE 1 thru 2 /permanent. EXECUTE. DELETE VARIABLES Command_ to var001. EXECUTE . FLIP VARIABLES=Lagvar0011. comp person = !var. EXECUTE . DELETE VARIABLES CASE_LBL. EXECUTE . ADD FILES /FILE=* /FILE=!aggfile. EXECUTE. SAVE OUTFILE=!aggfile. !doend. SORT CASES BY person(A). MATCH FILES /FILE=* /FILE=!testfile /BY person. EXECUTE . formats var001 var002 (f8.3). rename variable var001 = pearson. rename variable var002 = probability. exe. RESTORE. !enddefine. !makro_stoch start = 1 end = 100 flipvar_1=v1 flipvar_2 =v10 testfile = 'C:\user\testfile.sav' aggfile = 'C:\user\agg.sav' oms_outfile ='C:\user\outfile.sav'.
Dr. Frank Gaeth
|
IF you are doing what I think you are doing, I think there are faster ways
to do this but I want to make sure I understand what you are claiming. It sounds like you are claiming that if the within person lag 1 autocorrrelation is greater than some value (what value do you have in mind?) then the responses are fakes. Is that true? I think you are incorrect in your thinking for this reason. Consider two scenarios: 1) Suppose the 10 items form a scale with the average inter-item correlation being .60. 2) Suppose that the 10 items have no relationship to each other. Suppose you compute an lag1 autocorrelation of .32 for somebody. What do you conclude? That said and before I offer something, tell me how your data are organized. Like this. Id v1 v2 v3 ... v10 11 2 5 1 ... 4 12 3 2 3 ... 2 13 4 1 3 ... 5 Or like this. Id 11 2 11 5 11 1 11 ... 11 4 12 3 12 2 12 3 12 ... 12 2 Gene Maguin -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of drfg2008 Sent: Sunday, February 13, 2011 2:17 PM To: [hidden email] Subject: Syntax Problem: speed N= 50 persons (cases) give answers on a scale from 1-5 for 10 questions (items). The syntax tries to figure out, which person gives 'fake' answers just by answering randomly. The syntax autocorrelates the 10 items of each person (lag 1). Those persons whose answers autocorrelate with lag1, are considered giving fake answers. Appart from the question if this is the propper method, the question is: Is there a possibility to make the syntax faster. Since the syntax is running sequencially, it takes very long. Plus, the makro needs very long to start in the first place. And it is not possible, to run the syntax over a large number of cases. Already, from 1.000 cases on, the syntax needs too much time (and memory). Do you see any possibility to optimise the script? *examp. file-------------. input program. loop person =1 to 100 by 1. end case. end loop. end file. end input program. exe. comp v1 =RV.BINOM(5,0.5). comp v2 =RV.BINOM(5,0.5). comp v3 =RV.BINOM(5,0.5). comp v4 =RV.BINOM(5,0.5). comp v5 =RV.BINOM(5,0.5). comp v6 =RV.BINOM(5,0.5). comp v7 =RV.BINOM(5,0.5). comp v8 =RV.BINOM(5,0.5). comp v9 =RV.BINOM(5,0.5). comp v10 =RV.BINOM(5,0.5). EXE . SORT CASES BY person(A). SAVE OUTFILE='C:\user\testfile.sav'. *create exampl agg-file ---------------- . input program. loop var001 =1 to 1by 1. end case. end loop. end file. end input program. EXECUTE . comp var002 =0 . comp person =0 . SAVE OUTFILE='C:\user\agg.sav'. * ----------- Macro starts here ------------------------------------------. DEFINE !makro_stoch (start =!tokens(1) /end = !tokens(1) /testfile = !tokens(1) /aggfile = !tokens(1) /oms_outfile = !tokens(1) /flipvar_1 = !token(1) /flipvar_2 = !token(1)). !do !var = !start !to !end. GET FILE=!testfile. FILTER OFF. USE !var thru !var /permanent. EXECUTE. FLIP VARIABLES=!flipvar_1 to !flipvar_2. SHIFT VALUES VARIABLE=var001 RESULT=var001_shift LAG=1. OMS /DESTINATION VIEWER=NO /TAG='suppressall'. oms select tables /destination format = sav outfile=!oms_outfile /if commands = ['Correlations'] subtypes = ['Correlations']. CORRELATIONS /VARIABLES=var001 var001_shift /PRINT=TWOTAIL NOSIG /MISSING=PAIRWISE. OMSEND . GET FILE=!oms_outfile. FILTER OFF. USE 1 thru 2 /permanent. EXECUTE. DELETE VARIABLES Command_ to var001. EXECUTE . FLIP VARIABLES=Lagvar0011. comp person = !var. EXECUTE . DELETE VARIABLES CASE_LBL. EXECUTE . ADD FILES /FILE=* /FILE=!aggfile. EXECUTE. SAVE OUTFILE=!aggfile. !doend. SORT CASES BY person(A). MATCH FILES /FILE=* /FILE=!testfile /BY person. EXECUTE . formats var001 var002 (f8.3). rename variable var001 = pearson. rename variable var002 = probability. exe. RESTORE. !enddefine. !makro_stoch start = 1 end = 100 flipvar_1=v1 flipvar_2 =v10 testfile = 'C:\user\testfile.sav' aggfile = 'C:\user\agg.sav' oms_outfile ='C:\user\outfile.sav'. -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Syntax-Problem-speed-tp3383649 p3383649.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Thank you for your response!
My data is exactly structured as in the "example file", please see *examp. file-------------. (top of the text) ... Since the items are defenitely uncorrelated (randomly sorted), there should be no strong autocorrelation (lag1). regards. Frank
Dr. Frank Gaeth
|
Frank,
I think this will do what you need but let me know. I computed the lag 1 autocorrelation directly in syntax. Gene Maguin input program. vector y(10,f1.0). loop id=1 to 100. loop #i=1 to 10. + compute y(#i)=RV.BINOM(5,0.5). end loop. end case. end loop. end file. end input program. execute. frequencies y1 to y10. * 'x' is y1 to y9. 'y' is y2 to y10. vector y=y1 to y10. compute xbar=mean(y1 to y9). compute ybar=mean(y2 to y10). compute sumxy=0. compute sumx2=0. compute sumy2=0. loop #i=1 to 9. + compute sumxy=sumxy+y(#i)*y(#i+1). + compute sumx2=sumx2+y(#i)*y(#i). + compute sumy2=sumy2+y(#i+1)*y(#i+1). end loop. compute corr=(sumxy/9-xbar*ybar)/sqrt((sumx2/9-xbar**2)*(sumy2/9-ybar**2)). compute xvar=sumx2/9-xbar**2. compute yvar=sumy2/9-ybar**2. execute. format xbar ybar sumxy sumx2 sumy2(f3.0) corr xvar yvar(f10.6). list id y1 to y10 corr xbar xvar ybar yvar/cases=5. id y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 corr xbar xvar ybar yvar 1 4 0 3 0 2 0 3 2 4 2 -.508840 2.000000 2.444444 1.777778 1.950617 2 0 2 3 3 1 4 2 3 2 3 -.366900 2.222222 1.283951 2.555556 .691358 3 2 1 3 3 0 4 2 4 3 2 -.406250 2.444444 1.580247 2.444444 1.580247 4 3 1 2 2 2 4 2 2 3 1 -.362933 2.333333 .666667 2.111111 .765432 5 4 3 1 2 3 4 2 2 2 2 .189832 2.555556 .913580 2.333333 .666667 Number of cases read: 5 Number of cases listed: 5 -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of drfg2008 Sent: Tuesday, February 15, 2011 2:05 AM To: [hidden email] Subject: Re: Syntax Problem: speed Thank you for your response! My data is exactly structured as in the "example file", please see *examp. file-------------. (top of the text) ... Since the items are defenitely uncorrelated (randomly sorted), there should be no strong autocorrelation (lag1). regards. Frank ----- Free University Berlin -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Syntax-Problem-speed-tp3383649 p3385532.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
Frank
Here's how you can compute lag1 autocorrelations for each individual. *Set up Sample Data in wide format. input program. vector y(10,f1.0). loop id=1 to 100. loop #i=1 to 10. + compute y(#i)=RV.BINOM(5,0.5). end loop. end case. end loop. end file. end input program. execute. *convert to long format and calculate. VARSTOCASES /MAKE score FROM y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 /INDEX=Index1(10) /KEEP=id /NULL=KEEP. COMPUTE laggedscore = lag(score). IF id NE lag(id) laggedscore = $SYSMIS. EXECUTE. SPLIT FILE LAYERED BY id. CORRELATIONS /VARIABLES=score laggedscore /PRINT=TWOTAIL NOSIG /MISSING=PAIRWISE. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of drfg2008 Sent: Tuesday, February 15, 2011 2:05 AM To: [hidden email] Subject: Re: Syntax Problem: speed Thank you for your response! My data is exactly structured as in the "example file", please see *examp. file-------------. (top of the text) ... Since the items are defenitely uncorrelated (randomly sorted), there should be no strong autocorrelation (lag1). regards. Frank ----- Free University Berlin -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Syntax-Problem-speed-tp3383649 p3385532.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
In reply to this post by Maguin, Eugene
Thank you, Gene Maguin,
thank you Garry Gelade! @ Gene Maguin this is really great, it works fine, very quick. It took some time to understand (your computing the autocorrelation without using the function), but finally that’s exactly what I was searching for. You’re also right with your critique of the method itself. However, the idea is to presume a fake not if there is a correlation at all (since in surveys you always have inter-item correlation), but to identify cases, where you have either a very high or a very low correlation. In the first scenario someone would answer like: 1, 1, 1, 1, 1, etc. in the second scenario someone would answer randomly. Where to draw the line is the question. After having computed the autocorrelation over several different samples, I would first see how the r#s are distributed. Thanks!
Dr. Frank Gaeth
|
I have been following this discussion with interest, not least because of the looping Gene and Garry suggested (I have yet to master loops in my use of SPSS syntax).
However, given the objective, I am intrigued why you don't just compute a standard deviation for each respondent across the items of interest. That is what I generally do to identify cases with a very high correlation amongst their item responses. -----Original Message----- From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of drfg2008 Sent: 16 February 2011 13:48 To: [hidden email] Subject: Re: Syntax Problem: speed Thank you, Gene Maguin, thank you Garry Gelade! @ Gene Maguin this is really great, it works fine, very quick. It took some time to understand (your computing the autocorrelation without using the function), but finally that’s exactly what I was searching for. You’re also right with your critique of the method itself. However, the idea is to presume a fake not if there is a correlation at all (since in surveys you always have inter-item correlation), but to identify cases, where you have either a very high or a very low correlation. In the first scenario someone would answer like: 1, 1, 1, 1, 1, etc. in the second scenario someone would answer randomly. Where to draw the line is the question. After having computed the autocorrelation over several different samples, I would first see how the r#s are distributed. Thanks! ----- Free University Berlin -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Syntax-Problem-speed-tp3383649p3387655.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
I thought about that solution, but there seem to be a few little problems:
1. There is a systematic difference in answering items between groups. For example: men answer items differently form women (more variance) -> shall I exclude women because their variance is smaller than those of men (or the opposite way round)? Same problem goes for: age, education, ... 2. Different distributions cause different variance: binomial distributed variables may generate different (less) variance than metric. 3. Where is the limit (the maximum or minmum variance, variance=0)? 4. You can not identify random answers by variance. 5. You can not identify systematic answers, like 1,2,3,4,5,6, (as a fake) Correlation identifies (as far as I am convinced) pure random as well as systematic changes. The autocorrelation should not be either very near 0 nor very near +-1. My only problem, until now, was to compute autocorrelation for large data. That's now fixed. Thanks to Gene Maguin. regards Frank
Dr. Frank Gaeth
|
If the scale is on something like attitudes or values _and_ the convention of balancing with items from both end of a bipolar construct has been followed, an SD of zero should flag a case as suspicious. Strongly agreeing ( strongly disagreeing) both with items such as "I like chocolate" and items such as "I hate chocolate foods" is very suspicious. On something like an achievement test, a zero SD is ok with a perfect score. However, a zero sd with the worst conceivable score should flag a case as suspicious. Extreme scores can be a reason to consider a set of responses suspicious. High and low auto-correlations is one example. Finding distances between cases via the different methods in proximities and then considering cases very far from most cases as suspicious. I.e., some cases are in sparsely populated regions of the multivariate space. The nearest neighbors are far away. Nowadays there are some procedures (which I have not examined) in "<data> <identify unusual cases> . These appear to work. In at least the few instances that I have tried this the cases flagged as "unusual" do appear to be unusual. (One can consider "unusual" a synonym for "suspicious" in these contexts.) For example, in a 2D space of weight and age a 300 pound 9 year old would be suspicious. In my experience, 3D plots with colors for different groups can give some ideas of suspicious cases. So can extreme residuals is regressions etc. Art Kendall Social Research Consultants Except in achievement tests On 2/16/2011 12:17 PM, drfg2008 wrote: ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARDI thought about that solution, but there seem to be a few little problems: 1. There is a systematic difference in answering items between groups. For example: men answer items differently form women (more variance) -> shall I exclude women because their variance is smaller than those of men (or the opposite way round)? Same problem goes for: age, education, ... 2. Different distributions cause different variance: binomial distributed variables may generate different (less) variance than metric. 3. Where is the limit (the maximum or minmum variance, variance=0)? 4. You can not identify random answers by variance. 5. You can not identify systematic answers, like 1,2,3,4,5,6, (as a fake) Correlation identifies (as far as I am convinced) pure random as well as systematic changes. The autocorrelation should not be either very near 0 nor very near +-1. My only problem, until now, was to compute autocorrelation for large data. That's now fixed. Thanks to Gene Maguin. regards Frank ----- Free University Berlin -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Syntax-Problem-speed-tp3383649p3388044.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
yes, i would agree so far that detecting suspicious response patterns usually calls for more than one "test". But yours is a more context driven approach: you have to know a lot about the data and the logical context. This is often not the case, when the programmer, who is often enough under time pressure and without context knowledge, has to decide.
As a matter of fact, the autocorrelation is only one out of many opportunities of 'fraud detection'. Logical incoherence is an other possibility, and there are a few others. But it takes much more effort. (you could also post so called 'honey pots', were only those candidates step in, who you want to keep out) By the way, an achievement test should never have a zero SD, because this indicates a ceiling effect or a floor effect (it is either too easy or too difficult). This makes the whole test suspicious ;-) Still, I find the autocorrelation (esp. for opinion surveys) an interesting approach: quick, without any pre-knowledge, simple to implement. Frank
Dr. Frank Gaeth
|
But a programmer could just click <data>
<identify unusual cases> without much effort. In
fact it could take longer to select variables, etc. than to actually
run the syntax.
To check this I just tried it with 14 variable and 91,00 cases in a system file. I click the menu, pasted the syntax, highlighted it, and clicked <run selection>. It took about 20 to 25 seconds to click through the GUI menus. It took 11.8 seconds on a fast desktop. It took a lot longer to reply to this post. More considered use of the menus could even run to a couple of minutes. Now that this procedure is here, I'll most likely use it most of the time. Over the years, I have very seldom received really clean data sets. YMMV. One of the reasons that I have a soapbox about being able to go back to the beginning is that all through an analysis one can find things that look anomalous. Art Kendall Social Research Consultants On 2/16/2011 2:43 PM, drfg2008 wrote: ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARDyes, i would agree so far that detecting suspicious response patterns usually calls for more than one "test". But yours is a more context driven approach: you have to know a lot about the data and the logical context. This is often not the case, when the programmer, who is often enough under time pressure and without context knowledge, has to decide. As a matter of fact, the autocorrelation is only one out of many opportunities of 'fraud detection'. Logical incoherence is an other possibility, and there are a few others. But it takes much more effort. (you could also post so called 'honey pots', were only those candidates step in, who you want to keep out) By the way, an achievement test should never have a zero SD, because this indicates a ceiling effect or a floor effect (it is either too easy or too difficult). This makes the whole test suspicious ;-) Still, I find the autocorrelation (esp. for opinion surveys) an interesting approach: quick, without any pre-knowledge, simple to implement. Frank ----- Free University Berlin -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Syntax-Problem-speed-tp3383649p3388303.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants |
maybe I missed something.
In my old SPSS 17 on my computer there is no <data> <identify unusual cases>. Is it a feature in later versions? Or something to plugin. Or maybe I'm just a bit uninformed? If it is part of a later version of SPSS, this would be a very good reason to upgrade as fast as possible. Frank
Dr. Frank Gaeth
|
Identify Unusual Cases generates syntax
for the DETECTANOMALY command. It is part of the Data Preparation
option, which also includes VAILDATEDATA.
From the help... The Anomaly Detection procedure searches for unusual cases based on deviations from the norms of their cluster groups. The procedure is designed to quickly detect unusual cases for data-auditing purposes in the exploratory data analysis step, prior to any inferential data analysis. This algorithm is designed for generic anomaly detection; that is, the definition of an anomalous case is not specific to any particular application, such as detection of unusual payment patterns in the healthcare industry or detection of money laundering in the finance industry, in which the definition of an anomaly can be well-defined. Methods. The DETECTANOMALY procedure clusters cases into peer groups based on the similarities of a set of input variables. An anomaly index is assigned to each case to reflect the unusualness of a case with respect to its peer group. All cases are sorted by the values of the anomaly index, and the top portion of the cases is identified as the set of anomalies. For each variable, an impact measure is assigned to each case that reflects the contribution of the variable to the deviation of the case from its peer group. For each case, the variables are sorted by the values of the variable impact measure, and the top portion of variables is identified as the set of reasons why the case is anomalous. Jon Peck Senior Software Engineer, IBM [hidden email] 312-651-3435 From: drfg2008 <[hidden email]> To: [hidden email] Date: 02/16/2011 01:42 PM Subject: Re: [SPSSX-L] Syntax Problem: speed Sent by: "SPSSX(r) Discussion" <[hidden email]> maybe I missed something. In my old SPSS 17 on my computer there is no <data> <identify unusual cases>. Is it a feature in later versions? Or something to plugin. Or maybe I'm just a bit uninformed? If it is part of a later version of SPSS, this would be a very good reason to upgrade as fast as possible. Frank ----- Free University Berlin -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/Syntax-Problem-speed-tp3383649p3388363.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD |
oh I see. No, we don't have the licence for it.
>Fehler Nr. 7079 >Es ist keine Lizenz für SPSS Data Preparation vorhanden. >Dieser Befehl wird nicht ausgeführt. >Spezielle Symptomnummer: 18 thanks!
Dr. Frank Gaeth
|
In reply to this post by drfg2008
I list that private message here, because it has some interesting arguments
Well, that conclusion is mainly wrong. The auto-correlation should be near +1 *only* when the person marks each answer close to the immediately prior one. It will *not* detect 1,1,1,1,1,... because if there is no variance, there is (technically) no computable correlation. And "very near zero" is what you ought to expect for legal answers unless the items are arranged in some meaningful order. -- Rich Ulrich ******************** Reply: (first: you're right that 1,1,1, ... does not cause any correlation. If you can't compute a correlation due to a lack of variance, this should be a serious hint) But: The interesting point is that, after I went through a few older studies, in these surveys there were ALWAYS meaningful arrangements of items. So, if you expect a near zero correlation for legal answers, this assumption (also stated in my first text) is theoretical but not empirical. You would exclude almost everyone. The point seems to be, that fake answers are so to speak only 'too far away' from the rest of the group.
Dr. Frank Gaeth
|
I will be out of the office until Thursday, Feb. 16th. If you need immediate assistance, please call 812-856-5824. I will respond to your e-mail
as soon as possible. Thank you,
Shimon Sarraf
Center for Postsecondary Research Indiana University at Bloomington |
Free forum by Nabble | Edit this page |