SPSSX Discussion

Bug in DELETE VAR command?

Classic

List

Threaded

13 messages Options

la volta statistics

Bug in DELETE VAR command?

Hi all

I found a strange behaviour of SPSS 15.01.
When I use the DELETE VARIABLE command the following command (see example
below) missbehaves. The non-missing(ABB) is not recognized then and a value
1 is written into the variable 'StrangeBehaviour' despite of a non missing
value in the variable ABB.
This happens with and without a subsequent EXECUTE command after the delete
statement.

The same code but omitting the 'Delete Var Flag' line runs perfectly ok.

Can somebody reproduce this?

Christian

* Code to reproduce the bug.
new file.
input program.
loop #i = 1 to 200.
compute test = rv.normal(100,15).
end case.
end loop.
end file.
end input program.
execute.

If Test > 70 pat = 1.
IF Test < 80 flag = 1.
IF Flag = 1 ABB = 1.
Exec.
DELETE VAR Flag.
Exec.

* Here comes the bug.
IF Pat = 1 AND (Misssing(ABB)) StrangeBehaviour = 1.
Exec.

*******************************
la volta statistics
Christian Schmidhauser, Dr.phil.II
Weinbergstrasse 108
Ch-8006 Zürich
Tel: +41 (043) 233 98 01
Fax: +41 (043) 233 98 02
email: mailto:[hidden email]
internet: http://www.lavolta.ch/

Raynald Levesque

Re: Bug in DELETE VAR command?

Hi Christian,

I confirm that I reproduce the described behavior.

Raynald Levesque [hidden email]
Website: www.spsstools.net

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of la
volta statistics
Sent: April 13, 2007 7:58 AM
To: [hidden email]
Subject: Bug in DELETE VAR command?

Hi all

I found a strange behaviour of SPSS 15.01.
When I use the DELETE VARIABLE command the following command (see example
below) missbehaves. The non-missing(ABB) is not recognized then and a value
1 is written into the variable 'StrangeBehaviour' despite of a non missing
value in the variable ABB.
This happens with and without a subsequent EXECUTE command after the delete
statement.

The same code but omitting the 'Delete Var Flag' line runs perfectly ok.

Can somebody reproduce this?

Christian

* Code to reproduce the bug.
new file.
input program.
loop #i = 1 to 200.
compute test = rv.normal(100,15).
end case.
end loop.
end file.
end input program.
execute.

If Test > 70 pat = 1.
IF Test < 80 flag = 1.
IF Flag = 1 ABB = 1.
Exec.
DELETE VAR Flag.
Exec.

* Here comes the bug.
IF Pat = 1 AND (Misssing(ABB)) StrangeBehaviour = 1.
Exec.

*******************************
la volta statistics
Christian Schmidhauser, Dr.phil.II
Weinbergstrasse 108
Ch-8006 Zürich
Tel: +41 (043) 233 98 01
Fax: +41 (043) 233 98 02
email: mailto:[hidden email]
internet: http://www.lavolta.ch/

Richard Ristow

Re: Bug in DELETE VAR command?

In reply to this post by la volta statistics

At 07:58 AM 4/13/2007, la volta statistics wrote:

>I found a strange behaviour of SPSS 15.01.
>When I use the DELETE VARIABLE command the following command (see
>example
>below) missbehaves.

At 07:46 PM 4/13/2007, Raynald Levesque wrote:

>I confirm that I reproduce the described behavior.

So do I. As a twist, the two statements

. IF Flag = 1 ABB = 1.
. COMPUTE ABB2 = Flag.

should (and do) give exactly the same values for variables ABB and
ABB2. However, the strange behavior is exhibited only when using
variable ABB; using ABB2 gives the proper result. (Not tested: using
*only* the code for ABB2.)

SPSS 15.0.1 draft output (WRR-not saved separately). Code as originally
posted, except
. 'Exec' statements replaced by 'LIST'.
. Test data generation modified, to work with fewer cases
. Add code creating and using ABB2, as described above.

>A value 1 is written into the variable 'StrangeBehaviour' despite of a
>non missing value in the variable ABB.

It is also so written, despite 'pat' not having value 1.

Final output listing:
....................
|-----------------------------|---------------------------|
|Output Created |13-APR-2007 20:37:38 |
|-----------------------------|---------------------------|
test pat ABB ABB2 StrangeBehaviour Strange2

99.24 1.00 . . 1.00 1.00
81.29 1.00 . . 1.00 1.00
93.77 1.00 . . 1.00 1.00
90.37 1.00 . . 1.00 1.00
13.84 . 1.00 1.00 1.00 .
92.29 1.00 . . 1.00 1.00
21.62 . 1.00 1.00 1.00 .
36.19 . 1.00 1.00 1.00 .
58.75 . 1.00 1.00 1.00 .
29.08 . 1.00 1.00 1.00 .
89.07 1.00 . . 1.00 1.00
5.83 . 1.00 1.00 1.00 .
50.49 . 1.00 1.00 1.00 .
80.59 1.00 . . 1.00 1.00
9.21 . 1.00 1.00 1.00 .
4.36 . 1.00 1.00 1.00 .
91.93 1.00 . . 1.00 1.00
67.93 1.00 1.00 1.00 1.00 .
76.72 1.00 . . 1.00 1.00
61.97 1.00 1.00 1.00 1.00 .

Number of cases read: 20 Number of cases listed: 20
....................
Full code and output:
....................

* Code to reproduce the bug.
new file.
input program.
loop #i = 1 to 20.
compute test = rv.uniform(0,100).
end case.
end loop.
end file.
end input program.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |13-APR-2007 20:37:37 |
|-----------------------------|---------------------------|
test

99.24
81.29
93.77
90.37
13.84
92.29
21.62
36.19
58.75
29.08
89.07
5.83
50.49
80.59
9.21
4.36
91.93
67.93
76.72
61.97

Number of cases read: 20 Number of cases listed: 20

If Test > 60 pat = 1.
IF Test < 70 flag = 1.
IF Flag = 1 ABB = 1.
COMPUTE ABB2 = Flag.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |13-APR-2007 20:37:38 |
|-----------------------------|---------------------------|
test pat flag ABB ABB2

99.24 1.00 . . .
81.29 1.00 . . .
93.77 1.00 . . .
90.37 1.00 . . .
13.84 . 1.00 1.00 1.00
92.29 1.00 . . .
21.62 . 1.00 1.00 1.00
36.19 . 1.00 1.00 1.00
58.75 . 1.00 1.00 1.00
29.08 . 1.00 1.00 1.00
89.07 1.00 . . .
5.83 . 1.00 1.00 1.00
50.49 . 1.00 1.00 1.00
80.59 1.00 . . .
9.21 . 1.00 1.00 1.00
4.36 . 1.00 1.00 1.00
91.93 1.00 . . .
67.93 1.00 1.00 1.00 1.00
76.72 1.00 . . .
61.97 1.00 1.00 1.00 1.00

Number of cases read: 20 Number of cases listed: 20

DELETE VAR Flag.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |13-APR-2007 20:37:38 |
|-----------------------------|---------------------------|
test pat ABB ABB2

99.24 1.00 . .
81.29 1.00 . .
93.77 1.00 . .
90.37 1.00 . .
13.84 . 1.00 1.00
92.29 1.00 . .
21.62 . 1.00 1.00
36.19 . 1.00 1.00
58.75 . 1.00 1.00
29.08 . 1.00 1.00
89.07 1.00 . .
5.83 . 1.00 1.00
50.49 . 1.00 1.00
80.59 1.00 . .
9.21 . 1.00 1.00
4.36 . 1.00 1.00
91.93 1.00 . .
67.93 1.00 1.00 1.00
76.72 1.00 . .
61.97 1.00 1.00 1.00

Number of cases read: 20 Number of cases listed: 20

* Here comes the bug.
IF Pat = 1 AND (Missing(ABB)) StrangeBehaviour = 1.
IF Pat = 1 AND (Missing(ABB2)) Strange2 = 1.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |13-APR-2007 20:37:38 |
|-----------------------------|---------------------------|
test pat ABB ABB2 StrangeBehaviour Strange2

99.24 1.00 . . 1.00 1.00
81.29 1.00 . . 1.00 1.00
93.77 1.00 . . 1.00 1.00
90.37 1.00 . . 1.00 1.00
13.84 . 1.00 1.00 1.00 .
92.29 1.00 . . 1.00 1.00
21.62 . 1.00 1.00 1.00 .
36.19 . 1.00 1.00 1.00 .
58.75 . 1.00 1.00 1.00 .
29.08 . 1.00 1.00 1.00 .
89.07 1.00 . . 1.00 1.00
5.83 . 1.00 1.00 1.00 .
50.49 . 1.00 1.00 1.00 .
80.59 1.00 . . 1.00 1.00
9.21 . 1.00 1.00 1.00 .
4.36 . 1.00 1.00 1.00 .
91.93 1.00 . . 1.00 1.00
67.93 1.00 1.00 1.00 1.00 .
76.72 1.00 . . 1.00 1.00
61.97 1.00 1.00 1.00 1.00 .

Number of cases read: 20 Number of cases listed: 20

la volta statistics

AW: Bug in DELETE VAR command?

Hi Richard and Ray

Thanks for responding. Yes, I have contacted SPSS's Tech Support (in
Switzerland), but got now answer so far.
I know, it is easy to circumvent the situation I describe. However, I find
it annoying to control my data for bugs after having written a perfectly
valuable syntax.

Christian

By the way, I recently fuond an other bug (confirmed by SPSS). They told me
they will fix it with version 16.
The bug comes with the VARSTOCASES command. Here an example:

I select six cases upon a certain criteria out of a larger data set.
Sometimes these six cases have only missing valus.
I save the selected cases and subsequently use an ADD FILES command. When
all six files are empty, the resulting file is empty as well.
I then use a VARSTOCASES command. That should give me the value 0.00 in a
variable named 'trans' for all nine cases (INDEX = Index1(9).
I then do an AGGREGATE command do calculate the mean and the standard
deviation of 'trans'. If all six cases selected in the first step had only
missing values, I should get a mean = 0.00 and a standard deviation of 0.00.

However, when I repeatedly do that as part of a larger routine, I sometimes
get an absurdly high value for one of the values in 'trans'. This happens
not all the time; neither the occurrence is regularly, nor comes the wrong
mean always as the same value. Typically, the wrong value appears one to two
times when I repeat the procedure 10 times.

Below is a code that reproduces the error (at least on my computer) and
prints the calculated means. It runs 10 times through the ADD FILES,
VARSTOCASES, and AGGREGATE commands.

The result is for example:
mean = .0000
mean = 9.7E+288
mean = .0000
mean = .0000
mean = .0000
mean = .0000
mean = .0000
mean = .0000
mean = 2.9E+184
mean = .0000

The Syntax
1. creates six empty files
2. add them (resulting in an empty file)
3. restructures the empty file (VARSTOCASES)
4. aggregates the restructured file
5. writes the aggregated values into a syntax file
6. prints the values
7. erases the created syntax files (from step 5)

* BEGIN OF SYNTAX.
* Syntax to reproduce the wrong mean.
*************************************.

NEW FILE.
DATASET CLOSE all.
DATA LIST FREE /qrel Q1a Q2a Q3a Q4a Q5a Q6a Q7a Q8a Q9a
imp_1 imp_2 imp_3 imp_4 imp_5 imp_6 imp_7 imp_8 imp_9
Caegory Questions.
begin data.
end data.

* Make Data.
DEFINE !DoTbl ().
!LET !nby = 6.
!DO !cnt=1 !TO !nby.
!LET !Path = !QUOTE(!CONCAT('c:\Temp\tmp_',!cnt,'.sav') ).
Save Outfile = !Path.
!DOEND.
!ENDDEFINE.

!DoTbl.

* Make mean.
************.
DEFINE !DoData ().

!LET !nby = 10.
!DO !cnt=1 !TO !nby.
!LET !Path1 = !QUOTE(!CONCAT('c:\Temp\tmp_1.sav') ).
!LET !Path2 = !QUOTE(!CONCAT('c:\Temp\tmp_2.sav') ).
!LET !Path3 = !QUOTE(!CONCAT('c:\Temp\tmp_3.sav') ).
!LET !Path4 = !QUOTE(!CONCAT('c:\Temp\tmp_4.sav') ).
!LET !Path5 = !QUOTE(!CONCAT('c:\Temp\tmp_5.sav') ).
!LET !Path6 = !QUOTE(!CONCAT('c:\Temp\tmp_6.sav') ).
New File.
DATASET CLOSE ALL.

Get FILE = !Path1.
ADD FILES /FILE=*
/FILE = !Path2
/FILE = !Path3
/FILE = !Path4
/FILE = !Path5
/FILE = !Path6.
Exec.

OMS /DESTINATION VIEWER=NO .
VARSTOCASES
/MAKE trans FROM Q1a to Q9a
/INDEX = Index1(9)
/NULL = DROP
/DROP = QREL imp_1 to imp_9 Caegory Questions
/COUNT = Counter .
Exec.
OMSEND.

Compute dummy = 1.
DATASET DECLARE Agg.
AGGREGATE
/OUTFILE ='Agg'
/BREAK = dummy
/Self_mean = MEAN(trans)
/Self_sd = SD(trans).
Exec.

DATASET ACTIVE agg.
FORMAT Self_Mean (F8.4).
FORMAT Self_SD (F8.4).

!LET !PathA = !QUOTE(!CONCAT('c:\Temp\myMean_',!cnt,'.sps') ).
!LET !PathB = !QUOTE(!CONCAT('c:\Temp\myMean_',!cnt,'.sps') ).
!LET !myMean = !QUOTE(!CONCAT("DEFINE !Tb_Mean",!cnt,"()") ).
!LET !mySD = !QUOTE(!CONCAT("DEFINE !Tb_Std",!cnt,"()") ).

WRITE OUTFILE !PathA/!myMean/Self_Mean/"!ENDDEFINE."/"Exec.".
WRITE OUTFILE !PathA/!mySD/Self_SD/"!ENDDEFINE."/"Exec.".
EXEC.
!DOEND.
!ENDDEFINE.

!DoData.
Exec.

* Print the created means (should all be '.0000').
***************************************************.
Title "Results:".
DEFINE !DoPrint ().
!LET !nby = 10.
!DO !cnt=1 !TO !nby.
!LET !PathA = !QUOTE(!CONCAT('c:\Temp\myMean_',!cnt,'.sps') ).

INSERT FILE = !PathA.
!LET !Print = !CONCAT('mean = ','!Tb_mean',!cnt).
TITLE !Print.
Exec.
!DOEND.
!ENDDEFINE.

!DoPrint.

Echo " ".

* Clean up.
**********.
DEFINE !DoClean ().
!LET !nby = 10.
!DO !cnt=1 !TO !nby.
!LET !PathA = !QUOTE(!CONCAT('c:\Temp\myMean_',!cnt,'.sps') ).
ERASE FILE = !PathA.
!DOEND.
!LET !nby2 = 6.
!DO !cnt2=1 !TO !nby2.
!LET !Path = !QUOTE(!CONCAT('c:\Temp\tmp_',!cnt2,'.sav') ).
ERASE FILE= !Path.
!DOEND.
!ENDDEFINE.

!DoClean .
EXEC.

* END OF SYNTAX.

*******************************.

-----Ursprungliche Nachricht-----
Von: SPSSX(r) Discussion [mailto:[hidden email]]Im Auftrag von
Richard Ristow
Gesendet: Samstag, 14. April 2007 21:35
An: [hidden email]
Betreff: Re: Bug in DELETE VAR command?

At 07:58 AM 4/13/2007, la volta statistics wrote:

>I found a strange behaviour of SPSS 15.0.1.

Raynald Levesque and I have both posted, confirming the behavior seen.
I think it's time to open a case with Tech Support; have you done this?

As a follow-up: I don't see the problem in 14.0.2. The following is
14.0.2 draft output (WRR-code & output not saved separately), and the
results appear to be what they should be. It's the code I ran and
posted for 15.0.1, though the values will be different because I didn't
explicitly seed the random number generator.

Final output listing:
....................
IF Pat = 1 AND (Missing(ABB)) StrangeBehaviour = 1.
IF Pat = 1 AND (Missing(ABB2)) Strange2 = 1.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |14-APR-2007 15:23:40 |
|-----------------------------|---------------------------|
test pat ABB ABB2 StrangeBehaviour Strange2

28.99 . 1.00 1.00 . .
5.52 . 1.00 1.00 . .
8.93 . 1.00 1.00 . .
5.61 . 1.00 1.00 . .
60.25 1.00 1.00 1.00 . .
87.04 1.00 . . 1.00 1.00
67.79 1.00 1.00 1.00 . .
63.91 1.00 1.00 1.00 . .
95.94 1.00 . . 1.00 1.00
66.97 1.00 1.00 1.00 . .
75.40 1.00 . . 1.00 1.00
19.18 . 1.00 1.00 . .
66.89 1.00 1.00 1.00 . .
40.62 . 1.00 1.00 . .
6.53 . 1.00 1.00 . .
52.25 . 1.00 1.00 . .
56.46 . 1.00 1.00 . .
67.43 1.00 1.00 1.00 . .
54.80 . 1.00 1.00 . .
77.63 1.00 . . 1.00 1.00

Number of cases read: 20 Number of cases listed: 20
....................
Full code and output:
....................

* Code to reproduce the bug.
new file.
input program.
loop #i = 1 to 20.
compute test = rv.uniform(0,100).
end case.
end loop.
end file.
end input program.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |14-APR-2007 15:23:39 |
|-----------------------------|---------------------------|
test

28.99
5.52
8.93
5.61
60.25
87.04
67.79
63.91
95.94
66.97
75.40
19.18
66.89
40.62
6.53
52.25
56.46
67.43
54.80
77.63

Number of cases read: 20 Number of cases listed: 20

If Test > 60 pat = 1.
IF Test < 70 flag = 1.
IF Flag = 1 ABB = 1.
COMPUTE ABB2 = Flag.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |14-APR-2007 15:23:40 |
|-----------------------------|---------------------------|
test pat flag ABB ABB2

28.99 . 1.00 1.00 1.00
5.52 . 1.00 1.00 1.00
8.93 . 1.00 1.00 1.00
5.61 . 1.00 1.00 1.00
60.25 1.00 1.00 1.00 1.00
87.04 1.00 . . .
67.79 1.00 1.00 1.00 1.00
63.91 1.00 1.00 1.00 1.00
95.94 1.00 . . .
66.97 1.00 1.00 1.00 1.00
75.40 1.00 . . .
19.18 . 1.00 1.00 1.00
66.89 1.00 1.00 1.00 1.00
40.62 . 1.00 1.00 1.00
6.53 . 1.00 1.00 1.00
52.25 . 1.00 1.00 1.00
56.46 . 1.00 1.00 1.00
67.43 1.00 1.00 1.00 1.00
54.80 . 1.00 1.00 1.00
77.63 1.00 . . .

Number of cases read: 20 Number of cases listed: 20

DELETE VAR Flag.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |14-APR-2007 15:23:40 |
|-----------------------------|---------------------------|
test pat ABB ABB2

28.99 . 1.00 1.00
5.52 . 1.00 1.00
8.93 . 1.00 1.00
5.61 . 1.00 1.00
60.25 1.00 1.00 1.00
87.04 1.00 . .
67.79 1.00 1.00 1.00
63.91 1.00 1.00 1.00
95.94 1.00 . .
66.97 1.00 1.00 1.00
75.40 1.00 . .
19.18 . 1.00 1.00
66.89 1.00 1.00 1.00
40.62 . 1.00 1.00
6.53 . 1.00 1.00
52.25 . 1.00 1.00
56.46 . 1.00 1.00
67.43 1.00 1.00 1.00
54.80 . 1.00 1.00
77.63 1.00 . .

Number of cases read: 20 Number of cases listed: 20

* Here comes the bug.
IF Pat = 1 AND (Missing(ABB)) StrangeBehaviour = 1.
IF Pat = 1 AND (Missing(ABB2)) Strange2 = 1.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |14-APR-2007 15:23:40 |
|-----------------------------|---------------------------|
test pat ABB ABB2 StrangeBehaviour Strange2

28.99 . 1.00 1.00 . .
5.52 . 1.00 1.00 . .
8.93 . 1.00 1.00 . .
5.61 . 1.00 1.00 . .
60.25 1.00 1.00 1.00 . .
87.04 1.00 . . 1.00 1.00
67.79 1.00 1.00 1.00 . .
63.91 1.00 1.00 1.00 . .
95.94 1.00 . . 1.00 1.00
66.97 1.00 1.00 1.00 . .
75.40 1.00 . . 1.00 1.00
19.18 . 1.00 1.00 . .
66.89 1.00 1.00 1.00 . .
40.62 . 1.00 1.00 . .
6.53 . 1.00 1.00 . .
52.25 . 1.00 1.00 . .
56.46 . 1.00 1.00 . .
67.43 1.00 1.00 1.00 . .
54.80 . 1.00 1.00 . .
77.63 1.00 . . 1.00 1.00

Number of cases read: 20 Number of cases listed: 20

Mark A Davenport MADAVENP

Question: Use (misuse, non-use) of error bars

In reply to this post by Richard Ristow

Dear colleagues,

I have been working on a project wherein I am creating charts to show how
mean GPAs compare ACROSS several different ability goups and BETWEEN two
residential groups. I have done the sig tests and all, but am developing
charts to SHOW the data some administrators. Yes, I can show tables of
numbers and p values, and present statments that mean 1 is significantly
greater than mean 2, etc. Honestly, that is not really what this audience
is interested in. I really don't think tables of numbers will make an
impact.

The charting scheme I use makes use of error bars to show the uncertantly
of my point estimates. As I began thinking about the numbers I had
available (SDs, SE, 95% CIs, I began to wonder, 'Which should I use?" I
have used CI and never reall considered an alternative (perhaps due to the
nature of my research). I figured someone out there had a really good
answer. Well, I have been disappointed. My research has revealed the
following:

Whether you use SE, SD of CI depends largely on your discipline, your
question, your sample size..., etc. Strangely enough, I really don't
recall a common use of error bars in the education/psych journals that I
grew up with. All this is a slam in the face of my hero (cited below) and
inventor fo the box plot, John Wilder Tukey, who would have considered the
display of uncertainly in charts nothing less than standard (read,
required) practice. SPSS certainly makes it easy enough to include each
of the 3 types and alter the criteria (CI 90%, SE*2, etc.).

OK, all of that to ask this: Have you all, in your years of experience and
practice developed an opinion about the proper (or preferred) use of SD,
SE, CI error bars in particular suituations or with particular audiences
in the social sciences? Do you prefer SE (typically about half the width
of CI with decent sample sizes)? Do you even think it's important? I
really wonder at this last question, considering the lack of use in the
published ed and psych literature.

I respectfully surrender the soapbox to the next confused (or bored) soul.

Mark

***************************************************************************************************************************************************************
Mark A. Davenport Ph.D.
Senior Research Analyst
Office of Institutional Research
The University of North Carolina at Greensboro
336.256.0395
[hidden email]

'An approximate answer to the right question is worth a good deal more
than an exact answer to an approximate question.' --a paraphrase of J. W.
Tukey (1962)

Alexander J. Shackman-2

Re: Question: Use (misuse, non-use) of error bars

http://psyphz.psych.wisc.edu/%7Eshackman/mediation_moderation_resources.htm#Resources_for_Within-Subjects_%28Repeated_

the unpublished paper by christian schunn is particularly insightful

hth, alex shackman

On 4/26/07, Mark A Davenport MADAVENP <[hidden email]> wrote:

>
> Dear colleagues,
>
> I have been working on a project wherein I am creating charts to show how
> mean GPAs compare ACROSS several different ability goups and BETWEEN two
> residential groups. I have done the sig tests and all, but am developing
> charts to SHOW the data some administrators. Yes, I can show tables of
> numbers and p values, and present statments that mean 1 is significantly
> greater than mean 2, etc. Honestly, that is not really what this audience
> is interested in. I really don't think tables of numbers will make an
> impact.
>
> The charting scheme I use makes use of error bars to show the uncertantly
> of my point estimates. As I began thinking about the numbers I had
> available (SDs, SE, 95% CIs, I began to wonder, 'Which should I use?" I
> have used CI and never reall considered an alternative (perhaps due to the
> nature of my research). I figured someone out there had a really good
> answer. Well, I have been disappointed. My research has revealed the
> following:
>
> Whether you use SE, SD of CI depends largely on your discipline, your
> question, your sample size..., etc. Strangely enough, I really don't
> recall a common use of error bars in the education/psych journals that I
> grew up with. All this is a slam in the face of my hero (cited below) and
> inventor fo the box plot, John Wilder Tukey, who would have considered the
> display of uncertainly in charts nothing less than standard (read,
> required) practice. SPSS certainly makes it easy enough to include each
> of the 3 types and alter the criteria (CI 90%, SE*2, etc.).
>
> OK, all of that to ask this: Have you all, in your years of experience and
> practice developed an opinion about the proper (or preferred) use of SD,
> SE, CI error bars in particular suituations or with particular audiences
> in the social sciences? Do you prefer SE (typically about half the width
> of CI with decent sample sizes)? Do you even think it's important? I
> really wonder at this last question, considering the lack of use in the
> published ed and psych literature.
>
> I respectfully surrender the soapbox to the next confused (or bored) soul.
>
> Mark
>
>
> ***************************************************************************************************************************************************************
> Mark A. Davenport Ph.D.
> Senior Research Analyst
> Office of Institutional Research
> The University of North Carolina at Greensboro
> 336.256.0395
> [hidden email]
>
> 'An approximate answer to the right question is worth a good deal more
> than an exact answer to an approximate question.' --a paraphrase of J. W.
> Tukey (1962)
>

--
Alexander J. Shackman
Laboratory for Affective Neuroscience
Waisman Laboratory for Brain Imaging & Behavior
University of Wisconsin-Madison
1202 West Johnson Street
Madison, Wisconsin 53706

Telephone: +1 (608) 358-5025
FAX: +1 (608) 265-2875
EMAIL: [hidden email]
http://psyphz.psych.wisc.edu/~shackman

Mark A Davenport MADAVENP

Re: Question: Use (misuse, non-use) of error bars

This PowerPoint presentation by Jody Culham was also very helpful

http://defiant.ssc.uwo.ca/Jody_web/Culham_Lab_Docs/Advice/ErrorBars_Lecture5.ppt

***************************************************************************************************************************************************************
Mark A. Davenport Ph.D.
Senior Research Analyst
Office of Institutional Research
The University of North Carolina at Greensboro
336.256.0395
[hidden email]

'An approximate answer to the right question is worth a good deal more
than an exact answer to an approximate question.' --a paraphrase of J. W.
Tukey (1962)

"Alexander J. Shackman" <[hidden email]>
Sent by: [hidden email]
04/26/2007 11:56 AM
Please respond to
[hidden email]

To
"Mark A Davenport MADAVENP" <[hidden email]>
cc
[hidden email]
Subject
Re: [SPSSX-L] Question: Use (misuse, non-use) of error bars

http://psyphz.psych.wisc.edu/%7Eshackman/mediation_moderation_resources.htm#Resources_for_Within-Subjects_%28Repeated_

the unpublished paper by christian schunn is particularly insightful

hth, alex shackman

On 4/26/07, Mark A Davenport MADAVENP < [hidden email]> wrote:
Dear colleagues,

I have been working on a project wherein I am creating charts to show how
mean GPAs compare ACROSS several different ability goups and BETWEEN two
residential groups. I have done the sig tests and all, but am developing
charts to SHOW the data some administrators. Yes, I can show tables of
numbers and p values, and present statments that mean 1 is significantly
greater than mean 2, etc. Honestly, that is not really what this audience
is interested in. I really don't think tables of numbers will make an
impact.

The charting scheme I use makes use of error bars to show the uncertantly
of my point estimates. As I began thinking about the numbers I had
available (SDs, SE, 95% CIs, I began to wonder, 'Which should I use?" I
have used CI and never reall considered an alternative (perhaps due to the
nature of my research). I figured someone out there had a really good
answer. Well, I have been disappointed. My research has revealed the
following:

Whether you use SE, SD of CI depends largely on your discipline, your
question, your sample size..., etc. Strangely enough, I really don't
recall a common use of error bars in the education/psych journals that I
grew up with. All this is a slam in the face of my hero (cited below) and
inventor fo the box plot, John Wilder Tukey, who would have considered the
display of uncertainly in charts nothing less than standard (read,
required) practice. SPSS certainly makes it easy enough to include each
of the 3 types and alter the criteria (CI 90%, SE*2, etc.).

OK, all of that to ask this: Have you all, in your years of experience and

practice developed an opinion about the proper (or preferred) use of SD,
SE, CI error bars in particular suituations or with particular audiences
in the social sciences? Do you prefer SE (typically about half the width
of CI with decent sample sizes)? Do you even think it's important? I
really wonder at this last question, considering the lack of use in the
published ed and psych literature.

I respectfully surrender the soapbox to the next confused (or bored) soul.

Mark

***************************************************************************************************************************************************************
Mark A. Davenport Ph.D.
Senior Research Analyst
Office of Institutional Research
The University of North Carolina at Greensboro
336.256.0395
[hidden email]

'An approximate answer to the right question is worth a good deal more
than an exact answer to an approximate question.' --a paraphrase of J. W.
Tukey (1962)

--
Alexander J. Shackman
Laboratory for Affective Neuroscience
Waisman Laboratory for Brain Imaging & Behavior
University of Wisconsin-Madison
1202 West Johnson Street
Madison, Wisconsin 53706

Telephone: +1 (608) 358-5025
FAX: +1 (608) 265-2875
EMAIL: [hidden email]
http://psyphz.psych.wisc.edu/~shackman

Rose, Fred

Re: Question: Use (misuse, non-use) of error bars

In reply to this post by Mark A Davenport MADAVENP

Mark,

I'm sure you will get more learned opinions than mine, but I have also
struggled with the issue. I've never considered using CI, but wrestled with
SE vs SD. As I understand it, the choice relates to your purpose: Use SE
if you want to show how "close" your group means are likely to be to the
population mean. Use SD if you are interested in showing the variability of
your particular samples, without concern for the population values. Thus,
if you are using statistical tests of significance, SE's are probably most
appropriate. That's my 2 cents.

My piggy-back question: Is there a way in SPSS to have error bars appear on
bar charts, particularly if using the GUI? I have yet to figure that one
out.

Fred

--
Fredric E. Rose, Ph.D.
Assistant Professor of Psychology
Palomar College
760-744-1150 x2344
[hidden email]

On 4/26/07 8:49 AM, "Mark A Davenport MADAVENP" <[hidden email]>
wrote:

> Dear colleagues,
>
> I have been working on a project wherein I am creating charts to show how
> mean GPAs compare ACROSS several different ability goups and BETWEEN two
> residential groups. I have done the sig tests and all, but am developing
> charts to SHOW the data some administrators. Yes, I can show tables of
> numbers and p values, and present statments that mean 1 is significantly
> greater than mean 2, etc. Honestly, that is not really what this audience
> is interested in. I really don't think tables of numbers will make an
> impact.
>
> The charting scheme I use makes use of error bars to show the uncertantly
> of my point estimates. As I began thinking about the numbers I had
> available (SDs, SE, 95% CIs, I began to wonder, 'Which should I use?" I
> have used CI and never reall considered an alternative (perhaps due to the
> nature of my research). I figured someone out there had a really good
> answer. Well, I have been disappointed. My research has revealed the
> following:
>
> Whether you use SE, SD of CI depends largely on your discipline, your
> question, your sample size..., etc. Strangely enough, I really don't
> recall a common use of error bars in the education/psych journals that I
> grew up with. All this is a slam in the face of my hero (cited below) and
> inventor fo the box plot, John Wilder Tukey, who would have considered the
> display of uncertainly in charts nothing less than standard (read,
> required) practice. SPSS certainly makes it easy enough to include each
> of the 3 types and alter the criteria (CI 90%, SE*2, etc.).
>
> OK, all of that to ask this: Have you all, in your years of experience and
> practice developed an opinion about the proper (or preferred) use of SD,
> SE, CI error bars in particular suituations or with particular audiences
> in the social sciences? Do you prefer SE (typically about half the width
> of CI with decent sample sizes)? Do you even think it's important? I
> really wonder at this last question, considering the lack of use in the
> published ed and psych literature.
>
> I respectfully surrender the soapbox to the next confused (or bored) soul.
>
> Mark
>
> ******************************************************************************
> ******************************************************************************
> ***
> Mark A. Davenport Ph.D.
> Senior Research Analyst
> Office of Institutional Research
> The University of North Carolina at Greensboro
> 336.256.0395
> [hidden email]
>
> 'An approximate answer to the right question is worth a good deal more
> than an exact answer to an approximate question.' --a paraphrase of J. W.
> Tukey (1962)

Marta García-Granero

Re: Question: Use (misuse, non-use) of error bars

In reply to this post by Mark A Davenport MADAVENP

Hi Mark

At least in my field of work (biomedical research mainly), I've found
that the book "How to Report Statistics in Medicine" (Lang & Secic,
ACP Series) is a *must have*. It gives clear indications about which
(and why) is the best approach to the presentation of data, either
numerically (tables) or graphically. To keep it short:

1) Avoid the use of "dynamite pushers" graphs (*)
2) Avoid the use of SE. If the graph is simply descriptive, use SD, if
you want to show the precision of your estimates, then use CI.

(*) A bit of ASCII art (use courier font to view the figures):
_
|
|--|--|
| |
| |
----| |-----

Use this type of graphs instead:
_
|
|
|
O
|
|
|
_

I hope this helps,
Marta García-Granero

MADM> I have been working on a project wherein I am creating charts to show how
MADM> mean GPAs compare ACROSS several different ability goups and BETWEEN two
MADM> residential groups. I have done the sig tests and all, but am developing
MADM> charts to SHOW the data some administrators. Yes, I can show tables of
MADM> numbers and p values, and present statments that mean 1 is significantly
MADM> greater than mean 2, etc. Honestly, that is not really what this audience
MADM> is interested in. I really don't think tables of numbers will make an
MADM> impact.

MADM> The charting scheme I use makes use of error bars to show the uncertantly
MADM> of my point estimates. As I began thinking about the numbers I had
MADM> available (SDs, SE, 95% CIs, I began to wonder, 'Which should I use?" I
MADM> have used CI and never reall considered an alternative (perhaps due to the
MADM> nature of my research). I figured someone out there had a really good
MADM> answer. Well, I have been disappointed. My research has revealed the
MADM> following:

MADM> Whether you use SE, SD of CI depends largely on your discipline, your
MADM> question, your sample size..., etc. Strangely enough, I really don't
MADM> recall a common use of error bars in the education/psych journals that I
MADM> grew up with. All this is a slam in the face of my hero (cited below) and
MADM> inventor fo the box plot, John Wilder Tukey, who would have considered the
MADM> display of uncertainly in charts nothing less than standard (read,
MADM> required) practice. SPSS certainly makes it easy enough to include each
MADM> of the 3 types and alter the criteria (CI 90%, SE*2, etc.).

MADM> OK, all of that to ask this: Have you all, in your years of experience and
MADM> practice developed an opinion about the proper (or preferred) use of SD,
MADM> SE, CI error bars in particular suituations or with particular audiences
MADM> in the social sciences? Do you prefer SE (typically about half the width
MADM> of CI with decent sample sizes)? Do you even think it's important? I
MADM> really wonder at this last question, considering the lack of use in the
MADM> published ed and psych literature.

MADM> I respectfully surrender the soapbox to the next confused (or bored) soul.

Mark A Davenport MADAVENP

Re: Question: Use (misuse, non-use) of error bars

In reply to this post by Rose, Fred

Ah, this is weird. maybe Jon or ViAnn can chime in on this one. I was
trying to respond to Frederic's question and found that error bars are
easy enough in the old chart method and the Interactive Graphs method.
However, I tried to make the same bar chart using the Chart Builder and
the error bars option was greyed out. I could never get it to present
itself.

What gives?

***************************************************************************************************************************************************************
Mark A. Davenport Ph.D.
Senior Research Analyst
Office of Institutional Research
The University of North Carolina at Greensboro
336.256.0395
[hidden email]

'An approximate answer to the right question is worth a good deal more
than an exact answer to an approximate question.' --a paraphrase of J. W.
Tukey (1962)

"Fredric E. Rose, Ph.D." <[hidden email]>
Sent by: "SPSSX(r) Discussion" <[hidden email]>
04/26/2007 11:59 AM
Please respond to
"Fredric E. Rose, Ph.D." <[hidden email]>

To
[hidden email]
cc

Subject
Re: Question: Use (misuse, non-use) of error bars

Mark,

I'm sure you will get more learned opinions than mine, but I have also
struggled with the issue. I've never considered using CI, but wrestled
with
SE vs SD. As I understand it, the choice relates to your purpose: Use SE
if you want to show how "close" your group means are likely to be to the
population mean. Use SD if you are interested in showing the variability
of
your particular samples, without concern for the population values. Thus,
if you are using statistical tests of significance, SE's are probably most
appropriate. That's my 2 cents.

My piggy-back question: Is there a way in SPSS to have error bars appear
on
bar charts, particularly if using the GUI? I have yet to figure that one
out.

Fred

--
Fredric E. Rose, Ph.D.
Assistant Professor of Psychology
Palomar College
760-744-1150 x2344
[hidden email]

On 4/26/07 8:49 AM, "Mark A Davenport MADAVENP" <[hidden email]>
wrote:

> Dear colleagues,
>
> I have been working on a project wherein I am creating charts to show
how
> mean GPAs compare ACROSS several different ability goups and BETWEEN two
> residential groups. I have done the sig tests and all, but am
developing
> charts to SHOW the data some administrators. Yes, I can show tables of
> numbers and p values, and present statments that mean 1 is significantly
> greater than mean 2, etc. Honestly, that is not really what this
audience
> is interested in. I really don't think tables of numbers will make an
> impact.
>
> The charting scheme I use makes use of error bars to show the
uncertantly
> of my point estimates. As I began thinking about the numbers I had
> available (SDs, SE, 95% CIs, I began to wonder, 'Which should I use?" I
> have used CI and never reall considered an alternative (perhaps due to
the
> nature of my research). I figured someone out there had a really good
> answer. Well, I have been disappointed. My research has revealed the
> following:
>
> Whether you use SE, SD of CI depends largely on your discipline, your
> question, your sample size..., etc. Strangely enough, I really don't
> recall a common use of error bars in the education/psych journals that I
> grew up with. All this is a slam in the face of my hero (cited below)
and
> inventor fo the box plot, John Wilder Tukey, who would have considered
the
> display of uncertainly in charts nothing less than standard (read,
> required) practice. SPSS certainly makes it easy enough to include each
> of the 3 types and alter the criteria (CI 90%, SE*2, etc.).
>
> OK, all of that to ask this: Have you all, in your years of experience
and
> practice developed an opinion about the proper (or preferred) use of SD,
> SE, CI error bars in particular suituations or with particular audiences
> in the social sciences? Do you prefer SE (typically about half the
width
> of CI with decent sample sizes)? Do you even think it's important? I
> really wonder at this last question, considering the lack of use in the
> published ed and psych literature.
>
> I respectfully surrender the soapbox to the next confused (or bored)
soul.
>
> Mark
>
>
******************************************************************************
>
******************************************************************************

> ***
> Mark A. Davenport Ph.D.
> Senior Research Analyst
> Office of Institutional Research
> The University of North Carolina at Greensboro
> 336.256.0395
> [hidden email]
>
> 'An approximate answer to the right question is worth a good deal more
> than an exact answer to an approximate question.' --a paraphrase of J.

W.
> Tukey (1962)

Mark A Davenport MADAVENP

Re: Question: Use (misuse, non-use) of error bars

In reply to this post by Alexander J. Shackman-2

Here is another interesting read on the topic of error bars. This is the
one that piqued my self-doubt.

http://scienceblogs.com/cognitivedaily/2007/03/most_researchers_dont_understa.php

***************************************************************************************************************************************************************
Mark A. Davenport Ph.D.
Senior Research Analyst
Office of Institutional Research
The University of North Carolina at Greensboro
336.256.0395
[hidden email]

'An approximate answer to the right question is worth a good deal more
than an exact answer to an approximate question.' --a paraphrase of J. W.
Tukey (1962)

Beadle, ViAnn

Re: Question: Use (misuse, non-use) of error bars

In reply to this post by Rose, Fred

To get error bars on charts using the legacy graph dialogs, go to Options sub-dialog. You can get error bars on bar charts, line charts, area charts, ...

You can also get error bars using Chart Builder in version 15 by specifying a count, mean, or median statistic and turning on error bars in the properties sheet for the element.

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Fredric E. Rose, Ph.D.
Sent: Thursday, April 26, 2007 10:00 AM
To: [hidden email]
Subject: Re: Question: Use (misuse, non-use) of error bars

Mark,

I'm sure you will get more learned opinions than mine, but I have also
struggled with the issue. I've never considered using CI, but wrestled with
SE vs SD. As I understand it, the choice relates to your purpose: Use SE
if you want to show how "close" your group means are likely to be to the
population mean. Use SD if you are interested in showing the variability of
your particular samples, without concern for the population values. Thus,
if you are using statistical tests of significance, SE's are probably most
appropriate. That's my 2 cents.

My piggy-back question: Is there a way in SPSS to have error bars appear on
bar charts, particularly if using the GUI? I have yet to figure that one
out.

Fred

--
Fredric E. Rose, Ph.D.
Assistant Professor of Psychology
Palomar College
760-744-1150 x2344
[hidden email]

On 4/26/07 8:49 AM, "Mark A Davenport MADAVENP" <[hidden email]>
wrote:

Henrik Lolle

Re: Question: Use (misuse, non-use) of error bars

In reply to this post by Mark A Davenport MADAVENP

Dear Mark,

I can recommend the following article:

Goldstein, H. and Healy, M. J. R. (1995): "The graphical presentation
of a collection of means". Journal of the Royal Statistical Society.
Series A (Statistics in Society), Vol. 158, No. 1. (1995), pp. 175-177.

The method described in this article is often used in the context of
multilevel analysis.

Best,
Henrik
--
Henrik Lolle
Institut for Økonomi, Politik og Forvaltning
Fibigerstræde 1
Aalborg Universitet
9200 Aalborg Ø.
Tlf.: 96 35 81 84

Quoting Mark A Davenport MADAVENP <[hidden email]>:

> Dear colleagues,
>
> I have been working on a project wherein I am creating charts to show how
> mean GPAs compare ACROSS several different ability goups and BETWEEN two
> residential groups. I have done the sig tests and all, but am developing
> charts to SHOW the data some administrators. Yes, I can show tables of
> numbers and p values, and present statments that mean 1 is significantly
> greater than mean 2, etc. Honestly, that is not really what this audience
> is interested in. I really don't think tables of numbers will make an
> impact.
>
> The charting scheme I use makes use of error bars to show the uncertantly
> of my point estimates. As I began thinking about the numbers I had
> available (SDs, SE, 95% CIs, I began to wonder, 'Which should I use?" I
> have used CI and never reall considered an alternative (perhaps due to the
> nature of my research). I figured someone out there had a really good
> answer. Well, I have been disappointed. My research has revealed the
> following:
>
> Whether you use SE, SD of CI depends largely on your discipline, your
> question, your sample size..., etc. Strangely enough, I really don't
> recall a common use of error bars in the education/psych journals that I
> grew up with. All this is a slam in the face of my hero (cited below) and
> inventor fo the box plot, John Wilder Tukey, who would have considered the
> display of uncertainly in charts nothing less than standard (read,
> required) practice. SPSS certainly makes it easy enough to include each
> of the 3 types and alter the criteria (CI 90%, SE*2, etc.).
>
> OK, all of that to ask this: Have you all, in your years of experience and
> practice developed an opinion about the proper (or preferred) use of SD,
> SE, CI error bars in particular suituations or with particular audiences
> in the social sciences? Do you prefer SE (typically about half the width
> of CI with decent sample sizes)? Do you even think it's important? I
> really wonder at this last question, considering the lack of use in the
> published ed and psych literature.
>
> I respectfully surrender the soapbox to the next confused (or bored) soul.
>
> Mark
>
> ***************************************************************************************************************************************************************
> Mark A. Davenport Ph.D.
> Senior Research Analyst
> Office of Institutional Research
> The University of North Carolina at Greensboro
> 336.256.0395
> [hidden email]
>
> 'An approximate answer to the right question is worth a good deal more
> than an exact answer to an approximate question.' --a paraphrase of J. W.
> Tukey (1962)
>