SPSSX Discussion

Graduate course on SPSS

Classic

List

Threaded

15 messages Options

kim.barchard

Graduate course on SPSS

Hello everyone,

I am thinking of teaching a graduate level course on SPSS. I'm imagining
that it would assume no background with SPSS, but with bring students to a
relatively high level of sophistication. The course would focus on simple
statistical techniques so that this is a course about SPSS, not about
statistics.

Does anyone have assignments and lectures they'd be willing to share? What
about textbook recommendations?

Thanks!

Kim Barchard
Assistant Professor
University of Nevada, Las Vegas

Maguin, Eugene

Re: Graduate course on SPSS

Kim,

I think that is an excellent idea. From time to time and when we can find
them (and have the money to pay them), we have hired grad students to work
on data management and analysis projects. I've noticed that their spss
skills are pretty limited because they learned just enough of spss to get
through a statistics lab. It sounds like you have something else in mind.

I would assume that your students will have had a research design class(es)
so that they will understand different designs and that they have had a
(several) statistics class so that they understand how to conduct and
interpret different statistical tests. Your class might fit into that area
between design and statistics--an area I'd call 'data management and
preparation'. I think your syllabus is every non-statistics command in the
syntax reference plus the nonsyntax manual documented features. I would
probably skip macros, the matrix command set, scripts and Python--unless
there is time to include them, and then, I'd do in the order listed. In
addition, they need an understanding of how to setup data for different
statisical procedures. Lastly, they need to know how to make spss output
connect with other programs, especially word, excel and powerpoint.

I'd say that to start with there are two required 'texts'. The syntax manual
and Ray Levesque's book, which you can get from the spss website. If you go
into scripts and Python, I don't know what documentation is available as
there is nothing--as far as I can see--in the manuals directory. Maybe
somebody else can say.

Gene Maguin

Mark A Davenport MADAVENP

Re: Graduate course on SPSS

I replied to Kim privately with an attachment but want to give one more
voice to the importance of Gene's comments. I see very few stats teachers
going much deeper into SPSS that how to use the drop-down menus, leaving
students with a very simplistic view of data analysis. SPSS is NOT a
spread sheet. Students need to learn early on that SPSS is not just
another expensive calculator. It's a tool for solving problems.

For instance, it may take a student several finger-numbing mouse clicks to
recreate contrast codes created, but messed up, in the previous day's
work.

Or he/she can simply open, edit, and run the syntax written the day before
(not very sophisticated but it's a start):

**COMPARE A1,B1 TO A2,B1

COMPUTE dummyc_1 = 0.

DO IF (orglevel = 1 & gender = 1) .
RECODE
dummyc_1 (0=1) .
END IF .

DO IF (orglevel = 1 & gender = 2) .
RECODE
dummyc_1 (0=-1) .
END IF .
EXECUTE .

We are very good at teaching students how to generate numbers. We could
do so much more to help them learn how to solve data problems.

For what it's worth.

***************************************************************************************************************************************************************
Mark A. Davenport Ph.D.
Senior Research Analyst
Office of Institutional Research
The University of North Carolina at Greensboro
336.256.0395
[hidden email]

'An approximate answer to the right question is worth a good deal more
than an exact answer to an approximate question.' --a paraphrase of J. W.
Tukey (1962)

Gene Maguin <[hidden email]>
Sent by: "SPSSX(r) Discussion" <[hidden email]>
01/12/2007 11:30 AM
Please respond to
Gene Maguin <[hidden email]>

To
[hidden email]
cc

Subject
Re: Graduate course on SPSS

Kim,

I think that is an excellent idea. From time to time and when we can find
them (and have the money to pay them), we have hired grad students to work
on data management and analysis projects. I've noticed that their spss
skills are pretty limited because they learned just enough of spss to get
through a statistics lab. It sounds like you have something else in mind.

I would assume that your students will have had a research design
class(es)
so that they will understand different designs and that they have had a
(several) statistics class so that they understand how to conduct and
interpret different statistical tests. Your class might fit into that area
between design and statistics--an area I'd call 'data management and
preparation'. I think your syllabus is every non-statistics command in the
syntax reference plus the nonsyntax manual documented features. I would
probably skip macros, the matrix command set, scripts and Python--unless
there is time to include them, and then, I'd do in the order listed. In
addition, they need an understanding of how to setup data for different
statisical procedures. Lastly, they need to know how to make spss output
connect with other programs, especially word, excel and powerpoint.

I'd say that to start with there are two required 'texts'. The syntax
manual
and Ray Levesque's book, which you can get from the spss website. If you
go
into scripts and Python, I don't know what documentation is available as
there is nothing--as far as I can see--in the manuals directory. Maybe
somebody else can say.

Gene Maguin

Peck, Jon

Re: Graduate course on SPSS

The Data Management book, referred to below, has been updated to the 4th edition, and it has greatly expanded content on programmability and Python. The third edition also has a lot of material on this topic.

If you have installed the programmability extension, you will find a large pdf file in the help/programmability subdirectory of your SPSS installation.

And, of course, there are articles and other useful materials on SPSS Developer Central (www.spss.com/devcentral)

-Jon Peck

Gene Maguin <[hidden email]>
Sent by: "SPSSX(r) Discussion" <[hidden email]>
01/12/2007 11:30 AM
Please respond to
Gene Maguin <[hidden email]>

To
[hidden email]
cc

Subject
Re: Graduate course on SPSS

Kim,

I think that is an excellent idea. From time to time and when we can find
them (and have the money to pay them), we have hired grad students to work
on data management and analysis projects. I've noticed that their spss
skills are pretty limited because they learned just enough of spss to get
through a statistics lab. It sounds like you have something else in mind.

I would assume that your students will have had a research design
class(es)
so that they will understand different designs and that they have had a
(several) statistics class so that they understand how to conduct and
interpret different statistical tests. Your class might fit into that area
between design and statistics--an area I'd call 'data management and
preparation'. I think your syllabus is every non-statistics command in the
syntax reference plus the nonsyntax manual documented features. I would
probably skip macros, the matrix command set, scripts and Python--unless
there is time to include them, and then, I'd do in the order listed. In
addition, they need an understanding of how to setup data for different
statisical procedures. Lastly, they need to know how to make spss output
connect with other programs, especially word, excel and powerpoint.

I'd say that to start with there are two required 'texts'. The syntax
manual
and Ray Levesque's book, which you can get from the spss website. If you
go
into scripts and Python, I don't know what documentation is available as
there is nothing--as far as I can see--in the manuals directory. Maybe
somebody else can say.

Gene Maguin

Mark A Davenport MADAVENP

Re: Graduate course on SPSS

In reply to this post by kim.barchard

Elena,

There may be some disagreement but I see the teaching of syntax as
'essential', not 'optional'. Some statistical options are not even
available from the menu. However, most beginners and many advanced users
may never miss those options. More to my point is that SPSS, SAS, Stata,
etc. have made statistical analysis an almost mindless process. Before
computers, statistical analysis was for those with the math skill, the
need, motivation, perseverance, time, etc. to painstakingly march through
columns of numbers and pages of matrices. You had to clean and condition
your data manually, often on paper; a process that left you intimately
(and I do mean 'intimately') familiar with your data. The process of
calculating mean squares, sums of squares, and such gave you much more
opportunity to find errors by recognizing intermediate values that seem a
bit off. Then cam automation and the whole system went to pot. Now, all
one needs to do is use one's familiarity with MS Windows to open a data
set, pick a procedure, dump some variables into the box and pop out some
parameter estimates. Sadly, with the removal of the math from the hands
of beginning researchers, a vital step in the problem solving process
(stepping back and reviewing your progress, checking the reasonableness of
the numbers at intermediate steps) has been taken away. I think the ease
at which statistics can be generated by computer has significantly
increased the the risk of errors, particularly in the case of students.
For my students and for myself, the use of syntax has provided us with a
bit of that sense of intimacy that was lost when SPSS went Windows.

The wonderful thing is you don't need to be a statistician to be a
consciencious data manager. And you don't have to be a programmer to use
syntax. You must only be willing to learn. Honestly, it's not like
learning FORTRAN. There is a great deal a student can do with only some
very simple syntax. In my experience, it only take a few weeks for
students to appreciate how much time and effort syntax can save,
especially when subsequent lessons require that they dip back into data
and procedures that they have already run. They will quickly become
confident and receptive to the idea of doing some things 'the old way'.
The next thing you know they are playing with macros.

Gene, Jon, and others have given some very good advice. Many books now
teach SPSS by showing what syntax is produced when you set up a window and
press 'PASTE' instead of 'OK'. I would recommend such books. I also have
my students haunt Raynauld Levesque's website (http://www.spsstools.net/)
and lurk on message boards. First and foremost, I demand that my students
perform every procedure at least once using the 'PASTE' button rather than
the 'OK' button.

More on my 2-cents worth (note that it's not worth a dime).

Mark

***************************************************************************************************************************************************************
Mark A. Davenport Ph.D.
Senior Research Analyst
Office of Institutional Research
The University of North Carolina at Greensboro
336.256.0395
[hidden email]

'An approximate answer to the right question is worth a good deal more
than an exact answer to an approximate question.' --a paraphrase of J. W.
Tukey (1962)

"Elena Verbitskaya" <[hidden email]>
01/12/2007 12:49 PM

To
"'Mark A Davenport MADAVENP'" <[hidden email]>
cc

Subject
RE: Graduate course on SPSS

Dear Mark,
I am sure you are right. Could you give me any advise in such situation:
in Russia we do not have specialists in biostatistics in medical
universities (we do not have such specialty at all), and all scientists
have
to do analysis themselves . We have organized some counseling (three
persons
in the laboratory), so we have to teach young scientist to do all
themselves, they never have any information about programming, not all of
them have good computer skills, but part have practice in calculating some
basic statistics on calculators (only Student or so.)... Do you think it
is
wise to teach them syntax?

Elena V. Verbitskaya

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Mark A Davenport MADAVENP
Sent: Friday, January 12, 2007 8:27 PM
To: [hidden email]
Subject: Re: Graduate course on SPSS

I replied to Kim privately with an attachment but want to give one more
voice to the importance of Gene's comments. I see very few stats teachers
going much deeper into SPSS that how to use the drop-down menus, leaving
students with a very simplistic view of data analysis. SPSS is NOT a
spread
sheet. Students need to learn early on that SPSS is not just another
expensive calculator. It's a tool for solving problems.

For instance, it may take a student several finger-numbing mouse clicks to
recreate contrast codes created, but messed up, in the previous day's
work.

Or he/she can simply open, edit, and run the syntax written the day before
(not very sophisticated but it's a start):

**COMPARE A1,B1 TO A2,B1

COMPUTE dummyc_1 = 0.

DO IF (orglevel = 1 & gender = 1) .
RECODE
dummyc_1 (0=1) .
END IF .

DO IF (orglevel = 1 & gender = 2) .
RECODE
dummyc_1 (0=-1) .
END IF .
EXECUTE .

We are very good at teaching students how to generate numbers. We could
do
so much more to help them learn how to solve data problems.

For what it's worth.

****************************************************************************
****************************************************************************
*******
Mark A. Davenport Ph.D.
Senior Research Analyst
Office of Institutional Research
The University of North Carolina at Greensboro
336.256.0395
[hidden email]

'An approximate answer to the right question is worth a good deal more
than
an exact answer to an approximate question.' --a paraphrase of J. W.
Tukey (1962)

Gene Maguin <[hidden email]>
Sent by: "SPSSX(r) Discussion" <[hidden email]>
01/12/2007 11:30 AM
Please respond to
Gene Maguin <[hidden email]>

To
[hidden email]
cc

Subject
Re: Graduate course on SPSS

Kim,

I think that is an excellent idea. From time to time and when we can find
them (and have the money to pay them), we have hired grad students to work
on data management and analysis projects. I've noticed that their spss
skills are pretty limited because they learned just enough of spss to get
through a statistics lab. It sounds like you have something else in mind.

I would assume that your students will have had a research design
class(es)
so that they will understand different designs and that they have had a
(several) statistics class so that they understand how to conduct and
interpret different statistical tests. Your class might fit into that area
between design and statistics--an area I'd call 'data management and
preparation'. I think your syllabus is every non-statistics command in the
syntax reference plus the nonsyntax manual documented features. I would
probably skip macros, the matrix command set, scripts and Python--unless
there is time to include them, and then, I'd do in the order listed. In
addition, they need an understanding of how to setup data for different
statisical procedures. Lastly, they need to know how to make spss output
connect with other programs, especially word, excel and powerpoint.

I'd say that to start with there are two required 'texts'. The syntax
manual
and Ray Levesque's book, which you can get from the spss website. If you
go
into scripts and Python, I don't know what documentation is available as
there is nothing--as far as I can see--in the manuals directory. Maybe
somebody else can say.

Gene Maguin

Richard Ristow

Re: Graduate course on SPSS

In reply to this post by kim.barchard

At 10:29 AM 1/12/2007, [hidden email] wrote:

>I am thinking of teaching a course [that] would assume no background
>with SPSS, but with bring students to a relatively high level of
>sophistication. The course would focus on simple statistical
>techniques so that this is a course about SPSS, not about statistics.

Some points, recognizing that I'm partly echoing things other
contributors have said.

FIRST, learning SPSS has little to do with learning SPSS. (A special
case of the principle that computing has little to do with computers.)
That is, the crucial knowledge is what you need to do, and the steps
*as* *viewed* *from* *the* *problem* to get there; only then, the
syntax or techniques of your computational tool.

SECOND, having said the above, you need to learn about your
computational tool, SPSS or anything else; how it 'thinks'. For
example, SPSS is a file-spinner; its most natural operation is reading
and processing a file, record by record. The most important commands
are those that work with a whole file (the procedures); those that get
you a file to work with (start with GET FILE, DATA LIST); those that
work with each record, 'case', as it goes by (the transformation
commands); and those that define 'dictionary' attributes of the file
('transformation commands that take effect immediately').

That brings us to another part of how SPSS thinks: What a file 'looks'
like. That is, the data types variables can have, attributes (name,
labels, formats, missing values).

THIRD, and this needs to be hammered in: much of the work on a
statistical project, maybe 80%, is getting the data ready to run
analyses on. That is *radically* absent from most exercises in
statistics labs, where students never see anything but a cleaned-up
file ready to go. Teach careful practice, finicky practice: label every
variable; label values where that's relevant. Define user-missing
values where that's useful (which is often). Specify a format, an
appropriate format, for every numeric variable. (Neither categorical
variables nor Likert scales should be F8.2)

Of course, they'll have enough trouble getting the data in, in the
first place. And then the computations to analyze it, like scale total
scores where the variables read in are the question responses.

And when you've done this, the descriptive statistics are to be read,
with an eye to meaning and plausibility, not just printed and put in a
binder. (For continuous variables, my usual set of descriptive
statistics is mean, standard deviation, median, minimum, and maximum.)
....................
Of course, you'll also need something to cover the SECOND day of the
course.

-Forwards or backwards,
Richard; and the very best success to you.

Art Kendall-2

Re: Graduate course on SPSS

As usual Richard makes some very good points.
<tongue in cheek>
The one point one which I would disagree is

Richard Ristow wrote:

> At 10:29 AM 1/12/2007, [hidden email] wrote:
> <snip>
>
> THIRD, and this needs to be hammered in: much of the work on a
> statistical project, maybe 80%, is getting the data ready to run
> analyses on.
>

In consulting on something like over 200 doctoral dissertions and a
thousand congressional investigations, I have seen a small handful of
projects where it is as low as low as 80%. These tended to be using
data from organizations such as Census, NCI, NIMH, or FDA.

<remove tongue from cheek>

Art Kendall
Social Research Consultants

Marta García-Granero

Re: Graduate course on SPSS

In reply to this post by kim.barchard

Hi Kim

I'm joining this discussion a bit late (quite busy teaching SPSS to my
students), but I hope to add a couple of ideas from my own experience
teaching SPSS to biologists and medical researchers. Although all my
teaching material is is Spanish, I think I could adapt part of it
easily (a nice dataset and a flow-chart to select statistical
methods) to English.

Friday, January 12, 2007, 4:29:59 PM, You wrote:

kbUE> I am thinking of teaching a graduate level course on SPSS. I'm imagining
kbUE> that it would assume no background with SPSS, but with bring students to a
kbUE> relatively high level of sophistication. The course would focus on simple
kbUE> statistical techniques so that this is a course about SPSS, not about
kbUE> statistics.

How much time has spanned since your potential students learnt
statistics theory? My own experience is that you can't try to teach
them how to do a two-sample t-test with SPSS, if they don't know what
a t-test is, why they should use it in that experimental situation
and how to interpret SPSS output. My own courses intersperse statistics
theory with SPSS teaching.

kbUE> Does anyone have assignments and lectures they'd be willing to share?

This is (schematically) the way I work (I have datasets for every
analysis):

BASIC COURSE (around 2/3 hours per session, some are longer than
others)

First session:
--------------

The basics of dataset creation and manipulation with SPSS using the
data editor, GUI and syntax

Their goal is to create a dataset with 11 variables and 100 cases:

* They start using the data editor (assigning "good" variable names, type
-numeric, string, dates..., variable and value labels, variable level...),
and typing the first two rows of data. The dataset is then saved (we are
using SPSS 13, that doesn't allow simultaneous datasets).
* The other 98 cases are stored in a text file (although I'm
considering using an Excel file instead) and have to be imported to
SPSS. The first dataset is then opened again and the second is added
to it. The full dataset is saved.
* Now, they are shown how to do that with syntax (DATA LIST, VAR LABEL,
ADD FILES...)
* New variables (using the GUI, and later the syntax):
- Age (using the Date Wizard) is computed from BirthDate and StudyDate
- BMI (body mass index) is computed from weight and height
- Obesity is obtained recoding BMI into 3 categories (<25 Kg/m²=0;
25-30 Kg/m²=1; >30 Kg/m²=2)
* Basic statistics: Frequencies, Descriptives, Explore...

Second session:
--------------

One and two-samples tests for continuous variables (parametric and non
parametric).

Each experimental situation is presented separately, with a small
dataset, the flow chart is used to determine which statistical method
answers the research question, and the full analysis is undertaken:
Normality (or other conditions, like homogeneity of variances),
parametric test (if normality was fulfilled), and non parametric
equivalent (if normality failed).

Although they use the GUI, they learn to click PASTE button instead of
OK to see the underlying syntax. Some MACROS are used to add some
statistical methods not covered by SPSS (like confidence intervals
for median differences in paired and unpaired samples). We can't teach
them how to write their own MACROS, but they learn the basics: what a
MACRO is, and how activate and use it.

Third session:
-------------

K samples (continuous variables), both parametric and non parametric:
Oneway ANOVA, Kruskal-Wallis, two-way ANOVA, repeated measures ANOVA
and Friedman test.

The same method as before: presentation of each experimental
situation, flow chart to determine the correct statistical approach
and full statistical analysis (normality, parametric and
nonparametric). Again, MACROS are used (multiple comparisons after
Kruskal-Wallis and Friedman test). VARTOCASES is used to stack the
repeated measures dataset to analyse it using UNIANOVA (model without
interaction).

They use the GUI and the syntax is pasted.

Fourth session:
--------------

Correlation and regression

Same method: presentation of a problem, using the flow chart to
determine the correct approach. MACROS are used for bivariate
normality, 95%CI for r and for nonparametric linear regression
(Theil's incomplete method). Graphs are also used

Fifth session:
-------------

Methods for categorical variables: goodness of fit test, contingency
tables (RxC and 2x2), McNemar's and Cochran's test (this last with a
MACRO for multiple comparisons). The use of WEIGHT command is learnt
(when aggregated data are presented). MACROs to compute confidence
intervals for differences in proportions (both paired and unpared) are
also used.

ADVANCED COURSE

The Basic course knowledge is assumed, and, again, the theory is
explained before using SPSS. We focus on the following topics:

- ANOVA models: randomized blocks, latin square, split-plot, nested
models, mixed factorial model, pure within subjects factorial model,
ANCOVAs. Both UNIANOVA and MIXED is used, and the syntax is manually
modified (to nest some terms within others, or to add SPECIAL
contrasts...)

- Multiple linear regression. Strategies for model development; using
a MACRO to center the interaction terms, model checking.

- Categorical variables: stratified analysis (Mantel-Haenszel OR) and
logistic regression.

- Survival analysis: Kaplan-Meier and Cox regression.

- Meta-analysis: with MACROS (you can find them at Devcentral).

--
Regards,
Dr. Marta García-Granero,PhD mailto:[hidden email]
Statistician

---
"It is unwise to use a statistical procedure whose use one does
not understand. SPSS syntax guide cannot supply this knowledge, and it
is certainly no substitute for the basic understanding of statistics
and statistical thinking that is essential for the wise choice of
methods and the correct interpretation of their results".

(Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind)

Maguin, Eugene

An analysis question and a power question

In reply to this post by Peck, Jon

All,

I think I know the answer to the analysis question; however, I want to make
sure of that. A project I am working on will surveying two groups of people.
Each group will be surveyed by two methods: A and B. Group 1 gets method A
then B; Group 2 gets the reverse. The methods are related in that they ask
abut the same topics but with different levels of detail and structure. We
are expecting an interaction between method and group. Now then, the reason
for my question. The DVs can be constructed as either dichotomies or counts.
My understanding is that there is no way that spss can analyze this--without
treating the DV as continuous. I can't think of anything that would work but
is there something hiding or something that could be adapted?

Question 2: Given the above design, can anyone offer any advice on how to
compute a power number for this problem?

Thanks, Gene Maguin

kim.barchard

Re: Graduate course on SPSS

In reply to this post by kim.barchard

Hi Marta and everyone else,

In my statistics courses, I teach students the theory, the hand
calculation, and the SPSS calculation for each topic. My purpose in
teaching students SPSS at that point is to teach them how to do the
statistical analysis I have just taught them. With very few exceptions,
these analyses can be done with menus, and if syntax files are absolutely
essential, I provide them. Students become proficient at getting SPSS to
do the analyses that I teach them in class.

However, most psychological research goes beyond the statistical techniques
taught in the two or three graduate level statistics courses that most
psychologists get. And many psychological research projects would benefit
from statistical techniques that cannot be done using the menu system.
Therefore, I would like my students to feel confident in using SPSS in
novel ways. Therefore, I was thinking of teaching a course on SPSS itself.
This course would not teach much in the way of statistics. The goal would
be to focus on SPSS. I would therefore assume knowledge of basic
descriptive and inferential statistics (mean, sd, correlation, t-test), but
nothing else.

Each thing I teach students about SPSS, however, would need to be perceived
as relevant and interesting and useful, or else students will resent the
make-work activities, and will forget these skills when they would be most
useful. Furthermore, unless they find the skills interesting and useful,
they will not feel empowered to explore SPSS on their own. Therefore, each
SPSS skill needs to be taught in the context of a real research question.

It may be that teaching a course on SPSS itself is a bad idea. Perhaps
SPSS should only be taught as part of a statistics or research methods
course. By putting SPSS into the context of such a course, this provides a
"story" for the skills being taught. One of the list members was talking
about a course on data management. This would provide a story-line that
includes defining variables and variable properties, merging files,
transposing data, reorganizing data when it has been entered in a way that
isn't condusive to analysis, etc. Another option would be a course in
resampling techniques, which could introduce loops, and OMS. It is
intersting to me that so far, no one has told me that they teach a course
in SPSS itself.

What do you think of the concept of a course that focusses on SPSS skills -
how to go beyond the menus to use SPSS to answer research questions?

Best regards,

Kim

Marta García-Granero <[hidden email]> writes:

Hi Kim

I'm joining this discussion a bit late (quite busy teaching SPSS to my
students), but I hope to add a couple of ideas from my own experience
teaching SPSS to biologists and medical researchers. Although all my
teaching material is is Spanish, I think I could adapt part of it
easily (a nice dataset and a flow-chart to select statistical
methods) to English.

Friday, January 12, 2007, 4:29:59 PM, You wrote:

kbUE> I am thinking of teaching a graduate level course on SPSS. I'm
imagining
kbUE> that it would assume no background with SPSS, but with bring students
to a
kbUE> relatively high level of sophistication. The course would focus on
simple
kbUE> statistical techniques so that this is a course about SPSS, not about
kbUE> statistics.

How much time has spanned since your potential students learnt
statistics theory? My own experience is that you can't try to teach
them how to do a two-sample t-test with SPSS, if they don't know what
a t-test is, why they should use it in that experimental situation
and how to interpret SPSS output. My own courses intersperse statistics
theory with SPSS teaching.

kbUE> Does anyone have assignments and lectures they'd be willing to share?

This is (schematically) the way I work (I have datasets for every
analysis):

BASIC COURSE (around 2/3 hours per session, some are longer than
others)

First session:
--------------

The basics of dataset creation and manipulation with SPSS using the
data editor, GUI and syntax

Their goal is to create a dataset with 11 variables and 100 cases:

* They start using the data editor (assigning "good" variable names, type
-numeric, string, dates..., variable and value labels, variable
level...),
and typing the first two rows of data. The dataset is then saved (we are
using SPSS 13, that doesn't allow simultaneous datasets).
* The other 98 cases are stored in a text file (although I'm
considering using an Excel file instead) and have to be imported to
SPSS. The first dataset is then opened again and the second is added
to it. The full dataset is saved.
* Now, they are shown how to do that with syntax (DATA LIST, VAR LABEL,
ADD FILES...)
* New variables (using the GUI, and later the syntax):
- Age (using the Date Wizard) is computed from BirthDate and StudyDate
- BMI (body mass index) is computed from weight and height
- Obesity is obtained recoding BMI into 3 categories (<25 Kg/m²=0;
25-30 Kg/m²=1; >30 Kg/m²=2)
* Basic statistics: Frequencies, Descriptives, Explore...

Second session:
--------------

One and two-samples tests for continuous variables (parametric and non
parametric).

Each experimental situation is presented separately, with a small
dataset, the flow chart is used to determine which statistical method
answers the research question, and the full analysis is undertaken:
Normality (or other conditions, like homogeneity of variances),
parametric test (if normality was fulfilled), and non parametric
equivalent (if normality failed).

Although they use the GUI, they learn to click PASTE button instead of
OK to see the underlying syntax. Some MACROS are used to add some
statistical methods not covered by SPSS (like confidence intervals
for median differences in paired and unpaired samples). We can't teach
them how to write their own MACROS, but they learn the basics: what a
MACRO is, and how activate and use it.

Third session:
-------------

K samples (continuous variables), both parametric and non parametric:
Oneway ANOVA, Kruskal-Wallis, two-way ANOVA, repeated measures ANOVA
and Friedman test.

The same method as before: presentation of each experimental
situation, flow chart to determine the correct statistical approach
and full statistical analysis (normality, parametric and
nonparametric). Again, MACROS are used (multiple comparisons after
Kruskal-Wallis and Friedman test). VARTOCASES is used to stack the
repeated measures dataset to analyse it using UNIANOVA (model without
interaction).

They use the GUI and the syntax is pasted.

Fourth session:
--------------

Correlation and regression

Same method: presentation of a problem, using the flow chart to
determine the correct approach. MACROS are used for bivariate
normality, 95%CI for r and for nonparametric linear regression
(Theil's incomplete method). Graphs are also used

Fifth session:
-------------

Methods for categorical variables: goodness of fit test, contingency
tables (RxC and 2x2), McNemar's and Cochran's test (this last with a
MACRO for multiple comparisons). The use of WEIGHT command is learnt
(when aggregated data are presented). MACROs to compute confidence
intervals for differences in proportions (both paired and unpared) are
also used.

ADVANCED COURSE

The Basic course knowledge is assumed, and, again, the theory is
explained before using SPSS. We focus on the following topics:

- ANOVA models: randomized blocks, latin square, split-plot, nested
models, mixed factorial model, pure within subjects factorial model,
ANCOVAs. Both UNIANOVA and MIXED is used, and the syntax is manually
modified (to nest some terms within others, or to add SPECIAL
contrasts...)

- Multiple linear regression. Strategies for model development; using
a MACRO to center the interaction terms, model checking.

- Categorical variables: stratified analysis (Mantel-Haenszel OR) and
logistic regression.

- Survival analysis: Kaplan-Meier and Cox regression.

- Meta-analysis: with MACROS (you can find them at Devcentral).

--
Regards,
Dr. Marta García-Granero,PhD mailto:[hidden email]
Statistician

Richard Ristow

Re: Graduate course on SPSS

At 06:08 PM 1/15/2007, [hidden email] wrote:

>It may be that teaching a course on SPSS itself is a bad idea. Perhaps
>SPSS should only be taught as part of a statistics or research methods
>course. By putting SPSS into the context of such a course, this
>provides a "story" for the skills being taught.

Let me try an analogy: teaching English composition. (Will members
forgive me for writing 'English composition', on a multi-national list?
English is the only language in which I know rhetoric well. I take the
liberty of using it as my example, not being sure how techniques of
rhetoric, and the teaching of it, may differ in other countries and
languages.)

To write competently, you need to know a lot about English, and about
rhetoric. (By the latter, I mean techniques like stating and then
expanding a topic; controlling the rhythm of sentences; order of
topics, to put the emphasis where you desire it.)

But no matter how much you know about either, you can't write unless
you have something to say. Teaching composition is mainly asking
students to write, short and then longer pieces, on assigned topics.
You teach grammar, rhythm, and rhetoric. But you don't give tests on
students' knowledge of them; you evaluate how students apply them, in
what they write.

Following the analogy, students should learn SPSS through 'stories';
but probably not a single 'story' for the course. That could be as
daunting, and as limiting, as organizing a composition course around
writing a single large paper. There should, then, be different tasks to
accomplish in SPSS: read data and error-check it; descriptive
statistics; simple, and complex, inferential procedures.

Marta's course is like that. I'd consider more of the 'first session'
topics, preparing data for analysis. Maybe several input sets, in
varying degrees of cleanliness. (Marta, I'd be inclined to stick with
text, rather than Excel. Text forces more careful thinking about what
are your variables, what your datatypes.)

Descriptive statistics isn't something to toss off. What does a
frequency table tell you; what, in a frequency table, suggests you need
to think further? What statistics give you a good picture of a
continuous variable; how do you 'see' that picture? (As one example:
What do you learn from how near the mean and median are, to each
other?)

Then, simple inferential statistics. Descriptive statistics often shade
into these: correlations; simple chi-square or t/ANOVA tests for
differences across groups; are themselves a part of 'seeing' a dataset.

(Your students may balk: why all this simple stuff, when we want to do
GLM? Part of the course is teaching the subtlety of the 'simple'.)

Of course, they do each exercise in SPSS. How far they write syntax and
how far use the menus is your judgement. A suggestion, though: they
should always paste and run the syntax from the menus, and turn in that
syntax as part of their work. I imagine exercises in changing a table
in a certain way, by editing the syntax (menus not allowed).

>In my statistics courses, I teach students the theory, the hand
>calculation, and the SPSS calculation for each topic. My purpose in
>teaching students SPSS at that point is to teach them how to do the
>statistical analysis I have just taught them. With very few
>exceptions, these analyses can be done with menus, and if syntax files
>are absolutely essential, I provide them. Students become proficient
>at getting SPSS to do the analyses that I teach them in class.

Fair enough, but give SPSS its place, and not in the background. I
think they should always see the syntax they are running. And, though I
don't know how to teach it, to see a file as a live thing, not a dull
dead source of statistics.

>Each thing I teach students about SPSS, however, would need to be
>perceived as relevant and interesting and useful, or else students
>will resent the make-work activities, and will forget these skills
>when they would be most useful. Furthermore, unless they find the
>skills interesting and useful, they will not feel empowered to explore
>SPSS on their own. Therefore, each SPSS skill needs to be taught in
>the context of a real research question.

Yes, though that should include some very simple ones. They'll think of
descriptive statistics as something you toss off on the way to the real
work. They'll resist learning, but need to learn, that if you don't
know your data from the descriptives, you can mislead yourself badly in
the analysis.

>One of the list members was talking about a course on data
>management. This would provide a story-line that includes defining
>variables and variable properties, merging files, transposing data,
>reorganizing data when it has been entered in a way that isn't
>conducive to analysis, etc.

Don't know if you meant what I wrote. I certainly suggested something
like this. BUT, it should be data management as a means to understand
the data, not management for the sake of management.

>Another option would be a course in resampling techniques, which could
>introduce loops, and OMS.

Better not do this at the beginning. Those are *subtle* techniques, for
someone who doesn't understand SPSS basics, deeply.

>It is interesting to me that so far, no one has told me that they
>teach a course in SPSS itself.

Well, maybe this is why - the same reason nobody (in English-speaking
countries) teaches English, though students need to know it much better
than they do. You teach English by teaching the use of it.

>What do you think of the concept of a course that focuses on SPSS
>skills - how to go beyond the menus to use SPSS to answer research
>questions?

Yes, but I repeat: Every research project stands or falls on how well
the data's read, organized, and understood. Learn that, and you're
ready to go somewhere.

-In joy and Light,
Richard

Bob Schacht-3

Re: Graduate course on SPSS

At 04:21 PM 1/15/2007, Richard Ristow wrote:

>At 06:08 PM 1/15/2007, [hidden email] wrote:
>
>>It may be that teaching a course on SPSS itself is a bad idea. Perhaps
>>SPSS should only be taught as part of a statistics or research methods
>>course. . . .
>Let me try an analogy: teaching English composition. . . .
>To write competently, you need to know a lot about English, and about
>rhetoric. (By the latter, I mean techniques like stating and then
>expanding a topic; controlling the rhythm of sentences; order of
>topics, to put the emphasis where you desire it.)
>
>But no matter how much you know about either, you can't write unless
>you have something to say. . . .

I'd like to take a different tack with this argument. To me, SPSS is a
*tool*. I'll make a more extreme metaphor-- let's compare SPSS to a hammer.
Would you teach a course in uses of the hammer?

That metaphor leads us to considering the use of SPSS as a tool in
implementation of the hypothetico-deductive method, which I think is mostly
the way to do it. What you're trying to do, I think, is more in keeping
with exploratory data analysis:

>Descriptive statistics isn't something to toss off. What does a
>frequency table tell you; what, in a frequency table, suggests you need
>to think further? What statistics give you a good picture of a
>continuous variable; how do you 'see' that picture? (As one example:
>What do you learn from how near the mean and median are, to each other?)

This type of analysis is especially useful in the early stages of analysis,
especially as a data screening exercise. This is an especially good time to
notice the minimums and maximums, and consider whether or not those values
are reasonable for those variables. For example, I recently noticed values
such as "33," "13," and "11" for variables that were Likert scales with
values ranging from 1 to 5. Noticing the shape of distributions can also
be quite useful, especially if the distribution is bimodal.

>Then, simple inferential statistics. Descriptive statistics often shade
>into these: correlations; simple chi-square or t/ANOVA tests for
>differences across groups; are themselves a part of 'seeing' a dataset.

IMHO this part should be taught along with the hypothetico-deductive method
and hypothesis testing. Again IMHO, EVERY student should know what testing
a hypothesis means and how to do it. They should also know what "testing
the null hypothesis" means, because unless one knows about that, using SPSS
will likely mean merely pushing numbers around. One of the reasons that I
like about the old chestnut on social statistics by Blalock is that on the
inside cover was a table showing what chapter to look in if your
Independent variable is, say, ordinal and your dependent variable is
scale.(or any other pair of variable types). This covers most of what you
need to know to test a bivariate hypothesis.

>. . . Of course, they do each exercise in SPSS. How far they write syntax and
>how far use the menus is your judgement. A suggestion, though: they
>should always paste and run the syntax from the menus, and turn in that
>syntax as part of their work. I imagine exercises in changing a table
>in a certain way, by editing the syntax (menus not allowed).

I would be more interested in their understanding of research design, and
would not be so interested in whether or not they used syntax. But I'd also
give'em a few problems that can really only be done using syntax.

>>. . . One of the list members was talking about a course on data
>>management. This would provide a story-line that includes defining
>>variables and variable properties, merging files, transposing data,
>>reorganizing data when it has been entered in a way that isn't
>>conducive to analysis, etc.

One of the most important things is teaching them the difference between
coding that makes data entry easier, vs. coding that makes it easier to
conduct an analysis. There are lots of ways of inputting data that make
data input easy, but may make data analysis much more difficult-- and vice
versa.

>>It is interesting to me that so far, no one has told me that they
>>teach a course in SPSS itself.

This is because SPSS is/should be viewed as a tool, not as an end in itself.

>. . . Every research project stands or falls on how well
>the data's read, organized, and understood. Learn that, and you're
>ready to go somewhere.

Understood? Aye, there's the rub.

Bob

Reutter, Alex

Re: An analysis question and a power question

In reply to this post by Maguin, Eugene

Gene,

If I understand correctly, you have data that could look something like:

id group method order depvar
1 1 A 1 n1A
1 1 B 2 n1B
2 2 A 2 n2A
2 2 B 1 n2B
3 1 A 1 n1A
3 1 B 2 n1B
4 2 A 2 n2A
4 2 B 1 n2B

Where "id" is the subject id, "order" is the order in which the group members took that method, and "depvar" contains counts. So you could fit, for example:

GENLIN depvar BY group method order
/MODEL group method group*method
DISTRIBUTION=POISSON
LINK=LOG
/REPEATED
SUBJECT=id
WITHINSUBJECT=order
CORRTYPE=EXCHANGEABLE
/PRINT MODELINFO FIT SUMMARY SOLUTION WORKINGCORR.

This GENLIN syntax is adapted from the MIXED pasted syntax from the "Using Linear Mixed Models to Analyze a Crossover Trial" case study (Help > Case Studies; then Advanced Models > Linear Mixed Models); the specification is a little different in Genlin, but some of the same issues with crossover trials would apply, I think. In Genlin, of course, you will also want to take care with your choice of link function and distribution.

Not sure about question 2; have you talked to statistical Tech Support?

Cheers,
Alex

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Gene Maguin
Sent: Monday, January 15, 2007 2:30 PM
To: [hidden email]
Subject: An analysis question and a power question

All,

I think I know the answer to the analysis question; however, I want to make
sure of that. A project I am working on will surveying two groups of people.
Each group will be surveyed by two methods: A and B. Group 1 gets method A
then B; Group 2 gets the reverse. The methods are related in that they ask
abut the same topics but with different levels of detail and structure. We
are expecting an interaction between method and group. Now then, the reason
for my question. The DVs can be constructed as either dichotomies or counts.
My understanding is that there is no way that spss can analyze this--without
treating the DV as continuous. I can't think of anything that would work but
is there something hiding or something that could be adapted?

Question 2: Given the above design, can anyone offer any advice on how to
compute a power number for this problem?

Thanks, Gene Maguin

Maguin, Eugene

Re: An analysis question and a power question

Alex,

Yes, that is what the data would look like when arranged in a univariate
structure. Genlin is in 15 is it not? We are using 14. Ok, so that's that.

Thank you.
Gene Maguin

Marta García-Granero

Graduate course on SPSS - Lesson 1

In reply to this post by Marta García-Granero

Hi everybody

After being delayed for several days (duties at the University, trying
to force.. er, ahem!, TEACH some basic SPSS handling knowledge to my
students), I'm back with the original topic.

SPSS experts can safely skip this message, it is dedicated to novel
users.

OK, here's my idea of what a first session with SPSS could be. Anyone
interested in the GUI version of this can ask for an Acrobat file with
the screenshots and all (unfortunately, in Spanish, and I don't plan
to translate it yet, too busy for that...). The rest of the GUI
related material belongs now to the University - although I did all
the writing - (only the first chapter could be considered "free") and
can't be sent, sorry. I can also send the text file with the data to
be imported, but anyone with a bit of knowledge can get their own as I
did (random generation of values).

Tomorrow I'll start the series from one sample testing (parametric&non
parametric) to factorial ANOVA, correlation & regression, to finish
with categorical data (from goodness of fit to McNemar & Cochran tests
for related samples). I'll need some time to translate the
accompanying MACROS.

CREATION OF A DATASET WITH SPSS 13 AND SOME BASIC HANDLING

* Creating dataset with only two first cases *.
DATA LIST LIST/id(F4) name(A8) birthdate studydate (2 EDATE10)
gender(F5) height weight initDPB endDBP initSPB endSBP (6 F4).
BEGIN DATA
1 RPL 28-08-1941 13-07-1998 1 164 78 78 104 176 175
2 IGZ 30-06-1957 09-05-1998 1 155 74 95 114 162 160
END DATA.

* Format and labels *.
VARIABLE LABEL id 'ID Number' /name 'Name'
/birthdate 'Birth Date' /studydate 'Study Date'
/gender 'Gender' /height 'Height (cm)' /weight 'Weight (kg)'
/initDPB 'Initial Diastolic Blood Pressure (mmHg)'
/endDBP 'Final Diastolic Blood Pressure (mmHg)'
/initSPB 'Initial Systolic Blood Pressure (mmHg)'
/endSBP 'Final Systolic Blood Pressure (mmHg)'.
VALUE LABEL gender 0 'Male' 1 'Female'.
VARIABLE WIDTH birthdate studydate(10).
VARIABLE LEVEL gender (NOMINAL).

* Save dataset for later (SPSS 13 - only one dataset - is used) *.
SAVE OUTFILE='C:\SPSS Datasets&Syntax Files\Hypertension Dataset.sav'.

* Next tasks can be (and will be, in due time), modified for SPSS
14/15 and its multiple dataset handling capabilities: *.

* Import the rest of the data (careful with variable names and types) *.
GET DATA /TYPE = TXT
/FILE = 'C:\SPSS Datasets&Syntax Files\Data(3-100).txt'
/DELCASE = LINE /DELIMITERS = " "
/ARRANGEMENT = DELIMITED
/FIRSTCASE = 1 /IMPORTCASE = ALL
/VARIABLES = id F4 name A8 birthdate EDATE10 studydate EDATE10
gender F5 height F4 weight F4 initDPB F4 endDBP F4 initSPB F4 endSBP F4.
SAVE OUTFILE='C:\SPSS Datasets&Syntax Files\Data(3-100).sav'.

* Get original file again and add the second at the end *.
GET FILE'C:\SPSS Datasets&Syntax Files\Hypertension Dataset.sav'.
ADD FILES /FILE=*
/FILE='C:\SPSS Datasets&Syntax Files\Data(3-100).sav'.

EXE. /* Only if you want to see the results in the Data Editor *.

* 4 new variables (using different SPSS "tricks") *.
COMPUTE age = DATEDIF(studydate, birthdate, "years").
COMPUTE bmi = weight/((height/100)**2).
RECODE bmi (Lowest thru 25=0) (25 thru 30=1) (30 thru Highest=2) INTO obesity .
DO IF (initDPB GT 90) OR (initSPB GT 140).
. COMPUTE initHT=1.
ELSE.
. COMPUTE initHT=0.
FORMATS age obesity initHT (F5.0).
VARIABLE LABEL age 'Age (years)'/
bmi 'Body Mass Index (Kg/m²)'/
obesity 'Presence of obesity'/
initHT 'Initial Hypertension'.
VAL LAB obesity 0 'No' 1 'Overweight' 2 'Obese'/
initHT 0 'No' 1 'Yes'.
VAR WIDTH age bmi obesity (8).
VAR LEV obesity (ORDINAL) initHT (NOMINAL).

* Complete dataset is saved to disk *.
SAVE OUTFILE='C:\SPSS Datasets&Syntax Files\Hypertension Dataset.sav'.

* Some very basic analyses *.

FREQUENCIES
VARIABLES=gender obesity initHT
/PIECHART FREQ.

FREQUENCIES
VARIABLES=height to bmi
/FORMAT=NOTABLE
/NTILES= 4
/STATISTICS=STDDEV MIN MAX MEAN SKEW SESKEW KURT SEKURT
/HISTOGRAM=NORMAL.

EXAMINE
VARIABLES=height to bmi
/PLOT=BOXPLOT
/STATISTICS=NONE.

That's all.

By the way, Richard, I agree that showing how to use SPSS without a
clear purpose is, at the very least, quite difficult (more than usual,
I mean, my students are always so reluctant to be taught, I think I
have got only 2, out of 120, that are really interested in learning
more...).

RR> At 05:28 AM 1/15/2007, you wrote:

>>I hope to add a couple of ideas from my own experience teaching SPSS
>>to biologists and medical researchers. I think I could adapt part of
>>[my teaching material] easily methods) to English.

RR> What can I say? Yay, yay, Marta. All the good ideas the rest of us have
RR> posted, but farther, better, and in an actual course outline.

--
Regards,
Dr. Marta García-Granero,PhD mailto:[hidden email]
Statistician

---
"It is unwise to use a statistical procedure whose use one does
not understand. SPSS syntax guide cannot supply this knowledge, and it
is certainly no substitute for the basic understanding of statistics
and statistical thinking that is essential for the wise choice of
methods and the correct interpretation of their results".

(Adapted from WinPepi manual - I'm sure Joe Abrahmson will not mind)