SPSSX Discussion

recoding variables

Classic

List

Threaded

9 messages Options

Gonzales, Dana L

recoding variables

I apologize in advance for submitting what I'm sure is a simple question. I
reviewed the syntax posted on Reynald's site but I'm still unclear. I have a
dataset of 2200 patients with 58 variables. One of the variables is a string
variable containing two data points separated by /. An example is provided
below.

2/8
6/7
3/7
1/3
1/1
3/6
1/6
0/1

3/10

10/10

Can someone explain how to create the syntax? I found the following but I'm
not sure how to edit it.

DATA LIST LIST /a(A70).
BEGIN DATA

C206/E2101254/F210
E206/ABCDP206/ZF210/G210/X210
END DATA.
LIST.

STRING #(A70).
VECTOR b(5A10).
COMPUTE #=CONCAT(RTRIM(a),'/').
COMPUTE #cnt=1.

LOOP IF INDEX(#,'/')>0.
COMPUTE b(#cnt)=SUBSTR(#,1,INDEX(#,'/')-1).
COMPUTE #cnt=#cnt + 1.
COMPUTE #=SUBSTR(#,INDEX(#,'/')+1).
END LOOP.

EXECUTE.

Thanks!

Dana

Dana Barber Gonzales, PhD

Assistant Director Medical Residency Program

Family and Preventive Medicine

College of Medicine

4301 W. Markham St. Slot 530

Little Rock, AR 72205-7199

Office: 501-686-6593

Fax: 501-686-8421

Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

Pirritano, Matthew

scales of measurement

Hello listers,

Does anyone know of the original reference for scales of measurement (i.e., ordinal, nominal, interval, ratio).

Thanks,
Matt

Matthew Pirritano, Ph.D.
Assistant Professor of Psychology
Smith Hall 116C
Chapman University
Department of Psychology
One University Drive
Orange, CA 92866
Telephone (714)744-7940
FAX (714)997-6780

Richard Ristow

Re: recoding variables

In reply to this post by Gonzales, Dana L

At 04:07 PM 8/17/2007, Gonzales, Dana L wrote:

>I have a dataset [in which] one of the variables is a string variable
>containing two data points separated by /. An example is provided
>below.

|-----------------------------|---------------------------|
|Output Created |17-AUG-2007 19:39:17 |
|-----------------------------|---------------------------|
TwoPoint

2/8
6/7
3/7
1/3
1/1
3/6
1/6
0/1
3/10
10/10

Number of cases read: 10 Number of cases listed: 10

>Can someone explain how to create the syntax?

Raynald's syntax is correct (small surprise), but for your case of only
two fields in the input string, the following works and you may find it
clearer. SPSS 15 draft output (WRR:not saved separately):

NUMERIC Point1 Point2 (F3).

COMPUTE #Slash = INDEX(TwoPoint,'/').
STRING #OneValu (A8).

DO IF #Slash EQ 0.
. COMPUTE Point1 = NUMBER(TwoPoint,F8).
ELSE.
. COMPUTE #OneValu = SUBSTR(TwoPoint,1,#Slash-1).
. COMPUTE Point1 = NUMBER(#OneValu,F8).
. COMPUTE #OneValu = SUBSTR(TwoPoint, #Slash+1).
. COMPUTE Point2 = NUMBER(#OneValu,F8).
END IF.
LIST.

List
|-----------------------------|---------------------------|
|Output Created |17-AUG-2007 19:39:17 |
|-----------------------------|---------------------------|
TwoPoint Point1 Point2

2/8 2 8
6/7 6 7
3/7 3 7
1/3 1 3
1/1 1 1
3/6 3 6
1/6 1 6
0/1 0 1
3/10 3 10
10/10 10 10

Number of cases read: 10 Number of cases listed: 10
===================
APPENDIX: Test data
===================
DATA LIST FIXED
/TwoPoint (A8).
BEGIN DATA
2/8
6/7
3/7
1/3
1/1
3/6
1/6
0/1
3/10
10/10
END DATA.
LIST.

Edgar F. Johns

Re: scales of measurement

In reply to this post by Pirritano, Matthew

I believe it's: Stevens, S.S. (1951). Handbook of experimental psychology.
New York: Wiley.

Edgar
---
Discover Technologies
42020 Koppernick Rd.
Suite 204
Canton, MI 48187
(734) 564-4964
(734) 468-0800 fax
-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Pirritano, Matthew
Sent: Friday, August 17, 2007 6:56 PM
To: [hidden email]
Subject: scales of measurement

Hello listers,

Does anyone know of the original reference for scales of measurement (i.e.,
ordinal, nominal, interval, ratio).

Thanks,
Matt

Matthew Pirritano, Ph.D.
Assistant Professor of Psychology
Smith Hall 116C
Chapman University
Department of Psychology
One University Drive
Orange, CA 92866
Telephone (714)744-7940
FAX (714)997-6780

Ken Belzer

Re: scales of measurement

In reply to this post by Pirritano, Matthew

Hi Matt,

I believe the reference I've seen cited in textbooks most frequently is
Stevens's 1946 paper in Science, but a brief search came up with the 1951 chapter,
as well. (mailto:[hidden email])

Stevens, S.S. (1946). On the theory of scales of measurement. Science, 103,
677-680.
Stevens, S.S. (1951). Mathematics, measurement and psychophysics. In S.S.
Stevens (Ed.), Handbook of experimental psychology (pp. 1-49). New York: Wiley.

Best,
Ken

************************************** Get a sneak peek of the all-new AOL at
http://discover.aol.com/memed/aolcom30tour

Manmit Shrimali-2

Pre-analysis question

In reply to this post by Edgar F. Johns

Team:

I have a exploratory analysis to be conducted that include factor analysis followed by cluster and discriminant analysis. Results will then be used for regression.

Before starting the analysis, I have heard many perform variable centering. Can you please share why we do centering and in which case we should use along with do and don't. Your sharing is highly appreciated as having proper data setup is foundation for advance analysis.

Thanks,

Hector Maletta

Re: Pre-analysis question

Manmit,
For factor analysis and discriminant, centering is irrelevant,
since the analysis itself is independent of the units of measurement. In the
case of cluster analysis, standardization (rather than centering) is
required. Some clustering procedures (e.g. Hierarchical Clustering in the
SPSS CLUSTER command) standardize the variables by themselves; other
procedures like QUICK CLUSTER require you to standardize the variables prior
to applying the procedure.
The main point with clustering is not the position of the zero
(centering) but the unit of measurement (standardizing). If the various
variables are measured in different units, a change in unit would give more
or less weight to one variable or another, resulting in a different
solution. For that reason, variables should be measured in units of standard
deviation (centered or not, although they are usually centered on their
respective means).

Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Manmit Shrimali
Sent: 18 August 2007 02:25
To: [hidden email]
Subject: Pre-analysis question

Team:

I have a exploratory analysis to be conducted that include factor
analysis followed by cluster and discriminant analysis. Results will then be
used for regression.

Before starting the analysis, I have heard many perform variable
centering. Can you please share why we do centering and in which case we
should use along with do and don't. Your sharing is highly appreciated as
having proper data setup is foundation for advance analysis.

Thanks,

Swank, Paul R

Re: Pre-analysis question

Many years ago when I did my work in cluster analysis it was claimed
that correcting for the standard deviation was appropriate but
correcting for the mean was not. If I remember correctly, it was
something about removing too much information. Perhaps that has changed
but we would typically make all the standard deviations the same but not
the means.

However, another issue needs to be considered. Some argue that doing a
factor analysis first when there are potentially different populations
mixed in the data is problematic. If the factor structure is different
in different populations then the factor structure across a mixture may
not look like the structure for any population. Then, it was recommended
to alternatively factor and cluster until the solution stabilized.
Another possibility is latent mixture modeling as done by Mplus.

Paul R. Swank, Ph.D. Professor
Director of Reseach
Children's Learning Institute
University of Texas Health Science Center-Houston

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Hector Maletta
Sent: Saturday, August 18, 2007 9:28 AM
To: [hidden email]
Subject: Re: Pre-analysis question

Manmit,
For factor analysis and discriminant, centering is irrelevant,
since the analysis itself is independent of the units of measurement. In
the case of cluster analysis, standardization (rather than centering) is
required. Some clustering procedures (e.g. Hierarchical Clustering in
the SPSS CLUSTER command) standardize the variables by themselves; other
procedures like QUICK CLUSTER require you to standardize the variables
prior to applying the procedure.
The main point with clustering is not the position of the zero
(centering) but the unit of measurement (standardizing). If the various
variables are measured in different units, a change in unit would give
more or less weight to one variable or another, resulting in a different
solution. For that reason, variables should be measured in units of
standard deviation (centered or not, although they are usually centered
on their respective means).

Hector

-----Original Message-----
From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of
Manmit Shrimali
Sent: 18 August 2007 02:25
To: [hidden email]
Subject: Pre-analysis question

Team:

I have a exploratory analysis to be conducted that include
factor analysis followed by cluster and discriminant analysis. Results
will then be used for regression.

Before starting the analysis, I have heard many perform
variable centering. Can you please share why we do centering and in
which case we should use along with do and don't. Your sharing is highly
appreciated as having proper data setup is foundation for advance
analysis.

Thanks,

Art Kendall-2

Re: recoding variables

In reply to this post by Gonzales, Dana L

if your data is in a character file .txt, .dat, .asc try something like
this example syntax. which I generated by using
<file> <open> <data>
changing the edit box to *.txt, clicking example.txt and using the wizard.

*put these two lines in "example.txt"
C206/E2101254/F210
E206/ABCDP206/ZF210/G210/X210

Save all your current work, then open a new instance of SPSS. Make sure
that you put warnings, etc. into the output file. <edit> <options>
<viewer>. Cut-and-paste then run the syntax.

GET DATA /TYPE = TXT
/FILE = 'D:\project\example.txt'
/DELCASE = LINE
/DELIMITERS = "/"
/ARRANGEMENT = DELIMITED
/FIRSTCASE = 1
/IMPORTCASE = ALL
/VARIABLES =
V1 A4
V2 A8
V3 A5
V4 A4
V5 A4
.
CACHE.
EXECUTE.
DATASET NAME DataSet1 WINDOW=FRONT.

Art Kendall

Social Research Consultant

Gonzales, Dana L wrote:

> I apologize in advance for submitting what I'm sure is a simple question. I
> reviewed the syntax posted on Reynald's site but I'm still unclear. I have a
> dataset of 2200 patients with 58 variables. One of the variables is a string
> variable containing two data points separated by /. An example is provided
> below.
>
>
>
> 2/8
> 6/7
> 3/7
> 1/3
> 1/1
> 3/6
> 1/6
> 0/1
>
> 3/10
>
> 10/10
>
>
>
> Can someone explain how to create the syntax? I found the following but I'm
> not sure how to edit it.
>
>
>
> DATA LIST LIST /a(A70).
> BEGIN DATA
>
> C206/E2101254/F210
> E206/ABCDP206/ZF210/G210/X210
> END DATA.
> LIST.
>
> STRING #(A70).
> VECTOR b(5A10).
> COMPUTE #=CONCAT(RTRIM(a),'/').
> COMPUTE #cnt=1.
>
> LOOP IF INDEX(#,'/')>0.
> COMPUTE b(#cnt)=SUBSTR(#,1,INDEX(#,'/')-1).
> COMPUTE #cnt=#cnt + 1.
> COMPUTE #=SUBSTR(#,INDEX(#,'/')+1).
> END LOOP.
>
> EXECUTE.
>
>
>
> Thanks!
>
>
>
> Dana
>
>
>
>
>
> Dana Barber Gonzales, PhD
>
> Assistant Director Medical Residency Program
>
> Family and Preventive Medicine
>
> College of Medicine
>
> 4301 W. Markham St. Slot 530
>
> Little Rock, AR 72205-7199
>
> Office: 501-686-6593
>
> Fax: 501-686-8421
>
>
>
>
> Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
>
>
>