creating new variables- should they be initialized to 0?

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

creating new variables- should they be initialized to 0?

Cleland, Patricia (EDU)

 

Somewhere along the line, I developed the habit or was told to initialize any new variable that I created to 0. 

 

For example, if I wanted to recode the variable AGE into DECADE (LT 10=1, 11 thru 19=2, 20 thru 29=3, etc), I would first initialize DECADE via

COMPUTE DECADE=0. This means that all cases have a valid value for DECADE.

 

I may do this because I programmed in FORTRAN where this was mandatory before I learned SPSS and I’ve just kept on doing it all these years. I’ve noticed that my younger colleagues don’t initialize new variables.

 

Do others have any thoughts on whether this is necessary? Good practice?  A waste of time?

 

Thanks.

 

Pat

 

Reply | Threaded
Open this post in threaded view
|

Re: creating new variables- should they be initialized to 0?

ViAnn Beadle

New variables are initialized to sysmis which I think is a good thing to do as long as you are aware of this when doing transformations with them.

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Cleland, Patricia (EDU)
Sent: Tuesday, January 04, 2011 8:45 AM
To: [hidden email]
Subject: creating new variables- should they be initialized to 0?

 

 

Somewhere along the line, I developed the habit or was told to initialize any new variable that I created to 0. 

 

For example, if I wanted to recode the variable AGE into DECADE (LT 10=1, 11 thru 19=2, 20 thru 29=3, etc), I would first initialize DECADE via

COMPUTE DECADE=0. This means that all cases have a valid value for DECADE.

 

I may do this because I programmed in FORTRAN where this was mandatory before I learned SPSS and I’ve just kept on doing it all these years. I’ve noticed that my younger colleagues don’t initialize new variables.

 

Do others have any thoughts on whether this is necessary? Good practice?  A waste of time?

 

Thanks.

 

Pat

 

Reply | Threaded
Open this post in threaded view
|

Re: creating new variables- should they be initialized to 0?

Art Kendall
I also think that it is a good thing that sysmis is assigned if you do not give sufficient instructions for it to do otherwise.  It helps one to know that one has not given the system instructions that allow it to assign a value.

On the other hand if zero is assigned as a missing value then that can also help in checking your syntax. (I would assign the same number of ages each interval except perhaps the last one).
recode age (lo thru 10 =1) (11 thru 20=2) ... (missing = 0) into decade.
missing values decade (0).
value labels decade
 0 'age was missing'
 1 'up to 10'
 2 '11 to 20'
. . .

Art Kendall
Social Research Consultants


On 1/4/2011 11:00 AM, ViAnn Beadle wrote:

New variables are initialized to sysmis which I think is a good thing to do as long as you are aware of this when doing transformations with them.

 

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Cleland, Patricia (EDU)
Sent: Tuesday, January 04, 2011 8:45 AM
To: [hidden email]
Subject: creating new variables- should they be initialized to 0?

 

 

Somewhere along the line, I developed the habit or was told to initialize any new variable that I created to 0. 

 

For example, if I wanted to recode the variable AGE into DECADE (LT 10=1, 11 thru 19=2, 20 thru 29=3, etc), I would first initialize DECADE via

COMPUTE DECADE=0. This means that all cases have a valid value for DECADE.

 

I may do this because I programmed in FORTRAN where this was mandatory before I learned SPSS and I’ve just kept on doing it all these years. I’ve noticed that my younger colleagues don’t initialize new variables.

 

Do others have any thoughts on whether this is necessary? Good practice?  A waste of time?

 

Thanks.

 

Pat

 

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Art Kendall
Social Research Consultants
Reply | Threaded
Open this post in threaded view
|

Re: creating new variables- should they be initialized to 0?

Marks, Jim
In reply to this post by ViAnn Beadle

I initialize variables when

1)      I compute lagged values

2)      I plan to browse the data to find unusual/ specific results

3)      I plan to delete cases

4)      I want the frequency tables to include all cases

 

Jim Marks

Director, Market Research

x1616

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of ViAnn Beadle
Sent: Tuesday, January 04, 2011 10:01 AM
To: [hidden email]
Subject: Re: creating new variables- should they be initialized to 0?

 

New variables are initialized to sysmis which I think is a good thing to do as long as you are aware of this when doing transformations with them.

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Cleland, Patricia (EDU)
Sent: Tuesday, January 04, 2011 8:45 AM
To: [hidden email]
Subject: creating new variables- should they be initialized to 0?

 

 

Somewhere along the line, I developed the habit or was told to initialize any new variable that I created to 0. 

 

For example, if I wanted to recode the variable AGE into DECADE (LT 10=1, 11 thru 19=2, 20 thru 29=3, etc), I would first initialize DECADE via

COMPUTE DECADE=0. This means that all cases have a valid value for DECADE.

 

I may do this because I programmed in FORTRAN where this was mandatory before I learned SPSS and I’ve just kept on doing it all these years. I’ve noticed that my younger colleagues don’t initialize new variables.

 

Do others have any thoughts on whether this is necessary? Good practice?  A waste of time?

 

Thanks.

 

Pat

 

Reply | Threaded
Open this post in threaded view
|

Re: creating new variables- should they be initialized to 0?

John F Hall
In reply to this post by Art Kendall
Art's solution puts 10, 20 etc into higher, possibly the wrong, categories.
 
Try:
 
compute decade = trunc (age/10) .
 
----- Original Message -----
Sent: Tuesday, January 04, 2011 5:33 PM
Subject: Re: creating new variables- should they be initialized to 0?

I also think that it is a good thing that sysmis is assigned if you do not give sufficient instructions for it to do otherwise.  It helps one to know that one has not given the system instructions that allow it to assign a value.

On the other hand if zero is assigned as a missing value then that can also help in checking your syntax. (I would assign the same number of ages each interval except perhaps the last one).
recode age (lo thru 10 =1) (11 thru 20=2) ... (missing = 0) into decade.
missing values decade (0).
value labels decade
 0 'age was missing'
 1 'up to 10'
 2 '11 to 20'
. . .

Art Kendall
Social Research Consultants


On 1/4/2011 11:00 AM, ViAnn Beadle wrote:

New variables are initialized to sysmis which I think is a good thing to do as long as you are aware of this when doing transformations with them.

From: SPSSX(r) Discussion [[hidden email]] On Behalf Of Cleland, Patricia (EDU)
Sent: Tuesday, January 04, 2011 8:45 AM
To: [hidden email]
Subject: creating new variables- should they be initialized to 0?

Somewhere along the line, I developed the habit or was told to initialize any new variable that I created to 0. 

For example, if I wanted to recode the variable AGE into DECADE (LT 10=1, 11 thru 19=2, 20 thru 29=3, etc), I would first initialize DECADE via

COMPUTE DECADE=0. This means that all cases have a valid value for DECADE.

I may do this because I programmed in FORTRAN where this was mandatory before I learned SPSS and IÂ’ve just kept on doing it all these years. IÂ’ve noticed that my younger colleagues donÂ’t initialize new variables.

Do others have any thoughts on whether this is necessary? Good practice?  A waste of time?

Thanks.

Pat

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Importing .POR files

Mike Pritchard

Happy New Year everyone!

 

I’m having trouble with some files in the SPSS Portable file format and wonder if anyone has any suggestions.

 

SPSS 16 is complaining about the files.  Sometimes I get an error about the file format, but I can’t find anything that looks wrong in the file.  Still, I don’t know anything about the POR format, so I may be missing something.  Other times I get an unrecoverable error.  I’ve tried this on different computers with similar results.

 

Does anyone have experience with this kind of issue?  And solving it!  Is there perhaps any more robust path to get the file to SPSS?  Are there other software packages that can read .POR?  What about the file format definitions?

 

Any ideas would be most appreciated.

Thanks

Mike

 

______________________________________________________________________

What was your top business priority over the holidays? Fill out our QuickPoll and see the results

Mike Pritchard | [hidden email] | 5 Circles Research | 425-444-3410 (c) | 425-968-3883 (o)
Research helping companies delight their customers to increase profits since 1993

 

Reply | Threaded
Open this post in threaded view
|

Re: Importing .POR files

Tesiny, Ed
Mike,
Do you know what version of SPSS created the .POR files?
Ed

________________________________

From: SPSSX(r) Discussion on behalf of Mike Pritchard
Sent: Tue 1/4/2011 6:28 PM
To: [hidden email]
Subject: Importing .POR files



Happy New Year everyone!



I’m having trouble with some files in the SPSS Portable file format and wonder if anyone has any suggestions.



SPSS 16 is complaining about the files.  Sometimes I get an error about the file format, but I can’t find anything that looks wrong in the file.  Still, I don’t know anything about the POR format, so I may be missing something.  Other times I get an unrecoverable error.  I’ve tried this on different computers with similar results.



Does anyone have experience with this kind of issue?  And solving it!  Is there perhaps any more robust path to get the file to SPSS?  Are there other software packages that can read .POR?  What about the file format definitions?



Any ideas would be most appreciated.

Thanks

Mike



______________________________________________________________________

What was your top business priority over the holidays? Fill out our QuickPoll <http://sgiz.mobi/s3/e8d2fad2f399>  and see the results

Mike Pritchard | [hidden email] <mailto:[hidden email]>  | 5 Circles Research <http://www.5circles.com/>  | 425-444-3410 (c) | 425-968-3883 (o)
Research helping companies delight their customers to increase profits since 1993

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Boundaries of RECODE ranges

Richard Ristow
In reply to this post by Art Kendall
 From thread "creating new variables- should they be initialized  to 0?"

At 11:33 AM 1/4/2011, Art Kendall wrote:

>If zero is defined as a missing value then that can also help in
>checking your syntax.
>
>recode age (lo thru 10=1)
>            (11 thru 20=2) ...
>            (missing   =0) into decade.
>missing values decade (0).
>value labels decade
>      0 'age was missing'
>      1 'up to 10'
>      2 '11 to 20'

At 12:30 PM 1/4/2011, John F Hall responded:

>Art's solution puts 10, 20 etc into higher, possibly the wrong, categories.

In RECODE, you can specify exactly which categories the boundaries of
ranges will fall into, and you can do it in a way that will work as
well when categorizing variables that can have fractional as well as
integral values. It's a powerful feature of RECODE.

It works because, in RECODE,
a. It's acceptable to specify ranges that overlap
b. If a value falls into more than one (overlapping) range, the first
matching range on the RECODE statement applies.

So if you want to categorize age by decade, and follow the (common)
rule that ages exactly on the boundary go in the higher category, the
following will work even if "age" takes on fractional-year as well as
integer values. Notice that the ranges are listed from highest to
lowest; that is what assigns the boundary values to the higher
category. For the same reason, the "MISSING" clause is first. In this
dataset, '1' is a missing value for "age".

RECODE age
   (MISSING    = 0)
   (80 THRU HI = 9)
   (70 THRU 80 = 8)
   ...
   (20 THRU 30 = 3)
   (10 THRU 20 = 2)
   (LO THRU 10 = 1)  INTO decade.
FORMATS        decade (F2).
MISSING VALUES decade (0).
LIST.


List
|-----------------------------|---------------------------|
|Output Created               |04-JAN-2011 23:39:48       |
|-----------------------------|---------------------------|
    age decade

    1.0    0
    2.0    1
   10.0    2
   11.0    2
   12.0    2
   19.9    2
   20.0    3
   20.1    3

Number of cases read:  8    Number of cases listed:  8
=============================
APPENDIX: Test data, and code
=============================
NEW FILE.
DATA LIST FREE /age (F5.1).
BEGIN DATA
1 2 10 11 12 19.9 20 20.1
END DATA.
MISSING VALUES age(1).

RECODE age
   (MISSING    = 0)
   (80 THRU HI = 9)
   (70 THRU 80 = 8)
   (20 THRU 30 = 3)
   (10 THRU 20 = 2)
   (LO THRU 10 = 1)  INTO decade.
FORMATS        decade (F2).
MISSING VALUES decade (0).
LIST.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: Boundaries of RECODE ranges

John F Hall
The initial enquiry asked, "For example, if I wanted to recode the variable AGE into DECADE (LT 10=1, 11 thru 19=2, 20 thru 29=3, etc),"  I think 11 was a mistype for 10 and the required grouping is <10, 10 - 19, 20 - 29 etc.  So COMPUTE DECADE = TRUNC (AGE) is a common trick of the trade. 
 
On the Undergraduate Income and Expenditure Survey we used TEMP. RECODE with seemingly overlapping values many times when there were many variables and when the currency values had decimal places,   This saved typing .99.  When we wanted to keep the original values we used
DO REPEAT x = ~~~.  y = ~~~ COMPUTE x = trunc (y) END REPEAT
 
I think the COMPUTE solution is neater.  If 0 is a missing value, it can be so declared, but we used to use 0 for age under one year (eg for children in household).
 
----- Original Message -----
Sent: Wednesday, January 05, 2011 5:41 AM
Subject: Boundaries of RECODE ranges

From thread "creating new variables- should they be initialized  to 0?"

At 11:33 AM 1/4/2011, Art Kendall wrote:

>If zero is defined as a missing value then that can also help in
>checking your syntax.
>
>recode age (lo thru 10=1)
>            (11 thru 20=2) ...
>            (missing   =0) into decade.
>missing values decade (0).
>value labels decade
>      0 'age was missing'
>      1 'up to 10'
>      2 '11 to 20'

At 12:30 PM 1/4/2011, John F Hall responded:

>Art's solution puts 10, 20 etc into higher, possibly the wrong, categories.

In RECODE, you can specify exactly which categories the boundaries of
ranges will fall into, and you can do it in a way that will work as
well when categorizing variables that can have fractional as well as
integral values. It's a powerful feature of RECODE.

It works because, in RECODE,
a. It's acceptable to specify ranges that overlap
b. If a value falls into more than one (overlapping) range, the first
matching range on the RECODE statement applies.

So if you want to categorize age by decade, and follow the (common)
rule that ages exactly on the boundary go in the higher category, the
following will work even if "age" takes on fractional-year as well as
integer values. Notice that the ranges are listed from highest to
lowest; that is what assigns the boundary values to the higher
category. For the same reason, the "MISSING" clause is first. In this
dataset, '1' is a missing value for "age".

RECODE age
   (MISSING    = 0)
   (80 THRU HI = 9)
   (70 THRU 80 = 8)
   ...
   (20 THRU 30 = 3)
   (10 THRU 20 = 2)
   (LO THRU 10 = 1)  INTO decade.
FORMATS        decade (F2).
MISSING VALUES decade (0).
LIST.


List
|-----------------------------|---------------------------|
|Output Created               |04-JAN-2011 23:39:48       |
|-----------------------------|---------------------------|
    age decade

    1.0    0
    2.0    1
   10.0    2
   11.0    2
   12.0    2
   19.9    2
   20.0    3
   20.1    3

Number of cases read:  8    Number of cases listed:  8
=============================
APPENDIX: Test data, and code
=============================
NEW FILE.
DATA LIST FREE /age (F5.1).
BEGIN DATA
1 2 10 11 12 19.9 20 20.1
END DATA.
MISSING VALUES age(1).

RECODE age
   (MISSING    = 0)
   (80 THRU HI = 9)
   (70 THRU 80 = 8)
   (20 THRU 30 = 3)
   (10 THRU 20 = 2)
   (LO THRU 10 = 1)  INTO decade.
FORMATS        decade (F2).
MISSING VALUES decade (0).
LIST.

Reply | Threaded
Open this post in threaded view
|

Re: creating new variables- should they be initialized to 0?

Garry Gelade
In reply to this post by Cleland, Patricia (EDU)

Not a good idea.  If AGE is missing for example, and you don’t have an ELSE in your Recode, DECADE would have a vaild value but it shouldn’t have.

Also, if you initialize a new variable to 0, and then compute new values for it by using an IF statement, you would have the same problem. 

 

In general its safer to let SPSS create the new variable with a default value of SYSMIS, unless you specifically want the defualt to be something else.

 

Garry Gelade

Business Analytic Ltd

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Cleland, Patricia (EDU)
Sent: 04 January 2011 15:45
To: [hidden email]
Subject: creating new variables- should they be initialized to 0?

 

 

Somewhere along the line, I developed the habit or was told to initialize any new variable that I created to 0. 

 

For example, if I wanted to recode the variable AGE into DECADE (LT 10=1, 11 thru 19=2, 20 thru 29=3, etc), I would first initialize DECADE via

COMPUTE DECADE=0. This means that all cases have a valid value for DECADE.

 

I may do this because I programmed in FORTRAN where this was mandatory before I learned SPSS and I’ve just kept on doing it all these years. I’ve noticed that my younger colleagues don’t initialize new variables.

 

Do others have any thoughts on whether this is necessary? Good practice?  A waste of time?

 

Thanks.

 

Pat

 

Reply | Threaded
Open this post in threaded view
|

Re: creating new variables- should they be initialized to 0?

Allan Reese (Cefas)
In reply to this post by Cleland, Patricia (EDU)

I agree with Gerry and others, that you should initialize variables in a way that helps identify cases that have been incorrectly allocated.  Also, however confident you are in the programming, follow the creation of the variable with a check:

 

crosstabs tables=age by decade / missing=include.

 

“missing=include” is a nice feature that does not seem supported by the menu interface.

 

Allan

 

From: Garry Gelade [mailto:[hidden email]]
Sent: 05 January 2011 13:34
Subject: Re: creating new variables- should they be initialized to 0?

 

Not a good idea.  If AGE is missing for example, and you don’t have an ELSE in your Recode, DECADE would have a vaild value but it shouldn’t have.

Also, if you initialize a new variable to 0, and then compute new values for it by using an IF statement, you would have the same problem. 

 

In general its safer to let SPSS create the new variable with a default value of SYSMIS, unless you specifically want the defualt to be something else.

 

Garry Gelade

Business Analytic Ltd

 

From: SPSSX(r) Discussion [mailto:[hidden email]] On Behalf Of Cleland, Patricia (EDU)
Sent: 04 January 2011 15:45
To: [hidden email]
Subject: creating new variables- should they be initialized to 0?

 

 

Somewhere along the line, I developed the habit or was told to initialize any new variable that I created to 0. 

 

For example, if I wanted to recode the variable AGE into DECADE (LT 10=1, 11 thru 19=2, 20 thru 29=3, etc), I would first initialize DECADE via

COMPUTE DECADE=0. This means that all cases have a valid value for DECADE.

 

I may do this because I programmed in FORTRAN where this was mandatory before I learned SPSS and I’ve just kept on doing it all these years. I’ve noticed that my younger colleagues don’t initialize new variables.

 

Do others have any thoughts on whether this is necessary? Good practice?  A waste of time?

 

Thanks.

 

Pat

 

 

***********************************************************************************

This email and any attachments are intended for the named recipient only. Its unauthorised use, distribution, disclosure, storage or copying is not permitted. If you have received it in error, please destroy all copies and notify the sender. In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent. All emails may be subject to monitoring.

***********************************************************************************

 

Reply | Threaded
Open this post in threaded view
|

Re: creating new variables- should they be initialized to 0?

John F Hall
In reply to this post by Cleland, Patricia (EDU)
Just in case you missed this as it had a different subject.
----- Original Message -----
Sent: Wednesday, January 05, 2011 7:34 AM
Subject: Re: Boundaries of RECODE ranges

The initial enquiry asked, "For example, if I wanted to recode the variable AGE into DECADE (LT 10=1, 11 thru 19=2, 20 thru 29=3, etc),"  I think 11 was a mistype for 10 and the required grouping is <10, 10 - 19, 20 - 29 etc.  So COMPUTE DECADE = TRUNC (AGE) is a common trick of the trade. 
 
On the Undergraduate Income and Expenditure Survey we used TEMP. RECODE with seemingly overlapping values many times when there were many variables and when the currency values had decimal places,   This saved typing .99.  When we wanted to keep the original values we used
DO REPEAT x = ~~~.  y = ~~~ COMPUTE x = trunc (y) END REPEAT
 
I think the COMPUTE solution is neater.  If 0 is a missing value, it can be so declared, but we used to use 0 for age under one year (eg for children in household).
 
----- Original Message -----
Sent: Wednesday, January 05, 2011 5:41 AM
Subject: Boundaries of RECODE ranges

From thread "creating new variables- should they be initialized  to 0?"

At 11:33 AM 1/4/2011, Art Kendall wrote:

>If zero is defined as a missing value then that can also help in
>checking your syntax.
>
>recode age (lo thru 10=1)
>            (11 thru 20=2) ...
>            (missing   =0) into decade.
>missing values decade (0).
>value labels decade
>      0 'age was missing'
>      1 'up to 10'
>      2 '11 to 20'

At 12:30 PM 1/4/2011, John F Hall responded:

>Art's solution puts 10, 20 etc into higher, possibly the wrong, categories.

In RECODE, you can specify exactly which categories the boundaries of
ranges will fall into, and you can do it in a way that will work as
well when categorizing variables that can have fractional as well as
integral values. It's a powerful feature of RECODE.

It works because, in RECODE,
a. It's acceptable to specify ranges that overlap
b. If a value falls into more than one (overlapping) range, the first
matching range on the RECODE statement applies.

So if you want to categorize age by decade, and follow the (common)
rule that ages exactly on the boundary go in the higher category, the
following will work even if "age" takes on fractional-year as well as
integer values. Notice that the ranges are listed from highest to
lowest; that is what assigns the boundary values to the higher
category. For the same reason, the "MISSING" clause is first. In this
dataset, '1' is a missing value for "age".

RECODE age
   (MISSING    = 0)
   (80 THRU HI = 9)
   (70 THRU 80 = 8)
   ...
   (20 THRU 30 = 3)
   (10 THRU 20 = 2)
   (LO THRU 10 = 1)  INTO decade.
FORMATS        decade (F2).
MISSING VALUES decade (0).
LIST.


List
|-----------------------------|---------------------------|
|Output Created               |04-JAN-2011 23:39:48       |
|-----------------------------|---------------------------|
    age decade

    1.0    0
    2.0    1
   10.0    2
   11.0    2
   12.0    2
   19.9    2
   20.0    3
   20.1    3

Number of cases read:  8    Number of cases listed:  8
=============================
APPENDIX: Test data, and code
=============================
NEW FILE.
DATA LIST FREE /age (F5.1).
BEGIN DATA
1 2 10 11 12 19.9 20 20.1
END DATA.
MISSING VALUES age(1).

RECODE age
   (MISSING    = 0)
   (80 THRU HI = 9)
   (70 THRU 80 = 8)
   (20 THRU 30 = 3)
   (10 THRU 20 = 2)
   (LO THRU 10 = 1)  INTO decade.
FORMATS        decade (F2).
MISSING VALUES decade (0).
LIST.

Reply | Threaded
Open this post in threaded view
|

Re: creating new variables- should they be initialized to 0?

Richard Ristow
In reply to this post by Cleland, Patricia (EDU)
At 10:44 AM 1/4/2011, Cleland, Patricia (EDU) wrote:

>Somewhere along the line, I developed the habit or was told to
>initialize any new variable that I created to 0.  For example, if I
>wanted to recode the variable AGE into DECADE (LT 10=1, 11 thru
>19=2, 20 thru 29=3, etc), I would first initialize DECADE via
>
>COMPUTE DECADE=0.
>
>I may do this because I programmed in FORTRAN, where this was
>mandatory  Do others have any thoughts on whether this is necessary?
>Good practice?

I do it only sometimes, for particular purposes.

It's less crucial in SPSS than in FORTRAN. In FORTRAN, uninitialized
variables can receive unpredictable, and frequently wild, values. In
SPSS, uninitialized variables have definite states: SYSMIS, blank, or 0.

Instead of initializing before a RECODE, I put an ELSE clause in the
RECODE to ensure the variable is always assigned.

On the other hand, if I'm calculating a variable's value in a DO IF
construct, I usually initialize the variable first, to a special
missing value labeled 'logic error'. That helps catch the subtle bugs
where a DO IF construct fails because a logic test returns 'missing':
any case where the variable comes out with the 'logic error' value
results from such a bug.

And it can be convenient to initialize a variable, or compute a
substantive value, before assigning it a value in one or more IF
statements. In this case, it may be normal for none of the IF
statements to be triggered, and the initial calculated value to be
the desired final value.

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Reply | Threaded
Open this post in threaded view
|

Re: creating new variables- should they be initialized to 0?

Art Kendall
Well put.

Art

On 1/6/2011 9:15 PM, Richard Ristow wrote:

> At 10:44 AM 1/4/2011, Cleland, Patricia (EDU) wrote:
>
>> Somewhere along the line, I developed the habit or was told to
>> initialize any new variable that I created to 0.  For example, if I
>> wanted to recode the variable AGE into DECADE (LT 10=1, 11 thru
>> 19=2, 20 thru 29=3, etc), I would first initialize DECADE via
>>
>> COMPUTE DECADE=0.
>>
>> I may do this because I programmed in FORTRAN, where this was
>> mandatory  Do others have any thoughts on whether this is necessary?
>> Good practice?
>
> I do it only sometimes, for particular purposes.
>
> It's less crucial in SPSS than in FORTRAN. In FORTRAN, uninitialized
> variables can receive unpredictable, and frequently wild, values. In
> SPSS, uninitialized variables have definite states: SYSMIS, blank, or 0.
>
> Instead of initializing before a RECODE, I put an ELSE clause in the
> RECODE to ensure the variable is always assigned.
>
> On the other hand, if I'm calculating a variable's value in a DO IF
> construct, I usually initialize the variable first, to a special
> missing value labeled 'logic error'. That helps catch the subtle bugs
> where a DO IF construct fails because a logic test returns 'missing':
> any case where the variable comes out with the 'logic error' value
> results from such a bug.
>
> And it can be convenient to initialize a variable, or compute a
> substantive value, before assigning it a value in one or more IF
> statements. In this case, it may be normal for none of the IF
> statements to be triggered, and the initial calculated value to be
> the desired final value.
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> [hidden email] (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD
>

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
Art Kendall
Social Research Consultants