SPSSX Discussion

Error in "less than" condition

Classic

List

Threaded

7 messages Options

YurikoKirilovna

Error in "less than" condition

Hello to everybody on the list!

Today, I detect a problem in "less than" condition. When I compute a new
variable with an specific condition using SPSS Syntax like "variable <
0.093", there isn't a problem, but when I put: "variable < (-0.407+0.5), I
get a different results and I don't know why because (-0.407+0.5) = 0.093.
As you can see in my example:

* Generate data.
DATA LIST LIST
/ A (F6.3).
BEGIN DATA
0.093
0.093
0.093
0.005
0.560
END DATA.

* Case 1.
IF (A < 0.093) temp1=1.
EXECUTE.

* Case 2.
IF (A < (-0.407+0.5)) temp2=1.
EXECUTE.

Please help me undertand what is happening!

Kind regards!

Yuriko

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

David Marso

Re: Error in "less than" condition

Administrator

Perhaps you should then be using LE (Less than Equal).
Consider:
COMPUTE ValueTest=-0.407+0.5.
COMPUTE Diff=(A-ValueTest)*1000000000000.
FORMATS A ValueTest Diff(F40.16).
LIST A ValueTest Diff.

A
ValueTest Diff

.0930000000000000
.0930000000000000 -.0000277555756156
.0930000000000000
.0930000000000000 -.0000277555756156
.0930000000000000
.0930000000000000 -.0000277555756156
.0050000000000000
.0930000000000000 -88000000000.0000200000000000
.5600000000000001
.0930000000000000 467000000000.0000000000000000

YurikoKirilovna wrote

> Hello to everybody on the list!
>
> Today, I detect a problem in "less than" condition. When I compute a new
> variable with an specific condition using SPSS Syntax like "variable <
> 0.093", there isn't a problem, but when I put: "variable < (-0.407+0.5), I
> get a different results and I don't know why because (-0.407+0.5) = 0.093.
> As you can see in my example:
>
> * Generate data.
> DATA LIST LIST
> / A (F6.3).
> BEGIN DATA
> 0.093
> 0.093
> 0.093
> 0.005
> 0.560
> END DATA.
>
> * Case 1.
> IF (A < 0.093) temp1=1.
> EXECUTE.
>
> * Case 2.
> IF (A < (-0.407+0.5)) temp2=1.
> EXECUTE.
>
> Please help me undertand what is happening!
>
> Kind regards!
>
>
> Yuriko
>
>
>
>
> --
> Sent from: http://spssx-discussion.1045642.n5.nabble.com/
>
> =====================
> To manage your subscription to SPSSX-L, send a message to

> LISTSERV@.UGA

> (not to SPSSX-L), with no body text except the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

-----
Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"
--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Please reply to the list and not to my personal email.
Those desiring my consulting or training services please feel free to email me.
---
"Nolite dare sanctum canibus neque mittatis margaritas vestras ante porcos ne forte conculcent eas pedibus suis."
Cum es damnatorum possederunt porcos iens ut salire off sanguinum cliff in abyssum?"

Rich Ulrich

Re: Error in "less than" condition

In reply to this post by YurikoKirilovna

"0.5" is represented exactly in the binary coding of computer memory,

using 0s and 1s to represent powers of two.

"0.093" is only represented approximately in binary, so it is

bad programming to expect that you will achieve the value for

"0.093" by addition which will exactly match the value that is

assigned when 0.093 is read in. Or achieved by another computation.

Rich Ulrich

From: SPSSX(r) Discussion <[hidden email]> on behalf of YurikoKirilovna <[hidden email]>
Sent: Tuesday, February 19, 2019 9:01 AM
To: [hidden email]
Subject: Error in "less than" condition

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Jon Peck

Re: Error in "less than" condition

To add to Rich's correct explanation, if you compute -.407+.5 and look at its full precision value, you will see that the actual value is

0.09300000000000003

because the representation of most fractional values cannot be exact even with high precision arithmetic.

You can't see this directly in Statistics, because it won't show you that many figures, because the least significant part of the number is just noise, but if you compute

compute sum = (-.407+.5) * 1000000000.

format sum(F30.16).

you will see that little extra showing up.

On Tue, Feb 19, 2019 at 6:49 PM Rich Ulrich <[hidden email]> wrote:

"0.5" is represented exactly in the binary coding of computer memory,

using 0s and 1s to represent powers of two.

"0.093" is only represented approximately in binary, so it is

bad programming to expect that you will achieve the value for

"0.093" by addition which will exactly match the value that is

assigned when 0.093 is read in. Or achieved by another computation.

--

Rich Ulrich

From: SPSSX(r) Discussion <[hidden email]> on behalf of YurikoKirilovna <[hidden email]>
Sent: Tuesday, February 19, 2019 9:01 AM
To: [hidden email]
Subject: Error in "less than" condition

Hello to everybody on the list!

Today, I detect a problem in "less than" condition. When I compute a new
variable with an specific condition using SPSS Syntax like "variable <
0.093", there isn't a problem, but when I put: "variable < (-0.407+0.5), I
get a different results and I don't know why because (-0.407+0.5) = 0.093.
As you can see in my example:

* Generate data.
DATA LIST LIST
/ A (F6.3).
BEGIN DATA
0.093
0.093
0.093
0.005
0.560
END DATA.

* Case 1.
IF (A < 0.093) temp1=1.
EXECUTE.

* Case 2.
IF (A < (-0.407+0.5)) temp2=1.
EXECUTE.

Please help me undertand what is happening!

Kind regards!

Yuriko

--
Sent from: http://spssx-discussion.1045642.n5.nabble.com/

=====================
To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Jon K Peck
[hidden email]

Kirill Orlov

Re: Error in "less than" condition

In reply to this post by YurikoKirilovna

I have a quick question. Hope someone (maybe Jon) replies.

The average (at least in MATRIX) of 19 values 1.01 is not exactly the same as the avarage of 11 values 1.01 (it is just an example).

Will the post computing

compute average= rnd(average*1E15) / 1E15.
or
compute average= trunc(average*1E15) / 1E15.

always be helpful? I.e. will it always take the averages that must be the same to the same actual value?

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Jon Peck

Re: Error in "less than" condition

Rounding or truncating the value after a large left shift will get rid of extreme LSB (least significant bits) as long as the shifted number is still within the valid floating point range ( approximately 10**308). Of course, just shifting this way has no effect on the precision of the numbers. You get 53 bits in the mantissa regardless (plus a little trick implemented in hardware for mantissas with a leading zero), but it affects the operation of rnd and trunc. Also look into the FUZZBITS setting, which is available as a parameter to these functions and as a general preference setting in Edit > Options > Data.

From the Help...

Rounding and Truncation of Numeric Values. For the RND and TRUNC functions, this setting controls the default threshold for rounding up values that are very close to a rounding boundary. The setting is specified as a number of bits and is set to 6 at install time, which should be sufficient for most applications. Setting the number of bits to 0 produces the same results as in release 10. Setting the number of bits to 10 produces the same results as in releases 11 and 12.

For the RND function, this setting specifies the number of least-significant bits by which the value to be rounded may fall short of the threshold for rounding up but still be rounded up. For example, when rounding a value between 1.0 and 2.0 to the nearest integer this setting specifies how much the value can fall short of 1.5 (the threshold for rounding up to 2.0) and still be rounded up to 2.0.
For the TRUNC function, this setting specifies the number of least-significant bits by which the value to be truncated may fall short of the nearest rounding boundary and be rounded up before truncating. For example, when truncating a value between 1.0 and 2.0 to the nearest integer this setting specifies how much the value can fall short of 2.0 and be rounded up to 2.0.

If you need very high precision mean or sum calculation, consider the Kahan summation algorithm

https://en.wikipedia.org/wiki/Kahan_summation_algorithm

This is used within Statistics in some places. It is used in AGGREGATE, I recall, but I'm not sure where else.

There is an overview of several different high precision algorithms in Python here

https://code.activestate.com/recipes/393090/

The Python math.fsum function implements a high precision scheme if you want to experiment.

These algorithms, however, are summing their inputs which are assumed to be exact. If the issue is about binary representation, then you might want to preprocess the numbers first. Python also provides a Decimal data type that avoids the whole decimal to binary problem, but it has considerable overhead, since, unlike binary number operations, it is not supported in the hardware.

As you know, comparison of floating point values with fractional parts is imprecise, so you might want to use a small tolerance rather than testing for exact equality.

On Wed, Feb 20, 2019 at 2:44 AM Kirill Orlov <[hidden email]> wrote:

I have a quick question. Hope someone (maybe Jon) replies.

The average (at least in MATRIX) of 19 values 1.01 is not exactly the same as the avarage of 11 values 1.01 (it is just an example).

Will the post computing

compute average= rnd(average*1E15) / 1E15.
or
compute average= trunc(average*1E15) / 1E15.

always be helpful? I.e. will it always take the averages that must be the same to the same actual value?

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

Jon K Peck
[hidden email]

Kirill Orlov

Re: Error in "less than" condition

Thank you, Jon for the lesson.

===================== To manage your subscription to SPSSX-L, send a message to [hidden email] (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD