There are 6 relational operators:

Operator | Symbol | Form | Operation |
---|---|---|---|

Greater than | > | x > y | true if x is greater than y, false otherwise |

Less than | < | x < y | true if x is less than y, false otherwise |

Greater than or equals | >= | x >= y | true if x is greater than or equal to y, false otherwise |

Less than or equals | <= | x <= y | true if x is less than or equal to y, false otherwise |

Equality | == | x == y | true if x equals y, false otherwise |

Inequality | != | x != y | true if x does not equal y, false otherwise |

You have already seen how all of these work, and they are pretty intuitive. Each of these operators evaluates to the boolean value true (1), or false (0).

Here’s some sample code using these operators with integers:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
#include <iostream> int main() { std::cout << "Enter an integer: "; int x; std::cin >> x; std::cout << "Enter another integer: "; int y; std::cin >> y; if (x == y) std::cout << x << " equals " << y << "\n"; if (x != y) std::cout << x << " does not equal " << y << "\n"; if (x > y) std::cout << x << " is greater than " << y << "\n"; if (x < y) std::cout << x << " is less than " << y << "\n"; if (x >= y) std::cout << x << " is greater than or equal to " << y << "\n"; if (x <= y) std::cout << x << " is less than or equal to " << y << "\n"; return 0; } |

And the results from a sample run:

Enter an integer: 4 Enter another integer: 5 4 does not equal 5 4 is less than 5 4 is less than or equal to 5

These operators are extremely straightforward to use when comparing integers.

**Comparison of floating point values**

Directly comparing floating point values using any of these operators is dangerous. This is because small rounding errors in the floating point operands may cause unexpected results. We discussed rounding errors in detail in section 2.5 -- floating point numbers.

Here’s an example of rounding errors causing unexpected results:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
#include <iostream> int main() { double d1(100 - 99.99); // should equal 0.01 double d2(10 - 9.99); // should equal 0.01 if (d1 == d2) std::cout << "d1 == d2" << "\n"; else if (d1 > d2) std::cout << "d1 > d2" << "\n"; else if (d1 < d2) std::cout << "d1 < d2" << "\n"; return 0; } |

This program prints an unexpected result:

d1 > d2

In the above program, d1 = 0.0100000000000005116 and d2 = 0.0099999999999997868. Both numbers are close to 0.01, but d1 is greater than, and d2 is less than. And neither are equal.

Sometimes the need to do floating point comparisons is unavoidable. In this case, the less than and greater than operators (>, >=, <, and <=) are often used with floating point values as normal. The operators will produce the correct result most of the time, only potentially failing when the two operands are close. Due to the way these operators tend to be used, a wrong result often only has slight consequences.
The equality operator is much more troublesome since even the smallest of rounding errors makes it completely inaccurate. Consequently, using operator== or operator!= on floating point numbers is not advised. The most common method of doing floating point equality involves using a function that calculates how close the two values are to each other. If the two numbers are "close enough", then we call them equal. The value used to represent "close enough" is traditionally called **epsilon**. Epsilon is generally defined as a small number (e.g. 0.0000001).

New developers often try to write their own “close enough” function like this:

1 2 3 4 5 6 |
#include <cmath> // for fabs() bool isAlmostEqual(double a, double b, double epsilon) { // if the distance between a and b is less than epsilon, then a and b are "close enough" return fabs(a - b) <= epsilon; } |

fabs() is a function in the <cmath> library that returns the absolute value of its parameter. fabs(a - b) returns the distance between a and b as a positive number. This function checks if the distance between a and b is less than whatever epsilon value representing “close enough” was passed in. If a and b are close enough, the function returns true.

While this works, it’s not great. An epsilon of 0.00001 is good for inputs around 1.0, too big for numbers around 0.0000001, and too small for numbers like 10,000. This means every time we call this function, we have to pick an epsilon that’s appropriate for our inputs. If we know we’re going to have to scale epsilon in proportion to our inputs, we might as well modify the function to do that for us.

Donald Knuth, a famous computer scientist, suggested the following method in his book “The Art of Computer Programming, Volume II: Seminumerical Algorithms (Addison-Wesley, 1969)”:

1 2 3 4 5 6 7 |
#include <cmath> // return true if the difference between a and b is within epsilon percent of the larger of a and b bool approximatelyEqual(double a, double b, double epsilon) { return fabs(a - b) <= ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon); } |

In this case, instead of using epsilon as an absolute number, we’re using epsilon as a multiplier, so its effect is relative to our inputs.

Let’s examine in more detail how the approximatelyEqual() function works. On the left side of the <= operator, the absolute value of a - b tells us the distance between a and b as a positive number. On the right side of the <= operator, we need to calculate the largest value of "close enough" we're willing to accept. To do this, the algorithm chooses the larger of a and b (as a rough indicator of the overall magnitude of the numbers), and then multiplies it by epsilon. In this function, epsilon represents a percentage. For example, if we want to say "close enough" means a and b are within 1% of the larger of a and b, we pass in an epsilon of 1% (1% = 1/100 = 0.01). The value for epsilon can be adjusted to whatever is most appropriate for the circumstances (e.g. 0.01% = an epsilon of 0.0001). To do inequality (!=) instead of equality, simply call this function and use the logical NOT operator (!) to flip the result:

1 2 |
if (!approximatelyEqual(a, b, 0.001)) std::cout << a << " is not equal to " << b << "\n"; |

Note that while the approximatelyEqual() function will work for many cases, it is not perfect, especially as the numbers approach zero:

1 2 3 4 5 6 7 8 9 10 11 12 |
#include <iostream> int main() { // a is really close to 1.0, but has rounding errors, so it's slightly smaller than 1.0 double a = 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1; // First, let's compare a (almost 1.0) to 1.0. std::cout << approximatelyEqual(a, 1.0, 1e-8) << "\n"; // Second, let's compare a-1.0 (almost 0.0) to 0.0 std::cout << approximatelyEqual(a-1.0, 0.0, 1e-8) << "\n"; } |

Perhaps surprisingly, this returns:

1 0

The second call didn’t perform as expected. The math simply breaks down close to zero.

One way to avoid this is to use both an absolute epsilon (as we did in the first approach) and a relative epsilon (as we did in Knuth’s approach):

1 2 3 4 5 6 7 8 9 10 11 |
// return true if the difference between a and b is less than absEpsilon, or within relEpsilon percent of the larger of a and b bool approximatelyEqualAbsRel(double a, double b, double absEpsilon, double relEpsilon) { // Check if the numbers are really close -- needed when comparing numbers near zero. double diff = fabs(a - b); if (diff <= absEpsilon) return true; // Otherwise fall back to Knuth's algorithm return diff <= ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * relEpsilon); } |

In this algorithm, we’ve added a new parameter: absEpsilon. First, we check to see if the distance between a and b is less than our absEpsilon, which should be set at something very small (e.g. 1e-12). This handles the case where a and b are both close to zero. If that fails, then we fall back to Knuth’s algorithm.

Here’s our previous code testing both algorithms:

1 2 3 4 5 6 7 8 9 10 |
#include <iostream> int main() { // a is really close to 1.0, but has rounding errors double a = 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1; std::cout << approximatelyEqual(a, 1.0, 1e-8) << "\n"; // compare "almost 1.0" to 1.0 std::cout << approximatelyEqual(a-1.0, 0.0, 1e-8) << "\n"; // compare "almost 0.0" to 0.0 std::cout << approximatelyEqualAbsRel(a-1.0, 0.0, 1e-12, 1e-8) << "\n"; // compare "almost 0.0" to 0.0 } |

1 0 1

You can see that with an appropriately picked absEpsilon, approximatelyEqualAbsRel() handles the small inputs correctly.

Comparison of floating point numbers is a difficult topic, and there’s no “one size fits all” algorithm that works for every case. However, the approximatelyEqualAbsRel() should be good enough to handle most cases you’ll encounter.

O.3.6 -- Logical operators |

Index |

O.3.4 -- Sizeof, comma, and conditional operators |

using the first example I was wondering if I could expand it to calculate full mathematical expressions

Are you talking about parsing and calculating the answers to arbitrary mathematical expressions? If so, I don't think you have all the tools that would be required for that. At the very least, I'd expect you'd need to know more about std::string and loops, which are covered in upcoming chapters.

Ok thank you for telling me this

Hello,

Alex

Thanks for this great tutorial! I have a stupid question. For the sake of clarification i want to ask.

I understood what you gave explained above, but i have a question regarding to data types for floating point numbers. Can i use float data type instead of declaring float point numbers as a double and using Knuth's method. I do agree if i need more precision in case scientific calculation or which needs more precision. I am a new in C++. What is your advise? May be we should avoid using float data type other suggestion ? Below is given program, can be a example.

#include<iostream>

int main()

{

float d1(100-99.99); // should equal 0.01

float d2(10-9.99); // should equal 0.01

std::cout<<d1<<" "<<d2<<std::endl;

if (d1 == d2)

std::cout << "d1 == d2" << "\n";

else if (d1 > d2)

std::cout << "d1 > d2" << "\n";

else if (d1 < d2)

std::cout << "d1 < d2" << "\n";

return 0;

}

> Can i use float data type instead of declaring float point numbers as a double and using Knuth’s method

No. Floats and doubles have the same limitations/problems with comparison operators. The only difference between floats and doubles is the range, precision, and size (in bytes). All other issues with floating point numbers are identical between floats and doubles.

Personally, I favor double over float unless there's a specific reason to use float (e.g. the size is relevant for a given use case).

i dont get what it means when you use "else if" multiple times, why dont you use "if"? whats the difference?

If you use separate if statements, then the program will execute each one in turn. Multiple if statements may resolve to true.

If you chain if-else statements together, then the program only execute the next if statement if the previous one was false. As soon as it finds one that is true, it will stop executing.

Consider the difference between these two programs:

[code]

int x = 17;

if (x > 20) // this is false

std:cout < < "x > 20" < < '\n'; // so this doesn't print if (x > 10) // this is true

std::cout < < "x > 10" < < '\n'; // so this prints if (x > 5) // this is also true

std::cout < < "x > 5" < < '\n'; // so this prints [code] This program prints:

I understand. Thank you very much!

i modified your code to see the output at the end only the main function....but it does not print anything:(:(

[/code]

int main()

{

// if the distance between a and b is less than epsilon, then a and b are "close enough"

double a=12.0;

double b=11.99;

double epsilon=10;

return fabs(a - b) <= epsilon;

std::cout<<fabs(a-b);

}

Of course it doesn't -- you're returning before your code ever gets to the print statement. Here's what your program should look like:

i think i have to pass the values for the bool isAlmostEqual()

so the code would be modified like this.... if i am not mistaken

[/code]

#include <iostream>

#include <cmath> // for fabs()

int main()

{

bool isAlmostEqual(double a, double b, double epsilon)

isAlmostEual(12.0,12.1,11.999);

// if the distance between a and b is less than epsilon, then a and b are "close enough"

return fabs(a - b) <= epsilon;

}

Error:

C:\Users\maria\Desktop\c++\precision\main.cpp|11|error: 'a' was not declared in this scope|

C:\Users\maria\Desktop\c++\precision\main.cpp|11|error: 'b' was not declared in this scope|

C:\Users\maria\Desktop\c++\precision\main.cpp|11|error: 'epsilon' was not declared in this scope|

i can't evaluate the problem plz help:(

Yes, you need to pass in values for all three parameters (a, b, and epsilon). Your call to isAlmostEqual looks correct (minus the typo in the name), but your compiler is confused because you put the definition of function isAlmostEqual() in the wrong place.

why do we need to compare it in the return function???

return fabs(a - b) <= epsilon;

can't i do it in a different way bcoz it tried to compile ur same code

and another question does epsilon holds a constant value in the library or we r passing any values from another file??

[/code]

#include <iostream>

#include <cmath> // for fabs()

int main()

{

bool isAlmostEqual(double a, double b, double epsilon)

// if the distance between a and b is less than epsilon, then a and b are "close enough"

return fabs(a - b) <= epsilon;

}

ERROR:

C:\Users\maria\Desktop\c++\precision\main.cpp|10|error: a function-definition is not allowed here before 'return'|

You don't need to do the comparison in the return statement -- you could do it on the line before, temporarily store the result, and return the temporarily stored result. But that just accomplishes the same purpose and doesn't really make your code any clearer for such a simple function and statement.

Your problem here is that you're trying to put a function inside a function. You can't do that in C++. Put function isAlmostEqual() before function main() and you should be fine.

Can Fabs Return Any No. Except for 1 or 0 ?

I Think No because it is used in context with the relational operators and these operators dont return anything else except for true or false ?

But whats the problem in Asking !

fabs() can return any double value. The relational operators themselves are what evaluate to true or false.

While describing the example of "Comparison of floating point values", the values of d1(100 - 99.99) & d2(10 - 9.99) are shown as

d1 = 0.0099999999999997868 and d2 = 0.0100000000000005116

But in my attempt to print I find the values as

Precise value of d1: 0.010000000000005116

Precise value of d2: 0.0099999999999997868

This may also change the output of the example to -> " d1 > d2 "

Thank you for your time.

Yup, looks like I transposed the values. Thanks for bringing this up. I've fixed the example.

I don't like Knuth's algorithm because it fails for a certain range of values; surely it's better to always use absolute epsilon but modify the epsilon based on the application it's being used for? I would intuitively think of 'close enough' as 'a certain number of decimal places are equal', not a small percentage difference (which is meaningless near to zero.

I do believe there is a typo.

"This program prints an unexpected result:

d1 > d2"

The ">" should be "<", because you go on to say that d1 is less than d2.

Thanks for catching that! Fixed.

why in donald knuth's algorithm we are multiplying (fabs(a)*epsilon) if 'a' is greater than 'b' and not (fabs(b)*epsilon) if 'a' is smaller than 'b'.from what i am getting is multiplying fabs(a) with epsilon i.e (fabs(a)*epsilon) we are trying to make interval even smaller and then comparing with the result of equation on the left handside.so the whole thing could also be written as something like:

return fabs(a - b) <= (fabs(a) < fabs(b) ? fabs(b)*epsilon : fabs(a) * epsilon);

please correct me if i am getting this wrong.

Yes, all you've done is distribute the epsilon.

Hey Alex!

I made a similar 'close enough' program using Knuth's algorithm and every thing was working fine but I was curious to know the value of the epilson and I can't use debugger (I haven't figured it out in xcode of how to use the debugger). The program was :

The value of epilson came out to be 4.94066e-324! Now 3 questions

1 Is this epilson default value ?

2 I use d1 and d2 values to be the same(-2), then how is the conditional statement suppose to work?

3 I also checked Smokeycow program and your reply to it so I thought shouldn't his version and knuth version work the same if both d1 and d2 values were to be same (say -2).

Epsilon is something that needs to be set by you, based on how much tolerance you need. Your epsilon is uninitialized, which means whatever garbage was in the memory allocated to epsilon is now being interpreted as a double. That's why you're getting 4.94066e-324.

Try initializing epsilon to something small, like 1.0e-12.

wow man starting to lose hope with all the math examples. ):

How much math do you think is needed for Isometric 2D games?

> How much math do you think is needed for Isometric 2D games?

Totally depends on what type of isometric game you're interested in creating. For a simple tile-based game, probably not so much. For a shooter or other physics-based game, probably a lot.

Isometric games aside, you'll use relational operators in almost every program you write, so make sure you understand them. :)

Thnx Alex...:-)

You said that the d1evaluates to 0.0099999999999997868 and d2 to 0.0100000000000005116 in the above program. But when I use cout to print the value of d1 and d2, the program gives me 0.01 as output for both variables. Please let me know if I m missing something.

I m not sure but I think you missed your else statement in the following program:

If the code is okay, then what will happen if the if's expression evaluates to true. Yes..it will return true. But what after that. Does the compiler then Ignores the line 10 where we are returning Knuth's method.

std::cout prints variables to 6 significant digits by default. You can see their actual values by using std::setprecision() to ask std::cout to show more precision, or by looking at them in a debugger.

There's no need for an else statement in approximatelyEqualAbsRel. If the return statement is executed, the function returns a value to the caller right then and there, and the rest of the function does not execute. So effectively, everything after the if statement is essentially the else case, just implicitly.

I'm trying to fully understand what is going on with both the fabs() function call and the way it works in Knuth's "close enough" bool function. It didn't seem clear why a fabs() call is necessary so many times so I removed them from the conditional operator statement where they did not seem needed ( to me at least! ) and ended up with this:

My own version appeared to work fine until I changed the double values a and b to a negative decimal. If I do this and run the program it returns an incorrect bool value of false in a program where variables a and b are both the same. Knuth's version has no such problem and returns the correct value of true when using a negative decimal. I believe the problem to be my understanding of the fag() function call but I could be completely wrong! If you could shed some light on this that would be great.

Nice investigative work. You almost reached the conclusion yourself.

fabs() returns the absolute value of its parameter.

Knuth's algorithm doesn't care whether the numbers are negative or positive -- it only cares how large they are (in terms of absolute value). Your version only considers which value is greater.

Consider the case where a = -50 and b = 2. Knuth's version uses 50 * epsilon (because a has a magnitude of 50), whereas your version uses 2 * epsilon (because b is greater than -50). This makes your algorithm incorrect, because a clearly has a greater magnitude than b in this case.

Essentially, Knuth's algorithm uses fabs() to handle both positive and negative numbers.

I still dont get it, why we have to take:

Thanks in advance...

Thank u i'll surely go through all the previous tutorials.

Thanks got it. one question are my questions too preliminary level ?

still , Thanks.

A lot of the stuff you've been asking has been covered in previous lessons. For example, the above question of yours was covered in section 1.7 -- Forward declarations.

But there's so much to learn, I'm not surprised it doesn't all stick. I'd definitely recommend re-reading through the old lessons again just to make sure you're comprehending everything.

In my compiler "Dev C++" the above programs aren't compiling giving an error

In function 'int main()':

[Error] 'approximatelyEqual' was not declared in this scope

[Error] 'approximatelyEqualAbsRel' was not declared in this scope

recipe for target 'Untitled5.o' failed

What must have gone wrong?

Did you copy the code for approximatelyEqual() and approximatelyEqualAbsRel() into your code file? If so, are they above main()?

Here’s an alternative function I've come up with:

Not sure if that's more/less efficient, but it's lightweight and easier to read if you're not good with math.

Apparently editing comments breaks code formatting - Sorry that I posted and deleted this several times.

It would also appear that I made an error in the above code, this should be correct:

Notice that the function as incorrectly labeled was double rather than bool.

A note to all comment readers: All comments above this point refer to an old example, which was slightly inferior to the one now in place. Happy learning!

Knuth's version introduces an error with significant digits when dX > 1, so I've updated the code to correct the error. I also noticed my first attempt introduced a new error for dX < 1, so set it to an if statement so it only applies the fix if dX > 1, solving both issues. =3

I'd highly recommend always using the updated code over Knuth's, as Knuth's will invariably cause issues for almost any dX > 1, often returning incorrect answers due to (dEpsilon * fabs(dX)) introducing extra significant digits when unintended.

I'm a little iffy on if dX++ would work or not; it's not needed in this case, so I left it out rather than risk having dX++ be applied twice and affecting both instances of dX. Safer to just flat out tell it exactly what's to be added in this case than trying to automate it.

bool IsEqual(double dX, double dY)

{

const double dEpsilon = 0.00001; // significant digits sets precision

if (dX <= 1) // correction only applies if inputs are less than 1; avoids errors above 1

return fabs((dX + 1) - (dY + 1)) <= dEpsilon * fabs((dX + 1));

// the +1's resolve <1 errors; do not use dX++ due to risk of doubling the effect

else

return fabs(dX - dY) <= dEpsilon * fabs(dX); // runs if no <1 errors would occur

}

I keep thinking about this thing, and keep coming up with more errors. If dY is a negative number, but still small, between but not equal to 0 and -1, it breaks anyway. Any negative numbers break regardless, but the >-1 <0 range is a pain since I can't think of a way to get around it yet. At least not fully, anyway, not without introducing another error. I can fix numbers <-1 but not in that little gap... seriously need to think on this some more.

I've been thoroughly nerd sniped, boo. =P

I'm going to dig through some of the previous lessons again, since I think there's a function in there that may help me solve the problem.

I, of course, would not mind if anyone else picks at it some. =P

For reference, the issues come down largely to dealing with (dX - dY) having problems with negative numbers (semi-easily fixed), but especially breaking down if dX is close to dY and positive, but dY is negative. Application of fabs() seemed like the obvious fix, but it breaks because it would think 0.999 == -0.999, which... clearly isn't the case.

Maybe I'm just too tired to think about it anymore. Time for bed, I'll worry about it tomorrow and maybe dream up something overnight to fix it. Or just forget the problem exists entirely. I'd rather not, though. It bothers me greatly that I *KNOW* there's a bug in the program, but that I don't know how to fix it. XD

this is probably a silly question, but i keep reading this thinking i'm missing something really obvious. when you use maths in your program, after some point 2+2 will no longer equal 4? .. i like to think of a computer as a pretty dependable calculator, so this discussion of two numbers being "close enough" within a margin of 1% is blowing my mind.

Yeah, it's pretty trippy.

Integer 2 + 2 will always = 4.

Floating point 2.0 + 2.0 may or may not = 4.0.

It really all comes down to the way the numbers are stored internally. Integers always store precise values. Floating point numbers trade some precision for a much greater range.

Please can anyone specify why we multiply dEpsilon with dX? What is the purpose?

Thanks. Note: Even if you see this message has been sent out quiet long time ago, your answer will be ok for me.

bool IsEqual(double dX, double dY)

{

const double dEpsilon = 0.000001; // or some other small number

return fabs(dX - dY) <= dEpsilon * fabs(dX);

}

The idea behind the function is simple as we look at it geometrically. The module of |dX - dY| represents the units between the two numbers, in other words it is an interval of numbers, which shows us how far away the two numbers are (real numbers and their representation on the real axis). This comes directly from calculus, if we multiply epsilon, a very small number with the module of dX we make the interval of dX even smaller, assuming dX is the number with higher value, hence we see if |dX - dY| is an interval so small, that it "fits" somewhere in dEpsilon * |dX|. If it does, it means the two numbers are very close since epsilon can be as small as we want it, even if they are not equal. What you have to understand is the idea behind the module and its geometric use. If |dX - dY| are indeed very close, by substracting them we get a very small interval.

Guys, I'll tell you what I think, if I may. The function that Knuth made has one problem to my mind: it takes only the first value passed in into consideration when calculating the relative distance between the variables. As a mathematician, I think it should take both of them but - attention! - instead of using (x+y) it should make use of max(|x|,|y|). Therefore, you look for this piece of code, I believe:

By the way, I'm sure there must be some kind of max function in C++ but don't know in which library. I have not analysed the function thoroughly as it should be but I think it's the best version so far. Will gladly see a better one.

This seems to be best function.

I think there is max function somewhere nearby, probabaly in stdlib.h

but it can be easily written off:

You may have missed a bracket pair in Knuth's algorithm above. What you have written is equivalent to that code - but perhaps more readable.

I find it very curious that "absolute value" is a function in the standard library, but not a built-in operator in C++. You would think that the (otherwise useless) unary "+" operator would be assigned as the operator for the absolute value function. In other words, it seems logical that the expression +(x - y) should evaluate to the absolute value of (x - y).

In any case Alex, I think it would be better to introduce the absolute value function in section 3.2 with the Operators.

Keep up the good work.

+ doesn't absolute value things in normal mathematics. For example, if x = -3, then +x = -3, not 3.

C++ turning the unary plus operator into an absolute value operator would diverge from common mathematics.

What I find weird is the lack of an exponent operator. Who made _that_ decision? :)

I dont get it.

If we have this for example

And we test with x=20.111 and y=20.1101 (close enought) we get this:

that equals true.

But lets use smaller numbers, that are close to each other as on the first example: x=0.001, y=0.0001, we get this:

that equals false.

For me personaly, in both examples the numbers are satisfying close.

If i would need to compare two close floating point values, i would go for this:

If i need more precission i would increase the nFloatPPrecis depending on how much digits i would need.

What do you think?

The issue honestly seems to be that when you're using numbers lower than 1, any time you multiply them against the dEpsilon value, you're also adding in extra significant digits that shouldn't be there.

In your examples above, here's what I see:

20.001-20.0001 <= 0.0001*20.001

0.0009 <= 0.0020001<!--formatted-->

You start with two numbers with 5 and 6 significant digits. It evaluates to 5 significant digits. The 0.0001*20.001 returns 0.0020001 which is quite a few extra significant digits in theory, but in practice it's honestly really only important out to 0.002; the remainder is irrelevant.

In the latter example of:

0.001-0.0001 <= 0.0001*0.001

0.0009 <= 0.0000001<!--formatted-->

You start with two numbers with 4 and 5 significant digits. The comparison function, due to multiplying a number below 1 with another below 1 means you increased it to 0.0000001, or 7 significant digits. As such, the accuracy is now greater than the numbers you started with.

To correct this issue, all you'd have to do is rewrite it like this:

bool IsEqual(double dX, double dY)

{

const double dEpsilon = 0.000001; // or some other small number

return fabs((dX+1) - (dY+1)) <= dEpsilon * fabs((dX+1));

}

I was going to use ++ instead of +1, but then realized that could cause issues if dX is evaluated twice with ++ which would break the equation. Same dealie with only having ++ on one side; dX++ - dY++ doesn't work either if you don't include fabs(dX++), and the reverse also causes issues, so the (+1) will have to apply.

What this means now, though, is that you get:

1.001-1.0001 <= 0.0001*1.001

0.0009 <= 0.0001001<!--formatted-->

The left side remains exactly the same, and the right side retains 4 significant digits of actual value, where it basically ends at 0.0001.

This method ensures that, for numbers less than 1, dEpsilon will always resolve to the same number of significant digits. Yes, there's the 0.0001(001) tacked onto the end, but the important part is the 4th significant digit which will always be what matters when comparing two numbers that are below 1. As such, dEpsilon resolves to whatever you tell it to without adding erroneous extra significant digits.

In this case, it returns false because the error rate is larger than the 4 significant digits requested by dEpsilon, not because you accidentally added more significant digits by multiplying 0.1 by 0.1 and getting 0.01. The original code by Donald Knuth works just fine but bugs out as soon as you get to any value below 1 because multiplying values less than 1 is the same as dividing numbers larger than 1.

However, this introduces a problem where a dX value greater than 1 introduces inaccuracy. As such, the following is my updated version which only removes the <1 error if it occurs, and doesn't mess with anything otherwise.

bool IsEqual(double dX, double dY)

{

const double dEpsilon = 0.00001; // significant digits sets precision

if (dX <= 1) // correction only applies if inputs are less than 1; avoids errors above 1

return fabs((dX + 1) - (dY + 1)) <= dEpsilon * fabs((dX + 1));

// the +1's resolve <1 errors

else

return fabs(dX - dY) <= dEpsilon * fabs(dX); // runs if no <1 errors would occur

}

Gawd, it took me like 30 seconds to figure out how to fix the problem and over an hour trying to test the code because I couldn't figure out how to read the return from the bool again.

I don't really understand what is happening with fabs(). For example if dX = 2 and dY = 3 how would I figure what the result of fabs(dX), or fabs(dX - dY) would/should be. The sentence that states that fabs() is a function that returns the absolute value of it’s double parameter leads me to believe that dX would evaluate to 4 and dY to 6 in this case, but when compiled and ran with the above values and sending fabs(dX) or fabs(dY) to the screen using cout I get whatever the value of dX or dY was to begin with. When doing the same with fabs(dX - dY) however I get a value of 1, which would be the same as (dX - dY) anyway. (scratches head)

Heh, that's not what he meant. :) The double parameter he was referring to was the double float point variable that is provided to fabs as a parameter. So fabs doesn't return double the value put into it, it just returns the absolute value of the number sent to it. You can think of fabs doing something similar to this, though this is REALLY simplified:

And remember, when you're using functions, the mathematical expressions inside their parameters are evaluated before the function call is, for example using "fabs(-4 - 6)" is the same as "fabs(-10)".

If a float is generally accurate to 7 decimal places, and a double is at 16, if I were working with figures with 2 or 4 decimal places, would the accuracy ever really be an issue? For example, would I see something like

(.0095 * 36.75) > 34.90

equate to false since it uses less than the 7 decimals that floats tend to be accurate to?

A float is not accurate to 7 decimal places. A float is accurate to approximately 7 significant digits. A significant digits is any digit that is not a placeholder 0, including ones on the left side of the decimal.

For example, .0095 has two placeholder zeros, and so is only 2 significant figures. 34.90 has 4 significant figures.

There are two types of errors we need to watch out for with floating point values: rounding errors, and precision errors.

Rounding errors can happen with numbers of any length, because some numbers have infinite representations in binary (0.1 for example), and those representations will be truncated. Rounding errors typically make your answer wrong by 0.000001, or some small number like that.

The second and more potentially serious error are precision errors, where your number can't be stored because the floating point representation doesn't have enough memory. Precision errors are more serious because they can affect your answer by a much larger degree of magnitude than rounding errors.

0.0095 * 36.75 = 0.349125, which is 6 significant figures, so in this case, you'll probably be fine in terms of precision errors. Consequently, your answer will only be affected by small rounding errors.

But consider a case like: 0.0095 * 36.7513. Even though 36.7513 is 4 decimal places, it's 6 significant digits. When multiplied by 0.0095, the answer is 0.34913735. A float will truncate this to 0.349137.

Once you get into larger dollar amounts, floats are even less suitable. Consider $100264.75. This is a number with 8 significant digits, even though it only uses 2 decimal places. Already a float is not going to be able to hold this number. As you do mathematical operations on it, it's going to drift farther and farther from your intended answer.

In short, if accuracy is important, use double. Only use floats when accuracy is not that important (eg. in games, where it doesn't really matter if your character has 137.24 or 137.25 strength).

doesn't 34.90 have 3 significant digits?

I know math teachers usually say that the last zeros don't matter after a decimal, but when you're talking about precision they do. In sciences when you're taking measurements a .39 may be the same value as .3900000 mathematically speaking, but the latter is more precise to 7 significant figures. With the .39 value you cannot be assured the number in the thousandths place is 0.

No,...

34.90 has 4 significant figures.

3.94 has 3 significant figures

03.94 has 3 significant figures

.00000000394 = 3.94 E-9 has 3 significant figures

But 34.90 only has 3 significant figures; especially since 34.90 = 34.9

In terms of value, 34.90 is the same as 34.9. We're used to dropping trailing zeros when dealing with values.

However, in terms of precision, 34.9 is less precise than 34.90. If I tell you that I walked "34.9 miles", maybe I walked 34.85 miles and rounded up, or 34.94 miles and rounded down. But if I tell you I walked "34.90 miles", you know I walked somewhere between 34.895 and 34.904, and that provides more accuracy.

The other posters are correct: 34.90 has 4 significant digits.

I talk about significant digits in lesson 2.5 -- Floating point numbers.