March29

Rounding floating point numbers

A lot of confusion seems to exist around the ability to round floating point numbers, I am going to try to shed a little light on this topic.  Bear with me, the math may get a little involved, but the answers will be worth it (I think).

Float is an estimated data type

Approximate-number data types for use with floating point numeric data. Floating point data is approximate; therefore, not all values in the data type range can be represented exactly.

MSDN SQL Float Article

Float is never an exact number.  The number is internally stored in more than one part and has to be processed into something we humans can understand.  That means something like 1/3 can be represented as 33%, but it is not actually a nice even number.

For the full mathematical treatment of why this is so, feel free to read the fantastic paper “What Every Computer Scientist Should Know About Floating-Point Arithmetic”.  Warning, that paper is not for the feint of heart, it has a lot of math behind it.

The important thing to pick up out of that paper is that all possible numbers between 0 and 1 cannot be represented in 32 or even 64 bits. In order to store a complicated number with decimal places it has to be estimated.  This estimation introduces Rounding Errors.  When you start with one small error, everything that comes afterwards is not what you expect, these types of errors lead to errors like the Patriot Scud Missile Error.

A C# Example of Rounding

If you run the following C# code you will see an object that has a result of 5.1400000000000006.  Even though it looks like op1 = 2.56 and op2 = 2.58.

double type example

double op1 = Math.Round(2.562, 2);
double op2 = Math.Round(2.577, 2);
double opresult = op1 + op2;

Set a breakpoint on the opresult assignment and look at the value.  Now go into your immediate window and type ? opresult.ToString() and you will see an answer that most people probably expect (5.14).  But put your mouse over the opresult object and you see the actual value.

Result of two doubles added

I know at this point it probably seems pretty confusing.  A double is what VistaDB internally stores a FLOAT column type.  Why?  Because they match pretty well in bit size and what they can do, but it does lead to an interesting rounding problem like the one above.

Lets take the same code with a decimal type, what changes?

decimal type example

decimal dop1 = Math.Round(2.562m, 2);
decimal dop2 = Math.Round(2.577m, 2);
decimal dopresult = dop1 + dop2;

Set a breakpoint on the dopresult and look at the value.  This time is correctly shows 5.14, why?  Decimals are not an estimated type.  Either they can fit the value, or you get an overflow exception. 

The decimal keyword denotes a 128-bit data type. Compared to floating-point types, the decimal type has a greater precision and a smaller range, which makes it suitable for financial and monetary calculations. The approximate range and precision for the decimal type are shown in the following table.

MSDN Decimal Page

So the value is 128 bit instead of 64 bit.  That means a LOT more storage space for numbers.  Again hovering over the result of the math addition, but this time we get the result we expected.

 Decimal Operation Result

The m on the end of the number tells the compiler we want it represented as a decimal, without this you would get a compiler error about the types not be convertible.

A SQL Example of Rounding

Now lets take a look at the same logic in SQL code using VistaDB.

-- Gives the result 5.1400000000000006 due to math rounding
select  round(2.562,2) + round(2.577,2);

This is the same results we got in our first test using the float type.  But we can rewrite the SQL a couple different ways to force that same promotion to 128 bit math like this:

select CAST(round(2.562,2) as decimal) + cast(round(2.577,2) as decimal);
-- Gives 5.14 result because of 128 bit math

select  round(CAST(2.562 as DECIMAL),2) + round(CAST(2.577 AS DECIMAL),2);
-- Gives 5.14 result because of 128 bit math

By promoting the type we now get the expected result of 5.14.

Of course you could also have the column type as a decimal and it would have worked as well, but in this case I was demonstrating direct input numbers.  The same is true for any other input to round.  The round function outputs the smallest type it can that will hold the number. 

What about SQL Server?

This minimum value representation is actually the same as the SQL Server docs state, but they don’t appear to always follow that documentation.  For the same above lines you get different results.  The select with no cast gets a 5.14 (seems they auto promoted to a decimal).  But the ones with the cast both return 6.  I think it is because the default scale in SQL Server for a decimal is 0.  Casting the 2.562 as decimal with a scale and precision (38,2 for example) and the statement does work.  But dot net can’t set a scale and precision for decimal types.

I can fully explain the why and how of VistaDB rounding, but the SQL Server rounding leaves me confused.  The following statement will cause a crash in SQL Server:

select ROUND(748.58, -3)

Msg 8115, Level 16, State 2, Line 1
Arithmetic overflow error converting expression to data type numeric.

Even though the docs state that it should work.  VistaDB returns the correct answer in this case (1000).  This leads me to believe that the SQL Server rounding routines are incorrect, or at least non repeatable for correctness.

Best Practice?

This leaves me in the unfortunate position of not really being able to recommend a best practice, other than implementing your own ROUND function that you can count on across any database.  When I looked at rounding for other databases I was amazed that they all have even stranger rules and exceptions to their rules depending upon types, math operations, etc. 

I have always felt that the database should be for storing data, not doing math.  I think this issue reiterates that databases are not really standardized at math operations as well as most people probably think.  If you need a specific math operation to always be the same on any platform, then implementing it in your code may be the only way to accomplish that goal.  With VistaDB and SQL Server you could use something like T-SQL Proc or CLR Proc to give consistent results, or just implement the logic in your application.

Why not just use Decimal internally for everything?

I know someone will be saying that we should just use System.Decimal for everything and be done with it.  That actually doesn’t solve the underlying problem though.  Even if we were to change the internal representation of a float to a System.Decimal there would be other problems.  The first is that a decimal requires 128 bits of RAM, and there are no native operations in the CPU that act on a dot net decimal.  The change to decimal would impact performance, disk, and RAM usage.  But it would also be a seriously breaking change to existing users.

More Information

IEEE Site – If you have serious math requirements (engineering or scientific) you will probably want to use an IEEE math library instead of the built in routines.

Calculate dot net scale and precision – Not directly related, but shows the difficulty in working with decimals in dot net, and how some limitations of their design impact performance.

Comments are closed