Decimal floating point in .NET
In my article on binary floating point types,
I mentioned the System.Decimal
(or just decimal
in C#)
type briefly. This article gives more details about the type, including its
representation and some differences between it and the more common binary
floating point types. From here on, I shall just refer to it as the
decimal
type rather than System.Decimal
, and likewise
where float
and double
are mentioned, I mean the
.NET types System.Single
and System.Double
respectively.
To make the article easier on the eyes, I'll leave the names in normal type
from here on, too.
What is the decimal type?
The decimal type is just another form of floating point number - but unlike float and double, the base used is 10. If you haven't read the article linked above, now would be a good time to read it - I won't go into the basics of floating point numbers in this article.
The decimal type has the same components as any other floating point number:
a mantissa, an exponent and a sign. As usual, the sign is just a single bit,
but there are 96 bits of mantissa and 5 bits of exponent. However, not all
exponent combinations are valid. Only values 0-28 work, and they are effectively
all negative: the numeric value is sign * mantissa / 10exponent
.
This means the maximum and minimum values of the type are +/- (296-1), and
the smallest non-zero number in terms of absolute magnitude is 10-28.
The reason for the exponent being limited is that the mantissa is able to store 28 or 29 decimal digits (depending on its exact value). Effectively, it's as if you have 28 digits which you can set to any value you want, and you can put the decimal point anywhere from the left of the first digit to the right of the last digit. (There are some numbers where you can have a 29th digit to the left of the rest, but you can't have all combinations with 29 digits, hence the restriction.)
How is a decimal stored?
A decimal is stored in 128 bits, even though only 102 are strictly necessary.
It is convenient to consider the decimal as three 32-bit integers representing the
mantissa, and then one integer representing the sign and exponent. The top bit
of the last integer is the sign bit (in the normal way, with the bit being set (1) for
negative numbers) and bits 16-23 (the low bits of the high 16-bit word) contain
the exponent. The other bits must all be clear (0). This representation is the one
given by decimal.GetBits(decimal)
which returns an array of 4 ints.
Formatting decimals
Unlike floats and doubles, when .NET is asked to format a decimal into a string
representation, its default behaviour is to give the exact value. This means there
is no need for a decimal equivalent of the DoubleConverter
code of
the binary floating point article. You can, of course, ask it to restrict the
value to a specific precision.
Keeping zeroes
Between .NET 1.0 and 1.1, the decimal type underwent a subtle change. Consider the following simple program:
using System;
public class Test
{
static void Main()
{
decimal d = 1.00m;
Console.WriteLine (d);
}
}
When I first ran the above (or something similar) I expected it to output
just 1
(which is what it would have been on .NET 1.0) - but in fact,
the output was 1.00
. The decimal type doesn't normalize itself - it
remembers how many decimal digits it has (by maintaining the exponent where
possible) and on formatting, zero may be counted as a significant decimal digit.
I don't know the exact nature of what exponent is chosen (where there is a choice)
when two different decimals are multiplied, divided, added etc, but you may
find it interesting to play around with programs such as the following:
using System;
public class Test
{
static void Main()
{
decimal d = 0.00000000000010000m;
while (d != 0m)
{
Console.WriteLine (d);
d = d/5m;
}
}
}
Which produces a result of:
0.00000000000010000 0.00000000000002000 0.00000000000000400 0.00000000000000080 0.00000000000000016 0.000000000000000032 0.0000000000000000064 0.00000000000000000128 0.000000000000000000256 0.0000000000000000000512 0.00000000000000000001024 0.000000000000000000002048 0.0000000000000000000004096 0.00000000000000000000008192 0.000000000000000000000016384 0.0000000000000000000000032768 0.0000000000000000000000006554 0.0000000000000000000000001311 0.0000000000000000000000000262 0.0000000000000000000000000052 0.000000000000000000000000001 0.0000000000000000000000000002
Everything's a number
The decimal type has no concept of infinity or NaN (not-a-number) values, and despite the above examples of the same actual number being potentially representable in different forms (eg 1, 1.0, 1.00) the normal == operator copes with these and reports 1.0==1.00 etc.
Accuracy
The decimal type has a larger precision than any of the built-in binary floating point types in .NET, although it has a smaller range of potential exponents. Also, many operations which yield surprising results in binary floating point due to inexact representations of the original operands go away in decimal floating point, precisely because many operands are specifically represented in source code as decimals. However, that doesn't mean that all operations suddenly become accurate: a third still isn't exactly representable, for instance. The potential problems are just the same as they are with binary floating point. However, most of the time the decimal type is chosen for quantities like money, where operations will be simple and keep things accurate. (For instance, adding a tax which is specified as a percentage will keep the numbers accurate, assuming they're in a sensible range to start with.) Just be aware of which operations are likely to cause inaccuracy, and which aren't.
As a very broad rule of thumb, if you end up seeing a very long string representation (ie most of the 28/29 digits are non-zero) then chances are you've got some inaccuracy along the way: most of the uses of the decimal type won't end up using very many significant figures when the numbers are exact.
Conclusion
Most business applications should probably be using decimal rather than float or double. My rule of thumb is that manmade values such as currency are usually better represented with decimal floating point: the concept of exactly 1.15 dollars is entirely reasonable, for example. For values from the natural world, such as lengths and weights, binary floating point types make more sense. Even though there is a theoretical "exactly 1.15 metres" it's never going to occur in reality: you're certainly never going to be able to measure exact lengths, and they're unlikely to even exist at the atomic level. We're used to there being a certain tolerance involved.
There is a cost to be paid for using decimal floating point arithmetic, but I believe this is unlikely to be a bottleneck for most developers. As always, write the most appropriate (and readable) code first, and analyse your performance along the way. It's usually better to get the right answer slowly than the wrong answer quickly - especially when it comes to money...