Decimal floating point in .NET
In my article on binary floating point types,
I mentioned the System.Decimal (or just decimal in C#)
type briefly. This article gives more details about the type, including its
representation and some differences between it and the more common binary
floating point types. From here on, I shall just refer to it as the
decimal type rather than System.Decimal, and likewise
where float and double are mentioned, I mean the
.NET types System.Single and System.Double respectively.
To make the article easier on the eyes, I'll leave the names in normal type
from here on, too.
What is the decimal type?
The decimal type is just another form of floating point number - but unlike
float and double, the base used is 10. If you haven't read the article linked
above, now would be a good time to read it - I won't go into the basics of
floating point numbers in this article.
The decimal type has the same components as any other floating point number:
a mantissa, an exponent and a sign. As usual, the sign is just a single bit,
but there are 96 bits of mantissa and 5 bits of exponent. However, not all
exponent combinations are valid. Only values 0-28 work, and they are effectively
all negative: the numeric value is sign * mantissa / 10exponent.
This means the maximum and minimum values of the type are +/- (296-1), and
the smallest non-zero number in terms of absolute magnitude is 10-28.
The reason for the exponent being limited is that the mantissa is able to store
28 or 29 decimal digits (depending on its exact value). Effectively, it's as if you
have 28 digits which you can set to any value you want, and you can put the decimal
point anywhere from the left of the first digit to the right of the last digit.
(There are some numbers where you can have a 29th digit to the left of the rest,
but you can't have all combinations with 29 digits, hence the restriction.)
How is a decimal stored?
A decimal is stored in 128 bits, even though only 102 are strictly necessary.
It is convenient to consider the decimal as three 32-bit integers representing the
mantissa, and then one integer representing the sign and exponent. The top bit
of the last integer is the sign bit (in the normal way, with the bit being set (1) for
negative numbers) and bits 16-23 (the low bits of the high 16-bit word) contain
the exponent. The other bits must all be clear (0). This representation is the one
given by decimal.GetBits(decimal) which returns an array of 4 ints.
Formatting decimals
Unlike floats and doubles, when .NET is asked to format a decimal into a string
representation, its default behaviour is to give the exact value. This means there
is no need for a decimal equivalent of the DoubleConverter code of
the binary floating point article. You can, of course, ask it to restrict the
value to a specific precision.
Keeping zeroes
Between .NET 1.0 and 1.1, the decimal type underwent a subtle change.
Consider the following simple program:
using System;
public class Test
{
static void Main()
{
decimal d = 1.00m;
Console.WriteLine (d);
}
}
When I first ran the above (or something similar) I expected it to output
just 1 (which is what it would have been on .NET 1.0) - but in fact,
the output was 1.00. The decimal type doesn't normalize itself - it
remembers how many decimal digits it has (by maintaining the exponent where
possible) and on formatting, zero may be counted as a significant decimal digit.
I don't know the exact nature of what exponent is chosen (where there is a choice)
when two different decimals are multiplied, divided, added etc, but you may
find it interesting to play around with programs such as the following:
using System;
public class Test
{
static void Main()
{
decimal d = 0.00000000000010000m;
while (d != 0m)
{
Console.WriteLine (d);
d = d/5m;
}
}
}
Which produces a result of:
0.00000000000010000
0.00000000000002000
0.00000000000000400
0.00000000000000080
0.00000000000000016
0.000000000000000032
0.0000000000000000064
0.00000000000000000128
0.000000000000000000256
0.0000000000000000000512
0.00000000000000000001024
0.000000000000000000002048
0.0000000000000000000004096
0.00000000000000000000008192
0.000000000000000000000016384
0.0000000000000000000000032768
0.0000000000000000000000006554
0.0000000000000000000000001311
0.0000000000000000000000000262
0.0000000000000000000000000052
0.000000000000000000000000001
0.0000000000000000000000000002
Everything's a number
The decimal type has no concept of infinity or NaN (not-a-number) values,
and despite the above examples of the same actual number being potentially
representable in different forms (eg 1, 1.0, 1.00) the normal == operator
copes with these and reports 1.0==1.00 etc.
Accuracy
The decimal type has a larger precision than any of the built-in binary
floating point types in .NET, although it has a smaller range of potential
exponents. Also, many operations which yield surprising results in binary
floating point due to inexact representations of the original operands go
away in decimal floating point, precisely because many operands are
specifically represented in source code as decimals. However, that doesn't mean
that all operations suddenly become accurate: a third still isn't exactly
representable, for instance. The potential problems are just the same as they
are with binary floating point. However, most of the time the decimal type
is chosen for quantities like money, where operations will be simple and
keep things accurate. (For instance, adding a tax which is specified as a percentage
will keep the numbers accurate, assuming they're in a sensible range to start with.)
Just be aware of which operations are likely to cause inaccuracy, and which aren't.
As a very broad rule of thumb, if you end up seeing a very long string representation
(ie most of the 28/29 digits are non-zero) then chances are you've got some inaccuracy
along the way: most of the uses of the decimal type won't end up using very many
significant figures when the numbers are exact.
Conclusion
Most business applications should probably be using decimal rather than float
or double. My rule of thumb is that manmade values such as currency are usually
better represented with decimal floating point: the concept of exactly 1.25
dollars is entirely reasonable, for example. For values from the natural world,
such as lengths and weights, binary floating point types make more sense. Even
though there is a theoretical "exactly 1.25 metres" it's never going to occur in reality:
you're certainly never going to be able to measure exact lengths, and
they're unlikely to even exist at the atomic level. We're used to there being a certain
tolerance involved.
There is a cost to be paid for using decimal floating point arithmetic, but
I believe this is unlikely to be a bottleneck for most developers. As always,
write the most appropriate (and readable) code first, and analyse your
performance along the way. It's usually better to get the right answer
slowly than the wrong answer quickly - especially when it comes to money...