(You are currently looking at the first edition version of this page. This page is also available for the second and third editions.)

Notes for Chapter 3: Parameterized typing with generics

3.1: More persuasion on the benefits of generics

Joe Albahari provided a good example of where generics really can save you from bugs. While you might not have any problem remembering that everything in a particular ArrayList is an int (boxed, of course) you may find it relatively tricky to find all the accesses if you later change your mind and decide it should actually be full of decimal values. With generics and List<T>, that refactoring exercise is trivial because of the extra checking the compiler can provide. In the risky world of ArrayList you've got to rely on being careful.

Joe also gave another example which he finds usually persuades people of the benefits of generics. Which of these lines would you rather use to find the length of the fourth element of a list?

// When list is an ArrayList
((string) list[3]).Length

// When list is a List<string>
list[3].Length

3.2.1: Modifying the results of an indexer

In listing 3.1, I include the following line of code:

frequencies[word]++;

I mention that this may seem odd at first. There are situations where similar code could produce incorrect results - in particular, if you try to change a field or property on a value type which is returned from a dictionary, the compiler will complain. For example (C# 3 code just to be more concise):

using System.Collections.Generic;

struct MutableStruct
{
    public int value;
}

class Test
{
    static void Main()
    {
        var map = new Dictionary<string,MutableStruct>();
        
        map["hello"] = new MutableStruct();
        map["hello"].value++;
    }
}

This gives a compiler error of:

Test.cs(15,9): error CS1612: Cannot modify the return value of 'System.Collections.Generic.Dictionary.this[string] ' because it is not a variable

Of course, I'm sure you wouldn't use a mutable struct anyway, would you?

3.2.1: Regex vs String.Split

People who know my views on regular expressions and their misuse might be surprised to see me use one in listing 3.1. Surely String.Split is a great tool for splitting text into words? Well, yes and no. In particular, "no" when you can have several non-word characters in a row, and several different non-word characters - which is the case in the example. It's all feasible, of course, but in this case the regex was simpler.

3.2.3: List.ConvertAll - badly named?

In listing 3.2, I use List.ConvertAll to "convert" each element of a list of integers into its square root (as a double). Is the word "convert" really appropriate here? I guess it depends on what you understand by "conversion". It's not really a different representation of the same value, which is what conversion often means (consider currency conversion, numeric conversions etc). It's really a projection, mapping or transformation.

3.2.3: MakeList revisited

The MakeList method of listing 3.3 accomplishes its aim of helping to teach about generic methods, type inference etc. However, for real life use you might want consider this alternative:

static List<T> MakeList<T> (params T[] elements)
{
    return new List<T>(elements);
}

Despite being so short, this is still useful - because you can use type inference to avoid actually having to specify T in the calling code. This can be very useful in some cases. An alternative might help you to work round the lack of generic covariance:

static List<TResult> MakeList<TSource,TResult> (params TSource[] elements)
    where TSource : TResult
{
    List<TResult> ret = new List<TResult>(elements.Length);
    foreach (TResult t in elements)
    {
        ret.Add(t);
    }
    return ret;
}

Of course you then need to specify the types - but through the wonders of overloading by the number of type parameters, you can actually have both methods at the same time. Of course, the ToList and Cast extensions methods in LINQ to Objects make this somewhat less important if you're using .NET 3.5...

3.2.3: Consider ArrayList and Hashtable to be deprecated

Is your code still peppered with uses of ArrayList and Hashtable? I suspect a great deal of production code in the wild still uses these non-generic collection types despite "new" code in the same codebase using List<T> and its friends.

It's worth considering the non-generic collections to be effectively deprecated. Indeed, Silverlight 2.0 won't ship with them - so if you're planning to have any common code, you'll need to move over to the generic versions.

3.3.1: Derivation type constraints and implicit reference conversions

The footnote regarding implicit reference conversions is a good example of why it was so great to have Eric Lippert reviewing the book. You see, I originally had this (broken) example in mind:

// Broken code - do not use
using System;
using System.IO;

public class FakeStream
{
    public static implicit operator Stream(FakeStream original)
    {
        return new MemoryStream();
    }
}

class Test
{
    static void ShowLength<T>(T input) where T : Stream
    {
        Console.WriteLine (typeof(T));
        Console.WriteLine (input.Length);
    }
    
    static void Main()
    {
        FakeStream x = new FakeStream();
        // This line is okay, but it's not an implicit reference conversion
        Stream s = x;
        // Compilation error!
        ShowLength(x);
    }
}

Now, I thought this would compile, because there's an implicit conversion from an expression of type FakeStream to Stream as shown by the line with s in it. The input of that conversion is a reference and the output is a reference.

That doesn't mean it's an implicit reference conversion though. The terminology is slightly confusing here - but basically an implicit reference conversion is one where the CLR can just use the original pointer (reference) itself as a pointer (reference) to an instance of the target type. So, this works for things like interfaces and arrays. It doesn't work when the conversion returns a completely different reference, as user-defined conversions generally do.

3.3.1: Derivation type constraints: specification terminology

One part of the C# 3 language spec (10.1.5) talks about derivation type constraints being fulfilled by the type argument "deriving from" the constraint - and I've quoted that in the book. However, this is a fairly crude way of expressing it, and is likely to change in future versions. The actual meaning won't (unless it's expanded to be less restrictive somehow) but the terminology well be improved.

3.3.1: Naked constraints!

For some reason, the MSDN documentation calls a type parameter constraint (e.g. where T : U a naked constraint. Neither Eric nor I have any idea why this is so. I didn't realise it wasn't the official terminology until Eric pointed it out to me.

3.3.2: Simple spec != simple implementation

When discussing type inference, I mentioned that the rules in C# 2 are fairly simple. Eric pointed out that what may seem simple in a spec certainly needn't be simple in terms of a compiler implementation - especially when one considers the compiler to be part of an IDE which needs to provide Intellisense, work with incomplete code etc.

In a way, this is similar to writing an XML editor. If you start with an XML parser which correctly spits out errors on invalid XML documents, you've got an awful lot of work ahead of you. Much of the time spent in an XML editor is with an invalid document, as you're in the middle of changing it: the editor needs to understand that and respond accordingly.

Having said all this, the type inference rules in C# 2 really are fairly straightforward from the point of view of a C# developer. The same can't be said of the C# 3 rules! None of this takes away from the great job the C# team have done in terms of implementation.

3.3.2: Activator.CreateInstance<T>()

Joe Albahari pointed out that a good example of a method where you always have to specify the type parameter is Activator.CreateInstance<T>(). With no "normal" parameters, there's nothing for the type inference rules to work with. This is interesting on two counts:

3.3.3: default(...) is an operator

As I point out in the book, default is never mentioned as an operator in the language specification. That doesn't mean it isn't an operator though. I didn't want to be too specific in the book, but I'm willing to trust Eric's judgement on this: it really is an operator.

3.3.3: Pair - class or struct?

The Pair type I included as listing 3.6 is implemented as a class. This made a lot of sense in early drafts where it was mutable - but it's a lot more sensible as an immutable type. For one thing, that means that the hashcode and equality operations will stay stable, so long as the values within the pair aren't mutable themselves.

At this point, however, you have to wonder - is there much point in it being a reference type? Why not make it a value type (a struct)? That way a pair of bytes is really cheap, for example.

I tend to default to writing classes rather than structs, but I'm beginning to think that generic types which end up just wrapping other values (based on type parameters) could often be better represented as structs. This is in line with various examples in the framework such as KeyValuePair<TKey, TValue>.

3.3.3: XOR for hashcodes

The hashcode algorithm used in the Pair class is based on repeatedly multiplying and adding. This is my "default implementation" when I need to create a hashcode. Another common implementation is XORing the hashcodes of the constituent fields - an algorithm I denounce in the book, but without details.

It's not uncommon for two or more fields to be of the same type - in the case of our sample type, Pair<int,int> or Pair<string,string> for example. Let's take the latter type, and consider the following pairs:

Again, this is far from uncommon - often the same values crop up together. We'd like all of the above pairs to have different hashcodes, to avoid hash collisions - but there are only three values in the above list if we use XORing. The first three would all be 0, and the last two would be the same as each other. A pair of the form (x, x) will always have a hash of 0, and a pair of the form (x, y) will always have the same hash as (y, x).

In some cases you may know details of the data you're hashing, and be able to use XOR knowing that the above cases won't come up - but for a general purpose algorithm (particularly when dealing with generic types) it's a bad idea.

An early implementation of anonymous types used XOR hashing. Fortunately this was fixed long before release!

3.4.3: Terminology: one interface "extending" another

Eric picked up on the fact that I talk about IEnumerable<T> extending IEnumerable. It's the correct language in terms of the specification, but Eric's not a big fan of the term. He regards extension as being about reuse of implementation, whereas specifying a "base" interface is about saying that "any object fulfilling this contract must also fulfill this other interface".

Me? I like the term just fine. I don't see any necessary connection between extension and implementation. If someone's adding more provisions to a contract of any description, I'm happy to call that an extension. My guess is that Eric is right in terms of a very computer science specific use of the word extension, but I think the implication to common developers (like me) is clear.

It's rare for me to not be a stickler for appropriate terminology, but I guess in this case I just haven't been exposed to the strictly correct usage enough to worry too much.

3.5.1: Not quite a Sieve of Eratosthanes

The method for finding primes in listing 3.13 isn't quite the normal Sieve of Eratosthanes - because it removes multiples of numbers we already know to be non-prime. There's no simple way to change the listing while keeping the point of it (i.e. the use of RemoveAll) which is why I kept it as it is.

We could check the list of candidates in the second for loop, to see if factor is present - but of course this ends up being reasonably expensive too if performed in the simplest manner. A binary search would be quicker, but then we're getting significantly more complicated.

3.5.2: ICloneable

ICloneable is effectively a useless interface, because it doesn't indicate whether the copy should be deep or shallow. Even if the interface were separated into two, depending on whether or not you wanted the copy to be deep, it still wouldn't be terribly clear. Just how deep should a deep copy be? Just how shallow should a shallow copy be?

In many ways, the difficulties in expressing copy depth are similar to the difficulties in expressing const semantics.

3.5.4: The naming of SortedList/SortedDictionary

Joe Albahari points out that SortedList is named that way because internally, that's exactly what it is - a list, sorted by key. SortedDictionary on the other hand, is basically a red/black tree. Joe suggests that calling it BinaryTree might have been a bit more sensible. Personally I still think that names which only indicate the implementation rather than the interface aren't terribly useful - which isn't to suggest I have an immediately better suggestion...

3.6.1: Lions and tigers and bears, oh my!

The types used in the description of covariance were originally Base and Derived. Eric rightly pointed out that these take more thought to understand than concrete examples. He tends to use Animal, Giraffe and Turtle - an idea which I copied, partly as a small homage. Unfortunately I had to turn Giraffe into Cat for the sake of getting everything onto the page appropriately - but then again, it's not every day one turns giraffes into cats! Mads Torgersen apparently uses Fruit, Apple and Banana.

Feel free to consider using Person, LanguageGeek and SaneDeveloper if you wish...

3.6.2: Complex<Complex<double>>

Both Marc Gravell and Eric felt compelled to explain that a Complex<Complex<double>> would be a something like quaternion.

I haven't looked it up, and frankly I'm afraid to do so. I gave up brain-bending maths when I left university. Higher-order functions tax my poor brain hard enough these days...

3.6.2: A new solution to generic operators

The fabulously talented Marc Gravell came up with a nifty way of using operators in a generic manner via delegates and expression trees, in .NET 3.5. It's now part of MiscUtil. There's also a general explanation and a usage page for them.

3.6.2: A generic Complex type

Thanks to Marc Gravell's work on generic operators the idea of a Complex<T> type is no longer the realm of fantasy. It's still a lot of work, however, which is why it's so lovely that Marc's gone to the trouble of implementing it for us... He stresses that this hasn't been unit tested yet, but here it is in all its glory. (It relies on MiscUtil for generic operator support, of course.)

using System;
using System.Collections.Generic;
using MiscUtil;

struct Complex<T> : IEquatable<Complex<T>> 
{
    private readonly T real, imaginary;
    
    public Complex(T real, T imaginary) 
    {
        this.real = real;
        this.imaginary = imaginary;
    }
    
    public T RelativeLength() 
    {
        return SquareLength(ref this);
    }
    
    public T Real { get { return real; } }
    
    public T Imaginary { get { return imaginary; } }

    public override string ToString() 
    {
        return string.Format("({0}, {1}i)", real, imaginary);
    }
    
    public override bool Equals(object obj) 
    {
        if (obj != null && obj is Complex<T>) 
        {
            Complex<T> other = (Complex<T>)obj;
            return Equals(ref thisref other);
        }
        return base.Equals(obj);
    }
    
    public bool Equals(Complex<T> other) 
    {
        return Equals(this, other);
    }
    
    private static bool Equals(ref Complex<T> x, ref Complex<T> y) 
    {
        return EqualityComparer<T>.Default.Equals(x.Real, y.Real)
            && EqualityComparer<T>.Default.Equals(x.Imaginary, y.Imaginary);
    }
    
    public static Complex<T> operator +(Complex<T> x, Complex<T> y) 
    {
        return new Complex<T>(
            Operator.Add(x.Real, y.Real),
            Operator.Add(x.Imaginary, y.Imaginary)
        );
    }
    
    public static Complex<T> operator +(Complex<T> x, T y) 
    {
        return new Complex<T>(
            Operator.Add(x.Real, y),
            x.Imaginary
        );
    }
    
    public static Complex<T> operator -(Complex<T> x, Complex<T> y) 
    {
        return new Complex<T>(
            Operator.Subtract(x.Real, y.Real),
            Operator.Subtract(x.Imaginary, y.Imaginary)
        );
    }
    
    public static Complex<T> operator -(Complex<T> x, T y) 
    {
        return new Complex<T>(
            Operator.Subtract(x.Real, y),
            x.Imaginary
        );
    }
    
    public static Complex<T> operator -(Complex<T> x) 
    {
        return new Complex<T>(
            Operator.Negate(x.Real),
            Operator.Negate(x.Imaginary)
        );
    }
    
    public static bool operator ==(Complex<T> x, Complex<T> y) 
    {
        return Equals(ref x, ref y);
    }
    
    public static bool operator !=(Complex<T> x, Complex<T> y) 
    {
        return !Equals(ref x, ref y);
    }
    
    public static Complex<T> operator *(Complex<T> x, Complex<T> y) 
    {
        return new Complex<T>(
            Operator.Subtract(
                Operator.Multiply(x.Real, y.Real),
                Operator.Multiply(y.Imaginary, y.Imaginary)
            ), Operator.Add(
                Operator.Multiply(x.Real, y.Imaginary),
                Operator.Multiply(x.Imaginary, y.Real)
            )
        );
    }
    
    public static Complex<T> operator *(Complex<T> x, T y) 
    {
        return new Complex<T>(
            Operator.Multiply(x.Real, y),
            Operator.Multiply(x.Imaginary, y)
        );
    }
    
    public static Complex<T> operator *(Complex<T> x, int y) 
    {
        return new Complex<T>(
            Operator.MultiplyAlternative(x.Real, y),
            Operator.MultiplyAlternative(x.Imaginary, y)
        );
    }
    
    private static T SquareLength(ref Complex<T> value) 
    {
        return Operator.Add(
            Operator.Multiply(value.Real, value.Real),
            Operator.Multiply(value.Imaginary, value.Imaginary)
        );
    }
    
    public static Complex<T> operator /(Complex<T> x, Complex<T> y) 
    {
        T divisor = SquareLength(ref y),
          real = Operator.Divide(
                Operator.Add(
                    Operator.Multiply(x.Real, y.Real),
                    Operator.Multiply(x.Imaginary, y.Imaginary)
                ), divisor),
          imaginary = Operator.Divide(
                Operator.Subtract(
                    Operator.Multiply(x.Imaginary, y.Real),
                    Operator.Multiply(x.Real, y.Imaginary)
                ), divisor);
        return new Complex<T>(real, imaginary);
    }
    
    public static Complex<T> operator /(Complex<T> x, T y) 
    {
        return new Complex<T>(
            Operator.Divide(x.Real, y),
            Operator.Divide(x.Imaginary, y)
        );
    }
    
    public static Complex<T> operator /(Complex<T> x, int y) 
    {
        return new Complex<T>(
            Operator.DivideInt32(x.Real, y),
            Operator.DivideInt32(x.Imaginary, y)
        );
    }
    
    public static implicit operator Complex<T>(T real) 
    {
        return new Complex<T>(real, Operator<T>.Zero);
    }

    public override int GetHashCode() 
    {
        return (Real == null ? 0 : 17 * Real.GetHashCode())
             + (Imaginary == null ? 0 : Imaginary.GetHashCode());
    }
}

3.6.5: Java's interesting approach to generics

Java's approach to generics is interesting, partly because the variance which is available is at the call site, not the declaration. Neither Eric nor I know of languages which take a similar approach.

In some ways, Java's support for generics is similar to the support for checked exceptions: the exceptions are only checked in the compiler, not in the runtime. It would be perfectly possible to write a pseudo-Java compiler which didn't care about checked exceptions.