C# in Depth

Cover of C# in Depth
Order now (3rd edition)

(You are currently looking at the first edition version of this page. This page is also available for the second and third editions.)

All Notes

1.1.1: Productizing Product

The Product class in listing 1.1 leaves a few things to be desired. These are partly stylistic, but still far from arbitrary.

All of these points are ones to think about for real code, which has different motivations than sample code in books. In this case, I didn't want to add modifiers like private and sealed, in order to keep the code as simple as possible.

1.1.2: Implicit casting in foreach loops over generic collections

After changing the returned list of products from an ArrayList to a List<Product>, the book makes the following bold claim:

Similarly, the invisible cast in the foreach loop has gone. It’s hard to tell the difference, given that it’s invisible, but it really is gone. Honest. I wouldn’t lie to you. At least, not in chapter 1...

Well, it's sort of true and it sort of isn't. The compiler effectively converts the foreach loop into code which does contain a cast - but then it notices that the cast is just an identity conversion, so it optimizes it away from the compiled code.

In other cases, there will still be a cast in the compiled code. Consider this perfectly valid program:

using System;
using System.Collections.Generic;

class Test
{
    static void Main()
    {
        List<object> list = new List<object> { "This is a string" };
        
        foreach (string entry in list)
        {
            Console.WriteLine(entry);
        }
    }
}

In this case, there's a cast present in the compiled IL. If the list contains something other than a string, it will still compile but will fail at execution time.

1.1.2: Nullity checking in comparisons and defensive coding

In listing 1.4, the comparer doesn't check whether its parameters are null references, nor whether they're of the right type, nor whether the names of the products are null.

How much this matters in real life really depends on what you're doing. You can write defensive code everywhere, knowing that you'll get the most appropriate exception (or "null is less than everything" behaviour), or you can skimp and cope with the consequences. If you're the only one ever to use your types, you can be a bit more relaxed about parameter checking and the like, but when writing libraries for public consumption you should be a lot more careful.

1.1.2: Why return a List<T> from GetSampleProducts?

The GetSampleProducts method in the C# 2 and 3 version of the Product class demonstrates the feature of generics, and collection initializers in C# 3 - but it could do more.

Suppose we'd declared it to return IEnumerable<Product> instead of List<Product>. Then we could have seen iterator blocks with yield return statements. Another major language feature mentioned in the first chapter... So why didn't I do it?

There are two reasons. The first, being totally honest, is that I didn't think of it. It was only when Eric mentioned it at tech review that I saw the possibilities. However, leaving aside the size of change it would have required late on in the editing process, I still don't think it would have been a good idea - simply because iterator blocks scare me slightly. Closures are a bit mind-bending but basically make sense - iterator blocks are a whole extra level of magic.

Don't get me wrong, I love them and use them - but I think that presenting readers with that level of compiler magic that early in the book might be a little off-putting. As soon as you start thinking about the way that at execution time you start dipping into bits of your method at a time, it all gets somewhat bizarre.

Oh, and on a minor point - if the method hadn't returned a List<Product>, it would have been slightly harder to demonstrate sorting in C# 2 :)

1.1.3: List.ForEach - friend or foe?

Eric made an interesting comment on listing 1.11, where I demonstrate the ForEach method. His view is that expressions should be free from side effects wherever possible, and statements should be useful only for their side effects. That suggests that you should use the foreach statement to iterate through a collection and perform actions on it, rather than using an expression with a ForEach call.

Listing 1.12 shows the kind of code Eric is more in favour of, demonstrating (in his words) "shiny beauty". Every statement has a side effect, and every expression computes a value.

This is quite a purist viewpoint I suspect - and to be totally honest, I don't think I'm really qualified to discuss its merits. It's interesting to consider though - try applying it to your code, and see whether the results are clearer.

1.2.1: Why did Sun create Java?

As I note in the book, the operating-system independent Java language was clearly a threat to Microsoft. But imagine if Sun hadn't invented it, but someone else (IBM perhaps?) - would Sun not have regarded it as a threat? After all, it's not only operating-system independent, but also hardware-independent - and Sun is a company selling hardware and operating systems!

Did Sun introduce Java with an eye to hurting Microsoft, without necessarily considering the implications on itself? I don't know - but I'm glad it happened anyway. The effect of Java on the industry has certainly been profound.

1.2.3: Slow adoption rates: sad or not?

When discussing C# 2 and .NET 2.0, I expressed regret that they have taken so long to become widely adopted in the industry. I was reminded that the industry exists to make a profit, not to have fun with technology. New technology often (if not always) involves an element of risk, and there's definitely a lot of sense in sitting back while the early adopters take the risks.

I'm sticking by my use of the word "sad" though - because that hesitation to use .NET 2.0 has left many of us developers "in the trenches" being forced to use .NET 1.1 and C# 1 despite the huge productivity gains available with the later technologies. It's sad for us.

I really, really hope that the multi-targeting feature of Visual Studio 2008 along with the smaller conversion process and lack of a new runtime will mean many of us get to use the updated IDE and language features much earlier than we did last time round - the risk is much lower this time.

1.2.5: Is IronPython really a Trojan horse?

I was perhaps a little harsh using the words "Trojan horse" to describe the way that scripting and dynamic language developers will no doubt find the DLR (Dynamic Language Runtime) attractive. It's not a trick, and there aren't going to be hordes of Microsoft developers leaping out of the DLR when night falls. You may wish to think of it as an alternative entrance to the .NET house instead.

I do still think that it's an enticing prospect, however. Eric pointed out that I was characterising the language as the attractive part of the proposition, whereas he'd say that the framework, OS, and users make it an attractive value proposition. I think it really depends on the developer. Does a developer look at a platform, and consider the lack of her favourite language running on that platform a barrier to entry which can be overcome, or does she only look at the platforms her favourite language is available on? I'm sure there are plenty of developers on each side of the fence - but it's certainly a benefit if that favourite language is available on more platforms, such as .NET.

It's also true that although I point out that the IronPython programmer of today may be the C# programmer of tomorrow, the reverse is true too. By making multiple languages available on the same platform, it's much easier to learn a new language and use the right language in the right context.

These are largely cases of looking at the same picture from different angles. Hopefully we can all agree that having a broad spectrum of languages of different styles (functional, static, dynamic, OO etc) all running on a comprehensive platform is a good thing.

1.4.2: Are using directives really harmless?

Snippy is a simple beast. It has a set of namespaces configured, and will emit a using directive for each of them when compiling code. I claim in the book that this is harmless - but is it really?

In the context of Snippy, it's reasonably okay - but it's worth being aware of the consequences of using directives in C# 3, where they not only introduce types, but also extension methods.

It's fairly easy to come up with examples where the presence or absence of a using directive can drastically affect behaviour - and it's not obvious that this is the case just from the directive itself. Personally I like to trim unnecessary using directives wherever I can in real code.

2.2.1: Strong vs weak typing

Most of the definitions given for strong and weak typing involve details of what information is available at compile-time, what information is available at execution-time etc. However, Eric pointed out that a lot of the time the definitions really boil down to:

At that level, the terms become relatively meaningless.

2.2.1: Terminology for what is currently "static"

In the book, I don't go into very much detail about what static really means, but Eric wrote a blog entry on static methods containing this:

Static methods are called "static" because it can always be determined exactly, at compile time, what method will be called. That is, the method can be resolved solely by static analysis of the code.

Apparently this caused a bit of a ruckus - the word "static" isn't terribly well chosen. It does mean what it says, but that's not how people actually think of it. Do you think, "Hmm... I want a method which the compiler can resolve with only static analysis" or do you think, "Hmm... I want a method which relates to the type itself rather than any specific instance of the type"? I know I do the latter.

So, what are the other options? VB uses "Shared" which is closer in some ways - but still misses the boat in my view. Sharing involves something being used by more than one person (or instance in this case). Static members aren't shared between instances - they're present even if there are no instances at all!

Other options (in a totally imaginary language) might be "typewide" or "noninstance". Neither of these appeal, to be honest. Do you have any better ideas?

2.2.1: Static, dynamic and middle grounds

There is a middle ground between "totally static" and "totally dynamic". There's "mostly static, but dynamic where necessary" - which is the route VB has chosen. It's also a route C# might take in the future.

2.2.1: More benefits on static typing

Joe Albahari mentions these additional benefits of static typing:

  • Refactoring a large program is dramatically easier and safer with static typing. For instance, if you change a parameter’s type on an internal or private method, all you need do is rebuild, and the compiler will tell you everywhere that needs updating. With a duck-typed language, you have to instead rely on unit tests which invariably don’t give you 100% coverage (especially with UI code) and so you end up with residual bugs. I programmed for 5 years in a duck-typed object-oriented language, and remember that bugs from refactoring would sometimes show up months later!
  • The IDE can perform certain kinds of refactoring automatically, such as renaming a type or member. Having good identifier names, I think, is essential for long-term maintainability. In a duck-typed language, you almost never dare rename anything, once the project reaches a certain size, for fear of introducing bugs.
  • IntelliSense (as you point out) is a showstopper. Ask most people whether they’d be willing to rescind autocompletion for, well, almost anything, and they’d say no.
  • 2.3.3: When is it appropriate to write your own struct?

    In section 2.3.3 I talk a bit about the differences between structs and classes, but I don't give much guidance on when to use what.

    In my experience, the "default correct choice" is to write classes. When it comes to immutable types, you could argue there's not a huge amount of difference between using a class and using a struct, beyond natural nullability and memory behaviour - but there's a difference in gut feeling.

    By and large, I only write custom structs to represent some sort of basic quantity. For instance, I can imagine writing a Money struct which encapsulated the amount (as a decimal) and the currency (possibly an enum, or more likely a reference type with a small number of shared instances). I'd be very unlikely to encapsulate a person as a struct, or a collection.

    There's more on this topic in the Microsoft design guidelines.

    One important rule to avoid breaking: structs should almost always be immutable. Mutability in structs can cause horrific bugs which are really hard to track down. It's even worse if the mutability is available through an interface the struct implements. Just say no.

    2.3.4: Boxing copies values

    One point I didn't mention when discussing boxing was that the process of boxing always copies a value. The newly created box doesn't know about the variable (or other expression) that was used to create it - it just knows about the value. So, for example:

    int x = 5;
    object boxed = x; // A copy of the value 5 is boxed
    x = 10; // This doesn't change the value in the box
    Console.WriteLine(boxed); // So this prints 5

    2.3.4: The small cost of boxing

    Even at the level of "hundreds of thousands" of boxing operations, in many cases performance of an application will barely be impacted. As a quick test, I wrote a program to box and unbox integers 100 million times. It took less than a second to run on my laptop (and scaled linearly as I increased the number of iterations).

    Microbenchmarks are notoriously bad indicators of overall performance for various reasons, but it's worth being guided by the kind of scale involved here. Anything which can be done over a hundred million times per second doesn't need to be avoided too much - and certainly not to the extent of bending your design out of shape without hard evidence.

    2.4.2: How many anonymous types?

    I mention in section 2.4.2 (and later in the book, of course) that if you use two anonymous type creation expressions in a single assembly, and those expressions have the same property types and names in the same order, that you get a single type.

    That's not quite true, although I doubt if you'd ever notice it. Strictly speaking, this is done on a per-netmodule basis. What's a netmodule? It's like a mini-assembly - you can build an assembly from multiple netmodule files. I'm not aware of it being commonly used, but there we go...

    3.1: More persuasion on the benefits of generics

    Joe Albahari provided a good example of where generics really can save you from bugs. While you might not have any problem remembering that everything in a particular ArrayList is an int (boxed, of course) you may find it relatively tricky to find all the accesses if you later change your mind and decide it should actually be full of decimal values. With generics and List<T>, that refactoring exercise is trivial because of the extra checking the compiler can provide. In the risky world of ArrayList you've got to rely on being careful.

    Joe also gave another example which he finds usually persuades people of the benefits of generics. Which of these lines would you rather use to find the length of the fourth element of a list?

    // When list is an ArrayList
    ((string) list[3]).Length

    // When list is a List<string>
    list[3].Length

    3.2.1: Modifying the results of an indexer

    In listing 3.1, I include the following line of code:

    frequencies[word]++;

    I mention that this may seem odd at first. There are situations where similar code could produce incorrect results - in particular, if you try to change a field or property on a value type which is returned from a dictionary, the compiler will complain. For example (C# 3 code just to be more concise):

    using System.Collections.Generic;

    struct MutableStruct
    {
        public int value;
    }

    class Test
    {
        static void Main()
        {
            var map = new Dictionary<string,MutableStruct>();
            
            map["hello"] = new MutableStruct();
            map["hello"].value++;
        }
    }

    This gives a compiler error of:

    Test.cs(15,9): error CS1612: Cannot modify the return value of 'System.Collections.Generic.Dictionary.this[string] ' because it is not a variable

    Of course, I'm sure you wouldn't use a mutable struct anyway, would you?

    3.2.1: Regex vs String.Split

    People who know my views on regular expressions and their misuse might be surprised to see me use one in listing 3.1. Surely String.Split is a great tool for splitting text into words? Well, yes and no. In particular, "no" when you can have several non-word characters in a row, and several different non-word characters - which is the case in the example. It's all feasible, of course, but in this case the regex was simpler.

    3.2.3: List.ConvertAll - badly named?

    In listing 3.2, I use List.ConvertAll to "convert" each element of a list of integers into its square root (as a double). Is the word "convert" really appropriate here? I guess it depends on what you understand by "conversion". It's not really a different representation of the same value, which is what conversion often means (consider currency conversion, numeric conversions etc). It's really a projection, mapping or transformation.

    3.2.3: MakeList revisited

    The MakeList method of listing 3.3 accomplishes its aim of helping to teach about generic methods, type inference etc. However, for real life use you might want consider this alternative:

    static List<T> MakeList<T> (params T[] elements)
    {
        return new List<T>(elements);
    }

    Despite being so short, this is still useful - because you can use type inference to avoid actually having to specify T in the calling code. This can be very useful in some cases. An alternative might help you to work round the lack of generic covariance:

    static List<TResult> MakeList<TSource,TResult> (params TSource[] elements)
        where TSource : TResult
    {
        List<TResult> ret = new List<TResult>(elements.Length);
        foreach (TResult t in elements)
        {
            ret.Add(t);
        }
        return ret;
    }

    Of course you then need to specify the types - but through the wonders of overloading by the number of type parameters, you can actually have both methods at the same time. Of course, the ToList and Cast extensions methods in LINQ to Objects make this somewhat less important if you're using .NET 3.5...

    3.2.3: Consider ArrayList and Hashtable to be deprecated

    Is your code still peppered with uses of ArrayList and Hashtable? I suspect a great deal of production code in the wild still uses these non-generic collection types despite "new" code in the same codebase using List<T> and its friends.

    It's worth considering the non-generic collections to be effectively deprecated. Indeed, Silverlight 2.0 won't ship with them - so if you're planning to have any common code, you'll need to move over to the generic versions.

    3.3.1: Derivation type constraints and implicit reference conversions

    The footnote regarding implicit reference conversions is a good example of why it was so great to have Eric Lippert reviewing the book. You see, I originally had this (broken) example in mind:

    // Broken code - do not use
    using System;
    using System.IO;

    public class FakeStream
    {
        public static implicit operator Stream(FakeStream original)
        {
            return new MemoryStream();
        }
    }

    class Test
    {
        static void ShowLength<T>(T input) where T : Stream
        {
            Console.WriteLine (typeof(T));
            Console.WriteLine (input.Length);
        }
        
        static void Main()
        {
            FakeStream x = new FakeStream();
            // This line is okay, but it's not an implicit reference conversion
            Stream s = x;
            // Compilation error!
            ShowLength(x);
        }
    }

    Now, I thought this would compile, because there's an implicit conversion from an expression of type FakeStream to Stream as shown by the line with s in it. The input of that conversion is a reference and the output is a reference.

    That doesn't mean it's an implicit reference conversion though. The terminology is slightly confusing here - but basically an implicit reference conversion is one where the CLR can just use the original pointer (reference) itself as a pointer (reference) to an instance of the target type. So, this works for things like interfaces and arrays. It doesn't work when the conversion returns a completely different reference, as user-defined conversions generally do.

    3.3.1: Derivation type constraints: specification terminology

    One part of the C# 3 language spec (10.1.5) talks about derivation type constraints being fulfilled by the type argument "deriving from" the constraint - and I've quoted that in the book. However, this is a fairly crude way of expressing it, and is likely to change in future versions. The actual meaning won't (unless it's expanded to be less restrictive somehow) but the terminology well be improved.

    3.3.1: Naked constraints!

    For some reason, the MSDN documentation calls a type parameter constraint (e.g. where T : U a naked constraint. Neither Eric nor I have any idea why this is so. I didn't realise it wasn't the official terminology until Eric pointed it out to me.

    3.3.2: Simple spec != simple implementation

    When discussing type inference, I mentioned that the rules in C# 2 are fairly simple. Eric pointed out that what may seem simple in a spec certainly needn't be simple in terms of a compiler implementation - especially when one considers the compiler to be part of an IDE which needs to provide Intellisense, work with incomplete code etc.

    In a way, this is similar to writing an XML editor. If you start with an XML parser which correctly spits out errors on invalid XML documents, you've got an awful lot of work ahead of you. Much of the time spent in an XML editor is with an invalid document, as you're in the middle of changing it: the editor needs to understand that and respond accordingly.

    Having said all this, the type inference rules in C# 2 really are fairly straightforward from the point of view of a C# developer. The same can't be said of the C# 3 rules! None of this takes away from the great job the C# team have done in terms of implementation.

    3.3.2: Activator.CreateInstance<T>()

    Joe Albahari pointed out that a good example of a method where you always have to specify the type parameter is Activator.CreateInstance<T>(). With no "normal" parameters, there's nothing for the type inference rules to work with. This is interesting on two counts:

    3.3.3: default(...) is an operator

    As I point out in the book, default is never mentioned as an operator in the language specification. That doesn't mean it isn't an operator though. I didn't want to be too specific in the book, but I'm willing to trust Eric's judgement on this: it really is an operator.

    3.3.3: Pair - class or struct?

    The Pair type I included as listing 3.6 is implemented as a class. This made a lot of sense in early drafts where it was mutable - but it's a lot more sensible as an immutable type. For one thing, that means that the hashcode and equality operations will stay stable, so long as the values within the pair aren't mutable themselves.

    At this point, however, you have to wonder - is there much point in it being a reference type? Why not make it a value type (a struct)? That way a pair of bytes is really cheap, for example.

    I tend to default to writing classes rather than structs, but I'm beginning to think that generic types which end up just wrapping other values (based on type parameters) could often be better represented as structs. This is in line with various examples in the framework such as KeyValuePair<TKey, TValue>.

    3.3.3: XOR for hashcodes

    The hashcode algorithm used in the Pair class is based on repeatedly multiplying and adding. This is my "default implementation" when I need to create a hashcode. Another common implementation is XORing the hashcodes of the constituent fields - an algorithm I denounce in the book, but without details.

    It's not uncommon for two or more fields to be of the same type - in the case of our sample type, Pair<int,int> or Pair<string,string> for example. Let's take the latter type, and consider the following pairs:

    Again, this is far from uncommon - often the same values crop up together. We'd like all of the above pairs to have different hashcodes, to avoid hash collisions - but there are only three values in the above list if we use XORing. The first three would all be 0, and the last two would be the same as each other. A pair of the form (x, x) will always have a hash of 0, and a pair of the form (x, y) will always have the same hash as (y, x).

    In some cases you may know details of the data you're hashing, and be able to use XOR knowing that the above cases won't come up - but for a general purpose algorithm (particularly when dealing with generic types) it's a bad idea.

    An early implementation of anonymous types used XOR hashing. Fortunately this was fixed long before release!

    3.4.3: Terminology: one interface "extending" another

    Eric picked up on the fact that I talk about IEnumerable<T> extending IEnumerable. It's the correct language in terms of the specification, but Eric's not a big fan of the term. He regards extension as being about reuse of implementation, whereas specifying a "base" interface is about saying that "any object fulfilling this contract must also fulfill this other interface".

    Me? I like the term just fine. I don't see any necessary connection between extension and implementation. If someone's adding more provisions to a contract of any description, I'm happy to call that an extension. My guess is that Eric is right in terms of a very computer science specific use of the word extension, but I think the implication to common developers (like me) is clear.

    It's rare for me to not be a stickler for appropriate terminology, but I guess in this case I just haven't been exposed to the strictly correct usage enough to worry too much.

    3.5.1: Not quite a Sieve of Eratosthanes

    The method for finding primes in listing 3.13 isn't quite the normal Sieve of Eratosthanes - because it removes multiples of numbers we already know to be non-prime. There's no simple way to change the listing while keeping the point of it (i.e. the use of RemoveAll) which is why I kept it as it is.

    We could check the list of candidates in the second for loop, to see if factor is present - but of course this ends up being reasonably expensive too if performed in the simplest manner. A binary search would be quicker, but then we're getting significantly more complicated.

    3.5.2: ICloneable

    ICloneable is effectively a useless interface, because it doesn't indicate whether the copy should be deep or shallow. Even if the interface were separated into two, depending on whether or not you wanted the copy to be deep, it still wouldn't be terribly clear. Just how deep should a deep copy be? Just how shallow should a shallow copy be?

    In many ways, the difficulties in expressing copy depth are similar to the difficulties in expressing const semantics.

    3.5.4: The naming of SortedList/SortedDictionary

    Joe Albahari points out that SortedList is named that way because internally, that's exactly what it is - a list, sorted by key. SortedDictionary on the other hand, is basically a red/black tree. Joe suggests that calling it BinaryTree might have been a bit more sensible. Personally I still think that names which only indicate the implementation rather than the interface aren't terribly useful - which isn't to suggest I have an immediately better suggestion...

    3.6.1: Lions and tigers and bears, oh my!

    The types used in the description of covariance were originally Base and Derived. Eric rightly pointed out that these take more thought to understand than concrete examples. He tends to use Animal, Giraffe and Turtle - an idea which I copied, partly as a small homage. Unfortunately I had to turn Giraffe into Cat for the sake of getting everything onto the page appropriately - but then again, it's not every day one turns giraffes into cats! Mads Torgersen apparently uses Fruit, Apple and Banana.

    Feel free to consider using Person, LanguageGeek and SaneDeveloper if you wish...

    3.6.2: Complex<Complex<double>>

    Both Marc Gravell and Eric felt compelled to explain that a Complex<Complex<double>> would be a something like quaternion.

    I haven't looked it up, and frankly I'm afraid to do so. I gave up brain-bending maths when I left university. Higher-order functions tax my poor brain hard enough these days...

    3.6.2: A new solution to generic operators

    The fabulously talented Marc Gravell came up with a nifty way of using operators in a generic manner via delegates and expression trees, in .NET 3.5. It's now part of MiscUtil. There's also a general explanation and a usage page for them.

    3.6.2: A generic Complex type

    Thanks to Marc Gravell's work on generic operators the idea of a Complex<T> type is no longer the realm of fantasy. It's still a lot of work, however, which is why it's so lovely that Marc's gone to the trouble of implementing it for us... He stresses that this hasn't been unit tested yet, but here it is in all its glory. (It relies on MiscUtil for generic operator support, of course.)

    using System;
    using System.Collections.Generic;
    using MiscUtil;

    struct Complex<T> : IEquatable<Complex<T>> 
    {
        private readonly T real, imaginary;
        
        public Complex(T real, T imaginary) 
        {
            this.real = real;
            this.imaginary = imaginary;
        }
        
        public T RelativeLength() 
        {
            return SquareLength(ref this);
        }
        
        public T Real { get { return real; } }
        
        public T Imaginary { get { return imaginary; } }

        public override string ToString() 
        {
            return string.Format("({0}, {1}i)", real, imaginary);
        }
        
        public override bool Equals(object obj) 
        {
            if (obj != null && obj is Complex<T>) 
            {
                Complex<T> other = (Complex<T>)obj;
                return Equals(ref thisref other);
            }
            return base.Equals(obj);
        }
        
        public bool Equals(Complex<T> other) 
        {
            return Equals(this, other);
        }
        
        private static bool Equals(ref Complex<T> x, ref Complex<T> y) 
        {
            return EqualityComparer<T>.Default.Equals(x.Real, y.Real)
                && EqualityComparer<T>.Default.Equals(x.Imaginary, y.Imaginary);
        }
        
        public static Complex<T> operator +(Complex<T> x, Complex<T> y) 
        {
            return new Complex<T>(
                Operator.Add(x.Real, y.Real),
                Operator.Add(x.Imaginary, y.Imaginary)
            );
        }
        
        public static Complex<T> operator +(Complex<T> x, T y) 
        {
            return new Complex<T>(
                Operator.Add(x.Real, y),
                x.Imaginary
            );
        }
        
        public static Complex<T> operator -(Complex<T> x, Complex<T> y) 
        {
            return new Complex<T>(
                Operator.Subtract(x.Real, y.Real),
                Operator.Subtract(x.Imaginary, y.Imaginary)
            );
        }
        
        public static Complex<T> operator -(Complex<T> x, T y) 
        {
            return new Complex<T>(
                Operator.Subtract(x.Real, y),
                x.Imaginary
            );
        }
        
        public static Complex<T> operator -(Complex<T> x) 
        {
            return new Complex<T>(
                Operator.Negate(x.Real),
                Operator.Negate(x.Imaginary)
            );
        }
        
        public static bool operator ==(Complex<T> x, Complex<T> y) 
        {
            return Equals(ref x, ref y);
        }
        
        public static bool operator !=(Complex<T> x, Complex<T> y) 
        {
            return !Equals(ref x, ref y);
        }
        
        public static Complex<T> operator *(Complex<T> x, Complex<T> y) 
        {
            return new Complex<T>(
                Operator.Subtract(
                    Operator.Multiply(x.Real, y.Real),
                    Operator.Multiply(y.Imaginary, y.Imaginary)
                ), Operator.Add(
                    Operator.Multiply(x.Real, y.Imaginary),
                    Operator.Multiply(x.Imaginary, y.Real)
                )
            );
        }
        
        public static Complex<T> operator *(Complex<T> x, T y) 
        {
            return new Complex<T>(
                Operator.Multiply(x.Real, y),
                Operator.Multiply(x.Imaginary, y)
            );
        }
        
        public static Complex<T> operator *(Complex<T> x, int y) 
        {
            return new Complex<T>(
                Operator.MultiplyAlternative(x.Real, y),
                Operator.MultiplyAlternative(x.Imaginary, y)
            );
        }
        
        private static T SquareLength(ref Complex<T> value) 
        {
            return Operator.Add(
                Operator.Multiply(value.Real, value.Real),
                Operator.Multiply(value.Imaginary, value.Imaginary)
            );
        }
        
        public static Complex<T> operator /(Complex<T> x, Complex<T> y) 
        {
            T divisor = SquareLength(ref y),
              real = Operator.Divide(
                    Operator.Add(
                        Operator.Multiply(x.Real, y.Real),
                        Operator.Multiply(x.Imaginary, y.Imaginary)
                    ), divisor),
              imaginary = Operator.Divide(
                    Operator.Subtract(
                        Operator.Multiply(x.Imaginary, y.Real),
                        Operator.Multiply(x.Real, y.Imaginary)
                    ), divisor);
            return new Complex<T>(real, imaginary);
        }
        
        public static Complex<T> operator /(Complex<T> x, T y) 
        {
            return new Complex<T>(
                Operator.Divide(x.Real, y),
                Operator.Divide(x.Imaginary, y)
            );
        }
        
        public static Complex<T> operator /(Complex<T> x, int y) 
        {
            return new Complex<T>(
                Operator.DivideInt32(x.Real, y),
                Operator.DivideInt32(x.Imaginary, y)
            );
        }
        
        public static implicit operator Complex<T>(T real) 
        {
            return new Complex<T>(real, Operator<T>.Zero);
        }

        public override int GetHashCode() 
        {
            return (Real == null ? 0 : 17 * Real.GetHashCode())
                 + (Imaginary == null ? 0 : Imaginary.GetHashCode());
        }
    }

    3.6.5: Java's interesting approach to generics

    Java's approach to generics is interesting, partly because the variance which is available is at the call site, not the declaration. Neither Eric nor I know of languages which take a similar approach.

    In some ways, Java's support for generics is similar to the support for checked exceptions: the exceptions are only checked in the compiler, not in the runtime. It would be perfectly possible to write a pseudo-Java compiler which didn't care about checked exceptions.

    4.1.2: Magic value traps

    Joe Albahari pointed out that another problem with magic values is that it's easy to forget to check for them:

    There's nothing to stop you from accidentially treating it like a real value, and performing arithmetic on it, causing an error, potentially way down the line. If you make the same mistake with a nullable type, you'll get an exception thrown right away. This is probably the most convincing argument for the usefulness of nullable types over magic values.

    4.2.1: Mutable value types in the framework

    Joe Albahari points out that there are a few mutable value types in the framework - in the System.Drawing namespace, Point, Size and Rectangle are all mutable. The same choices have been made in WPF, too.

    Use with care.

    4.2.3: Equality of nulls

    As the book mentions, all the various operators are language-specific. Equality is a particularly good example. In VB, x=y (as an equality comparison, not an assignment) has a nullable result when x or y are nullable. If both sides are null values, then the result is null.

    The language designers decided that was one step too far for C#. Which is the best approach? It's very hard to say. I'm sure that in some scenarios the C# way is clearer, and in others the VB way is clearer.

    It pains me to cede the point, but in this particular case I think that VB has more purity and integrity than C#. Just don't remind me about it too often :)

    4.3.1: Non-nullable reference types

    Just for kicks, imagine a world where all types were non-nullable by default. Sure, the null reference is available when required, but suppose it were prohibited unless you'd explicitly said you'd want to allow it. Any attempt to return a value which might be null could be prohibited in a method declared to return just string, say - whereas if the method were declared to return string? it would be okay.

    An alternative which is still possible would be to use ! as a "non-null reference type" modifier. So a method with a signature of void Foo(string! x) would automatically enforce that x wasn't null, etc.

    Is it worth it at this point? Probably not - it's one of those ideas which is only really workable if it's present right at the start. There's too much code "out there" now.

    4.3.2: Happy birthday, Donald Knuth

    As it happened, Eric reviewed listing 4.4 on January 10th, 2008. He left a note in the margin wishing Donald Knuth a happy 70th birthday. I don't know if a card was sent, however.

    4.3.3: Lifted or not?

    It's unfortunate, but the C# specification is inconsistent in its use of the word lifted - at least at the moment.

    Eric has blogged about this very issue - and I'm proud to reveal that I'm the reader mentioned in the post :)

    4.3.3: Unexpected conversion

    Ross Bradbury posted an interesting question on the C# in Depth Forum. It was sufficiently unexpected that we reckoned it was worth a note.

    He had code which effectively looked like this (I've pared it down a little):

    using System;

    class Test
    {
        public static implicit operator Test(int i)
        {
            Console.WriteLine("Converted from "+i);
            return new Test();
        }
        
        static void Main()
        {
            int a = 10;
            Test normalConversion = a;
            
            int? b = 20;
            Test unexpectedConversion = b;
            
            int? c = null;
            Test evenOdder = c;
        }
    }

    The first conversion (normalConversion = a) is entirely normal - int to Test, using the implicit conversion. This prints 10.

    The second conversion (unexpectedConversion = b) is one which I didn't expect to be legal - after all, we have an int? rather than an int, and there's no implicit conversion from int? to int. However, it compiles and runs, and prints 20.

    What would it print if we were to convert from a null value? Well, that's what the third conversion (evenOdder = c) checks. Again, it compiles and executes without an exception - but nothing gets printed. The result is a null reference.

    It turns out that the C# compiler is handling this as a lifted conversion - so it checks whether or not the source has a value or not, and returns a null value if not. If there is a value, it performs the appropriate conversion.

    This is a compiler bug (confirmed by Eric). It's not behaving as per the spec (which disallows it) but the behaviour is unlikely to change now as it would break existing code.

    5.0 (Introduction): Foresight or luck?

    Eric admits in comments that the language designers perhaps weren't quite as foresighted as I gave them credit for. However, as he puts it:

    It was foresighted in the sense that the designers knew that if they added generics, iterator and anonymous functions, then that would open up vast new areas for extension of the language. What exactly those areas were going to look like, no one knew.

    Either way, the limited improvements to delegates in C# 2 certainly act as a welcome stepping stone before the full-on functional emphasis of C# 3.

    5.4.2: Predicate<T> in LINQ? Not so much...

    In section 5.4.2, I wrote:

    The Predicate<T> delegate type we've used so far isn't used very widely in .NET 2.0, but it becomes very important in .NET 3.5 where it's a key part of LINQ.

    It's possible that this was true at the time I originally wrote chapter 5 (significantly before .NET 3.5 was released) but it certainly isn't true now. LINQ tends to use Func<TSource,bool> for its predicates (where TSource is the type of the sequence involved). The two delegates are equivalent, of course - they have the same signature - but it's still not strictly speaking a use of Predicate<T>. Ah well.

    5.5.1: Was Scheme the first language to support closures?

    Scheme was the first language to require closure semantics. It's possible that there were earlier Lisp dialects which implemented them to a greater or lesser extent - so perhaps my statement at the start of this section is overly bold. If you know of a language earlier than Scheme with closures in, please let me know so I can change this note accordingly...

    5.5.4: Stacks, heaps, and caring too much

    Remember the section in chapter 2 where I mention that in some ways managed developers shouldn't care about whether things are placed on the stack or the heap? Well, you've got Eric to thank for that. I've just always cared by default.

    This section about closures shows a good reason why it's sometimes not worth caring - you end up being less surprised when things move around unexpectedly. Unsurprisingly, Eric puts it best:

    The whole point of managed memory is that every object lives at least as long as it needs to. The idea that "local stuff vanishes" has nothing to do with whether the implementation is "on the stack" or not – stacks are a means to an end, not an end in themselves.

    It had never occurred to me to think in such a liberating, non-implementation-specific way before writing this book. Since reading Eric's comments, I've been coming up with all kinds of bizarre ideas, many of which are completely unworkable - but it's a valuable experience nonetheless.

    5.5.7: Accidental capture of expensive resources

    There's a note of caution I originally wanted to include in the chapter, but it ended up making the whole thing too long - and frankly too negative. In the current implementation of captured variables, any variable which is captured by any anonymous method is captured by all anonymous methods which capture a variable from the same scope. This can - in very rare scenarios - mean that something isn't eligible for garbage collection for far longer than anticipated.

    Rather than go over the details, I'll redirect you to Eric's blog post on the topic. Note that this behaviour certainly isn't mandated, so it's possible that it may be fixed in a future version of the compiler - but it's unlikely to affect very many people.

    6.2.2: No code is executed until MoveNext is called

    I already point this out in the book, but when you call a method or property implemented with an iterator block, none of the code in the body is executed. This can be very confusing - particularly in unit tests!

    One reason for including this note when it's already mentioned in the book is that it's a fairly frequently asked question. The other is that Eric has a great blog post about it.

    6.2.3: Finally blocks in iterators

    It's worth being careful about what you put in a finally block in iterators, precisely because (as shown in the book) the finally may not be executed.

    This will usually only be the case with a malicious caller - let's face it, everyone else is bound to be using foreach - which gives a hint as to the kind of code to include and the kind of code to avoid.

    If you use the finally block for resource cleanup - perhaps implicitly, with a using statement - that's probably okay. If the caller doesn't call Dispose, it's no worse than them failing to call Dispose on the embedded resource.

    If, however, you were to use the finally block for security purposes - for instance, to relinquish some extra rights gained earlier on - then that would be very dangerous.

    Likewise it's probably a bad idea to use lock across a yield return. It's okay to acquire and release a lock in the course of the iterator block without yielding - during that time it'll be running just like a normal method - but if you return a value while holding the lock, the lock will still be held until execution comes back into the iterator block, which may be never!

    6.2.3: Why would you use a finally block in an iterator?

    In the book, I explain how finally blocks in iterator blocks work, but I don't actually give a real life case of why they're useful. Fortunately, there's a really nice example which is easy to understand. How many times have you written code like this?

    using (TextReader reader = File.OpenText(filename))
    {
        string line;
        while ( (line=reader.ReadLine()) != null)
        {
            // Do something with the line
        }
    }

    I've certainly processed lots of files a line at a time. Wouldn't it be nice to just be able to foreach over the file instead? Enter LineReader...

    using System.Collections.Generic;
    using System.IO;

    public class LineReader : IEnumerable<string>
    {
        string filename;

        public LineReader(string filename)
        {
            this.filename = filename;
        }

        public IEnumerator<string> GetEnumerator()
        {
            using (TextReader reader = File.OpenText(filename))
            {
                string line;
                while ( (line=reader.ReadLine()) != null)
                {
                    yield return line;
                }
            }
        }
       
        IEnumerator IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }
    }

    The using statement is just syntactic sugar for a finally block - so as long as you dispose of the iterator, the file will be closed too. Note how the file isn't even opened until the first call to MoveNext. Now you can write code like this:

    foreach (string line in new LineReader("file.txt"))
    {
        // Do something with the line
    }

    Possibly more importantly, you can also use this within LINQ to Objects, processing the sequence of lines in a file just like any other sequence of data items. There's a more fully-featured implementation of LineReader in my Miscellaneous Utility library.

    6.2.4: Why the <> in the names?

    In the book, I mention that the types generated by iterator blocks always begin with <> and point out that this isn't indicating a generic type parameter. There's slightly more to it than that. Although the names are not indicating a generic type, it's no coincidence that they contain symbols used for generics. The name is deliberately constructed so it can't possibly be explicitly referenced by valid C# code.

    The name also isn't CLS-compliant - which is fine, because the type itself is never public, and indeed never should be public. If ever a compiler accidentally made the types generated by iterator blocks public, any assembly claiming to be CLS-compliant would no longer compile, highlighting the compiler bug.

    6.3.2: Differences in design in MiscUtil

    After a couple of iterations of design, the Range class in MiscUtil no longer supports iteration over itself. Instead, you create a RangeIterator which has the concept of a range, which end to start at, and the step to take (implemented with a delegate).

    A Range itself then only has a start, an end, a Comparer , and inclusive/exclusive flags for each end. It's a mathematical interval (which can be closed, open, or half-open at either end). Separating the concerns of iteration and the range itself is much more satisfying in the long run, and also makes immutability somewhat easier to achieve.

    6.3.3: The bug (fixed before printing) in listing 6.7

    There was a subtle bug in my original code for listing 6.8. I think it's instructive to have a look at it, just so you know to avoid it yourselves. (At the same time, it allows me to thank Eric for finding it. I suspect relatively few reviewers would have done so, even though it's nothing to do with knowing C# inside out.)

    Here's the correct code:

    public IEnumerator<T> GetEnumerator()
    {
        T value = start;
        while (value.CompareTo(end) < 0)
        {
            yield return value;
            value = GetNextValue(value);
        }
        if (value.CompareTo(end) == 0)
        {
            yield return value;
        }
    }

    And here's the broken code:

    // Broken! Do not use!
    public IEnumerator<T> GetEnumerator()
    {
        T value = start;
        while (value.CompareTo(end) <= 0)
        {
            yield return value;
            value = GetNextValue(value);
        }
    }

    So what's wrong with the broken code? Well, consider a Range<byte> with a lower bound of 0, an upper bound of 255, stepping up by 1 each time. Clearly that should produce 0, 1, 2 ... 254, 255 and then stop. But how would it stop in the broken code? It can't - once you add 1 to 255 (as a byte) you start back at 0, so no comparison with 255 is ever going to yield a positive value.

    The working code finishes when the value just returned reaches or exceeds the end point. In fact, there's still a bug here. Although we carefully stop if we reach the end point (carefully yielding it due to the range being inclusive) we could still overflow. Take the same example as before, but stepping up by 2 bytes each time. We'd go straight from 254 to 0, creating the same problem as before.

    The most appropriate solution may be to change GetNextValue to indicate when it's noticed overflow (or any other reason not to return a value). For instance, we could change the signature to:

    bool GetNextValue(ref T value)

    The iteration code could then become:

    public IEnumerator<T> GetEnumerator()
    {
        bool hasMore;
        T value = start;
        do
        {
            yield return value;
            hasMore = GetNextValue(ref value);
        }
        while (hasMore && value.CompareTo(end) <= 0)
    }

    I haven't yet implemented this in MiscUtil - it's relatively tricky, compared with the previous simple stepping - but I hope to do so at some point.

    The moral of the story is to never claim that something is "quite straightforward" - it will do its best to prove you wrong!

    6.4: Coroutines and continuation passing style

    In my initial manuscript, I suggested that the language design team might not have considered iterator blocks being used for the continuation passing style coding that the Coordination and Concurrency Runtime supports. Fortuately, Eric corrected me on this and provided more information:

    Though I'm sure the design team did not anticipate that this would be used by robots, the fact that iterators are a form of coroutines was well understood by the design team.

    And yes, coroutines are handy for implementing CPS in languages which do not support it natively, though they are not strictly necessary. Really all you need are first-class functions.

    Since iterators provide closure semantics and are objects you can pass around, they act like first class functions. That they also act like coroutines is a nice bonus.

    See my blog for a discussion of how to implement a simple CPS system in Jscript by using first-class functions.

    Really, is there anything he hasn't written a blog post about?

    n/a - never mentioned!: Easy-to-write-but-inefficient recursive iterators

    Suppose you have a tree structure you wish to iterate (either breadth or depth first). It's very easy to write an iterator which is O(n^2) - i.e. somewhat inefficient. By recursing and creating a new iterator each time, you end up creating a lot of iterators unnecessarily.

    The solutions to this issue end up with less readable but more efficient code. It's up to you which path you take, of course, based on your context.

    Wes Dyer and Eric have both blogged about this issue. See their posts for far more details.

    7.1.3: Partial method corner cases

    Partial methods sound reasonably straightforward, don't they? Well, consider an expression tree which is built from a lambda expression which calls a partial method. If the partial method is removed, the lambda expression can't be converted into an expression tree any more, as there's no such thing as an expression tree which does absolutely nothing.

    I'm so glad I don't write compilers.

    7.2: Things the compiler won't spot

    The C# team missed a few potential restrictions on static classes - things you can do but shouldn't really be able to. I suspect very few people have ever noticed this, but you can find the default value of a static class: default(MyStaticClass).

    The compiler spots if you try to use this for type inference in the cases I've tried, but it's possible that you can trick it if you work hard enough.

    7.2: Making a class static shows your intention

    When describing static classes, I mostly stayed at the mechanical level of what they do, rather than why you'd want them to work that way. I did explain that it helps to keep the type from being misused, but Joe Albahari expresses an important reason much more elegantly:

    I think static classes are important for clarity because [making a class static] states intention.

    Exactly. Whenever there's a simple way of giving your reader a massive clue as to how you've designed your class to be used, that's a really good thing.

    7.3: Choices of defaults

    Eric's comment with respect to the choice of "the most private access available" as the default in C#:

    This is in contrast to, say Jscript .NET, where the defaults are to make as many programs as possible work, so everything is public unless specified otherwise. The language goals of Jscript are different than those of C#.

    7.3: Private - explicit or implicit?

    I've already mentioned that Eric and I disagree over whether or not to include the private modifier when it's not needed. To refresh your memory, my point of view is that by omitting the modifier when something is as private as it can be, you draw attention to other members which are less private. Here's Eric's rebuttal:

    By calling it out explicitly every time, I let the maintenance programmer who may not have the rules memorized know up front what the visibility is.

    Also, making it explicit says "I made this private by design". If it is missing then you cannot tell whether it was private by design or by accident.

    I'm not hugely worried by the first point, but the second is quite convincing. I'm still thinking it over, but I could end up changing my mind :)

    7.5.1: Octal literals, warning pragmas, and \x escapes

    Eric doesn't like lines such as #pragma warning disable 0169 - they provoke comments like this:

    C# does not have octal literals, thank goodness, but this still gives me cognitive dissonance to look at. It reminds me too much of the painfully broken rules in Jscript for dealing with literals in this form...

    I completely agree that it's a good thing that C# doesn't have octal literals. However, it does have the awful \x escape sequences for Unicode characters. Think fast - what's the difference between "I say:\x8Good compiler" and "You say:\x8Bad compiler"? Does one include "Good compiler" and the other include "Bad compiler" in its output? No - because while "G" isn't a valid hex digit, all the characters in "Bad" are. The two literals are the same as : "I say:\u0008Good compiler" and "You say:\u8BAD compiler" - but here the difference is much more obvious.

    8.1: Automatic properties - easy to implement (which doesn't mean "cheap")

    I'll let Eric speak for himself, in response to the statement that "Our first feature [automatic properties] is probably the simplest in the whole of C# 3":

    It was by far the easiest to implement. Peter Hallam did it in about an afternoon. (Of course, the implementation cost is a tiny fraction of the design, testing, documenting, etc, cost. There are no cheap features, only some which are less expensive than others.)

    8.1: "Fat" properties?

    Apparently automatic properties were called "fat" (or perhaps "phat") properties for a while internally. I can't think why, given that they're the slim version of normal property implementations! Of course, this kind of thing often becomes a piece of in-house terminology for no discernible reason at all.

    8.1: Why exposing properties rather than fields is best practice

    In chapter 8, I asserted that you shouldn't write non-private fields - that using properties is regarded as best practice. I didn't explain why, however - but I've now written it up as an article.

    8.2.1: "var" isn't "variant" - "object" is!

    I go to great lengths in the book to explain that the use of implicitly typed local variables (var) isn't the same as making the code dynamically typed, or introducing the Variant type from COM or VB6. However, we already have something approaching that - object! If you declare a variable as being of type object, you can store whatever you like in it. That's not true of implicitly typed variables.

    8.2.2: Additional (and inconsequential) restriction on implicit typing

    The following method call and assignment is legal. Insane, but legal.

    int j = M(j=10);

    It has the semantics of this code:

    int j;
    int temp;
    temp = (j = 10);
    j = M(temp);

    Okay, fair enough - hopefully most readers know that code like this is a recipe for disaster. However, let's try adding implicit typing to the mix:

    var j = M(j=10);

    How do we find out the type of j? We look at the return type of M.
    How do we work out which overload of M so use? We look at the type of j.
    In short, there's a chicken and egg problem, so this is declared to be illegal.

    8.2.3: Eric on the pros and cons of implicit typing

    This was such a great comment, I had to include it verbatim. You'll see it reflected so some extent in the book, but it's worth having the full version here.

    All code is an abstraction. Is what the code is “really” doing is manipulating data? No. Numbers? Bits? No. Voltages? No. Electrons? Yes, but understanding the code at the level of electrons is a bad idea! The art of coding is figuring out what the right level of abstraction is for the audience.

    In a high level language there is always this tension between WHAT the code does (semantically) and HOW the code accomplishes it. Maintenance programmers need to understand both the what and the how if they’re going to be successful in making changes.

    The whole point of LINQ is that it massively de-emphasizes the "how" and massively emphasizes the "what". By using a query comprehension, the programmer is saying to the future audience "I believe that you should neither know nor care exactly how this result set is being computed, but you should care very much about what the semantics of the resulting set are." They make the code closer to the business process being implemented and farther from the bits and electrons that make it go.

    Implicitly typed locals are just one small way in which you can deemphasize the how and thereby emphasize the what. Whether that is the right thing to do in a particular case is a judgment call. So I tell people that if knowledge of the type is relevant and its choice is crucial to the continued operation of the method, then do not use implicit typing. Explicit typing says "I am telling you how this works for a reason, pay attention". Implicit typing says "it doesn’t matter a bit whether this thing is a List<Customer> or a Customer[], what matters is that it is a collection of customers."

    8.3.1: Town or city?

    In listing 8.2, the Location class has a country and a town. It was noted that most people - at least in the US and Canada - would usually have chosen "city" instead of "town". It's possible that this is a US/UK distinction, as in the UK it's feasible to tell people from elsewhere in the country the name of a reasonably large town you're close to and give more information than just the closest city.

    The most practical reason for my choice, however, is that I wanted to distinguish between Purley and Reading in the example, while sticking with Tom and his friends for data. In fact, Purley is more of a village than a town, but if we'd gone with "city" I'd have needed a different set of people entirely.

    8.3.4: Fear is not the enemy...

    When describing object and collection initializers, I talk about the code being clearer through having less fluff. Eric describes it in a more precise way:

    Again, redundancy is the enemy of information density. Only redundancy which contributes to the understanding of the code should be encouraged; other kinds of redundancy are discouraged.

    8.3.4: Ages vs dates of birth

    In a few examples, I have used Age as a property of a Person - a property which is writable. In real code, this is bad idea. The moment after an age has been set, it is inaccurate - whereas a date of birth lasts forever. The age can always be calculated as a read-only property. This also allows the type to be (potentially) immutable.

    I used age in the examples as a simple number which could easily be attached to a person - but this shouldn't be copied in real code. In truth, I suspect that very little code shown in books should be copied verbatim. The kind of code which is useful for teaching purposes often highlights one area at the expense of others, partly through the need for brevity in print.

    8.5.1: Introducing LINQ

    LINQ is complicated. It requires a lot of somewhat intertwined features, some of which don't make much sense until you see the bigger picture. Eric suggested that the "bottom up" way I've approached it in the book is possibly best for a book, and that a "top down" approach works better in a live presentation.

    I've tried to present in a top down manner, and not been terribly successful - but I can do the bottom up style fairly easily, given enough time and an interested audience. I suspect this has more to do with my presentation skills - and frankly the experience of writing the book in a bottom up manner - than it has to do with the soundness of the idea itself. I'll be keeping an eye out for how other presenters tackle the topic. Feedback is welcome.

    8.5.2: Anonymous types in VB, and immutability

    As I point out in the book, anonymous types in C# are always immutable. I only recently learned that in VB, anonymous types can be wholly or partially mutable.

    That doesn't sound too bad - choice is good, right? But anonymous types are mutable by default in VB. They're only immutable for properties decorated with Key. Ouch.

    While it's obviously good that different languages can choose different paths, it seems odd to me that by default VB has chosen against immutability, while the rest of the world is trying to embrace it for its threading goodness (and other benefits, of course). I'm happy with C#'s choice to make anonymous types immutable, and I'd hope that if the C# team had decided to allow mutability, they'd have made that an explicit choice, with immutability the safer default.

    It's really important that VB developers understand this, by the way - various LINQ operators such as Distinct rely on equality and hashing, which are only applied to immutable properties for anonymous types... if you're going to use those operators in VB, you must remember the Key modifier.

    9.1.5: Another potential shortcut - lambdas with no arguments

    Although C# 3 has a shortcut for the case of a single argument, it has no parallel for a lambda which doesn't need any arguments. For instance, a Func<int> might be created from: () => 5. There are numerous possible representations. Two which spring to mind are |> 5 and |=> 5, using the vertical bar as a sort of "no data here" symbol.

    9.2.1: No need for a lambda!

    The code in listing 9.4 can be abbreviated even further than just removing the braces from round the body of the lambda expression. We don't need a lambda expression at all - we can use a simple method group conversion, converting this:

    Action<Film> print = film => Console.WriteLine(film);

    to this:

    Action<Film> print = Console.WriteLine;

    Of course as it's now so brief, we're not getting as much benefit from the extra variable. We can eliminate it completely, leaving this as the rest of the listing:

    films.ForEach(Console.WriteLine);

    films.FindAll(film => film.Year < 1960)
         .ForEach(Console.WriteLine);

    films.Sort((f1,f2) => f1.Name.CompareTo(f2.Name));
    films.ForEach(Console.WriteLine);

    It's a judgement call as to which form is most readable - I quite like keeping the extra variable - but it's worth taking note of this. When you write a "simple" lambda expression, just think that it might not need to be a lambda at all. I'll readily admit this possibility had completely passed me by in this case - thanks are due to Joe Albahari for pointing it out.

    9.2.2: Security in lambdas

    Eric raised an interesting point which I not only hadn't covered - I hadn't even thought about it.

    A lambda converted to a delegate becomes a method on the class in question (or a nested child class). If the delegate accesses a private field on the class, that's OK because the code all lives in the class.

    What happens when a lambda is converted to an expression tree and it accesses a private? We want that scenario to still work, even though when you compile the expression tree, the resulting method is NOT on any method associated with the class.

    In the desktop CLR, what we do is rely upon a new feature called Restricted Skip Visibility, whereby partially or fully trusted code is allowed to view the private state of other code provided that the viewing code is at least as or more trusted than the code that owns the private state.

    Note that hoisted locals are considered private state.

    The implications here are interesting, and things get odder in Silverlight (which doesn't have Restricted Skip Visibility) and in the SQL Server version of the CLR, which doesn't grant the required permission to any code. However, there's good news:

    We are presently attempting to design a more general mechanism into all versions of the CLR so that this notion of "this object possesses a license to mess with the private state of that object" is cleanly represented.

    9.3.1: Expression trees and parameters

    There was some confusion on the forum around why the expression shown in listing 9.6 doesn't have any parameters. After all, it performs addition on two numbers - doesn't that mean it has two parameters? Well, not quite...

    Think of the expression as a black box - you don't know what goes on inside it. You just feed it some number of parameters, and it will give you a result (assuming it doesn't have a void return type, effectively). In this case the addition is always performed with 2 and 3 as the operands - nothing varies, so there are no parameters. That's why when we compile the expression tree in listing 9.7 we end up with a Func<int> - something which accepts no parameters but returns an integer.

    An alternative way of thinking about this is as a method. Consider this code:

    public static int Add2And3()
    {
        return 2 + 3;
    }

    Again, it's adding 2 and 3 - but the method doesn't take any parameters.

    If we want to create an expression tree which does take parameters, we need to change the code a bit. Here's the code to do that, courtesy of Marc Gravell:

    // Declare the parameters
    ParameterExpression firstParam = Expression.Parameter(typeof(int), "x");
    ParameterExpression secondParam = Expression.Parameter(typeof(int), "y");

    // Create an Expression that performs an operation on the parameters
    Expression add = Expression.Add(firstParam, secondParam);
            
    // Create an Expression<T> using Expression.Lambda
    // Note that we specify the parameters again
    Expression<Func<int,int,int>> lambda = 
        Expression.Lambda<Func<int,int,int>>(add, firstParam, secondParam);
            
    // Compile the expression into a delegate
    Func<int,int,int> compiled = lambda.Compile();
            
    // Execute it
    Console.WriteLine(compiled(2,3));

    As you can see, this time when we call compiled we give it two parameters.

    That may help to clear up some misunderstandings, but reader feedback has suggested that expression trees are best understood by just trying them. Write a small test program and play around. Have fun - and don't worry too much if you find expression trees tricky: if you understand the basic concept that they're expressing logic as data, and that you can create them using lambda expressions, that's as much as most developers will ever need to know.

    9.3.3: Lambda to expression tree conversion restrictions

    I asked Eric for more information about what kind of lamdba expressions couldn't be turned into expression trees. Bearing in mind the following list, aren't you glad I didn't include it all in the book?

    The following are illegal in an expression tree:

    • anonymous methods
    • statement lambdas
    • expression lambdas with ref/out parameters
    • all the assignment operators
    • the event += -= operators
    • any use of a removed partial method
    • a call to a virtual method via "base"
    • almost anything unsafe that would require unsafe codegen when converted to a delegate:
      • any binary operator where either operand has pointer type
      • any unary operator where the operand has pointer type
      • any sizeof operator on a non-built-in type
      • any conversion from or to a pointer type
    • multidimensional array initializers
    • any method that uses the undocumented "__arglist" feature
    • any "naked" method group
    • any "null typed" expression other than the literal null

    The last four warrant a bit more explanation:

    • Support for multi-d initializers was left out of the expression tree API for no good reason and by the time we realized it, it was too edge-case a scenario to warrant defining/testing/documenting/etc new apis for it.
    • Both declaring and calling methods with C-style variable-number-of-arguments parameter lists are legal in C# for interop purposes; hardly anyone knows that. Search for "__arglist" for details. It's not legal to call one in an expression tree.
    • There are supposed to be no "naked" method groups in C# expressions but there are compiler bugs that make them legal. For example, you can say "M is object", and it will always be false, but not a compiler error, even though it really should be. It's not legal to do this in an expression tree because we have no way of describing the type of the operand. (We could bless this and just make it the constant false, but I would rather not compound our earlier error by blessing it further.)
    • Again, compiler bugs. The spec implies that (null ?? null) is not legal, but the compiler allows it. The type of that expression is "the null type", which again, is not a type according to the specification. Again, it's not legal to do this in an expression tree because we have no way of describing the type. (And again, we could make this the constant null, but let's not bless bad code, that just makes it harder to take the breaking change if we ever fix the bug.)

    Given the size of the list, it seems amazing that expression trees are useful for anything - but as I noted in the book, the limitations very rarely crop up in real code.

    9.3.3: infoof and tricks with expression trees

    Eric reveals a feature which has been under discussion:

    We have for many years considered creating an infoof (pronounced "in foof", of course) operator that would be to MethodInfos as typeof is to Types. The problem is, how do you specify the arguments so that overload resolution can select the correct method out of the method group? There’s no good way to do it, and there have always been way more important things to do, so it keeps getting cut.

    However, the fact that lambda expression conversions can use the IL methodinfo operator can be used to your advantage:

    If you want to get a methodinfo for a particular method, you can just say

    Expression<Func<int,string,double>> e = (arg1, arg2)=>M(arg1, arg2);

    And then pull the MethodInfo out of the generated Call node! Of course you end up with a few wasted object allocations along the way, but you are guaranteed to get the same method out that the compiler would have picked for overload resolution.

    Interestingly, I'm pretty sure Marc Gravell came up with exactly the same idea after I'd read Eric's note, but before I included it on this web site.

    9.4.3: Hindley-Milner type inference

    I won't pretend to fully understand this comment, as I hadn't heard of Hindley-Milner type inference (Wikipedia link) before, other than it being briefly mentioned in Eric's blog:

    A number of people have asked me why we didn’t simply use Hindley-Milner type inference, a la F#, OCAML, etc. Two reasons.
    1. HM type inference is not guaranteed to terminate in a reasonable amount of time; our type inference algorithm guarantees progress every time through the loop and therefore runs in polynomial time. (Though of course, overload resolution can then be exponential in C#, which is unfortunate.)
    2. HM type inference works poorly in a language which has class inheritance; it was designed for functional languages like Haskell with pattern matching rather than inheritance.

    10.2: Extension delegates

    There's one feature I wasn't even aware of when writing the book. I only found out about it when reading the preview of another C# 3 book by Bruce Eckel and Jamie King.

    Basically, the feature allows you to specify an extension method as a method group using extension syntax. It can then be converted to a delegate of an appropriate type as if it were an instance method. As an example, take the Count() extension method on IEnumerable<T>. The actual declared method is static and takes a single parameter, so in some ways we shouldn't logically be able to use it as a the target of an Func<int> delegate which takes no parameters and returns an int, right? Nope...

    using System;
    using System.Linq;

    public class Test
    {
        static void Main()
        {
            string[] x = {"a""b""c"};
            
            // Call extension method directly
            Console.WriteLine(x.Count());
            
            // Convert extension method group to delegate
            Func<int> func = x.Count;
            
            // Print out the target of the delegate
            Console.WriteLine(func.Target);
            
            // Print out the value returned by calling the delegate
            Console.WriteLine(func());
        }
    }

    This compiles and executes, with this output:

    3
    System.String[]
    3

    Note how the target of the delegate is the array, even though delegates which use static methods normally have a null target.

    Basically, the long and the short of it is that extension methods can be used as if they were instance methods, not just for calling, but also for conversion into delegates. That's really neat.

    10.2.4: Null references vs null values

    As I mention in section 10.2.4 when describing extension methods, "You can't call instance methods on null references in C#". This is correct (with the understanding that calling an extension method using a null target isn't the same as calling a true instance method) but life becomes more interesting when you consider null values in general. In particular, Nullable<T> allows various methods to be called on a null value. Consider the following code:

    int? foo = null;
    int hash = foo.GetHashCode();
    string text = foo.ToString();
    bool equal = foo.Equals(foo);
    Type type = foo.GetType();

    Without peeking back at chapter 4, what would you expect the results to be? As it happens, the code above does blow up with an exception - but only on the last line. The first three method calls require no boxing, and Nullable<T> handles them just fine. The call to GetType does end up boxing the value into a null reference, which then causes a NullReferenceException.

    Many thanks to Marc Gravell for this interesting example.

    10.2.4: Extra issue when reusing existing names for extension methods

    In section 10.2.4, I give an example of a potentially useful extension method: effectively making string.IsNullOrEmpty usable as if it were an instance method.

    Now, I already give a warning about potentially confusing people reading your code, if they're used to it being a static method - but Intellisense makes this even worse. Here's a screenshot from Visual Studio 2008 SP1. I've declared two extension methods: IsNullOrEmpty and IsNullOrWhitespace. I've then captured the screen after typing "foo." where foo is a variable of type string.

    Intellisense bug around extension methods in VS2008

    Note how Intellisense is providing different icons for the two extension methods - it's still displaying the normal "static method" icon for IsNullOrEmpty.

    It's not hugely serious, but it's just another thing to be aware of. I should note that ReSharper has its own Intellisense display, and that does the right thing.

    10.3.1: Features of Enumerable.Range

    One feature of Enumerable.Range which isn't supported by the Range class in chapter 6 is the ability to create an empty range. Because both ends of the range are inclusive, you always end up with at least one entry.

    This made the code in chapter 6 simpler, but it's nice to be able to specify an empty range. The "real" range class in MiscUtil includes this feature, of course.

    10.3.1: Detecting performance problems

    Deferred execution has a downside: if you write a query which will take a long time to execute, your profiler is unlikely to point you at the query creation point, which is where you're likely to be able to fix it. If you try to create queries close to the code which iterates over the results, you may find it easier to understand the performance characteristics.

    10.3.2: Efficiency of examples

    Before technical review, the query in listing 10.8 called Where after Reverse - in other words, it was inefficient. I knew about this, and already had the callout to explain how the efficiency could be improved, but Eric suggested that the code in the listing should be the more efficient code to start with.

    His reasoning (which I totally agree with) is that sometimes developers take code directly from books, and then fiddle with it until it works for their particular situation - sometimes without reading the surrounding text. Therefore the examples should avoid errors which are then pointed out in the text.

    The moral of the story is two-fold:

    10.4.1: Extending the world... carefully

    Eric has a bit of guidance about the use of extension methods:

    Future versions of the frameworks design guidelines will probably say "please don't put extension methods on object, System.ValueType, System.Enum, unconstrained type parameters, etc."

    We will violate this guideline ourselves in a few key places—a lot of people are asking for an In operator which is the inverse of Contains. static bool In<T>(this T, IEnumerable<T> ts) { ... } so you can say if 12.In(myints). A bit bogus if you ask me, but people like it. Clearly this would then be an extension method which matches every type.

    There's one use of extension methods like this that has proved useful to me - acting on anonymous types, just by reflecting over their properties. I do this in chapter 12 with AsXAttributes and I've also got a version in MiscUtil for AsXElements.

    I also have an extension method for any reference type, which throws an ArgumentNullException (with optional parameter name) is you call it on a null reference. This make argument checking easy - I just type foo.ThrowIfNull("foo");.

    11.1.1: What does LINQ mean?

    Eric gave his own answer to my question of what "counts" as LINQ (as well as emphasizing that it really doesn't matter):

    [...] There is something real and new here. That real and new thing is that the semantics of the query are expressible in the language you are programming in. In the old days of building a SQL string, the semantics of the query aren't in the C# code, they're in the string. The compiler has no way of knowing whether the thing in the string is sensible or not. To me, LINQ is any technology that moves the query logic more into the C#/VB language and out of string manipulation or object model calls.

    11.1.2: Deferred execution and iterator blocks

    One area which apparently causes lots of people pain is understanding that iterator blocks really do only start to get used when you iterate over the results. That was true when we looked at iterators in chapter 6, and it's equally true if you use a custom iterator as a data source (or custom operator) in LINQ.

    Possibly the biggest source of confusion is around parameter validation - if the method does parameter validation, you might instinctively expect it to throw an exception as soon as you call the method, even if you understand that data is only going to be processed when you iterate over the result. This is certainly desirable, and I think it's a pity the language doesn't (currently) support this scenario. I've blogged about this in more detail.

    11.1.2: Truly silly query expressions

    It's important to understand that the process of translating a query expression into method calls is purely syntactic - that's how it's able to work in radically different ways with different providers.

    I have a blog entry demonstrating just how silly things can get - you can make query expressions call static methods or delegates, if you provide the right (somewhat strange) data source.

    If you build a LINQ provider which makes Select return an int, so be it. You shouldn't expect anyone to actually use it, but the compiler won't care at all.

    11.1.3: Naming of imaginary companies and products

    The fictional company was originally called Skeetysoft, but Eric reckons that "PascalCasing is the coolest casing". Who am I to argue, beyond pointing out that he works for Microsoft rather than MicroSoft? ;)

    Additionally, I'm somewhat inconsistent in my choice of naming for products. Currently SkeetySoft is competing with Microsoft (SkeetyOffice/SkeetyMediaPlayer) and Google (SkeetyTalk). I should either have been consistently challenging Microsoft by calling my chat application SkeetyMessenger, or perhaps decided to throw down the gauntlet to Apple as well, with SkeetyTunes as the media app. Good job this is all in my mind, otherwise I'm sure the relevant CEOs would all be quaking in their boots.

    11.1.3: Defect severities

    My choice of Trivial/Minor/Major/Showstopper as the available severities was somewhat arbitrary, but it does have the nice feature of restricting choice quite a lot, leaving clear options. I know that restricting choice sounds like a bad thing, but often too much choice can lead to time being wasted on decisions which make no real difference. Take the Java logging API for example - how often am I really going to care whether a log level is at "fine" or "finer"?

    Eric apparently uses an even smaller set of choices on occasion: Vexing/Boneheaded/Fatal.

    11.3.3: Explicitly specifying an ascending order

    One feature I missed when writing about orderings is the ability to explicitly use the word ascending for a particular ordering. For example,

    from person in people
    order by person.Name
    select person

    is exactly equivalent to

    from person in people
    order by person.Name ascending
    select person

    The reason I missed this? I've never seen it used, that I recall. (Yes, I should still have spotted it when going through the spec, of course...)

    11.5.1: Equijoins

    You may have been slightly surprised to see the word equals in join expressions, rather than ==, but there are reasons for it:

    11.5.1: Scoping of key selectors in joins

    I mention the "scoping issue" briefly after listing 11.12, but it's worth going into this slightly more fully. Let's consider joining two ranges of numbers together in the obvious way. This query expression is valid:

    from left in Enumerable.Range(0, 10)
    join right in Enumerable.Range(5, 15)
      on left equals right
    select left*right

    This, however, is not:

    // Warning: invalid!
    from left in Enumerable.Range(0, 10)
    join right in Enumerable.Range(5, 15)
      on right equals left
    select left*right

    The left side of the equals only knows about the "main" sequence, whereas the right side only knows about the extra sequence which is being joined with the main one. We can see why it fails when the compiler translation is performed on each of the two expressions above. First the working one:

    Enumerable.Range(0, 10)
              .Join(Enumerable.Range(5, 15), // Extra sequence
                    left => left, // Left key selector
                    right => right, // Right key selector
                    (left, right) => left*right // Result selector
                   )

    That's fine. All the lambda expressions make sense. This is the translation of the broken query expression though:

    // Warning: invalid!
    Enumerable.Range(0, 10)
              .Join(Enumerable.Range(5, 15), // Extra sequence
                    left => right// Left key selector
                    right => left// Right key selector
                    (left, right) => left*right // Result selector
                   )

    Neither of the key selector lambda expressions will compile, because they refer to variables which aren't available.

    11.6.1: Copy editors get notes too

    Just a quick nod to my copy editor, Liz, who did such a wonderful job with the book. She mentioned after listing 11.17 that entirely coincidentally, she used to work with a Tara, a Tim and a Darren (three of the SkeetySoft employees).

    Given the trivial nature of some of the notes I've been adding, I see no reason not to include this little factoid. It's nice to get an opportunity to thank her again, too. If I ever write another book, I really want to have Liz as my copy editor again.

    12.1.2: Disposing of DataContexts

    In all my examples, I dispose of the DefectModelDataContext after using it. In your code, you may not want to do this - it can be difficult in some cases. Unlike most types which implement IDisposable, DataContext doesn't really need disposing - at least not in most cases. I asked Matt Warren about this design decision, and here was his response:

    There are a few reasons we implemented IDisposable:

    • If application logic needs to hold onto an entity beyond when the DataContext is expected to be used or valid you can enforce that contract by calling Dispose. Deferred loaders in that entity will still be referencing the DataContext and will try to use it if any code attempts to navigate the deferred properties. These attempts will fail. Dispose also forces the DataContext to dump its cache of materialized entities so that a single cached entity will not accidentally keep alive all entities materialized through that DataContext, which would otherwise cause what appears to be a memory leak.
    • The logic that automatically closes the DataContext connection can be tricked into leaving the connection open. The DataContext relies on the application code enumerating all results of a query since getting to the end of a resultset triggers the connection to close. If the application uses IEnumerable's MoveNext method instead of a foreach statement in C# or VB, you can exit the enumeration prematurely. If your application experiences problems with connections not closing and you suspect the automatic closing behavior is not working you can use the Dispose pattern as a work around.

    12.1.2: Resetting databases: data vs metadata

    In section 12.1.2 on page 319, I note two different ways of resetting the database - one issuing direct SQL commands, and the other using LINQ to SQL to load everything and then delete each entry.

    It should be noted that although these have the same effect in terms of the data stored directly in the tables, the SQL code (in CleanDatabaseQuickly.cs) also resets some metadata in the table - namely the identity values. This can be important occasionally - e.g. in unit tests which might expect the database to be in an absolutely known state. Most production code shouldn't care what ID values are generated, of course.

    13.0 (Introduction): Cheap vs good

    It has been pointed out that writing perfect code is expensive. Sometimes taking the cheap, hacky path can be the most appropriate business decision. This can be difficult to live with if you have a natural quality focus, and it's particularly inconvenient if the options are "grotty hack to make the company survive... but leading to crippling issues later" vs "elegant solution which would pave the way for further development... but the company will die first".

    If the language can make "right" code also "cheap" code, that's fabulous - and really tricky in terms of language design.

    13.1.1: There's more to functional coding than functions as first class data

    I mention the more functional emphasis of C# 3 a number of times in the book. There are various aspects to this: firstly there's the language features of making it simpler to create delegates, and possible to create expression trees.

    However, there's also the library aspect - the way that the sequence operators never modify the existing data, for instance. Indeed, comparing List<T>.Sort and IEnumerable<T>.OrderBy shows this difference - sorting an existing collection is a mutating operation, whereas creating a new sequence based on an old one, but with a different order, is not.

    Immutability is a core principle of functional languages, enabling simpler concurrency and in many cases making it easier to reason about how your program works. I hope it will be more strongly supported in C# 4.

    In the meantime, C# 3 allows you to write functional-style code, and LINQ encourages you to do so - but don't fall into the trap of thinking it's just about lambda expressions. (I've personally got a long way to go when it comes to a functional style of programming, by the way, so don't think I'm looking down on anyone.)

    13.1.1: F#, IronRuby etc - first class citizens?

    It looks hopeful that F# (as one example) will become properly integrated into .NET in some senses. There will be proper Visual Studio support, Intellisense, debugging etc. However, it's likely that there will still be distinct "tiers" of citizenship. If every language were a first class citizen, they would all be treated equally on MSDN, with examples in every language for every method - and that's not likely to happen any time soon.

    This sliding scale of support will prove interesting - in particular, as Sun tries to fully support more languages on the JVM, we should keep an eye on how well integrated those languages are, and how much documentation is tailored for them.

    13.1.2: Feature in a dynamic language != dynamic language feature

    In the book I mention that some of the features in C# 3 which make life more pleasant (such as object/collection initalizers and implicitly typed local variables) look like they come from dynamic languages. However, that's not to say they work the same way in C# as they do in truly dynamic languages. Eric puts it well:

    These are features that make statically typed languages look textually more like dynamic languages, but of course none of these are dynamic language features. It's not that dynamic languages just hide the type information from you. In many cases, it is in principle impossible to deduce the type of a thing in a dynamic language. Consider Jscript with "eval", for example, where code at runtime can arbitrarily change the contents of local variables. There’s no "implicity typing" there – there's no typing at all!

    13.2: Inheritance - a single-shot opportunity for specialization

    Eric has another way of talking about the use of delegates instead of inheritance for specialization.

    Indeed. Inheritance is a tremendously powerful tool for sharing implementation details and specializing behaviour. But the fact that you only get "one shot" at inheritance in a single-inheritance world means that you’ve got to take that shot carefully and make sure you're using that power to its best ability. What you’re describing [delegates] is an alternative approach to specialization which is much less "expensive" since it does not waste your inheritance shot on something not worth the power.

    13.4: Massively parallel - but at different powers?

    Eric responded to my prediction of massively parallel processing with this twist:

    What I expect to see is a few "big" cores – 2, 4 or 8 heavy duty processors – and then dozens of weaker cores. What's better, 16 Toyota Camrys, or 4 Camrys and six dozen guys on motor scooters? Obviously it depends on what job you want to get done; but there are lots of tasks that are amenable to this kind of parallelization.

    I agree - this is a definite option, and in some ways is what you already have with a modern PC: the main CPU(s) are powerful, but you don't get very many of them, whereas recent graphics cards have many more effective threads of execution, each doing a very specialised job. If the task you want to perform can be parallelised in a way which is supported by the GPU architecture, you can often achieve much more that way than with the main CPU. (And if you have multiple graphics cards, that takes it to another level again...)

    In other cases, I think there will be very large numbers of "reasonably powerful" main processors. These may be more geared towards servers than consumer technology. We'll have to see...