(You are currently looking at the first edition version of this page. This page is also available for the second and third editions.)

Notes for Chapter 6: Implementing iterators the easy way

6.2.2: No code is executed until MoveNext is called

I already point this out in the book, but when you call a method or property implemented with an iterator block, none of the code in the body is executed. This can be very confusing - particularly in unit tests!

One reason for including this note when it's already mentioned in the book is that it's a fairly frequently asked question. The other is that Eric has a great blog post about it.

6.2.3: Finally blocks in iterators

It's worth being careful about what you put in a finally block in iterators, precisely because (as shown in the book) the finally may not be executed.

This will usually only be the case with a malicious caller - let's face it, everyone else is bound to be using foreach - which gives a hint as to the kind of code to include and the kind of code to avoid.

If you use the finally block for resource cleanup - perhaps implicitly, with a using statement - that's probably okay. If the caller doesn't call Dispose, it's no worse than them failing to call Dispose on the embedded resource.

If, however, you were to use the finally block for security purposes - for instance, to relinquish some extra rights gained earlier on - then that would be very dangerous.

Likewise it's probably a bad idea to use lock across a yield return. It's okay to acquire and release a lock in the course of the iterator block without yielding - during that time it'll be running just like a normal method - but if you return a value while holding the lock, the lock will still be held until execution comes back into the iterator block, which may be never!

6.2.3: Why would you use a finally block in an iterator?

In the book, I explain how finally blocks in iterator blocks work, but I don't actually give a real life case of why they're useful. Fortunately, there's a really nice example which is easy to understand. How many times have you written code like this?

using (TextReader reader = File.OpenText(filename))
{
    string line;
    while ( (line=reader.ReadLine()) != null)
    {
        // Do something with the line
    }
}

I've certainly processed lots of files a line at a time. Wouldn't it be nice to just be able to foreach over the file instead? Enter LineReader...

using System.Collections.Generic;
using System.IO;

public class LineReader : IEnumerable<string>
{
    string filename;

    public LineReader(string filename)
    {
        this.filename = filename;
    }

    public IEnumerator<string> GetEnumerator()
    {
        using (TextReader reader = File.OpenText(filename))
        {
            string line;
            while ( (line=reader.ReadLine()) != null)
            {
                yield return line;
            }
        }
    }
   
    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

The using statement is just syntactic sugar for a finally block - so as long as you dispose of the iterator, the file will be closed too. Note how the file isn't even opened until the first call to MoveNext. Now you can write code like this:

foreach (string line in new LineReader("file.txt"))
{
    // Do something with the line
}

Possibly more importantly, you can also use this within LINQ to Objects, processing the sequence of lines in a file just like any other sequence of data items. There's a more fully-featured implementation of LineReader in my Miscellaneous Utility library.

6.2.4: Why the <> in the names?

In the book, I mention that the types generated by iterator blocks always begin with <> and point out that this isn't indicating a generic type parameter. There's slightly more to it than that. Although the names are not indicating a generic type, it's no coincidence that they contain symbols used for generics. The name is deliberately constructed so it can't possibly be explicitly referenced by valid C# code.

The name also isn't CLS-compliant - which is fine, because the type itself is never public, and indeed never should be public. If ever a compiler accidentally made the types generated by iterator blocks public, any assembly claiming to be CLS-compliant would no longer compile, highlighting the compiler bug.

6.3.2: Differences in design in MiscUtil

After a couple of iterations of design, the Range class in MiscUtil no longer supports iteration over itself. Instead, you create a RangeIterator which has the concept of a range, which end to start at, and the step to take (implemented with a delegate).

A Range itself then only has a start, an end, a Comparer , and inclusive/exclusive flags for each end. It's a mathematical interval (which can be closed, open, or half-open at either end). Separating the concerns of iteration and the range itself is much more satisfying in the long run, and also makes immutability somewhat easier to achieve.

6.3.3: The bug (fixed before printing) in listing 6.7

There was a subtle bug in my original code for listing 6.8. I think it's instructive to have a look at it, just so you know to avoid it yourselves. (At the same time, it allows me to thank Eric for finding it. I suspect relatively few reviewers would have done so, even though it's nothing to do with knowing C# inside out.)

Here's the correct code:

public IEnumerator<T> GetEnumerator()
{
    T value = start;
    while (value.CompareTo(end) < 0)
    {
        yield return value;
        value = GetNextValue(value);
    }
    if (value.CompareTo(end) == 0)
    {
        yield return value;
    }
}

And here's the broken code:

// Broken! Do not use!
public IEnumerator<T> GetEnumerator()
{
    T value = start;
    while (value.CompareTo(end) <= 0)
    {
        yield return value;
        value = GetNextValue(value);
    }
}

So what's wrong with the broken code? Well, consider a Range<byte> with a lower bound of 0, an upper bound of 255, stepping up by 1 each time. Clearly that should produce 0, 1, 2 ... 254, 255 and then stop. But how would it stop in the broken code? It can't - once you add 1 to 255 (as a byte) you start back at 0, so no comparison with 255 is ever going to yield a positive value.

The working code finishes when the value just returned reaches or exceeds the end point. In fact, there's still a bug here. Although we carefully stop if we reach the end point (carefully yielding it due to the range being inclusive) we could still overflow. Take the same example as before, but stepping up by 2 bytes each time. We'd go straight from 254 to 0, creating the same problem as before.

The most appropriate solution may be to change GetNextValue to indicate when it's noticed overflow (or any other reason not to return a value). For instance, we could change the signature to:

bool GetNextValue(ref T value)

The iteration code could then become:

public IEnumerator<T> GetEnumerator()
{
    bool hasMore;
    T value = start;
    do
    {
        yield return value;
        hasMore = GetNextValue(ref value);
    }
    while (hasMore && value.CompareTo(end) <= 0)
}

I haven't yet implemented this in MiscUtil - it's relatively tricky, compared with the previous simple stepping - but I hope to do so at some point.

The moral of the story is to never claim that something is "quite straightforward" - it will do its best to prove you wrong!

6.4: Coroutines and continuation passing style

In my initial manuscript, I suggested that the language design team might not have considered iterator blocks being used for the continuation passing style coding that the Coordination and Concurrency Runtime supports. Fortuately, Eric corrected me on this and provided more information:

Though I'm sure the design team did not anticipate that this would be used by robots, the fact that iterators are a form of coroutines was well understood by the design team.

And yes, coroutines are handy for implementing CPS in languages which do not support it natively, though they are not strictly necessary. Really all you need are first-class functions.

Since iterators provide closure semantics and are objects you can pass around, they act like first class functions. That they also act like coroutines is a nice bonus.

See my blog for a discussion of how to implement a simple CPS system in Jscript by using first-class functions.

Really, is there anything he hasn't written a blog post about?

n/a - never mentioned!: Easy-to-write-but-inefficient recursive iterators

Suppose you have a tree structure you wish to iterate (either breadth or depth first). It's very easy to write an iterator which is O(n^2) - i.e. somewhat inefficient. By recursing and creating a new iterator each time, you end up creating a lot of iterators unnecessarily.

The solutions to this issue end up with less readable but more efficient code. It's up to you which path you take, of course, based on your context.

Wes Dyer and Eric have both blogged about this issue. See their posts for far more details.