Simulating INumeric with policies

After reading Luca Bolognese's blog post regarding using C# 4.0's dynamic keyword to simulate INumeric I immediately thought that must be a way to express this with generics instead. Generics, however, do not allow you to abstract over operators (which is what is is needed to really do INumeric) but you can simulate it using adapters as described in this post.

Say I want to write a general average routine that is generic over what type I am using to store the data. That is, I want one routine that works equally for double as well as decimal. What I want to be able to write is something like,

        public static T Average(params T[] values) {
            T total = 0;

            foreach (var v in values) 
                total = total + v;

            return total / values.Length;
        }

where the T is any type that supports adding and dividing. I can't get this code to compile exactly but we can get surprisingly close! My final version looks like,

    class Math<T, A> : PolicyUser<T, A> where A : struct, IOperatorPolicy<T> {
        public static T Average(params T[] values) {
            var total = L(0);

            foreach (var v in values) 
                total = total + v;

            return total / values.Length;
        }
    }

which is much closer to the original than I thought originally could get!

I started by cloning the operator policy and added a FromInt() I will explain below,

    interface IOperatorPolicy<T> {
        T Add(T a, T b);
        T Subtract(T a, T b);
        T Multiply(T a, T b);
        T Divide(T a, T b);
        T FromInt(int value);
    }

Then I cloned the double policy struct and then added a decimal mainly by cut/paste/search/replace

    struct DoubleOperatorPolicy : IOperatorPolicy<double> {
        double IOperatorPolicy<double>.Add(double a, double b) { return a + b; }
        double IOperatorPolicy<double>.Subtract(double a, double b) { return a - b; }
        double IOperatorPolicy<double>.Multiply(double a, double b) { return a * b; }
        double IOperatorPolicy<double>.Divide(double a, double b) { return a / b; }
        double IOperatorPolicy<double>.FromInt(int value) { return value; }
    }

    struct DecimalOperatorPolicy : IOperatorPolicy<decimal> {
        decimal IOperatorPolicy<decimal>.Add(decimal a, decimal b) { return a + b; }
        decimal IOperatorPolicy<decimal>.Subtract(decimal a, decimal b) { return a - b; }
        decimal IOperatorPolicy<decimal>.Multiply(decimal a, decimal b) { return a * b; }
        decimal IOperatorPolicy<decimal>.Divide(decimal a, decimal b) { return a / b; }
        decimal IOperatorPolicy<decimal>.FromInt(int value) { return value; }
    }

The cleaver bit is to define a base type that contains a struct that defines the operators as overloads. The overloads use a policy struct to implement that overloaded operators. This struct also defines implicit conversion operators to hide some (but not all) of the messiness introduced by using this wrapper type. The PolicyUser class looks like,

    class PolicyUser<T, A> where A: struct, IOperatorPolicy<T> {
        static A Policy = new A();
        
        protected struct Value {
            public T V;

            public Value(T v) { V = v; }

            public static Value operator +(Value a, Value b) { return new Value(Policy.Add(a.V, b.V)); }
            public static Value operator -(Value a, Value b) { return new Value(Policy.Subtract(a.V, b.V)); }
            public static Value operator *(Value a, Value b) { return new Value(Policy.Multiply(a.V, b.V)); }
            public static Value operator /(Value a, Value b) { return new Value(Policy.Divide(a.V, b.V)); }
            public static implicit operator Value(T v) { return new Value(v); }
            public static implicit operator Value(int v) { return new Value(Policy.FromInt(v)); }
            public static implicit operator T(Value v) { return v.V; }
        }

        protected static Value L(int value) { return new Value(Policy.FromInt(value)); }
    }

What the Value struct does is allows you to use the type Value in place of the type T whenever you want to use operators. When operators are used you only need to ensure one sub-expression is promoted to Value and the implict operator will take care of the rest.

One remaining oddness is using literals. Implicit operators have there limits which makes using literal a bit strange. This was also odd in the Policy post as well and I used the same trick to make it a little less cumbersome. I introduced a static method L() that takes an integer and converts it to Value allowing you to use integer literals by calling FromInt() I introduced earlier. You can extend to this to allow other types by repeating the pattern for, say double, allowing you to use double literals as well. I didn't because I didn't need it for my example (but probably will if I try to implement something more complicated.

To bind the actual data type to provide the data type and the policy. To make this simpler I created two create implementations of Math.

  class DoubleMath : Math<double, DoubleOperatorPolicy> { }
  class DecimalMath : Math<decimal, DecimalOperatorPolicy> { }

Calling the average method looks like,

    var resultDouble = DoubleMath.Average(1, 2, 3, 4, 5);
    var resultDecimal = DecimalMath.Average(1, 2, 3, 4, 5);

It should be straight forward to see how this could be adapted to a Financial class, as Luca was trying to build, and how the routines could be made independent of the data type used in the calculations.

There you have it. Policies can be used to simulate INumeric without having to resort to using dynamic.

Edit: Mar 9, 2009: Fixed HTML formatting error

Zero the value is not

One of my pet-peeves is code written like this,

if (0 != value) { ... }

This reads backwards to me. I hear echoes of Frank Oz in my head, "if zero the value is not". As cute as this dialect is coming from a pointy-eared muppet, it still takes a bit to translate back into the more common English phrasing, "if the value is not zero". It seems programmers who insist on writing in this style have forgotten that their source should be readable as well as executable. Don't force your readers to have to do grammar gymnastics as well as having to decipher your algorithmic gynmastics.

Once upon a time, this was considered good style in order to avoid accidentally writing something like,

if (value = 0) { ... }

In C#, this is illegal. In most (all?) modern C and C++ compilers, this will generate a warning. (You do compile with warnings reported as errors right?) There ceases to be a good reason to write test expressions backwards. Let's let this style rest in peace.

Even if you think the above form has some redeeming value, let's all agree that any programmer that voluntarily writes something like,

if (false == value) { ... }

should have an electric shock sent directly through their keyboard.

BeginInvoke's last parameter: What's in a name?

I have been playing around with asynchronous programming lately and it bothers me that the last parameter to the begin invoke pattern doesn't have a name that everyone agrees on. The begin invoke pattern is the asynchronous calling pattern were the BeginXxxx(), such as Stream.BeginRead(), takes a set of parameters, a callback method, and some, well, thing, at the end that is carried along with the asynchronous operation that eventually finds it way into the IAsyncResult. The problem is what do we call that thing? In an informal survey of the methods in the BCL that implement this pattern I have found a wide variation. Here is a partial list in somewhat order of popularity,

  • object
  • stateObject
  • state
  • asyncState
  • extraData
  • data

There seems to be little agreement about what to call it. We could pick the most prevalent but the name object occurs the most often because every delegate gets a BeginInvoke() method created for it and in this automatically generated code the parameter is called object. We can't standardize on object because it is a reserved word in several languages so either it would be impossible to specify or awkward (i.e. @object in C#). What I would like is a name that we can all use and we can all agree on.

To name the thing, we first must understand why it it is there at all. It exists to hold some state on behalf of the caller. Having the state object makes programming against the begin invoke pattern easier in languages that do not have anonymous methods that capture local variables. Consider the following program (which intentionally ignores errors because it is only an example),

const string uriFormat = 
    "http://messenger.services.live.com/users/{0}@apps.messenger.live.com/presence";
const string responsePattern = 
    @"""statusText""\s*:\s*""(?<status>\w+).*""displayName""\s*:\s*""(?<name>\w+)""";

static void Main(string[] args) {
    var uri = string.Format(uriFormat, args[0]);
    var request = WebRequest.Create(uri);
    var result = request.BeginGetResponse(PresenceCallback, request);

    // Other interesting work....

    result.AsyncWaitHandle.WaitOne();
}

private static void PresenceCallback(IAsyncResult result) {
    var request = result.AsyncState as WebRequest;
    var response = request.EndGetResponse(result);
    var s = new StreamReader(response.GetResponseStream()).ReadToEnd();
    var search = new Regex(responsePattern);
    var match = search.Match(s);    
    if (match.Success)
        Console.WriteLine("{0} is {1}", 
           match.Groups["name"].Captures[0].Value, 
           match.Groups["status"].Captures[0].Value);
    else
        Console.WriteLine("Unexpected response");
}

This will get the online status of a Windows Live Messanger account given the live ID account number. For example, passing 1afa695addc07e5 as an argument to the above will tell whether or not I am online. In this case I am using last parameter of the BeginGetResponse() method to pass the request itself. This then is cast back to WebRequest in the PresenceCallback() method so I can call EndGetResponse() to retrieve the actual response. As far at the BeginGetResponse() call is concerned, this value is opaque. It ignores it completely and just supplies it blindly in the IAsyncResult. It makes no assumptions about the data at all, it is just something the caller just wants carried around. If I was using anonymous delegates in C# this would look a lot better as,

static void Main(string[] args) {
    var uri = string.Format(uriFormat, args[0]);
    var request = WebRequest.Create(uri);
    request.BeginGetResponse(result => {
        var response = request.EndGetResponse(result);
        var s = new StreamReader(response.GetResponseStream()).ReadToEnd();
        var search = new Regex(responsePattern);
        var match = search.Match(s);    
        if (match.Success)
            Console.WriteLine("{0} is {1}", 
               match.Groups["name"].Captures[0].Value, 
               match.Groups["status"].Captures[0].Value);
        else
            Console.WriteLine("Unexpected response");
        }   
    }, null);

    // Other interesting work....

    result.AsyncWaitHandle.WaitOne();
}

Here the request local variable request is captured automatically by the C# compiler and placed into a compiler generated class as a field. The compiler generated class also contains the code I supplied in the lambda as an instance method. When I refer to response in the lambda the reference is automatically translated into a field lookup in the generated class. Since request is already accessible in the lambda I don't need the last parameter to carry anything useful so I pass null.

Anonymous methods makes using callbacks much easier but since not all languages support anonymous methods or lambdas the BCL standardized on a method pattern for begin invoke that can easily be used by those languages. If the calling pattern did not have a caller state object then the work performed automatically by the C# compiler would have to be repeated manually by the programmer in these, then, second class languages. The .NET team did not want such languages to be second class citizen (especially since neither C# nor VB.NET supported anonymous methods initially) so they required the presence of the caller state object parameter.

Now we know why it is there, what to do we call it? I like state because the parameter represents state the caller want's to preserve. I don't like object because it is a common reserved word. I don't like stateObject because a symbol's type should not be repeated in the name, we already know it is an object by its type. asyncState is acceptable, especially since that is the name it is given by IAsyncResult, but it is a bit redundant, we already know, by context, it is asynchronous state. Plus we should avoid abbreviation, like "async", in symbol names (though it is very common, and asynchronous is very very long, so not that bad). data seems fine to me, it is a synonym to state, but it is overused. extraData I cannot, for the life of me, figure out. Extra for whom? Extra as opposed to what? Unfortunately, this is the name given to the parameter by IHttpAsyncHandler (see, I told you "async" was common).  Its name tells me nothing about what I should do with it. It is very unclear that this value should be mapped to AsyncState in IAsyncResult.

I propose we call it state, with an acceptable variation of asyncState.

Now, if changing a parameter name was not a breaking change...


Trivia:

  • The above uses the Live Services that you can find more about here.
  • The IM presence service returns JSON which I parse using a fancy looking Regex instance. I recommend that production code use DataContractJsonSerializer() instead.

If you are using a loop, you're doing it wrong

"If you are using a loop, you're doing it wrong."

That is the advice one of my college professors told us when he was teaching us APL. APL was designed to perform vector and matrix operations. Programming in APL is an exercise in stringing operators together to perform strange and wonderful things. Using a loop is just wrong and slows things down.

It is similar with LINQ, if you are using a loop you are doing it wrong. I find myself doing a lot of prototyping lately and I am forcing myself to use LINQ; not because I don't like it, far from it, I really like LINQ, but using loops is so ingrained into my psyche that I have to stop myself and force myself to think in LINQ. Every time I am tempted to write a loop that involves a collection or an array I ask myself, could I use LINQ instead? Programmers with more of a database background seem to take to LINQ like a duck to water. They think in sets and vectors, I don't, but I am getting there.

For example sometimes I find myself needing to return an IEnumerable<T> when I have a List<T> but of a different T. This often happens when I want to keep internal implementation details private. My internal List<T> might have the actual class but I need to return an enumerable for some interface. Before I would simply write a loop using the C# 2.0's yield syntax,

List<Foo> foos = new List<Foo>();

...

public IEnumerable<IFoo> Foos {
    get {
        foreach (Foo f in foos)
            yield return f;
    }
}

This loop involves a collection, can I use LINQ? Sure! By using LINQ's Cast<T>() method this can be replaced with,

public IEnumerable<IFoo> Foos {
    get { return foos.Cast<IFoo>(); }
}

If you are trying to find if a list contains an object by some name, you could write a loop like,

public bool Contains(string value) {
    foreach (Foo foo in foos)
        if (foo.Name == value)
            return true;
    return false;
}

Using LINQ this might look like,

public bool Contains(string value) {
    return (from foo in foos where foo.Name == value select foo).Any();
}

A nice thing about LINQ is you can perform complicated queries in pieces. With deferred execution of enumerators, this is fairly efficient as well. This is really helpful in the debugger. In one chunk of code I was writing I needed to coalesce adjacent ranges that are marked with the same text value. Assume you have a structure called Range that looks like,

struct Range {
    public int Start;
    public int Length;
}

and another struct that labels the ranges with names,

struct NamedRange {
    public string Name;
    public Range Range;
}

Now lets have a routine that calculates the range information over some stream,

public IEnumerable<NamedRange> GetNamedRanges(Stream stream) {
    ...
}

Lets assume that name ranges are some things like "whtespace", "reserved", "identifier", "number" "string", etc. as you might expect to receive from a lexical scanner like found in this post.

What I want to do with these ranges is to convert the names into styles such as you might find referencing a CSS style sheet. So, in effect, I am mapping NameRange values to StyledRange values where StyleRange would look something like,

struct StyledRange {
    public Style Style;
    public Range Range;
}

Lets create a dictionary that maps range names to styles such as,

styleMap["number"] = new Style() { Name = "green" };
styleMap["reserved"] = new Style() { Name = "bold" };

I only wanted to highlight numbers and reserved words, for everything else I will use the default style,

defaultStyle = new Style { Name = "normal" };

We can translate our named ranges directly into styled ranges by using a LINQ query expression such as,

var ranges = GetNamedRanges(stream);
var styledRanges = from range in ranges
                   select new StyledRange() {
                       Style = styleMap.MapOrDefault(range.Name) ?? defaultStyle,
                       Range = range.Range
                   };

where MapOrDefault() is an extension method for IDictionary<TKey, TValue> that looks like,

public static TValue MapOrDefault(this IDictionary dictionary, TKey key) {
    TValue result;
    if (dictionary.TryGetValue(key, out result))
        return result;
    return default(TValue);
}

which is patterned after the existing LINQ methods for IEnumerable<T>, FirstOrDefault() and LastOrDefault().

Since many of the ranges that have different names will have the same style, it would be nice to coalesce adjacent styles together so no two adjacent ranges have the same style. In other words, we only want a new styled range when the style changes. The above query expression just produces a one-to-one mapping of named range to styled range. What we need is something that will merge adjacent ranges. Do do this I will introduce another extension method, Reduce<T>() for IEnumerable<T>,

public static IEnumerable<T> Reduce<T>(this IEnumerable<T> e,
    Func<T, T, bool> match,
    Func<T, T, T> reduce) {
    var en = e.GetEnumerator();
    T last;
    if (en.MoveNext()) {
        last = en.Current;
        while (en.MoveNext()) {
            if (!match(last, en.Current)) {
                yield return last;
                last = en.Current;
            }
            else
                last = reduce(last, en.Current);
        }
        yield return last;
    }
}

What this method does is if two adjacent elements match (as defined by the match delegate returning true) they will be reduced into one element by calling the reduce delegate. For example, the Sum<T>()standard extension method could be implemented using Reduce<T>() as,

var sum = numbers.Reduce((a, b) => true, (a, b) => a + b).First();

Now that we have Reduce<T>(), lets reduce the list of styled ranges to coalesce the adjacent ranges with the same style. This can be done by,

styledRanges = styledRanges.Reduce(
    (r1, r2) => r1.Style == r2.Style,
    (r1, r2) => new StyledRange() {
        Style = r1.Style,
        Range = MergeRanges(r1.Range, r2.Range)});

MergeRanges() referenced above, is,

Range MergeRanges(Range r1, Range r2) {
    return new Range() { Start = r1.Start, Length = r2.Start - r1.Start + r2.Length };
}

In my example, this took 18 ranges for a typical line of C# source down to 7 ranges. Pretty good. But I noticed that some of those ranges were styling whitespace as "normal". This seems like a waste; why switch back from green to black text for writing whitespace? Why not combine those with adjacent ranges instead? A simple approach to this is to add the following code prior to mapping the styles,

ranges = ranges.Reduce(
    (r1, r2) => r1.Name == "whitespace" || r2.Name == "whitespace",
    (r1, r2) => new NamedRange() {
        Name = r1.Name == "whitespace" ? r2.Name : r1.Name,
        Range = MergeRanges(r1.Range, r2.Range)
    });

This says to merge whitespace ranges with adjacent non-whitespace ranges. This reduces the range count from 7 to 4. Not bad from the original 18 and that was just for one line of source. This savings adds up quickly over an entire file.

The complete example, made as a function, looks like,

IEnumerable<StyledRange> StyleRanges(IEnumerable<NamedRange> ranges) {

    // Merge whitespace ranges with adjacent non-whitespace ranges
    ranges = ranges.Reduce(
        (r1, r2) => r1.Name == "whitespace" || r2.Name == "whitespace",
        (r1, r2) => new NamedRange() {
            Name = r1.Name == "whitespace" ? r2.Name : r1.Name,
            Range = MergeRanges(r1.Range, r2.Range)});

    // Map named ranges to styles.
    var styledRanges = from range in ranges 
                       select new StyledRange() { 
                           Style = styleMap.MapOrDefault(range.Name) ?? defaultStyle,
                           Range = range.Range };

    // Merge adjacent ranges with the same style.
    styledRanges = styledRanges.Reduce(
        (r1, r2) => r1.Style == r2.Style,
        (r1, r2) => new StyledRange() {
            Style = r1.Style,
            Range = MergeRanges(r1.Range, r2.Range)});

    return styledRanges;
}

There are a few nice things about this function. First, it builds up an IEnumerable<T> but this enumerable doesn't execute until one of its enumerators is enumerated. This is due to the deferred execution of enumerators. Second, even though deferred execution makes single step debugging challenging, you can get a picture of what the function will do by using ToArray() in the watch windows on the intermediate results. This allows you to inspect the intermediate result to see if the mapping or reducing is what you expected. Third, this routine has no loops. It operates on each enumerable as a set. Now some of you will rightly say there is a loop buried in the Reduce<T>() method. True; Reduce<T>() is just like many other LINQ extension methods, they contain loops implied by the returned enumerator. But LINQ allows me to write code that communicates what I am trying to do without it being obscured by the details of how it is done. I think of LINQ not so much as a set of features in C# but a way of programming. A way of programming that, if you are using loops, you are doing it wrong.

Scanners (not the movie)

In my previous post I introduced a program I wrote to that uses the method described in First-Order Logic by Raymond M. Smullyan to prove an arbitrary propositional logic expression is a tautology. In this program I use some techniques that I want to describe. The first is the scanner. When writing scanners I have always hand written the scanner. There are two primary reasons for this, first I have never found scanners that difficult to write and, second, they are very time critical, often taking more than 20% of the overall parsing time, and auto-generated scanners are hard to optimize. Tableaux uses a hand-written scanner. The code for the scanner is in Scanner.cs in the zip file found here.

A scanner is a routine that takes text input and classifies it into a stream of tokens. Scanners typically give one token at a time. The scanner is the lowest level of the parsing process. Parsers are built on top of scanners and use the scanner results to determine the meaning of the source using some kind of grammar. The distinction between the parser and the scanner is somewhat arbitrary. You could just consider each character of the input as a token and build the parser up from there. Doing this is typically slower than a traditional scanner, however, so most parsers use a scanner. The line between the scanner and the parser is usually drawn at the lexical elements such as identifiers, reserved words, numeric and string constants, comments, etc. More formally, a lexical element is a string of characters that can be matched and classified by a regular expression. Scanner generators, such as Lex, use regular expressions to specify the scanner.

The trick to a fast scanner is to reduce the per-character cost of the scanner. One way to do this is to make sure you are executing the fewest number of instructions per-character-scanned as possible. A few of tricks I use to reduce the per-character cost include,

  1. Use the switch statement
    • That included switch or case statement typically heavily optimize the result, often generating jump tables.
  2. Pull fields into local variables and only write them at the before returning.
    • Local variables can be mapped to registers during code generation. Fields never will be. Stores to global memory (i.e. a field of a object allocated from the heap) are very hard for code generators to optimize. Local variable are much easier.
  3. Make checking for boundaries (end of file or end of buffer) appear as a character.
    • This merges the end-of-file check with character classification.
    • In this case I use the character value 0 to indicate end of file. If I was willing to use unsafe code I would have taken advantage of the fact that the CLR ensures a 0 value at the end of every string.

One thing that looks slow that isn't is the GetChar() routine. Normally you would see that and think I am paying for a CALL/RET pair but his gets inlined in retail. I am paying for the end of buffer check every character but, as noted above, this can be removed if I wanted to use unsafe code, definitely overkill for a small program like tableaux.

I went a bit overboard, intentionally, with identifier. It handles C# style identifiers, handling Unicode letter and number classifications. To avoid calling the Unicode classification routines for every character I hard-coded the ASCII part and only call the Unicode classification routines when I encounter a non-ASCII potential identifier character. It is not that the classification routines are slow, it is that every cycle counts in the scanner and avoiding the CALL/RET and character table look is significant for file that contain mostly ASCII characters. Note also that I hard-code some non-identifier characters in the identifier internal loop. This avoids calling the Unicode classification routines for those character, which are the most common characters you might find after an identifier, which avoids calling the Unicode classification routines at all for most input.

To b | !b, that is the question

When I was reading First-Order Logic by Raymond M. Smullyan I was fascinated by the second chapter in which he describes a method for proving a propositional logic expression is a tautology. This method is called the Tableau Method. The Tableau Method is a rather straight forward way to prove a statement is always going to be true by demonstrating that asserting it is false will result in a contradiction. The propositional logic expression language Dr. Smullyan uses is rather simple. I present a modified version of it here. My version contains a not operator (!), a conjunction operator (&), a disjunction operator (|) and an implication operator (→).1 For example,

p & q

is true when both p and q are true;

p | q

is true when either p is true or q is true;

p → q

is true when p is false or p is true and q is also true; and finally,

!p

is true only when p is false. This should be very familiar to any programmer. A propositional logic expression is a tautology if it will always be true regardless of the value of its propositional variables. An example of a simple tautology is,

p | !p

which will be true if p is true or false. This statement is always true. This can be proven by simply demonstrating that asserting it is false results in a contradiction. For example, we assert a statement is false by prefixing it with a capital F or true with a capital T. For the above expression this would look like,

F p | !p

For this to be false both of the sub-expressions would also have to be false. This would then look like

F p
F !p

For !p to be false p would have to be true, which looks like,

T p

Now we have a contradiction, according to propositional logic, p cannot be both true and false so p | !p must always be true. Using the Tableau Method this would look like,

(1) F p | !p
(2) F p (1)
(3) F !p (1)
(4) T p (3)
X

where the X indicates that the tableau ends in a contradiction (i.e. is closed) and the numbers to the left label the expressions and the numbers to the right indicate which expression the assertion is derived from. To produce a tableau you first assert the expression is false then use the following table to add assertions,

Expression of the form Asserts
T !p F p
F !p T p
T p & q T p and T q
F p & q F p or F q
T p | q T p or T q
F p | q F p and F q
T p → q F p or T q
F p → q T p and F q

When two expressions are asserted the table might branch. If the above table says "and" then both statement are asserted directly, but if it contains an "or" then both assertions must lead to a contradiction independently. For example, consider wanting to prove the transitive property of the implication operator, that is (p→q & q→r) → (p→r), this would start off with adding asserts until we are required to branch,

(1) F (p→q & q→r) → (p→r)
(2) T p→q & q→r (1)
(3) F p→r (1)
(4) T p→q (2)
(5) T q→r (2)
(6) T p (3)
(7) F r (3)

We now have to handle (4) and (5) with require us to branch the table. The first branch is the F p branch,

(8) F p (4)
X

This lead immediately to a contradiction because of (6) so this branch of the tableau is closed. We now need handle the T q branch,

(9) T q (4)

This doesn't lead to an immediate contradiction so we need to branch again for (5). The first branch for (5) looks like

(10) F q (5)
X

This again leads to an immediate contradiction, we already asserted that q must be true in (9). The second branch for (5) is the T r branch.

(11) T r (5)
X

This also lead to an immediate contradiction because of (7). Since all branches lead to a contradiction the original expression must always be true.

Putting this altogether we get,

(1) F (p→q & q→r) → (p→r)
(2) T p→q & q→r (1)
(3) F p→r (1)
(4) T p→q (2)
(5) T q→r (2)
(6) T p (3)
(7) F r (3)

(8) F p (4)
X

(9) T q (4)

(10) F q (5)
X

(11) T r (5)
X

This is a simple and elegant way to demonstrate proof by contradiction. Of course I just had to write a program to tell me whether an arbitrary propositional logic expression is a tautology and automatically generate the tableau.

You can find the source project that is works with the just released Visual Studio 2008 here. It requires VS 2008 because I am addicted to some of the new C# features, especially the var keyword.


1 Dr. Sumullyan uses different characters for his operators but his characters are not as commonly included in browser fonts and are not as easy to type on a standard keyboard (using -> as a synonym of →) as the ones I use. << back

Operators, Generics and Policies II: Adapter Policy

The mixin technique in my last post was a bit awkward to use with generic methods. It is much less awkward to use for classes, however. For example, if you want an expression evaluator much like I posted here, but you want to leave the result type open, you can use this technique to do it.

First, I declared a policy interface for the operators I wanted to abstract over,

partial interface IOperatorPolicy<T> {
    T Add(T a, T b);
    T Subtract(T a, T b);
    T Multiply(T a, T b);
    T Divide(T a, T b);
}

Second, I created the operator policies for int and double. I could have created more, decimal, float, etc. but two is sufficient for demonstration.

partial struct IntOperatorPolicy : IOperatorPolicy<int> {
    int IOperatorPolicy<int>.Add(int a, int b) { return a + b; }
    int IOperatorPolicy<int>.Subtract(int a, int b) { return a - b; }
    int IOperatorPolicy<int>.Multiply(int a, int b) { return a * b; }
    int IOperatorPolicy<int>.Divide(int a, int b) { return a / b; }
}

partial struct DoubleOperatorPolicy : IOperatorPolicy<double> {
    double IOperatorPolicy<double>.Add(double a, double b) { return a + b;
    double IOperatorPolicy<double>.Subtract(double a, double b) { return a - b; }
    double IOperatorPolicy<double>.Multiply(double a, double b) { return a * b; }
    double IOperatorPolicy<double>.Divide(double a, double b) { return a / b; }
}

Next are the expression nodes themselves. I wrap them in a generic class that takes the parameters. This mitigates some of the clutter caused by the policy.

partial class Expression<T, OperatorPolicy>
    where OperatorPolicy : struct, IOperatorPolicy<T> {

    static OperatorPolicy _operatorPolicy = new OperatorPolicy();

    public abstract partial class Expr {
        public abstract T Evaluate();
    }

    public partial class Literal : Expr {
        private T _literal;

        public Literal(T literal) { _literal = literal; }

        public override T Evaluate() {
            return _literal;
        }

        public override string ToString() {
            return _literal.ToString();
        }
    }

    public abstract partial class BinaryOp : Expr {
        private Expr _left;
        private Expr _right;

        protected BinaryOp(Expr left, Expr right) { _left = left; _right = right; }

        public override T Evaluate() {
            return EvaluateOp(_left.Evaluate(), _right.Evaluate());
        }

        protected abstract T EvaluateOp(T left, T right);
    }

    public partial class Add : BinaryOp {
        public Add(Expr left, Expr right) : base(left, right) { }
        protected override T EvaluateOp(T left, T right) {
            return _operatorPolicy.Add(left, right);
        }
    }
    
    public partial class Add : BinaryOp {
        public Add(Expr left, Expr right) : base(left, right) { }
        protected override T EvaluateOp(T left, T right) {
            return _operatorPolicy.Add(left, right);
        }
    }

    public partial class Subtract : BinaryOp {
        public Subtract(Expr left, Expr right) : base(left, right) { }
        protected override T EvaluateOp(T left, T right) {
            return _operatorPolicy.Subtract(left, right);
        }
    }

    public partial class Multiply : BinaryOp {
        public Multiply(Expr left, Expr right) : base(left, right) { }
        protected override T EvaluateOp(T left, T right) {
            return _operatorPolicy.Multiply(left, right);
        }
    }

    public partial class Divide : BinaryOp {
        public Divide(Expr left, Expr right) : base(left, right) { }
        protected override T EvaluateOp(T left, T right) {
            return _operatorPolicy.Divide(left, right);
        }
    }
}

Here the Expression class acts as a container of a family of classes. Once you have an instance of one of the Expr derived classes, calling Evaluate() is straight-forward. Unfortunately, constructing one is a bit of a challenge. To get a literal of the int expression you would need to do,

var i1 = new Expression<int, IntOperatorPolicy>.Literal(1);

You can make this a little better by introducing a static method into Expression like,

public static Expr L(T literal) {
    return new Literal(literal);
}

then the above becomes,

var i1 = Expression<int, IntOperatorPolicy>.L(1);

This still a bit awkward especially when you want to construct a more complicated expression,

var e = new Expression<int, IntOperatorPolicy>.Add(i1, i1);

This can be made significantly better by adding operator overloads to Expr as in,

public abstract partial class Expr {
    public abstract T Evaluate();
    public static Expr operator +(Expr a, Expr b) {
        return new Add(a, b);
    }
    public static Expr operator -(Expr a, Expr b) {
        return new Subtract(a, b);
    }
    public static Expr operator *(Expr a, Expr b) {
        return new Multiply(a, b);
    }
    public static Expr operator /(Expr a, Expr b) {
        return new Divide(a, b);
    }
}

now the e assignment can look like,

var e = i1 + i1;

and finally, to make it look even better you can put the code to create the expression in a static method of a class that derives from Expression and closes the type parameters. Here are examples of one for int and a similar one for double.

class TestInt : Expression<int, IntOperatorPolicy> {
    public static void Run() {
        var iv = L(1) + L(2) * L(3);
        Console.WriteLine(iv.Evaluate());
    }
}

class TestDouble : Expression<double, DoubleOperatorPolicy> {
    public static void Run() {
        var dv = L(1.2) + L(3.4) * L(5.6);
        Console.WriteLine(dv.Evaluate());
    }
}

This shows taking advantage of the C# compiler's operator overloading to produce an expression tree and the use of the L() method as a sort of cast to change a literal of T into a Literal<T>. The call to Evaluate() calculates the result of the expression as an int or double. The code generated, as discussed last time, is quite good when JIT'ing outside the debugger and using optimized retail bits.

This use of a policy struct is what I refer to as an adapter policy. The Expression class has an abstraction it supports, represented by IOperatorPolicy<T>, but this abstraction is not supported by any of the interesting types you would want to pass as Expression's T. This is solved by creating an adapter policy and passing that along with T. This allows a generic type to introduce an abstraction that is unknown to the types the generic wants to range over, allow us, as above, to pass int for Expression's T even though int doesn't implement IOperatorPolicy<T> itself.

Next time I will discuss another application of an adapter policy.

Operators, Generics and Policies

One of the limitations of C# generics is you cannot abstract over operators. That is not completely true, you can if the base class used in where clause has operators; but, since operators are more useful, and more common, on value types, that is only marginally helpful. Also, this abstraction is provided by the base class not by using generics. What I would like to do is declare that my type  parameter requires the type to implement a particular operator. For example, to create a generic Add() method I would like to specify that the type parameter must implement the + operator and then be able to use the + operator in my code such as,

// Invalid C#
static T Add<T>(T a, T b) where T: T operator +(T, T) {
    return a + b;
}

C# doesn't support this but you can get somewhat close by supplying a policy struct like I described in this post. The policy interface and the int policy struct would look like,

interface IAddPolicy<T> {
    T Add(T a, T b);
}

struct IntAddPolicy : IAddPolicy<int> {
    public int Add(int a, int b) { return a + b; }
}

This allows you to write the following,

static T Add<T, P>(T a, T b) where P: struct, IAddPolicy<T> {
    return new P().Add(a, b);
}

Unfortunately the policy struct cannot be inferred from the parameters so you have to supply the type parameter explicitly,

var i = Add<int, IntAddPolicy>(3, 4);

which doesn't look pretty and generates less than efficient code for the Add() method on an i386 architecture,

00000000  sub         esp,8 
00000003  xor         eax,eax 
00000005  mov         dword ptr [esp],eax 
00000008  mov         dword ptr [esp+4],eax 
0000000c  mov         byte ptr [esp],0 
00000010  movsx       eax,byte ptr [esp] 
00000014  mov         byte ptr [esp+4],al 
00000018  add         ecx,edx 
0000001a  mov         eax,ecx 
0000001c  add         esp,8 
0000001f  ret              

This generates code to initialize the policy structure we don't actually use. We can get rid of that by recasting this a bit to save the dummy struct allocation by putting the Add() method in a wrapping class that takes the type parameters. The type would allocate the struct instead of the method. This looks like,

class Adder<T, P>
    where P: struct, IAddPolicy<T>
{
    static P AddPolicy = new P();
    static public T Add(T a, T b) { return AddPolicy.Add(a, b); }
}

which generates the following code for Add(),

00000000  add         ecx,edx 
00000002  mov         eax,ecx 
00000004  ret              

This can be called like,

var i2 = Adder<int, IntAddPolicy>.Add(3, 4);

Note that the code in IntAddPolicy gets inlined into the Add() method. Even though this doesn't get inlined into the caller of Add(), it will still be pretty quick with the branch prediction and register aliasing of modern processors. This means that you are not paying much (and in many cases nothing) in code quality for using an AddPolicy over writing out the Add() method explicitly.

To use the Add() method above with another type, such as double, you need to create another add policy struct. This would look like,

struct DoubleAddPolicy : IAddPolicy<double> {
    public double Add(double a, double b) { return a + b; }
}

Using this struct to call Add() will produce the following code for the Add() method,

00000000  fld         qword ptr [esp+0Ch] 
00000004  fadd        qword ptr [esp+4] 
00000008  ret         10h  

This is not quite as good as the code generated for int because it uses the stack to pass the parameters but it is still pretty fast.

We now have a method that can add two values of any type T for which an AddPolicy can be created. This is pretty close to, and more general than, what I originally wanted. Unfortunately, even though the code generated is good, calling this code is very awkward; we have to call a static method of a parameterized class and the policy struct prevents the compiler from being able to use type inferencing to supply the parameters for us. Next time I will describe a more practical application of this technique where the awkwardness is not as noticiable.

C# Mixins - Sort of

I recently needed to create a set of classes that was the combinatorial expansion of several implementation flavors of two independent sets of methods. Why I needed combinatorial expansion is not that interesting but the solution is. What I came up with is a way of supporting a form of mixin's in C#.

Lets take the methods,

void Lock();
bool Unlock();
bool Locked { get; }

If you call Lock(), Locked returns true, until you call Unlock(). Also Unlock() returns true if the object is now unlocked. There are several implementations I can think of for these methods, one uses a Boolean value and doesn't support nested calls to Lock():

void Lock() {
  _locked = true;
}

bool Unlock() {
  _locked = false;
  return true;
}

bool Locked {
  get { return _locked; }
}

Another uses an integer value to count the number of nested calls to Lock() and it remains locked until a matching number of Unlock() calls are made.

void Lock() {
  _count++;
}

void Unlock() {
  _count--;
  return _count == 0;
}

void Locked() {
  return _count > 0;
}

This isn't thread safe so we could provide a relatively thread-safe version in,

void Lock() {
    System.Threading.Interlocked.Increment(ref _count);
}

bool Unlock() {
    return System.Threading.Interlocked.Decrement(ref _count) == 0;
}

bool Locked {
    get { return _count > 0; }
}

I am sure you have already thought of at least one more. Now I have four possible implementations (including yours), which is the correct one? Unfortunately, that depends on how the locking is going to be used. If I know I will never need nested locks, only taking space for a Boolean makes sense. If I know I will need to support nested locks then the int seems like a good choice. If it is going to be used in a multi-threaded application I should probably use the thread-safe version. I now have two choices, pick one implementation or pass it the implementation as a parameter. Picking one implementation is easy, just copy of of the above implementation into the class; but how do you efficiently pass an implementation as a parameter?

From now on I will refer to each implementation as a locking policy. What I want is an efficient way to pass the locking policy to a set of classes. Most obvious way is to write a class for each policy and pass the policy as a parameter to each instance. This might look something like,

abstract class LockingPolicy {
  public abstract void Lock();
  public abstract bool Unlock();
  public abstract bool Locked { get; }
}

class LockableObject {
    private LockingPolicy _lockingPolicy;
  
    public LockableObject(LockingPolicy lockingPolicy) {
        _lockingPolicy = lockingPolicy;
    }
  
    public void Lock() {
        _lockingPolicy.Lock();
    }

    public bool Unlock() {
        return _lockingPolicy.Unlock();
    }

    public bool Locked {
        get { return _lockingPolicy.Locked; }
    }
}

Unfortunately, this has, on average, 36 bytes of overhead per instance on a 32 bit machine (more on 64).  This seems a bit much for such a policy and picking just one implementation, even one too general for the average application, starts looking like a better alternative. We must be able to do better than this.

Since passing the implementation as a parameter to the instances is expensive, why not try to pass the implementation to the class instead? In order to do that we need use a type parameter. What we want is something like,

class LockableObject<LockingPolicy> {
    private LockingPolicy _lockingPolicy = new LockingPolicy();

    public void Lock() {
        _lockingPolicy.Lock();
    }

    public bool Unlock() {
        return _lockingPolicy.Unlock();
    }

    public bool Locked {
        get { return _lockingPolicy.Locked; }
    }
}

In other words, we need to passing a class and delegate the implementation to that class. This does not look significantly better than the previous implementation, however, because we are new'ing up an instance and assigning it to an instance variable which, as before, consumes a minimum of 36 bytes per instance. But if LockingPolicy is a struct instead of a class then it will only add the size of the struct to the LockableObject instance size. In other words, we would only be taking space for either a bool or an int. The size overhead looks good, now how about the delegation? In order to call methods on LockingPolicy the LockingPolicy struct must implement an interface. It cannot be an abstract class because all structs are sealed and cannot be inherited from. The interface for the locking policy is fairly straight forward and looks like,

interface ILockingPolicy {
    void Lock();
    bool Unlock();
    bool Locked { get; }
}

An implementation of the Boolean locking policy looks like,

struct SimpleLockPolicy : ILockingPolicy {
    bool _locked;

    void ILockingPolicy.Lock() {
        _locked = true;
    }
    bool ILockingPolicy.Unlock() {
        _locked = false;
        return false;
    }
    bool ILockingPolicy.Locked {
        get { return _locked; }
    }
}

The counted locking policy looks like,

struct CountedLockPolicy : ILockingPolicy {
    private int _count;
    void ILockingPolicy.Lock() {
        _count++;
    }
    bool ILockingPolicy.Unlock() {
        _count--;
        return _count == 0;
    }
    bool ILockingPolicy.Locked {
        get { return _count > 0; }
    }
}

We now need to add a where clause to the LockableObject class so we can call the methods provided by the struct's methods. The modified LockableObject class looks like,

class LockableObject<LockingPolicy>
    where LockingPolicy : struct, ILockingPolicy 
{
    private LockingPolicy _lockingPolicy = new LockingPolicy();

    public void Lock() {
        _lockingPolicy.Lock();
    }

    public bool Unlock() {
        return _lockingPolicy.Unlock();
    }

    public bool Locked {
        get { return _lockingPolicy.Locked; }
    }
}

The LockableObject can now be instantiated like,

var lockableObject = new LockableObject<SimpleLockPolicy>();

for a lockable object that uses the Boolean implementation. To use the counted locking policy you would construct the object like,

var lockableObject = new LockableObject<CountedLockPolicy>();

The space consumption looks good but the delegation looks expensive it call through an interface which is a virtual. But since we are using structs, which are sealed, the call through the interface can be made static and then be a candidate for inlining. In fact, this is what happens in retail builds when not debugging. This is not entirely free however since there the runtime doesn't inline twice. Implementing Lock() directly instead of through delegation means the Lock() method can be inlined, removing the call entirely, the delegated version inlines the delegation but it still generates a call.

This mechanism for C# mixins is not as convenient as languages that support multiple inheritance, because we have to manually delegate to the policy, but it is just as efficient as we will see demonstrated more clearly in some of the next entries.

Keyed Binary Search

One thing that has always frustrated me about typical binary search routines found in libraries is they always assume you can easily produce a copy of the item you are searching for. What I mean is they typically follow this pattern,

int BinarySearch<T>(IList<T> list, T item);

which takes a list of some type, left open here, and searches for it in the list. This is great if the very next thing you do next is insert item into the list. You already have item in hand so this signature is perfect. But if you don't have a ready item, and item is difficult to produce, you are stuck.

This happened to me just recently in my code. I had a sorted list of a data structure representing a span of source, similar to a TextPointer from WPF, called a TextRange. My version can quickly return the offset of the beginning of the span but, since TextRange tracks source changes, creating a new TextRange is expensive. When text in the editor is change, I receive the location of the change in an event. How do I figure out which one of my TextRanges was modified? Since the list is sorted, the obvious choice is to use a binary search which yields O(Log N) performance. Since my TextRanges are store in a List<TextRange>, obviously I should call List<T>.BinarySearch(). Unfortunately, TextRanges are expensive to produce; too expensive for a per-keystroke operation (I am glossing over some details for the sake of brevity; you need to trust me this is true). I have one of two choices, write my own BinarySearch() or create a fake TextRange that is a special TextRange that is less expensive to create and only used for searching. Neither choice is pleasant. Hand-rolled binary searches are not hard to right but, for some strange reason, I always get them wrong the first time.

What I needed is prebuilt and tested version of BinarySearch() that looks more like,

int BinarySearch<T, K>(IList<T> list, K key, Converter<T, K> converter);

What this binary search routine does is find some key value in the list given a function that can convert any list item into the key. In my example, I would call it like,

int location = BinarySearch(textRangeList, location, TextRangeToLocation);

where TextRangeToLocation looks something like,

int TextRangeToLocation(TextRange range) { return range.Location; }

I don't need to produce a TextRange. I have the location I am looking for handy, I just need to know what, if any, TextRange contains the location. This method would do just that. You can even use a lambda function in C# 3.0 to get this all on one line,

int location = BinarySearch(textRangeList, location, (n) => n.Location);

This is starting to look like a good idea. Let's code it up. Here is the method I ended up with,

public static int BinarySearch<T, K>(IList<T> list, K value, 
    Converter<T, K> convert, Comparison<K> compare) {
    int i = 0;
    int j = list.Count - 1;
    while (i <= j) {
        int m = i + (j - i) / 2;
        int c = compare(convert(list[m]), value);
        if (c == 0)
            return m;
        if (c < 0)
            i = m + 1;
        else
            j = m - 1;
    }
    return ~i;
}

I added an extra parameter, compare. This is required because I didn't put any limits on what K can be and I need a way to compare one K with another. Many types, such as int, already know how to compare values of their own type. Typically types that are comparable, such as int, string, etc., implement IComparable<T>. We can accommodate such types by creating an overloaded version of BinarySearch() that looks like,

public static int BinarySearch<T, K>(IList<T> list, K value, 
    Converter<T, K> convert) where K : IComparable<K> {
    return list.BinarySearch(value, convert, (n, m) => n.CompareTo(m));
}

which uses a lambda function to create a Comparison<T> for any type that supports IComparable<T>. Beware that this doesn't work well with reference types because it doesn't check for null. You can make it safer by putting struct in the where clause but that seemed overly restrictive to me.

Now that I have a BinarySearch() that meets my needs better than the built-in BinarySearch(), my next thought was I should evangelize the benefits my version of BinarySearch() to the team responsible for maintaining the CLR collections and, maybe, someday, I will have it as a native part of the CLR; maybe, someday. Then I remember extensions methods. In C# 3.0, with a slight modification to the above methods, they can appear as if they were already part of the CLR.

public static class ListExtensionMethods {
    public static int BinarySearch<T, K>(this IList<T> list, K value,
        Converter<T, K> convert, Comparison<K> compare) {
        int i = 0;
        int j = list.Count - 1;
        while (i <= j) {
            int m = i + (j - i) / 2;
            int c = compare(convert(list[m]), value);
            if (c == 0)
                return m;
            if (c < 0)
                i = m + 1;
            else
                j = m - 1;
        }
        return ~i;
    }

    public static int BinarySearch<T, K>(this IList<T> list, K value, 
        Converter<T, K> convert) where K : IComparable<K> {
        return list.BinarySearch(value, convert, (n, m) => n.CompareTo(m));
    }
}

My version of BinarySearch() now appears as part of the methods of any type that implements IList<T>. This allows us to write,

int location = textRangeList.BinarySearch(location, (n) => n.Location);

just like it was a native part of the CLR. No more hand-rolled BinarySearch()s!

Now that we have a ListExtensionMethods class, what other missing methods could we add? Here are a few that I thought would be useful,

public static void Sort<T, K>(this List<T> list, 
    Converter<T, K> coverter, Comparison<K> compare) {
    list.Sort((n, m) => compare(convert(n), convert(m)));
}

public static void Sort<T, K>(this List<T> list, 
    Converter<T, K>  converter) where K : IComparable<K> {
    list.Sort((n, m) => convert(n).CompareTo(convert(m)));
}

public static void Swap<T>(this IList<T> list, int index1, 
    int index2) {
    var value = list[index1];
    list[index1] = list[index2];
    list[index2] = value;
}

public static void Shuffle<T>(this IList<T> list) {
    int count = list.Count;
    Random r = new Random();
    for(int i = count - 1; i > 1; i--)
        list.Swap(i, r.Next(i - 1));
}

The Sort() methods above are allow you to extract a key for sorting using the same methods as above. These are not strictly necessary since constructing the lambda function I use is not that complicated but I believe they make the code more readable if you are consistent in how you extract keys from a list, so these are convenient wrappers that allow you to use the same methods to sort and search. Swap() and Shuffle() are throw-ins. They are really only useful for demo code or, in my case, test code. I used them to test the Sort() method which I used to test the BinarySearch() method.

Pick a License

Reading Pick a License, Any License by Jeff Atwood inspired me to pick one for the code I post on this blog. I picked Ms-PL. I certainly intended people to use the code I post, this make that intent more clear. I have modified the my copyright footer to reflect this selection.

PolyDictionary IV: By extension

The technique we used to build PolyDictionary can be used in other places. For example, you can adapt a non-type-checked store to a type-checked store by supplying your own Get() and Set() methods. One place this might be useful is when you encounter something like CodeDom's UserData. It is intended to allow users or producers of a CodeDom to annotate a CodeDom with their own information. Because it uses IDictionary, using UserData is not type checked at compile time. You can introduce type checking by using your own methods to access the dictionary like,

  public static T Get<T>(IDictionary dictionary, Key<T> key) {
      return (T)dictionary[key];
  }

  public static void Set<T>(IDictionary dictionary, Key<T> key, T value) {
      dictionary[key] = value;
  }

which uses the Key<T> class from last time. This allows the compiler to ensure that the type of the value you get out is the same as you put in and vise versa.

With C# 3.0, however, you can get even more creative. You can create extension methods for IDictionary that makes it appear that all IDictionarys have a type checked Get() and Set() like,

    static class IDictionaryExtensions {
        public static T Get<T>(this IDictionary dictionary, Key<T> key) {
            return (T)dictionary[key];
        }
        public static void Set<T>(this IDictionary dictionary, Key<T> key, T value) {
            dictionary[key] = value;
        }
    }

This allows you to write code like,

    Key<string> k1 = new Key<string>();
    Key<int> k2 = new Key<int>();
    Key<double> k3 = new Key<double>();

    IDictionary dictionary = new Hashtable();

    dictionary.Set(k1, "one");
    dictionary.Set(k2, 2);
    dictionary.Set(k3, 3.33);

    string v1 = dictionary.Get(k1);
    int v2 = dictionary.Get(k2);
    double v3 = dictionary.Get(k3);

    Console.WriteLine("v1 = {0}", v1);
    Console.WriteLine("v2 = {0}", v2);
    Console.WriteLine("v3 = {0}", v3);

which allows us to use an IDictionary almost identically to a PolyDictionary

This works great for Get() and Set() but this doesn't work for Add(), unfortunately. The compiler accepts it but it doesn't do what we want. If we use a value of an incorrect type for a particular key we want the compiler to generate and error. In other words, if we change the Set() call for k1 above to,

  dictionary.Add(k1, 1);

we want the compiler to generate an error. This doesn't happen because of method overloading. The object version of Add, Add(object value) is called if the type is incorrect; this is not what we want. We could add an AddSafe() method to the extension class but then we would have to remember to call it instead of Add(), which defeats the purpose in my opinion.

PolyDictionary III: Cast away

The astute among you realized right away that PolyDictionary does not really avoid a cast, it just hides it. The TryGetValue() has a cast in it, highlighted below,

        public bool TryGetValue<K, V>(Key<K, V> key, out V value) {
            object objValue;
            if (_table.TryGetValue(key, out objValue)) {
                value = (V)objValue;
                return true;
            }
            value = default(V);
            return false;
        }

So, can we get rid of the this cast? Yes, but the results are not very pleasant. One approach is to create one dictionary for each unique value type and then find that dictionary by that type and index into it using the key, instead of storing all the values in the same dictionary. This sounds similar to the original problem! All we need is a PolyDictionary that maps T to Dictionary<Key<T>, T>. We could code this but we couldn't run it because we would be defining PolyDictionary in terms of itself, consuming all of memory in the process. We need something like PolyDictionary but isn't a PolyDictionary.

OK, things will now get ugly. Have your barf bags on the stand-by! And, for the love of God, don't ever put this in a production application!

Now that is off my chest, here is one technique to avoid the cast. You can create a nested class SubDictionary that has a static instance variable which maps PolyDictionaries to Dictionary<Key<T>, T>. We can then parameterize SubDictionary by T which will create a unique dictionary for each T (I warned you it was ugly). This is because the CLR treats every instantiation of SubDictionary<T> as a unique type with unique static variables. Essentially we are tricking the CLR into producing a dictionary of T to Dictionary<PolyDictionary,Dictionary<Key<T>,T>>. Since the variable is static we need to find the version that is for the instance we are using, hence the key of PolyDictionary.

Here is the example,

    /* Never, Never use this code! */
    class PolyDictionary {
        
        class SubDictionaries<T> {
            internal static Dictionary<PolyDictionary, Dictionary<Key<T>, T>> Dictionaries = 
                new Dictionary<PolyDictionary,Dictionary<Key<T>,T>>();
        }

        public PolyDictionary() { }

        public void Add<T>(Key<T> key, T value) {
            Dictionary<Key<T>, T> subDictionary = GetDictionary(key);
            subDictionary.Add(key, value);
        }

        public T Get<T>(Key<T> key) {
            Dictionary<Key<T>, T> subDictionary = FindDictionary(key);
            if (subDictionary != null)
                return subDictionary[key];
            throw new KeyNotFoundException();
        }

        public bool TryGetValue<T>(Key<T> key, out T value) {
            Dictionary<Key<T>, T> subDictionary = FindDictionary(key);
            if (subDictionary != null)
                return subDictionary.TryGetValue(key, out value);
            else {
                value = default(T);
                return false;
            }
        }

        Dictionary<Key<T>, T> FindDictionary<T>(Key<T> key) {
            Dictionary<Key<T>, T> subDictionary;
            if (!SubDictionaries<T>.Dictionaries.TryGetValue(this, out subDictionary))
                return null;
            return subDictionary;
        }

        Dictionary<Key<T>, T> GetDictionary<T>(Key<T> key) {
            Dictionary<Key<T>, T> subDictionary;
            if (!SubDictionaries<T>.Dictionaries.TryGetValue(this, out subDictionary)) {
                subDictionary = new Dictionary<Key<T>, T>();
                SubDictionaries<T>.Dictionaries.Add(this, subDictionary);
            }
            return subDictionary;
        }
    }

This code has so many problems I don't know where to start. For one, it never frees memory, the dictionaries are in memory and will never be collected. I only include it to show that it can be done. As you can see, sometimes it is better to have a cast. There might be better techniques that eliminate some of the problems but my feeling is, why bother. The cast is not that bad.

PolyDictionary II: Constructed Keys

In the last entry we defined a dictionary that can contain a value of any type and extra that value out of the dictionary without our code using a cast. The dictionary itself used a cast (which I show a "way" to remove it next time) but our code didn't. One problem it had, however, is it used declared keys instead of constructed keys. A declared key is a key can only be referenced by knowing the variable it is assigned to; a unique object. A constructed key, on the other hand, uses object equality instead of object identity for the key. String keys are constructed keys; you can read them from a file, calculate them or use a literal value. The dictionary will eventually compare the strings for equality. In other words, constructed keys can be constructed from data where declared keys cannot.

Can we modify PolyDictionary to use constructed keys? One way to do it is to create a new Key type that takes the constructed key type as a parameter. One such type is,

    struct Key<K, V> {
        public K Value;
        public Key(K value) {
            this.Value = value;
        }
        public static implicit operator Key<K, V>(K value) {
            return new Key<K, V>(value);
        }
    }

This struct just wraps the key value. I use a struct instead of a class because structs will do a value comparison for equality by default, classes use reference equality.

The existing methods of PolyDictionary do not take this kind of key; but, we can add methods to PolyDictionary that do. For example, we can create an Add() method that looks like,

    public void Add<K, V>(Key<K, V> key, V value);

We can then create a Get() method that looks like,

    public V Get<K, V>(Key<K, V> key);

All these methods can be added directly to PolyDictionary because, through method overloading, they don't interfere with the original, declared key versions.

Using this dictionary looks like,

    Key<string, string> k1 = "k1";
    Key<string, int> k2 = "k2";
    Key<int, double> k3 = 3;

    PolyDictionary dictionary = new PolyDictionary();

    dictionary.Add(k1, "one");
    dictionary.Add(k2, 2);
    dictionary.Add(k3, 3.33);

    string v1 = dictionary.Get(k1);
    int v2 = dictionary.Get(k2);
    double v3 = dictionary.Get(k3);

    Console.WriteLine("v1 = {0}", v1);
    Console.WriteLine("v2 = {0}", v2);
    Console.WriteLine("v3 = {0}", v3);

The complete example of PolyDictionary follows,

    class PolyDictionary {
        private Dictionary<object, object> _table;

        public PolyDictionary() {
            _table = new Dictionary<object, object>();
        }

        public void Add<K, V>(Key<K, V> key, V value) {
            _table.Add(key, value);
        }

        public void Add<T>(Key<T> key, T value) {
            _table.Add(key, value);
        }

        public bool Contains<K, V>(Key<K, V> key) {
            return _table.ContainsKey(key);
        }

        public bool Contains<T>(Key<T> key) {
            return _table.ContainsKey(key);
        }

        public void Remove<K, V>(Key<K, V> key) {
            _table.Remove(key);
        }

        public void Remove<T>(Key<T> key) {
            _table.Remove(key);
        }

        public bool TryGetValue<K, V>(Key<K, V> key, out V value) {
            object objValue;
            if (_table.TryGetValue(key, out objValue)) {
                value = (V)objValue;
                return true;
            }
            value = default(V);
            return false;
        }

        public bool TryGetValue<T>(Key<T> key, out T value) {
            object objValue;
            if (_table.TryGetValue(key, out objValue)) {
                value = (T)objValue;
                return true;
            }
            value = default(T);
            return false;
        }

        public V Get<K, V>(Key<K, V> key) {
            return (V)_table[key];
        }

        public T Get<T>(Key<T> key) {
            return (T)_table[key];
        }

        public void Set<K, V>(Key<K, V> key, V value) {
            _table[key] = value;
        }

        public void Set<T>(Key<T> key, T value) {
            _table[key] = value;
        }
    }

Note: the implementation changed a bit from the previous example. The Get() methods now use the this property instead of calling TryGetValue(). This allows the Get() methods to be inlined at the cost of an additional cast or two in the code.

PolyDictionary

CLR's standard Dictionary<K,T> allows you to express the type of the key and the type value so you don't have to cast on the way in and out of a dictionary. For example you can cache a bunch of strings based on some integer key. I use this in the following program that takes an integer number and writes out the English equivalent.

    class Program {
        static Dictionary<int, string> names = new Dictionary<int, string>();

        static Program() {
            names[0] = "zero";
            names[1] = "one";
            names[2] = "two";
            names[3] = "three";
            names[4] = "four";
            names[5] = "five";
            names[6] = "six";
            names[7] = "seven";
            names[8] = "eight";
            names[9] = "nine";
            names[10] = "ten";
            names[11] = "eleven";
            names[12] = "twelve";
            names[13] = "thirteen";
            names[14] = "fourteen";
            names[15] = "fifteen";
            names[16] = "sixteen";
            names[17] = "seventeen";
            names[18] = "eighteen";
            names[19] = "nineteen";
            names[20] = "twenty";
            names[30] = "thirty";
            names[40] = "forty";
            names[50] = "fifty";
            names[60] = "sixty";
            names[70] = "seventy";
            names[80] = "eighty";
            names[90] = "ninety";
            names[100] = "hundred";
            names[1000] = "thousand";
            names[1000000] = "million";
            names[1000000000] = "billion";
        }

        static int WriteValue(int value, int place) {
            if (value >= place) {
                int subValue = value / place;
                WriteValue(subValue);
                Console.Write(" {0} ", names[place]);
                value = value % place;
            }
            return value;
        }

        static void WriteValue(int value) {
            int originalValue = value;
            if (value < 0) {
                value = -value;
                Console.WriteLine("negative ");
            }
            value = WriteValue(value, 1000000000);
            value = WriteValue(value, 1000000);
            value = WriteValue(value, 1000);
            value = WriteValue(value, 100);
            if (value > 20) {
                int subValue = (value / 10) * 10;
                Console.Write(names[subValue]);
                value = value % 10;
                if (value == 0)
                    return;
                Console.Write("-");
            }
            if (value != 0 || originalValue == 0)
                Console.Write(names[value]);
        }

        static void Main(string[] args) {
            while (true) {
                Console.Write("Enter number: ");
                string line = Console.ReadLine();
                if (line == "exit" || string.IsNullOrEmpty(line))
                    break;
                int value;
                if (int.TryParse(line, out value)) {
                    WriteValue(value);
                    Console.WriteLine();
                }
                else
                    Console.WriteLine("Error: expected number");
            }
        }
    }

This example doesn't use any casts. The expression names[10] is of type string and the index is of type int. This is great when the values of the dictionary are all homogeneous as name is above. But what if I want to store values of varying types but still not use a cast. In other words, I want a heterogeneous, type-checked dictionary. If I don't mind casts, I can just use a Hashtable as in,

    Hashtable table = new Hashtable();

    table.Add("k1", "one");
    table.Add("k2", 2);
    table.Add(3, 3.33);

    string v1 = (string)table["k1"];
    int v2 = (int)table["k2"];
    double v3 = (double)table[3];

    Console.WriteLine("v1 = {0}", v1);
    Console.WriteLine("v2 = {0}", v2);
    Console.WriteLine("v3 = {0}", v3);

Unfortunately, if I later change the code to,

    table.Add("k1", 1);
    table.Add("k2", 2);
    table.Add(3, 3.33);

    string v1 = (string)table["k1"];
    int v2 = (int)table["k2"];
    double v3 = (double)table[3];

The compiler will dutifully compile the code and I will get a runtime error at the first use of the dictionary when it tries to cast an integer to a string. I would like the compiler to catch this type of error instead of having to wait until my user does.

I have keys of varying type and values of varying type. How can I get this and type verification as well? In other words, what I want is something like,

    dictionary.Add(k1, "one");
    dictionary.Add(k2, 2);
    dictionary.Add(k3, 3.33);

    string v1 = dictionary[k1];
    int v2 = dictionary[k2];
    double v3 = dictionary[k3];

What I want to do is be able to infer the returns result of the expression dictionary[k1] from the key. In the above example, there should be something about the k1 key that tells the compiler that the result will be a string and when I change it later to an int, I want the compiler to catch the assignment to string as a type error.

To do this I need a method that can infer the type of the return result from a parameter. A simple example of such a method is,

    T NoOp(T value) { return value; }

which returns the same value of the same type as it retrieves for any value of any type. The return type is inferred from the parameter value's type.

This is not good enough yet. We don't want the key type to be the same as the value type; we want the value type to be arbitrary. To do this we need to encode the value's type in the key. One way is to have the key's type include the value's type as part of the key's type definition. This can be done through supplying a dummy type parameter, a type parameter that is not needed in the definition of the type but the dummy type parameter is carried around by the type and can be used by type inferencing. To do this we need a method something like,

    T Get<T>(Key<T> key) { ... }

which takes a key and extracts the return type from the keys definition. This requires a Key class definition that has one type parameter which is simply,

    class Key<T> { public Key() { } }

Note we don't use T in Key, it is just there so type inferencing will find it. We can now define some keys,

    Key<string> k1 = new Key<string>();
    Key<int> k2 = new Key<int>();
    Key<double> k3 = new Key<double>();

The keys are just arbitrary instances. Type inferencing will now be able to extract T from the key's type and use it as the return result! This means that Get(k1) will be  of type string and Get(k2) will be of type int.

Type inferencing also works the same way for other parameters of the method. For the Add() method we can define it as,

    public void Add<T>(Key<T> key, T value);

With this  Add() the code above will compile because the type of value is also inferred from the key!

Unfortunately, the this property does not accept type parameters so we cannot get the code,

    string v1 = dictionary[k1];
    int v2 = dictionary[k2];
    double v3 = dictionary[k3];

to compile; but if we replace the array indexers with a Get() method we can get close, as in,

    string v1 = dictionary.Get(k1);
    int v2 = dictionary.Get(k2);
    double v3 = dictionary.Get(k3);

This works if dictionary has a Get() method like we described above,

    public T Get<T>(Key<T> key);

There we have it! With the Add() and Get() methods we have the beginnings of a dictionary that can hold any value of any type and retrieve the value back without requiring our use of the dictionary to use a cast. A complete example is given below,

    class Key<T> { public Key() { } }

    class PolyDictionary {
        private Dictionary<object, object> _table;

        public PolyDictionary() {
            _table = new Dictionary<object, object>();
        }

        public void Add<T>(Key<T> key, T value) {
            _table.Add(key, value);
        }

        public bool Contains<T>(Key<T> key) {
            return _table.ContainsKey(key);
        }

        public void Remove<T>(Key<T> key) {
            _table.Remove(key);
        }

        public bool TryGetValue<T>(Key<T> key, out T value) {
            object objValue;
            if (_table.TryGetValue(key, out objValue)) {
                value = (T)objValue;
                return true;
            }
            value = default(T);
            return false;
        }

        public T Get<T>(Key<T> key) {
            T value;
            if (!TryGetValue(key, out value))
                throw new KeyNotFoundException();
            return value;
        }

        public void Set<T>(Key<T> key, T value) {
            _table[key] = value;
        }
    }

WPF Forum Best Practices

Rob's post Best Forum Practices recommends some guidelines for the WPF forum. My favorite is,

Recommended Steps for Answering [Questions]:

1) Answer the question

These are good guidelines for any question/answer style forum, not just WPF's.

Follow-up to Advice

I have to be honest; I didn’t expect the response I received from my Fatherly advice to new programmers article. There have been some very nice things written about it, not only in the comments but in other blogs. I don’t consider myself a very good writer and having my writing style praised was highly unexpected. I thank everyone for their kind words. Also, if you haven’t read the comments posted to the article yet, I highly recommend it. They offer interesting insights from a variety of perspectives.

I also recommend reading some of the blogs entries that have been written in response to it. I would like to point out one in particular from Jeff Atwood: Shipping Isn't Enough. I agree with him. You not only need to focus on shipping but make sure what you are working on is worth shipping. You need to focus on shipping programs people use and enjoy using.

Other bloggers have pointed out that you often don’t that much control over what you are working on, such as Mike-O-Matic’s Shipping Is Enough, Sometimes. And other point out that sometimes you don’t know if what you are working on will be used or not, as in Avonelle’s If a program falls in a forest....

These entries remind me of one more bit of advice: you are not your users. Just because you like something some particular way doesn’t mean that your users will, nor just because you would never use something doesn’t mean nobody needs it. You are not your users; you should get to know them.

The Expression Problem: The JRuby Interpreter

I previously discussed the expression problem could be solved with various patterns including the visitor pattern and using a procedural approach. It is interesting to note that JRuby used to use a visitor pattern but switched to a procedural approach to improve performance. You can read about it here. This is a side note in the article since what they want to do is convert Ruby to run as byte code similar to how JPython works.

Linked List VI: Recursion - see recursion

I finally have a definitive answer to the question of why MyNode is needed. Consider the following, simplified, version of the LinkedList classes I posed,

  class C<X> { class N { } }
  class D: C<D.N> { }

This is really an obfuscated form of

  class N<X> { }
  class C<X> { }
  class D: C<N<?>> { }

where ? needs to be replaced by an infinite list if N<N<...>>. In other words, I am trying to define a type directly in terms of itself, a T where T=N<T>. This is not allowed. An additional type is required to break the cycle, such as M in,

  class M: N<M> { }

Then I can write,

  class D: C<M> { }

So, there we have it, MyNode is necessary to avoid defining a type directly in terms of itself. The unobfuscated version makes this much more clear. MyNode serves the same role as M does above.

Thanks again to Andrew for taking the time to beat this into my thick skull.

Fatherly Advice To New Programmers

It looks like none of my children will become programmers. Instead of letting my fatherly advice to my new programmer son or daughter go to waste, I am going to inflict it on you. If you are newly embarking on the journey that is becoming a programmer, here is advice your father would tell you if he were a programmer. These are things I had to learn the hard way.

Keep Learning: Read. Go to conferences. Subscribe to journals. Take classes. Whatever it takes for you to keep learning, make it a priority. Learn about every language you can find. Take time to learn about any new frameworks, algorithms, techniques, models, paradigms, you can. Each gives you one more tool in your tool chest. Each will help you more easily tackle your next programming problem. Find a mentor, someone much better than you, and learn all they can teach you. Never stop learning.

Learn To Communicate: I often joke that the most important skill you can learn as a programmer is how to draw a rectangle on a white-board. Communication is critical to the job of a programmer. Communicating with customers, clients, users, co-workers, bosses, vice presidents, CEO's, board-members, VC capitalists, all will become important at some point in your career. Learn how to speak in public. Learn how to write in English. Learn to effectively communicate in person. Learn how to persuade without shouting, getting angry, or getting flustered. Learn how to speak without jargon. Help people understand what you are doing. Learn to break things into simple, understandable pieces. Learn to communicate by analogy and symbolism. Learn to communicate.

Be Predictable: Learn how fast you can comfortably program. Wait to predict how long it will take you to complete a task until you understand it. Allow for the unexpected. Plan for vacations and time-off. Live with your predictions. I don't believe I know a problem well enough to predict how long it will take to complete until I can break that task down into sub-tasks that each take no longer than 3 days (often less than one day). Live by this rule, under-promise, over-deliver. It is better to deliver in 10 days what you promised in 15 than to deliver in 10 days what you promised in 5. People depend, schedule, and plan around your predictions. Make them the best you can and make sure you can comfortably do them or you will be asked to live up to your uncomfortable predictions. You will not be good at it at first; to compensate, verify your predictions with someone more experienced. Learn to get better. Be predictable; other depend on you.

Own Up To Your Mistakes: You will make mistakes. How you handle your mistakes is how you will be judged. Learn how to say "I was wrong." If you underestimated how long it will take you to do something, tell people as soon as it is clear to you. If you broke the build, fix it. If you created a bug, fix it. Don't deny the mistake, don't make excuses for the mistake, don't figure out how to hide the mistake, don't blame others for the mistake, do something about it. Take ownership of your mistake or you will repeat it.

Never Let Bad Code Off Your Desk: Your job as a programmer is to write code that works, never let code off your desk you are not sure meets that criteria. Not only does it reflect badly on you, it is much more expensive, and much harder, to find a problem once it leaves your desk than before. Learn to love unit tests. Learn to love code coverage. Learn to test your code better than people who are paid to test it. Be embarrassed about bugs that are found after you have checked-in. Be especially embarrassed when a customer finds the bug. Don't rely on others to find your bugs for you, find them and fix them yourself. Don't hope it will work. Test it. Don't assume it will work. Test it. Don't whatever. Just test it. If you haven't tested it, it doesn't work; of this you can be sure. But, even if you are diligent with testing, bugs will get by you. You will make mistakes but try your best not to.

Programming is Fun But Shipping is Your Job: Programming is fun. It is the joy of discovery. It is the joy of creation. It is the joy of accomplishment. It is the joy of learning. It is fun to see your handiwork displaying on the screen. It is fun to have your co-workers marvel at your code. It is fun to have people use your work. It is fun have your product lauded in public, used by neighbors, and discussed in the press. Programming should be fun and if it isn't, figure out what is making it not fun and fix it. However, shipping isn't fun. I often have said that shipping a product feels good, like when someone stops hitting you. Your job is completing the product, fixing the bugs, and shipping. If bugs need fixing, fix them. If documentation needs writing, write it. If code needs testing, test it. All of this is part of shipping. You don't get paid to program, you get paid to ship. Be good at your job.

Remember these simple statements,

  • Never stop learning.
  • Communication is critical.
  • Under promise, over deliver.
  • "I was wrong."
  • If it is not tested it doesn't work.
  • Programming isn't your job, shipping is.

Linked List V: Revenge of the Type Parameter

I honestly thought I was done with this topic but as it turns out I missed something. I stated firmly that "currently, in C#, we are stuck with the cast" but I was wrong. One of the great things about working at Microsoft is the people you can interact with. In back burner researching a definitive answer to Hallvard's question to my post of why linked list LinkedList<T> cannot be written,

  class LinkedList<T>: LinkedList<T, LinkedList<T>.Node> { }

I asked Andrew Kennedy to look at my page. I like talking to Andrew; I learn at least one new thing every time we talk. This was certainly no exception. He quickly pointed out something that was staring me squarely in the face but I missed entirely, folding my implementation of DoubleLinkedList<T> with DoubleLinkedList<T, N> and closing the node type directly will remove the cast. In other words, DoubleLinkedList<T> can be written without casts as,

    class DoubleLinkedList<T>: LinkedList<T, DoubleLinkedList<T>.DoubleNode>,
        IDoubleLinkedList<T> {

        public class DoubleNode: Node {
            DoubleNode _previous;

            public override DoubleNode Next {
                get {
                    return base.Next;
                }
                set {
                    base.Next = value;
                    value._previous = this;
                }
            }

            public DoubleNode Previous { get { return _previous; } }
        }

        public IEnumerable<T> Backwards {
            get {
                if (Last != null) {
                    DoubleNode current = Last;
                    do {
                        yield return current.Data;
                        current = current.Previous;
                    } while (current != Last);
                }
            }
        }
    }

which closes the node parameter instead of leaving it open. This is one of those head-slapping, "how stupid of me" moments like when someone points out that "you know, if you would have pushed the other pawn instead, you would have had checkmate in three moves." You can see that this is a straight forward application of the requires clause version (declaring it to be DoubleNode instead of requiring it). In my zest to introduce the concepts of the "this type" from LOOM and the requires clause from Scala, I missed what should have been obvious. Silly me.

This solution has the limitation that you cannot derive something like TripleLinkedList<T> from DoubleLinkedList<T> so my example still holds some water (TripleLinkedList<T, N> can be derived from DoubleLinkedList<T, N>) but it is not nearly so compelling. I still think that one of the two features would be very useful but I need a much better motivating example.

While I still don't have a definitive answer to Hallvard's question, I picked up this gem along the way. Not bad so far.

Linked List IV: Removing casts

In the last entry we successfully derived a double-linked list implementation from a single-linked list implementation, but we had to use casts to get it to compile. What we want to do now is see if we can remove the casts.

We will first remove the cast in the Backwards method.

        DoubleNode current = (DoubleNode)Last;

The reason we need the cast is because the base class defines Last to be

        protected Node Last { get { return _last; } }

What we need is the type to be DoubleNode, but this property is defined in a descendent class. To allow us to change the type we need to leave the type open, or, in other words,  add the node type as a type parameter. To accomplish this, I replaced all references to the node type with a type parameter. The new LinkedList class looks like,

    class LinkedList<T, NodeType>: IEnumerable<T>, ILinkedList<T> 
        where NodeType: LinkedList<T, NodeType>.Node, new() {

        public class Node {
            private T _data;
            private NodeType _next;

            public T Data { 
                get { return _data; }
                set { _data = value; } 
            }
            public virtual NodeType Next { 
                get { return _next; } 
                set { _next = value; } 
            }
        }

        NodeType _last;

        public void Add(T value) {
            NodeType newNode = new NodeType();
            newNode.Data = value;
            if (_last == null) {
                _last = newNode;
                newNode.Next = newNode;
            }
            else {
                newNode.Next = _last.Next;
                _last.Next = newNode;
                _last = newNode;
            }
        }

        public void Clear() {
            _last = null;
        }

        public IEnumerator<T> GetEnumerator() {
            if (_last != null) {
                NodeType current = _last;
                do {
                    current = current.Next;
                    yield return current.Data;
                } while (current != _last);
            }
        }

        IEnumerator IEnumerable.GetEnumerator() {
            return GetEnumerator();
        }

        protected NodeType Last { 
            get { return _last; } 
        }
    }

The where clause looks complicated but what it says is that NodeType's actual parameter must derive from the Node type defined in the class. This allows me to use the properties defined for the node class on variables of type NodeType. The new() part says that the actual parameter must also have a default constructor. This allows us to remove the CreateNode() method.

Unfortunately, this definition is a bit difficult to use because you need to define a node class to pass as a parameter. To make this easier, I defined a single parameter version,

    class LinkedList<T>: LinkedList<T, LinkedList<T>.MyNode> {
        public class MyNode: Node { }
    }

I implemented the double-linked list as,

    class DoubleLinkedList<T, NodeType>: LinkedList<T, NodeType>, IDoubleLinkedList<T> 
        where NodeType: DoubleLinkedList<T, NodeType>.DoubleNode, new() {
        
        public class DoubleNode: Node {
            NodeType _previous;

            public override NodeType Next {
                get {
                    return base.Next;
                }
                set {
                    base.Next = value;
                    value._previous = (NodeType)this;
                }
            }

            public NodeType Previous { get { return _previous; } }
        }

        public IEnumerable<T> Backwards {
            get {
                if (Last != null) {
                    NodeType current = Last;
                    do {
                        yield return current.Data;
                        current = current.Previous;
                    } while (current != Last);
                }
            }
        }
    }

This looks almost identical to the previous DoubleLinkedList that I posted last time but it no longer needs a cast in Backwards. This is achieved by constraining types passed as NodeType to descend from DoubleNode instead of just Node. We can pass NodeType to LinkedList<T, NodeType> because the constrains on NodeType are compatible with the constraints for NodeType in LinkedList<T, NodeType>. I require that NodeType derive from DoubleNode which, itself, derives from Node. LinkedList<T, NodeType> requires NodeType to derive from Node; requiring it derive from DoubleNode ensures that is the case. I will also make this easier to use, like LinkedList<T, NodeType> above, by creating DoubleLink<T> from DoubleLink<T, NodeType>, as in,

    class DoubleLinkedList<T>: DoubleLinkedList<T, DoubleLinkedList<T>.MyNode> {
        public class MyNode: DoubleNode { }
    }

But, we still need a cast in the Next property. The reason we need it is, even though we know that NodeType derives from DoubleNode, the compiler only is sure that the type of the object itself is DoubleNode. It cannot prove that it must be a NodeType and there is no way in C# to declare that it must be. Currently, in C#, we are stuck with the cast. A couple languages have tried to deal with this problem. One such language is LOOM by Kim Bruce and others. LOOM allows declaring that a variable must be the same type as this. What that might look like in our example is the following,

        public class DoubleNode: Node {
            this _previous;

            public override NodeType Next {
                get {
                    return base.Next;
                }
                set {
                    base.Next = value;
                    value._previous = this;
                }
            }

            public this Previous { get { return _previous; } }
        }

where I assume that the keyword this is used where the type is assumed to be the type of this. This concept also shows up in Eiffel as like Current. Unfortunately, proving that NodeType and DoubleNode's this type are identitical is a non-trivial problem and has the side affect of adding exact typing to the language or leaves the type system unsound as Eiffel's like Current does.

Another approach is to constrain the decedents of DoubleNode to also to be decedents of NodeType. The only requirement we have on this is that it be assignable to a variable of type NodeType. If we could express that all concrete implementations of DoubleNode are required to be decedents of NodeType this requirement would be met. This is the approach that Scala, by Martin Odersky and others, took to the problem. What this might look like in C# is,

        public abstract class DoubleNode: Node requires NodeType {
            NodeType _previous;

            public override NodeType Next {
                get {
                    return base.Next;
                }
                set {
                    base.Next = value;
                    value._previous = this;
                }
            }

            public NodeType Previous { get { return _previous; } }
        }

This states that descendants of DoubleNode are required to also be decedents of type NodeType (or be NodeType, as is more likely the case). Knowing this we can then remove the cast of this because we are guaranteed it is assignment compatible with NodeType.

Whether either of these extensions make it into C# is anyone's guess. It is still open whether the solutions presented are really better than the problem. Is requiring the cast really that bad? I will leave that up to you to answer. Also I glossed over some tricky problems that would need to be addressed to implement either option, not the least of which are the changes to the CLR.

This all started off innocently enough. We wanted to implement

    interface ILinkedList<T>: IEnumerable<T> {
        void Add(T value);
        void Clear();
    }

and,

    interface IDoubleLinkedList<T>: ILinkedList<T> {
        IEnumerable<T> Backwards { get; }
    }

but things got complicated when we wanted to share code and remove casts. This innocent looking problem is devilishly hard to express in a type-checked object-oriented language.

Linked List III: Double-linked list

A straight forward implementation of a double-linked list that implements the interface,

    interface IDoubleLinkedList<T>: ILinkedList<T> {
        IEnumerable<T> Backwards { get; }
    }

can be implemented as,

    class DoubleLinkedList<T>: IDoubleLinkedList<T>, IEnumerable<T> {

        private class Node {
            public T Data;
            public Node Next;
            public Node Previous;
        }

        private Node _last;

        public DoubleLinkedList() { }

        public void Add(T value) {
            Node newNode = new Node();
            newNode.Data = value;
            if (_last == null) {
                newNode.Next = newNode;
                newNode.Previous = newNode;
                _last = newNode;
            }
            else {
                newNode.Next = _last.Next;
                _last.Next.Previous = newNode;
                _last.Next = newNode;
                newNode.Previous = _last;
                _last = newNode;
            }
        }

        public void Clear() {
            _last = null;
        }

        public IEnumerator<T> GetEnumerator() {
            if (_last != null) {
                Node current = _last;
                do {
                    current = current.Next;
                    yield return current.Data;
                } while (current != _last);
            }
        }

        IEnumerator IEnumerable.GetEnumerator() {
            return GetEnumerator();
        }

        public IEnumerable<T> Backwards {
            get {
                if (_last != null) {
                    Node current = _last;
                    do {
                        yield return current.Data;
                        current = current.Previous;
                    } while (current != _last);
                }
            }
        }
    }

With a casual comparison of this implementation to the implementation of LinkList<T> you will see that most of the methods are the same. Clear() and GetEnumator() are identical. Add() is nearly identical; in fact, it would be identical if we changed the Node implementation to

        private class Node {
            public T Data;
            private Node _next;
            public Node Previous;

            public Node Next {
                get {
                    return _next;
                }
                set {
                    _next = value;
                    value.Previous = this;
                }
            }
        }

With this change to the Node class, the only difference between LinkedList<T> and DoubleLinked<T> is the Node class and the Backwards property. It seems obvious we should use inheritance to share code. DoubleLinked<T> should derive from LinkedList<T>. To allow code sharing we need to make some changes to LinkedList<T>. First we need to change Node implementation to allow DoubleLink<T> to override the behavior of Next,

        protected class Node {
            public T Data;
            private Node _next;
            public virtual Node Next { get { return _next; } set { _next = value; } }
        }

Then we need to add,

        protected Node Last { get { return _last; } }

to allow DoubleLinked<T> to access to _last but not modify it. DoubleLink<T> need Last to implement Backwards. Additionally we need to introduce

        protected virtual Node CreateNode() { return new Node(); }

and change the first statement in the Add() method to,

            Node newNode = CreateNode();

to allow DoubleLinked<T> to override which class to create for the nodes in the list. This allows DoubleLink<T> to create DoubleNode instead of Node. With these changes we can now change the implementation of DoubleLink<T> implementation to,

    class DoubleLinkedList<T>: LinkedList<T>, IDoubleLinkedList<T> {

        protected class DoubleNode: Node {
            public DoubleNode Previous;

            public override Node Next {
                get {
                    return base.Next;
                }
                set {
                    base.Next = value;
                    ((DoubleNode)value).Previous = this;
                }
            }
        }

        public IEnumerable<T> Backwards {
            get {
                if (Last != null) {
                    DoubleNode current = (DoubleNode)Last;
                    do {
                        yield return current.Data;
                        current = current.Previous;
                    } while (current != Last);
                }
            }
        }

        protected override LinkedList<T>.Node CreateNode() { return new DoubleNode();  }
    }

There we have it. We used inheritance to share code. Unfortunately, we had to use casts to get this to work. The casts show we are now running into some limitations in the expressiveness of the C# type system. We know that all the items of the list will be of type DoubleNode but we need to convince the compiler (and the CLR) with a cast in the set part of the Next method and the Backwards property. It would be nice to write this without casts. Next time we look at a way to rid ourselves of one the cast in the Backwards property and the virtual CreateNode() method. We will also look at some possible ways to increase the expressiveness of C#'s type system to allow us to get rid of the final cast.

Linked Lists II: Simple Generics

In the last entry we looked at implementing a collection class using a linked list data structure. This time I will show some ways to use that implementation to implement the generic interfaces introduced last time. The complete class looks like,

    class LinkedList: IEnumerable {

        private class Node {
            public object Data;
            public Node Next;
        }

        private Node _last;

        public LinkedList() { }

        public void Add(object value) {
            Node newNode = new Node();
            newNode.Data = value;
            if (_last == null) {
                newNode.Next = newNode;
                _last = newNode;
            }
            else {
                newNode.Next = _last.Next;
                _last.Next = newNode;
                _last = newNode;
            }
        }

        public void Clear() {
            _last = null;
        }

        public IEnumerator GetEnumerator() {
            if (_last != null) {
                Node current = _last;
                do {
                    current = current.Next;
                    yield return current.Data;
                } while (current != _last);
            }
        }
    }

The interface we want to implement is,

    interface ILinkedList<T>: IEnumerable<T> {
        void Add(T value);
        void Clear();
    }

One obvious, but bad, way to implement this interface is to subclass LinkedList like,

    class LinkedList<T>: LinkedList, ILinkedList<T>, IEnumerable<T> {

        public void Add(T value) { base.Add(value); }

        IEnumerator<T> IEnumerable<T>.GetEnumerator() {
            foreach (T item in this)
                yield return item;
        }
    }

The reason this is so bad is because given,

    LinkedList<string> list = new LinkedList<string>();

the following statements are both legal,

    list.Add("some string");
    list.Add(1);

The reason is because the first statement calls the LinkedList<T>.Add(T value) above, as expected, but the second will call LinkedList.Add(object value), because the generic Add is overloaded with the the non-generic version. Even if it was obscured by the new Add method you could still access Add(object value) by,

    LinkedList list2 = list;
    list2.Add(1);

We wanted the compiler to be able to flag when we add a non-string to the list as an error; this implementation doesn't do that. A slightly better approach is to use delegation instead of inheritance, as in,

    class LinkedListD<T>: ILinkedList<T>, IEnumerable<T> {
        private LinkedList _list = new LinkedList();

        public void Add(T value) { _list.Add(value); }

        public void Clear() { _list.Clear(); }

        IEnumerator<T> IEnumerable<T>.GetEnumerator() {
            foreach (T item in _list)
                yield return item;
        }

        IEnumerator IEnumerable.GetEnumerator() {
            return _list.GetEnumerator();
        }
    }

But this is still not good enough because, not only can we no longer inherit the implementation of Clear() and GetEnumerator(), we also use more memory by having to create an instance of LinkedList to wrap. Additionally, if we wanted to add an Insert() method, that put values at the beginning of the list instead of the end, we would have to manually update LinkedListD. A much better approach is to scrap the first implementation and re-implement the class as a generic class, as in,

    class LinkedList<T>: ILinkedList<T>, IEnumerable<T> {
        private class Node {
            public T Data;
            public Node Next;
        }

        private Node _last;

        public LinkedList() { }

        public void Add(T value) {
            Node newNode = new Node();
            newNode.Data = value;
            if (_last == null) {
                newNode.Next = newNode;
                _last = newNode;
            }
            else {
                newNode.Next = _last.Next;
                _last.Next = newNode;
                _last = newNode;
            }
        }

        public void Clear() {
            _last = null;
        }

        public IEnumerator<T> GetEnumerator() {
            if (_last != null) {
                Node current = _last;
                do {
                    current = current.Next;
                    yield return current.Data;
                } while (current != _last);
            }
        }

        IEnumerator IEnumerable.GetEnumerator() {
            return GetEnumerator();
        }
    }

With this approach, if we really need a list with data type object, we can always use LinkedList<object> which gives us a class essentially the same class as what we started with.  Sometimes it is just better to rewrite. Note that the BCL took a similar approach. List<T> and Dictionary<K,T> do not inherit from ArrayList and Hashtable.

Next time I will try a first crack at implementing IDoubleLinkedList<T>.

Linked Lists I: Simple implementation

Last time I investigated several ways to implement an expression evaluator using a procedural style and several object oriented styles. I used this to demonstrate a data structure that doesn't lend itself well to being representing in an object oriented style. Another problematic structure is a linked list. It is not so much the linked list itself but writing it is such a way that is both type safe and also supports code sharing with, say, a doubly-linked list. What I want to implement is a class that implements the following interface.

    interface ILinkedList<T>: IEnumerable<T> {
        void Add(T value);
        void Clear();
    }

And then a subclass of that class that implements,

    interface IDoubleLinkedList<T>: ILinkedList<T> {
        IEnumerable<T> Backwards { get; }
    }

which inherits the implementation of Add() and Clear() and efficiently implements Backwards. For now, I will ignore the generic parameter and just implement a simple linked list. First, lets create a Node that will form the nodes of the linked list.

    class Node {
        public object Data;
        public Node Next;
    }

A linked list consists of a list of these nodes.

    class LinkedList: IEnumerable {
        Node _last;
    }

When an item is added to the list we will create a new node to hold it and create circular list where the _last instance variable will always point to the last node. This way we can quickly find both the last node and the first node of the list because, since the list is circular, the last node's Next field points to the first node. The reason I chose this particular way of implementing a linked list will be obvious if you have watched this video.

        public void Add(object value) {
            Node newNode = new Node();
            newNode.Data = value;
            if (_last == null) {
                newNode.Next = newNode;
                _last = newNode;
            }
            else {
                newNode.Next = _last.Next;
                _last.Next = newNode;
                _last = newNode;
            }
        }

Clearing the list is simple, just forget all the nodes.

        public void Clear() {
            _last = null;
        }

I implemented IEnumerable interface by using the new yield feature of C# 2.0. I start at the last node of the list, walk forward (to the first node) and yield an element, each successive loop will yield another element until we have yielded the last element (which is pointed to by _last).

        public IEnumerator GetEnumerator() {
            if (_last != null) {
                Node current = _last;
                do {
                    current = current.Next;
                    yield return current.Data;
                } while (current != _last);
            }
        }

Now I have a reasonable implementation of a linked list. Next time we will take several different approaches to implement the interfaces defined above.

A Tour of XAML - Part VI - Forward Compatibility (continued)

Returning to the Party example we where using, lets say you want to author a document that has 10 kazoos when the version 2 assembly is present but replaces them with 4 extra balloons if we only have the version 1 assembly. This can be written using a markup compatibility AlternateContent block. This might look like,

    <Party mc:Ignorable="v2"
      xmln="...assembly-v1-uri..."
      xmlns:v2="...assembly-v2-uri..."
      xmlns:mc="http://schemas.micrsoft.com/winfx/2006/markup-compatibility">
        <Balloon Color="Red" />
        <Balloon Color="Blue" />
        <mc:AlternateContent>
            <mc:Choice Requires="v2">
                <Balloon Color="Green" Shape="Dog" />
                <v2:Favor Kind="Kazoo" Quantity="10" />
            </mc:Choice>
            <mc:Fallback>
                <Balloon Color="Green" />
                <Balloon Color="Violet" />
                <Balloon Color="Orange" />
                <Balloon Color="Yellow" />
            </mc:Fallback>
        </mc:AlternateContent>
    </Party>

When the version 1 assembly is present this is interpreted as,

    <Party xmln="...assembly-v1-uri...">
        <Balloon Color="Red" />
        <Balloon Color="Blue" />
        <Balloon Color="Green" />
        <Balloon Color="Violet" />
        <Balloon Color="Orange" />
        <Balloon Color="Yellow" />
    </Party>

but in version 2 it is interpreted as,

    <Party xmlns="...assembly-v2-uri...">
        <Balloon Color="Red" />
        <Balloon Color="Blue" />
        <Balloon Color="Green" Shape="Dog" />
        <Favor Kind="Kazoo" Quantity="10" />
    </Party>

The content of the Choice element is used when the v2 namespace is available because it is mentioned in the Requires attribute. This corresponds to when the version 2 assembly is in use. If v2 is not available, the content of the Fallback element is used. AlternateContent supports an arbitrary list of Choice elements and the Requires can be an arbitrary list of prefix names.

Of course exchanging balloons for kazoos is not a very compelling example. For a more practical example, let's assume we have following classes in version 1 of an assembly,

    [ContentProperty("Content")]
    class Paragraph {
        private List<Inline> _content = new List<Inline>();
        public List<Inline> Content { get { return _content; } }
    }
    
    abstract class Inline { }
    
    [ContentProperty("Text")]
    class Run : Inline {
        string _text;
        public string Text { get { return _text; } set { _text = value; } }
    }
    
    class Image : Inline {
        private Uri _source;
        public Uri Source { get { return _source; } set { _source = value; } }
    }

Now in the version 2 assembly we add a new class that looks like,

    class Graph : Inline {
        private List<Point> _points = new List<Point>();
        public List<Point> Points { get { return _points; } }
    }

Now we want to write a paragraph that uses the new Graph class when version 2 of the assembly is present, because, for example, it scales better, but in version 1 we will just use a bitmap image of the graph. This might look like,

    <Paragraph mc:Ignorable="v2"
      xmln="...assembly-v1-uri..."
      xmlns:v2="...assembly-v2-uri..."
      xmlns:mc="http://schemas.micrsoft.com/winfx/2006/markup-compatibility">
        <Run>The sales projects for the next quarter are</Run>
        <AlternateContent>
            <mc:Choice Requires="v2">
                <v2:Graph Data="1,2 2,3 3,5" />
            </mc:Choice>
            <mc:Fallback>
                 <Image Source="SalesData.jpg" />
            </mc:Fallback>
        </mc:AlternateContent>
    </Paragraph>

When the version 2 assembly is present, this is interpreted as,

    <Paragraph xmlns="...assembly-v2-uri...">
        <Run>The sales projects for the next quarter are</Run>
        <Graph Data="1,2 2,3 3,5" />
    </Paragraph>

but, when only version 1 available, it is interpreted as,

    <Paragraph xmln="...assembly-v1-uri...">
        <Run>The sales projects for the next quarter are</Run>
        <Image Source="SalesData.jpg" />
    </Paragraph>

using the bitmap instead of of the Graph class. This allows a document author to produce a file that takes advantage of a new feature, such as Graph with its better scaling properties, but still allows older versions to see a the graph at a lower fidelity.

These classes make a better example for another feature of markup compatibility, the ProcessContent attribute. Let's add another class in the version 2 assembly,

    [ContentProperty("Content")]
    class Glow : Inline {
        private List<Inline> _content = new List<Inline>();
        public List<Inline> Content { get { return _content; } }
    }

This class allows us to make the content to appear to glow when drawn. This allows us to write a document that looks like,

    <Paragraph mc:Ignorable="v2"
      xmln="...assembly-v1-uri..."
      xmlns:v2="...assembly-v2-uri..."
      xmlns:mc="http://schemas.micrsoft.com/winfx/2006/markup-compatibility">
        <Run>The word</Run> 
        <v2:Glow><Run>glow</Run></v2:Glow> 
        <Run>will appear to glow when it viewed with version 2.</Run>
    </Paragraph>

Unfortunately, with version 1, this is interpreted as,

    <Paragraph xmln="...assembly-v1-uri...">
        <Run>The word</Run> 
        <Run>will appear to glow when it viewed with version 2.</Run>
    </Paragraph>

The word "glow" disappeared because, by default, when an element is ignored its content is ignored as well. This default can be modified by using the ProcessContent attribute. If the original document is modified to be,

    <Paragraph 
      mc:Ignorable="v2" 
      mc:ProcessContent="v2:Glow"
      xmln="...assembly-v1-uri..."
      xmlns:v2="...assembly-v2-uri..."
      xmlns:mc="http://schemas.micrsoft.com/winfx/2006/markup-compatibility">
        <Run>The word</Run> 
        <v2:Glow><Run>glow</Run></v2:Glow> 
        <Run>will appear to glow when it viewed with version 2.</Run>
    </Paragraph>

This will tell the reader to just ignore the tags for Glow if it doesn't understand it but leave the content intact. This is then interpreted as,

    <Paragraph xmln="...assembly-v1-uri...">
        <Run>The word</Run> 
        <Run>glow</Run>
        <Run>will appear to glow when it viewed with version 2.</Run>
    </Paragraph>

for version 1, and will be interpreted as,

    <Paragraph xmlns:v2="...assembly-v2-uri...">
        <Run>The word</Run> 
        <Glow><Run>glow</Run></Glow> 
        <Run>will appear to glow when it viewed with version 2.</Run>
    </Paragraph>

when version 2 is present which is exactly what we wanted.

In general, forward compatibility allows previous versions of a document reader to read a document intended for later versions and render the content at a lower fidelity. This helps immensely when deploying a newer version of an assembly. During initial deployment it might not be that wide spread. You can have your users start taking advantage of the new features immediately because older versions will still be able to read newer documents, at a lower fidelity, even when they don't have the new assembly.

A Tour of XAML V: Forward compatibility

As we saw last time, XAML supports backwards compatibility. XAML also supports forward compatibility. Forward compatibility is easiest to understand from through a set of examples. Take the balloon example before. Say we want to write a XAML document that says, when you have the version 2 assembly, I want a dog shaped balloon. If, however, you only have the version 1 assembly, a default shaped balloon is fine. In other words, I want a document that contains some version 2 specific references in it but it is OK to just ignore them when you have a version 1 assembly. In our example, we want to mark the Shape property some way that the version 1 reader will know it can ignore the property. In XAML you can use the markup compatibility namespace to accomplish this. You can write,

    <Balloon Color="Red" v2:Shape="Dog" 
      xmlns="...assembly-v1-uri..."
      xmlns:v2="...assembly-v2-uri..."
      xmlns:mc="http://schemas.micrsoft.com/winfx/2006/markup-compatibility"
      mc:Ignorable="v2" />

This takes advantage of the Ignorable attribute of from the markup compatibility namespace. mc:Ignorable="v2" says that any attribute or element from the namespace associated with the "v2" prefix can be ignored and you would still have a document that would have an acceptable meaning. If a XAML reader only has the version 1 assembly, it would interpreted the above document as,

    <Balloon Color="Red" xmlns="...assembly-v1-uri..."  />

If the reader has the version 2 assembly, it would interpret it as,

    <Balloon Color="Red" Shape="Dog" xmlns="...assembly-v2-uri..." />

mc:Ignorable can also applies to elements. To see example of this lets define a class, Party, that is define as,

    [ContentProperty("Balloons")]
    public class Party {
        List<Balloon> _balloons = new List<Balloon>();
        public List<Balloon> Balloons { get { return _balloons; } 
    }

This allows us to define a Party instance that contains balloons. For this example, let's assume that Party was defined in the Version 1 assembly and is unmodified in the Version 2 assembly. We can then write a XAML document that looks like,

    <Party xmln="...assembly-v1-uri...">
        <Balloon Color="Red" />
        <Balloon Color="Blue" />
    </Party>

We can now also write a document that will include a third dog shaped balloon but only if we are using the version 2 assembly, and not include if we are still using version 1. It looks like,

    <Party mc:Ignorable="v2"
      xmln="...assembly-v1-uri..."
      xmlns:v2="...assembly-v2-uri..."
      xmlns:mc="http://schemas.micrsoft.com/winfx/2006/markup-compatibility">
        <Balloon Color="Red" />
        <Balloon Color="Blue" />
        <v2:Balloon Color="Green" Shape="Dog" />
    </Party>

This will be interpreted exactly as the prior document when reading it with version 1. With version 2, however, it would be interpreted as if it was,

    <Party xmlns="...assembly-v2-uri...">
        <Balloon Color="Red" />
        <Balloon Color="Blue" />
        <Balloon Color="Green" Shape="Dog" />
    </Party>

We can also add a new a new class, such as Favor, that can be used in a document such as,

    <Party mc:Ignorable="v2"
      xmln="...assembly-v1-uri..."
      xmlns:v2="...assembly-v2-uri..."
      xmlns:mc="http://schemas.micrsoft.com/winfx/2006/markup-compatibility">
        <Balloon Color="Red" />
        <Balloon Color="Blue" />
        <v2:Balloon Color="Green" Shape="Dog" />
        <v2:Favor Kind="Kazoo" Quantity="10" />
    </Party>

This doesn't affect the version 1 interpretation of the document, but adds 10 kazoos to the version 2 interpretation. If the author determines that the kazoos are really necessary to the interpretation of either the reference to v1 can be removed and the document rewritten to,

    <Party xmlns="...assembly-v2-uri...">
        <Balloon Color="Red" />
        <Balloon Color="Blue" />
        <Balloon Color="Green" Shape="Dog" />
        <Favor Kind="Kazoo" Quantity="10" />
    </Party>

or the the ignorable attribute can just be removed which amounts to an equivalent document given subsumption described in the previous entry.

Tour of XAML IV: Backward compatibility

XAML has support for both forward and backward compatibility of XAML documents. To understand backwards compatibility in a more formal sense, think of a XAML reader combined with some set of assemblies as the predicate that defines a set. If the reader with the given set of assemblies accepts a XAML document without error it is part of the set. If it rejects a document, it is not part of the set. You can think of the XAML reader and assemblies as defining a set, {r(s,x)|x} where x is the XAML file and r(s,x) is the XAML reader and s is a set of assemblies. You can then describe the relationship between  two sets of assemblies with respect to XAML. The assemblies imply set schemas that the XAML reader uses to interpret the XAML document. The schema implied by the WPF assemblies states there is a Button element and a DockPanel element and the Button element can be included in a DockPanel as one of it children, for example. The a set of implied schemas  s2 is backwards compatible with s1 if {r(s1, x)|x} ∩ {r(s2, x)|x} = {r(s1, x)|x} or, in other words, s2 is backwards compatible with s1 if the reader will accept same XAML documents with s2 as it accepts with s1.

XAML really doesn't have to do much to claim it supports backwards compatibility because backwards compatibility is really a relationship between two sets of XAML schemas (and the implying assemblies) than it is between XAML documents. XAML, however, simplifies establishing this relationship by allowing one schema to declare it subsumes another schema. That means that, given both s1 and s2, XAML allows s2 to declare that any reference elements of s1 should be considered a reference to elements of s2.

To get more concrete, suppose you had an assembly that declared a single type,

    public class Balloon {
        Color _color;
        
        public Color Color { get { return _color; } set { _color = value; } }
    }

A XAML reader will infer a schema that would allow you to write a XAML document like,

    <Balloon Color="Red" xmlns="...assembly-v1-uri..." />

Now, lets say, in version 2 of the assembly you want to add a shape to the balloon class to support different balloon shapes. In the version 2 assembly you can modify the class to,

    public class Balloon {
        Color _color;
        Shape _shape;
        
        public Color Color { get { return _color; } set { _color = value; } }
        public Shape Shape { get { return _shape; } set { _shape = value; } }
    }

Now we can write a XAML document that refers to the new assembly by writing,

    <Balloon Color="Red" Shape="Dog" xmlns="...assembly-v2-uri..."/>

We, however, still want to read all XAML documents that refer to the previous assembly as well using the new assembly. We can do this by declaring, in the new assembly, that this assembly is compatible with the old assembly with a XmlnsCompatibleWith("...assembly-v2-uri...","...assembly-v1-uri...") attribute. When the XAML reader sees that attribute, it will silently interpret all references to "...assembly-v1-uri..." as if they where referring to "...assembly-v2-uri...". A XAML reader, presented with the version 2 assembly, will treat,

    <Balloon Color="Red" xmlns="...assembly-v1-uri..." />

exactly as if it was written as,

    <Balloon Color="Red" xmlns="...assembly-v2-uri..." />

This allows all XAML documents referring to version 1 assembly to be readable when you only have version 2 of the assembly. In fact, assuming you follow CLR guidelines for evolving an assembly, any XAML file that was legal using the version 1 assembly will be legal using the version 2 assembly.

Cider's January CTP

Cider has been available since December in CTP form. This is really old news and reasons why I haven't blogged about it till now you can see my previous post about my and my machines independent vacations. To get the January CTP see the Channel 9 page http://channel9.msdn.com/wiki/default.aspx/Cider.JanuaryCTPReleaseNotes. You can keep track of our developments through our Channel 9 home page at http://channel9.msdn.com/wiki/default.aspx/Cider.HomePage. I will try to be better about timely blogging updates but given my past history maybe it would be better if you checked the home page now and then...

This month we are focusing on the grid design experience. Even though we shipped it in December we really stole, uh, er, borrowed the grid design experience from Sparkle (aka the Expression Interactive Designer) which is also now available in CTP form available from http://www.microsoft.com/downloads/details.aspx?familyid=ed9f5fb2-4cfc-4d2c-9af8-580d644e3d1d&displaylang=en.

The Cider CTPs are not beta quality. I say that to make sure you know what to expect. We are delivering bits early and often to give all of you a chance to influence our development by giving us early feedback on our design and the general direction we are taking. I encourage all of you to take advantage of this. We are listening and care a lot about what you think.

A Solution to a Puzzle

A few months back Randy Kimmerly posted a blog about one the puzzles on my desk.

Twisted in the correct way it will form a 3x3x3 cube. It is a snake cube puzzle that goes by the commercial name of Cubra©. Randy got excited about the puzzle not for the puzzle itself but by the challenge of writing a program that will produce the solution. He wrote a very clever C# program to produce a solution. I thought the solution would be more concise in Prolog so I volunteered to produce a Prolog version. I had to dust off my Prolog skills, I hadn't used Prolog since I used to support Turbo Prolog for Borland, so my skills are a bit rusty. Here is my solution,

linkTurns([f, f, t, f, t, t, t, t, f, t, t, t, t, t, t, t, 
   t, f, t, t, t, f, t, t, t, f]).

inrange(0). inrange(1). inrange(2).
cell(X, Y, Z) :- inrange(X), inrange(Y), inrange(Z).

move(right, cell(X, Y, Z), cell(X1, Y, Z)) :- X1 is X + 1, cell(X1, Y, Z).
move(left, cell(X, Y, Z), cell(X1, Y, Z)) :- X1 is X - 1, cell(X1, Y, Z).
move(down, cell(X, Y, Z), cell(X, Y1, Z)) :- Y1 is Y + 1, cell(X, Y1, Z).
move(up, cell(X, Y, Z), cell(X, Y1, Z)) :- Y1 is Y - 1, cell(X, Y1, Z).
move(in, cell(X, Y, Z), cell(X, Y, Z1)) :- Z1 is Z + 1, cell(X, Y, Z1).
move(out, cell(X, Y, Z), cell(X, Y, Z1)) :- Z1 is Z - 1, cell(X, Y, Z1).

legalTurn(left, X) :- member(X, [up, down, in, out]).
legalTurn(right, X) :- member(X, [up, down, in, out]).
legalTurn(up, X) :- member(X, [left, right, in, out]).
legalTurn(down, X) :- member(X, [left, right, in, out]).
legalTurn(in, X) :- member(X, [left, right, up, down]).
legalTurn(out, X) :- member(X, [left, right, up, down]).

free(X, Y) :- member(X, Y), !, fail.
free(_, _).

solution(Cells, Moves, [], Cells, Moves).
solution([LastCell|UsedCells], [LastMove|OldMoves], 
   [t|Turns], ResultCells, ResultMoves) :-
   legalTurn(LastMove, NewMove), move(NewMove, LastCell, NewCell), 
   free(NewCell, UsedCells),   solution([NewCell, LastCell|UsedCells], 
   [NewMove, LastMove|OldMoves], Turns, ResultCells, ResultMoves).
solution([LastCell|UsedCells], [LastMove|OldMoves], 
   [f|Turns], ResultCells, ResultMoves) :- 
   move(LastMove, LastCell, NewCell), free(NewCell, UsedCells), 
   solution([NewCell, LastCell|UsedCells], [LastMove, LastMove|OldMoves], 
   Turns, ResultCells, ResultMoves).

solution(Cells, Moves) :- linkTurns(L), solution([cell(0, 0, 0)], 
   [right], L, Cells, Moves).


print_solution([], []).
print_solution([Cell|Cells], [Move|Moves]) :- 
   print_solution(Cells, Moves), write(Move), put(9), write('to '), 
   write(Cell), nl.

goal :- solution(Cells, Moves), print_solution(Cells, Moves).

The clause linkTurns/1 represents where the string of blocks turns and where it doesn't. inrange/1 represents the valid ranges the coordinates of a block can have, 0, 1 or 2. This is used to build cell/3 which represents the locations the blocks can occupy in a 3x3 result. move/3 calculates how to get from one block to another.  legalTurn/2 generates legal turns. free/2 helps decide if a cell is already occupied.  solution/5 calculates the solution; and solution/2 is a simplified wrapper around solution/5. print_solution/2 is a utility clause to help print the solution in a some what readable form. Finally goal is a clause that kicks off the whole process (a dead giveaway to Turbo Prolog roots).

This will only print one solution (because of the implied cut in using I/O) even though there are two possible. To see both use,

  ?- solution(Cells, Moves).

You will notice that both solutions are reflexive of each other so you could say there is really only one solution.

I found an interesting site here that discusses these puzzles as well as gives a break down of how many of these kinds of puzzle there can be.

The Expression Problem: Part IV - Layers

So far I have presented three different approaches to solving the expression problem, procedural, pure object-oriented, and a visitor pattern. This time I will present a technique I will refer to as layers. Like the visitor pattern, it is a modification of the pure object-oriented approach that takes advantage of partial classes. The solution begins very similarly to the other object-oriented approaches, I create an abstract class to represent an abstraction of expressions,

    abstract partial class Expr { }

Notice the addition of the partial modifier. I will take advantage of that shortly. For now I will just flush out the rest of the hierarchy in a way that should, by now, be very familiar. I created a class to represent literals,

    partial class Literal: Expr {
        protected double Value;

        public Literal(double value) { Value = value; }
    }

Notice that, as with the pure object-oriented approach, I can keep the field used to hold the value as protected. This is also true for the class to represent an abstraction of binary operators.

    abstract partial class BinaryOperator: Expr {
        protected Expr Left;
        protected Expr Right;

        public BinaryOperator(Expr left, Expr right) {
            Left = left;
            Right = right;
        }
    }

The actual binary operators are just descendants of BinaryOperator with no additional data.

    partial class Add: BinaryOperator {
        public Add(Expr left, Expr right) : base(left, right) { }
    }

    partial class Subtract: BinaryOperator {
        public Subtract(Expr left, Expr right) : base(left, right) { }
    }

    partial class Multiply: BinaryOperator {
        public Multiply(Expr left, Expr right) : base(left, right) { }
    }

    partial class Divide: BinaryOperator {
        public Divide(Expr left, Expr right) : base(left, right) { }
    }

So far, this is nearly identical to the pure object-oriented approach. Where it differs is in how new operations are added.  To add the Evaluate() method, instead of modifying each class to add the methods, I can take advantage of the fact I left them as partial classes and create additional parts that contain the methods instead. First I created a new part of the Expr class to introduce the virtual method Evaluate().

    abstract partial class Expr {
        public abstract double Evaluate();
    }

Compiling now will generate several errors indicating that the other classes do not implement the Evaluate() method.  This is want I wanted. Using abstract in the above class part lets the compiler help me determine when I have completed the implementation of the evaluate operation. The rest of the class parts for Evaluate() looks like,

    partial class Literal {
        public override double Evaluate() {
            return Value;
        }
    }

    abstract partial class BinaryOperator {
        public sealed override double Evaluate() {
            return EvaluateOp(Left.Evaluate(), Right.Evaluate());
        }

        protected abstract double EvaluateOp(double left, double right);
    }

    partial class Add {
        protected override double EvaluateOp(double left, double right) {
            return left + right;
        }
    }

    partial class Subtract {
        protected override double EvaluateOp(double left, double right) {
            return left - right;
        }
    }

    partial class Multiply {
        protected override double EvaluateOp(double left, double right) {
            return left * right;
        }
    }

    partial class Divide {
        protected override double EvaluateOp(double left, double right) {
            return left / right;
        }
    }

I call each of the collections of class parts a layer. You can think of them much like a transparency teachers uses on overhead projectors. You can build up a class by combining layers of functionality. I usually keep each layer in its own file named for the layer. In this case, Evaluator.cs might be a good choice.

To demonstrate building up a class with through layers I will create a printing layer.

    abstract partial class Expr {
        public abstract void Print();
    }

    partial class Literal {
        public override void Print() {
            Console.Write(Value);
        }
    }

    abstract partial class BinaryOperator {
        public sealed override void Print() {
            Left.Print();
            PrintOp();
            Right.Print();
        }

        protected abstract void PrintOp();
    }

    partial class Add {
        protected override void PrintOp() {
            Console.Write(" + ");
        }
    }

    partial class Subtract {
        protected override void PrintOp() {
            Console.Write(" - ");
        }
    }

    partial class Multiply {
        protected override void PrintOp() {
            Console.Write(" * ");
        }
    }

    partial class Divide {
        protected override void PrintOp() {
            Console.Write(" / ");
        }
    }

As in the other approaches, I will demonstrate how to add support for the Power expression form. I have two choices, I can add it just as we did in the pure object-oriented approach,

    class Power: BinaryExpr {
        public Power(Expr left, Expr right) : base(left, right) { }

        protected override double EvaluateOp(double left, double right) {
            return Math.Pow(left, right);
        }

        protected override void PrintOp() {
            Console.Write(" ^ ");
        }
    }

or I could divide it into multiple parts that can be co-located with the other similar parts. The data part would look like,

    partial class Power: BinaryExpr {
        public Power(Expr left, Expr right) : base(left, right) { }
    }

The part for the expression layer would look like,

    partial class Power: BinaryExpr {
        protected override double EvaluateOp(double left, double right) {
            return Math.Pow(left, right);
        }
    }

An the printing layer part would look like,

    partial class Power: BinaryExpr {
        protected override void PrintOp() {
            Console.Write(" ^ ");
        }
    }

It is this flexibility that makes layers so appealing. You can decide, case by case, which is the right way to modify the hierarchy.

The Expression Problem: Part III - Visitor Pattern

So far we have looked at two ways to represent the following expression forms,

number Literal
expr + expr Add
expr - expr Subtract
expr * expr Multiply
expr / expr Divide

The first was procedural and the second object-oriented. Using the procedural approach, it was easier to add operations than data types. Using an object-oriented approach, the opposite was true. What I will demonstrate now is a technique that balances some of the advantages and disadvantages of both into a hybrid approach called the Visitor Pattern.

There are several ways to apply the visitor problem to expressions, each with their own unique set of advantages and disadvantages, but I will only cover one of these approaches.

The visitor pattern is a modification of the the object-oriented approach that allows an operation to be separated from the class hierarchy into its own class. To demonstrate how a visitor pattern can be applied, I will first created a visitor interface that will be implemented by all visitors. It looks like,

    public interface IExprVisitor<R> {
        R Visit(Literal literal);
        R Visit(Add add);
        R Visit(Subtract subtract);
        R Visit(Multiply multiply);
        R Visit(Divide divide);
    }1

Here I use a generic type to leave the type of the return result open. The expression evaluator will use double but double isn't an appropriate return result for printing, for example. Using a type parameter allows different visitors to have their own result types.

Now I will create the class hierarchy that will be visited. I will first create an abstract class to represent the expressions, just as I did in the previous object oriented example.

    public abstract class Expr {
        public abstract R Accept<R>(IExprVisitor<R> visitor);
    }

The method Accept() is key to the visitor pattern. It accepts the visitor and, in turn, calls the correct Visit() method on the visitor, based on the type of the instance being visited. To do this, each descendant overrides the Accept() method and calls the correct method. To show how this is done I will implement the Literal expression form.

    public class Literal: Expr {
        public double Value;

        public Literal(double value) { Value = value; }

        public override R Accept<R>(IExprVisitor<R> visitor) {
            return visitor.Visit(this);
        }
    }

The above Literal class is very similar to the original pure object oriented version with the addition of the Accept() method. The implementation of the method is trivial, simply call visitor.Visit(this). The compiler will determine that IExprVisitor<R>.Visitor(Literal literal) is the method to bind to based on the type of the this parameter.

Next, as before, I introduce a base class to represent an abstraction of a binary operator.

    public abstract class BinaryOperator: Expr {
        public Expr Right;
        public Expr Left;

        public BinaryOperator(Expr left, Expr right) {
            Right = right;
            Left = left;
        }
    }

Notice that I do not yet implement the Accept() method. I could have, after adding R Visit(BinaryOperator binop) to the IExprVisitor interface, but I didn't need it for the visitors I wanted to create. I have found that it is usually best to only have the visitors visit the concrete classes and avoid the abstract. A visitor can always simulate visiting a abstract base class by calling a common routine in each of the concrete class visitor methods (you will see this technique below in the Printer visitor). This leaves the visitor less cluttered when the abstract class doesn't need to be visited separately. In other words, my Accept() methods only calls one Visit() method for each instance, and that for the most derived type. This has the additional advantage of allowing the Visit() method to have a meaningful result, which wouldn't be possible if the Accept() method called more than one Visit() methods.

Now that I have a base class for binary operators, I can now implement the binary operator expression forms.

    public class Add: BinaryOperator {
        public Add(Expr left, Expr right) : base(left, right) { }

        public override R Accept<R>(IExprVisitor<R> visitor) {
            return visitor.Visit(this);
        }
    }

    public class Subtract: BinaryOperator {
        public Subtract(Expr left, Expr right) : base(left, right) { }

        public override R Accept<R>(IExprVisitor<R> visitor) {
            return visitor.Visit(this);
        }
    }

    public class Multiply: BinaryOperator {
        public Multiply(Expr left, Expr right) : base(left, right) { }

        public override R Accept<R>(IExprVisitor<R> visitor) {
            return visitor.Visit(this);
        }
    }

    public class Divide: BinaryOperator {
        public Divide(Expr left, Expr right) : base(left, right) { }

        public override R Accept<R>(IExprVisitor<R> visitor) {
            return visitor.Visit(this);
        }
    }

The implementation of these should come as no surprise.  They each override the Accept() method and call the visitor method associated with their type.

Now that we have the class hierarchy in place, I will now implement the expression evaluator as a visitor.

    public class Evaluator: IExprVisitor<double> {
        public Evaluator() { }

        public double Evaluate(Expr expr) {
            return expr.Accept(this);
        }

        double IExprVisitor<double>.Visit(Literal literal) {
            return literal.Value;
        }

        double IExprVisitor<double>.Visit(Add add) {
            return Evaluate(add.Left) + Evaluate(add.Right);
        }

        double IExprVisitor<double>.Visit(Subtract subtract) {
            return Evaluate(subtract.Left) - Evaluate(subtract.Right);
        }

        double IExprVisitor<double>.Visit(Multiply multiply) {
            return Evaluate(multiply.Left) * Evaluate(multiply.Right);
        }

        double IExprVisitor<double>.Visit(Divide divide) {
            return Evaluate(divide.Left) / Evaluate(divide.Right);
        }
    }

The expression evaluator can be used by creating a new visitor and calling the Evaluate() method of the evaluator, passing the root of the expression tree. For example,

Expr expr = new Add(new Multiply(new Literal(10), new Literal(4)),
    new Literal(2));

Evalutor evaluator = new Evaluator();
double result = evaluator.Evaluate(expr);
  
Console.WriteLine("The answer is {0}", result);

The visitor pattern makes it easy to add new operations. To demonstrate this, as before, I will add a print operation. This looks very much like the expression evaluator above.

    public class Printer: IExprVisitor<object> {
        public Printer() { }

        public void Print(Expr expr) {
            expr.Accept(this);
        }

        object IExprVisitor<object>.Visit(Literal literal) {
            Console.Write(literal.Value);
            return null;
        }

        void PrintBinaryOperator(BinaryOperator binOp, char opChar) {
            Print(binOp.Left);
            Console.Write(" {0} ", opChar);
            Print(binOp.Right);
        }

        object IExprVisitor<object>.Visit(Add add) {
            PrintBinaryOperator(add, '+');
            return null;
        }

        object IExprVisitor<object>.Visit(Subtract subtract) {
            PrintBinaryOperator(subtract, '-');
            return null;
        }

        object IExprVisitor<object>.Visit(Multiply multiply) {
            PrintBinaryOperator(multiply, '*');
            return null;
        }

        object IExprVisitor<object>.Visit(Divide divide) {
            PrintBinaryOperator(divide, '/');
            return null;
        }
    }

As this shows, using a parameterized type for the visitor interface is awkward when you don't have anything to return. What I would want to do is implement IExprVisitor<void> but void is not allowed as a type parameter so I use object instead. This means I have to return something, so I always return null. This should be seen as awkwardness introduced by my choice of interface styles, not of the visitor pattern itself.

The visitor pattern shares some of the advantages of the procedural approach, in that it is easy to add new operations and new operations can be added dynamically, but we lose the natural ability to add new data types that is normally associated with an object-oriented approach as well as lose representation encapsulation since the instance fields need to be visible to the visitors. We kept the natural size optimization and the completeness verification by the compiler, however. To demonstrate the difficult of adding a new type, I will again add Power. The first step is to add the new Power type.

    public class Power: BinaryOperator {
        public Power(Expr left, Expr right) : base(left, right) { }

        public override R Accept<R>(IExprVisitor<R> visitor) {
            return visitor.Visit(this);
        }
    }

Now we need to add a method to IExprVisitor<R> for the Power type that each visitor will need to implement. It looks like,

        R Visit(Power power);

Next I need to add a method for in each visitor to handle the new class. For Evaluator it is,

        double IExprVisitor<double>.Visit(Power power) {
            return Math.Pow(Evaluate(power.Left), Evaluate(power.Right));
        }

And and for the Printer class it looks like.

        object IExprVisitor<object>.Visit(Power power) {
            PrintBinaryOperator(power, '^');
            return null;
        }

This is just one way to apply the visitor pattern to the expression problem. Other ways have their own strengths and weaknesses by trading off different advantages and disadvantages of the the procedural and object-oriented styles. Some visitor pattern applications have their own unique advantages, such as being able to mesh functionality by executing several visitors concurrently such as evaluating and printing.

Next time we will investigate a different modified object-oriented approach that is sometimes dubbed groups but I will call layers.


1 - This application of the visitor pattern is patterned after one from Generalized Algebraic Data Types and Object-Oriented Programming. << back <<

The Expression Problem: Part II - Object-oriented

Last time we took a procedural approach to solve the expression problem. This time I will demonstrate an object-oriented approach. As a reminder, we want to model the following forms of expressions,

number Literal
expr + expr Add
expr - expr Subtract
expr * expr Multiply
expr / expr Divide

The most natural thing to do in an object-oriented language is to make each of the above expression forms its own class. First, I will introduce a base class to provide an expression abstraction.

    abstract class Expr {
    }

Now I will now create a sub-class for each expression.

    class Literal: Expr {
        protected double Value;

        public Literal(double value) {
            Value = value;
        }
    }
    
    abstract class BinaryExpr: Expr {
        protected Expr Left;
        protected Expr Right;

        public BinaryExpr(Expr left, Expr right) {
            Left = left;
            Right = right;
        }
    }
    
    class Add: BinaryExpr {
        public Add(Expr left, Expr right) : base(left, right) { }
    }
    
    class Subtract: BinaryExpr {
        public Subtract(Expr left, Expr right) : base(left, right) { }
    }
    
    class Multiply: BinaryExpr {
        public Multiply(Expr left, Expr right) : base(left, right) { }
    }
    
    class Divide: BinaryExpr {
        public Divide(Expr left, Expr right) : base(left, right) { }
    }

I created a base class for the binary expressions which I will take advantage of below.

To evaluate the expression I will add an abstract virtual method, Evaluate(), to Expr.

    abstract class Expr {
        public abstract double Evaluate();
    }

Then I will override the Evaluate() method for each of the above classes. Evaluating a literal is simply returning the value of the literal so I modified the Literal class to,

    class Literal: Expr {
        protected double Value;

        public Literal(double value) {
            Value = value;
        }

        public override double Evaluate() {
            return Value;
        }
    }

All binary expressions need to evaluate their left-hand expression and right-hand expression and then perform their operation on the results. To represent this, I overrode Evaluate() to evaluate both Left and Right and then call a newly introduced EvaluateOp() method. I sealed the Evaluate() method because I want descendants to override EvaluateOp() not Evaluate().

    abstract class BinaryExpr: Expr {
        protected Expr Left;
        protected Expr Right;

        public BinaryExpr(Expr left, Expr right) {
            Left = left;
            Right = right;
        }

        public sealed override double Evaluate() {
            return EvaluateOp(Left.Evaluate(), Right.Evaluate());
        }

        protected abstract double EvaluateOp(double left, double right);
    }

Now I can implemented the concrete descendents of BinaryExpr.

    class Add: BinaryExpr {
        public Add(Expr left, Expr right) : base(left, right) { }

        protected override double EvaluateOp(double left, double right) {
            return left + right;
        }
    }

    class Subtract: BinaryExpr {
        public Subtract(Expr left, Expr right) : base(left, right) { }

        protected override double EvaluateOp(double left, double right) {
            return left - right;
        }
    }

    class Multiply: BinaryExpr {
        public Multiply(Expr left, Expr right) : base(left, right) { }

        protected override double EvaluateOp(double left, double right) {
            return left * right;
        }
    }

    class Divide: BinaryExpr {
        public Divide(Expr left, Expr right) : base(left, right) { }

        protected override double EvaluateOp(double left, double right) {
            return left / right;
        }
    }

The advantages an object-oriented approach are,

  1. New data types can be added without affecting any of the other data types.
  2. Each data type is encapsulated, only the data type needs to have access the instance variables.
  3. All the operations affecting the instance variables are in one place, the methods of instance variable's class.
  4. New operations can be added as abstract methods. The compiler will then generate an error if an operation is not implemented for one of the leaf classes.
  5. Efficient storage is natural. Adding field to one of the expression types do not affect the others.

To demonstrate how easy it is to add a new data type I will again add support for Power.

expr ^ expr Power

To do this I will add a class to represent the Power expression form. This looks like,

    class Power: BinaryExpr {
        public Power(Expr left, Expr right) : base(left, right) { }

        protected override double EvaluateOp(double left, double right) {
            return Math.Pow(left, right);
        }
    }

Note that the Power data type can be added without modifying the other classes.

The disadvantages to an object-oriented approach are,

  1. It is difficult to add new operations because adding an operation requires modifying all the classes.
  2. The logic for each operation is spread over multiple classes and often multiple files. This makes it cumbersome to get a complete picture what the operation does.
  3. It is very difficult to dynamically add operations and can't be done without careful planning.

To demonstrate some of the difficulties of adding a new operation, I will add the Print operation. First I will modify the base Expr class to add an abstract Print() method.

    abstract class Expr {
        public abstract double Evaluate();
        public abstract void Print();
    }

Now I will override this method in each class similar to the way I did for the Evaluate() method.

    class Literal: Expr {
        protected double Value;

        public Literal(double value) {
            Value = value;
        }

        public override double Evaluate() {
            return Value;
        }

        public override void Print() {
            Console.Write(Value);
        }
    }

    abstract class BinaryExpr: Expr {
        protected Expr Left;
        protected Expr Right;

        public BinaryExpr(Expr left, Expr right) {
            Left = left;
            Right = right;
        }

        public sealed override double Evaluate() {
            return EvaluateOp(Left.Evaluate(), Right.Evaluate());
        }

        protected abstract double EvaluateOp(double left, double right);

        public sealed override void Print() {
            Left.Print();
            PrintOp();
            Right.Print();
        }

        protected abstract void PrintOp();
    }

    class Add: BinaryExpr {
        public Add(Expr left, Expr right) : base(left, right) { }

        protected override double EvaluateOp(double left, double right) {
            return left + right;
        }

        protected override void PrintOp() {
            Console.Write(" + ");
        }
    }

    class Subtract: BinaryExpr {
        public Subtract(Expr left, Expr right) : base(left, right) { }

        protected override double EvaluateOp(double left, double right) {
            return left - right;
        }

        protected override void PrintOp() {
            Console.Write(" - ");
        }
    }

    class Multiply: BinaryExpr {
        public Multiply(Expr left, Expr right) : base(left, right) { }

        protected override double EvaluateOp(double left, double right) {
            return left * right;
        }

        protected override void PrintOp() {
            Console.Write(" * ");
        }
    }

    class Divide: BinaryExpr {
        public Divide(Expr left, Expr right) : base(left, right) { }

        protected override double EvaluateOp(double left, double right) {
            return left / right;
        }

        protected override void PrintOp() {
            Console.Write(" / ");
        }
    }
    
    class Power: BinaryExpr {
        public Power(Expr left, Expr right) : base(left, right) { }

        protected override double EvaluateOp(double left, double right) {
            return Math.Pow(left, right);
        }

        protected override void PrintOp() {
            Console.Write(" ^ ");
        }
    }

As you can see, each of the classes needed to be modified to implement Print(). You can also note that the compiler complained until each of the classes implemented the Print() operation. This is because we used an abstract method in the Expr class.

If you compare the procedural approach vs. the object-oriented approach you will notice that they are pretty much opposites. It is easy to add operations in the procedural approach but difficult using an object-oriented approach. Adding data types is difficult using in a procedural approach, but easy in an object-oriented approach. Data needs to be public in procedural, but can be private in object oriented. When deciding which approach to use you need to try an predict what will occur more often, adding new data types or adding new operations. What I will present next are some compromise solutions that balance the advantages and disadvantages.

The Expression Problem: Part I - Procedural

Expressions are difficult to represent because they form a matrix of operations and data types. Languages make it either easy to add new operations but difficult to add data types or easy to add new data types but hard to add new operations. This problem isn't limited to expressions, of course, but it is a very clear and familiar example so I will use it to characterize the more general problem. I will present four ways to approach the problem, each with their particular strengths and weaknesses. For my examples I representing a simple expression language comprising of the following expression forms,

number Literal
expr + expr Add
expr - expr Subtract
expr * expr Multiply
expr / expr Divide

Some operations I want to perform on these expressions include evaluating expressions and printing expressions. The operations and the expressions form a matrix of functionality, expression kind by operation,

  Evaluate Print
Literal evaluate literal print literal
Add evaluate add print add
Subtract evaluate subtract print subtract
Multiply evaluate multiply print multiply
Divide evaluate divide print divide

Languages either make it easier to add new columns, that is operations, or add new rows, that is new expression kinds. Procedural and functional languages typically make it easy to add new operations. Object-oriented languages typically make it easier to add new expression kinds represented as new data types.

The first approach I will demonstrate is procedural. I define a data type, Expr, that will hold the operator and the operands.

    enum Oper { Add, Subtract, Multiply, Divide, Literal }

    class Expr {
        public Oper Oper;
        public Expr Left;
        public Expr Right;
        public double Value;

        public Expr(Oper oper, Expr left, Expr right) { ... }
        public Expr(double value) { ... }
    }1

The expression evaluation operation for this data structure looks like,

    static double Evaluate(Expr e) {
        switch (e.Oper) {
            case Oper.Add: return Evaluate(e.Left) + Evaluate(e.Right);
            case Oper.Subtract: return Evaluate(e.Left) - Evaluate(e.Right);
            case Oper.Multiply: return Evaluate(e.Left) * Evaluate(e.Right);
            case Oper.Divide: return Evaluate(e.Left) / Evaluate(e.Right);
            case Oper.Literal: return e.Value;
            default: Debug.Fail("Unknown operator"); return 0;
        }
    }

The advantages to a procedural approach are,

  1. It is very simple to add additional operations (such as Evaluate()).
  2. All the logic for an operation is centralized in one place, and often in a single routine.
  3. New operations can be added without having the modify existing operations.
  4. New operations can be also be added dynamically without affecting or needing to recompile existing operations.

To demonstrate some of the advantages of using a procedural approach we will add a print operation. It looks like,

    static void Print(Expr e) {
        switch (e.Oper) {
            case Oper.Add: Print(e.Left); Console.Write(" + "); Print(e.Right); break;
            case Oper.Subtract: Print(e.Left); Console.Write(" - "); Print(e.Right); break;
            case Oper.Multiply: Print(e.Left); Console.Write(" * "); Print(e.Right); break;
            case Oper.Divide: Print(e.Left); Console.Write(" / "); Print(e.Right); break;
            case Oper.Literal: Console.Write(e.Value); break;
            default: Debug.Fail("Unknown operator"); break;
        }
    }2

Adding the Print operation on the Expr type doesn't require any modifications to Expr or Evaluate().

Some disadvantages to a procedural approach are,

  1. It is difficult to add additional data types (in our example, each data type represents a different operator) because adding a new type affects all operations.
  2. Each operation must account for every data type.
  3. The internal representation of the data type must be accessible to every operation. This makes even minor changes to the data layout have far reaching effects and often leads to difficult to maintain implementations.
  4. It is very difficult, and requires careful planning, to dynamically add data types.
  5. Efficient storage of the data types requires variant record support in the language, as discussed in note 1 below, (which can be simulated though inheritance in object-oriented languages but I don't provide and example of that).

To demonstrate the difficulty of adding a data type, let's add power support.

expr ^ expr Power

This represents the expression xn. To implement this we first we need to modify the enumeration to add Power,

    enum Oper { Add, Subtract, Multiply, Divide, Literal, Power }

Next we need to add an additional case statement to Evaluate()'s switch statement,

    case Oper.Power: return Math.Pow(left, right);

An finally we need to also add another case statement to Print's switch statement that looks like,

    case Oper.Power: Print.(e.Left); Console.Write(" ^ "); Print(e.Right); break;

Note we had to modify both operations and the Oper enumeration. If we had several operations, we would have had to modify most, if not all, of them. Also, note that the compiler didn't help us find any of the places we need to modify since the code was legal after each of the above modifications. If we had made the changes to just Evaluate() for example, the compiler wouldn't have complained, but we would receive a runtime error if we called Print() with an expression tree that contains a Power node.

To learn more about the expression problem and its history, see Kim Bruce's excellent paper, Some Challenging Typing Issues in Object-Oriented Languages. Be warned, it will be a bit of a spoiler for some of what I will cover.


1 - A casual observer will notice that Expr is a bit wasteful in that it allocates room for a value for every node instead of only for literals. It also takes up space for Left and Right for literals. In C#, removing this waste is a bit awkward to express and makes the Evaluate() method more complicated, so I ignored the waste in favor of simplicity.

In C++, this is a bit less awkward, and might look like,

    struct Expr {
       Oper            oper;
       union {
            struct {
                Expr   *left;
                Expr   *right;
            };
            double      value;
        };
        Expr(Oper oper, Expr *left, Expr *right) { ... }
        Expr(double value) { ... }
    };

with a very similar evaluation function, but for now we will stick with C#. << back <<

2 - For simplicity, I ignored operator precedence. Later I will present a version of this routine that handles precedence correctly. << back <<

More Cider at the PDC

Yet more Cider stuff. Brian and Mark tag teamed together at the PDC to present how to add design-time behavior to WPF (Avalon) controls. You can find their presentation here... (Maybe Corbin will be able to view this one ;-).

I seem to not be in any of the recent videos. I am really not that bashful. Here I am in the audience (Mark points to me). I wasn't on stage because I had to leave early to catch a plane. I will probably show up sooner or later in one of these.

Cider on MSDN

Mike Harsh and Mark Boulter demo'ed Cider for MSDN. Check it out...

Cider on Channel 9

Mark Boulter and Brian Pepin make an appearance on Channel 9 to explain and demo Cider. Check it out...

Cider architecture

Brian Pepin has started to blog about Cider's architecture. Check it out...

A Tour of XAML - Part III - Type converters

Fundamentally XAML is an object initialization language. It creates object and sets values to properties. XAML, however, is not limit to creating reference types. You can also create instances of any type that has an associated type converter. For example, XAML can create an instance of a brush with the following code,

  <Brush>Red</Brush>

which creates a red solid color brush by calling the type converter associated with Brush. This means that,

  <Button>
    <Button.Background>
      <Brush>Red</Brush>
    <Button.Background>
  </Button>

is equivilent to,

  <Button Background="Red" />

XAML can be used to create value types, such as an integer, through its type converter as in,

  <s:Int32>27</s:Int32>

or a string,

  <s:String>It is the time for all good men...</s:String>

by using string's type converter. Note that XAML always refers to the CLR name for the type, not a language specific name such as "int" in C# or "Integer" in Pascal. Both of these types refer to the System.Int32 class in the CLR and XAML uses the CLR name, Int32. Note also that neither the XAML nor the WPF (Avalon) namespaces refer to the System namespace in mscorlib.dll where these types live. This mean that somewhere in the document you need to declare the prefix. This might look something like:

  <?Mapping ClrNamespace="System" XmlNamespace="System" 
  Assembly="mscorlib"?>
  <Window
    xmlns="http://schemas.microsoft.com/winfx/avalon/2005"
    xmlns:s="System">
      <Button>
        <s:String>It is the time for all good men...</s:String>
      </Button>
  </Window>

Of course, for string, it is easier to write it,

  <Window xmlns="http://schemas.microsoft.com/winfx/avalon/2005">
    <Button>It is the time for all good men...</Button>
  </Window>

but the string for is useful when you want multiple strings in a collection, say, for example, as the content of a list box,

  <?Mapping ClrNamespace="System" XmlNamespace="System" 
  Assembly="mscorlib"?>
  <Window
    xmlns="http://schemas.microsoft.com/winfx/avalon/2005"
    xmlns:s="System">
      <ListBox>
        <s:String>It is the time for all good men...</s:String>
        <s:String>to come to the aid of their country.</s:String>
      </ListBox>
  </Window>

Note that if this had been written,

  <Window xmlns="http://schemas.microsoft.com/winfx/avalon/2005">
    <ListBox>
      It is the time for all good men...
      to come to the aid of their country.
    </ListBox>
  </Window>

the list box would only contain one string, not two. This is because all continuous text content is considered one string and not two. If you load this in WPF you will also notice that the string get treated as "It is time for all good men... to come to the aid of their country" where the spaces and line breaks have been collapsed. I will discuss the rules for whitespace normalization (collapsing) in a subsequent post. For now, if you are familiar with HTML, what XAML does shouldn't come as any surprise.

OOPSLA '05

Next week I will be attending OOPSLA. This will be my first technical conference where I will just be a regular attendee. I have been to more conferences than I can count but I was always there on some official reason either as a presenter or booth worker, or there for some special event such as to launch a product or attend some award presentation. If you are there too, feel free to stop me in the hall.

Cider demo'ed at the PDC

If you missed the PDC and Eric Rudder's keynote address where Cider was demonstrated you can watch it here. Mark Boulter demonstrates it with Joe Marini demonstrating Sparkle. Their part begins around 37 minutes into the keynote.

A Tour of XAML - Part II: Property Elements?

XAML provides two ways to set a property, either through a Property Attribute or through a Property Element. We covered Property Attributes last time, this time we will focus on the Property Element.

A Property Element is an element that, instead of creating an object instance, set the value of a property. The following is equivalent to the example given in Part I but uses Property Elements instead of Property Attributes.

  <Button xmlns="http://schemas.microsoft.com/winfx/avalon/2005">
    <Button.Content>
      Hello, World!
    </Button.Content>
  </Button>

The loader distinguishes a Object Element from a Property Element by the presents of the '.' in the name. The first part of the name is a reference to a type and the second part, after the dot, refers to the property name (we will discuss exactly why this is when we discuss Attached Properties). Here we use Button in both Property Elements. Property Elements are necessary when the value of the property cannot be expressed as a string. For example, you can use a Property Element to set the content of a Button to be another Button (not sure why you would want that, but suspend disbelieve for a second) as in,

  <Button xmlns="http://schemas.microsoft.com/winfx/avalon/2005">
    <Button.Content>
      <Button Content="Hello, World!" />
    </Button.Content>
  </Button>

This can be written more explicitly as,

  <Button xmlns="http://schemas.microsoft.com/winfx/avalon/2005">
    <Button.Content>
      <Button>
        <Button.Content>
          Hello, World!
        </Button.Content>
      </Button>
    </Button.Content>
  </Button>

As you can see, using Property Elements can become a bit repetitive. To mitigate this, one of the properties of a class can be designated as the Content Property which tells the load to assign direct content of the element to that property. In the case of Button (and all ContentControls) the Content Property is, oddly enough, set to the property named Content. This allows us to write something like,

  <Button xmlns="http://schemas.microsoft.com/winfx/avalon/2005">
    Hello, World!
  </Button>

for the first example above, and,

  <Button xmlns="http://schemas.microsoft.com/winfx/avalon/2005">
    <Button>Hello, World!</Button>
  </Button>

for the second. Most Avalon classes have a Content Property defined. For Panel's, such as DockPanel, Grid, and Canvas, their Content Property is set to their Children collection. Any direct content of a Panel is assumed to be content intended for the Children property. This means the following,

  <StackPanel xmlns="http://schemas.microsoft.com/winfx/avalon/2005">
    <Button>OK</Button>
    <Button>Cancel</Button>
  </StackPanel>

is an abbreviated version of,

  <StackPanel xmlns="http://schemas.microsoft.com/winfx/avalon/2005">
    <StackPanel.Children>
      <Button>
        <Button.Content>
          OK
        </Button.Content>
      </Button>
      <Button>
        <Button.Content>
          Cancel
        </Button.Content>
      </Button>
    </StackPanel.Children>
  </StackPanel>

Now, with Property Elements and Content Property Attributes and Object Element and Property Attribute from Part I, we have covered the fundamentals of XAML.

A Tour of XAML - Part I: What is XAML?

Today I begin a tour of XAML. As Chris said, I have been part of development of XAML. I came fairly late to the game, however, more closely to the end of the Avalon development cycle than at the beginning, and XAML was already fairly well defined when I arrived. I did work to push some changes into the language, primarily driving consistency and regularity in the language as well as put developing features that support versioning and extensibility. We will cover some of those things in our tour, but lets start at the beginning.

The first question we will cover in our tour is what is XAML anyway? Simply, XAML is an XML based object initialization language. Its job in life is to create a tree of objects. For example, consider the following XAML file,

  <Button Content="Hello, world!" 
    xmlns="http://schemas.microsoft.com/winfx/avalon/2005" />

This creates an instance of System.Windows.Controls.Button and sets the Content property to the sting value "Hello, world!". The reason System.Windows.Button is selected and not, say System.Windows.Forms.Button, is because of the xmlns declaration in the Button tag. The magic behind the URI "http://schemas.microsoft.com/winfx/avalon/2005" is hidden in the Avalon assemblies themselves. They contain an assembly attribute that looks like,

  [assembly: XmlnDefinition(
	"http://schemas.microsoft.com/winfx/avalon/2005",
	"System.Windows.Controls")]

It actually has several more definitions like this to associate several other CLR namespaces with the Avalon XML namespace, but this is the one used above. Because PresentationFramework.dll is in the GAC or referenced by the project if the XAML is compiled into assembly, the Avalon loader will find the XmlnsDefinition in the assembly and it is able to determine that it is Avalon's button we want to create.

Now lets take a look at the phrase Content="Hello, world!". This causes the loader to look up the property called Content and sets its value. The Content property is introduced in Button's ContentControl base class and is of type System.Object. Since there is no applicable type converter for System.Object, the loader uses the default type of an attribute, string, and calls the setter method for Content. Things are slightly more complicated than this, but this should give you the basic idea. This gives us an instance of the System.Windows.Controls.Button class with its Content property set to the value "Hello, world!", which is, hopefully, what we wanted.

If we change the XAML to,

  <Button Content="Hello, world!" IsDefault="True"
    xmlns="http://schemas.microsoft.com/winfx/avalon/2005" />

adding the phrase IsDefault="True" which causes the loader to find the IsDefault property introduced on the Button class itself. The IsDefault property is of type System.Boolean which has a type converter, System.ComponentModel.BooleanConverter, associated with it. The loader will pass the string, "True", to the type converter associated with the property (or, as in this case, the property's type) prior to setting the property's value; after which we will have an instance of the System.Windows.Controls.Button class with its Content property set to the value "Hello, world!", and its IsDefault property set to True.

This time we covered an Object Element, an XML element used to create an object, and we used a Property Attribute, an XML element used to set the value of an object instance's property. Next time we will get use a Property Element and take advantage of the Content Property Attribute.

Cider is Announced

Cider is now announced! In case you missed it at the PDC, Cider is the code name for the Visual Studio Designer for WPF (pronounced Avalon according to Chris) that will be delivered in Orcas. We also announced the Expression product line that includes a different designer called Sparkle. Other than the obvious difference between Sparkle and Cider, Cider is in VS, Sparkle is a stand-alone product, Cider will be a designer for developers, Sparkle will be a designer targeted at professional designers.

For those that were asking about what I was working on, I wasn't trying to be coy, I just didn't want to spill the beans early.

I will be more forthcoming in the months ahead and try to keep you up-to-date with our progress.

If I missed you at the PDC, I am sorry about that. Feel free to ask your questions here or via e-mail (chuckj directed through microsoft.com).

Off to the PDC

I am off to PDC. This will be my first PDC as an "insider" instead of a regular attendee, I am looking forward to it. We are announcing some very exciting things! I am not going to be presenting but I will be available to chat. I plan to spend most of my time in the "Big Room". If you don't know what that is now, you will once you get there. I should be easy to spot, I will be the guy in a blue Microsoft shirt and tan pants... ;-).

Comment Toggle

When writing a prototype for a syntax highlighter, more years ago than I care to admit, I came across an interesting property of the two character comment delimiters found in the C family of languages as well as Pascal. I have shown this to several people and many had never seen it before so I thought I would share it more widely.

Take C# for example, you can quickly toggle between two blocks of code by using the three characters /*/ in the middle. These three characters act as a comment toggle. If you are in a block comment, it will terminate the comment, if not, it starts one. This combined with the unconditional comment terminator sequence /* */ which ensures that the characters follow are not in a block comment,  but is still a legal sequence of characters outside a comment, you can toggle between blocks of code. For example,

    /* */
    Console.WriteLine("Option 1");
    /*/
    Console.WriteLine("Option 2");
    /* */

This will print “Option 1” to the console. The second WriteLine() is commented out. With a simple one character change, removing the trailing “/” in the first comment, will cause “Option 2” to print instead.

    /* *
    Console.WriteLine("Option 1");
    /*/
    Console.WriteLine("Option 2");
    /* */

The first WriteLine() is commented out and the second is now not. I use this sometimes to quickly switch between two optional implementations of routines when I am rewriting code, or when constructing test applications.

You can use the unconditional comment terminator to alternate between several versions such as,

    /* */
    Console.WriteLine("Option 1");
    /* *
    Console.WriteLine("Option 2");
    /* *
    Console.WriteLine("Option 3");
    /* */

To select a different option you need to change two characters, such as removing the trailing "/" of the first comment and adding one to the third, as in,

    /* *
    Console.WriteLine("Option 1");
    /* *
    Console.WriteLine("Option 2");
    /* */
    Console.WriteLine("Option 3");
    /* */

which will cause "Option 3" to print instead of "Option 1".

Pascal can do the same thing with the its block comment as in,

    (* *)
    Writeln('Option 1');
    (*)
    Writeln('Option 2');
    (* *)

Note that some implementations of C and C++ support nested comments and, therefore, the sequence /*/ always starts a comment instead of toggling, so this sequence is not as useful.

The reason this was important to my prototype is a bit hard to explain and is not nearly as interesting as the toggle itself.

Sorting IV - Radix Sort

So far we have learned a couple things about making algorithms faster from sorting,

  1. Look for relationships in the data (or relationships between the operations performed on the data) you are not taking advantage of. In our case this was the transitive property of comparisons.
  2. Try organizing the data differently that might better take advantage of a relationship. In our case, this was a tree instead an array.

To make the sort any faster we need something that can get performing better than an O(N log N). For this we need to take advantage of some property of the data we have been ignoring. We have examined some general sorting techniques that can be applied to any type of array element as long as any two elements can be compared. If, on the other hand, we assume we are sorting integers we can get a bit more clever.

If we have an array of size N and knew that we that the array contained values from 1 to N, we could trivially sort the array by filling it with values 1 to N, in order, such as,

  static void FillArray(int[] a) {
      for (int i = 0; i < a.Length; i++)
          a[i] = i;
  }

Even though this isn't very useful, since such a case rarely, if ever, occurs, it tells us that if we can make assumptions about the data we are dealing with, we can make a radical improvement on the sort. Can we take a similar approach with an array that can contain any value of integer? An integer is actually made up of 4 bytes (in C#, in other languages your mileage may vary). Each byte ranges in value from 0 to 255. If you had 256 buckets, you could put all the integers into one of the 256 buckets based on the most significant byte of the integer. You could then take one of these buckets and do the same thing each of its values using instead the second most significant digits. You can then repeat the whole thing 4 times for each of the bytes and you can see you would have the array completely sorted but an enormous cost of memory. You could take a recursive approach but you would still wind up with at least 1024 (4 * 256) buckets active at the same time, but it looks promising since this appears O(N).

It turns out that we can reduce the bucket count to 256 by starting with the least significant byte and ensuring that for subsequent bytes we distribute to their buckets in order. Before I explain why this works, lets at some code that implements it,

    static void RadixSort(int[] a) {
        List<int>[] buckets = new List<int>[256];

        for (int i = 0; i < buckets.Length; i++)
            buckets[i] = new List<int>();

        for (int j = 0; j < 4; j++) {

            for (int i = 0; i < a.Length; i++) {
                int index = (a[i] >> j * 8) & 0xFF;
                buckets[index].Add(a[i]);
            }

            int idx = 0;

            for (int i = 0; i < buckets.Length; i++) {
                List<int> bucket = buckets[i];
                for (int k = 0; k < bucket.Count; k++)
                    a[idx++] = bucket[k];
                bucket.Clear();
            }
        }
    }

This works because after the first pass the entire array is ordered by the lowest order byte. This is because we rebuild the array in bucket order, all the 0 bytes first, 1 bytes after them, etc. The second pass removes the values from the array in order and places them in the buckets. Since they are inserted into the buckets in sorted order, if we extract them from the buckets in order they will still be sorted; therefore, after the second pass we will have an array sorted by the lowest two significant digits. The third pass gives us an array sorted by the lowest three significant digits, and finally, the last pass gives us a sorted array.

The above algorithm still takes up quite a bit more memory than the original array we are sorting. Even though the buckets are not that big they are still a lot bigger than the elements themselves (each only 4 bytes). To further reduce the size we can take advantage of the fact that the total number of elements in all the buckets will be N. We could use a temporary array the same size as the original array for our buckets. The trick with this is to know where each bucket should start. To know this we need to know how many items will be in each bucket. If we take an additional pass on the data we can use one pass to figure out how many items are in each bucket and then, in the second pass, distribute the data into each bucket. The following version of the radix sort takes this approach,

        static void RadixSort(int[] a) {
            int[] source = a;
            int[] destination = new int[a.Length];
            int[] indexes = new int[257];

            for (int j = 0; j < 4; j++) {
                for (int i = 0; i < indexes.Length; i++)
                    indexes[i] = 0;
                for (int i = 0; i < a.Length; i++) {
                    int index = (source[i] >> j * 8) & 0xFF;
                    indexes[index + 1]++;
                }
                for (int i = 1; i < indexes.Length; i++)
                    indexes[i] += indexes[i - 1];
                for (int i = 0; i < a.Length; i++) {
                    int index = (source[i] >> j * 8) & 0xFF;
                    destination[indexes[index]++] = source[i];
                }
                int[] temp = source;
                source = destination;
                destination = temp;
            }
        }

This algorithm has one overly clever trick; to avoid copies it uses the original array as the bucket storage for the even passes by swapping source and destination. Since it does this, after the last pass the original array will contain the fully sorted data. If the key length is odd, the final array would need to be copied from the bucket storage array.

This looks O(N), but is it? As it turns out, for non-duplicate values, it is still O(N log N) because it takes O(log N) digits to represent each integer value uniquely (here our digits where 0-256 but the above would work with 2 buckets where we used bits instead of bytes). For each digit we need a pass over the data giving us O(N log N). In practice, such as above, the arrays are not totally filed to capacity (and if they were, as we also noted above, we could use an O(N) algorithm) so the radix sort tends to perform better than O(N log N). Some of the problems with the radix sort is that it consumes memory proportional to the size of the data where a QuickSort only takes O(log N) amount of data (consumed as stack space in my example) and the HeapSort and Selection Sort take a fixed size of data. Also my naive implementation of the radix sort above doesn't handle negative integers correctly. I will leave why and how to correct it as an exercise to the reader ;-).

Here we have learned a third useful tool to help us make our algorithms faster, see if there is some characteristic of the data itself you can take advantage of. You will end up with a less reusable algorithm but it might be much faster.  In my tests, on array sizes of 4,000,000 integers of random data, the radix sort above is about twice as fast as an optimized version of a QuickSort.

Sorting Part III - HeapSort

Last time we looked at the QuickSort and found that it was much better, in most cases, than the selection sort, but there was still a pathological case where the QuickSort wasn’t better. Can we guarantee we are better than the O(N2) of the selection sort? Can we guarantee O(N log N)? To do this we need to find something in QuickSort we can do better. The Achilles’ heal for the QuickSort is selecting the median value of the partition. Several techniques have been developed to avoid the worst case scenarios (like selecting three values and taking the median; in other words, median of three, or just picking a random item in the partition) but none of these guarantee O(N log N). We need a more radical approach.

We need a way of preserving the comparisons; much like QuickSort does, but that doesn’t use partitions. As is often the case, when you need a radical new approach, you should try a different data structure (much like choosing a different coordinate system in calculus). If we organize the array into a binary tree, we can find the largest element in the array in O(N log N) comparisons. This is because each element only needs to be compared with its parents, which is O(log N) comparisons, and we need to do this for every element in the in the array, O(N), giving us O(N log N). This sounds promising, but how do you organize an array into a binary tree? This is easier than it sounds. If you number an array from 1 to n, where each number is the index of the elements in the array, you can treat the array as a binary tree by defining the left node of any element i to be at index 2i and the right node to be at 2i+1 (or the other way around, if you must, but I like to think of the odd children as right). To create the desired tree, we need to ensure that for every element in the array at index i, it is the greater than its children. In other words, for all elements of array a, ai ≥ a2i for 2i ≤ n and ai ≥ a2i+1 for 2i+1 ≤ n. An array that obeys this constraint is called a heap (hence the name HeapSort).

Forming a heap is fairly straight forward and follows the informal description given above; you swap each element, from 2 to n, with its parent until you come across a parent that is greater than the element. The algorithm for this might look something like,

    static void FormHeap(int[] a) {
        // form the heap
        for (int i = 1; i < a.Length; i++) {
            int j = i;
            while (j > 0) {
                int k = (j + 1) / 2 - 1;
                if (a[k] < a[j])
                    Swap(a, j, k);
                else
                    break;
                j = k;
            }
        }
    }

Note, that since C# numbers the array from 0 to n-1 instead of from 1 to n, we need to subtract add one to the index j and then subtract it again while calculating the parent’s index.

Now that we have the have the greatest element in the array, we know which element should be at the end of the array. We need only swap it with the last element of the array to put it there, but doing so will violate the heap constraint above since an is guaranteed to be less than or equal to both a2 and a3. To restore order back to the heap we must push the new first element down in the heap until it is greater than or equal to both of its children or it is childless. This can be done by selecting the greatest child and swapping positions with it then repeating until it is greater than the children or it has no children. Since we are following a single line of succession, an element will only have, at most log2n generations (or levels, as it is more traditional called) below it. This means we can restore order to the heap in O(log N) comparisons. This might look like,

    static void ReformHeap(int[] a, int i) {
        int j = 0;
        while (true) {
            int k = (j + 1) * 2 - 1;
            if (k >= i)
                 break;
            if (k + 1 >= i || a[k] > a[k + 1]) {
                if (a[j] > a[k])
                    break;
                Swap(a, k, j);
                j = k;
            }
            else {
                if (a[j] > a[k + 1])
                    break;
                Swap(a, k + 1, j);
                j = k + 1;
            }
        }
    }

We now need to do this n times to completely sort the array, meaning that we are guaranteed to sort a heap in O(N log N) comparisons. This gives us O(N log N) operations to form a heap and O(N log N) operations to sort a heap which means that we can sort any array, using a heap in O(N log N) (since big O notation throws away the constant 2 in 2N log N). Putting this together gives us the HeapSort invented by J.W.J. Williams,

    static void HeapSort(int[] a) {
        // form the heap
        for (int i = 1; i < a.Length; i++) {
            int j = i;
            while (j > 0) {
                int k = (j + 1) / 2 - 1;
                if (a[k] < a[j])
                    Swap(a, j, k);
                else
                    break;
                j = k;
            }
        }

        // Pick the top of the heap and place it last, reform the 
        // heap, then take the new top and place it next to last.
        for (int i = a.Length - 1; i > 0; i--) {
            // The top of the heap contains the highest value, swap
            // it the last member
            Swap(a, 0, i);
            // Reform the heap it. It contains a relatively low
            // value on top. Replace it with the greatest child.
            int j = 0;
            while (true) {
                int k = (j + 1) * 2 - 1;
                if (k >= i)
                    break;
                if (k + 1 >= i || a[k] > a[k + 1]) {
                    if (a[j] > a[k])
                        break;
                    Swap(a, k, j);
                    j = k;
                }
                else {
                    if (a[j] > a[k + 1])
                        break;
                    Swap(a, k + 1, j);
                    j = k + 1;
                }
            }
        }
    }

Here we have it. An algorithm that is guaranteed to perform on the order of O(N log N) operations to sort an array! And here, you are apt to say, “Why doesn’t everyone use a HeapSort instead of a QuickSort if HeapSort is so much better?” Good question, I am glad you asked. There are several reasons why, but I will save that for another time.

Sorting Part II - QuickSort

As we noted in Part I, a selection sort is an easy sort to write but unfortunately it performs on the order of n2 comparisons. To produce an improved version we need to find a piece of information that the selection sort is throwing away and take advantage of it.

Selection sort will compare a value of the array with all of the other values of the array to determine which is the smallest. It doesn’t take advantage to the fact that comparing say, a10 to a20 and comparing a20 to, say, a40 could tell us something about the relationship between a10 and a40. What we could do is take some element of a and compare it to every other element in a and partition the array into two pieces, one containing all the elements less than the selected element and all elements greater than or equal to the selected element. By transitivity, we would know that each element in the first partition is less than all elements in the second partition even if we don’t compare them directly. We could then partition the partitions again and again, in Xeno-esque fashion, until the array only contains partitions of one element in length. Since each partition will be less than its successor, the array will be sorted.

An algorithm that takes this very approach is the QuickSort, invented by C.A.R. Hoare. A version of which looks like the following,

    public class Sorts {

        public static void QuickSort(int[] a) {
            QSort(a, 0, a.Length - 1);
        }

        private static void QSort(int[] a, int l, int r) {
            // Partition the array into two parts, one with all values 
            // greater than a certain value in the array, one with all 
            // less.
            int v = a[(l + r) / 2];
            int i = l;
            int j = r;
            while (i < j) {
                // Find the first values that don't belong and swap 
                // them. Repeat until all values are in the right 
                // partitions.
                while (a[i] < v)
                    i++;
                while (a[j] > v)
                    j--;
                if (i <= j) {
                    if (j > i)
                        Swap(a, i, j);
                    i++;
                    j--;
                }
            }
            // Recursively partition the array until we can't any 
            // longer. When this happens, the array is sorted.
            if (j > l)
                QSort(a, l, j);
            if (r > i)
                QSort(a, i, r);
        }
    }

If we get lucky and always guess the median value of each of the partitions we will only perform O(N log N) comparisons. The worst thing we can do is select a value that, given an n length partition will give us two partitions one of length n-1 and the other of length 1. This would put us back where we started; with the selection sort. This would happen if we selected the first (or last) element in the partition and the array was already sorted. To avoid this, the above code selects an element from the middle of the partition, so, if the array is sorted, it will be the median. If the array is in random order, it is not better or worse than the first element.

This, however, doesn’t guarantee we will get the elusive O(N log N) performance. Many cases we will be less than O(N log N), those cases where we don’t magically pick the median. In some pathological cases, we could arrive very close to O(N2). An example of which is an array ordered 1..m,1..m where m is n/2. Next we will examine an algorithm that guarantees O(N log N) performance.

Sorting Part I - Selection Sort

When I run across a particularly fast elegant algorithm I often find myself asking, “Why it is fast?” How can I make my algorithms fast like the author of this one did? I have found that the answer usually is found in how the algorithm manages data. How does the algorithm preserve work and prevent recalculating information? To step back a bit, there are only three ways to make some code go faster, find a way to make what it is doing faster, find a way to do less, find some other time to do it. The most elegant algorithms do the second well; they do less. For a few entries I want to look at various algorithms to see what makes them fast (or not so fast).

The first set of algorithms I want to investigate are some of the most fundamental, sorting. The simplest sort to write is the selection sort. In C# it looks like,

  static void SelectionSort(int[] a) {
         for (int i = 0; i < a.Length - 1; i++)
             for (int j = i + 1; j < a.Length; j++)
                 if (a[i] > a[j])
                     Swap(a, i, j);
  }

where Swap looks like,

  static void Swap<T>(T[] a, int i, int j) {
      T v = a[i];
      a[i] = a[j];
      a[j] = v;
  }

This sort finds the smallest item in the list and puts it first, and then it finds the second smallest and puts it second, etc. Even though it is very easy to write and relatively easy to verify, it isn’t very fast. We could improve this by in-lining the Swap() method. Additionally, we can limit the number of swaps to a.Length by only swapping the value we know is the smallest. A version that includes both optimizations is,

  static void ImprovedSelectionSort(int[] a) {
      for (int i = 0; i < a.Length - 1; i++) {
          int min = i;
          for (int j = i + 1; j < a.Length; j++)
              if (a[min] > a[j])
                  min = j;
          if (i == min) continue;
          int v = a[i];
          a[i] = a[min];
          a[min] = v;
      }
  }

These optimizations are examples of making what we are doing faster. Unfortunately, we still with do on the order of O(N2) comparisons even though we reduced the number of moves to O(N) and removed any overhead of doing a call in the middle of the loop. What we really want to do is less. To do less we need to come up with information we are not using or information we are throwing away.

Since we are trying to optimize the number of comparisons (since we got the moves down to O(N)) we should examine them closely. We need to come up with a way to make each comparison do more. One thing we should notice is the above algorithm totally ignores the transitive nature of a compare. That is if A > B and B > C then A > C. If we know that a[10] > a[11] and a[11] > a[12] we don’t need to physically do the comparison between a[10] and a[12] to know at a[10] > a[12], but how do we take advantage of that? In my next "sorting" post I will investigate an algorithm that takes advantage of this.

My Life as a GUI Builder Builder

Frankly I am not sure how to respond to Beware The GUI Builder. I have spent nearly all my programming career building what this article describes as  GUI builder (I have always called them designers, but that's me, I hate the term GUI). It is not often you come across an opinion piece that condemns nearly your entire body of work as worse than worthless (unless you are a lawyer or politician, that is). Let me see if I can find myself and respond.

At this point, if you haven't already, read the referenced article, if just for the clever poem. My response will make more sense that way (well, hopefully, anyway), and I apologize for my lack of a poem (well, not really; be glad I didn't try one).

"major failings of this application genre"

Don't be fooled (OK, you can be fooled if you want but I don't have to like it). This article is really just a diatribe about the short-coming of a particular unnamed Java based SWT GUI builder with passing references to Swing builders. Many of the flaws of "GUI Builders" it points out are not really faults of the entire genre, but, rather, flaws in the combination of Java, SWT, and this unnamed GUI builder. I wish the author would have named it, at least I could then verify the claims made in the piece, but I will trudge ahead anyway, blissful in my ignorance. Many of the flaws are common to a wide range of designers, but not all, and certainly not mine, of course ;-).

Reliance on Static Code Analysis

Not all designers rely on "static code analysis". Many, such as all the ones I have worked on, store the result of the designer in as a serialized image of the form. This is what code generating form designers do as well, such as WinForms, but the format of the serialization of the form is in code. I also have to quibble with the description of how the loading of the form is done, "By 'static code analysis', I refer to those aspects of the code's behavior that can be determined from consideration of it's structure alone, without regard to it's runtime behavior." I am not sure what this means. All the code-generating form designers I am familiar with have a language interpreter that can execute a subset of the serialization language. The subset is usually restricted from executing flow control logic, but not always as in Visual dBASE. The reasons for these limitations is to simplify translating gestures in the designer to changes in the code that was generated. In an example given in the article, it would be difficult for a designer to figure out the mapping between the array,

    String[] colors = { "Red", "Orange", "Yellow",  "Green", "Blue",
        "Indigo", "Violet" };

and the value passed to the setText() method in the for loop (which isn't actually there in the example, but should be). But, in my mind, this is no excuse, if the language allows it the designer should support it. My solution in the past is to use a language that only allows what the designer supports. In the case of Delphi it was the .DFM file format, and currently, in my "new" job, it is XAML. In these cases, the language is designed in such a way that it is limited to supporting only that which the designer can figure out. This is one of the several reasons I don't like the code generator style of serialization, programmers are tempted to futz with it and the will always use a feature of the serialization language the designer doesn't support. The authors has a point, Java does not make very good serialization language (nor does any other general purpose language).

Going back to the example used in this section (instead of skipping ahead as above), localization and handling dynamic values are typically where designers excel, not fall down. If you separate the logic of the form from the initialization state, you can have localizers just edit the initialization information, not your code (good reason not to code-gen, by the way). The article refers to resource bundles in "capable" GUI builders but it goes beyond that. The entire initialization state should be localizable, not just strings or what the developer might feel generous about allowing the poor localizer to fiddle with. The localizer might need to entirely reorient the form for a right-to-left language, reorganize the form in a more culturally friendly order, etc. If layout is done in code this is notoriously difficult, if not impossible, for localization teams to deal with. Most localization teams I have had the pleasure to interact with require the framework to have a form designer for these very reasons. Don't layout a form in code. Just don't. What assumption you have about how the form will look is culturally biased. Writing layout in code just makes the bias impossible to remove. Localization is a tough, thankless job. Don't make it harder.

As for dynamic values based on "male" and "female", is easy to do a one-liner, but data binding is what is really needed here. A form should follow good MVC (Model-View-Controller) principles, where appropriate, and this example screams model-view. If you have a model that controls changes the salutation options, a good class library will have a data-bound version of a radio group that will update accordingly. This is not a GUI builder problem, it is a framework/data-binding problem.

Masking Understanding and Inhibiting Learning

This whole section is arguing against a false premise. No designer, ever, was written with the idea that the user would never have to code again. (Well, I have to admit, the marketing teams for them often have said that, but they were lying). Good designers are written to smooth and facilitate the transition between designing a form and writing code, allowing the user to do those things that are better done in a designer in the designer, and leave those things better done in code, done in code. They also where never intended be the only way you learn the framework (although they do help a lot) or take the place of good framework knowledge. Their job is to make using the framework easier, period. The author has a good point, most of your time is not spent in the designer, you spend most of you time in code. That is when the designer really shines, when you don't have to use it much to get the job done.

"How do you visually represent these infinitely variable sets of constraints amongst components, and in such a way that they can be configured with mouse clicks and keyboard operations? The answer is 'With great difficulty'."

This is a criticism of frameworks, not designers. The framework should present a model where this is straight forward. The designer should make using it easy to use and easily discoverable. Also, remember, doing this in code might be easier for you, it just makes other people's jobs impossible. You are not the only one that needs to control the layout of the form. A localizer might have a different idea of where the OK button should go. And, to be blunt, if your code has "infinitely variable sets of constraints" on anything, you should consider a different line of work.

Missed Opportunities for Reuse

I covered most of this above, use data-binding; understand data-binding; love data-binding. The second part, the address form example, is a job for nested forms or data templates. Each of the address should be a an instance of a single nested form or data template (depending on the framework you choose). This is where data-binding shines again since moving data in an out of controls is what data-binding does. This is a case where you shouldn't need to write much code. The code is shared if you use the sharing techniques provided by the framework and the designer.

Dependency Upon Another Tool

Ideally the framework and the designer come from the same company and are rev'ed at the same time. If you get your designer from two different places, the author has a point. In my opinion, the designer and the framework should be developed together with full knowledge of each other. In Java, these are typically provided separately for reasons I will never understand.

Signs That Your Predecessors Used A GUI Builder

Most of these are either artifacts of code generation, covered above, or just signs that your predecessor didn't leave because he wanted to. You can write bad code in any language and and using any tool. These are common problems, granted, but well written code is not common. As a retort, here are some signs that your predecessor didn't use a GUI designer,

  • The form looks hideously ugly; something only a Unix user would love.
  • You are considering rewriting it because you cannot change it without breaking something in the layout.
  • There are dead spots, places where the controls are obviously missing on the form. This happened because it was just easier to hide the control than figure out how to re-layout the form or because the predecessor forgot to save the "prototype".
  • Your not sure the form you are looking at in the UI or what code creates it.
  • The localization team curses when your predecessor's name is mentioned.

OK, so mine are not as witty...

Conclusion

They have some utility for mocking up interfaces. You can create static "snapshots" of your interface and show them to users, perhaps employing printouts of them as paper prototypes.

This is truly damning with faint praise. If this is all you use a designer for, go get a better designer. Seriously. They exist and hopefully you now know what to look for in order to find a good one.