If you are using a loop, you're doing it wrong

"If you are using a loop, you're doing it wrong."

That is the advice one of my college professors told us when he was teaching us APL. APL was designed to perform vector and matrix operations. Programming in APL is an exercise in stringing operators together to perform strange and wonderful things. Using a loop is just wrong and slows things down.

It is similar with LINQ, if you are using a loop you are doing it wrong. I find myself doing a lot of prototyping lately and I am forcing myself to use LINQ; not because I don't like it, far from it, I really like LINQ, but using loops is so ingrained into my psyche that I have to stop myself and force myself to think in LINQ. Every time I am tempted to write a loop that involves a collection or an array I ask myself, could I use LINQ instead? Programmers with more of a database background seem to take to LINQ like a duck to water. They think in sets and vectors, I don't, but I am getting there.

For example sometimes I find myself needing to return an IEnumerable<T> when I have a List<T> but of a different T. This often happens when I want to keep internal implementation details private. My internal List<T> might have the actual class but I need to return an enumerable for some interface. Before I would simply write a loop using the C# 2.0's yield syntax,

List<Foo> foos = new List<Foo>();

...

public IEnumerable<IFoo> Foos {
    get {
        foreach (Foo f in foos)
            yield return f;
    }
}

This loop involves a collection, can I use LINQ? Sure! By using LINQ's Cast<T>() method this can be replaced with,

public IEnumerable<IFoo> Foos {
    get { return foos.Cast<IFoo>(); }
}

If you are trying to find if a list contains an object by some name, you could write a loop like,

public bool Contains(string value) {
    foreach (Foo foo in foos)
        if (foo.Name == value)
            return true;
    return false;
}

Using LINQ this might look like,

public bool Contains(string value) {
    return (from foo in foos where foo.Name == value select foo).Any();
}

A nice thing about LINQ is you can perform complicated queries in pieces. With deferred execution of enumerators, this is fairly efficient as well. This is really helpful in the debugger. In one chunk of code I was writing I needed to coalesce adjacent ranges that are marked with the same text value. Assume you have a structure called Range that looks like,

struct Range {
    public int Start;
    public int Length;
}

and another struct that labels the ranges with names,

struct NamedRange {
    public string Name;
    public Range Range;
}

Now lets have a routine that calculates the range information over some stream,

public IEnumerable<NamedRange> GetNamedRanges(Stream stream) {
    ...
}

Lets assume that name ranges are some things like "whtespace", "reserved", "identifier", "number" "string", etc. as you might expect to receive from a lexical scanner like found in this post.

What I want to do with these ranges is to convert the names into styles such as you might find referencing a CSS style sheet. So, in effect, I am mapping NameRange values to StyledRange values where StyleRange would look something like,

struct StyledRange {
    public Style Style;
    public Range Range;
}

Lets create a dictionary that maps range names to styles such as,

styleMap["number"] = new Style() { Name = "green" };
styleMap["reserved"] = new Style() { Name = "bold" };

I only wanted to highlight numbers and reserved words, for everything else I will use the default style,

defaultStyle = new Style { Name = "normal" };

We can translate our named ranges directly into styled ranges by using a LINQ query expression such as,

var ranges = GetNamedRanges(stream);
var styledRanges = from range in ranges
                   select new StyledRange() {
                       Style = styleMap.MapOrDefault(range.Name) ?? defaultStyle,
                       Range = range.Range
                   };

where MapOrDefault() is an extension method for IDictionary<TKey, TValue> that looks like,

public static TValue MapOrDefault(this IDictionary dictionary, TKey key) {
    TValue result;
    if (dictionary.TryGetValue(key, out result))
        return result;
    return default(TValue);
}

which is patterned after the existing LINQ methods for IEnumerable<T>, FirstOrDefault() and LastOrDefault().

Since many of the ranges that have different names will have the same style, it would be nice to coalesce adjacent styles together so no two adjacent ranges have the same style. In other words, we only want a new styled range when the style changes. The above query expression just produces a one-to-one mapping of named range to styled range. What we need is something that will merge adjacent ranges. Do do this I will introduce another extension method, Reduce<T>() for IEnumerable<T>,

public static IEnumerable<T> Reduce<T>(this IEnumerable<T> e,
    Func<T, T, bool> match,
    Func<T, T, T> reduce) {
    var en = e.GetEnumerator();
    T last;
    if (en.MoveNext()) {
        last = en.Current;
        while (en.MoveNext()) {
            if (!match(last, en.Current)) {
                yield return last;
                last = en.Current;
            }
            else
                last = reduce(last, en.Current);
        }
        yield return last;
    }
}

What this method does is if two adjacent elements match (as defined by the match delegate returning true) they will be reduced into one element by calling the reduce delegate. For example, the Sum<T>()standard extension method could be implemented using Reduce<T>() as,

var sum = numbers.Reduce((a, b) => true, (a, b) => a + b).First();

Now that we have Reduce<T>(), lets reduce the list of styled ranges to coalesce the adjacent ranges with the same style. This can be done by,

styledRanges = styledRanges.Reduce(
    (r1, r2) => r1.Style == r2.Style,
    (r1, r2) => new StyledRange() {
        Style = r1.Style,
        Range = MergeRanges(r1.Range, r2.Range)});

MergeRanges() referenced above, is,

Range MergeRanges(Range r1, Range r2) {
    return new Range() { Start = r1.Start, Length = r2.Start - r1.Start + r2.Length };
}

In my example, this took 18 ranges for a typical line of C# source down to 7 ranges. Pretty good. But I noticed that some of those ranges were styling whitespace as "normal". This seems like a waste; why switch back from green to black text for writing whitespace? Why not combine those with adjacent ranges instead? A simple approach to this is to add the following code prior to mapping the styles,

ranges = ranges.Reduce(
    (r1, r2) => r1.Name == "whitespace" || r2.Name == "whitespace",
    (r1, r2) => new NamedRange() {
        Name = r1.Name == "whitespace" ? r2.Name : r1.Name,
        Range = MergeRanges(r1.Range, r2.Range)
    });

This says to merge whitespace ranges with adjacent non-whitespace ranges. This reduces the range count from 7 to 4. Not bad from the original 18 and that was just for one line of source. This savings adds up quickly over an entire file.

The complete example, made as a function, looks like,

IEnumerable<StyledRange> StyleRanges(IEnumerable<NamedRange> ranges) {

    // Merge whitespace ranges with adjacent non-whitespace ranges
    ranges = ranges.Reduce(
        (r1, r2) => r1.Name == "whitespace" || r2.Name == "whitespace",
        (r1, r2) => new NamedRange() {
            Name = r1.Name == "whitespace" ? r2.Name : r1.Name,
            Range = MergeRanges(r1.Range, r2.Range)});

    // Map named ranges to styles.
    var styledRanges = from range in ranges 
                       select new StyledRange() { 
                           Style = styleMap.MapOrDefault(range.Name) ?? defaultStyle,
                           Range = range.Range };

    // Merge adjacent ranges with the same style.
    styledRanges = styledRanges.Reduce(
        (r1, r2) => r1.Style == r2.Style,
        (r1, r2) => new StyledRange() {
            Style = r1.Style,
            Range = MergeRanges(r1.Range, r2.Range)});

    return styledRanges;
}

There are a few nice things about this function. First, it builds up an IEnumerable<T> but this enumerable doesn't execute until one of its enumerators is enumerated. This is due to the deferred execution of enumerators. Second, even though deferred execution makes single step debugging challenging, you can get a picture of what the function will do by using ToArray() in the watch windows on the intermediate results. This allows you to inspect the intermediate result to see if the mapping or reducing is what you expected. Third, this routine has no loops. It operates on each enumerable as a set. Now some of you will rightly say there is a loop buried in the Reduce<T>() method. True; Reduce<T>() is just like many other LINQ extension methods, they contain loops implied by the returned enumerator. But LINQ allows me to write code that communicates what I am trying to do without it being obscured by the details of how it is done. I think of LINQ not so much as a set of features in C# but a way of programming. A way of programming that, if you are using loops, you are doing it wrong.

03/31/2008 2:48 AM

Fantastic post!!

I'm a newbye to LINQ, so reading this examples can make me understand it better.

Greetings from Spain

ragundo | mailto:ragundoAT NOSPAMgmail dot com | ragundoAT NOSPAMgmail dot com

03/31/2008 7:30 AM

return (from foo in foos where foo.Name == value select foo).Any();

????

Ahem.....

return foos.Any(foo=>foo.Name == value);

James Curran | www.honestillusion.com | jamescurranAT NOSPAMmvps dot org

03/31/2008 8:39 AM

You make good points. It is similar to trying to teach people that iterating through database tables with cursors is a bad idea; not just because of the performance implications, but because you are not using the set-based features of SQL to communicate your intent.

I do have one gripe though.

"This loop involves a collection, can I use LINQ? Sure! By using LINQ's Cast<T>() method this can be replaced with..."

Cast<T> is not a LINQ method. It is an extension method on IEnumerable<T> which happens to also implement one of LINQ's standard query operators. LINQ = "Language INtegrated Query", but there is nothing resembling a language integrated query in "return foos.Cast<IFoo>();"

I am honestly not trying to be pedantic :) But I have noticed a trend lately among many .NET bloggers to conflate the term "LINQ" with a lot of C# 3.0 features which, while they help enable the LINQ syntax, are not actually part of LINQ themselves. I bring it up only because I have encountered many developers who are trying to get up to speed in C# 3.0 who become confused when members of the C# community lump non-LINQ features into the LINQ feature set.

David Nelson | www.commongenius.com | davidAT NOSPAMcommongenius dot com

03/31/2008 9:41 AM

> LINQ = "Language INtegrated Query"

I think you are letting your SQL background color your interpretation of LINQ. When you see LINQ, I suspect you see a SELECT statement written a little funny. When I look at LINQ, with my procedural background, I see function calls written a little funny. You can make an argument that you are more right than I am because of the name, who's to say whether Cast<T>() is more LINQ than Select<T>() or TypeOf(). Must an expression only filter to be truely LINQ? I don't believe I am using the term LINQ incorrectly to include Cast<T>().

Like I said in the post, I don't think of LINQ as a set of C# features but as a way of programming. A way that includes methods that imbed loops like Cast<T>().

Chuck Jazdzewski | www.removingalldoubt.com | chuckjAT NOSPAMmicrosoft dot com

03/31/2008 9:45 AM

> return foos.Any(foo=>foo.Name == value);

Yep. That would have been more concise. I knew I would make such mistakes when I started this blog, hence the name.

Chuck Jazdzewski | www.removingalldoubt.com | chuckjAT NOSPAMmicrosoft dot com

03/31/2008 11:00 AM

"I think you are letting your SQL background color your interpretation of LINQ."

I'm not sure why you jumped to the conclusion that my "SQL background" would be the primary lens through which I view LINQ, especially since you have no idea what my background is. I do have a decent understanding of SQL which I have acquired through years of developing database-backed applications. But it is hardly the overriding perspective from which I view imperative programming language features.

In actuality, I see LINQ as a hybrid between imperative languages (like C#) and declarative languages (like SQL), and so far my opinion is that it doesn't fit particularly well into either paradigm. Maybe it isn't supposed to, but the fact that it is a language feature in an imperative language makes me wish that it fit into that paradigm better than it does.


"...who's to say whether Cast<T>() is more LINQ than Select<T>() or TypeOf()...I don't believe I am using the term LINQ incorrectly to include Cast<T>()."

I think you misunderstood. Obviously Cast<T> is just as much a standard query operator as Select<T> or TypeOf. My problem is not that you included Cast<T> with the others; it is that you would refer to standard query operators as a part of LINQ, when they are not.

If you look through the specification for standard query operators (http://download.microsoft.com/download/5/8/6/5868081c-68aa-40de-9a45-a3803d8134b8/standard_query_operators.doc), you will not find any mention of LINQ. That is because they are not part of LINQ; they are "an API that enables querying of any .NET array or collection." LINQ takes advantage of that API in order to implement a "language-integrated" form of querying; but to call those operators a part of LINQ is to put the cart before the horse. LINQ is built on top of the standard query operators, not the other way around.

Again, I am not trying to be pedantic, or to nitpick at you for the sake of nitpicking. My concern is to address a common misconception which I have personally observed as being confusing to developers who are trying to learn C# 3.0.

David Nelson | www.commongenius.com | davidAT NOSPAMcommongenius dot com

03/31/2008 11:39 AM

> you have no idea what my background is...

You are right. Sorry about that. I had no intent to offend. I am sorry if any was taken.

> ...but to call those operators a part of LINQ is to put the cart before the horse.

You are defining the LINQ too narrowly. The query operators are just as much a part of LINQ as the syntactic sugar used to call them.

To switch metaphors, LINQ is not the engine, the cars or the caboose; it is the whole train.

I don't think lumping query operators in with LINQ is in any way a misconception. They were created as part of the LINQ project, for use in LINQ expressions, called by the de-sugared LINQ syntax, etc. If you are finding people confusing the two, they are not confused. They might be confused, however, by the insistence on such a distinction.

Chuck Jazdzewski | www.removingalldoubt.com | chuckjAT NOSPAMmicrosoft dot com

03/31/2008 3:01 PM

"I had no intent to offend."

Not offense taken. I just prefer that my arguments be evaluated on their merit rather than on an ill-conceived perception of my own perspective.

"To switch metaphors, LINQ is not the engine, the cars or the caboose; it is the whole train."

That's not the way I see it. If that were the case, it would not be called "Language INtegrated Query"; it would have a more general name which did not include the concept of a language. The fact that the designers specifically chose to refer to the feature as "language integrated" means to me that they intended the language-integrated parts to be considered separately from the language-neutral parts (such as the standard query operators); otherwise why give it that name? To me, LINQ in C# is the query syntax which operators over instances for which standard query operators have been defined (either with instance or extension methods). The standard query operators enable LINQ, but they are not part of it.

This is the source of the confusion that I referenced. Since the framework libraries which contain the standard query operators over IEnumerable are not language-specific, they cannot be considered to be a part of any language-specific feature (such as LINQ).

Perhaps I am wrong. Perhaps I have completely misinterpreted the intentions of the .NET design teams who created LINQ, and they actually intended it to be a cross-language feature with language-specific components. However, if that is the case, then I have to conclude that LINQ ("Language INtegrated Query") is the single most poorly named product in Microsoft's history.

David Nelson | www.commongenius.com | davidAT NOSPAMcommongenius dot com

03/31/2008 4:24 PM

> ... and they actually intended it to be a cross-language feature with language-specific components

That is exactly what was intended. That is, for example, why the query operators are from the System.Linq namespace and not form Microsoft.CSharp.Linq, which they would be if Linq was a C# specific thing. Things in System and below are intended as part of the platform, not for a particular langauge. LINQ was intended to be usable from any language, not just C# and VB.

> However, if that is the case, then I have to conclude that LINQ ("Language INtegrated Query") is the single most poorly named product in Microsoft's history.

I don't agree. Bob was pretty bad.

I don't think LINQ is that bad if you consider that LINQ is more than just the query syntax. It is made of of multiple pieces including a query API and language specific syntantic sugar. Like I said above, LINQ is the whole train, not just one car.

Chuck Jazdzewski | www.removingalldoubt.com | chuckjAT NOSPAMmicrosoft dot com

04/04/2008 6:29 AM

The headline should be "If you are using a loop, somebody might still be able to read your code without need to analyse it" :)

Don't get me wrong, I use LINQ quite often and love it. However, in this case it becomes too complex to grasp while reading the source code and thus hard to manage IMHO. I rather write a "multi-line" if e.g. instead of using the one-line ?? -approach.

Great post nonetheless to show how powerful LINQ is.

Holger Flick | http://flickdotnet.de

04/06/2008 6:01 PM

If you're using C# + LINQ, you're still doing it wrong.

An APL solution would probably fit in one line. There's an APL dialect called K which would probably make that go down to 25 chars or so.

LINQ is more verbose than APL, but not easier to grok. Go for the real thing.

OB-1

04/06/2008 6:54 PM

Nice piece, it really makes me wonder how idiomatic C# code is going to evolve.

Jonathan Allen | www.infoq.com/news/2008/04/No-more-loops | grauenwolfAT NOSPAMgmail dot com

04/06/2008 7:03 PM

If you like the "fuctional" aspects of LINQ then try out F#.

erik | geekswithblogs.net/erik

04/06/2008 8:11 PM

I fail to see how this is programming "without" loops. Sure, you've avoided using the for/foreach keywords but it doesn't change what is actually happening. I'm sure the IL being generated is still remarkably similar to what you would get if you did use those keywords.

That said, there is nothing wrong with moving towards a more functional programming style, as long as your code is still readable and maintainable as Holger Flick mentioned above.

I'm glad this sort of stuff is finally possible in a mainstream compiled language. C# is looking like the most usable 'enterprisey' language out there. It sure beats Java and C/C++ when it comes to preserving programmer sanity.

mike | http://www.digsby.com | mikeatdigsbydotcom

04/07/2008 4:43 PM

> Don't get me wrong, I use LINQ quite often and love it. However, in this case it becomes too complex to grasp while reading the source code and thus hard to manage IMHO. I rather write a "multi-line" if e.g. instead of using the one-line ?? -approach

I have tried this same algorithm before using loops and it doesn't look particularly easy to understand, certainly much less than my proposed function. The biggest problem is, as I mentioned in the last paragraph, the what gets jumbled together with the how.

I am certainly willing to be proved wrong if you can come up with an alternative that uses loops that is easier to understand.

Chuck Jazdzewski | www.removingalldoubt.com | chuckjAT NOSPAMmicrosoft dot com

04/07/2008 4:59 PM

> I fail to see how this is programming "without" loops

I guess I failed in my attempt to pre-empt this line of argument in my last paragraph with "some of you will rightly say there is a loop buried in the Reduce<T>() method" ;-).

I am not saying loops are bad. I am saying that you are not taking full advantage of LINQ if you write one yourself. Whether that is good or bad I will leave up to the beholder.

Chuck Jazdzewski | www.removingalldoubt.com | chuckjAT NOSPAMmicrosoft dot com

04/07/2008 5:06 PM

> An APL solution would probably fit in one line.

I have never considered the terseness of APL a good thing.

> LINQ is more verbose than APL, but not easier to grok.

I have never found APL easy to grok dispite the expertice of my teacher so this statement is not true for me. Nor, I suspect would it be for the people who would come after me that would be burdened to not only learn my code but APL (or K or J or...) as well.

Chuck Jazdzewski | www.removingalldoubt.com | chuckjAT NOSPAMmicrosoft dot com

04/18/2008 11:35 AM

List processing features make it easier to do this search but seriously increase the effort needed to understand the algorithm being used. Maintenance is much harder when the application includes many of these list processing operations.

This code is easy when you write it but much harder to understand by the next developer given that the next developer will not have a full understanding of the application.

Lastly, simple operations, such as finding an item in a list, should be immediately obvious to even the most inexperienced of developers.


Greg | mailto:gAT NOSPAMg dot net | gAT NOSPAMg dot net

Add New

Name

Email

Homepage

Security Word

Type in the security Word

Content (HTML not allowed)