Fabian's Mix

Mixins, .NET, and more

.NET 4.0 expression trees: Code gen, blocks, and visitors

without comments

Yesterday, I did some research on how expression trees have changed with the upcoming version 4.0 of the .NET framework. Of course, this is important for LINQ providers (and thus also for re-linq) because LINQ providers may have to change when Microsoft changes the format and/or capabilities of expression trees. However, it’s also interesting to people performing custom code generation, because .NET 4.0 expression trees have gained a lot of new functionality related to custom code generation.

Let’s start with the latter: expression trees are becoming more suited for custom code generation. How’s that?

LambdaExpression has always had a Compile method that allows you to compile an expression tree to a dynamic method and execute it in the same process. It does not allow you to embed a compiled expression tree into a dynamic assembly generated via Reflection.Emit, however – and this has changed with .NET 4.0. By using the new CompileToMethod feature, expression trees can now be emitted into a MethodBuilder, which enables people to combine expression trees with Reflection.Emit. As in most new expression tree features, you can definitely see the influence of the DLR here 🙂

In addition, there is a whole bunch of new expression node types: “expression” trees can now also include statements! BlockExpression allows you to group a sequence of “expression” statements together (discarding the values of all but the last one), GotoExpression allows you to implement control flow, and BinaryExpressions have been changed to allow represent assignments. This means that .NET 4.0 expression trees really can represent arbitrary code constructs, which is a very good thing! Bart de Smet has a blog post illustrating expression trees with statements.

However, for LINQ providers, it would actually be a very bad thing to have these kinds of constructs in LINQ expressions. Imagine having to translate a GotoExpression to SQL, for example. The good news is that, according to this blog post comment, it’s unlikely that LINQ providers will have to deal with any of the new expression node types added by .NET 4.0, as the compilers still don’t allow statement expressions in LINQ queries.

So, what else is new in .NET 4.0’s System.Linq.Expressions namespace? Well, for example, there is finally a public ExpressionVisitor class. Expression visitors are at the core of every LINQ provider, and it has always been a pity that .NET 3.5’s ExpressionVisitor was internal. Therefore, re-linq had to define it’s own ExpressionTreeVisitor base class.

However, there’s more to .NET 4.0’s ExpressionVisitor than the ordinary “switch on node type and dispatch to the respective strongly typed Visit method” every LINQ visitor base class has consisted of until now: Microsoft have finally implemented real double dispatch for expressions! In .NET 4.0, the Expression base class now has an Accept method, and the concrete expression classes dispatch to the visitors, just as Gamma et al. described in 1994! And the sweetest part of it: they’ve thought of including an extension expression model. I’ll write about this in a separate blog post, but in a nutshell, I consider this a great design achievement over .NET 3.5’s expression trees.

All in all, I think .NET 4.0 expression trees really add a lot to their 3.5 counterparts, and I’m looking forward to using them. If you want to read more about the topic, I’d suggest perusing the DLR expression tree specs, as MSDN doesn’t really have much information at the moment.

Written by Fabian

February 18th, 2010 at 4:47 pm

Leave a Reply