Fabian's Mix

Mixins, .NET, and more

re-linq: Subquery boundaries after GroupBy, Union, and similar operators?

without comments

Working on https://www.re-motion.org/jira/browse/RM-4093 (UnionResultOperator should act as a query source) made me think about the nature of result operators (i.e., query operators operating on the result of a from/where/join/orderby/select block) that mangle the incoming data so much that it can’t be associated with the original data any more.

For example, consider three queries:

  1. (from o in Orders
  2.  select o)
  3.     .Take (4)
  4.     .Any (o2 => o2.OrderDate > DateTime.Now)

  • The result of the from/select block is a sequence of Order objects that are directly traceable to the MainFromClause: [o] (re-linq’s notation for: reference to the query source called "o").
  • The result of the Take operator is still a sequence of the same Order objects: [o].
  • Therefore, the "o2" in the Any operator still refers to the items produced by the MainFromClause: [o].OrderDate > DateTime.Now.
  1. (from o in Orders
  2.  select o)
  3.     .GroupBy (o => o.OrderDate)
  4.     .Any (g => g.Key > DateTime.Now)

  • As in the first query, the result of the from/select block is a sequence of items coming from the MainFromClause: [o].
  • The result of the GroupBy operator is a sequence of IGrouping<DateTime, Order> objects which are no longer traceable to the MainFromClause, but only to the GroupResultOperator (let’s call it “g”): [g].
  • Therefore, the "g" in the Any operator refers to items produced by the GroupResultOperator: [g].Key > DateTime.Now.
  1. (from o in Orders
  2.  select o)
  3.     .Union (OtherOrders)
  4.     .Any (o2 => o2.OrderDate > DateTime.Now)

  • As in the first query, the result of the from/select block is a sequence of items coming from the MainFromClause: [o].
  • The result of the Union operator is a sequence of Order objects which are, however, no longer traceable to the MainFromClause, since they could potentially come from the second query source (OtherOrders): [u].
  • Therefore, the "u" in the Any operator refers to items produced by the UnionResultOperator: [g].Key > DateTime.Now.

Okay, so far so good. However, the question is how useful it is to have a result operator refer to another result operator coming before it. Back-ends that produce SQL will always have to add a sub-query boundary after such a result operator, because the produced SQL should look something like this:

  1. SELECT CASE
  2.   WHEN EXISTS(
  3.     SELECT *
  4.     FROM (SELECT * FROM [OrderTable] AS [t0]
  5.           UNION
  6.           SELECT * FROM [OtherOrderTable]) AS [q0]
  7.     WHERE ([q0].[OrderDate] > @1)
  8.     )
  9.   THEN 1
  10.   ELSE 0
  11.   END AS [value]

As you can see, the part before the Any (which is equivalent to the EXISTS clause) needs to be moved into a sub-query – there is a subquery boundary after the UnionResultOperator. Therefore, we could change re-linq’s front-end to (always) automatically insert such boundaries before operators such as Union and GroupBy:

  1. (from u in
  2.      (from o in Orders
  3.       select o)
  4.      .Union (OtherOrders)
  5.  select u)
  6.     .Any (u => u.OrderDate > DateTime.Now)

The question I’m asking myself is whether there is a good reason for not making this change. Is there any advantage to keeping GroupBy and Union in the same query as the following operators?

I’d appreciate any input on this on our mailing list: https://groups.google.com/d/topic/re-motion-users/MgZcKlHAn1g/discussion.

Written by Fabian

October 12th, 2012 at 9:57 am

Posted in re-linq

re-linq: Now on NuGet (with symbols)

without comments

In July 2011, Henrik Feldt created a NuGet package for re-linq 1.13.111. Only one year later, we took notice of that, when Chris Eldredge and Gordon Watts gently asked us to update the outdated package.

Starting with re-linq 1.13.161, we’ve therefore decided to regularly publish NuGet packages in the course of our weekly build. The release policy is currently the same as for our CodePlex releases: whenever we have changes such as bugfixes or new features, we’ll release a new build of re-linq.

Note that the versioning of re-linq is still bound to that of re-motion as, for now, the two projects are built together. This means that, at the moment, re-linq does not follow semantic versioning – new releases may contain fixes, new features, or breaking changes, no matter how the version number changed. Also, strictly speaking, all 1.13 builds are internal builds without any support guarantees. On the other hand, these are also the builds we’re using at rubicon, and we have confidence in their quality, so there isn’t much reason not to use them.

Also note that the Remotion.Linq NuGet package only contains the re-linq front-end. If there is demand for a SQL Backend package, please tell us so.

The packages we release to NuGet are symbol packages, which means that you can debug them and step into the code if you want to. You need to configure Visual Studio to disable Just My Code, enable source server support, and add SymbolSource.org as a symbol file location. And then…

Debugging re-linq result

There is one caveat with debugging: The NuGet package contains the Release build of re-linq, so debugging might sometimes be bumpy, since the source code does not reflect the optimizations made by the C# compiler. But even so, when I tried it, it worked quite nicely.

Written by Fabian

August 28th, 2012 at 12:08 pm

Posted in re-linq

re-mix: The Mixin-Based Composition Pattern

without comments

At rubicon, we implement various applications based on our business case processing product called ActaNova. ActaNova is a piece of software that implements electronic records management (and more) in a standard way. Those specific applications we implement on top of it extend the standard domain model, UI, database schema, and so on with application-specific behavior and structure.

This has quite a few interesting effects on the technical implementation. For example, we need a way to allow the UI to be extended (and modified) by modules added to the application. In this post, however, I’d like to concentrate on the idea of building a domain model that is extensible by modules within the application. And I’ll use a pattern based on mixins (as provided by re-mix), which we call the Mixin-Based Composition Pattern.

The pattern allows us to do two things:

  • Define a domain class as an aggregation of functionality defined by different modules within the application, and
  • extend the domain class from modules added to the application.

Consider a simple Person class, as it could be defined by an electronic records management software. Note that I’m not talking about the actual current or future implementation of ActaNova here, I’m just inventing an example that lets me illustrate the pattern. Let’s say a Person has the following features, implemented by various modules within the records management product:

  • It is an entity in the domain model, so it has got an ID.
  • It also has a first name, last name, middle names, and date of birth as standard properties.
  • It is versioned, which means that changes made to its data don’t simply overwrite any previous values, but put them into a history object. Versioning is implemented in a generic fashion by the Versioning module.
  • It is also tenant-bound, which means that it belongs to a specific tenant within the application (and only in the context of that tenant can it be loaded, edited, and changed. Tenant handling is implemented in a generic fashion by the Tenant module.

A specific employee records management application is built around the standard records management software. It creates an Employee domain class that derives from Person. Different modules extend the standard Person class as follows:

  • The Employee domain class has a job description property in addition to the inherited Person properties.
  • The Human Resources module adds the current salary of the employee.
  • The Organizational Structure module adds the current manager of the employee.

Here’s a picture of how you could model this domain by using mixins:

ComposedObjects

As you can see, the Person class aggregates the versioning and tenant binding functionality defined by the respective modules by using generic mixins exposed from those modules. The Employee class derives from the Person class and is extended by mixins from within the Human Resources and the Organizational Structure modules. This is a combination of simulating multiple inheritance via mixins (reusing code by including mixins into your classes) and use-case slicing, also implemented via mixins (implementing different slices of your code separately, use case by use case). (And in case you’re wondering, the OnLoaded and OnCommitting methods in the picture are considered to be infrastructure methods exposed by all domain objects.)

Using re-mix, this all works really nicely, with one caveat: when you have an instance of Person or Employee and want to access those properties added by mixins, you need to cast the instance to IVersionedObject, IPaidEntity, and so on. Which is not very nice, which is why the Mixin-Based Composition Pattern adds the following conventions:

  • A composed object (such as Person) must provide a This property of a type TComposedInterface. The TComposedInterface type must be an interface derived from all the interfaces introduced by used mixins. (In the example, that would be an ICompletePerson interface, inheriting from IVersionedObject and ITenantBoundObject.) The TComposedInterface can also provide properties defined by the core object itself. (I.e., ICompletePerson can also provide the ID, FirstName, LastName, MiddleNames, and DateOfBirth properties.)
  • Mixins that extend a specific class must also provide a conversion method As…() as an extension method for instances of that target class. The conversion method returns the instance cast to the interface introduced by the mixin.

These conventions allow client code to access all members added by used mixins via the This property and all members added by extending mixins via the As…() extension methods. This makes it very convenient to work with a domain implemented via mixins. Here’s an example of how the Employee class shown above would be used:

var employee = ObjectFactory.Create<Employee>();

 

Console.WriteLine (employee.ID);

Console.WriteLine (employee.FirstName);

 

// Properties added by mixins included by the Person/Employee classes

Console.WriteLine (employee.This.TenantID);

Console.WriteLine (employee.This.VersionID);

 

// Properties added by mixins extending the Person/Employee classes

Console.WriteLine (employee.AsOrganizedEntity().Manager);

Console.WriteLine (employee.AsPaidEntity().Salary);

re-mix has a ComposedObject<TComposedInterface> base class that provides the This property, so implementing the Person class could be as simple as this:

public interface ICompletePerson : IVersionedObject<Person>, ITenantBoundObject {}

 

[Uses (typeof (VersioningMixin<Person>))]

[Uses (typeof (TenantBindingMixin))]

public class Person : ComposedObject<ICompletePerson>

{

  public Person (int id)

  {

    ID = id;

    MiddleNames = new List<string>();

  }

 

  public int ID { get; private set; }

  public string FirstName { get; set; }

  public string LastName { get; set; }

  public List<string> MiddleNames { get; private set; }

  public DateTime DateOfBirth { get; set; }

 

  // etc.

}

(I’ve given the IVersionedObject interface a generic parameter, just to show that this is possible.)

But even if you can’t (or don’t want to) use the ComposedObject base class, implementing the This property by hand is very simple:

public class

    : IHasCompleteInterface<TComposedInterface>

{

  // etc.

 

  public TComposedInterface This

  {

    get { return (TComposedInterface) this; }

  }

}

The IHasCompleteInterface marker interface just causes re-mix to include the given interface on the domain object type (without the domain programmer having to implement its methods).

Here’s an example of one of those As…() extension methods, as it would come with the HumanResources module:

public static class EmployeeExtensions

{

  public static IPaidEntity AsPaidEntity (this Employee employee)

  {

    return (IPaidEntity) employee;

  }

}

No magic there, either, just some syntactic sugar to avoid users having to make those casts.

To summarize, by implementing the Mixin-Based Composition Pattern, you can:

  • build your domain classes by means of aggregating functionality from different modules in the application by using mixins exposed from those modules,
  • access members added by those mixins nicely via the This property,
  • extend your domain model from external modules by having mixins in there that extend the domain classes, and
  • access members added by those mixins nicely via the extension methods provided with the mixins.

This makes for very extensible, yet nicely factorized and simple to use domain models.

(One word about persistence and O/R mapping: Our own O/R mapper “re-store” recognizes mixed entities, and it persists properties added by the mixins into the database. It’s of course possible to implement the same behavior for other O/R mappers and persistence frameworks as well if they provide the necessary extensibility points. Whether to put the mixin data into the same database tables as the core domain data is another topic and depends on the specific situation.)

Written by Fabian

March 16th, 2012 at 2:05 pm

Posted in mixins

re-mix: Encapsulate and share ASP.NET MVC controller actions

with 2 comments

Jan Jonas has written a blog post about how to encapsulate commonly used controller actions using mixins with ASP.NET MVC 3. I think that it’s a really cool idea to use re-mix for this, however Jan is facing a few stumbling blocks that I want to comment upon. (Actually, I wanted to post this as a comment to his blog post, but for some reason, the comment form isn’t working.)

I understand Jan is saying that ASP.NET MVC doesn’t find the action methods added by mixins because the proxies generated by re-mix don’t inherit from ControllerBase. Therefore he needs to implement a custom ControllerActionInvoker to get ASP.NET MVC to invoke action methods introduced by mixins.

Actually, the proxies re-mix generates do inherit from the mixed classes; in Jan’s example, those are Controller1Controller and Controller2Controller, i.e., the proxies do inherit (indirectly) from ControllerBase. Then why isn’t ASP.NEW MVC finding those controller actions by default?

Maybe the problem is that the proxies implement the mixin interfaces (in the example, IEchoControllerMixin) explicitly, and ASP.NET MVC probably doesn’t search for explicit interface implementations. However, one can tell re-mix to implement interface members publicly, either by putting the MemberVisibilityAttribute on the methods in the mixin, or by specifying IntroducedMemberVisibility = MemberVisibility.Public when applying the UsesAttribute. (You can also derive a custom attribute from UsesAttribute that sets IntroducedMemberVisibility to MemberVisibility.Public by default, since you’ll always want to do that for controller mixins.)

If, for some reason, that still doesn’t work, one could put the CreateActionInvoker override into the target class (Controller1Controller) rather than the mixin, or into a different mixin (e.g., a ControllerMixinSupportMixin). That would enable one to use multiple controller mixins without generating conflicts on the CreateActionInvoker method.

A second problem Jan is facing is that the controller mixins can’t access the protected members declared by the controller and its base classes. However, re-mix allows this: to give mixins access to protected members of the target class, one can specify a custom interface as the TTarget parameter to the Mixin<TTarget> base class. Just define a public interface redeclaring all the controller methods the mixins should be able to access and specify that for the TTarget type parameter. The target class doesn’t actually have to implement that interface, it just needs to provide all the members, with either public or protected visibility.

I’ve adapted Jan’s example accordingly:

public interface IMixedController
{
  ViewResult View();
  ViewResult View (string viewName);
  ...
}
public class EchoControllerMixin : Mixin<IMixedController>, IEchoControllerMixin
{
  public ActionResult Echo(string id)
  {
    var result = new ContentResult();
    var controllerName =
        Target.ControllerContext.RouteData.Values["controller"].ToString();
    result.Content = string.Format(
        "re-mix added Echo action to controller '{0}'<hr/>You said: {1}", 
        controllerName,
        !string.IsNullOrEmpty(id) ? id : "<i>nothing</i>");
    return result;
  }
}
public class UsesControllerMixinAttribute : UsesAttribute
{
  public UsesControllerMixinAttribute (Type mixinType)
    : base (mixinType)
  {
    IntroducedMemberVisibility = MemberVisibility.Public;
  }
}
[UsesControllerMixin(typeof(EchoControllerMixin))]
public class Controller1Controller : Controller
{
}

 

Feel free to ask at the discussion list (http://groups.google.com/group/re-motion-users) if you face any other issues with re-mix.

Update (2012-01-11): Fixed base attribute ctor call in code.

Update (2012-01-16): As mentioned in the comments, Jan Jonas has published a blog post discussing my suggestions.

Written by Fabian

January 10th, 2012 at 10:09 am

Posted in mixins

re-linq: How to recognize if a method is a query operator?

without comments

On the users’ mailing list, Alex Norcliffe, Lead Architect of Umbraco 5, describes a problem they are currently facing with re-linq. As an illustration, consider the following two queries, especially the sub-queries within the where clauses:

var query1 = from c in QuerySource

             where c.Assistants.Any ()

             select c;

 

var query2 = from c in QuerySource

             where c.GetAssistants().Any()

             select c;

When re-linq analyzes those sub-queries within the where clauses, the first query will produce a SubQueryExpression with a QueryModel whose MainFromClause has the following expression: [c].Assistants. In other words, the items produced by the sub-query are those identified by the Assistants property.

The second query, however, will produce an exception:

Remotion.Linq.Parsing.ParserException : Cannot parse expression ‘c’ as it has an unsupported type. Only query sources (that is, expressions that implement IEnumerable) and query operators can be parsed.

—-> Remotion.Linq.Utilities.ArgumentTypeException : Expected a type implementing IEnumerable<T>, but found ‘Remotion.Linq.UnitTests.Linq.Core.TestDomain.Cook’.

Why’s that?

re-linq assumes that all methods occurring in a query operator call chain should be treated like query operators (Where, Select, etc.). This means that for the sub-query within query2, re-linq regards c as the start of the query operator chain. And, since c’s type does not implement IEnumerable<T>, it throws the exception shown above. Even if c’s type implemented IEnumerable<T>, an exception would be thrown that
GetAssistants() “is currently not supported”, unless one registers a custom node parser for that method.

Of course, what Alex actually wanted was re-linq treating both query1 and query2 in an equivalent way. I.e., a SubQueryExpression with a QueryModel whose MainFromClause has the following expression: [c].GetAssistants().

There is an easy workaround for now (see the mailing list), but I’m wondering how we could change re-linq to produce this result out of the box. I can think of two possibilities, both of which have certain drawbacks:

1 – Have re-linq treat MethodCallExpressions the same way as MemberExpressions. I.e, if the method has a registered node parser, treat it as a query operator. Otherwise, treat it (and all expression parts of the call chain before it) as the start of the query.

This would work quite well in the scenario shown above, and it would be nicely symmetric to how MemberExpressions work in re-linq.

However, it would become a very breaking change regarding diagnostics. Consider this example, in which CustomOperator is actually a custom query operator:

source.Where(…).CustomOperator().Select(…)

Currently, re-linq will throw an exception that it can’t parse CustomOperator() if one forgets to register the respective parser, and the LINQ provider backend won’t even get a QueryModel to process.

If we change this behavior, the frontend will no longer throw an exception, and the backend will suddenly get a MainFromClause with a FromExpression of "[source].Where(…).CustomOperator()". I think it would be difficult to understand for LINQ provider implementers why exactly this occurs. I can even imagine people believing this must be "right" (as no exception occurred) and start manually parsing the Where(…) and CustomOperator() calls, effectively reimplementing logic from re-linq…

2 – Have re-linq only treat MethodCallExpressions called on enumerables as query operators. Otherwise, treat them (and all expression parts of the call chain before the method call) as the start of the query.

This would also work in the given scenario, and it has the advantage of still providing good diagnostics when methods taking IEnumerable<T> have no associated expression node parser. However, it’s still a heuristic way of parsing, and it is asymmetric (both with MemberExpressions and in itself). Consider the following three examples:

instanceImplementingEnumerable.StartQueryHere().Where(…)
instanceNotImplementingEnumerable.StartQueryHere().Where(…)
instanceImplementingEnumerable.StartQueryHere.Where(…)

re-linq would parse the StartQueryHere method in the first example as a query operator (and throw an exception if there isn’t an expression node parser registered for it). The StartQueryHere method and property in the second and third example, on the other hand, would parse just fine. I believe this is difficult to understand just as well.

What do other people think of these two options? If you want to see this scenario to be supported out of the box, please give me some feedback about it on the developer mailing list: http://groups.google.com/group/re-motion-dev/t/f9f6198bbbecd796.

Written by Fabian

December 20th, 2011 at 9:29 am

Posted in re-linq

Getting Visual Studio 2010 SP1 to run elevated when launching .sln files

with 16 comments

For different reasons, I want Visual Studio 2010 to always run as an administrator on my Windows Server 2008 R2 machine (that has UAC enabled). Therefore, I have the “Run as administrator” checkbox checked on the Compatibility tab of Windows Explorer’s Properties dialog for devenv.exe:

Compatibility properties devenv

This causes Windows to always show me the UAC prompt when I run Visual Studio 2010:

UAC prompt devenv

Unfortunately, it also causes double clicking solution files in Explorer to stop working.

The reason is that Visual Studio associates .sln files with a special program, called VSLauncher.exe, which inspects the solution file in order to decide what version of Visual Studio to open it with. This enables side-by-side installation of different versions of Visual Studio to run correctly. When VSLauncher.exe is executed by Windows Explorer because I double-clicked a solution file, it is run with normal privileges, and is therefore not permitted to run devenv.exe, which requires elevation. VSLauncher thus silently fails.

The obvious solution is to also check “Run as administrator” for VSLauncher.exe, which, in my case, is located in “C:\Program Files (x86)\Common Files\microsoft shared\MSEnv”.

And, of course, the obvious solution doesn’t work. Any more.

With my installation, it used to work just fine, but with installing SP1 for Visual Studio 2010, or maybe even earlier, Windows somehow started to ignore my “Run as administrator” checkbox, and VSLauncher.exe would silently fail again.

After some research, I found that the reason for Windows ignoring my compatibility setting was that VSLauncher.exe now had a manifest embedded, which contained the following fragment:

<requestedPrivileges>

   <requestedExecutionLevel level="asInvoker" uiAccess="false">

   </requestedExecutionLevel>

</requestedPrivileges>

So, VSLauncher.exe now specified that it always wanted to be run at the same execution level as its invoker. And, since of course the program must know better than the user, this caused Windows to ignore my own execution level setting.

And now, to the solution. Since Windows wouldn’t let me override what the program said it wanted, I needed to override what the program said it wanted.

To do that, I used the Manifest Tool that comes with the Windows SDK (and thus with Visual Studio):

mt -inputresource:"VSLauncher.exe" -out:VSLauncher.exe.manifest

This command extracted the manifest from VSLauncher.exe into a file called VSLauncher.exe.manifest. I then edited the manifest to request the desired execution level:

<requestedPrivileges>

   <requestedExecutionLevel level="requireAdministrator" uiAccess="false">

   </requestedExecutionLevel>

</requestedPrivileges>

Then, I could write back the manifest:

mt -outputresource:VSLauncher.exe -manifest VSLauncher.exe.manifest

With the desired result:

UAC prompt VSLauncher

One note of caution: Please make a backup copy of VSLauncher.exe before manipulating the manifest. And perform at your own risk.

This trick should also work with Windows 7, by the way.

Written by Fabian

May 3rd, 2011 at 9:53 am

re-linq: Customizability explained

without comments

In a previous post, I mentioned a set of features for better customizability of how the re-linq front-end parses expression trees. This time, I want to explain why you’d use those features, and how to decide which one to use.

To start, here’s an updated version of the diagram showing re-linq’s pipeline:

re-linq pipeline 2

(Read my previous post for an explanation of this pipeline.)

There are four important points of extensibility in that pipeline:

  • The query parser,
  • the expression tree processors,
  • the expression transformers, and
  • the expression nodes.

Let’s take a look at each of them.

Replacing the Query Parser

The query parser’s responsibility is to build a QueryModel from a LINQ expression tree. This responsibility is defined by the IQueryParser interface, and implemented by the QueryParser class. The latter performs quite a good job, so why would you want to replace it?

In reality, you wouldn’t usually want to replace re-linq’s query parser, but maybe you need to extend it. By implementing IQueryParser and decorating the existing QueryParser class, you can:

  • Do anything you want with the expression tree before re-linq gets to see it;
  • adapt the QueryModel after re-linq is done constructing it;
  • react to the fact that re-linq is asked to construct a QueryModel; e.g., to implement logging, or to perform some checks; or
  • avoid the re-linq pipeline at all; e.g, if you want to perform caching at the QueryModel level.

The point about caching deserves some explanation. Let’s say you have a LINQ provider that translates queries to SQL. If your users issue the same queries again and again and your LINQ provider proves to be a bottleneck, you might need some possibility to cache the parsed SQL queries based on the incoming expression trees. re-linq currently does not have any support for caching query translation results, and while it’s possible to build a caching subclass of QueryProviderBase that intercepts Execute, this doesn’t really fit re-linq’s architecture with the IQueryExecutor class.

It is, however, easily possible to implement a two-part cache using the existing architecture. First, implement a caching IQueryParser implementation that calculates a cache key from the incoming expression tree, and caches QueryModel references based on that key. Then, implement a caching IQueryExecutor that keeps track of the generated SQL by QueryModel reference.

That said, replacing the query parser is probably not the most interesting extensibility point in re-linq, so let’s go on to the next one.

Adding (or replacing) Expression Tree Processors

Expression tree processors are implementations of the IExpressionTreeProcessor interface that re-linq applies to the expression tree before that tree is actually analyzed. You can add your own processors to replace and transform the expression tree if your LINQ provider needs to do this. re-linq already defines two processors that are included in the pipeline by default: the partial evaluator and the transforming processor.

The partial evaluator is responsible for analyzing the parsed expression tree for any sub-expressions that can be evaluated in memory and replacing those sub-expressions with the result of the evaluation.

Here’s an example: Consider you write a query such as the following:

from o in Orders
where o.Date == DateTime.Today
select o

In this query, the partial evaluator will detect that the sub-expression “DateTime.Today” can be evaluated locally and will replace it with a constant expression that holds the respective date. (Note: If you implement query caching as explained above, keep this in mind!)

By default, the partial evaluator is the first processor to be executed. You can replace it if you need a custom evaluator.

The second default processor, the transforming processor, is responsible to execute the expression transformers explained below.

To add or replace expression tree processors, create a customized instance of the QueryParser and ExpressionTreeParser classes. See this description for the details: https://www.re-motion.org/jira/browse/RM-3721.

You’d typically add your own processor if you have an expression visitor that needs to be applied to the expression tree prior to query analysis. Note, however, that the number of expression visitors involved in query parsing can negatively affect the performance of your LINQ provider. For simple processing, expression transformers may be the better alternative, so let’s look at those next.

Adding (or replacing) Expression Transformers

Expression transformers are a light-weight, efficient way of transforming sub-expressions in an expression tree. They work similar to the Visit… methods of the ExpressionTreeVisitor class, but unlike expression visitors, transformers are only meant for local transformations (i.e., transformations of expression patterns that can be detected by looking at a single expression and its (more or less) immediate children). Transformers are written for a dedicated expression type (e.g., MethodCallExpression), and they should not build up any state that spans multiple expression nodes in a tree.

Here’s an example from re-linq’s source code:

/// <summary>

/// Replaces calls to <see cref=”Nullable{T}.Value”/> and
///
<see cref=”Nullable{T}.HasValue”/> with casts and null checks. This
///
allows LINQ providers

/// to treat nullables like reference types.

/// </summary>

public class NullableValueTransformer
 
: IExpressionTransformer<MemberExpression>

{

  public ExpressionType[] SupportedExpressionTypes

  {

    get { return new[] { ExpressionType.MemberAccess }; }

  }

 

  public Expression Transform (MemberExpression expression)

  {

    ArgumentUtility.CheckNotNull (“expression”, expression);

 

    if (expression.Member.Name == “Value”
       
&& IsDeclaredByNullableType(expression.Member))

      return Expression.Convert (expression.Expression, expression.Type);

    else if (expression.Member.Name == “HasValue”
        && IsDeclaredByNullableType (expression.Member))

      return Expression.NotEqual (
          expression.Expression
          Expression.Constant (null, expression.Member.DeclaringType));

    else

      return expression;

  }

 

  private bool IsDeclaredByNullableType (MemberInfo memberInfo)

  {

    return memberInfo.DeclaringType.IsGenericType
       
&& memberInfo.DeclaringType.GetGenericTypeDefinition()
            == typeof (Nullable<>);

  }

}

The NullableValueTransformer implements the IExpressionTransformer<T> interface for MemberExpression because it will handle that expression type, similar to an expression visitor implementing VisitMemberExpression. The SupportedExpressionTypes property is used to determine what expressions exactly should be handled by this transformer. (If the type parameter and the expression types don’t match, an exception is thrown at run-time.)

When the expression tree is parsed, the transforming processor (see above) will visit each sub-expression of the expression tree and pick the corresponding transformers based on the node type values. When it picks the NullableValueTransformer, it calls the Transform method, and the transformer may then decide whether to return the (untransformed) input expression, or to return a different expression. In the example, the transformer replaces calls to the Nullable<T>.Value and HasValue properties with cast expressions and null checks.

The transformers are called “inside out”, i.e., child nodes are transformed before their parent and ancestor nodes. When more than one transformer qualifies for the same expression, the transformers are called in a chain in the order of registration. When a transformer changes the expression, the chain is aborted and transformers are again chosen for the new expression (which may have a different type than the original one).

re-linq comes with a number of predefined transformers, including the nullable value transformer, a few VB syntax transformers, a transformer that detects invocations of LambdaExpressions, and a set of transformers that add metadata to constructor invocations for tuple types.

I’d think that most pre-processing requirements that a LINQ provider may have can be efficiently implemented as transformers. To add custom transformations, you again need to create a customized instance of the QueryParser and ExpressionTreeParser classes, see https://www.re-motion.org/jira/browse/RM-3721.

Custom Expression Nodes

Last, but not least, there are the expression node parsers used by re-linq for translating query operators into query model clauses and result operators. I’ve written about these before, see https://www.re-motion.org/blogs/mix/archive/2010/10/28/re-linq-extensibility-custom-query-operators.aspx. Add your own node parsers to support non-default query operators. Again, https://www.re-motion.org/jira/browse/RM-3721 shows how they are integrated into the pipeline.

Well, that’s it. I’ve to say I’m quite glad with these customization features. If you have any comments, questions, or suggestions regarding them, feel free to post at our mailing list: http://groups.google.com/group/re-motion-users.

Written by Fabian

April 29th, 2011 at 3:06 pm

Posted in re-linq

re-mix: Now on CodePlex

with 2 comments

Via the re-motion Team Blog: http://remix.codeplex.com/.

Check the team blog for details.

Written by Fabian

April 29th, 2011 at 1:35 pm

Posted in mixins

re-linq: Weekly builds now on CodePlex

without comments

Starting with build 1.13.92, we’ve changed our build system to automatically upload the re-linq weekly builds to CodePlex if any re-linq features were implemented in that week.

Previously, we published our weekly builds on our own server (https://re-motion.org/builds/) and only created CodePlex releases for “major” changes. For example, it was planned to publish the next CodePlex release only when re-linq’s SQL backend was finished.

That had the disadvantage, however, of giving the impression that re-linq wasn’t being worked on. For example, version 1.13.41 had been the “recommended release” for a very long time. It was released on January 13, 2010, and some people assumed that re-linq hadn’t been changing at all in the meantime. This was not correct – a lot of changes were made to re-linq between 1.13.41 and 1.13.92.

This concept was described on re-linq’s CodePlex homepage, but apparently, this wasn’t enough. So, now, every weekly build will be published on CodePlex. re-linq is developed in a test-driven way, and we usually consider our weekly builds to be “stable” (ie, ready for production). Here’s what the homepage has to say about stability:

Note that due to the goodness of TDD, weekly builds are generally considered stable and we do often use those in production. However, if you need a bug fix you will have to upgrade to a newer version. Hotfixes are only produced for release versions (even/odd scheme: release versions have even minor version numbers, such as the upcoming 1.14.0, and hotfixes will be numbered 1.14.1, 1.14.2 etc.).

So, be aware that there can be “breaks” between minor releases for non-even builds (like 1.13), and always check out the release notes for the versions between your current one and the new one if you want to upgrade. The easiest way to do this is via our JIRA.

Apart from that, just keep monitoring the releases on CodePlex (RSS link) in order to get notified of new versions.

Written by Fabian

January 31st, 2011 at 10:55 am

Posted in re-linq

re-linq: A lot of new customizability

without comments

This week, we’ve implemented a set of features for easier customizability of the way re-linq’s front-end handles expression trees. With those features, you – as a re-linq-based LINQ provider author – now can:

  • extend, customize, or even replace the whole process of how to get from a LINQ expression tree to a re-linq QueryModel;
  • add custom expression tree transformation steps (or replace the existing ones); and
  • add light-weight expression transformation logic.

I’ll explain those in detail. Consider the following picture, which shows the important classes involved in the execution of a LINQ query with a re-linq based LINQ provider:

re-linq pipeline

As you can see, a user starts the process by either calling GetEnumerator on an instance of a class derived from QueryableBase<T> or by using one of the methods for immediate execution (such as Count, Single, First, etc.) defined by the .NET framework’s Queryable class (passing in an instance derived from QueryableBase<T>). Both cause QueryProviderBase.Execute to be called. Up to here, everything is according to the standard LINQ provider implementation patterns.

Now comes the part specific to re-linq. The QueryProviderBase class will now ask a QueryParser to parse the query into a QueryModel, and then pass that QueryModel to an implementation of IQueryExecutor. The executor must be defined by the actual LINQ provider – this is where the actual data is queried from the underlying data source.

In order to produce the QueryModel, the QueryParser asks an ExpressionTreeParser to parse the given query expression tree into a chain of ExpressionNodeTypes. For each top-level query operator, one (or more) node instances are created, and all of those nodes are then asked to apply themselves onto (and thus building) the QueryModel to be returned.

The ExpressionTreeParser performs several steps before it creates the expression nodes: first, it partially evaluates the expression tree – ie., it simplifies those parts of the tree that can be executed in memory –, then, it applies some transformation steps, and finally, it creates the nodes for the top-level query operators using a user-extensible registry. (For simplicity, I’ve left out the analysis of method call arguments – which involves the detection of sub-queries – from the picture.)

Now that you understand the re-linq (front-end) pipeline, let me try again to explain what we’ve added. You can now:

  • extend, customize, or even replace the whole process of how to get from a LINQ expression tree to a re-linq QueryModel by providing a custom query parser;
  • add custom expression tree transformation steps (or replace the existing ones) by adding custom expression tree processing steps; and
  • add light-weight expression transformation logic by adding custom expression transformers.

I’ll probably describe the details of each of those possibilities in separate blog posts. For now, it will suffice to say that the new customizability features will be released with re-linq build 1.13.92 on CodePlex tomorrow (2011-01-28). By tomorrow evening (CET), the release notes will also be completed (available via JIRA).

Written by Fabian

January 27th, 2011 at 6:44 pm

Posted in re-linq