Fabian's Mix

Mixins, .NET, and more

re-linq: Dealing with sub-queries in a from clause

without comments

On our re-motion users list, Fabio Maulo asks:

for this query
from Dog o in session.Query<Dog>() select o

I would catch the Cast method but it is parsed as SubQueryExpression.

Have you any advise ?
Thanks.

My short answer is: Look for the CastResultOperator in the ResultOperators collection of the SubQueryExpression’s QueryModel. Detect that the inner QueryModel is trivial by inspecting QueryModel.IsIdentityQuery(). Do this in your back-end’s respective VisitSubQueryExpression visitor method, or use the SubQueryFromClauseFlattener.

The long answer:

re-linq’s front-end works in such a way that it always wraps expressions that contain query operators (Select, Where, OrderBy, Count, Cast, and similar method calls) into QueryModel instances. If those expressions are part of a larger expression tree (eg., in the expression of a FromClause), re-linq will wrap the QueryModel into a SubQueryExpression and include that in the larger tree.

This sub-query system has several advantages, for example, it is highly consistent both in representation and analysis (because you can usually apply the same QueryModel analyzer to the outer and inner queries), it’s easy and fast to parse (no context is needed when re-linq’s structural parser encounters a Select MethodCallExpression, for example), and so on.

However, it means you have to explicitly deal with those SubQueryExpressions in your parser if you need to support such queries.

You have several options:

Option 1, if your target has good support for sub-queries (SQL has, for example), you can just translate those SubQueryExpressions into target sub-queries as they are. In the expression visitor parsing the expressions of from clauses, in the VisitSubQueryExpression method, apply your QueryModelVisitor to the expression’s inner QueryModel, then insert the resulting query as a sub-query into your result.

I’ll explain by using SQL. In the example above, you’d have the following:

SELECT [q0].*
FROM (SELECT [t0].* FROM [DogTable] [t0]) [q0]

You can choose to add an optimization step to the VisitSubQueryExpression method: if the QueryModel is trivial (QueryModel.IsIdentityQuery()) and all of its result operators can be handled inline, you can simply “flatten” the sub-query instead of parsing the inner QueryModel:

SELECT [t0].* FROM [DogTable] [t0]

As you can see, for this flattening, you have to take the FromExpression of the inner QueryModel and use it as the FromExpression of the outer query model. (If you also want to flatten non-trivial queries, you’ll need to perform an additional reference replacement step, but I won’t go into this now.)

This is how we do it in re-linq’s SQL backend (or how we will do it, since the flattener hasn’t been implemented yet). The advantage is that this concept supports all constructs that build SubQueryExpressions automatically, and that you can still build optimized statements.

Option 2, if your target has no good support of sub-queries, you can still detect the SubQueryExpression in the expression visitor’s VisitSubQueryExpression method. Check whether the inner QueryModel is trivial (IsIdentityQuery()) and whether all the result operators can be expressed in-line. If so, use the inner QueryModel’s FromExpression and modify it according to the result operators. You can do this, for example, using a dedicated SubQueryModelVisitor that returns a flattened result, which you can then insert into your outer query. If the QueryModel is non-trivial (eg., if somebody wrote “from o in session.Query<Dog>().Where (…).OrderBy (…) select o”), throw a NotSupportedException or something similar.

This is similar to option 1, but the optimization becomes the default (and only) route.

Option 3, if you feel this is too complicated to do in your back-end, you can perform a recursive sub-query flattening step on the outermost QueryModel before you feed it into your back-end’s analyzer or transformer. I’ve blogged about this before, and re-linq already contains a simple SubQueryFromClauseFlattener. If you want to use that, you’ll probably need to override CheckFlattenable() for the example above in order to make sure that the CastResultOperator is ignored.

If the existing SubQueryFromClauseFlattener is not enough for your purposes (or you don’t want to throw an exception if the sub-query cannot be flattened), you can implement your own transformation class in a similar way, of course.

The advantage of this option is that it keeps logic out of your back-end, which is likely complex enough anyway. The disadvantage is, however, that it requires a separate pass over the QueryModel. If performance is very important, it’s probably not the best option.

To summarize, I realize that at the first glance, re-linq’s sub-query model might sometimes look like overkill, especially when a SubQueryExpression is created just because of a single Cast, Count, or Single method call. On the other hand, it makes it possible to handle Cast, Count, or Single the same way, no matter if they are at the top of the expression tree or embedded in a Select expression, for example. And they can be handled the same way whether they follow directly on a query source (session.Query<Dog>().Count()) or not (session.Query<Dog>().Where(…).Select(…).Count()). To fully leverage this simplification, you need a target system that can also represent those sub-queries. If you got one, prefer option 1 from above.

Written by Fabian

September 23rd, 2010 at 7:44 am

Posted in re-linq

Leave a Reply