Fabian's Mix

Mixins, .NET, and more

re-linq: Update Number Three

without comments

Tempus fugit, as they say, and it has already been more than three weeks since my last update on re-linq’s progress. Needless to say that there’s been a lot going on in the code base. Here’s a recap of what has changed.

Result operators

  • We redesigned the result operators (previously: result modifications). Those are query operators such as Distinct, First, or Count, which are not part of a query’s clauses, but which act on the query’s result set, grouping, filtering, aggregating, or choosing single elements. They were also moved from the SelectClause to the QueryModel.
  • When a result operator is followed by a “normal” query method, such as Where, Select, or OrderBy, we now wrap everything coming prior to that query method into a subquery. Previously, we just reordered those query methods, putting them before the result operator. Which, of course, is wrong because the order of query methods is usually very important in a LINQ query. (Although, of course, there are cases where you want to reorder clauses.)
  • We changed the handling of result operators that have (optional) selectors and predicates. We now handle those exactly as if there were Where and Select method calls directly in front of the operator.
  • We fixed the Take result operator: its Count property is now an Expression rather than an integer because Take might refer to another query part delivering the number of items to take.
  • We implemented lots of new result operators: Skip, Reverse, Union, Intersect, Except, Average, LongCount, DefaultIfEmpty, Cast, OfType, and Contains.

Data stream modeling and query execution

  • We developed and implemented a model of the data that streams from the Select clause through result operators (being transformed in that process) and finally out of the QueryModel. This data is represented in an abstract form as IStreamedDataInfo objects, concrete data values are represented as IStreamedData objects. You can obtain the data info via QueryModel.GetOutputDataInfo(); this replaces QueryModel.GetResultType().
  • Based on IStreamedData, we improved the ExecuteInMemory facilities of the result operators. This means that a LINQ provider based on re-linq can now easily execute result operators in memory that it can’t translate to its destination query language. This should of course be used very carefully, as fetching all data in memory and then filtering it there might quickly become very inefficient. However, it’s a quick way to get started.
  • We eliminated IExecutionStrategy. Executing queries is now performed by QueryModel.Execute() in conjunction with IStreamedDataInfo.
  • We refactored IQueryExecutor, there’s now an additional ExecuteSingle() method for queries that end with a single-item result operator, such as First, Last, Single, Min or Max.

Group-By support

  • We added support for GroupBy. We regard group operations as result operators, i.e. they are attached to a QueryModel and executed on the query’s result set. GroupBy is interesting insofar as it is the only query operator (yet) that also acts as a query item source. This means that a result operator following a GroupOperator (for example another GroupOperator) will have the IGroupings produced by the GroupOperator as its input.
  • In order to support GroupOperator.ExecuteInMemory(), we’ve developed a reverse resolver. This takes a resolved expression such as [student].Name (where [student] is a reference to a query item source) as well as the structure of the data streaming into the GroupOperator (e.g. new {[student], [course]}) and produces a LambdaExpression that evaluates the expression when passed an input item. We call this reverse resolving because its exactly the opposite of what our field access resolution mechanism does when a LINQ query is parsed.
    If now you ask why we implemented this even though the information is available when parsing queries _before_ we resolve expressions, well, the answer is simple: we have a transformable query model. You can easily append a new GroupOperator or change the selector of a SelectClause at runtime. This means that we cannot just keep the information we have from parsing – that information might be completely outdated. So we have to recalculate the LambdaExpressions if we want to perform an ExecuteInMemory operation.

Eager fetching

  • We rewrote eager fetching. Eager fetch requests are now represented by result operators attached to the QueryModel. The ad-hoc fetchRequests parameters passed to IQueryExecutor are gone; instead, FetchFilteringQueryModelVisitor should be used to extract fetch requests from a QueryModel (if a query executor supports eager fetching).
  • The query methods representing the entry points to eager fetching were moved to re-store. For users of re-linq, this means that eager fetching is now an opt-in functionality: just provide the respective query methods if you want to support eager fetching. Otherwise, don’t.

Other query methods

  • We implemented support for the Join and GroupJoin query methods.
  • We now also support Enumerable’s query methods. This means that expressions such as from expressions or a where conditions that use Enumerable.Select() or similar methods are now parsed as subqueries.

Other refactorings

  • We restructured the classes in the Remotion.Data.Linq namespace. We moved our Data Model as well as everything related to SQL generation to a Backend sub-namespace. The classes in this namespace are only relevant to LINQ providers producing SQL (such as the one for re-store).
  • We did lots of other, minor refactorings.
  • And we also fixed a few conceptual bugs.

Wow, that’s a long list. But we had nearly four weeks, and we did use them well, I think.

Written by Fabian

July 31st, 2009 at 6:38 pm

Leave a Reply