Entity Framework: (anti)pattern Repository

A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection. Client objects construct query specifications declaratively and submit them to Repository for satisfaction.

Entity Framework provides us with the actual implementation of the Repository patterns: DbSet<T> and UnitOfWork: DbContext. I often see colleagues using in projects their own implementation of repositories on top of the ones existing in Entity Framework.

Most often, one of two approaches is used:

  • Generic Repository as an attempt to disregard a particular ORM.
  • Repository as a set of queries to a selected database table (DAO pattern).

Each of these approaches has its own downsides.

Generic Repository

When discussing the architecture of a new project, we often hear the question: “What if we want to switch to another ORM?” And the answer to it is usually the following: “Let’s make Generic Repository that will encapsulate the interaction with a particular technology for accessing data.”

And thus we have a new abstraction layer that translates the well-known, well-designed, and documented API of a beloved ORM into our custom, “on-the-knee” API without documentation.

A typical repository interface looks like this:

So now you can easily switch to another ORM if you suddenly need it.

Not really! What if, when implementing our miracle repository, we used the unique features of a particular ORM? When migrating to a new ORM, we’ll have to invent workarounds in the business logic layer to somehow emulate what the previous ORM provided out of the box.

It turns out that in order to ensure a smooth migration, we must write the implementation of our repository for all popular libraries in advance.

Thus, to write Generic Repository, you need:

  • Get thoughts together.
  • Design the interface.
  • Write an implementation for the ORM selected for the project.
  • Write an implementation for alternative ORMs.
  • Remove all the unique features of each library from the interface.
  • Explain to teammates why they cannot now use the unique features of their favorite ORM.
  • Maintain implementations for different ORMs in an up-to-date state. After all, frameworks are also evolving!
  • Explain to a manager why you are spending time on this instead of performing immediate tasks.

Fortunately, there are people who have already done this for us. And if you really need to be independent of ORM, you can use one of the ready implementations. For example, the one from the ASP.NET Boilerplate project. In addition to Repository, there are many interesting things in it.

But it’s better to leave it as it is. IDbSet<T> already contains the entire set of CRUD operations and lots of other things (including asynchronous operations) due to the inheritance from IQueryable<T>.

Repository as a set of queries

Often people mistakenly call the implementation of another pattern – the Data Access Object – a repository. Or both of these patterns are implemented by the same class. Then there are methods: GetByLogin(), GetByName(), etc. in addition to CRUD operations.

While the project is “green”, everything is fine. Requests are based on the corresponding files. The code is structured. But, as the project grows, new features are added, so as new queries. The repositories swell and turn into unsupported monsters.

Then the queries appear that join several tables and return Data Transfer Object, and not the domain object. And the question arises: into which repository should such queries be pushed? All this happens because SRP is violated when grouping queries by database tables, and not by the features of business logic.

In addition to this, DAO methods also have other disadvantages:

  • They are difficult to test.

Although EF Core tried to solve this problem using in-memory DbContext.

  • They do not support reuse and composition.

For example, if you have a DAO interface:

We cannot use two previous methods to implement FilterByDateAndTag().

So what should we do?

Use the Query Builder and Specification patterns.

.NET provides the actual implementation of the Query Builder pattern: IQueryable<T> and a set of LINQ extension methods.

Let’s analyze the queries in our project. In accordance with the Pareto law, 80% of queries will be:

  • either searching an entity by id: context.Entities.Find(id),

  • either filtration by a single field:

context.Entities.Where (e => e.Property == value).

Out of the remaining 20%, a substantial portion will be unique for each individual business case. Such queries can be left inside business logic services.

Only during the refactoring process, repetitive portions of queries should be taken out into the extension methods to IQueryable<T>. And the recurring conditions – into the specification.

Specification

The specification represents the rules of business logic in the form of a Boolean predicate that receives the domain entity at the input. Thus, the specifications support the composition with the help of Boolean operators.

Fowler and Evans define the specification as:

But such specifications can not be used with IQueryable<T>.

In LINQ to Entities, Expression<Func<T, bool>> is used as specifications. But such expressions cannot be combined with Boolean operators and used in LINQ to Objects.

Let’s try to combine both approaches. Let’s add the ToExpression() method:

And the Specification<T> base class:
Now we need to override the Boolean operators &&, ||, and !. To do this, we have to do rather strange things. According to C# Language Specification [7.11.2], if you override the operators: true, false, & and |, then & will be called instead of &&, and | will be called instead of ||.

Specification.cs

We also need to replace the argument of one of the expressions with ExpressionVisitor:

ParameterReplacer.cs

And convert the Specification<T> to Expression for use within other expressions:

SpecificationExpander.cs

Now we can

– test our specifications:

– combine our specifications:
– use them in LINQ to Entities:
If you do not like the magic with operators, you can use the actual implementation of the Specification from ASP.NET Boilerplate. Or use PredicateBuilder from LinqKit.

Extension methods to IQueryable

The extension methods to Iqueryable<T> can be an alternative to the specifications. For example:

The problem here is that if the first extension method works as expected, then Exception will be thrown for the second. Because it is called inside the Expression Tree passed to SelectMany(), and LINQ to Entities cannot handle this.

We will try to change the situation for better. For this we need:

  • ExpressionVisitor, which will expand our extension methods.

  • A decorator for IQueryable<T>, which will be called by our ExpressionVisitor.

  • The AsExpandable() extension method that will wrap IQueryable<T> in the decorator.

  • The [Expandable] attribute, with which we will mark the extension methods for expansion. After all, Where() or Select() are extension methods as well, and you do not need to expand them.

QueryableExtensions.cs

Now we need to implement the IQueryable<T> and IQueryProvider interfaces:

VisitableQuery.cs
VisitableQueryProvider.cs

VisitorExtensions.cs

There is one small feature. To support asynchronous operations, such as ToListAsync(), EntityFramework, and EF Core, additional interfaces should be defined: IDbAsyncEnumerable<T> and IAsyncEnumerable<T>. Therefore, it is better to use the actual implementation. It is based on ExpandableQuery of LinqKit but allows to use any ExpressionVisitor.

And, finally, the ExpressionVisitor as such:

ExtensionExpander.cs

ExtensionRebinder.cs

Now we can

– use the extension methods inside Expression Tree:

TL; DR

Dear colleagues, do not try to abstract yourself from the chosen framework! As a rule, the framework already provides a sufficient level of abstraction. Otherwise, why is it needed?

The full code for extensions is available on GitHub: EntityFramework.CommonTools, and in NuGet:

Benchmarks in my project:

DatabaseQueryBenchmark.cs

The approximate results

It seems that everything is cached as it should. The compilation time of the query increases by 15-30%.

Source references:

Dmitriy Panyushkin

Dmitriy Panyushkin

In 2012, graduated from Lomonosov Moscow State University, Faculty of Mechanics and Mathematics. Dmitriy is engaged in developing enterprise systems on the .NET platform and is now working for QuantumArt. He is fond of web development.
Dmitriy Panyushkin

Latest posts by Dmitriy Panyushkin (see all)

Dmitriy Panyushkin

In 2012, graduated from Lomonosov Moscow State University, Faculty of Mechanics and Mathematics. Dmitriy is engaged in developing enterprise systems on the .NET platform and is now working for QuantumArt. He is fond of web development.

  • good read, however it all looks a lot of ceremony (which I guess we cannot avoid). have you ever witnessed underlying database engine switch in projects? heard it many times, done also abstractions in repositories back in time, but never ever had real example when we needed to switch databases..

  • Victor Cejudo

    And then, if you want to stop using Entity Framework because some mapping problems that NHibernate doesn’t have, or because of a specific aspect of an inherited database that you can’t do with an ORM, you need to provide all the functionality of IQueryable.

    Wouldn’t it be simpler just make a method in the interface called FindByFields that gets a data structure and make that combination of queries in the implementation? What if Entity Framework shows a poor performance and I want to call a stored procedure instead?

    The specification pattern is for business logic, not to expose database aspects to the business.

    I do want to get collections as the result of the repository, not IQueryable because I want the unit of work to be completely closed and disposed when the repository returns, not to have potential state problems with the result of the methods.
    The caller should not be aware of that.