IQueryable is Tight Coupling

From time to time I encounter people who attempt to express an API in terms of IQueryable<T>. That's almost always a bad idea. In this post, I'll explain why.

In short, the IQueryable<T> interface is one of the best examples of a Header Interface that .NET has to offer. It's almost impossible to fully implement it.

Please note that this post is about the problematic aspects of designing an API around the IQueryable<T> interface. It's not an attack on the interface itself, which has its place in the BCL. It's also not an attack on all the wonderful LINQ methods available on IEnumerable<T>.

You can say that IQueryable<T> is one big Liskov Substitution Principle (LSP)violation just waiting to happen. In the next two section, I will apply Postel's law to explain why that is.

Consuming IQueryable<T> #

The first part of Postel's law applied to API design states that an API should be liberal in what it accepts. In other words, we are talking about input, so an API that consumes IQueryable<T> would take this generalized shape:

IFoo SomeMethod(IQueryable<Bar> q);

Is that a liberal requirement? It most certainly is not. Such an interface demands of any caller that they must be able to supply an implementation of IQueryable<Bar>. According to the LSP we must be able to supply any implementation without changing the correctness of the program. That goes for both the implementer of IQueryable<Bar> as well as the implementation of SomeMethod.

At this point it's important to keep in mind the purpose of IQueryable<T>: it's intended for implementation by query providers. In other words, this isn't just some sequence of Bar instances which can be filtered and projected; no, this is a query expression which is intended to be translated into a query somewhere else - most often some dialect of SQL.

That's quite a demand to put on the caller.

It's certainly a powerful interface (or so it would seem), but is it really necessary? Does SomeMethod really need to be able to perform arbitrarily complex queries against a data source?

In one recent discussion, it turns out that all the developer really wanted to do was to be able to select based on a handful of simple criteria. In another case, the developer only wanted to do simple paging.

Such requirements could be modeled much simpler without making huge demands on the caller. In both cases, we could provide specialized Query Objects instead, or perhaps even simpler just a set of specialized queries:

IFoo FindById(int fooId);
 
IFoo FindByCorrelationId(int correlationId);

Or, in the case of paging:

IEnumerable<IFoo> GetFoos(int page);

This is certainly much more liberal in that it requires the caller to supply only the required information in order to implement the methods. Designing APIs in terms of Role Interfaces instead of Header Interfaces makes the APIs much more flexible. This will enable you to respond to change.

Exposing IQueryable<T> #

The other part of Postel's law states that an API should be conservative in what it sends. In other words, a method must guarantee that the data it returns conforms rigorously to the contract between caller and implementer.

A method returning IQueryable<T> would take this generalized shape:

IQueryable<Bar> GetBars();

When designing APIs, a huge part of the contract is defined by the interface (or base class). Thus, the return type of a method specifies a conservative guarantee about the returned data. In the case of returning IQueryable<Bar> the method thus guarantees that it will return a complete implementation of IQueryable<Bar>.

Is that conservative?

Once again invoking the LSP, a consumer must be able to do anything allowed by IQueryable<Bar> without changing the correctness of the program.

That's a big honking promise to make.

Who can keep that promise?

Current Implementations #

Implementing IQueryable<T> is a huge undertaking. If you don't believe me, just take a look at the official Building an IQueryable provider series of blog posts. Even so, the interface is so flexible and expressive that with a single exception, it's always possible to write a query that a given provider can't translate.

Have you ever worked with LINQ to Entities or another ORM and received a NotSupportedException? Lots of people have. In fact, with a single exception, it's my firm belief that all existing implementations violate the LSP (in fact, I challenge my readers to refer me to a real, publicly available implementation of IQueryable<T> that can accept any expression I throw at it, and I'll ship a free copy of my book to the first reader to do so).

Furthermore, the subset of features that each implementation supports varies from query provider to query provider. An expression that can be translated by the Entity framework may not work with Microsoft's OData query provider.

The only implementation that fully implements IQueryable<T> is the in-memory implementation (and referring to this one does not earn you a free book). Ironically, this implementation can be considered a Null Object implementation and goes against the whole purpose of the IQueryable<T> interface exactly because it doesn't translate the expression to another language.

Why This Matters #

You may think this is all a theoretical exercise, but it actually does matter. When writing Clean Code, it's important to design an API in such a way that it's clear what it does.

An interface like this makes false guarantees:

public interface IRepository
{
    IQueryable<T> Query<T>();
}

According to the LSP and Postel's law, it would seem to guarantee that you can write any query expression (no matter how complex) against the returned instance, and it would always work.

In practice, this is never going to happen.

Programmers who define such interfaces invariably have a specific ORM in mind, and they implicitly tend to stay within the bounds they know are safe for that specific ORM. This is a leaky abstraction.

If you have a specific ORM in mind, then be explicit about it. Don't hide it behind an interface. It creates the illusion that you can replace one implementation with another. In practice, that's impossible. Imagine attempting to provide an implementation over an Event Store.

The cake is a lie.

Comments

Anthony Johnston #

The differences between the different providers is even more subtle than NotSupportedException's
If you have the following query you get different results with the EF LINQ provider and the LINQ to Objects provider

var r = from x in context.Xs select new { x.Y.Z }

dependant on the type of join between the sets or if Y is null

makes in-memory testing difficult

2012-03-26 17:24 UTC

Sergio Romero #

I will look for the implementation of IQueryable<T> that does not violate the principle but, what if I already own your book? :)

P.S. Kudos by the way. I love every single piece of it.

2012-03-26 20:12 UTC

Vytautas #

While swapping implementations from a specific ORM to an event store is a major undertaking, it is not very difficult to swap one persistence mechanism to another when you query persistent view models (which are usually found in the same type of applications where you can find an event store). They usually have automatic properties and (next to) no relations so queries on them are relatively simple to not throw many NotSupportedExceptions.

Also, having a generic repository as an intermediate step between an ORM and a specific repository lets you unit test specific repositories and/or use them with multiple persistence mechanisms (which is sometimes desirable). Despite the leaks, this "pattern" does have some uses in my book when used VERY carefully (although it is misused in 99 cases out of 100 when you see it).

2012-03-26 20:43 UTC

Jon #

With your suggested approach, don't you just end up with a large interface of different queries? If not, how do you decide to split them up? Wouldn't it be better to have a more general purpose IQuery < T >.Execute()?

2012-03-26 21:17 UTC

Amir #

I've created an IRepository interface which doesn't consume or expose any IQueryable object. But I've created a separate library (abstraction) on the top of this abstract IRepository and name it "EntityRepository" which is abstract, implements the IRepository and is dedicated to EF. EntityRepository and its derived types may consume or expose IQueryable.
Everybody who wants to implement the repositories using EF would use EntityRepository and everybody who wants to use other ORMs can create his own derived abstraction from IRepository.

But the consumer codes in my library just use the IRepository inteface (Ex. A CRUD ViewModel base class).

It it correct?

2012-03-27 19:42 UTC

Mark Seemann #

Amir, that sounds correct. IQueryable is a useful tool, but it can be somewhat misleading as part of an API (interfaces or base classes). On concrete classes it's less misleading.

2012-03-28 05:03 UTC

Matt Warren #

You should take a look at some of Bart De Smet's stuff, he talks about IQueryable and it's problems.

See http://bartdesmet.net/blogs/bart/archive/2008/08/15/the-most-funny-interface-of-the-year-iqueryable-lt-t-gt.aspx and also take a look at the section 9min 30 secs into this talk http://channel9.msdn.com/Events/PDC/PDC10/FT10

2012-03-28 10:47 UTC

Daniel Cazzulino #

-- IFoo SomeMethod(IQueryable{Bar} q);

Who on Earth defines an API that *takes* an IQueryable? Nobody I know, and
I've never had the need to. So the first half of the post is kinda
irrelevant ;)

-- Implementing IQueryable{T} is a huge undertaking.

That's why NOBODY has to. The ORM you use will (THEY will take the huge
undertaking and they have, go look for EF and NHibernate and even RavenDB
who do provide such an API), as well as Linq to Objects for your testing
needs.

So, the IRepository.Query{T} is doing *exactly* what its definition
promises: "Mediates between the domain and data mapping layers using a
collection-like interface for accessing domain objects." (
http://martinfowler.com/eaaCatalog/repository.html)

In the ORM-implemented case, it goes against the data mapping layer, in the
Linq to Objects case, it does directly over the collection of objects.

It's the PERFECT abstraction to make code that goes against a DB/repository
unit-testable. (not that it frees you from having full integration
end-to-end tests that go against the *real* repository to make sure you're
not using any unsupported constructs against the in-memory version).

IQueryable{T} got rid of an entire slew of useless interfaces to abstract
the real repository. We should wholeheartedly embrace it and move forward,
instead of longing for the days when we had to do all that manually ;)

2012-03-28 14:19 UTC

Daniel Cazzulino #

-- Imagine attempting to provide an implementation over an Event Store.

That's the thing, if this interface is used in the read side of a CQRS
system, there's no event store there. Views need flexible queryability. No?

2012-03-28 14:20 UTC

Josh Elster #

I'm with kzu here -- I'm not really buying into your claim that IQueryable is a bad (leaky, etc) abstraction using the logic you provide. A method returning an IQueryable is *only* providing a guarantee that it describes a query over a set of objects. Nothing more. It gives no guarantees that the query can be executed, only what the query looks like.

Saying this:

public IQueryable{Customer} GetActiveCustomers() {
return CustomersDb.Where(x => x.IsActive).ToList().AsQueryable();
}

tells consumers that the return value of the method is a representation of how the results will be materialized when the query is executed. Saying this, OTOH:

public IEnumerable{Customer} GetActiveCustomers() {
return CustomersDb.Where(x => x.IsActive).ToList();
}

explicitly tells consumers that you are returning the *results* of a query, and it's none of the consuming code's business to know how it got there.

Materialization of those query results is what you seemed to be focused on, and that's not IQueryable's task. That's the task of the specific LINQ provider. IQueryable's job is only to *describe* queries, not actually execute them.

2012-03-28 15:21 UTC

Ethan J. Brown #

Completely agree with Mark here - IQueryable in, for instance, a repository interface may seem like a natural fit, but in fact it's a very very bad idea.

A repository interface should be explicit about the operations it provides over the set of data, and should not open the door to arbitrary querying that is not part of the applications overall design / architecture. YAGNI

I know that this may fly in the face of those who are used to arbitrary late-bound SQL style query flexibility and code-gen'd ORMs, but this type of setup is not one that's conducive to performance or scaling in the long term. ORMs may be good to get something up and running without a lot of effort (or thought into how the data is used), but they rapidly show their cracks / leaks over time. This is why you see sites like StackOverflow as they reach scale adopt MicroORMs that are closer to the metal like Dapper instead of using EF, L2S or NHibernate. (Other great Micro ORMs are Massive, OrmLite, and PetaPoco to name just a few.)

Is it work to be explicit in your contracts around data access? Absolutely... but this is a better long-term design decision when you factor in testability and performance goals. How do you test the perf characteristics of an API that doesn't have well-defined operations? Well... you can't really.

Also -- consider stores that are *not* SQL based as Mark alludes to. We're using Riak, where querying capabilities are limited in scope and there are performance trade-offs to certain types of operations on secondary indices. We accept this tradeoff b/c the operational characteristics of Riak are unparalleled and it has great latency characteristics when retrieving data. We need to be explicit about how we allow data to be retrieved in our data access layer because some things are a no-go in our underlying store (for instance, listing keys is extremely expensive at the moment in Riak).

Bind yourself to IQueryable and you've handcuffed yourself. IQueryable is simply not a fit in a lot of cases, and in others it's total overkill... data access is not one size fits all. Embrace the polyglot!

2012-03-29 12:09 UTC

Daniel #

Thanks for your blog post. One big problem with APIs that are based on IQueryable is testability, see this question for example: http://stackoverflow.com/questions/9921647/unit-testing-with-queries-defined-in-extension-methods.

However, I don't yet see how to design an easy to use and read API with query objects. Maybe you could provide a repository implementation that works without IQueryable but is still readable and easy to use?

2012-03-29 12:35 UTC

Nikos Baxevanis #

Daniel, according to your blog post http://blogs.clariusconsulting.net/kzu/how-to-design-a-unit-testable-domain-model-with-entity-framework-code-first/ you seem to do quite a lot of work to make the code testable, even re-implementing the Include method. Claiming that IQueryable is the "perfect" abstraction here doesn't convince me.

2012-03-29 12:39 UTC

Daniel #

What do you think about the API outlined in this post: http://stackoverflow.com/a/9943907/572644

2012-03-30 13:29 UTC

Mark Seemann #

Looks good, Daniel. I'd love to hear your experience with this once you've tried it on for some time :)

2012-03-30 14:15 UTC

Joshua Mouch #

I agree that it can be annoying when you try to materialize a IQueryable to find that something in the query is not supported by the referenced QueryProvider.

However, exposing IQueryable in an API is so powerful that it's worth this minor annoyance.

2012-03-30 19:35 UTC

Steven Pena #

Excellent points - couldn't agree more. Even when a method of IQueryable has been implemented, you can't even guarantee the same behavior. For example - look at "Contains". The EF Linq provider uses %, which will match any case, while the LinqToObjects provider will match by case. I tend to use IQueryable in my DAL and surface up a more meaningful and controlled API through my Repositories. This prevents that leaky abstraction and allows me to control the behavior of that persistence layer (caching, eager loading of aggregates...)

2012-03-31 12:46 UTC

Nathan Brown #

What does this say for things like ASP.Net Web API? It is asking you to expose your data through an IQueryable. I tried to use the OData stuff that was using IQueryable and could never get it to fully work with NHibernate because of specific unsupported query structures.

I like the idea of the OData or Web API queries, but would rather they just expose the query parameters and let us build adapters from the query to the datastore as relevant.

2012-03-31 18:24 UTC

Demis Bellot #

Completely agree with this. IQueryable/OData is an anti-pattern in the making - especially for external Web APIs, which is why it will never be an out-of-the-box option in ServiceStack: http://stackoverflow.com/questions/9577938/odata-with-servicestack

The whole idea of web service interfaces is to expose a technology-agnostic interoperable API to the outside world.

Exposing an IQueryable/OData endpoint effectively couples your services to using OData indefinitely as you wont be able to feasibly determine what 'query space' existing clients are binded to, i.e. what existing queries/tables/views/properties you need to freeze/support indefinitely. This is an issue when exposing any implementation at the surface area of your API, as it limits your ability to add your own custom logic on it, e.g. Authorization, Caching, Monitoring, Rate-Limiting, etc. And because OData is really slow, you'll hit performance/scaling problems with it early. The lack of control over the endpoint, means you're effectively heading for a rewrite: https://coldie.net/?tag=servicestack

Lets see how feasible is would be to move off an oData provider implementation by looking at an existing query from Netflix's OData api:

http://odata.netflix.com/Catalog/Titles?$filter=Type%20eq%20'Movie'%20and%20(Rating%20eq%20'G'%20or%20Rating%20eq%20'PG-13')

This service is effectively now coupled to a Table/View called 'Titles' with a column called 'Type'.

And how it would be naturally written if you weren't using OData:

http://api.netflix.com/movies?ratings=G,PG-13

Now if for whatever reason you need to replace the implementation of the existing service (e.g. a better technology platform has emerged, it runs too slow and you need to move this dataset over to a NoSQL/Full-TextIndexing-backed sln) how much effort would it take to replace the OData impl (which is effectively binded to an RDBMS schema and OData binary impl) to the more intuitive impl-agnostic query below? It's not impossible, but as it's prohibitively difficult to implement an OData API for a new technology, that rewriting + breaking existing clients would tend to be the preferred option.

Letting internal implementations dictate the external facing url structure is a sure way to break existing clients when things need to change. This is why you should expose your services behind Cool URIs, i.e. logical permanent urls (that are unimpeded by implementation) that do not change, as you generally don't want to limit the technology choices of your services.

It might be a good option if you want to deliver adhoc 'exploratory services' on it, but it's not something I'd ever want external clients binded to in a production system. And if I'm only going to limit its use to my local intranet what advantages does it have over just giving out a read-only connection string? which will allow others use with their favourite Sql Explorer, Reporting or any other tools that speaks SQL.

2012-07-31 05:42 UTC

Daniel Cazzulino #

@Nikos "re-implementing the Include method": you mean that one-liner??

The testable interface abstraction allows for one-liners in the ORM-bound implementation, as well as the testable fake. Doesn't get any simpler than that.

That hardly seems like complicated stuff to me.

2012-08-31 15:44 UTC

Daniel Cazzulino #

@Demis +1. Exposing OData is a different thing altogether. Not totally convinced with that either myself.

2012-08-31 15:46 UTC

Ron #

@Demis, couldn't you use a service pattern to address those issues?

2013-02-21 16:49 UTC

Mario Majcica #

Although this post is "oldish" I do think that is still not answered. I was checkinga quite nice idea of UoW and Repository pattern and EF implementation proposal and got stuck again at IQueryable question. Recently I read a lot about this question and could not find anything that suits me perfectly. Is your opinion changed after more than two years time? What about passing a Expression<Func<T, bool>> predicate to our Find method? Thanks

2014-08-06 17:46 UTC

Mark Seemann #

Mario, thank you for writing. It occasionally happens that I change my mind, as I learn new skills and gain more experience, but in this case, I haven't changed my mind. I indirectly use IQueryable<T> when I occasionally have to query a relational database with the Entity Framework or LINQ to SQL (which happens extremely rarely these days), but I never design my own interfaces around IQueryable<T>; my reasons for introducing an interface is normally to reduce coupling, but introducing any interface that exposes or depends on IQueryable<T> doesn't reduce coupling.

In Agile Principles, Patterns, and Practices in C#, Robert C. Martin explains in chapter 11 that "clients [...] own the abstract interfaces" - in other words: the client, which invokes methods on the interface, should define the methods that the interface should expose, based on what it needs - not based on what an arbitrary implementation might be able to implement. This is the design philosophy I always follow. A client shouldn't have to need to be able to perform arbitrary queries against a data access layer. If it does, it's such a Leaky Abstraction anyway that the interface isn't going to help; in such cases, just let the client talk directly to the ORM or the ADO.NET objects. That's much easier to understand.

The same argument also goes against designing interfaces around Expression<Func<T, bool>>. It may seem flexible, but the fact is that such an expression can be arbitrarily complex, so in practice, it's impossible to guarantee that you can translate every expression to all SQL dialects; and what about OData? Or queries against MongoDB?

Still, I rarely use ORMs at all. Instead, I increasingly rely on simple abstractions like Drain when designing my systems.

2014-08-09 12:22 UTC

Published: Monday, 26 March 2012 13:53:31 UTC

IQueryable is Tight Coupling by Mark Seemann

Consuming IQueryable<T> #

Exposing IQueryable<T> #

Current Implementations #

Why This Matters #

Comments

Wish to comment?

Published

Tags