Do ORMs reduce the need for mapping?

With some Entity Framework examples in C#.

In a recent comment, a reader asked me to expand on my position on object-relational mappers (ORMs), which is that I'm not a fan:

I consider ORMs a waste of time: they create more problems than they solve.

Code That Fits in Your Head, subsection 12.2.2, footnote

While I acknowledge that only a Sith deals in absolutes, I favour clear assertions over guarded language. I don't really mean it that categorically, but I do stand by the general sentiment. In this article I'll attempt to describe why I don't reach for ORMs when querying or writing to a relational database.

As always, any exploration of such a kind is made in a context, and this article is no exception. Before proceeding, allow me to delineate the scope. If your context differs from mine, what I write may not apply to your situation.

Scope #

It's been decades since I last worked on a system where the database 'came first'. The last time that happened, the database was hidden behind an XML-based RPC API that tunnelled through HTTP. Not a REST API by a long shot.

Since then, I've worked on various systems. Some used relational databases, some document databases, some worked with CSV, or really old legacy APIs, etc. Common to these systems was that they were not designed around a database. Rather, they were developed with an eye to the Dependency Inversion Principle, keeping storage details out of the Domain Model. Many were developed with test-driven development (TDD).

When I evaluate whether or not to use an ORM in situations like these, the core application logic is my main design driver. As I describe in Code That Fits in Your Head, I usually develop (vertical) feature slices one at a time, utilising an outside-in TDD process, during which I also figure out how to save or retrieve data from persistent storage.

Thus, in systems like these, storage implementation is an artefact of the software architecture. If a relational database is involved, the schema must adhere to the needs of the code; not the other way around.

To be clear, then, this article doesn't discuss typical CRUD-heavy applications that are mostly forms over relational data, with little or no application logic. If you're working with such a code base, an ORM might be useful. I can't really tell, since I last worked with such systems at a time when ORMs didn't exist.

The usual suspects #

The most common criticism of ORMs (that I've come across) is typically related to the queries they generate. People who are skilled in writing SQL by hand, or who are concerned about performance, may look at the SQL that an ORM generates and dislike it for that reason.

It's my impression that ORMs have come a long way over the decades, but frankly, the generated SQL is not really what concerns me. It never was.

In the abstract, Ted Neward already outlined the problems in the seminal article The Vietnam of Computer Science. That problem description may, however, be too theoretical to connect with most programmers, so I'll try a more example-driven angle.

Database operations without an ORM #

Once more I turn to the trusty example code base that accompanies Code That Fits in Your Head. In it, I used SQL Server as the example database, and ADO.NET as the data access technology.

I considered this more than adequate for saving and reading restaurant reservations. Here, for example, is the code that creates a new reservation row in the database:

public async Task Create(int restaurantId, Reservation reservation)
{
    if (reservation is null)
        throw new ArgumentNullException(nameof(reservation));
 
    using var conn = new SqlConnection(ConnectionString);
    using var cmd = new SqlCommand(createReservationSql, conn);
    cmd.Parameters.AddWithValue("@Id", reservation.Id);
    cmd.Parameters.AddWithValue("@RestaurantId", restaurantId);
    cmd.Parameters.AddWithValue("@At", reservation.At);
    cmd.Parameters.AddWithValue("@Name", reservation.Name.ToString());
    cmd.Parameters.AddWithValue("@Email", reservation.Email.ToString());
    cmd.Parameters.AddWithValue("@Quantity", reservation.Quantity);
 
    await conn.OpenAsync().ConfigureAwait(false);
    await cmd.ExecuteNonQueryAsync().ConfigureAwait(false);
}
 
private const string createReservationSql = @"
    INSERT INTO [dbo].[Reservations] (
        [PublicId], [RestaurantId], [At], [Name], [Email], [Quantity])
    VALUES (@Id, @RestaurantId, @At, @Name, @Email, @Quantity)";

Yes, there's mapping, even if it's 'only' from a Domain Object to command parameter strings. As I'll argue later, if there's a way to escape such mapping, I'm not aware of it. ORMs don't seem to solve that problem.

This, however, seems to be the reader's main concern:

"I can work with raw SQL ofcourse... but the mapping... oh the mapping..."

qfilip

It's not a concern that I share, but again I'll remind you that if your context differs substantially from mine, what doesn't concern me could reasonably concern you.

You may argue that the above example isn't representative, since it only involves a single table. No foreign key relationships are involved, so perhaps the example is artificially easy.

In order to work with a slightly more complex schema, I decided to port the read-only in-memory restaurant database (the one that keeps track of the restaurants - the tenants - of the system) to SQL Server.

Restaurants schema #

In the book's sample code base, I'd only stored restaurant configurations as JSON config files, since I considered it out of scope to include an online tenant management system. Converting to a relational model wasn't hard, though. Here's the database schema:

CREATE TABLE [dbo].[Restaurants] (
    [Id]               INT            NOT NULL,
    [Name]             NVARCHAR (50)  NOT NULL UNIQUE,
    [OpensAt]          TIME           NOT NULL,
    [LastSeating]      TIME           NOT NULL,
    [SeatingDuration]  TIME           NOT NULL
    PRIMARY KEY CLUSTERED ([Id] ASC)
)
 
CREATE TABLE [dbo].[Tables] (
    [Id]               INT            NOT NULL IDENTITY,
    [RestaurantId]     INT            NOT NULL REFERENCES [dbo].[Restaurants](Id),
    [Capacity]         INT            NOT NULL,
    [IsCommunal]       BIT            NOT NULL
    PRIMARY KEY CLUSTERED ([Id] ASC)
)

This little subsystem requires two database tables: One that keeps track of the overall restaurant configuration, such as name, opening and closing times, and another database table that lists all a restaurant's physical tables.

You may argue that this is still too simple to realistically capture the intricacies of existing database systems, but conversely I'll remind you that the scope of this article is the sort of system where you develop and design the application first; not a system where you're given a relational database upon which you must create an application.

Had I been given this assignment in a realistic setting, a relational database probably wouldn't have been my first choice. Some kind of document database, or even blob storage, strikes me as a better fit. Still, this article is about ORMs, so I'll pretend that there are external circumstances that dictate a relational database.

To test the system, I also created a script to populate these tables. Here's part of it:

INSERT INTO [dbo].[Restaurants] ([Id], [Name], [OpensAt], [LastSeating], [SeatingDuration])
VALUES (1, N'Hipgnosta', '18:00', '21:00', '6:00')
 
INSERT INTO [dbo].[Tables] ([RestaurantId], [Capacity], [IsCommunal])
VALUES (1, 10, 1)
 
INSERT INTO [dbo].[Restaurants] ([Id], [Name], [OpensAt], [LastSeating], [SeatingDuration])
VALUES (2112, N'Nono', '18:00', '21:00', '6:00')
 
INSERT INTO [dbo].[Tables] ([RestaurantId], [Capacity], [IsCommunal])
VALUES (2112, 6, 1)
 
INSERT INTO [dbo].[Tables] ([RestaurantId], [Capacity], [IsCommunal])
VALUES (2112, 4, 1)
 
INSERT INTO [dbo].[Tables] ([RestaurantId], [Capacity], [IsCommunal])
VALUES (2112, 2, 0)
 
INSERT INTO [dbo].[Tables] ([RestaurantId], [Capacity], [IsCommunal])
VALUES (2112, 2, 0)
 
INSERT INTO [dbo].[Tables] ([RestaurantId], [Capacity], [IsCommunal])
VALUES (2112, 4, 0)
 
INSERT INTO [dbo].[Tables] ([RestaurantId], [Capacity], [IsCommunal])
VALUES (2112, 4, 0)

There are more rows than this, but this should give you an idea of what data looks like.

Reading restaurant data without an ORM #

Due to the foreign key relationship, reading restaurant data from the database is a little more involved than reading from a single table.

public async Task<Restaurant?> GetRestaurant(string name)
{
    using var cmd = new SqlCommand(readByNameSql);
    cmd.Parameters.AddWithValue("@Name", name);
    
    var restaurants = await ReadRestaurants(cmd);
    return restaurants.SingleOrDefault();
}
 
private const string readByNameSql = @"
    SELECT [Id], [Name], [OpensAt], [LastSeating], [SeatingDuration]
    FROM [dbo].[Restaurants]
    WHERE [Name] = @Name
 
    SELECT [RestaurantId], [Capacity], [IsCommunal]
    FROM [dbo].[Tables]
    JOIN [dbo].[Restaurants]
    ON [dbo].[Tables].[RestaurantId] = [dbo].[Restaurants].[Id]
    WHERE [Name] = @Name";

There are more than one option when deciding how to construct the query. You could make one query with a join, in which case you'd get rows with repeated data, and you'd then need to detect duplicates, or you could do as I've done here: Query each table to get multiple result sets.

I'm not claiming that this is better in any way. I only chose this option because I found the code that I had to write less offensive.

Since the IRestaurantDatabase interface defines three different kinds of queries (GetAll(), GetRestaurant(int id), and GetRestaurant(string name)), I invoked the rule of three and extracted a helper method:

private async Task<IEnumerable<Restaurant>> ReadRestaurants(SqlCommand cmd)
{
    var conn = new SqlConnection(ConnectionString);
    cmd.Connection = conn;
 
    await conn.OpenAsync();
    using var rdr = await cmd.ExecuteReaderAsync();
 
    var restaurants = Enumerable.Empty<Restaurant>();
    while (await rdr.ReadAsync())
        restaurants = restaurants.Append(ReadRestaurantRow(rdr));
 
    if (await rdr.NextResultAsync())
        while (await rdr.ReadAsync())
            restaurants = ReadTableRow(rdr, restaurants);
 
    return restaurants;
}

The ReadRestaurants method does the overall work of opening the database connection, executing the query, and moving through rows and result sets. Again, we'll find mapping code hidden in helper methods:

private static Restaurant ReadRestaurantRow(SqlDataReader rdr)
{
    return new Restaurant(
        (int)rdr["Id"],
        (string)rdr["Name"],
        new MaitreD(
            new TimeOfDay((TimeSpan)rdr["OpensAt"]),
            new TimeOfDay((TimeSpan)rdr["LastSeating"]),
            (TimeSpan)rdr["SeatingDuration"]));
}

As the name suggests, ReadRestaurantRow reads a row from the Restaurants table and converts it into a Restaurant object. At this time, however, it creates each MaitreD object without any tables. This is possible because one of the MaitreD constructors takes a params array as the last parameter:

public MaitreD(
    TimeOfDay opensAt,
    TimeOfDay lastSeating,
    TimeSpan seatingDuration,
    params Table[] tables) :
    this(opensAt, lastSeating, seatingDuration, tables.AsEnumerable())
{
}

Only when the ReadRestaurants method moves on to the next result set can it add tables to each restaurant:

private static IEnumerable<Restaurant> ReadTableRow(
    SqlDataReader rdr,
    IEnumerable<Restaurant> restaurants)
{
    var restaurantId = (int)rdr["RestaurantId"];
    var capacity = (int)rdr["Capacity"];
    var isCommunal = (bool)rdr["IsCommunal"];
    var table = isCommunal ? Table.Communal(capacity) : Table.Standard(capacity);
 
    return restaurants.Select(r => r.Id == restaurantId ? AddTable(r, table) : r);
}

As was also the case in ReadRestaurantRow, this method uses string-based indexers on the rdr to extract the data. I'm no fan of stringly-typed code, but at least I have automated tests that exercise these methods.

Could an ORM help by creating strongly-typed classes that model database tables? To a degree; I'll discuss that later.

In any case, since the entire code base follows the Functional Core, Imperative Shell architecture, the entire Domain Model is made of immutable data types with pure functions. Thus, ReadTableRow has to iterate over all restaurants and add the table when the Id matches. AddTable does that:

private static Restaurant AddTable(Restaurant restaurant, Table table)
{
    return restaurant.Select(m => m.WithTables(m.Tables.Append(table).ToArray()));
}

I can think of other ways to solve the overall mapping task when using ADO.NET, but this was what made most sense to me.

Reading restaurants with Entity Framework #

Does an ORM like Entity Framework (EF) improve things? To a degree, but not enough to outweigh the disadvantages it also brings.

In order to investigate, I followed the EF documentation to scaffold code from a database I'd set up for only that purpose. For the Tables table it created the following Table class and a similar Restaurant class.

public partial class Table
{
    public int Id { get; set; }
 
    public int RestaurantId { get; set; }
 
    public int Capacity { get; set; }
 
    public bool IsCommunal { get; set; }
 
    public virtual Restaurant Restaurant { get; set; } = null!;
}

Hardly surprising. Also, hardly object-oriented, but more about that later, too.

Entity Framework didn't, by itself, add a Tables collection to the Restaurant class, so I had to do that by hand, as well as modify the DbContext-derived class to tell it about this relationship:

entity.OwnsMany(r => r.Tables, b =>
{
    b.Property<int>(t => t.Id).ValueGeneratedOnAdd();
    b.HasKey(t => t.Id);
});

I thought that such a simple foreign key relationship would be something an ORM would help with, but apparently not.

With that in place, I could now rewrite the above GetRestaurant method to use Entity Framework instead of ADO.NET:

public async Task<Restaurants.Restaurant?> GetRestaurant(string name)
{
    using var db = new RestaurantsContext(ConnectionString);
    var dbRestaurant = await db.Restaurants.FirstOrDefaultAsync(r => r.Name == name);
    if (dbRestaurant == null)
        return null;
 
    return ToDomainModel(dbRestaurant);
}

The method now queries the database, and EF automatically returns a populated object. This would be nice if it was the right kind of object, but alas, it isn't. GetRestaurant still has to call a helper method to convert to the correct Domain Object:

private static Restaurants.Restaurant ToDomainModel(Restaurant restaurant)
{
    return new Restaurants.Restaurant(
        restaurant.Id,
        restaurant.Name,
        new MaitreD(
            new TimeOfDay(restaurant.OpensAt),
            new TimeOfDay(restaurant.LastSeating),
            restaurant.SeatingDuration,
            restaurant.Tables.Select(ToDomainModel).ToList()));
}

While this helper method converts an EF Restaurant object to a proper Domain Object (Restaurants.Restaurant), it also needs another helper to convert the table objects:

private static Restaurants.Table ToDomainModel(Table table)
{
    if (table.IsCommunal)
        return Restaurants.Table.Communal(table.Capacity);
    else
        return Restaurants.Table.Standard(table.Capacity);
}

As should be clear by now, using vanilla EF doesn't reduce the need for mapping.

Granted, the mapping code is a bit simpler, but you still need to remember to map restaurant.Name to the right constructor parameter, restaurant.OpensAt and restaurant.LastSeating to their correct places, table.Capacity to a constructor argument, and so on. If you make changes to the database schema or the Domain Model, you'll need to edit this code.

Encapsulation #

This is the point where more than one reader wonders: Can't you just..?

In short, no, I can't just.

The most common reaction is most likely that I'm doing this all wrong. I'm supposed to use the EF classes as my Domain Model.

But I can't, and I won't. I can't because I already have classes in place that serve that purpose. I also will not, because it would violate the Dependency Inversion Principle. As I recently described, the architecture is Ports and Adapters, or, if you will, Clean Architecture. The database Adapter should depend on the Domain Model; the Domain Model shouldn't depend on the database implementation.

Okay, but couldn't I have generated the EF classes in the Domain Model? After all, a class like the above Table is just a POCO Entity. It doesn't depend on the Entity Framework. I could have those classes in my Domain Model, put my DbContext in the data access layer, and have the best of both worlds. Right?

The code shown so far hints at a particular API afforded by the Domain Model. If you've read my book, you already know what comes next. Here's the Table Domain Model's API:

public sealed class Table
{ 
    public static Table Standard(int seats)
 
    public static Table Communal(int seats)
 
    public int Capacity { get; }
 
    public int RemainingSeats { get; }
 
    public Table Reserve(Reservation reservation)
 
    public T Accept<T>(ITableVisitor<T> visitor)
}

A couple of qualities of this design should be striking: There's no visible constructor - not even one that takes parameters. Instead, the type affords two static creation functions. One creates a standard table, the other a communal table. My book describes the difference between these types, and so does the Maître d' kata.

This isn't some frivolous design choice of mine, but rather quite deliberate. That Table class is a Visitor-encoded sum type. You can debate whether I should have modelled a table as a sum type or a polymorphic object, but now that I've chosen a sum type, it should be explicit in the API design.

"Explicit is better than implicit."

The Zen of Python

When we program, we make many mistakes. It's important to discover the mistakes as soon as possible. With a compiled language, the first feedback you get is from the compiler. I favour leveraging the compiler, and its type system, to prevent as many mistakes as possible. That's what Hillel Wayne calls constructive data. Make illegal states unrepresentable.

I could, had I thought of it at the time, have introduced a predicative natural-number wrapper of integers, in which case I could have strengthened the contract of Table even further:

public sealed class Table
{ 
    public static Table Standard(NaturalNumber capacity)
 
    public static Table Communal(NaturalNumber capacity)
 
    public NaturalNumber Capacity { get; }
 
    public int RemainingSeats { get; }
 
    public Table Reserve(Reservation reservation)
 
    public T Accept<T>(ITableVisitor<T> visitor)
}

The point is that I take encapsulation seriously, and my interpretation of the concept is heavily inspired by Bertrand Meyer's Object-Oriented Software Construction. The view of encapsulation emphasises contracts (preconditions, invariants, postconditions) rather than information hiding.

As I described in a previous article, you can't model all preconditions and invariants with types, but you can still let the type system do much heavy lifting.

This principle applies to all classes that are part of the Domain Model; not only Table, but also Restaurant:

public sealed class Restaurant
{
    public Restaurant(int id, string name, MaitreD maitreD)
 
    public int Id { get; }
    public string Name { get; }
    public MaitreD MaitreD { get; }
 
    public Restaurant WithId(int newId)
 
    public Restaurant WithName(string newName)
 
    public Restaurant WithMaitreD(MaitreD newMaitreD)
 
    public Restaurant Select(Func<MaitreD, MaitreD> selector)
}

While this class does have a public constructor, it makes use of another design choice that Entity Framework doesn't support: It nests one rich object (MaitreD) inside another. Why does it do that?

Again, this is far from a frivolous design choice I made just to be difficult. Rather, it's a result of a need-to-know principle (which strikes me as closely related to the Single Responsibility Principle): A class should only contain the information it needs in order to perform its job.

The MaitreD class does all the heavy lifting when it comes to deciding whether or not to accept reservations, how to allocate tables, etc. It doesn't, however, need to know the id or name of the restaurant in order to do that. Keeping that information out of MaitreD, and instead in the Restaurant wrapper, makes the code simpler and easier to use.

The bottom line of all this is that I value encapsulation over 'easy' database mapping.

Limitations of Entity Framework #

The promise of an object-relational mapper is that it automates mapping between objects and database. Is that promise realised?

In its current incarnation, it doesn't look as though Entity Framework supports mapping to and from the Domain Model. With the above tweaks, it supports the database schema that I've described, but only via 'Entity classes'. I still have to map to and from the 'Entity objects' and the actual Domain Model. Not much is gained.

One should, of course, be careful not drawing too strong inferences from this example. First, proving anything impossible is generally difficult. Just because I can't find a way to do what I want, I can't conclude that it's impossible. That a few other people tell me, too, that it's impossible still doesn't constitute strong evidence.

Second, even if it's impossible today, it doesn't follow that it will be impossible forever. Perhaps Entity Framework will support my Domain Model in the future.

Third, we can't conclude that just because Entity Framework (currently) doesn't support my Domain Model it follows that no object-relational mapper (ORM) does. There might be another ORM out there that perfectly supports my design, but I'm just not aware of it.

Based on my experience and what I see, read, and hear, I don't think any of that likely. Things might change, though.

Net benefit or drawback? #

Perhaps, despite all of this, you still prefer ORMs. You may compare my ADO.NET code to my Entity Framework code and conclude that the EF code still looks simpler. After all, when using ADO.NET I have to jump through some hoops to load the correct tables associated with each restaurant, whereas EF automatically handles that for me. The EF version requires fewer lines of code.

In isolation, the fewer lines of code the better. This seems like an argument for using an ORM after all, even if the original promise remains elusive. Take what you can get.

On the other hand, when you take on a dependency, there's usually a cost that comes along. A library like Entity Framework isn't free. While you don't pay a licence fee for it, it comes with other costs. You have to learn how to use it, and so do your colleagues. You also have to keep up to date with changes.

Every time some exotic requirement comes up, you'll have to spend time investigating how to address it with that ORM's API. This may lead to a game of Whac-A-Mole where every tweak to the ORM leads you further down the rabbit hole, and couples your code tighter with it.

You can only keep up with so many things. What's the best investment of your time, and the time of your team mates? Learning and knowing SQL, or learning and keeping up to date with a particular ORM?

I learned SQL decades ago, and that knowledge is still useful. On the other hand, I don't even know how many library and framework APIs that I've both learned and forgotten about.

As things currently stand, it looks to me as though the net benefit of using a library like Entity Framework is negative. Yes, it might save me a few lines of code, but I'm not ready to pay the costs just outlined.

This balance could tip in the future, or my context may change.

Conclusion #

For the kind of applications that I tend to become involved with, I don't find object-relational mappers particularly useful. When you have a rich Domain Model where the first design priority is encapsulation, assisted by the type system, it looks as though mapping is unavoidable.

While you can ask automated tools to generate code that mirrors a database schema (or the other way around), only classes with poor encapsulation are supported. As soon as you do something out of the ordinary like static factory methods or nested objects, apparently Entity Framework gives up.

Can we extrapolate from Entity Framework to other ORMs? Can we infer that Entity Framework will never be able to support objects with proper encapsulation, just because it currently doesn't?

I can't say, but I'd be surprised if things change soon, if at all. If, on the other hand, it eventually turns out that I can have my cake and eat it too, then why shouldn't I?

Until then, however, I don't find that the benefits of ORMs trump the costs of using them.

Comments

Vlad #

One project I worked on was (among other things) mapping data from database to rich domain objects in the way similar to what is described in this article. These object knew how to do a lot of things but were dependant on related objects and so everything neded to be loaded in advance from the database in order to ensure correctness. So having a Order, OrderLine, Person, Address and City, all the rows needed to be loaded in advance, mapped to objects and references set to create the object graph to be able to, say, display shipping costs based on person's address.

The mapping step involved cumbersome manual coding and was error prone because it was easy to forget to load some list or set some reference. Reflecting on that experience, it seems to me that sacrificing a bit of purity wrt domain modelling and leaning on an ORM to lazily load the references would have been much more efficient and correct.

But I guess it all depends on the context..?

2023-09-19 13:17 UTC

qfilip #

Thanks. I've been following recent posts, but I was too lazy to go through the whole PRing things to reply. Maybe that's a good thing, since it forces you to think how to reply, instead of throwing a bunch of words together quickly. Anyways, back to business.

I'm not trying to sell one way or the other, because I'm seriously conflicted with both. Since most people on the web tend to fall into ORM category (in .NET world at least), I was simply looking for other perspective, from someone more knowledgable than me.

The following is just my thinking out loud...

You've used DB-first approach and scaffolding classes from DB schema. With EF core, the usual thing to do, is the opposite. Write classes to scaffold DB schema. Now, this doesn't save us from writing those "relational properties", but it allows us to generate DB update scripts. So if you have a class like:

class SomeTable
{
    public int Id;
    public string Name;
}

and you add a field:

class SomeTable
{
    public int Id;
    public string Name;
    public DateTime Birthday;
}

you can run

add-migration MyMigration   // generate migration file
update-database             // execute it

This gives you a nice way to track DB chages via Git, but it can also introduce conflicts. Two devs cannot edit the same class/table. You have to be really careful when making commits. Another painful thing to do this way is creating DB views and stored procedures. I've honestly never saw a good solution for it. Maybe trying to do these things is a futile effort in the first place.

The whole

readByNameSql = @"SELECT [Id], [Name], [OpensAt], [LastSeating], [SeatingDuration]...

is giving me heebie jeebies. It is easy to change some column name, and introduce a bug. It might be possible to do stuff with string interpolation, but at that point, I'm thinking about creating my own framework...

The most common reaction is most likely that I'm doing this all wrong. I'm supposed to use the EF classes as my Domain Model. - Mark Seemann

One of the first things that I was taught on my first job, was to never expose my domain model to the outside world. The domain model being EF Core classes... These days, I'm thinking quite the opposite. EF Core classes are DTOs for the database (with some boilerplate in order for framework to do it's magic). I also want to expose my domain model to the outside world. Why not? That's the contract after all. But the problem with this, is that it adds another layer of mapping. Since my domain model validation is done in class constructor, deserialization of requests becomes a problem. Ideally, it should sit in a static method. But in that case I have: jsonDto -> domainModel -> dbDto. The No-ORM approach also still requires me to map domainModel to command parameters manually. All of this is a tedious, and very error prone process. Especially if you have the case like vlad mentioned above.

Minor observation on your code. People rarely map things from DB data to domain models when using EF Core. This is a horrible thing to do. Anyone can run a script against a DB, and corrupt the data. It is something I intend to enforce in future projects, if possible. Thank you F# community.

I can't think of anything more to say at the moment. Thanks again for a fully-fledged-article reply :). I also recommend this video. I haven't had the time to try things he is talking about yet.

2023-09-21 19:27 UTC

Mark Seemann #

Vlad, qfilip, thank you for writing.

I think your comments warrant another article. I'll post an update here later.

2023-09-24 15:57 UTC

qfilip #

Quick update from me again. I've been thinking and experimenting with several approaches to solve issues I've written about above. How idealized world works and where do we make compromises. Forgive me for the sample types, I couldn't think of anything else. Let's assume we have this table:

type Sheikh = {
    // db entity data
    Id: Guid
    CreatedAt: DateTime
    ModifiedAt: DateTime
    Deleted: bool

    // domain data
    Name: string
    Email: string // unique constraint here
    
    // relational data
    Wives: Wife list
    Supercars: Supercar list
}

I've named first 3 fields as "entity data". Why would my domain model contain an ID? It shouldn't care about persistence. I may save it to the DB, write it to a text file, or print it on a piece of paper. Don't care. We put IDs because data usually ends up in a DB. I could have used Email here to serve as an ID, because it should be unique, but we also like to standardize these stuff. All IDs shall be uuids.

There are also these "CreatedAt", "ModifiedAt" and "Deleted" columns. This is something I usually do, when I want soft-delete functionality. Denoramalize the data to gain performance. Otherwise, I would need to make... say... EntityStatus table to keep that data, forcing me to do a JOIN for every read operation and additional UPDATE EntityStatus for every write operation. So I kinda sidestep "the good practices" to avoid very real complications.

Domain data part is what it is, so I can safely skip that part.

Relational data part is the most interesting bit. I think this is what keeps me falling back to EntityFramework and why using "relational properties" are unavoidable. Either that, or I'm missing something.

Focusing attention on Sheikh table here, with just 2 relations, there are 4 potential scenarios. I don't want to load stuff from the DB, unless they are required, so the scenarios are:

Load Sheikh without relational data
Load Sheikh with Wives
Load Sheikh with Supercars
Load Sheikh with Wives and Supercars

2^NRelations I guess? I'm three beers in on this, with only six hours left until corporate clock starts ticking, so my math is probably off.

God forbid if any of these relations have their own "inner relations" you may or may not need to load. This is where the (magic mapping/to SQL translations) really start getting useful. There will be some code repetition, but you'll just need to add ThenInclude(x => ...) and you're done.

Now the flip side. Focusing attention on Supercar table:

type Supercar = {
    // db entity data
    ...

    // domain data
    Vendor: string
    Model: string
    HorsePower: int
    
    // relational data
    Owner: Sheikh
    OwnerId: Guid
}

Pretty much same as before. Sometimes I'll need Sheikh info, sometimes I won't. One of F# specific problems I'm having is that, records require all fields to be populated. What if I need just SheikhID to perform some domain logic?

                    let tuneSheikhCars (sheikhId) (hpIncrement) (cars) =
                        cars
                        |> List.filter (fun x -> x.Owner.Id = sheikhId)
                        |> List.map (fun x -> x with { HorsePower = x.HorsePower + hpIncrement })

Similar goes for inserting new Supercar. I want to query-check first if Owner/Sheikh exists, before attempting insertion. You can pass it as a separate parameter, but code gets messier and messier.

No matter how I twist and turn things around, in the real world, I'm not only concerned by current steps I need to take to complete a task, but also with possible future steps. Now, I could define a record that only contains relevant data per each request. But, as seen above, I'd be eventually forced to make ~ 2^NRelations of such records, instead of one. A reusable one, that serves like a bucket for a preferred persistence mechanism, allowing me to load relations later on, because nothing lives in memory long term.

I strayed away slightly here from ORM vs no-ORM discussion that I've started earlier. Because, now I realize that this problem isn't just about mapping things from type A to type B.

2023-10-08 23:24 UTC

opcoder #

I wonder if EF not having all the features we want isn't a false problem. I feel like we try to use the domain entities as DTOs and viceversa, breaking the SRP principle. But if we start writing DTOs and use them with EF, we would need a layer to map between the DTOs and the entities (AutoMapper might help with this?). I'm sure this has been discussed before.

2023-10-09 6:56 UTC

qfilip #

opcoder not really, no... Automapper(s) should only be used for mapping between two "dumb objects" (DTOs). I wouldn't drag in a library even for that, however, as it's relatively simple to write this thing yourself (with tests) and have full control / zero configuration when you come to a point to create some special projections. As for storing domain models in objects, proper OOP objects, with both data and behaviour, I don't like that either. Single reason for that is: constructors. This is where you pass the data to be validated into a domain model, and this is where OOP has a fatal flaw for me. Constructors can only throw exceptions, giving me no room to maneuver. You can use static methods with ValidationResult as a return type, but now we're entering a territory where C#, as a language, is totally unprepared for.

2023-10-13 17:00 UTC

Iker #

Just my two cents:

Yes, it is possible to map the NaturalNumber object to an E.F class property using ValueConverters. Here are a couple of articles talking about this:

But even though you can use this, you may still encounter another use cases that you cannot tackle. E.F is just a tool with its limitations, and there will be things you can do with simple C# that you can not do with E.F.

I think you need to consider why you want to use E.F, understand its strengths and weaknesses, and then decide if it suits your project.

Do you want to use EF solely as a data access layer, or do you want it to be your domain layer?. Maybe for a big project you can use only E.F as a data access layer and use old plain C# files for domain layer. In a [small | medium | quick & dirty] project use as your domain layer.

There are bad thing we already know:

Increased complexity.
There will be things you can not do. So you must be carefull you will not need something E.F can not give you.
You need to know how it works. For example, know that accessing myRestaurant.MaitreD implies a new database access (if you have not loaded it previously).

But sometimes E.F shines, for example:

You are programing against the E.F model, not against a specific database, so it is easier to migrate to another database.
Maybe migrate to another database is rare, but it is very convenient to run tests against an in-memory SQLite database. Tests against a real database can be run in the CD/CI environment, for example.
Having a centralized point to process changes (SaveChanges) allows you to easily do interesting things: save "CreatedBy," "CreatedDate," "ModifiedBy," and "ModifiedDate" fields for all tables or create historical tables (if you do not have access to the SQL Server temporal tables).
Global query filters allow you to make your application multi-tenant with very little code: all tables implement IByClient, a global filter for theses tables... and voilà, your application becomes multi-client with just a few lines.

I am not a E.F defender, in fact I have a love & hate reletaionship with it. But I believe it is a powerful tool for certain projects. As always, the important thing is to think whether it is the right tool for your specific project :)

2023-10-15 16:43 UTC

Mark Seemann #

Thank you, all, for writing. There's more content in your comments than I can address in one piece, but I've written a follow-up article that engages with some of your points: Domain Model first.

Specifically regarding the point of having to hand-write a lot of code to deal with multiple tables joined in various fashions, I grant that while typing isn't a bottleneck, the more code you add, the greater the risk of bugs. I'm not trying to be dismissive of ORMs as a general tool. If you truly, inescapably, have a relational model, then an ORM seems like a good choice. If so, however, I don't see that you can get good encapsulation at the same time.

And indeed, an important responsibility of a software architect is to consider trade-offs to find a good solution for a particular problem. Sometimes such a solution involves an ORM, but sometimes, it doesn't. In my world, it usually doesn't.

Do I breathe rarefied air, dealing with esoteric problems that mere mortals can never hope to encounter? I don't think so. Rather, I offer the interpretation that I sometimes approach problems in a different way. All I really try to do with these articles is to present to the public the ways I think about problems. I hope, then, that it may inspire other people to consider problems from more than one angle.

Finally, from my various customer engagements I get the impression that people also like ORMs because 'entity classes' look strongly typed. As a counter-argument, I suggest that this may be an illusion.

2023-10-23 06:45 UTC

Published: Monday, 18 September 2023 14:40:00 UTC

Do ORMs reduce the need for mapping? by Mark Seemann

Scope #

The usual suspects #

Database operations without an ORM #

Restaurants schema #

Reading restaurant data without an ORM #

Reading restaurants with Entity Framework #

Encapsulation #

Limitations of Entity Framework #

Net benefit or drawback? #

Conclusion #

Comments

Wish to comment?

Published

Tags