ploeh blog

At the Boundaries, Applications are Not Object-Oriented

Tuesday, 31 May 2011 13:27:11 UTC

My recent series of blog posts about Poka-yoke Design generated a few responses (I would have been disappointed had this not been the case). Quite a few of these reactions relate to various serialization or translation technologies usually employed at application boundaries: Serialization, XML (de)hydration, UI validation, etc. Note that such translation happens not only at the perimeter of the application, but also at the persistence layer. ORMs are also a translation mechanism.

Common to most of the comments is that lots of serialization technologies require the presence of a default constructor. As an example, the XmlSerializer requires a default constructor and public writable properties. Most ORMs I've investigated seem to have the same kind of requirements. Windows Forms and WPF Controls (UI is also an application boundary) also must have default constructors. Doesn't that break encapsulation? Yes and no.

Objects at the Boundary #

It certainly would break encapsulation if you were to expose your (domain) objects directly at the boundary. Consider a simple XML document like this one:

<name>
  <firstName>Mark</firstName>
  <lastName>Seemann</lastName>
</name>

Whether or not we have formal a contract (XSD), we might stipulate that both the firstName and lastName elements are required. However, despite such a contract, I can easily create a document that breaks it:

<name>
  <firstName>Mark</firstName>
</name>

We can't enforce the contract as there's no compilation step involved. We can validate input (and output), but that's a different matter. Exactly because there's no enforcement it's very easy to create malformed input. The same argument can be made for UI input forms and any sort of serialized byte sequence. This is why we must treat all input as suspect.

This isn't a new observation at all. In Patterns of Enterprise Application Architecture, Martin Fowler described this as a Data Transfer Object (DTO). However, despite the name we should realize that DTOs are not really objects at all. This is nothing new either. Back in 2004 Don Box formulated the Four Tenets of Service Orientation. (Yes, I know that they are not in vogue any more and that people wanted to retire them, but some of them still make tons of sense.) Particularly the third tenet is germane to this particular discussion:

Services share schema and contract, not class.

Yes, and that means they are not objects. A DTO is a representation of such a piece of data mapped into an object-oriented language. That still doesn't make them objects in the sense of encapsulation. It would be impossible. Since all input is suspect, we can hardly enforce any invariants at all.

Often, as Craig Stuntz points out in a comment to one of my previous posts, even if the input is invalid, we want to capture what we did receive in order to present a proper error message (this argument also applies on machine-to-machine boundaries). This means that any DTO must have very weak invariants (if any at all).

DTOs don't break encapsulation because they aren't objects at all.

Don't be fooled by your tooling. The .NET framework very, very much wants you to treat DTOs as objects. Code generation ensues.

However, the strong typing provided by such auto-generated classes gives a false sense of security. You may think that you get rapid feedback from the compiler, but there are many possible ways you can get run-time errors (most notably when you forget to update the auto-generated code based on new schema versions).

An even more problematic result of representing input and output as objects is that it tricks lots of developers into dealing with them as though they represent the real object model. The result is invariably an anemic domain model.

More and more, this line of reasoning is leading me towards the conclusion that the DTO mental model that we have gotten used to over the last ten years is a dead end.

What Should Happen at the Boundary #

Given that we write object-oriented code and that data at the boundary is anything but object-oriented, how do we deal with it?

One option is to stick with what we already have. To bridge the gap we must then develop translation layers that can translate the DTOs to properly encapsulated domain objects. This is the route I take with the samples in my book. However, this is a solution that more and more I'm beginning to think may not be the best. It has issues with maintainability. (Incidentally, that's the problem with writing a book: at the time you're done, you know so much more than you did when you started out… Not that I'm denouncing the book - it's just not perfect…)

Another option is to stop treating data as objects and start treating it as the structured data that it really is. It would be really nice if our programming language had a separate concept of structured data… Interestingly, while C# has nothing of the kind, F# has tons of ways to model data structures without behavior. Perhaps that's a more honest approach to dealing with data… I will need to experiment more with this…

A third option is to look towards dynamic types. In his article Cutting Edge: Expando Objects in C# 4.0, Dino Esposito outlines a dynamic approach towards consuming structured data that shortcuts auto-generated code and provides a lightweight API to structured data. This also looks like a promising approach… It doesn't provide compile-time feedback, but that's only a false sense of security anyway. We must resort to unit tests to get rapid feedback, but we're all using TDD already, right?

In summary, my entire series about encapsulation relates to object-oriented programming. Although there are lots of technologies available to represent boundary data as ‘objects', they are false objects. Even if we use an object-oriented language at the boundary, the code has nothing to do with object orientation. Thus, the Poka-yoke Design rules don't apply there.

Now go back and reread this post, but replace ‘DTO' with ‘Entity' (or whatever your ORM calls its representation of a relational table row) and you should begin to see the contours of why ORMs are problematic.

P.S. 2022-05-02. See also At the boundaries, applications aren't functional.

Comments

Marc Gravell #

Absolutely. I spend a lot of time in this space, an while I enjoy using (and maintaining) tools to make this transition simple, whenever there isthe first hint of a problem I always advise "add a dedicated DTO". A similar question (using interfaces in WCF messages) came up on stackoverflow last week. But ultimately, any boundary talks "data", not behaviour. It also emphasises why BinaryFormatter (implementation-aware) sucks so much for ***any*** kind of data exchange.

2011-05-31 14:53 UTC

Phil Sandler #

Arg, had a longer comment but it got eaten when I clicked Save Comment.

Anyway the gist of it: great post, I commpletely agree that data objects aren't domain objects, and thinking of them as structured data is very liberating. Using Entities/DTOs/data objects as a composible part of a domain object is a nice way to create that separation which still allows you to (potentially directly) use ORM entities while avoiding having an anemic domain model. So, for example, CustomerDomain would/could contain a CustomerEntity, instead of trying to add behavior to CustomerEntity (which might be a generated class).

2011-05-31 16:37 UTC

ploeh blog danish software design

Objects at the Boundary #

What Should Happen at the Boundary #

Comments

When are Default Constructors OK? #

Comments

Improved Design #

Comments

Code Smell: Automatic Reference Type Property #

Code Smell: Automatic Value Type Property #

Improved Design: Guard Clause #

Improved Design: Value Object Property #

Comments

Design Smell #

Improved Design #

Comments

Smell Example #

Improvement: Constructor Injection #

Improvement: Abstract Factory #

Comments

Comments

Comments