Name by role by Mark Seemann
Consider naming variables according to their role, instead of their type.
My recent article on good names might leave you with the impression that I consider good names unimportant. Not at all. That article was an attempt at delineating the limits of naming. Good names aren't the panacea some people seem to imply, but they're still important.
As the cliché goes, naming is one of the hardest problems in software development. Perhaps it's hard because you have to do it so frequently. Every time you create a variable, you have to name it. It's also an opportunity to add clarity to a code base.
A common naming strategy is to name objects after their type:
Reservation? reservation = dto.Validate(id);
or:
Restaurant? restaurant = await RestaurantDatabase.GetRestaurant(restaurantId);
There's nothing inherently wrong with a naming scheme like this. It often makes sense. The reservation
variable is a Reservation
object, and there's not that much more to say about it. The same goes for the restaurant
object.
In some contexts, however, objects play specific roles. This is particularly prevalent with primitive types, but can happen to any type of object. It may help the reader if you name the variables according to such roles.
In this article, I'll show you several examples. I hope these examples are so plentiful and varied that they can inspire you to come up with good names. Most of the code shown here is part of the sample code base that accompanies my book Code That Fits in Your Head.
A variable introduced only to be named #
In a recent article I showed this code snippet:
private bool SignatureIsValid(string candidate, ActionExecutingContext context) { var sig = context.HttpContext.Request.Query["sig"]; var receivedSignature = Convert.FromBase64String(sig.ToString()); using var hmac = new HMACSHA256(urlSigningKey); var computedSignature = hmac.ComputeHash(Encoding.ASCII.GetBytes(candidate)); var signaturesMatch = computedSignature.SequenceEqual(receivedSignature); return signaturesMatch; }
Did you wonder about the signaturesMatch
variable? Why didn't I just return the result of SequenceEqual
, like the following?
private bool SignatureIsValid(string candidate, ActionExecutingContext context) { var sig = context.HttpContext.Request.Query["sig"]; var receivedSignature = Convert.FromBase64String(sig.ToString()); using var hmac = new HMACSHA256(urlSigningKey); var computedSignature = hmac.ComputeHash(Encoding.ASCII.GetBytes(candidate)); return computedSignature.SequenceEqual(receivedSignature); }
Visual Studio even offers this as a possible refactoring that it'll do for you.
The inclusion of the signaturesMatch
variable was a conscious decision of mine. I felt that directly returning the result of SequenceEqual
was a bit too implicit. It forces readers to make the inference themselves: Ah, the two arrays contain the same sequence of bytes; that must mean that the signatures match!
Instead of asking readers to do that work themselves, I decided to do it for them. I hope that it improves readability. It doesn't change the behaviour of the code one bit.
Test roles #
When it comes to unit testing, there's plenty of inconsistent terminology. One man's mock object is another woman's test double. Most of the jargon isn't even internally consistent. Do yourself a favour and adopt a consistent pattern language. I use the one presented in xUnit Test Patterns.
For instance, the thing that you're testing is the System Under Test (SUT). This can be a pure function or a static method, but when it's an object, you're going to create a variable. Consider naming it sut. A typical test also defines other variables. Naming one of them sut
clearly identifies which of them is the SUT. It also protects the tests against the class in question being renamed.
[Fact] public void ScheduleSingleReservationCommunalTable() { var table = Table.Communal(12); var sut = new MaitreD( TimeSpan.FromHours(18), TimeSpan.FromHours(21), TimeSpan.FromHours(6), table); var r = Some.Reservation; var actual = sut.Schedule(new[] { r }); var expected = new[] { new TimeSlot(r.At, table.Reserve(r)) }; Assert.Equal(expected, actual); }
The above test follows my AAA.formatting heuristic. In all, it defines five variables, but there can be little doubt about which one is the sut
.
The table
and r
variables follow the mainstream practice of naming variables after their type. They play no special role, so that's okay. You may balk at such a short variable name as r
, and that's okay. In my defence, I follow Clean Code's N5 heuristic for long and short scopes. A variable name like r
is fine when it only spans three lines of code (four, if you also count the blank line).
Consider also using the variable names expected
and actual
, as in the above example. In many unit testing frameworks, those are the argument names for the assertion. For instance, in xUnit.net (which the above test uses) the Assert.Equals
overloads are defined as Equal<T>(T expected, T actual)
. Using these names for variables makes the roles clearer, I think.
The other #
The above assertion relies on structural equality. The TimeSlot
class is immutable, so it can safely override Equals
(and GetHashCode
) to implement structural equality:
public override bool Equals(object? obj) { return obj is TimeSlot other && At == other.At && Tables.SequenceEqual(other.Tables); }
I usually call the downcast variable other
because, from the perspective of the instance, it's the other object. I usually use that convention whenever an instance interacts with another object of the same type. Among other examples, this happens when you model objects as semigroups and monoids. The Angle struct, for example, defines this binary operation:
public Angle Add(Angle other) { return new Angle(this.degrees + other.degrees); }
Again, the method argument is in the role as the other object, so naming it other
seems natural.
Here's another example from a restaurant reservation code base:
public bool Overlaps(Seating other) { if (other is null) throw new ArgumentNullException(nameof(other)); return Start < other.End && other.Start < End; }
The Overlaps
method is an instance method on the Seating
class. Again, other
seems natural.
Candidates #
The Overlaps
method looks like a predicate, i.e. a function that returns a Boolean value. In the case of that method, other
indicates the role of being the other object, but it also plays another role. It makes sense to me to call predicate input candidates. Typically, you have some input that you want to evaluate as either true or false. I think it makes sense to think of such a parameter as a 'truth candidate'. You can see one example of that in the above SignatureIsValid
method.
There, the string
parameter is a candidate
for having a valid signature.
Here's another restaurant-related example:
public bool WillAccept( DateTime now, IEnumerable<Reservation> existingReservations, Reservation candidate) { if (existingReservations is null) throw new ArgumentNullException(nameof(existingReservations)); if (candidate is null) throw new ArgumentNullException(nameof(candidate)); if (candidate.At < now) return false; if (IsOutsideOfOpeningHours(candidate)) return false; var seating = new Seating(SeatingDuration, candidate.At); var relevantReservations = existingReservations.Where(seating.Overlaps); var availableTables = Allocate(relevantReservations); return availableTables.Any(t => t.Fits(candidate.Quantity)); }
Here, the reservation in question is actually not yet a reservation. It might be rejected, so it's a candidate
reservation.
You can also use that name in TryParse
methods, as shown in this article.
Data Transfer Objects #
Another name that I like to use is dto
for Data Transfer Objects (DTOs). The benefit here is that as long as dto
is unambiguous in context, it makes it easier to distinguish between a DTO and the domain model you might want to turn it into:
[HttpPost("restaurants/{restaurantId}/reservations")] public async Task<ActionResult> Post(int restaurantId, ReservationDto dto) { if (dto is null) throw new ArgumentNullException(nameof(dto)); var id = dto.ParseId() ?? Guid.NewGuid(); Reservation? reservation = dto.Validate(id); if (reservation is null) return new BadRequestResult(); var restaurant = await RestaurantDatabase.GetRestaurant(restaurantId); if (restaurant is null) return new NotFoundResult(); return await TryCreate(restaurant, reservation); }
By naming the input parameter dto
, I keep the name reservation
free for the domain object, which ought to be the more important object of the two.
I could have named the input parameter"A Data Transfer Object is one of those objects our mothers told us never to write."
reservationDto
instead of dto
, but that would diminish the 'mental distance' between reservationDto
and reservation
. I like to keep that distance, so that the roles are more explicit.
Time #
You often need to make decisions based on the current time or date. In .NET the return value from DateTime.Now is a DateTime
value. Typical variable names are dateTime
, date
, time
, or dt
, but why not call it now
?
private async Task<ActionResult> TryCreate(Restaurant restaurant, Reservation reservation) { using var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled); var reservations = await Repository.ReadReservations(restaurant.Id, reservation.At); var now = Clock.GetCurrentDateTime(); if (!restaurant.MaitreD.WillAccept(now, reservations, reservation)) return NoTables500InternalServerError(); await Repository.Create(restaurant.Id, reservation).ConfigureAwait(false); scope.Complete(); return Reservation201Created(restaurant.Id, reservation); }
This is the TryCreate
method called by the above Post
method. Here, DateTime.Now
is hidden behind Clock.GetCurrentDateTime()
in order to make execution repeatable, but the idea remains: the variable represents the current time or date, or, with a bit of good will, now
.
Notice that the WillAccept
method (shown above) also uses now
as a parameter name. That value's role is to represent now
as a concept.
When working with time, I also sometimes use the variable names before
and after
. This is mostly useful in integration tests:
[Fact] public async Task GetCurrentYear() { using var api = new LegacyApi(); var before = DateTime.Now; var response = await api.GetCurrentYear(); var after = DateTime.Now; response.EnsureSuccessStatusCode(); var actual = await response.ParseJsonContent<CalendarDto>(); AssertOneOf(before.Year, after.Year, actual.Year); Assert.Null(actual.Month); Assert.Null(actual.Day); AssertLinks(actual); }
While you can inject something like a Clock
dependency in order to make your SUT deterministic, in integration tests you might want to see behaviour when using the system clock. You can often verify such behaviour by surrounding the test's Act phase with two calls to DateTime.Now
. This gives you the time before
and after
the test exercised the SUT.
When you do that, however, be careful with the assertions. If such a test runs at midnight, before
and after
might be two different dates. If it runs on midnight December 31, it might actually be two different years! That's the reason that the test passes as long as the actual.Year
is either of before.Year
and after.Year
.
Invalid values #
While integration tests often test happy paths, unit tests should also exercise error paths. What happens when you supply invalid input to a method? When you write such tests, you can identify the invalid values by naming the variables or parameters accordingly:
[Theory] [InlineData(null)] [InlineData("")] [InlineData("bas")] public async Task PutInvalidId(string invalidId) { var db = new FakeDatabase(); var sut = new ReservationsController( new SystemClock(), new InMemoryRestaurantDatabase(Some.Restaurant), db); var dummyDto = new ReservationDto { At = "2024-06-25 18:19", Email = "colera@example.com", Name = "Cole Aera", Quantity = 2 }; var actual = await sut.Put(invalidId, dummyDto); Assert.IsAssignableFrom<NotFoundResult>(actual); }
Here, the invalid input represent an ID. To indicate that, I called the parameter invalidId
.
The system under test is the Put
method, which takes two arguments:
public Task<ActionResult> Put(string id, ReservationDto dto)
When testing an error path, it's important to keep other arguments well-behaved. In this example, I want to make sure that it's the invalidId
that causes the NotFoundResult
result. Thus, the dto
argument should be as well-behaved as possible, so that it isn't going to be the source of divergence.
Apart from being well-behaved, that object plays no role in the test. It just needs to be there to make the code compile. xUnit Test Patterns calls such an object a Dummy Object, so I named the variable dummyDto
as information to any reader familiar with that pattern language.
Derived class names #
The thrust of all of these examples is that you don't have to name variables after their types. You can extend this line of reasoning to class inheritance. Just because a base class is called Foo
it doesn't mean that you have to call a derived class SomethingFoo
.
This is something of which I have to remind myself. For example, to support integration testing with ASP.NET you'll need a WebApplicationFactory<TEntryPoint>. To override the default DI Container configuration, you'll have to derive from this class and override its ConfigureWebHost
method. In an example I've previously published I didn't spend much time thinking about the class name, so RestaurantApiFactory
it was.
At first, I named the variables of this type factory
, or something equally devoid of information. That bothered me, so instead tried service
, which I felt was an improvement, but still too vapid. I then adopted api
as a variable name, but then realised that that also suggested a better class name. So currently, this defines my self-hosting API:
public sealed class SelfHostedApi : WebApplicationFactory<Startup>
Here's how I use it:
[Fact] public async Task ReserveTableAtNono() { using var api = new SelfHostedApi(); var client = api.CreateClient(); var dto = Some.Reservation.ToDto(); dto.Quantity = 6; var response = await client.PostReservation("Nono", dto); var at = Some.Reservation.At; await AssertRemainingCapacity(client, at, "Nono", 4); await AssertRemainingCapacity(client, at, "Hipgnosta", 10); }
The variable is just called api
, but the reader can tell from the initialisation that this is an instance of the SelfHostedApi
class. I like how that communicates that this is an integration test that uses a self-hosted API. It literally says that.
This test also uses the dto
naming convention. Additionally, you may take note of the variable and property called at
. That's another name for a date and time. I struggled with naming this value, until Karsten Strøbæk suggested that I used the simple word at: reservation.At
indicates the date and time of the reservation without being encumbered by awkward details about date and time. Should we call it date
? time
? dateTime
? No, just call it at
. I find it elegant.
Conclusion #
Sometimes, a Reservation
object is just a reservation
, and that's okay. At other times, it's the actual
value, or the expected
value. If it represents an invalid reservation in a test case, it makes sense to call the variable invalidResevation
.
Giving variables descriptive names improves code quality. You don't have to write comments as apologies for poor readability if a better name communicates what the comment would have said.
Consider naming variables (and classes) for the roles they play, rather than their types.
On the other hand, when variable names are in the way, consider point-free code.
Comments
Excellent name suggestions. Thanks for sharing them :)
The name
other
is good, but I preferthat
because I think it is a better antonym ofthis
(the keyword for the current instance) and because it has the same number of letters asthis
.I typically try to avoid turning some input into either true or false. In particular, I find it confusing for the syntax to say that some instance is a
Reservation
while the semantics says that it "is actually not yet a reservation". I think of this as an example of primitive obsession. Strictly speaking, I think "Primitive Obsession is when the code relies too much on primitives." (aka, on primitive types). In my mind though, I have generalized this to cover any code that relies too much on weaker types. Separate typesReservation
andbool
are weaker than separate typesReservation
andCandidateReservation
. I think Alexis King summarized this well with a blog post titled Parse, don’t validate.And yet, my coworkers and I have engaged in friendly but serious debates for years about which of those two approaches is better. My argument, essentially as given above, is for separate types
Reservation
andCandidateReservation
. The main counterargument is that these types are the same except for a database-generated ID, so just represent both using one type with an optional ID.Have you thought about this before?
I prefer to emphasize the roles even more by using the names
dto
andmodel
. We are in the implementation of the (Post) route for"restaurants/{restaurantId}/reservations"
, so I think it is clear from context that thedto
andmodel
are really a reservation DTO and a reservation model.Tyson, thank you for writing. Certainly, I didn't intent my article to dictate names. As you imply, there's room for both creativity and subjectivity, and that's fine. My suggestions were meant only for inspiration.
Yes; I would think twice before deciding to model a domain type with a database-generated ID. A server-generated ID is an implementation detail that shouldn't escape the data access layer. If it does, you have a leaky abstraction at hand. Sooner or later, it's going to bite you.The
Reservation
class in the above examples has this sole constructor:You can't create an instance without supplying an ID. On the other hand, any code can conjure up a GUID, so no server is required. At the type-level, there's no compelling reason to distinguish between a reservation and a candidate reservation.
Granted, you could define two types,
Reservation
andCandidateReservation
, but they'd be isomorphic. In Haskell, you'd probably use anewtype
for one of these types, and then you're back at Alexis King's blog.Usually, doing things frequently means mastering them pretty quickly. Not so for naming. I guess, there are multiple issues:
Naming variables by their roles is a great idea!
Many thanks for another awesome post, I enjoyed reading it.
Tomas, thank you for writing.
Good point; I hadn't thought about that. I think that the reasons you list are valid.As an additional observation, it may be that there's a connection to the notion of deliberate practice. As the catch-phrase about professional experience puts it, there's a difference between 20 years of experience and one year of experience repeated 20 times.
Doing a thing again and again generates little improvement if one does it by rote. One has to deliberately practice. In this case, it implies that a programmer should explicitly reflect on variable names, and consider more than one option.
I haven't met many software developers who do that.