Consider naming variables according to their role, instead of their type.

My recent article on good names might leave you with the impression that I consider good names unimportant. Not at all. That article was an attempt at delineating the limits of naming. Good names aren't the panacea some people seem to imply, but they're still important.

As the cliché goes, naming is one of the hardest problems in software development. Perhaps it's hard because you have to do it so frequently. Every time you create a variable, you have to name it. It's also an opportunity to add clarity to a code base.

A common naming strategy is to name objects after their type:

Reservation? reservation = dto.Validate(id);

or:

Restaurant? restaurant = await RestaurantDatabase.GetRestaurant(restaurantId);

There's nothing inherently wrong with a naming scheme like this. It often makes sense. The reservation variable is a Reservation object, and there's not that much more to say about it. The same goes for the restaurant object.

In some contexts, however, objects play specific roles. This is particularly prevalent with primitive types, but can happen to any type of object. It may help the reader if you name the variables according to such roles.

In this article, I'll show you several examples. I hope these examples are so plentiful and varied that they can inspire you to come up with good names. Most of the code shown here is part of the sample code base that accompanies my book Code That Fits in Your Head.

A variable introduced only to be named #

In a recent article I showed this code snippet:

private bool SignatureIsValid(string candidate, ActionExecutingContext context)
{
    var sig = context.HttpContext.Request.Query["sig"];
    var receivedSignature = Convert.FromBase64String(sig.ToString());
 
    using var hmac = new HMACSHA256(urlSigningKey);
    var computedSignature = hmac.ComputeHash(Encoding.ASCII.GetBytes(candidate));
 
    var signaturesMatch = computedSignature.SequenceEqual(receivedSignature);
    return signaturesMatch;
}

Did you wonder about the signaturesMatch variable? Why didn't I just return the result of SequenceEqual, like the following?

private bool SignatureIsValid(string candidate, ActionExecutingContext context)
{
    var sig = context.HttpContext.Request.Query["sig"];
    var receivedSignature = Convert.FromBase64String(sig.ToString());
 
    using var hmac = new HMACSHA256(urlSigningKey);
    var computedSignature = hmac.ComputeHash(Encoding.ASCII.GetBytes(candidate));
 
    return computedSignature.SequenceEqual(receivedSignature);
}

Visual Studio even offers this as a possible refactoring that it'll do for you.

The inclusion of the signaturesMatch variable was a conscious decision of mine. I felt that directly returning the result of SequenceEqual was a bit too implicit. It forces readers to make the inference themselves: Ah, the two arrays contain the same sequence of bytes; that must mean that the signatures match!

Instead of asking readers to do that work themselves, I decided to do it for them. I hope that it improves readability. It doesn't change the behaviour of the code one bit.

Test roles #

When it comes to unit testing, there's plenty of inconsistent terminology. One man's mock object is another woman's test double. Most of the jargon isn't even internally consistent. Do yourself a favour and adopt a consistent pattern language. I use the one presented in xUnit Test Patterns.

For instance, the thing that you're testing is the System Under Test (SUT). This can be a pure function or a static method, but when it's an object, you're going to create a variable. Consider naming it sut. A typical test also defines other variables. Naming one of them sut clearly identifies which of them is the SUT. It also protects the tests against the class in question being renamed.

[Fact]
public void ScheduleSingleReservationCommunalTable()
{
    var table = Table.Communal(12);
    var sut = new MaitreD(
        TimeSpan.FromHours(18),
        TimeSpan.FromHours(21),
        TimeSpan.FromHours(6),
        table);
 
    var r = Some.Reservation;
    var actual = sut.Schedule(new[] { r });
 
    var expected = new[] { new TimeSlot(r.At, table.Reserve(r)) };
    Assert.Equal(expected, actual);
}

The above test follows my AAA.formatting heuristic. In all, it defines five variables, but there can be little doubt about which one is the sut.

The table and r variables follow the mainstream practice of naming variables after their type. They play no special role, so that's okay. You may balk at such a short variable name as r, and that's okay. In my defence, I follow Clean Code's N5 heuristic for long and short scopes. A variable name like r is fine when it only spans three lines of code (four, if you also count the blank line).

Consider also using the variable names expected and actual, as in the above example. In many unit testing frameworks, those are the argument names for the assertion. For instance, in xUnit.net (which the above test uses) the Assert.Equals overloads are defined as Equal<T>(T expected, T actual). Using these names for variables makes the roles clearer, I think.

The other #

The above assertion relies on structural equality. The TimeSlot class is immutable, so it can safely override Equals (and GetHashCode) to implement structural equality:

public override bool Equals(object? obj)
{
    return obj is TimeSlot other &&
           At == other.At &&
           Tables.SequenceEqual(other.Tables);
}

I usually call the downcast variable other because, from the perspective of the instance, it's the other object. I usually use that convention whenever an instance interacts with another object of the same type. Among other examples, this happens when you model objects as semigroups and monoids. The Angle struct, for example, defines this binary operation:

public Angle Add(Angle other)
{
    return new Angle(this.degrees + other.degrees);
}

Again, the method argument is in the role as the other object, so naming it other seems natural.

Here's another example from a restaurant reservation code base:

public bool Overlaps(Seating other)
{
    if (other is null)
        throw new ArgumentNullException(nameof(other));
 
    return Start < other.End && other.Start < End;
}

The Overlaps method is an instance method on the Seating class. Again, other seems natural.

Candidates #

The Overlaps method looks like a predicate, i.e. a function that returns a Boolean value. In the case of that method, other indicates the role of being the other object, but it also plays another role. It makes sense to me to call predicate input candidates. Typically, you have some input that you want to evaluate as either true or false. I think it makes sense to think of such a parameter as a 'truth candidate'. You can see one example of that in the above SignatureIsValid method.

There, the string parameter is a candidate for having a valid signature.

Here's another restaurant-related example:

public bool WillAccept(
    DateTime now,
    IEnumerable<Reservation> existingReservations,
    Reservation candidate)
{
    if (existingReservations is null)
        throw new ArgumentNullException(nameof(existingReservations));
    if (candidate is null)
        throw new ArgumentNullException(nameof(candidate));
    if (candidate.At < now)
        return false;
    if (IsOutsideOfOpeningHours(candidate))
        return false;
 
    var seating = new Seating(SeatingDuration, candidate.At);
    var relevantReservations =
        existingReservations.Where(seating.Overlaps);
    var availableTables = Allocate(relevantReservations);
    return availableTables.Any(t => t.Fits(candidate.Quantity));
}

Here, the reservation in question is actually not yet a reservation. It might be rejected, so it's a candidate reservation.

You can also use that name in TryParse methods, as shown in this article.

Data Transfer Objects #

Another name that I like to use is dto for Data Transfer Objects (DTOs). The benefit here is that as long as dto is unambiguous in context, it makes it easier to distinguish between a DTO and the domain model you might want to turn it into:

[HttpPost("restaurants/{restaurantId}/reservations")]
public async Task<ActionResult> Post(int restaurantId, ReservationDto dto)
{
    if (dto is null)
        throw new ArgumentNullException(nameof(dto));
 
    var id = dto.ParseId() ?? Guid.NewGuid();
    Reservation? reservation = dto.Validate(id);
    if (reservation is null)
        return new BadRequestResult();
 
    var restaurant = await RestaurantDatabase.GetRestaurant(restaurantId);
    if (restaurant is null)
        return new NotFoundResult();
 
    return await TryCreate(restaurant, reservation);
}

By naming the input parameter dto, I keep the name reservation free for the domain object, which ought to be the more important object of the two.

"A Data Transfer Object is one of those objects our mothers told us never to write."

I could have named the input parameter reservationDto instead of dto, but that would diminish the 'mental distance' between reservationDto and reservation. I like to keep that distance, so that the roles are more explicit.

Time #

You often need to make decisions based on the current time or date. In .NET the return value from DateTime.Now is a DateTime value. Typical variable names are dateTime, date, time, or dt, but why not call it now?

private async Task<ActionResult> TryCreate(Restaurant restaurant, Reservation reservation)
{
    using var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled);
 
    var reservations = await Repository.ReadReservations(restaurant.Id, reservation.At);
    var now = Clock.GetCurrentDateTime();
    if (!restaurant.MaitreD.WillAccept(now, reservations, reservation))
        return NoTables500InternalServerError();
 
    await Repository.Create(restaurant.Id, reservation).ConfigureAwait(false);
 
    scope.Complete();
 
    return Reservation201Created(restaurant.Id, reservation);
}

This is the TryCreate method called by the above Post method. Here, DateTime.Now is hidden behind Clock.GetCurrentDateTime() in order to make execution repeatable, but the idea remains: the variable represents the current time or date, or, with a bit of good will, now.

Notice that the WillAccept method (shown above) also uses now as a parameter name. That value's role is to represent now as a concept.

When working with time, I also sometimes use the variable names before and after. This is mostly useful in integration tests:

[Fact]
public async Task GetCurrentYear()
{
    using var api = new LegacyApi();
 
    var before = DateTime.Now;
    var response = await api.GetCurrentYear();
    var after = DateTime.Now;
 
    response.EnsureSuccessStatusCode();
    var actual = await response.ParseJsonContent<CalendarDto>();
    AssertOneOf(before.Year, after.Year, actual.Year);
    Assert.Null(actual.Month);
    Assert.Null(actual.Day);
    AssertLinks(actual);
}

While you can inject something like a Clock dependency in order to make your SUT deterministic, in integration tests you might want to see behaviour when using the system clock. You can often verify such behaviour by surrounding the test's Act phase with two calls to DateTime.Now. This gives you the time before and after the test exercised the SUT.

When you do that, however, be careful with the assertions. If such a test runs at midnight, before and after might be two different dates. If it runs on midnight December 31, it might actually be two different years! That's the reason that the test passes as long as the actual.Year is either of before.Year and after.Year.

Invalid values #

While integration tests often test happy paths, unit tests should also exercise error paths. What happens when you supply invalid input to a method? When you write such tests, you can identify the invalid values by naming the variables or parameters accordingly:

[Theory]
[InlineData(null)]
[InlineData("")]
[InlineData("bas")]
public async Task PutInvalidId(string invalidId)
{
    var db = new FakeDatabase();
    var sut = new ReservationsController(
        new SystemClock(),
        new InMemoryRestaurantDatabase(Some.Restaurant),
        db);
 
    var dummyDto = new ReservationDto
    {
        At = "2024-06-25 18:19",
        Email = "colera@example.com",
        Name = "Cole Aera",
        Quantity = 2
    };
    var actual = await sut.Put(invalidId, dummyDto);
 
    Assert.IsAssignableFrom<NotFoundResult>(actual);
}

Here, the invalid input represent an ID. To indicate that, I called the parameter invalidId.

The system under test is the Put method, which takes two arguments:

public Task<ActionResult> Put(string id, ReservationDto dto)

When testing an error path, it's important to keep other arguments well-behaved. In this example, I want to make sure that it's the invalidId that causes the NotFoundResult result. Thus, the dto argument should be as well-behaved as possible, so that it isn't going to be the source of divergence.

Apart from being well-behaved, that object plays no role in the test. It just needs to be there to make the code compile. xUnit Test Patterns calls such an object a Dummy Object, so I named the variable dummyDto as information to any reader familiar with that pattern language.

Derived class names #

The thrust of all of these examples is that you don't have to name variables after their types. You can extend this line of reasoning to class inheritance. Just because a base class is called Foo it doesn't mean that you have to call a derived class SomethingFoo.

This is something of which I have to remind myself. For example, to support integration testing with ASP.NET you'll need a WebApplicationFactory<TEntryPoint>. To override the default DI Container configuration, you'll have to derive from this class and override its ConfigureWebHost method. In an example I've previously published I didn't spend much time thinking about the class name, so RestaurantApiFactory it was.

At first, I named the variables of this type factory, or something equally devoid of information. That bothered me, so instead tried service, which I felt was an improvement, but still too vapid. I then adopted api as a variable name, but then realised that that also suggested a better class name. So currently, this defines my self-hosting API:

public sealed class SelfHostedApi : WebApplicationFactory<Startup>

Here's how I use it:

[Fact]
public async Task ReserveTableAtNono()
{
    using var api = new SelfHostedApi();
    var client = api.CreateClient();
    var dto = Some.Reservation.ToDto();
    dto.Quantity = 6;
 
    var response = await client.PostReservation("Nono", dto);
 
    var at = Some.Reservation.At;
    await AssertRemainingCapacity(client, at, "Nono", 4);
    await AssertRemainingCapacity(client, at, "Hipgnosta", 10);
}

The variable is just called api, but the reader can tell from the initialisation that this is an instance of the SelfHostedApi class. I like how that communicates that this is an integration test that uses a self-hosted API. It literally says that.

This test also uses the dto naming convention. Additionally, you may take note of the variable and property called at. That's another name for a date and time. I struggled with naming this value, until Karsten Strøbæk suggested that I used the simple word at: reservation.At indicates the date and time of the reservation without being encumbered by awkward details about date and time. Should we call it date? time? dateTime? No, just call it at. I find it elegant.

Conclusion #

Sometimes, a Reservation object is just a reservation, and that's okay. At other times, it's the actual value, or the expected value. If it represents an invalid reservation in a test case, it makes sense to call the variable invalidResevation.

Giving variables descriptive names improves code quality. You don't have to write comments as apologies for poor readability if a better name communicates what the comment would have said.

Consider naming variables (and classes) for the roles they play, rather than their types.

On the other hand, when variable names are in the way, consider point-free code.


Comments

Excellent name suggestions. Thanks for sharing them :)

I usually call the downcast variable other because, from the perspective of the instance, it's the other object. I usually use that convention whenever an instance interacts with another object of the same type.

The name other is good, but I prefer that because I think it is a better antonym of this (the keyword for the current instance) and because it has the same number of letters as this.

It makes sense to me to call predicate input candidates. Typically, you have some input that you want to evaluate as either true or false. I think it makes sense to think of such a parameter as a 'truth candidate'. You can see one example of that in the above SignatureIsValid method.

...

Here, the reservation in question is actually not yet a reservation. It might be rejected, so it's a candidate reservation.

I typically try to avoid turning some input into either true or false. In particular, I find it confusing for the syntax to say that some instance is a Reservation while the semantics says that it "is actually not yet a reservation". I think of this as an example of primitive obsession. Strictly speaking, I think "Primitive Obsession is when the code relies too much on primitives." (aka, on primitive types). In my mind though, I have generalized this to cover any code that relies too much on weaker types. Separate types Reservation and bool are weaker than separate types Reservation and CandidateReservation. I think Alexis King summarized this well with a blog post titled Parse, don’t validate.

And yet, my coworkers and I have engaged in friendly but serious debates for years about which of those two approaches is better. My argument, essentially as given above, is for separate types Reservation and CandidateReservation. The main counterargument is that these types are the same except for a database-generated ID, so just represent both using one type with an optional ID.

Have you thought about this before?

By naming the input parameter dto, I keep the name reservation free for the domain object, which ought to be the more important object of the two.

...

I could have named the input parameter reservationDto instead of dto, but that would diminish the 'mental distance' between reservationDto and reservation. I like to keep that distance, so that the roles are more explicit.

I prefer to emphasize the roles even more by using the names dto and model. We are in the implementation of the (Post) route for "restaurants/{restaurantId}/reservations", so I think it is clear from context that the dto and model are really a reservation DTO and a reservation model.

2020-11-30 20:46 UTC

Tyson, thank you for writing. Certainly, I didn't intent my article to dictate names. As you imply, there's room for both creativity and subjectivity, and that's fine. My suggestions were meant only for inspiration.

The main counterargument is that these types are the same except for a database-generated ID, so just represent both using one type with an optional ID.

Have you thought about this before?

Yes; I would think twice before deciding to model a domain type with a database-generated ID. A server-generated ID is an implementation detail that shouldn't escape the data access layer. If it does, you have a leaky abstraction at hand. Sooner or later, it's going to bite you.

The Reservation class in the above examples has this sole constructor:

public Reservation(Guid id, DateTime at, Email email, Name name, int quantity)

You can't create an instance without supplying an ID. On the other hand, any code can conjure up a GUID, so no server is required. At the type-level, there's no compelling reason to distinguish between a reservation and a candidate reservation.

Granted, you could define two types, Reservation and CandidateReservation, but they'd be isomorphic. In Haskell, you'd probably use a newtype for one of these types, and then you're back at Alexis King's blog.

2020-12-02 7:43 UTC
...naming is one of the hardest problems in software development. Perhaps it's hard because you have to do it so frequently.

Usually, doing things frequently means mastering them pretty quickly. Not so for naming. I guess, there are multiple issues:

  1. Words are ambiguous. The key is, not to do naming in isolation, the context matters. For example, it's difficult to come up with a good name for a method when we don't have a good name for its class, the whole component, etc. Similar with Clean Code's N5: the meaning of a short variable is clear in a small scope, closed context.
  2. Good naming requires deep understanding of the domain. Developers are usualy not good at the business they model. Sadly, it often means "necessary evil" for them.

Naming variables by their roles is a great idea!

Many thanks for another awesome post, I enjoyed reading it.

2020-12-11 09:11 UTC

Tomas, thank you for writing.

doing things frequently means mastering them pretty quickly. Not so for naming. I guess

Good point; I hadn't thought about that. I think that the reasons you list are valid.

As an additional observation, it may be that there's a connection to the notion of deliberate practice. As the catch-phrase about professional experience puts it, there's a difference between 20 years of experience and one year of experience repeated 20 times.

Doing a thing again and again generates little improvement if one does it by rote. One has to deliberately practice. In this case, it implies that a programmer should explicitly reflect on variable names, and consider more than one option.

I haven't met many software developers who do that.

2020-12-15 9:19 UTC


Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 30 November 2020 06:31:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 30 November 2020 06:31:00 UTC