Synchronizing concurrent teams by Mark Seemann
Or, rather: Try not to.
A few months ago I visited a customer and as the day was winding down we got to talk more informally. One of the architects mentioned, in an almost off-hand manner, "we've embarked on a SAFe journey..."
"Yes..?" I responded, hoping that my inflection would sound enough like a question that he'd elaborate.
Unfortunately, I'm apparently sometimes too subtle when dealing with people face-to-face, so I never got to hear just how that 'SAFe journey' was going. Instead, the conversation shifted to the adjacent topic of how to coordinate independent teams.
I told them that, in my opinion, the best way to coordinate independent teams is to not coordinate them. I don't remember exactly how I proceeded from there, but I probably said something along the lines that I consider coordination meetings between teams to be an 'architecture smell'. That the need to talk to other teams was a symptom that teams were too tightly coupled.
I don't remember if I said exactly that, but it would have been in character.
The architect responded: "I don't like silos."
How do you respond to that?
Autonomous teams #
I couldn't very well respond that silos are great. First, it doesn't sound very convincing. Second, it'd be an argument suitable only in a kindergarten. Are not! -Are too! -Not! -Too! etc.
After feeling momentarily checked, for once I managed to think on my feet, so I replied, "I don't suggest that your teams should be isolated from each other. I do encourage people to talk to each other, but I don't think that teams should coordinate much. Rather, think of each team as an organism on the savannah. They interact, and what they do impact others, but in the end they're autonomous life forms. I believe an architect's job is like a ranger's. You can't control the plants or animals, but you can nurture the ecosystem, herding it in a beneficial direction."
That ranger metaphor is an old pet peeve of mine, originating from what I consider one of my most under-appreciated articles: Zookeepers must become Rangers. It's closely related to the more popular metaphor of software architecture as gardening, but I like the wildlife variation because it emphasizes an even more hands-off approach. It removes the illusion that you can control a fundamentally unpredictable process, but replaces it with the hopeful emphasis on stewardship.
How do ecosystems thrive? A software architect (or ranger) should nurture resilience in each subsystem, just like evolution has promoted plants' and animals' ability to survive a variety of unforeseen circumstances: Flood, draught, fire, predators, lack of prey, disease, etc.
You want teams to work independently. This doesn't mean that they work in isolation, but rather they they are free to act according to their abilities and understanding of the situation. An architect can help them understand the wider ecosystem and predict tomorrow's weather, so to speak, but the team should remain autonomous.
Concurrent work #
I'm assuming that an organisation has multiple teams because they're supposed to work concurrently. While team A is off doing one thing, team B is doing something else. You can attempt to herd them in the same general direction, but beware of tight coordination.
What's the problem with coordination? Isn't it a kind of collaboration? Don't we consider that beneficial?
I'm not arguing that teams should be antagonistic. Like all metaphors, we should be careful not to take the savannah metaphor too far. I'm not imagining that one team consists of lions, apex predators, killing and devouring other teams.
Rather, the reason I'm wary of coordination is because it seems synonymous with synchronisation.
In Code That Fits in Your Head I've already discussed how good practices for Continuous Integration are similar to earlier lessons about optimistic concurrency. It recently struck me that we can draw a similar parallel between concurrent team work and parallel computing.
For decades we've known that the less synchronization, the faster parallel code is. Synchronization is costly.
In team work, coordination is like thread synchronization. Instead of doing work, you stop in order to coordinate. This implies that one thread or team has to wait for the other to catch up.
Unless work is perfectly evenly divided, team A may finish before team B. In order to coordinate, team A must sit idle for a while, waiting for B to catch up. (In development organizations, idleness is rarely allowed, so in practice, team A embarks on some other work, with consequences that I've already outlined.)
If you have more than two teams, this phenomenon only becomes worse. You'll have more idle time. This reminds me of Amdahl's law, which briefly put expresses that there's a limit to how much of a speed improvement you can get from concurrent work. The limit is related to the percentage of the work that can not be parallelized. The greater the need to synchronize work, the lower the ceiling. Conversely, the more you can let concurrent processes run without coordination, the more you gain from parallelization.
It seems to me that there's a direct counterpart in team organization. The more teams need to coordinate, the less is gained from having multiple teams.
But really, Fred Brooks could you have told you so in 1975.
Versioning #
A small development team may organize work informally. Work may be divided along 'natural' lines, each developer taking on tasks best suited to his or her abilities. If working in a code base with shared ownership, one developer doesn't have to wait on the work done by another developer. Instead, a programmer may complete the required work individually, or working together with a colleague. Coordination happens, but is both informal and frequent.
As development organizations grow, teams are formed. Separate teams are supposed to work independently, but may in practice often depend on each other. Team A may need team B to make a change before they can proceed with their own work. The (felt) need to coordinate team activities arise.
In my experience, this happens for a number of reasons. One is that teams may be divided along wrong lines; this is a socio-technical problem. Another, more technical, reason is that zookeepers rarely think explicitly about versioning or avoiding breaking changes. Imagine that team A needs team B to develop a new capability. This new capability implies a breaking change, so the teams will now need to coordinate.
Instead, team B should develop the new feature in such a way that it doesn't break existing clients. If all else fails, the new feature must exist side-by-side with the old way of doing things. With Continuous Deployment the new feature becomes available when it's ready. Team A still has to wait for the feature to become available, but no synchronization is required.
Conclusion #
Yet another lesson about thread-safety and concurrent transactions seems to apply to people and processes. Parallel processes should be autonomous, with as little synchronization as possible. The more you coordinate development teams, the more you limit the speed of overall work. This seems to suggest that something akin to Amdahl's law also applies to development organizations.
Instead of coordinating teams, encourage them to exist as autonomous entities, but set things up so that not breaking compatibility is a major goal for each team.