Microservices: Back to the Future

Andrew Hagedorn, June 2020

Earlier in 2020 a talk made the rounds about how Segment had moved towards microservices and ultimately reverted back to a monolith. This resonated with me. Not just because in 2014 this happened to me on a very miniature scale at Zocdoc, but because when my team moved successfully towards a microservice architecture in 2016 it was not a forgone conclusion that it would be a success. Could a few choices made differently have put us in the same boat?

Microservices Boomerang

In 2014 our cycle time in the monolith was increasing, our development team was growing, and we had a relatively new platform team looking to make its mark. These were the perfect ingredients for a project to move to microservices. The solution left a lot to be desired:

A mostly homegrown framework with a lack of tooling
Custom system for network communication
Co-mingling of the service and monolith data
No strategy for fake or mock services; you had to run everything all the time

After a very long time the larger team got a chance to start extracting functionality from our monolith and it was not pretty. We quickly realized we were building a distributed monolith and killed the project entirely. In retrospect despite the conceptual failures in our approach, there were structural issues that were more problematic. We were in a data center with very little flexibility which hamstrung our ability to truly innovate on our platform and roll out new infrastructure pieces. Even had we done everything right we would not have been able to scale the microservice platform in such a way that we could actually have been successful.

A Second Attempt

A year or two later our technical landscape had shifted in some ways, but in other ways were the same. The structural issues were removed by moving to the cloud, but our monolith was still a drain on our productivity despite continued investment in making it easier to work with. The time was ripe for another attempt to move towards microservices, but we wanted to learn from our recent past. The key learnings we had were:

Strong Contracts: Instead of an ambiguous custom transfer layer we used documented explicit contracts over HTTP between services. This meant that there was no guess work in how to communicate with a service and also allowed us to easily generate clients in different languages.
Isolation: Services were isolated in multiple ways. First, they did not share access to their data so there was no chance of an implicit coupling via a database. Second, strong contracts allowed us to give each service its own CI and deploy pipeline. Third, within those CI pipelines services only interact with mock or fake versions of its dependencies such that you can develop the codebases independently.
Infrastructure as Code: For reproducibility and documentation we used ansible and cloudformation to generate our AWS resources and deploy new versions of our code
Containers: The unit of deployment would be containers instead of a zipped set of files
Testing in Production: We did not invest in creating a true to life staging environment and did not attempt to fully integrate our systems outside of production. This certainly adds risk around deployments, but we felt this would be a more tractable problem and would provide more value long term.

However, just having those learnings was not enough to ensure that we were successful. The things that were most impactful were around two points:

We got our key service boundaries roughly correct
We didn't over index on the micro in microservices

After some time we realized that we did not have the right level of monitoring or observability and needed to reinvest in the platform aspects of the system, but these only add value when they are layered on top of a codebase that is workable. Had we gotten either of those wrong we could have drowned in either operation overhead or codebase that was fragmented such that it was significantly harder to work with.

It isn't all roses...

So we have microservices and a monolith now and teams are able to work productively and independently. However, as I look back on that time period I think we made some key strategic mistakes.

First, while using containerized applications did open many doors, one of those doors was the ability to use many different languages. This has added a significant amount of friction when you move through different parts of the codebase and means that we often have to develop the same platform components many times instead of just once. There is an argument that some languages provide significant value in certain use cases, I would argue that in most cases having a coherent codebase is the better option.

Language drift was also symptomatic of the second key mistake: technology skew. This time period was the organization's introduction to AWS and without any guardrails the team used it to its fullest. At times this manifested in architectural complexity without any real reason and at other times it manifested in the selection of duplicative technologies. Instead of refining a set of technologies we had a great deal of expertise in we just owned a large set of often duplicative technologies. Similar to language drift, general technological drift meant that we diluted our expertise as a larger team.

Finally, we followed the hype train a little too closely. Whether it was CQRS, event sourcing, or the latest React trend we more often than not used a technology or pattern because it was cool rather than because it fit our use case.

What would I do today?

A fun thought exercise is where would the business and technical organization be if we had never moved to microservices. The key questions being:

Should we have moved to microservices?
How could we have changed our approach to be in a better place today?
Did we provide value to the business, or just to our resumes?

Ultimately, I think the answers are probably that yes we should have moved to microservices and that we are better able to provide value to the business because of it. However, if I would have taken a more deliberate approach rather than something that more closely resembled the wild west. Had there been a stronger shared strategy and vision across the technology team with a solid technical review process we would have avoided a lot of accidental complexity in our architecture and technology stack.

Microservices: Back to the Future

Microservices Boomerang

A Second Attempt

It isn't all roses...

What would I do today?

Other Posts

Technology

When Technology Bites Back

Stories from Beyond

Random Commands