Helix Engineering: Our approach to microservices, part 1
Microservices have many benefits over monolithic applications for engineering groups that build large-scale distributed systems—but they have many pitfalls, too. On our journey to a microservice architecture, the Helix Engineering team has dealt with many of these pitfalls, learned important lessons, and developed a world-class platform that we’re extremely proud to call our own. In this post, I’ll give you an introduction to our architecture; in future posts, we’ll dive a little deeper into different aspects of it and where we’re headed next.
Microservices have a lot of hype behind them, but they’re not a panacea. A monolith is much simpler operationally and often conceptually. But a monolith suffers from several fundamental problems that get worse at scale:
- More brittle; every change can, in theory, break everything
- Dependency hell
- Difficult to find the boundaries
- More difficult to understand
- Changes are riskier
- Deployment is slower and needs much more coordination between teams
- Testing is slower
I’m a big proponent of components at any level. I liked classes when I learned object-oriented programming, but then I saw how they can be abused and then I discovered interfaces and binary components. This solved many problems when developing desktop applications. On the front-end, component-based frameworks finally became dominant after a couple of decades of global head-scratching. But on the server side, there was no good solution at the architectural level. Microservices bring components to the backend all the way from development to deployment. Containers and the DevOps movement made a big difference here, making it possible to apply agile techniques and fast iterations from ideation to deployment.
Microservices at Helix
At Helix, we use Go to program our platform backend services. We may have the odd service implemented in Scala or Python, but Go is predominantly our Go-to (see what I did there?) language. Some of the benefits of Go are:
- Blazing Fast compilation
- Great performance
- Strong typing
- Statically linked executable (no runtime dependencies)
Note that we have a whole separate bioinformatics software pipeline that is primarily Python-based, but that will be a subject for another blog post. Our platform services usually have a single purpose, but some services grew organically, do a lot of work, and stretch the concept of “micro.” Most of our services have a dedicated relational database where they persist their state. Services never share a database. The vast majority of our services are internal and just talk to each other. A select few services expose public APIs. For example, the Helix Marketplace API is the backend service that supports the Helix.com web site and store; the Helix Genomics API provides authenticated, authorized, and secure access for partner applications to user genomics data. And the Helix Partner API provides status and notifications to partners about user interactions with the Helix platform.
The services rely on a shared Go library developed at Helix. The library wraps several open-source libraries and provides general-purpose functionality as well as standard ways to accomplish tasks such as registering and exposing REST endpoints, environment-specific configuration, logging, authentication and database helpers.
The whole premise of microservices is that a service should be small and focused on one task. But how small, exactly? There is a point of diminishing returns. The big picture is that you try to build a complex system and not an eclectic collection of beautiful microservices where each one is pure, but putting them together and maintaining them is a nightmare. At Helix, we follow the principle that microservices own their data. Each service has its own relational database. That means that the data (especially relational data) can guide us into dividing our system into microservices. If, in your data model, you define a set of tables with some relationships between them, you probably want all the code that needs to access and modify this data to live in a single microservice.
Now, this is not that easy because some data models are large and complicated, which leads to big microservices. Furthermore, it’s not always clear if code that needs access to a particular piece of data should be in the service responsible for the data or if it should access it through an API. But sometimes it’s an easier decision. There are microservices that provide a general-purpose service to many other services and it’s pretty clear that you want to bundle all these services together. For example, we have an email service at Helix that wraps our external marketing email vendor and allows other services to send individual emails or batch email. This service is used by seven other services. This is a great example of identifying a piece of standalone functionality and packaging it in a microservice that is solely responsible for providing it to a large crowd of consuming services.
Stay tuned for more
That’s it for part one. In our next chapter, I go into some detail on our tech stack, authentication, and more.Check it out here!
After that, I dive into the CI/CD pipeline, testing, and where we’re headed next.You can jump to it here.
To make sure you’re seeing the latest from the Helix Engineering team, follow along here!
Helix is the leading population genomics and viral surveillance company operating at the intersection of clinical care, research, and data analytics.