Helix Engineering: Our approach to microservices, part 3
In Part 1 of our series, we touched on the benefits of microservices, why Helix uses them, and how we’ve progressed down that path. Last week, we looked at authentication, authorization, and APIs. In the conclusion to our three-part series on microservices, we’ll run through the CI/CD pipeline, testing, and where we’re headed next.
The CI/CD pipeline
Helix uses GoCD for continuous integration and deployment. Each service and application live in their own git repo on Github. Developers work on a branch, and whenever they push changes to their branch, their changes are tested by GoCD agents running in a test environment on a feature branch. We have homegrown scripts to properly version and tag every build and enforce uniform behavior when rebasing. When a PR is reviewed, accepted, and merged into master it is tested again, then the changes automatically propagate to the staging environment and tested again. Finally, the changes move to production.
As part of the build process, each service is packaged into a Docker image that gets pushed into the AWS Elastic Container Registry. The same image is pulled into the different environments and gets deployed on the AWS Elastic Container Service. Another important aspect of deployment is provisioning various AWS resources such as instances, IAM roles, firewall rules, and load balancers. We use Terraform to specify all these resources and have some more homegrown scripts to ensure we follow a uniform process.
At Helix, we take testing very seriously. There is always a question of how much testing you need. There is no single answer. Different parts of the system may be more or less critical. We practice multi-tier testing that includes both unit tests and end-to-end tests per service. In a microservices application, a service often needs to talk to other services. Some of these may be third party services that might not have a test environment. We have several solutions, such as:
- Mocking dependencies
- Creating test data in dedicated test environments
- Hitting endpoints in the staging environment for read-only tests
Our primary tool for testing Go code is the Ginkgo testing framework. For front-end code we use primarily Jest and Nightwatch, with flow type checks for good measure, and a smattering of Selenium to drive the browser.
Error reporting, logging, and instrumentation
Any non-trivial distributed system must keep track of what’s going on inside. At Helix, we use Rollbar for central error reporting, Sumo Logic for central logging, and NewRelic for collecting metrics.
Helix uses PagerDuty to stay on top of issues. When something goes wrong, the on-call engineer gets notified and can start diagnosing the problem. For infrastructure-related issues we have a separate rotation of InfraOps engineers. In a microservices environment, it is not always easy to pinpoint the root cause because data flows between many services and a problem at the beginning of the chain might only be discovered way downstream. This is where robust error messages and logging come into play.
The Helix platform is a work in progress. New business needs and operational improvements require constant work of the Helix Engineering team. We are very proud of what we have built, and excited to further develop and refine it. Here are some of the upcoming challenges we will be facing:
- Balancing developer productivity with system stability and security
- New service provisioning and setup
- Cross service testing
- Cross service troubleshooting
- AWS cost reduction
- AWS limits and quotas
- Introducing and migrating to new technologies, tools and processes
In the future, there are many technologies and capabilities we want to evaluate, incorporate into the Helix platform, or use more frequently. Here is a partial list:
- Using the new React context API
- Cross-service testing
- Performance and load testing
- Dynamic Configuration
- Queue-Based Service interactions
- Fargate and Kubernetes
- Blue-green and canary deployments
Stay tuned for more
So, that was a whirlwind high-level tour of the Helix microservice-oriented architecture. The story is still unfolding! In future blog posts, we’ll drill in deeper and examine the cool stuff we build here. Stay tuned!
To make sure you’re seeing the latest from the Helix Engineering team, follow along here.
Helix is the leading population genomics and viral surveillance company operating at the intersection of clinical care, research, and data analytics.