We revisited our distributed tracing setup and incorporated Kubernetes pod metadata into it, significantly enhancing our engineers’ ability to troubleshoot problems that cut across microservices.
A little less than two years ago, SoundCloud began the journey of replacing our homegrown deployment platform, Bazooka, with Kubernetes. Kubernetes automates deployment, scaling, and management of containerized applications.
If you’re a regular visitor to this blog, you might be aware that we have been transitioning to a microservices based architecture over the past four to five years, as we have shared insights into the process and the related challenges on multiple occasions. To recap, adopting a microservices architecture has allowed us to regain team autonomy by breaking up our monolithic backend into dozens of decoupled services, each encapsulating a well defined portion of our product domain. Every service is built and deployed individually, communicating with other services over the network via light-weight data interchange formats such as JSON or Thrift. What we haven’t touched on so far is how a microservice at SoundCloud looks backstage.
Building and operating services distributed across a network is hard. Failures are inevitable. The way forward is having resiliency as a key part of design decisions.
This post talks about two key aspects of resiliency when doing RPC at scale - the circuit breaker pattern, and its power combined with client-side load balancing.
In a previous series of blog posts, we covered our decision to move away from a monolithic architecture, replacing it with microservices, interacting synchronously with each other over HTTP, and asynchronously using events. In this post, we review our progress toward this goal, and talk about the conditions and strategy required to decommission our monolith.
Since we started breaking our monolith and introduced a microservices architecture we rely a lot on synchronous request-response style communication. In this blog post we’ll go over our current status and some of the lessons we learned.