A previous post on this blog ended with the following paragraph:
“We might also replace JSON with a more efficient serialization protocol like protocol buffers. This would further improve performance and should result in less handwritten serialization and deserialization code but this still needs more experimentation and investigation.”
To follow up on the above, this article announces the release of Twinagle, an open source implementation of the Twirp protocol for Scala/Finagle.
The backend of SoundCloud is designed around a set of microservices. As detailed previously, services expose a REST-like interface and use HTTP and JSON to communicate. Our services are usually implemented in Scala, making use of Twitter’s Finagle library, but we have a fair amount of Ruby and Go powering production as well.
This choice has worked well for us, because tooling around HTTP and JSON is widespread and allows for easy debugging by humans. However, the very flexibility offered by the stack has, in significant ways, slowed down development. This is because, for each endpoint, developers need to decide:
Internal guidelines and recommendations have only helped so much — variations tend to creep in between teams and over time. Small inconsistencies between server and client implementations have caused outages (e.g. a client mistakenly made use of camelCase for field names but the server expected snake_case).
The “artisanal” approach to HTTP API development and consumption can be frustrating: Even where guidelines are followed perfectly, an engineer still needs to forensically examine application code in order to get comprehensive information about how any given endpoint works.
As a result of discussion in our weekly engineering forum, a subgroup of engineers decided to explore alternatives. The group honed in on Twirp.
“The Twirp wire protocol is a simple RPC protocol based on HTTP and Protocol Buffers (proto). The protocol uses HTTP URLs to specify the RPC endpoints, and sends/receives proto messages as HTTP request/response bodies.”
Engineers compose API contracts in the protobuf Interface Description Language (IDL). They then use those contracts to generate server stubs and clients using protobuf tooling. The generated code takes care of HTTP request/response parsing and serialization to either JSON or the binary protobuf format. All requests are simple HTTP POST. And the wire protocol itself is simple and well-thought-out (thank you, Twirp engineering).
Twirp seemed like a great fit: It enabled us to start introducing IDLs to simplify microservice development in a way that incrementally “slotted into” a complex backend. We looked at gRPC but decided against it, and our reasons for this decision match very well with those stated in the original Twirp blog post). We already run some Go services, and Twirp has a Go implementation we could just drop in and use.
However, more than 70 percent of microservices we deploy are based on Scala + Finagle…
Due to the simplicity of the Twirp protocol, we decided to implement it ourselves. And we are pleased to announce the release of Twinagle, our implementation of Twirp for Finagle.
The project is split into two components, outlined below.
An SBT plugin that takes a protobuf interface and uses it to generate:
For instructions and code examples, please see the Twinagle project page.
The first high-impact production service at SoundCloud to adopt Twinagle was our track authorization service. This service is responsible for deciding how users can access tracks on SoundCloud based on the user’s location, client application, and subscription status. Should they hear an ad? Should they be able to see the track exists in other countries? Every time the SoundCloud backend handles a track identifier, this service gets a request. Needless to say, this service receives a large number of requests.
In addition, JSON response bodies for track lookups can be large: Up to 100 tracks may be requested at a time, and in practice, the average request is for 25 tracks. However, the structure of these responses is not complex; it consists of a long list of relatively simple objects.
The track authorization service is only used by one client application, meaning we only needed to perform one client migration. A combination of systems-integration simplicity and handling of significant request load made this service a natural early candidate for conversion to Twinagle.
First we implemented a new Twinagle-based endpoint in the authorization service. We defined messages in the protobuf file that are based on existing domain model classes found in the service’s codebase, and we implemented a thin layer to convert between both. In exchange for this small amount of extra boilerplate code, we were able to reuse the existing business logic code, untouched. As a result, we could limit potential bugs to serialization and deserialization code.
Once we deployed the new endpoint, we brought the Twirp library and protobuf file into the client application. We took a similar approach to the client side as we did on the server, building in a thin (de)serialization layer that isolates the existing business logic.
Then we had the option of using both the REST and Twirp versions of the endpoint from the client app. We redirected client traffic using a percentage-based rollout flag in two phases. In the first phase, we made duplicate requests and compared the results. This meant that with a rollout percentage of 1 percent, the client made a request to both the REST and Twirp endpoints 1 percent of the time and reported discrepancies in the responses, but returned only the REST endpoint’s response. We caught a few critical bugs this way. We then moved on to use the Twirp endpoint only — and to stop duplicating requests. We rolled out to 100 percent with no outages and no bugs on the first attempt as a result of this process.
The results of this migration were positive but produced some mild surprises. We expected a drop in response time, but we only saw marginal benefits in the 90th percentile, and no benefits in the 50th or 99th percentiles. This was likely because most of the response time comes from accessing the database, and not from request/response serialization and deserialization. Additionally, the variability of 90th percentile latency dropped significantly.
We saw a huge drop in bandwidth between the two services; network usage decreased by more than 40 percent when using binary protobuf instead of JSON! We also discovered that many fields in the response were being ignored by the client application, and we were able to confidently clean things up in the new Twirp endpoint. Finally, we deleted all of the custom JSON serialization and deserialization code in both services, which is always a satisfying developer experience!
This initial service conversion confirmed our hopes for Twinagle: Protobuf contracts make services easier to understand, allow us to get rid of an entire class of low-value code, and even result in some nice efficiency benefits. We are actively converting old services to use Twinagle, and all new SoundCloud services are now Twinagle-based. To try it out for yourself, check out the project page.
The implementation and rollout of Twinagle has been a great cross-team effort. Special thanks to:
And extra special thanks to Twitch engineering, for Twirp!