Go at SoundCloud

SoundCloud is a polyglot company, and while we’ve always operated with Ruby on Rails at the top of our stack, we’ve got quite a wide variety of languages represented in our backend. I’d like to describe a bit about how—and why—we use Go, an open-source language that recently hit version 1.

It’s in our company DNA that our engineers are generalists, rather than specialists. We hope that everyone will be at least conversant about every part of our infrastructure. Even more, we encourage engineers to change teams, and even form new ones, with as little friction as possible. An environment of shared code ownership is a perfect match for expressive, productive languages with low barriers to entry, and Go has proven to be exactly that.

Go has been described by several engineers here as a WYSIWYG language. That is, the code does exactly what it says on the page. It’s difficult to overemphasize how helpful this property is toward the unambiguous understanding and maintenance of software. Go explicitly rejects “helper” idioms and features like the Uniform Access Principle, operator overloading, default parameters, and even exceptions, on the basis that they create more problems through ambiguity than they solve in expressivity. There’s no question that these decisions carry a cost of keystrokes—especially, as most new engineers on Go projects lament, during error handling—but the payoff is that those same new engineers can easily and immediately build a complete mental model of the application. I feel confident in saying that time from zero to productive commits is faster in Go than any other language we use; sometimes, dramatically so.

Go’s strict formatting rules and its “only one way to do things” philosophy mean we don’t waste much time bikeshedding about style. Code reviews on a Go codebase tend to be more about the problem domain than the intricacies of the language, which everyone appreciates.

Further, once an engineer has a working knowledge of Effective Go, there seems to be very little friction in moving from “how the application behaves today” to “how the application should behave in the ideal case.” Should a slow response from this backend abort the entire request? Should we retry exactly once, and then serve partial results? This agent has been acting strangely: can we install a 250ms timeout? Every high-level scenario in the behavior of a system can be expressed in a straightforward and idiomatic implementation, without the need for libraries or frameworks. Removing layers of abstraction reduces complexity; plainly stated, simpler code is better code.

Go has some other nice properties that we’ve taken advantage of. Static typing and fast compilation enable us to do near-realtime static analysis and unit testing during development. It also means that building, testing and rolling out Go applications through our deployment system is as fast as it gets.

In fact, fast builds, fast tests, fast peer-reviews and fast deployment means that some ideas can go from the whiteboard to running in production in less than an hour. For example, the search infrastructure on Next is driven by Elastic Search, but managed and interfaced with the rest of SoundCloud almost exclusively through Go services. During validation testing, we realized that we needed the ability to mark indexes as read-only in certain circumstances, and needed the indexing applications to detect and respect this new dimension of index-state. Adding the abstraction in the code, polling a new endpoint to reliably detect the state, changing the relevant indexing behaviors, and writing tests for them, all took half an afternoon. By the evening, the changes had been deployed and running under load for hours. That kind of velocity, especially in a statically-typed, natively-compiled language, is exhilarating.

I mentioned our build and deployment system. It’s called Bazooka, and it’s designed to be a platform for managing the deployment of internal services. (We’ll be open-sourcing it pretty soon; stay tuned!) Scaling 12-Factor apps over a heterogeneous network can be thought of as one large, complex state machine, full of opportunities for inconsistency and race conditions. Go was a natural choice for this kind of job. Idiomatic Go is safely concurrent by default; Bazooka developers can reason about the complexity of their problem without being distracted by the complexity of their tools. And Bazooka makes use of Doozer to coordinate its shared state, which—in addition to being the only open-source implementation of Paxos in the wild (that we’re aware of)—is also written in Go.

All together, SoundCloud maintains about half a dozen services and over a dozen repositories written entirely in Go. And we’re increasingly turning to Go when spinning up new backend projects.

Interested in writing Go to solve real problems and build real products? We’d love to hear from you!