SoundCloud for Developers

Discover, connect and build

Backstage Blog RSS

  • July 7th, 2014 iOS Mobile Building the new SoundCloud iOS application — Part I: The reactive paradigm By Mustafa Sezgin & Jan Berkel

    Recently, SoundCloud launched the new iOS application which was a complete rewrite of the existing iOS application. The Mobile engineering team saw this as an opportunity to build a solid foundation for the future of SoundCloud on iOS and to experiment with new technologies and processes at the same time.

    In the world of mobile, you deal with data, errors, threads and concurrency a lot. The common scenario starts with a user tapping on the screen. The application jumps off of the main UI thread, does some I/O related work, some database-related operations along with some transformations to the data. The application then jumps back to the UI thread to render some new information onto the screen.

    Both the Android and iOS platforms provide some tools to deal with the different aspects of this scenario yet they are far from ideal. Some of them do not provide much in terms of error handling, which forces you to write boiler-plate code while some of them force you to deal with low-level concurrency primitives. You might have to add some additional libraries to your project so that you do not have to write filtering and sorting predicate code.

    We knew early on we wanted to avoid these issues and thus came into the picture the functional reactive paradigm and Reactive Cocoa.

    In short, Reactive Cocoa allows you to create composable, event-driven (finite or infinite) streams of data while making use of functional composition to perform transformations on those streams. Erik Meijer is best known in this space with his Reactive Extensions on the .NET platform which also spawned the JVM based implementation RxJava. By adopting this paradigm, we now have a uniform way of dealing with data models and operators that can apply transformations on those data models while taking care of low level concurrency primitives so that one does not have to be concerned about threads or the difficult task of concurrent programming.

    Let's take an example

    Like most mobile applications, the SoundCloud iOS application is a typical case of an API client with local storage. It fetches JSON data from the API via HTTP and parses it into API model objects. The persistent store technology we use is Core Data. We decided early on that we wanted to isolate the API from our storage representation so there is a final mapping step involved where we convert API models to Core Data models.

    We break this down into smaller units of work: we have

    1. Execute a network request. Parse the JSON response.
    2. Transform JSON objects into API model objects.
    3. Transform API model objects into Core Data models.

    1. Executing the network request

    For simplicity, assume that the network-access layer implements the following method:

    - (RACSignal *)executeRequest:(NSURL *)url;

    We do not pass in any delegates or callback blocks, we just give it an NSURL and get back a RACSignal representing a possible asynchronous operation, or future. To obtain the data from that operation, we can subscribe to the signal using subscribeNext:error:completed:

    - (RACDisposable *)subscribeNext:(void (^) (id result))nextBlock
                               error:(void (^) (NSError *error))errorBlock
                           completed:(void (^) (void))completedBlock

    You might recognize the familiar-looking error and success-callback blocks from other asynchronous APIs. This is where some of Reactive Cocoa's and FRP's strengths lie as we shall see later.

    2. Parsing the JSON response

    After the network request has been made, the JSON response needs to be parsed into an API model representation. For this we use a thin layer around Github's Mantle library, which wraps parsing in a RACSignal and pushes the result (or error) to the subscriber:

    - (RACSignal *)parseResponse:(id)data
      return [RACSignal createSignal:^(id<RACSubscriber> subscriber) {
        NSError *error = nil;
        id apiModel = [MTLJSONAdapter modelOfClass:ApiTrack.class
        if (error) {
          [subscriber sendError:error];
        } else {
          [subscriber sendNext:apiModel];
          [subscriber sendCompleted];

    To achieve the composition of operations that we have mentioned earlier, we wrapped the functionality of existing libraries with signals, where appropriate.

    3. Persisting the API model with Core Data

    In our architecture, the database represents the single source of truth. Therefore to show tracks to the user we first need to store them as Core Data objects. We have a collection of adapter classes that are responsible for mapping API model objects to Core Data model objects. An ApiTrackAdapter might look as follows:

    - (RACSignal *)adaptObject:(ApiTrack *)apiTrack
      return [[self findOrBuildTrackWithUrn:apiTrack.urn]
                               map:^(CoreDataTrack *coreDataTrack) {
          coreDataTrack.title = apiTrack.title;
          // set other properties
          return coreDataTrack;

    Putting it all together

    We now have the building blocks to issue a network request, parse the JSON, and store it as a Core Data object. RAC makes it very easy to compose the individual methods functionally by feeding the output of each operation as an input to the next one. The following example uses flattenMap:

    -(RACSignal *)loadAndStoreTrack:(NSURL *)url
      return [[requestHandler executeRequest:url] flattenMap:^(id json) {
        return [[parser parseResponse:json] flattenMap:^(ApiTrack *track) {
          return [adapter adaptObject:track];

    The flattenMap: method maps or transforms values emitted by a signal and produces a new signal as a result. In this example, the newly created signal returned by loadAndStoreTrack: would either return the adapted Core Data track object or error if any of the operations failed. In addition to flattenMap:, there is a whole range of predefined functional operators like filter:or reduce: that can be applied to signals.

    RAC Schedulers

    We left out one powerful feature of RAC which is the ability to parametrize concurrency. To ensure that the application stays responsive, we want to perform the network I/O and model parsing in a background queue.

    Core Data operations are different, we do not have a choice there. They have to be executed on a predefined private queue, otherwise we risk creating deadlocks in our application.

    With the help of RACScheduler we can easily control where the side-effects of a signal are performed by simply calling subscribeOn: on it with a custom scheduler implementation:

    -(RACSignal *)loadAndStoreTrack:(NSURL *)url
      return [[requestHandler executeRequest:url] flattenMap:^(id json) {
        return [[parser parseResponse:json] flattenMap:^(ApiTrack *track) {
          return [[adapter adaptObject:track]

    Here, we use a scheduler that is aware of the current Core Data context to ensure that adaptObject: is executed on the right queue by wrapping everything internally with performBlock:.

    If we want to update our UI with the title of the track we just fetched, we could do something like the following:

    [[trackService loadAndStoreTrack:trackUrl] subscribeNext:^(Track *track) {
       self.trackView.text = track.title;
    } error:^(NSError *error) {
      // handle errors

    To ensure that this final update happens on the UI thread we can tell RAC to deliver us the information back on the main thread by using the deliverOn: method:

    [[[trackService loadAndStoreTrack:trackUrl]
      subscribeNext:^(Track *track) {
       self.trackView.text = track.title;
    } error:^(NSError *error) {
      // handle errors

    By breaking down each operation within this common scenario into isolated units of work, it becomes easier to perform the operations we need on the desired threads by taking advantage of Reactive Cocoa's scheduling abilities. The functional reactive paradigm has also helped us to compose these independent operations one after another by using operators such as flattenMap. Although adopting FRP and ReactiveCocoa has had its difficulties, we have learned many lessons along the way.

    Steep learning curve

    Adopting FRP requires a change of perspective, especially for developers who are not used to a functional programming style. Methods do not return values directly, they return intermediate objects (signals) that take callbacks. This can lead to more verbose code, especially when the code is heavily nested which is common when doing more complex things with RAC.

    Therefore, it is important to have short and well-named methods, for example a method signature like -(RACSignal *)signal does not communicate anything about the type of values the caller is going to receive.

    Another problem is the sheer number of methods or operators defined on a base classes like RACStream / RACSignal. In practice only a few (like flattenMap: or filter:) are used on a regular basis, but the remaining 80% tend to confuse developers who are new to the framework.

    Memory management

    Memory management can be problematic because of RAC's heavy use of blocks which can easily lead to retain cycles. They can be avoided by breaking the cycle with weak references to self (@weakify / @strongify macros).

    One of RAC's promises is to reduce the amount of state you need to keep around in your code. This is true but you still need to manage the state introduced by the framework itself, which comes in the form of RACDisposable, an object returned as a result of signal subscription. A common pattern we introduced is to bind the lifetime of the subscription to the lifetime of the object with asScopedDisposable:

    self.disposable = [[signal subscribeNext:^{ /* stuff */ }] asScopedDisposable];

    Overdoing it

    It is easy to fall into the trap of trying to apply FRP to every single problem one encounters (also known as the golden hammer syndrome), thereby unnecessarily complicating the code. Defining clear boundaries and rules between the reactive and non-reactive parts of the code base is important to minimize verbosity and to use the power of FRP And Reactive Cocoa where appropriate.


    There are inherent performance problems within RAC. For example, a simple imperative for loop is guaranteed to execute much faster than flattenMap: which introduces a lot of internal method dispatching, object allocation, and state handling.

    In most cases this overhead is not noticeable, especially when I/O latency is involved, as in the preceding examples.

    However in situations where performance really matters, such as fast UI rendering, it makes sense to avoid RAC completely.


    We found this to be a non-issue if your application components are well designed and individually tested. Backtraces tend to get longer but this can be alleviated with some extra tooling like custom LLDB filters. A healthy amount of debug logging across critical components also does not hurt.


    Testing a method that returns a RACSignal is more complicated than testing code that returns plain value objects, but it can be made less painful with a testing library that supports custom matchers. We have created a collection of matchers for expecta that lets us write concise tests. For example:

    RACSignal *signal = [subject executeRequest:url];
    expect(signal).to.sendSingle(@{ @"track": @{ @"title": @"foo" } });

    We found that adopting FRP tends to produce easily testable components because they are generally designed to perform one single task, which is to produce an output given a specific input.

    It took a while for the team to get up to speed with FRP and Reactive Cocoa and to learn for which parts of the application it can be used most effectively. Right now it has become an indispensable part of our mobile development efforts, both on Android and iOS. The functional reactive approach has made it easier to build complex functionality out of smaller pieces whilst simplifying concurrency and error handling.

  • July 3rd, 2014 Data Real-time counts with Stitch By Emily Green

    We made Stitch to provide counts and time-series of counts in real-time.

    Stitch was initially developed to do the timelines and counts for our stats pages. This is where users can see which of their tracks were played and when.

    SoundCloud Stats Screenshot

    Stitch is a wrapper around a Cassandra database. It has a web application that provides read-access to the counts through an HTTP API. The counts are written to Cassandra in two distinct ways, and it's possible to use either or both of them:

    For real-time updates, Stitch has a processor application that handles a stream of events coming from a broker and increments the appropriate counts in Cassandra.
    The batch part is a MapReduce job running on Hadoop that reads event logs, calculates the overall totals, and bulk loads this into Cassandra.

    The problem

    The difficulty with real-time counts is that incrementing is a non-idempotent operation, which means that if you apply the same increment twice you get a different value to when you apply it once. If an incident affects our data pipeline, and the counts are wrong, we cannot fix by simply re-feeding the day's events through the processors; we would risk double counting.

    Our first solution

    Initially, Stitch only supported real-time updates and addressed this problem with a MapReduce job named The Restorator that performed the following actions:

    1. Calculated the expected totals
    2. Queried Cassandra to get the values it had for each counter
    3. Calculated the increments needed to apply to fix the counters
    4. Applied the increments

    Meanwhile, to stop the sand shifting under its feet, The Restorator needed to coordinate a locking system between itself and the real-time processors, so that the processors did not try to simultaneously apply increments to the same counter, resulting in a race-condition. It used ZooKeeper for this.

    As you can probably tell, this was quite complex, and it could take a long time to run. But despite this, it did indeed work.

    Our second solution

    We got a new use-case; a team wanted to run Stitch purely in batch. This is when we added the batch layer and took the opportunity to revisit the way Stitch was dealing with the non-idempotent increments problem. We evolved to a Lambda Architecture style approach, where we combine a fast real-time layer for a possibly inaccurate but immediate count, with a batch slow layer for an accurate but delayed count. The two sets of counts are kept separately and updated independently, possibly even living on different database clusters. It is up to the reading web application to return the right version when queried. At its naïvest, it returns the batch counts instead of the real-time counts whenever they exist.

    Stitch Diagram

    Thanks go to Kim Altintop and Omid Aladini who created Stitch, and John Glover who continues to work on it with me.

    If this sounds like the sort of thing you'd like to work on too, check out our jobs page.

  • June 13th, 2014 Scala Finagle Ruby Architecture Building Products at SoundCloud—Part III: Microservices in Scala and Finagle By Phil Calçado

    In the first two parts of this series, we talked about how SoundCloud started breaking away from a monolithic Ruby on Rails application into a microservices architecture. In this part we will talk a bit more about the platforms and languages in which we tend to write these microservices.

    At the same time that we started the process of building systems outside the Mothership (our Rails monolith) we started breaking our large team of engineers into smaller teams that focused on one specific area of our platform.

    It was a phase of high experimentation, and instead of defining which languages or runtimes these teams should use, we had the rule of thumb write it in whatever you feel confident enough putting in production and being on-call for.

    This led to a Cambrian Explosion of languages, runtimes and skills. We had systems being developed in everything from Perl to Julia, including Haskell, Erlang, and node.js.

    While this process proved quite productive in creating new systems, we started having problems when maintaining them. The bus factor for several of our systems was very low, and we eventually decided to consolidate our tools.

    Based on the expertise and preferences across teams, and an assessment of the industry and our peers, we decided to stick to the JVM and select JRuby, Clojure, and Scala as our company-wide supported languages for product development. For infrastructure and tooling, we also support Go and Ruby.

    Turns out that selecting the runtime and language is just one step in building products in a microservices architecture. Another important aspect an organization has to think about is what stack to use for things like RPC, resilience, and concurrency.

    After some research and prototyping, we ended up with three alternatives: a pure Netty implementation, the Netflix stack, and the Finagle stack.

    Using pure Netty was tempting at first. The framework is well documented and maintained, and the support for HTTP, our main protocol for RPC, is good. After a while, though, we found ourselves implementing abstractions on top of it to do basic things for the concurrency and resilience requirements of our systems. If such abstractions were to be required, we would rather use something that exists than re-invent the wheel.

    We tried the Netflix stack, and a while back Joseph Wilk wrote about our experience with Hystrix and Clojure. Hystrix does very well in the resilience and concurrency requirements, but its API based on the Command pattern was a turnoff. In our experience, Hystrix commands do not compose very well unless you also use RxJava, and although we use this library for several back-end systems and our Android application, we decided that the reactive approach was not the best for all of our use cases.

    We then started trying out Finagle, a protocol-agnostic RPC system developed by Twitter and used by many companies our size. Finagle does very well in our three requirements, and its design is based on a familiar and extensible Pipes-and-Filters meets Futures model.

    The first issue we found with Finagle is that, as opposed to the other alternatives, it is written in Scala, therefore the language runtime jar file is required even for a Clojure or JRuby application. We decided that this wasn’t too important, though it adds about 5MB to the transitive dependencies' footprint, the language runtime is very stable and does not change often.

    The other big issue was to adapt the framework to our conventions. Twitter uses mostly Thrift for RPC; we use HTTP. They use ZooKeeper for Service Discovery; we use DNS. They use a Java properties-based configuration system; we use environment variables. They have their own telemetry system; we have our own telemetry system (we're not ready to show it just yet, but stay tuned for some exciting news there). Fortunately, Finagle has some very nice abstractions for these areas, and most of the issues were solved with very minimal changes and there was no need to patch the framework itself.

    We then had to deal with the very messy state of Futures in Scala. Heather Miller, from the Scala core team, explained the history and changes introduced by newer versions of the language in a great presentation. But in summary, what we have across the Scala ecosystem are several different implementations of Futures and Promises, with Finagle coupled to Twitter's Futures. Although Scala allows for compatibility between these implementations, we decided to use Twitter's everywhere, and invest time in helping the Finagle community move closer to the most recent versions of Scala rather than debug weird issues that this interoperability might spawn.

    With these issues addressed, we focused on how best to develop applications using Finagle. Luckly, Finagle’s design philosophy is nicely described by Marius Eriksen, one of its core contributors, in his paper Your Server as a Function. You don’t need to follow these principles in your userland code, but in our experience everything integrates much better if you do. Using a Functional programming language like Scala makes following these principles quite easy, as they map very well to pure functions and combinators.

    We have used Finagle for HTTP, Thrift, memcached, Redis, and MySQL. Every request to the SoundCloud platform is very likely hitting at least one of our Finagle-powered microservices, and the performance we have from these is quite amazing.

    In the last part of this series of blog posts, we will be talking about how Finagle and Scala are being used to move away from a one-size-fits-all RESTful API to optmized back-ends for our applications.

  • June 12th, 2014 Scala Finagle Ruby Architecture Building Products at SoundCloud—Part II: Breaking the Monolith By Phil Calçado

    In the previous post, we talked about how we enabled our teams to build microservices in Scala, Clojure, and JRuby without coupling them with our legacy monolithic Rails system. After the architecture changes were made, our teams were free to build their new features and enhancements in a much more flexible environment. An important question remained, though: how do we extract the features from the monolithic Rails application called Mothership?

    Splitting a legacy application is never easy, but luckily there are plenty of industry and academic publications to help you out.

    The first step in any activity like this is to identify and apply the criteria used to define the units to be extracted. At SoundCloud, we have decided to use the work of Eric Evans and Martin Fowler in what is called a Bounded Context. An obvious example of Bounded Context in our domain was user-to-user messages. This was a well-contained feature set, highly cohesive, and not too coupled with the rest of the domain, as it just needs to hold references to users.

    After we identified the Bounded Context, the next task was to find a way to extract it. Unfortunately, Rails’ ActiveRecord framework often leads to a very coupled design. The code dealing with such messages was as follows:

     def index
      if (InboxItem === item)
        respond mailbox_items_in_collection.index.paginate(:page => params[:page])
        respond mailbox_items_in_collection.paginate(
          :joins => "INNER JOIN messages ON #{safe_collection}_items.message_id =",
          :page  => params[:page],
          :order => 'messages.created_at DESC')

    Because we wanted to extract the messages’ Bounded Context into a separate microservice, we needed the code above to be more flexible. The first step we took was to refactor this code into what Michael Feathers describes as a seam:

    A seam is a place where you can alter behavior in your program without editing in that place.

    So we changed our code a little bit:

    def index
      conversations = cursor_for do |cursor|
      respond collection_for(conversations, :conversations)

    The first version of the conversations_service#conversations_for method was not that different from the previous code; it performed the exact same ActiveRecord calls.

    We were ready to extract this logic into a microservice without having to refactor lots of controllers and other Presentation Layer code. We first replaced the implementation of conversations_service#conversations_for with a call to the service:

    def conversations_for(user, offset = 0, limit = 50)
      response = @http_client.do_get(service_path(user), pagination(offset, limit))
      parse_response(user, response)

    We avoided big-bang refactorings as much as we could, and this required us to have the microservices working together with the old Mothership code for as long as it took to completely extract the logic into the new microservice.

    As described before, we did not want to use the Mothership’s database as the integration point for microservices. That database is an Application Database, and making it an Integration Database would cause problems because we would have to synchronize any change in the database across many different microservices that would now be coupled to it.

    Although using the database as the integration point between systems was not planned, we had the new microservices accessing the Mothership’s database during the transition period.

    This brought up two important issues. During the whole transition period, the new microservices could not change the relational model in MySQL—or, even worse, use a different storage engine. For extreme cases, like user-to-user messages where a threaded-based model was replaced by a chat-like one, we had cronjobs keep different databases synchronized.

    The other issue was related to the Semantic Events system described in Part I. The way our architecture and infrastructure was designed requires events to be emitted where the state change happened, and this ought to be a single system. Because we could not have both the Mothership and the new microservice emitting events, we had to implement only the read-path endpoints until we were ready to make the full switch from the Mothership to the new microservice. This was less problematic than what we first thought, but nevertheless it did impact product prioritization because features to be delivered were constrained by this strategy.

    By applying these principles we were able to extract most services from the Mothership. Currently we have only the most coupled part of our domain there, and products like the new user-to-user messaging system were built completely decoupled from the monolith.

    In the next part, we will look at how we use Scala and Finagle to build our microservices.

  • June 11th, 2014 Scala Finagle Ruby Architecture Building Products at SoundCloud —Part I: Dealing with the Monolith By Phil Calçado

    Most of SoundCloud's products are written in Scala, Clojure, or JRuby. This wasn't always the case. Like other start-ups, SoundCloud was created as a single, monolithic Ruby on Rails application running on the MRI, Ruby's official interpreter, and backed by memcached and MySQL.

    We affectionately call this system Mothership. Its architecture was a good solution for a new product used by several hundreds of thousands of artists to share their work, collaborate on tracks, and be discovered by the industry.

    The Rails codebase contained both our Public API, used by thousands of third-party applications, and the user-facing web application. With the launch of the Next SoundCloud in 2012, our interface to the world became mostly the Public API —we built all of our client applications on top of the same API partners and developers used.

    Diagram 1

    These days, we have about 12 hours of music and sound uploaded every minute, and hundreds of millions of people use the platform every day. SoundCloud combines the challenges of scaling both a very large social network with a media distribution powerhouse.

    To scale our Rails application to this level, we developed, contributed to, and published several components and tools to help run database migrations at scale, be smarter about how Rails accesses databases, process a huge number of messages, and more. In the end we have decided to fundamentally change the way we build products, as we felt we were always patching the system and not resolving the fundamental scalability problem.

    The first change was in our architecture. We decided to move towards what is now known as a microservices architecture. In this style, engineers separate domain logic into very small components. These components expose a well-defined API, and implement a Bounded Context —including its persistence layer and any other infrastructure needs.

    Big-bang refactoring has bitten us in the past, so the team decided that the best approach to deal with the architecture changes would not be to split the Mothership immediately, but rather to not add anything new to it. All of our new features were built as microservices, and whenever a larger refactoring of a feature in the Mothership was required, we extract the code as part of this effort.

    This started out very well, but soon enough we detected a problem. Because so much of our logic was still in the Rails monolith, pretty much all of our microservices had to talk to it somehow.

    One option around this problem was to have the microservices accessing directly the Mothership database. This is a very common approach in some corporate settings, but because this database is a Public, but not Published Interface, it usually leads to many problems when we need to change the structure of shared tables.

    Instead, we went for the only Published Interface we had, which was the Public API. Our internal microservices would behave exactly like the applications developed by third-party organizations integrate with the SoundCloud platform.

    Diagram 2

    Soon enough, we realized that there was a big problem with this model; as our microservices needed to react to user activity. The push-notifications system, for example, needed to know whenever a track had received a new comment so that it could inform the artist about it. At our scale, polling was not an option. We needed to create a better model.

    We were already using AMQP in general and RabbitMQ in specific — In a Rails application you often need a way to dispatch slow jobs to a worker process to avoid hogging the concurrency-weak Ruby interpreter. Sebastian Ohm and Tomás Senart presented the details of how we use AMQP, but over several iterations we developed a model called Semantic Events, where changes in the domain objects result in a message being dispatched to a broker and consumed by whichever microservice finds the message interesting.

    Diagram 3

    This architecture enabled Event Sourcing, which is how many of our microservices deal with shared data, but it did not remove the need to query the Public API —for example, you might need all fans of an artist and their email addresses to notify them about a new track.

    While most of the data was available through the Public API, we were constrained by the same rules we enforced on third-party applications. It was not possible, for example, for a microservice to notify users about activity on private tracks as users could only access public information.

    We explored several possible solutions to the problem. One of the most popular alternatives was to extract all of the ActiveRecord models from the Mothership into a Ruby gem, effectively making the Rails model classes a Published Interface and a shared component. There were several important issues with this approach, including the overhead of versioning the component across so many microservices, and that it became clear that microservices would be implemented in languages other than Ruby. Therefore, we had to think about a different solution.

    In the end, the team decided to use Rails' features of engines (or plugins, depending on the framework's version) to create an Internal API that is available only within our private network. To control what could be accessed internally, we used Oauth 2.0 when an application is acting on behalf of a user, with different authorisation scopes depending on which microservice needs the data.

    Diagram 4

    Although we are constantly removing features from the Mothership, having both a push and pull interface to the old system makes sure that we do not couple our new microservices to the old architecture. The microservice architecture has proven itself crucial to developing production-ready features with much shorter feedback cycles. External-facing examples are the visual sounds, and the new stats system.