Building Products at SoundCloud—Part II: Breaking the Monolith

In the previous post, we talked about how we enabled our teams to build microservices in Scala, Clojure, and JRuby without coupling them with our legacy monolithic Rails system. After the architecture changes were made, our teams were free to build their new features and enhancements in a much more flexible environment. An important question remained, though: how do we extract the features from the monolithic Rails application called Mothership?

Splitting a legacy application is never easy, but luckily there are plenty of industry and academic publications to help you out.

The first step in any activity like this is to identify and apply the criteria used to define the units to be extracted. At SoundCloud, we have decided to use the work of Eric Evans and Martin Fowler in what is called a Bounded Context. An obvious example of Bounded Context in our domain was user-to-user messages. This was a well-contained feature set, highly cohesive, and not too coupled with the rest of the domain, as it just needs to hold references to users.

After we identified the Bounded Context, the next task was to find a way to extract it. Unfortunately, Rails’ ActiveRecord framework often leads to a very coupled design. The code dealing with such messages was as follows:

 def index
  if (InboxItem === item)
    respond mailbox_items_in_collection.index.paginate(:page => params[:page])
  else
    respond mailbox_items_in_collection.paginate(
      :joins => "INNER JOIN messages ON #{safe_collection}_items.message_id = messages.id",
      :page  => params[:page],
      :order => 'messages.created_at DESC')
  end
end

Because we wanted to extract the messages’ Bounded Context into a separate microservice, we needed the code above to be more flexible. The first step we took was to refactor this code into what Michael Feathers describes as a seam:

A seam is a place where you can alter behavior in your program without editing in that place.

So we changed our code a little bit:

def index
  conversations = cursor_for do |cursor|
    conversations_service.conversations_for(
    current_user,
    cursor[:offset],
    cursor[:limit])
  end

  respond collection_for(conversations, :conversations)
end

The first version of the conversations_service#conversations_for method was not that different from the previous code; it performed the exact same ActiveRecord calls.

We were ready to extract this logic into a microservice without having to refactor lots of controllers and other Presentation Layer code. We first replaced the implementation of conversations_service#conversations_for with a call to the service:

def conversations_for(user, offset = 0, limit = 50)
  response = @http_client.do_get(service_path(user), pagination(offset, limit))
  parse_response(user, response)
end

We avoided big-bang refactorings as much as we could, and this required us to have the microservices working together with the old Mothership code for as long as it took to completely extract the logic into the new microservice.

As described before, we did not want to use the Mothership’s database as the integration point for microservices. That database is an Application Database, and making it an Integration Database would cause problems because we would have to synchronize any change in the database across many different microservices that would now be coupled to it.

Although using the database as the integration point between systems was not planned, we had the new microservices accessing the Mothership’s database during the transition period.

This brought up two important issues. During the whole transition period, the new microservices could not change the relational model in MySQL—or, even worse, use a different storage engine. For extreme cases, like user-to-user messages where a threaded-based model was replaced by a chat-like one, we had cronjobs keep different databases synchronized.

The other issue was related to the Semantic Events system described in Part I. The way our architecture and infrastructure was designed requires events to be emitted where the state change happened, and this ought to be a single system. Because we could not have both the Mothership and the new microservice emitting events, we had to implement only the read-path endpoints until we were ready to make the full switch from the Mothership to the new microservice. This was less problematic than what we first thought, but nevertheless it did impact product prioritization because features to be delivered were constrained by this strategy.

By applying these principles we were able to extract most services from the Mothership. Currently we have only the most coupled part of our domain there, and products like the new user-to-user messaging system were built completely decoupled from the monolith.

In the next part, we will look at how we use Scala and Finagle to build our microservices.