Real-Time Counts with Stitch

July 3rd, 2014 by Emily Green

Here at SoundCloud, in order to provide counts and a time series of counts in real time, we created something called Stitch.

Stitch was initially developed to provide timelines and counts for our stats pages, which are where users can see which of their tracks are played and when.

Stitch is a wrapper around a Cassandra database. It has a web application that provides read access to the counts through an HTTP API. The counts are written to Cassandra in two distinct ways, and it’s possible to use either one or both of them:

Real Time: For real-time updates, Stitch has a processor application that handles a stream of events coming from a broker and increments the appropriate counts in Cassandra.
Batch: The batch part is a MapReduce job running on [Hadoop] that reads event logs and calculates the overall totals, and then bulk loads this into Cassandra.

← Building Products at SoundCloud—Part III: Microservices in Scala and Finagle
Building the new SoundCloud iOS application — Part I: The reactive paradigm →

Real-Time Counts with Stitch

The Problem

Our First Solution

Our Second Solution

Conclusion