SoundCloud for Developers

Discover, connect and build

Backstage Blog RSS

You're browsing posts of the category Announcements

  • May 18th, 2015 Announcements API Apple's June 1 64-bit Deadline By Erik Michaels-Ober

    In October 2014, Apple announced that all submissions to the App Store must include 64-bit support by June 1, 2015. The SoundCloud API for Cocoa contains 32-bit dependencies and will not be updated, because it has been discontinued. Anyone using the SoundCloud API for Cocoa will need to will need to migrate away from it if they wish to update their app after June 1.

    To ease this transition we have built a sample app that demonstrates how to authorize a user via OAuth using only built-in Foundation libraries.

    Once an access_token has been obtained via OAuth, you can make GET requests like so:

    let urlSession = NSURLSession.sharedSession()
    let urlString = "https://api.soundcloud.com/me"
    let urlComponents = NSURLComponents(string: urlString)!
    urlComponents.queryItems = [ NSURLQueryItem(name: "oauth_token", value: "insert an OAuth token here")]
    let url = urlComponents.URL!
    
    let dataTask = urlSession.dataTaskWithRequest(NSURLRequest(URL: url)) { (data, response, error) -> Void in
       if let jsonOutput = NSJSONSerialization.JSONObjectWithData(data, options: nil, error: nil) as? [String:AnyObject] {
           // do stuff with JSON
       }
    }
    
    dataTask.resume()
    

    POST requests that contain multipart data (e.g. uploading a track) look like this:

    import UIKit
    
    class ViewController: UIViewController {
        func uploadTrack(title: String, trackPath: String) {
            let urlSession = NSURLSession.sharedSession()
            let urlRequest = getURLRequest(title, audioPath: trackPath)
            let task = urlSession.dataTaskWithRequest(urlRequest) { (data, response, error) -> Void in
                if let httpResponse = response as? NSHTTPURLResponse {
                    println("returned \(httpResponse.statusCode)")
                }
                if data != nil, let response = NSString(data: data, encoding: NSUTF8StringEncoding) {
                    println(response)
                }
                if let err = error {
                    println(err)
                }
            }
            task.resume()
        }
    
        func getURLRequest(title: String, audioPath: String) -> NSURLRequest {
            let boundary = NSUUID().UUIDString
            let request = NSMutableURLRequest(URL: NSURL(string: "https://api.soundcloud.com/tracks")!)
            request.HTTPMethod = "POST"
            request.HTTPBody = getPostData("insert an OAuth token here", boundary: boundary, title: title, audioPath: audioPath)
            let contentType = "multipart/form-data; boundary=" + boundary
            request.setValue(contentType, forHTTPHeaderField: "Content-Type")
            return request
        }
    
        func getPostData(token: String, boundary: String, title: String, audioPath: String) -> NSData {
            let boundaryStart = "--\(boundary)\r\n"
            let boundaryEnd = "\r\n--\(boundary)--\r\n"
            let bodyData : NSMutableData = NSMutableData()
    
            // add the token
            var tokenSection = boundaryStart
            tokenSection += "Content-Disposition: form-data; name=\"oauth_token\"\r\n\r\n"
            tokenSection += "\(token)\r\n"
            bodyData.appendData(tokenSection.dataUsingEncoding(NSUTF8StringEncoding)!)
    
            // add the track title
            var titleSection = boundaryStart
            titleSection += "Content-Disposition: form-data; name=\"track[title]\"\r\n\r\n"
            titleSection += "\(title)\r\n"
            bodyData.appendData(titleSection.dataUsingEncoding(NSUTF8StringEncoding)!)
    
            // add the audio file
            let trackData = NSData(contentsOfFile: audioPath)!
            var trackSection = boundaryStart
            trackSection += "Content-Disposition: form-data; name=\"track[asset_data]\"; "
            trackSection += "filename=\"\(audioPath.lastPathComponent)\"\r\n"
            trackSection += "Content-Type: application/octet-stream\r\n"
            trackSection += "\r\n"
            bodyData.appendData(trackSection.dataUsingEncoding(NSUTF8StringEncoding)!)
            bodyData.appendData(trackData)
            bodyData.appendData(boundaryEnd.dataUsingEncoding(NSUTF8StringEncoding)!)
            return bodyData
        }
    }
    

    Note: This example assumes access tokens will never expire, however, we encourage you not to make this assumption in your production code. Instead, build your app assuming that tokens will periodically expire and can be refreshed using a refresh token. For details on how to use a refresh token, see Section 1.5 of the OAuth 2.0 specification.

  • February 2nd, 2015 Announcements API Linked partitioning to replace offset-based pagination By Erik Michaels-Ober

    The SoundCloud API will be dropping support for offset-based pagination on March 2, 2015, in favor of linked partitioning.

    To page through a JSON response, pass the linked_partitioning=1 parameter along with your request and it will return a collection, along with a next_href property if there are additional results. To fetch the next page of results, simply follow that URI. If the response does not contain a next_href property, you have reached the end of the results.

    You can read more about linked partitioning in the Pagination section of our HTTP API Guide, including code examples in JavaScript, PHP, Python, and Ruby.

    The limit parameter continues to be supported with linked partitioning. The default limit is 50 with a maximum value of 200.

    Please update your code to replace the offset parameter with linked_partitioning. If you have any questions about this update, please notify us via email.

  • January 26th, 2015 Announcements Open Source Monitoring Go Prometheus: Monitoring at SoundCloud By Julius Volz, Björn Rabenstein

    In previous blog posts, we discussed how SoundCloud has been moving towards a microservice architecture. Soon we had hundreds of services, with many thousand instances running and changing at the same time. With our existing monitoring set-up, mostly based on StatsD and Graphite, we ran into a number of serious limitations. What we really needed was a system with the following features:

    • A multi-dimensional data model, so that data can be sliced and diced at will, along dimensions like instance, service, endpoint, and method.

    • Operational simplicity, so that you can spin up a monitoring server where and when you want, even on your local workstation, without setting up a distributed storage backend or reconfiguring the world.

    • Scalable data collection and decentralized architecture, so that you can reliably monitor the many instances of your services, and independent teams can set up independent monitoring servers.

    • Finally, a powerful query language that leverages the data model for meaningful alerting (including easy silencing) and graphing (for dashboards and for ad-hoc exploration).

    All of these features existed in various systems. However, we could not identify a system that combined them all until a colleague started an ambitious pet project in 2012 that aimed to do so. Shortly thereafter, we decided to develop it into SoundCloud's monitoring system: Prometheus was born.

    Read more...

  • November 17th, 2014 Announcements API XML responses deprecated By Erik Michaels-Ober

    The SoundCloud API will be dropping support for Extensible Markup Language (XML) responses. XML will be phased out on the following schedule:

    1. XML is currently the default response format for requests without an explicit format specified in the path (e.g. /tracks) or Accept header. This default will be changed to JSON on December 1, 2014.
    2. Explicit requests for XML — specified either in the path (e.g. /tracks.xml) or an Accept: application/xml header — will continue to be supported until December 15, 2014. After that point, only JSON responses will be supported.

    SoundCloud has been using JSON exclusively for internal APIs for several years. Dropping support for XML in our public API will allow us to focus on providing consistent and reliable service.

    If your app still uses XML responses, please start working to upgrade it to JSON immediately. If your app does not currently use XML responses, it should be unaffected by this change.

    If you are unable to migrate your app from XML to JSON for some reason, we recommend accessing the SoundCloud API through a proxy server that converts JSON to XML.

    Please let us know if you have any questions about this update via email.

  • May 9th, 2014 Announcements Go Open Source Roshi: a CRDT system for timestamped events By Peter Bourgon

    Let's talk about the stream.

    The SoundCloud stream represents stuff that's relevant to you primarily via your social graph, arranged in time order, newest-first. The atom of that data model, an event, is a simple enough thing.

    • Timestamp
    • User who did the thing
    • Identifier of the thing that was done

    For example,

    If you followed A-Trak, you'd want to see that repost event in your stream. Easy. The difficult thing about time-ordered events is scale, and there are basically two strategies for building a large-scale time-ordered event system.

    Data models

    Fan out on write means everybody gets an inbox.

    Fan out on write

    That's how it works today: we use Cassandra, and give each user a row in a column family. When A-Trak reposts Skrillex, we fan-out that event to all of A-Trak's followers, and make a bunch of inserts. Reads are fast, which is great. But writes carry perverse incentives: the more followers you have, the longer it takes to persist all of your updates. Storage requirements are also quadratic against user growth and follower count (i.e. affiliation density). And mutations, e.g. changes in the social graph, become costly or unfeasible to implement at the data layer. It works, but it's unwieldy in a lot of dimensions.

    At some point, those caveats and restrictions started affecting our ability to iterate on the stream. To keep up with product ideas, we needed to address the infrastructure. And rather than tackling each problem in isolation, we thought about changing the model.

    The alternative is fan in on read.

    Fan in on read

    When A-Trak reposts Skrillex, it's a single append to A-Trak's outbox. When users view their streams, the system will read the most recent events from the outboxes of everyone they follow, and perform a merge. Writes are fast, storage is minimal, and since streams are generated at read time, they naturally represent the present reality. (It also opens up a lot of possibilities for elegant implementations of product features and experiments.)

    Of course, reads are difficult. If you follow thousands of users, making thousands of simultaneous reads, time-sorting, merging, and cutting within a typical request-response deadline isn't trivial. As far as we know, nobody operating at our scale builds timelines via fan-in-on-read. And we presume that's due at least in part to the challenges of reads.

    Yet we saw potential here. Storage reduction was actually huge: we projected a complete fan-in-on-read data size for all users on the order of a hundred gigabytes. At that size, it's feasible to keep the data set in memory, distributed among commodity servers. The problem then becomes coördination: how do you reliably and correctly populate that data system (writes), and materialize views from up to thousands of sources by hard deadlines (reads)?

    Enter the CRDT

    If you're into so-called AP data systems, you've probably run into the term CRDT recently. CRDTs are conflict-free replicated data types: data structures for distributed systems. The tl;dr on CRDTs is that by constraining your operations to only those which are associative, commutative, and idempotent, you sidestep a lot of the complexity in distributed programming. (See: ACID 2.0 and/or CALM theorem.) That, in turn, makes it straightforward to guarantee eventual consistency in the face of failure.

    With a bit of thinking, we were able to map a fan-in-on-read stream product to a data model that could be implemented with a specific type of CRDT. We were then able to focus on performance, optimizing our reads without becoming overwhelmed by incidental complexity imposed by the consistency model.

    Roshi

    The result of our work is Roshi, a distributed storage system for time-series events. It implements what we believe is a novel CRDT set type, closely resembling a LWW-element-set with inline garbage collection. At its core, it uses the Redis ZSET sorted set to store state, and orchestrates self-repairing reads and writes on top, in a stateless operational layer. We spent a long while optimizing the read path to support our latency and QPS requirements, and we're confident that Roshi will accommodate our exponential growth for years. It took about six developer months to build, and we're in the process of rolling it out now.

    Roshi is fully open-source, and all the gory technical details are in the repository, so please do check it out. I hope it's easy to grok: at the time of writing, it's 5000 lines of Go, of which 2300 are tests. And we intend to keep the codebase lean, explicitly not adding features that are outside of the tightly defined problem domain.

    Open-sourcing our work naturally serves the immediate goal of providing usable software to the community. We hope that Roshi may be a good fit for problems in your organizations, and we look forward to collaborating with anyone who's interested in contributing. Open-sourcing also serves another, perhaps more interesting goal, which is advancing a broader discussion about software development. The obvious reaction to Roshi is to ask why we didn't implement it with an existing, proven data system like Cassandra. But we too often underestimate the costs of doing that: costs like mapping your domain to the generic language of the system, learning the subtleties of the implementation, operating it at scale, and dealing with bugs that your likely novel use cases may reveal. There are even second-degree costs: when software engineering is reduced to plumbing together generic systems, software engineers lose their sense of ownership, which is the foundation of craftsmanship and software quality.

    Given a well-defined problem, a specific solution may be far less costly than a generic version: there's a smaller domain translation, a much smaller surface area, and less operational friction. We hope that Roshi stands in evidence for the case that the practice of software engineering can be a more thoughtful and crafted process. Software that is "invented here" can, in the right circumstances, deliver outstanding business value.

    Roshi was a team effort. I'm deeply indebted to the amazing work of Tomás Senart, Björn Rabenstein, and Johan Uhle, without whom Roshi would have never been possible.