In the past, the Search Team at SoundCloud had high lead times for making updates to Elasticsearch clusters, either during the implementation of a new feature or simply while fixing a bug. This was because both tasks require us to reindex our catalog from scratch, which means reindexing more than 720 million users, tracks, playlists, and albums. Altogether, this process took up to one week, though there was even one scenario where it almost took one month to roll out a bug fix.
In this post, I would like to share the concrete Elasticsearch tweaks we made so that we can now reindex our entire catalog in one hour.
SoundCloud consists of hundreds of millions of tracks, people, albums, and playlists, and navigating this vast collection of music and personalities poses a large challenge, particularly with so many covers, remixes, and original works all in one place.