July 24th, 2012

Go at SoundCloud

SoundCloud is a polyglot company, and while we’ve always operated with Ruby on Rails at the top of our stack, we’ve got quite a wide variety of languages represented in our backend. I’d like to describe a bit about how—and why—we use Go, an open-source language that recently hit version 1.

It’s in our company DNA that our engineers are generalists, rather than specialists. We hope that everyone will be at least conversant about every part of our infrastructure. Even more, we encourage engineers to change teams, and even form new ones, with as little friction as possible. An environment of shared code ownership is a perfect match for expressive, productive languages with low barriers to entry, and Go has proven to be exactly that.

Go has been described by several engineers here as a WYSIWYG language. That is, the code does exactly what it says on the page. It’s difficult to overemphasize how helpful this property is toward the unambiguous understanding and maintenance of software. Go explicitly rejects “helper” idioms and features like the Uniform Access Principle, operator overloading, default parameters, and even exceptions, on the basis that they create more problems through ambiguity than they solve in expressivity. There’s no question that these decisions carry a cost of keystrokes—especially, as most new engineers on Go projects lament, during error handling—but the payoff is that those same new engineers can easily and immediately build a complete mental model of the application. I feel confident in saying that time from zero to productive commits is faster in Go than any other language we use; sometimes, dramatically so.

Go’s strict formatting rules and its “only one way to do things” philosophy mean we don’t waste much time bikeshedding about style. Code reviews on a Go codebase tend to be more about the problem domain than the intricacies of the language, which everyone appreciates.

Further, once an engineer has a working knowledge of Effective Go, there seems to be very little friction in moving from “how the application behaves today” to “how the application should behave in the ideal case.” Should a slow response from this backend abort the entire request? Should we retry exactly once, and then serve partial results? This agent has been acting strangely: can we install a 250ms timeout? Every high-level scenario in the behavior of a system can be expressed in a straightforward and idiomatic implementation, without the need for libraries or frameworks. Removing layers of abstraction reduces complexity; plainly stated, simpler code is better code.

Go has some other nice properties that we’ve taken advantage of. Static typing and fast compilation enable us to do near-realtime static analysis and unit testing during development. It also means that building, testing and rolling out Go applications through our deployment system is as fast as it gets.

In fact, fast builds, fast tests, fast peer-reviews and fast deployment means that some ideas can go from the whiteboard to running in production in less than an hour. For example, the search infrastructure on Next is driven by Elastic Search, but managed and interfaced with the rest of SoundCloud almost exclusively through Go services. During validation testing, we realized that we needed the ability to mark indexes as read-only in certain circumstances, and needed the indexing applications to detect and respect this new dimension of index-state. Adding the abstraction in the code, polling a new endpoint to reliably detect the state, changing the relevant indexing behaviors, and writing tests for them, all took half an afternoon. By the evening, the changes had been deployed and running under load for hours. That kind of velocity, especially in a statically-typed, natively-compiled language, is exhilarating.

I mentioned our build and deployment system. It’s called Bazooka, and it’s designed to be a platform for managing the deployment of internal services. (We’ll be open-sourcing it pretty soon; stay tuned!) Scaling 12-Factor apps over a heterogeneous network can be thought of as one large, complex state machine, full of opportunities for inconsistency and race conditions. Go was a natural choice for this kind of job. Idiomatic Go is safely concurrent by default; Bazooka developers can reason about the complexity of their problem without being distracted by the complexity of their tools. And Bazooka makes use of Doozer to coordinate its shared state, which—in addition to being the only open-source implementation of Paxos in the wild (that we’re aware of)—is also written in Go.

All together, SoundCloud maintains about half a dozen services and over a dozen repositories written entirely in Go. And we’re increasingly turning to Go when spinning up new backend projects.

Interested in writing Go to solve real problems and build real products? We’d love to hear from you!

Peter Bourgon
  • B S

    Thanks for the article. I also in need of integration with elasticsearch, and wonder how you approached this. Do you share any part of your code with community? thakns.

  • Kashif Rasul

    actually zookeeper from apache is another opensource paxos and there is also libpaxos another opensource albeit academic implementation…

  • http://www.amoraes.info/ André

    More and more people start to realize that Go is in fact a great language, not because it have many “cool features” but because it can solve hard problems without too much trouble.

    Good to know about one more user of Go.

  • Pingback: Go at SoundCloud « thoughts…

  • Moshe Revah

    “user” might ~sound~ like an understatement. That’s another company using Go in production :-)

  • Pingback: Go at SoundCloud | look!

  • http://twitter.com/mosmann Michael Mosmann

    I am not sure if this feeling comes a little bit from this green field situation. Complexity does not come from code by itself, it comes from the real world. And not every time you can choose the best tool (f.i. a programming language) for the job.

  • http://twitter.com/globalo Christoph Sturm

    interesting that you say all your engineers are generalists, and on your jobs page you have very detailed job titles. also there is no job for a go programmer listed :)

  • Hubba

    You have a Go binary running as a service but how do you insure that it is ‘live’? Right now I run a cron script that checks and restarts processes as a lightweight solution but before I start running a more beefy process monitor I wonder if there is not a more Go-centric solution? How do you keep your Go processes running smoothly?

  • Anonymous

    I’ve had success with supervisord for exactly this use case. works well without being too beefy.

  • Peter Bourgon

    Complexity originates in the Real World™, but code itself can be a complexity multiplier. For example, when the language is an impedance mismatch to the problem.

  • http://twitter.com/cloudhead Alexis Sellier

    Zookeeper uses its own protocol called Zab

    https://cwiki.apache.org/ZOOKEEPER/zab-vs-paxos.html

  • Anonymous

    At disqus we use runit to keep our services up. http://smarden.org/runit/

  • Pingback: transparency in programming language design — Lucas Gonze's blog

  • brad clawsie

    great post. i also have had the pleasure of writing a large system in go. its easy to get up to speed with the tool and there are very few surprises. channels are a huge win. the community is smart and engaged, and most people who pick up go don’t want to put it back down in favor of other tools

  • http://twitter.com/goldjunge Alexander Simmerl

    Most of the services are managed by the described deployment system(Bazooka). The system components of Bazooka are managed by http://smarden.org/runit/, as we use it for process management within the rest of our infrastructure.

  • Manuel Barkhau

    I tried to do continuous testing in Go, but didn’t find a decent solution. Are you using anything to run tests automatically when files change during development? If how are you doing it?

  • Albert Strasheim

    We’ve had great success with systemd (Linx only), including using features like socket activation.

  • Anonymous

    What are you using to help in debugging? 

  • Pingback: project deathstar » Blog Archive » Go programming

  • http://twitter.com/bsdf ༟༟

    HELLO I LOVE PROGRAMMING GOLANG AND AM WORKING KNOWLEDGE OF EFFECTIVE GO. I LOVE GREAT GERMAN MUSIC CULTURE ASH RA TEMPEL AND EFDEMIN. PLEASE HIRE

  • Daniel Velkov

    Maybe you want to add Soundcloud as an Elastic Search user: http://www.elasticsearch.org/users/

  • Anonymous

    wow I need to check that out. socket activation sounds pretty cool

  • Antimonio

    There is no mention of Go language in the positions listed in your Jobs page ;)

  • Peter Bourgon

    I personally have Sublime Text 2 + GoSublime set up, which in its default state gives you realtime type- and error-checking. I’ve also defined a Build System for ‘go install && go test’, which I can fire off with Cmd-B.

  • http://twitter.com/eikke Nicolas Trangez

    Open-source real-world-used (Multi-)Paxos implementation? => http://arakoon.org (In the 2.0 branch on Github the state-machine is lifted for easier re-use in other applications)

    Disclaimer: I’m one of the co-authors.

  • marymemmanuel

    interesting………

  • jrusselltressa

    nice

  • Richard Gallacher

    why did my high volume of trafic not show up on my stats why have i not had notice of my going pro.?