• Collins - Infrastructure Management for Engineers

    mobocracy:

    At Tumblr we strive to automate as much as is reasonable. Automation helps us manage thousands of servers, our MySQL topology, our software deployments, our configuration updates, A/B testing, etc. As your production environment grows it generally becomes less and less consistent and more and more difficult to manage, even with tools like Puppet and Chef. Eventually you need a central point of truth from which you can determine the state of any given asset in your environment.

    When we started building a second production datacenter last year we needed a way to manage the intake process for the thousands of servers that would be getting shipped to us. Although we built Collins to support our secondary datacenter, we deployed it to the legacy production environment in February 2012. At that point we already had roughly 1500 production servers yet we had no consistent view of the environment. The initial problem we set out to solve was simply to inventory the production environment and get a sense of what servers were in use, which servers weren’t, and where there might be cost savings. After completing the inventory process, Collins quickly found another use in helping automate the management of our MySQL topology via Jetpants.

    Pretty quickly most of our infrastructure was using or populating Collins data in some way: puppet, func, the deployment tool, host provisioning, graphing/trending, proxy configuration, nagios configuration, DNS configuration, etc. Today an engineer at Tumblr can login to Collins using their LDAP credentials, find an available host, click Provision and be on their new dev box in under an hour. This is actually part of the developer on-boarding process.

    In our recipes document you can find some sample use cases for Collins like:

    Today we are open sourcing Collins, the Tumblr infrastructure management system. Collins was developed using the play framework but was designed so that people without any Java/Scala experience could integrate with it using the API or via Callbacks. Here at Tumblr we have bash, python and ruby integrations with Collins, all developed by different people. There is also a PHP SDK for Collins in the works.

    We are releasing the following components, available under the Apache License v2.0:

    The Documentation is available under the Creative Commons BY 3.0 license.

    Collins can be integrated with the Jetpants MySQL management toolkit through an open source plugin called jetpants_collins. This plugin allows Jetpants to use Collins as the single point of truth for your hardware inventory, automatically querying the list of pools, shards, hosts, and database instances in your infrastructure. Furthermore, every change you make to your MySQL topology using Jetpants (master promotions, shard splits, cloning replicas, etc) will be reflected in Collins immediately and automatically.

    Over the next month we will also be open sourcing a number of other related components which you can find out more about here and here.

    In the meantime, here are some more links to get you started:

    A number of people were responsible for helping make Collins so successful at Tumblr. Big thanks to Dan Simon, Steve Salevan, Joshua Hoffman, Dallas Marlow, Brad McDuffie, Evan Elias and all the rest of the Tumblr engineering team. Additionally a number of companies helped beta test Collins and provide feedback, thanks to all of you!

    Oh, and if you’re Interested in this type of work, we’re hiring!

  • Managing large sharded MySQL topologies with Jetpants (PDF)

    Slides from our talk at Percona Live NYC are now available. The presentation covers the design and implementation of Jetpants, Tumblr’s open source toolkit for interacting with hundreds of MySQL database servers.

    Interested in working with relational databases at scale? Tumblr is currently hiring database engineers in NYC!

  • JavaScript/native bridge for iOS’s UIWebView

    bryan:

    A Hacker News commenter, in response to Zach Williams’ Tumblr for iOS deconstruction post, asked for documentation on how to pass data from JavaScript running in a UIWebView to a native controller. This is all it takes:

  • Jetpants installation on Ubuntu

    Tim Ellis of PalominoDB wrote up a great walk-through for installing and using Jetpants on Ubuntu as well as Linux Mint.

    We’ll be rolling in some of his changes in the next release, to make Jetpants run on more Linux distros out of the box.

  • What have we been up to?

    For a long time I’ve wanted to put up some kind of a projects/presentations page. We’re usually pretty busy and don’t always have time to talk about what we’re up to, but we are always writing code and always making good progress.

    Tumblr finally has a github page to keep people up to date on some of the things we’re working on, as well as presentation from engineers here.

    I’ve also gone ahead and put in a ‘Coming Soon’ section highlighting some of the things we’re planning on releasing in the very near future.

    I hope that over the coming months we’ll be releasing more and more of the great stuff we’ve been building here at Tumblr. And of course, the documentation repo is open source as well.

    Thanks to Chris Aniszczyk at Twitter for being such a tremendous open source advocate over there and inspiring us to release early and often. I also borrowed much of their docs code, so thanks for that as well.

    Keep an eye out for new stuff on this blog and through the github pages.

  • Tumblr is going to Techcrunch Disrupt!

    developers:

    Want to build something cool? How about a free t-shirt?

    Our very own John Bunting will be attending the TechCrunch Disrupt Hackathon to help people build apps to help follow the world’s creators!

    He’ll be doing a workshop on how to use the Tumblr API that will hopefully inspire you to use our dataset to create the next big application. If you use the Tumblr API, you’re eligible to win a $500 Amazon Gift Card for you and your team!

    If you’re going to be there, give a shout out (John is @codingjester). John will have stickers and t-shirts for anyone who swings by. We hope to see you there!

  • Tumblr @ Percona Live NYC, Oct 1-2

    MySQL is an integral component of Tumblr’s architecture. For performance reasons, we run Percona Server, a drop-in replacement for MySQL with major enhancements; we’ve also relied heavily on Percona for their consulting expertise for years. So naturally we’re very excited to attend the New York edition of the Percona Live MySQL Conference, coming up on October 1-2!

    On October 2 we’ll be presenting the design and implementation of Jetpants, Tumblr’s open source toolkit for interacting with hundreds of database servers. Our session will delve into the motivations behind our toolkit, explain its internals, and discuss Tumblr’s database growth story as a whole.

    If that’s not enough Jetpants for you, how about a live demonstration? Tim Ellis of PalominoDB will be giving a tutorial on October 1 covering the modern ecosystem of MySQL sharding options and tools, including Jetpants.

    Several Tumblr engineers will be in attendance at Percona Live. If you’d like to chat about our engineering efforts (or even employment opportunities), come say hello!

  • Developers: ChangeLog for 08/24/12

    developers:

    Today we’re announcing the release of a brand new feature: our Tagged API, which allows developers to find posts based on their tags.

    Lets take a look at an example with the gif tag.

    http://api.tumblr.com/v2/tagged?tag=gif&api_key=your_consumer_key
    

    The response will be a pay load of

  • mobocracy:

This is what migrating a billion cache objects into a new cache pool looks like.

Top left: total number of objects per host
Top right: memory available vs used
Bottom left: memcache commands per host (writes only)
Bottom right: network utilization per host
Good times on a Thursday night.

    mobocracy:

    This is what migrating a billion cache objects into a new cache pool looks like.

    • Top left: total number of objects per host
    • Top right: memory available vs used
    • Bottom left: memcache commands per host (writes only)
    • Bottom right: network utilization per host

    Good times on a Thursday night.

  • /geek: Git Fu 3

    codingjester:

    Decided I needed a pre-commit hook for Tumblr while working on the main code base. The pre-commit hook below will automatically lint check all of your php files and bail out if you’re trying to commit broken code.

    #!/usr/bin/env bash
    git diff --cached --name-status | awk '{print $2}' | grep -e...