• OpenTSDB proxy

    We use OpenTSDB to store the majority of our time series server and application statistics here at Tumblr. We recently began a project to migrate OpenTSDB from an existing HBase cluster running an older version of HBase to a new cluster with newer hardware and running the latest stable version of Hbase.

    We wanted a way to have some historical data in the new cluster before we switched to it. Within Tumblr we have a variety of applications generating these metrics and it was not very practical for us to change all of them to double write this data. Instead, we chose to replace the standard OpenTSDB listeners with a proxy that would do this double writing for us. While we could have used HBase copy table or written our own tool to backfill historical data from the old cluster, double writing for an initial period allowed us to avoid adding additional load on our existing cluster. This strategy also allowed us to move queries for recent data to new cluster earlier than the full cutover.

    The tsd_proxy is written in Clojure and relies heavily on the Lamina and Aleph which in turn build on top of Netty. We have been using this in our production infrastructure for over two months now while sustaining writes at or above 175k/s and it has been working well for us. We are open sourcing this proxy in the hope that others might find a use for this as well.

    The tsd proxy listens on a configurable port and can forward the incoming data stream to multiple end points. It also has the ability to filter the incoming stream and reject data points that don’t match a (configurable) set of regular expressions. It also has the ability to queue the incoming stream and re-attempt delivery if one of the end points is down. It is also possible to limit the queue size so you don’t blow through your heap. The README has some more information on how to set this up.

  • Supporting Keyboard Shortcuts without UITextView

    In Issue 5 of objc.io, Peter Steinberger wrote about Hidden Gems and Workarounds in iOS 7 and briefly mentions the the new class UIKeyCommand to support keyboard shortcuts but he caveats:

    Now don’t get too excited; there are some caveats. This only works when the keyboard is visible (if there’s some first responder like UITextView.) For truly global hotkeys, you still need to revert to…hackery.

    With the recent addition of Keyboard Shortcuts to the Tumblr app, we figured out a way to get around this requirement, by taking advantage of the fact that UIViewController is a UIResponder.

    @implementation TMDashboardViewController
    
    #pragma mark - UIResponder
    
    - (BOOL)canBecomeFirstResponder {
        return YES;
    }
    
    - (NSArray *)keyCommands {
        static NSArray *keyCommands;
    
        static dispatch_once_t onceToken;
        dispatch_once(&onceToken, ^{
            UIKeyCommand *composeCommand = [UIKeyCommand keyCommandWithInput:@"c" modifierFlags:UIKeyModifierAlternate action:@selector(composeShortcut:)];
    
            keyCommands = @[composeCommand];
        });
    
        return keyCommands;
    }
    
    #pragma mark - TMDashboardViewController
    
    - (void)composeShortcut:(UIKeyCommand *)command {
        [self compose];
    }
    
    @end
    
  • We’ve launched a bug bounty program!

    A few weeks back, we launched a cohesive form that the Tumblr community can use to submit security bugs and exploits to us for review. We’ve had great responses so far, and a small number of those submissions became hot bugs for me and my team to fix. After they were addressed, it fell to me to figure out how to send honoraria to the talented researchers who had found them.  

    Our first attempt resulted in a pretty inefficient system: a bunch of trips to the Duane Reade on Broadway and 19th for gift cards; lots of stamps from the newsstand at 23rd and 5th for postage to foreign countries; and a whole lot of expense reports for the Finance department to process.

    No more. We’ve streamlined. We’re geared up to handle as many requests as we receive and to deliver the bounties expeditiously. We’ve also better outlined the exact terms of the program—we want everyone to be clear on which parts of Tumblr are worth your efforts.

    The terms of the bounty are listed here. Happy hunting!

  • Hi, I’m Kyle, an engineer here at Tumblr. I work on search and other cool stuff that makes Tumblr the best blogging platform in the world.
This Martin Luther King Day weekend, Tumblr and the Black Techies, an organization I founded to promote diversity in engineering, are collaborating to host a hackathon. The MLK DreamCode Hackathon will take place at Tumblr HQ beginning the morning of Saturday, January 18th and go on through the evening of Sunday, January 19th.
I’m excited because we’ve invited a diverse group of engineers and designers to take part. Over the 30 hours, we hope to see hacks that further Dr. King’s values of equality, peace, and social justice.
I’ll be blogging the events from the Dream Code Hackathon at mlkdreamcode.tumblr.com. Be sure to follow the mlkdreamcode blog and use the #mlkDreamCode tag for your hacks.

    Hi, I’m Kyle, an engineer here at Tumblr. I work on search and other cool stuff that makes Tumblr the best blogging platform in the world.

    This Martin Luther King Day weekend, Tumblr and the Black Techies, an organization I founded to promote diversity in engineering, are collaborating to host a hackathon. The MLK DreamCode Hackathon will take place at Tumblr HQ beginning the morning of Saturday, January 18th and go on through the evening of Sunday, January 19th.

    I’m excited because we’ve invited a diverse group of engineers and designers to take part. Over the 30 hours, we hope to see hacks that further Dr. King’s values of equality, peace, and social justice.

    I’ll be blogging the events from the Dream Code Hackathon at mlkdreamcode.tumblr.com. Be sure to follow the mlkdreamcode blog and use the #mlkDreamCode tag for your hacks.

  • Key-Value-Observing is a divisive API, to say the least. Despite it’s (well-documented) flaws, I personally tend to favor it when wanting to know if a property’s value changes, but most developers I talk to (including both of my fellow iOS developers here at Tumblr) tend to prefer overriding setters. Here’s a case where I think KVO works slightly better than overwriting setters does.

    Say you have a custom view controller subclass with a property, and changes to that property’s value will result in some modifications being made to the view controller’s view or subviews. An example from the Tumblr codebase does exactly this:

    - (void)setContainerScrollable:(BOOL)containerScrollable {
        if (_containerScrollable != containerScrollable) {
            _containerScrollable = containerScrollable;
    
            self.container.scrollEnabled = containerScrollable;
            self.tableView.scrollEnabled = !containerScrollable;
        }
    }
    

    Looks simple enough, right? Now you can simply do the following:

    TMContainerViewController *controller = [[TMContainerViewController alloc] init];
    controller.containerScrollable = YES;
    

    Of course, there’s a problem with this. Since we’re calling the custom setter before the controller’s view has necessarily loaded, it won’t have the desired effect. At best, the subviews we operate on inside the overridden setter will be nil and our custom behavior won’t be applied. At worst, we’ll refer to self.view in our implementation, the view will be loaded prematurely, and something unexpected could occur.

    So how can we fix this? One way is to make sure our setter is called again after the view is loaded, and prevent against the custom logic being executed beforehand:

    - (void)setContainerScrollable:(BOOL)containerScrollable {
        if (_containerScrollable != containerScrollable) {
            _containerScrollable = containerScrollable;
    
            if ([self isViewLoaded]) {
                self.container.scrollEnabled = containerScrollable;
                self.tableView.scrollEnabled = !containerScrollable;
            }
        }
    }
    
    - (void)viewDidLoad {
        // View set-up
    
        self.containerScrollable = self.isContainerScrollable;
    }
    

    This should work, but calling a getter and passing it’s return value to the same property’s setter doesn’t strike me as being particularly elegant. What if we factor out our custom logic into a separate private instance method?

    - (void)updateViewsForContainerScrollability {
        self.container.scrollEnabled = self.isContainerScrollable;
        self.tableView.scrollEnabled = !self.isContainerScrollable ;
    }
    
    - (void)setContainerScrollable:(BOOL)containerScrollable {
        if (_containerScrollable != containerScrollable) {
            _containerScrollable = containerScrollable;
    
            if ([self isViewLoaded]) {
                [self updateViewsForContainerScrollability];
            }
        }
    }
    
    - (void)viewDidLoad {
        // View set-up
    
        [self updateViewsForContainerScrollability];
    }
    

    This is a fine solution, and will work as expected. That being said, let’s look at another approach to the same problem using KVO.

    Here’s what our observation code looks like:

    - (void)observeValueForKeyPath:(NSString *)keyPath ofObject:(id)object change:(NSDictionary *)change 
                           context:(void *)context {
        if (context == TMContainerViewControllerKVOContext) {
             if (object == self && [keyPath isEqualToString:@"containerScrollable"]) {
                self.container.scrollEnabled = self.isContainerScrollable;
                self.tableView.scrollEnabled = !self.isContainerScrollable;
            }
        }
        else {
            [super observeValueForKeyPath:keyPath ofObject:object change:change context:context];
        }
    }
    

    Now, we’re still faced with the same issue of needing to ensure that this code runs both A) as soon as the view is loaded and B) any time the property’s value is changed going forward. Thankfully, NSKeyValueObservingOptionInitial provides this exact behavior.

    - (void)viewDidLoad {
        // View set-up
    
      [self addObserver:self forKeyPath:@"containerScrollable"
                options:NSKeyValueObservingOptionInitial
                context:TMContainerViewControllerKVOContext];
    }
    

    Since we’re no longer overriding the setter, the property’s value can be changed completely independently of the view being initialized. When our view is set-up, we add an observer that is called immediately with the initial property value, and called again whenever the property is changed in the future.

    KVO code can be messy, can result in problems if used incorrectly, and certainly isn’t the best tool to use in all situations. But if you ask me, this is a pretty good example of when it does come in handy.

  • Efficient distribution of cacheable http requests at Tumblr

    Tumblr uses Haproxy for load balancing http requests and other types of tcp traffic to pools of application servers in order to evenly distribute our workloads and ensure that we’re using our machines efficiently. Haproxy supports consistent hashing of requests to specific servers which is a requirement for efficiently caching responses. Consistent hashing ensures that only a few mappings (of requests to servers) are redistributed when a server is added or removed from the pool which results in a higher maintained hit ratio when dealing with caching systems than a traditional modulo distribution method.

    An important aspect of the hashing algorithm is to ensure that load is distributed evenly across the pool. Haproxy uses the SDBM hashing function by default, but about a year ago we began doing testing to determine the efficiency of alternative algorithms using the same hashing criteria as we see in production. We found that the hashing algorithm most suited for the input given to our Haproxy instances at Tumblr was DJB2.

    The improvement we see when using DJB2 is best illustrated via a graph of connections from Tumblr’s varnish pools to the application nodes. The Varnish pools are consistently hashed backends to Haproxy. The effect of switching the algorithm is an improvement in the load distribution on the varnish pool which is reflected in the convergence of connections to application nodes ultimately leading to better cache usage.

    image

    Please continue reading for more on the details of hashing as implemented in Haproxy and the patch we have pushed upstream which lets you try out these options.

    Hash functions strive to have little correlation between input and output. The heart of a hash function is its mixing step. The behavior of the mixing step largely determines the degree in which a hash function is collision resistant. Hash functions that are collision resistant are more likely to provide an even distribution of load.

    The purpose of the mixing function is to spread the effect of each message bit throughout all the bits of the internal state. Ideally every bit in the hash state is affected by every bit in the message and perform that operation as quickly as possible for the sake of program performance. A function is said to satisfy the strict avalanche criterion if, whenever a single input bit is complemented (toggled between 0 and 1), each of the output bits should change with a probability of one half for an arbitrary selection of the remaining input bits.

    To guard against a combination of hash function and input that results in high rate of collisions, Haproxy implements an avalanche algorithm on the result of the hashing function. The avalanche is always applied when using the consistent hashing directive. It is intended to provide a good distribution for little input variations. The result is quite suited to fit over a 32-bit space with enough variations so that a randomly picked number falls equally before any server position, which is ideal for consistently hashed backends, a common use case for caches.

    In additional tests involving alternative algorithms for hash input and an option to trigger avalanche, we found different algorithms perform better on different criteria. DJB2 performs well when hashing ascii text which makes it a good choice for hashing http host headers. Other alternatives perform better on numbers and are a good choice when using source ip. The results also vary by use of the avalanche flag.

    What are the effects of the DJB2 and avalanche algorithms on your production deployments? Would it not be great to have an option that lets you play with the hash function and determine via configuration if avalanche is beneficial to you?

    Now you can. Tumblr’s patches for enabling the hashing alternative and an option to trigger avalanche are now available in Haproxy 1.5. Let us know the results of your testing.

  • keithmcknight:

really happened the other night. true story.

    keithmcknight:

    really happened the other night. true story.

  • Last month, Corinne and I attended the Grace Hopper Celebration of Women in Computing. As product engineers, we were delighted to meet other women in our field and celebrate their technical expertise.

To increase diversity in the tech space, Tumblr is a proud supporter of Hacker School and their initiatives. This fall, we’re sponsoring Daphne, a coder in the current batch, and we can’t wait to see what amazing things she’ll create next.

    Last month, Corinne and I attended the Grace Hopper Celebration of Women in Computing. As product engineers, we were delighted to meet other women in our field and celebrate their technical expertise.

    To increase diversity in the tech space, Tumblr is a proud supporter of Hacker School and their initiatives. This fall, we’re sponsoring Daphne, a coder in the current batch, and we can’t wait to see what amazing things she’ll create next.

  • mt:

codingjester talking about tumblr APIs at yahoodevelopers Hack USA!

    mt:

    codingjester talking about tumblr APIs at yahoodevelopers Hack USA!

  • My Summer at Tumblr: Luke Cycon

My summer as an Engineering Intern at Tumblr? You might expect me to say I found myself in a chaotic new world. Truth is, it was pretty tame. Now don’t mistake tame for uninteresting or easy. Of those, I can assure you, my summer was neither.

True to my interests, I worked on Tumblr’s back-end systems. In terms of size, the code-bases I worked on were larger than what I was used to, but not by a tremendous amount. What made for an interesting summer of development was the volume of data with which my code was expected to handle. I primarily worked on our Firehose. One could imagine that amount of data that flows through there. I spent my summer ironing our issues, refactoring “messy” parts of the code-base, and implementing runtime performance boosts.

I am quite proud of the features I contributed to the code-bases I worked with here. Small things, such as seeing a performance graph change shape, or the resident memory of a service drop because of a feature you added is powerful. Every improvement I made, I did so knowing that it would positively affect the experiences of the millions of creators that use Tumblr.

Working mainly in Scala, a language I consider to be my second favorite, I felt “comfortable” editing the code from day one. As the weeks passed, I began to notice my understanding of both my code-bases and the libraries we worked with growing quite intimate. I no longer required the documentation nearby to use a library, I no longer needed to look up the behavior of “that one constructor” for a class in the core library. From there, I saw the true power of the abstractions in which we deal as engineers. To say I grew as developer would be an understatement.

One thing I learned over the summer? I couldn’t tell you if I wanted. What this summer has meant to me is more than a collection of tips and tricks that are nice to know, it’s been a chance to grow as a software engineer. Working in such a violently creative environment with a group of such incredible and talented people has been an amazing experience. I am excited to head back to California, but it will be hard to leave this place and these people behind. I suppose I won’t have to, not entirely at least. Moving forward, I will be working part-time with Tumblr from school.

Tumblr is something different in a powerful way, and it has been nothing short of exhilarating to be a part of it.

    My Summer at Tumblr: Luke Cycon

    My summer as an Engineering Intern at Tumblr? You might expect me to say I found myself in a chaotic new world. Truth is, it was pretty tame. Now don’t mistake tame for uninteresting or easy. Of those, I can assure you, my summer was neither.

    True to my interests, I worked on Tumblr’s back-end systems. In terms of size, the code-bases I worked on were larger than what I was used to, but not by a tremendous amount. What made for an interesting summer of development was the volume of data with which my code was expected to handle. I primarily worked on our Firehose. One could imagine that amount of data that flows through there. I spent my summer ironing our issues, refactoring “messy” parts of the code-base, and implementing runtime performance boosts.

    I am quite proud of the features I contributed to the code-bases I worked with here. Small things, such as seeing a performance graph change shape, or the resident memory of a service drop because of a feature you added is powerful. Every improvement I made, I did so knowing that it would positively affect the experiences of the millions of creators that use Tumblr.

    Working mainly in Scala, a language I consider to be my second favorite, I felt “comfortable” editing the code from day one. As the weeks passed, I began to notice my understanding of both my code-bases and the libraries we worked with growing quite intimate. I no longer required the documentation nearby to use a library, I no longer needed to look up the behavior of “that one constructor” for a class in the core library. From there, I saw the true power of the abstractions in which we deal as engineers. To say I grew as developer would be an understatement.

    One thing I learned over the summer? I couldn’t tell you if I wanted. What this summer has meant to me is more than a collection of tips and tricks that are nice to know, it’s been a chance to grow as a software engineer. Working in such a violently creative environment with a group of such incredible and talented people has been an amazing experience. I am excited to head back to California, but it will be hard to leave this place and these people behind. I suppose I won’t have to, not entirely at least. Moving forward, I will be working part-time with Tumblr from school.

    Tumblr is something different in a powerful way, and it has been nothing short of exhilarating to be a part of it.