• Ordered Broadcasts in Android

    Simulating background / foreground notifications in Android

    In a recent article I wrote for objc.io detailing how notifications in Android differ from those in iOS, a reader messaged me and asked

    “How do I detect if an Application is in the foreground when a notification arrives? Android seems to be missing this functionality, where iOS has it readily available.” paraphrased

    It’s true, there is no intrinsic differentiation for foreground and background notifications in Android. This is largely a fundamental difference in how Android handles notifications altogether. In Android, the developer has full control over the lifecycle and presentation of a notification, and such, it could be presented at any time. Whether the app is actively running or not, the developer is responsible for presenting notifications to the user. The same callbacks are presented in all scenarios. In light of this question, I created a sample project to demonstrate one such approach to this problem.

    OrderedBroadcast Example (Github)

    See the demo video (YouTube)

    Ordered Broadcast Strategy

    One strategy for mitigating this problem is using a not-so-well-known API titled sendOrderedBroadcast (used in place of sendBroadcast) available on any Context within your application. An ordered broadcast takes the same intent you would use with a normal broadcast, the primary difference lies within the receiver. By setting a priority on the IntentFilter using setPriority(int priority) you tell the system that this receiver should be called before any other. Lets take a look at some code.

    @Override
    protected void onResume() {
     super.onResume();
     IntentFilter filter = new IntentFilter(AlarmReceiver.ACTION_RECEIVE_NOTIFICATION);
     // The default priority is 0. Positive values will be before
     // the default, lower values will be after it.
     filter.setPriority(1);
    
     // It’s good practice to register your BroadcastReceiver’s in
     // onResume() unless you have a good reason not to
     registerReceiver(mForegroundReceiver, filter);
    }
    

    When registering a receiver programmatically, we have the ability to set a priority on it. You might have seen this before, but never known why to use it. Well, now you know! Ordered broadcasts will inspect this priority before sending to the receivers. Receivers with a higher priority will catch the broadcast first, and then send itself on to lower ones (default is 0).

    The beauty of using an ordered broadcast is that you (the developer) can decide whether or not your want that broadcast propagated. For example, if you have two BroadcastReceivers catching q broadcast, one as a foreground receiver and one as a background receiver, you can tell the foreground receiver to abort the broadcast using abortBroadcast(), so that any lower priority receivers won’t catch it.

    private BroadcastReceiver mForegroundReceiver = new BroadcastReceiver() {
     @Override
     public void onReceive(Context context, Intent intent) {
     // Don’t send this intent to anyone else!
     abortBroadcast();
    
     // Let the user know we received a broadcast (if we want).
     Toast.makeText(MainActivity.this, R.string.received_in_foreground, Toast.LENGTH_SHORT).show();
     }
    };
    


    Summary

    That’s it! Using the ordered broadcast strategy, you can send the same intents for background and foreground notifications, and display them in different ways by utilizing different priorities.

    You can go even crazier with this approach by setting different priorities for different Activitys. Maybe when you’re on the main screen, you want to intercept all notifications, but on subscreens you only want to intercept notifications related to that specific screen. The possibilities are endless!

  • Who doesn’t love animated GIFs?
Believe it or not, support for GIFs at Tumblr was a happy accident! When Tumblr put together the code for handling JPEGs, support and GIFs (and PNGs) happened to also work using the same code. Perhaps even more surprising is that the tools used to handle GIFs at Tumblr hadn’t changed much from those early days. 
The image above is an original from sukme that could not be posted to Tumblr last June. It also would have failed if he’d tried last Sunday. If you click-through to the original post, you will see a muddy, reduced-saturation mess. All this because our resizer couldn’t handle the original. 
I’ve got ninety-nine problems and the GIF is one
There is a lot of misinformation about GIF limits on Tumblr, so let me set the record straight: We don’t count colors or frames or pixels. We only count bytes and seconds. Every image that comes in is scaled to a number of smaller sizes and the smaller your image is, the fewer resizes need to happen, which means less time. 
We had two core failure modes in our prior resizer: Some images would take as much as several minutes to convert. This was not directly attributable to color, dimensions, or frame count, but a mysterious mix of all of them. Some images would balloon in size (600KB at 400x400, 27MB at 250x250).
The unpredictability of these failures made our GIF limits feel arbitrary and terrible to the end users. Some have gone so far as to threaten monkey kicks. I don’t want to get kicked by a monkey, so we started working hard late last year to fix it. 
A proposed solution
Some of you may have seen this post where the performance of our current converter was compared with a new “mystery” converter. The mystery converter was roughly 1000x faster on the “slapping” GIF and happened to look great, but had quality problems on other images. Those were more fully explored in here a couple of days later.
If you haven’t figured it out yet, the mystery converter is gifsicle.
Getting a better handle on it
To get an unbiased test set, I took a random sample of roughly 90K GIFs that Tumblr users tried to upload, not limiting the corpus only to those that succeeded. These were tested against the current converter, resizing down to the next size we produce. Each resize is given up to 20 seconds to complete in our application, but all resizes must complete in 30 seconds. All resizes must be under 1MB or we will convert the first frame to JPEG and call it a day. 
2.6% of my 90K GIFs took longer than 20 seconds to resize. This is an underestimation of how many GIFs would be rejected for time because this is only one of several resizes required. A whopping 17.1% of all GIFs were over 1MB. Even if we bump up to 2MB, the rejection rate is 2.75%. The converter was making over 25% of all resizes larger than the higher-resolution originals! The total rejection rate for my sample set was 4.46% of all original GIFs uploaded. 
Using gifsicle is so much faster that our CPU rejection rate drops to 0.00 on my test set. Also, just under 99% of all images were smaller when resized than they were at their original resolution. The size rejection rate was a much lower 0.59%.
Gifsicle problems
As compelling as the performance of gifsicle is, the quality problems are too much to ignore. We played around with the code a bit, but eventually we just got in touch with the author, Dr. Eddie Kohler. The specifics are in this post, but the short version is that Eddie was able to improve quality by adding some more advanced resampling methods as well as palette expansion for small-palette images. This increased our size rejection rate to 0.68% while still keeping us well under our CPU budget. 
Proving it
Image processing is all about choices. How do you resample? Do you sharpen? Where in the workflow is gamma correction applied, if at all? The list goes on and on. 
As you can imagine from the performance differences, our previous converter and gifsicle take very different approaches to GIF resizing. The output images look different. Sometimes it is slight, sometimes it is significant, but there is no way we could put out a converter that messes up your images, even if it messes them up quickly. 
We set up a qualitative study. The goal was simply to prove that we weren’t doing worse than our old converter, not necessarily that we were doing better. This study was opened up to all Tumblr employees, as well as some “randomly selected” outsiders (my friends and family). Participants were presented with one of two questions:
1.) Given an original and 1 resize, decide whether it is ok, unacceptable, or completely broken.
2.) Given an original and 2 resizes (randomly choses which was left and which was right, sometimes they were identical), choose the better image or say there is no difference.
The results were everything I could have hoped for. The “acceptable” test showed that users found gifsicle better at producing acceptable results (87% vs. 84%), but not by a statistically relevant amount (p=0.086) and that gifsicle produced fewer broken GIFs (0.71% vs. 1.38%), but again not enough to say it is definitively better (p=0.106). The “better” test found users preferring gifsicle 37% of the time, the prior converter only 16% of the time, but users also preferred one identical image over the other 27% of the time. Again, it is hard to say that gifsicle is better, but it is clear that it is no worse.
Putting it all together
The development and testing described above took from late October until the beginning of March. Packaging, deployment, and integration took only a couple of weeks!
We aren’t done. There is work underway exploring how we handle JPEGs and PNGs. There are a slew of features that we can go after. This was a big step, a necessary step, but not the end for sure. 
We are a community, it takes a village, there’s no “i” in GIF
This project couldn’t have happened without the excellent work of Eddie Kohler in creating, maintaining, and enhancing gifsicle. Tumblr’s Site Reliability Engineering group packaged and helped deploy gifsicle onto hundreds and hundreds of machines in our datacenter. Tumblr’s Security Team vetted the code, both by inspection and by attacking it to make sure we stay safe. This was all for the awesome Tumblr creators, but I have to mention qilme/sukme (same dude, two blogs), reallivingartist, and especially gnumblr for their help in understanding and ultimately attacking this monstrous problem.

    Who doesn’t love animated GIFs?

    Believe it or not, support for GIFs at Tumblr was a happy accident! When Tumblr put together the code for handling JPEGs, support and GIFs (and PNGs) happened to also work using the same code. Perhaps even more surprising is that the tools used to handle GIFs at Tumblr hadn’t changed much from those early days. 

    The image above is an original from sukme that could not be posted to Tumblr last June. It also would have failed if he’d tried last Sunday. If you click-through to the original post, you will see a muddy, reduced-saturation mess. All this because our resizer couldn’t handle the original. 

    I’ve got ninety-nine problems and the GIF is one

    There is a lot of misinformation about GIF limits on Tumblr, so let me set the record straight: We don’t count colors or frames or pixels. We only count bytes and seconds. Every image that comes in is scaled to a number of smaller sizes and the smaller your image is, the fewer resizes need to happen, which means less time. 

    We had two core failure modes in our prior resizer: Some images would take as much as several minutes to convert. This was not directly attributable to color, dimensions, or frame count, but a mysterious mix of all of them. Some images would balloon in size (600KB at 400x400, 27MB at 250x250).

    The unpredictability of these failures made our GIF limits feel arbitrary and terrible to the end users. Some have gone so far as to threaten monkey kicks. I don’t want to get kicked by a monkey, so we started working hard late last year to fix it. 

    A proposed solution

    Some of you may have seen this post where the performance of our current converter was compared with a new “mystery” converter. The mystery converter was roughly 1000x faster on the “slapping” GIF and happened to look great, but had quality problems on other images. Those were more fully explored in here a couple of days later.

    If you haven’t figured it out yet, the mystery converter is gifsicle.

    Getting a better handle on it

    To get an unbiased test set, I took a random sample of roughly 90K GIFs that Tumblr users tried to upload, not limiting the corpus only to those that succeeded. These were tested against the current converter, resizing down to the next size we produce. Each resize is given up to 20 seconds to complete in our application, but all resizes must complete in 30 seconds. All resizes must be under 1MB or we will convert the first frame to JPEG and call it a day. 

    2.6% of my 90K GIFs took longer than 20 seconds to resize. This is an underestimation of how many GIFs would be rejected for time because this is only one of several resizes required. A whopping 17.1% of all GIFs were over 1MB. Even if we bump up to 2MB, the rejection rate is 2.75%. The converter was making over 25% of all resizes larger than the higher-resolution originals! The total rejection rate for my sample set was 4.46% of all original GIFs uploaded. 

    Using gifsicle is so much faster that our CPU rejection rate drops to 0.00 on my test set. Also, just under 99% of all images were smaller when resized than they were at their original resolution. The size rejection rate was a much lower 0.59%.

    Gifsicle problems

    As compelling as the performance of gifsicle is, the quality problems are too much to ignore. We played around with the code a bit, but eventually we just got in touch with the author, Dr. Eddie Kohler. The specifics are in this post, but the short version is that Eddie was able to improve quality by adding some more advanced resampling methods as well as palette expansion for small-palette images. This increased our size rejection rate to 0.68% while still keeping us well under our CPU budget. 

    Proving it

    Image processing is all about choices. How do you resample? Do you sharpen? Where in the workflow is gamma correction applied, if at all? The list goes on and on. 

    As you can imagine from the performance differences, our previous converter and gifsicle take very different approaches to GIF resizing. The output images look different. Sometimes it is slight, sometimes it is significant, but there is no way we could put out a converter that messes up your images, even if it messes them up quickly. 

    We set up a qualitative study. The goal was simply to prove that we weren’t doing worse than our old converter, not necessarily that we were doing better. This study was opened up to all Tumblr employees, as well as some “randomly selected” outsiders (my friends and family). Participants were presented with one of two questions:

    1.) Given an original and 1 resize, decide whether it is ok, unacceptable, or completely broken.

    2.) Given an original and 2 resizes (randomly choses which was left and which was right, sometimes they were identical), choose the better image or say there is no difference.

    The results were everything I could have hoped for. The “acceptable” test showed that users found gifsicle better at producing acceptable results (87% vs. 84%), but not by a statistically relevant amount (p=0.086) and that gifsicle produced fewer broken GIFs (0.71% vs. 1.38%), but again not enough to say it is definitively better (p=0.106). The “better” test found users preferring gifsicle 37% of the time, the prior converter only 16% of the time, but users also preferred one identical image over the other 27% of the time. Again, it is hard to say that gifsicle is better, but it is clear that it is no worse.

    Putting it all together

    The development and testing described above took from late October until the beginning of March. Packaging, deployment, and integration took only a couple of weeks!

    We aren’t done. There is work underway exploring how we handle JPEGs and PNGs. There are a slew of features that we can go after. This was a big step, a necessary step, but not the end for sure. 

    We are a community, it takes a village, there’s no “i” in GIF

    This project couldn’t have happened without the excellent work of Eddie Kohler in creating, maintaining, and enhancing gifsicle. Tumblr’s Site Reliability Engineering group packaged and helped deploy gifsicle onto hundreds and hundreds of machines in our datacenter. Tumblr’s Security Team vetted the code, both by inspection and by attacking it to make sure we stay safe. This was all for the awesome Tumblr creators, but I have to mention qilme/sukme (same dude, two blogs), reallivingartist, and especially gnumblr for their help in understanding and ultimately attacking this monstrous problem.

  • Meet Dr. Eddie Kohler, a GIF creator’s best friend!

    Scattered across the wacky animated set above is Eddie Kohler, professor of computer science at Harvard and the author of gifsicle since 1997. When it came time for Tumblr to reexamine how we manipulate GIFs, every engineer who looked at the problem inevitably came upon gifsicle, and every engineer eventually came to the same conclusion: the performance is stunning, but the quality just isn’t there. If you read between the lines of this post, sampling was obviously the issue.

    Late last year, we got in touch with Eddie. After we came to a mutual understanding of the problem, Eddie agreed to come visit us in New York. We spent Friday the 13th basking in the warm, coal-fired glow of the GIF format and how to process it. 

    By the end of the day we had a handshake-deal for Tumblr to sponsor some feature development on gifsicle, and what we are releasing now is the result of that work. 

    I would love to say there was a mutual “eureka” moment, but that would be a lie. Eddie showed up with some brilliant ideas about how to handle resizing while maintaining performance and quality. 

    Resampling:

    Eddie added several resampling methods, including some hybrid modes. None are as fast as the “naïve” default method, but the results are simply much better. 

    Palette:

    Our old tool threw away all the palette information, resizing as if there were no color limits, and then took a second pass to try to create the optimum palette for the image. This is slow and takes a ton of memory and can leave images looking muddy unless you sharpen them afterwards. 

    Gifsicle takes a very different approach. Scaling and resampling use 45-bit RGB colors (extra precision to allow a safe round-trip through gamma correction), but the results are fit to the original image’s color palette. This works for the vast majority of images while still avoiding the problem of having to choose which colors will be selected for the 512-color maximum palette (256 global and 256 frame-local colors). Despite my skepticism, this works amazingly well.

    The last change made here was to allow optional expansion of the palette for small-palette images. When reducing the size of a 2-color black-and-white GIF, it is nice to be able to use a few shades of gray for some of the pixels. 

    Results

    As previously mentioned, we took a little speed hit by changing the resampling. That meant that we were only 10x faster than our previous converter instead of being 12-15x faster. The images are significantly smaller, too. Perhaps the biggest thing is that it is highly unlikely that a resize to smaller dimensions will create a bigger file. So now the animation you lovingly crafted and optimized to be under a megabyte won’t surprise you by timing out or exploding in size and getting rejected. 

    The bottom line: our rejection rate using our old tool is estimated to be 4.46% of all original GIFs. Using gifsicle reduces that to 0.68% of all submitted GIFs, and no rejection of GIFs under 1MB. Oh, and your submissions will complete much faster. 

    Eddie isn’t a Tumblr user, so send any thanks to him on Twitter.

  • Tumblr Hosts New York Android Developers Meetup

    Nearly 200 Android developers stopped by the Tumblr office for three talks focusing on the importance of Design in Android.  Our own kevinthebigapple spoke on the importance of building beautiful, design first software for the platform.  Many thanks to the organizers of this fine event and everyone who showed up!

  • OpenTSDB proxy

    We use OpenTSDB to store the majority of our time series server and application statistics here at Tumblr. We recently began a project to migrate OpenTSDB from an existing HBase cluster running an older version of HBase to a new cluster with newer hardware and running the latest stable version of Hbase.

    We wanted a way to have some historical data in the new cluster before we switched to it. Within Tumblr we have a variety of applications generating these metrics and it was not very practical for us to change all of them to double write this data. Instead, we chose to replace the standard OpenTSDB listeners with a proxy that would do this double writing for us. While we could have used HBase copy table or written our own tool to backfill historical data from the old cluster, double writing for an initial period allowed us to avoid adding additional load on our existing cluster. This strategy also allowed us to move queries for recent data to new cluster earlier than the full cutover.

    The tsd_proxy is written in Clojure and relies heavily on the Lamina and Aleph which in turn build on top of Netty. We have been using this in our production infrastructure for over two months now while sustaining writes at or above 175k/s and it has been working well for us. We are open sourcing this proxy in the hope that others might find a use for this as well.

    The tsd proxy listens on a configurable port and can forward the incoming data stream to multiple end points. It also has the ability to filter the incoming stream and reject data points that don’t match a (configurable) set of regular expressions. It also has the ability to queue the incoming stream and re-attempt delivery if one of the end points is down. It is also possible to limit the queue size so you don’t blow through your heap. The README has some more information on how to set this up.

  • Supporting Keyboard Shortcuts without UITextView

    In Issue 5 of objc.io, Peter Steinberger wrote about Hidden Gems and Workarounds in iOS 7 and briefly mentions the the new class UIKeyCommand to support keyboard shortcuts but he caveats:

    Now don’t get too excited; there are some caveats. This only works when the keyboard is visible (if there’s some first responder like UITextView.) For truly global hotkeys, you still need to revert to…hackery.

    With the recent addition of Keyboard Shortcuts to the Tumblr app, we figured out a way to get around this requirement, by taking advantage of the fact that UIViewController is a UIResponder.

    @implementation TMDashboardViewController
    
    #pragma mark - UIResponder
    
    - (BOOL)canBecomeFirstResponder {
        return YES;
    }
    
    - (NSArray *)keyCommands {
        static NSArray *keyCommands;
    
        static dispatch_once_t onceToken;
        dispatch_once(&onceToken, ^{
            UIKeyCommand *composeCommand = [UIKeyCommand keyCommandWithInput:@"c" modifierFlags:UIKeyModifierAlternate action:@selector(composeShortcut:)];
    
            keyCommands = @[composeCommand];
        });
    
        return keyCommands;
    }
    
    #pragma mark - TMDashboardViewController
    
    - (void)composeShortcut:(UIKeyCommand *)command {
        [self compose];
    }
    
    @end
    
  • We’ve launched a bug bounty program!

    A few weeks back, we launched a cohesive form that the Tumblr community can use to submit security bugs and exploits to us for review. We’ve had great responses so far, and a small number of those submissions became hot bugs for me and my team to fix. After they were addressed, it fell to me to figure out how to send honoraria to the talented researchers who had found them.  

    Our first attempt resulted in a pretty inefficient system: a bunch of trips to the Duane Reade on Broadway and 19th for gift cards; lots of stamps from the newsstand at 23rd and 5th for postage to foreign countries; and a whole lot of expense reports for the Finance department to process.

    No more. We’ve streamlined. We’re geared up to handle as many requests as we receive and to deliver the bounties expeditiously. We’ve also better outlined the exact terms of the program—we want everyone to be clear on which parts of Tumblr are worth your efforts.

    The terms of the bounty are listed here. Happy hunting!

  • Hi, I’m Kyle, an engineer here at Tumblr. I work on search and other cool stuff that makes Tumblr the best blogging platform in the world.
This Martin Luther King Day weekend, Tumblr and the Black Techies, an organization I founded to promote diversity in engineering, are collaborating to host a hackathon. The MLK DreamCode Hackathon will take place at Tumblr HQ beginning the morning of Saturday, January 18th and go on through the evening of Sunday, January 19th.
I’m excited because we’ve invited a diverse group of engineers and designers to take part. Over the 30 hours, we hope to see hacks that further Dr. King’s values of equality, peace, and social justice.
I’ll be blogging the events from the Dream Code Hackathon at mlkdreamcode.tumblr.com. Be sure to follow the mlkdreamcode blog and use the #mlkDreamCode tag for your hacks.

    Hi, I’m Kyle, an engineer here at Tumblr. I work on search and other cool stuff that makes Tumblr the best blogging platform in the world.

    This Martin Luther King Day weekend, Tumblr and the Black Techies, an organization I founded to promote diversity in engineering, are collaborating to host a hackathon. The MLK DreamCode Hackathon will take place at Tumblr HQ beginning the morning of Saturday, January 18th and go on through the evening of Sunday, January 19th.

    I’m excited because we’ve invited a diverse group of engineers and designers to take part. Over the 30 hours, we hope to see hacks that further Dr. King’s values of equality, peace, and social justice.

    I’ll be blogging the events from the Dream Code Hackathon at mlkdreamcode.tumblr.com. Be sure to follow the mlkdreamcode blog and use the #mlkDreamCode tag for your hacks.

  • Key-Value-Observing is a divisive API, to say the least. Despite it’s (well-documented) flaws, I personally tend to favor it when wanting to know if a property’s value changes, but most developers I talk to (including both of my fellow iOS developers here at Tumblr) tend to prefer overriding setters. Here’s a case where I think KVO works slightly better than overwriting setters does.

    Say you have a custom view controller subclass with a property, and changes to that property’s value will result in some modifications being made to the view controller’s view or subviews. An example from the Tumblr codebase does exactly this:

    - (void)setContainerScrollable:(BOOL)containerScrollable {
        if (_containerScrollable != containerScrollable) {
            _containerScrollable = containerScrollable;
    
            self.container.scrollEnabled = containerScrollable;
            self.tableView.scrollEnabled = !containerScrollable;
        }
    }
    

    Looks simple enough, right? Now you can simply do the following:

    TMContainerViewController *controller = [[TMContainerViewController alloc] init];
    controller.containerScrollable = YES;
    

    Of course, there’s a problem with this. Since we’re calling the custom setter before the controller’s view has necessarily loaded, it won’t have the desired effect. At best, the subviews we operate on inside the overridden setter will be nil and our custom behavior won’t be applied. At worst, we’ll refer to self.view in our implementation, the view will be loaded prematurely, and something unexpected could occur.

    So how can we fix this? One way is to make sure our setter is called again after the view is loaded, and prevent against the custom logic being executed beforehand:

    - (void)setContainerScrollable:(BOOL)containerScrollable {
        if (_containerScrollable != containerScrollable) {
            _containerScrollable = containerScrollable;
    
            if ([self isViewLoaded]) {
                self.container.scrollEnabled = containerScrollable;
                self.tableView.scrollEnabled = !containerScrollable;
            }
        }
    }
    
    - (void)viewDidLoad {
        // View set-up
    
        self.containerScrollable = self.isContainerScrollable;
    }
    

    This should work, but calling a getter and passing it’s return value to the same property’s setter doesn’t strike me as being particularly elegant. What if we factor out our custom logic into a separate private instance method?

    - (void)updateViewsForContainerScrollability {
        self.container.scrollEnabled = self.isContainerScrollable;
        self.tableView.scrollEnabled = !self.isContainerScrollable ;
    }
    
    - (void)setContainerScrollable:(BOOL)containerScrollable {
        if (_containerScrollable != containerScrollable) {
            _containerScrollable = containerScrollable;
    
            if ([self isViewLoaded]) {
                [self updateViewsForContainerScrollability];
            }
        }
    }
    
    - (void)viewDidLoad {
        // View set-up
    
        [self updateViewsForContainerScrollability];
    }
    

    This is a fine solution, and will work as expected. That being said, let’s look at another approach to the same problem using KVO.

    Here’s what our observation code looks like:

    - (void)observeValueForKeyPath:(NSString *)keyPath ofObject:(id)object change:(NSDictionary *)change 
                           context:(void *)context {
        if (context == TMContainerViewControllerKVOContext) {
             if (object == self && [keyPath isEqualToString:@"containerScrollable"]) {
                self.container.scrollEnabled = self.isContainerScrollable;
                self.tableView.scrollEnabled = !self.isContainerScrollable;
            }
        }
        else {
            [super observeValueForKeyPath:keyPath ofObject:object change:change context:context];
        }
    }
    

    Now, we’re still faced with the same issue of needing to ensure that this code runs both A) as soon as the view is loaded and B) any time the property’s value is changed going forward. Thankfully, NSKeyValueObservingOptionInitial provides this exact behavior.

    - (void)viewDidLoad {
        // View set-up
    
      [self addObserver:self forKeyPath:@"containerScrollable"
                options:NSKeyValueObservingOptionInitial
                context:TMContainerViewControllerKVOContext];
    }
    

    Since we’re no longer overriding the setter, the property’s value can be changed completely independently of the view being initialized. When our view is set-up, we add an observer that is called immediately with the initial property value, and called again whenever the property is changed in the future.

    KVO code can be messy, can result in problems if used incorrectly, and certainly isn’t the best tool to use in all situations. But if you ask me, this is a pretty good example of when it does come in handy.

  • Efficient distribution of cacheable http requests at Tumblr

    Tumblr uses Haproxy for load balancing http requests and other types of tcp traffic to pools of application servers in order to evenly distribute our workloads and ensure that we’re using our machines efficiently. Haproxy supports consistent hashing of requests to specific servers which is a requirement for efficiently caching responses. Consistent hashing ensures that only a few mappings (of requests to servers) are redistributed when a server is added or removed from the pool which results in a higher maintained hit ratio when dealing with caching systems than a traditional modulo distribution method.

    An important aspect of the hashing algorithm is to ensure that load is distributed evenly across the pool. Haproxy uses the SDBM hashing function by default, but about a year ago we began doing testing to determine the efficiency of alternative algorithms using the same hashing criteria as we see in production. We found that the hashing algorithm most suited for the input given to our Haproxy instances at Tumblr was DJB2.

    The improvement we see when using DJB2 is best illustrated via a graph of connections from Tumblr’s varnish pools to the application nodes. The Varnish pools are consistently hashed backends to Haproxy. The effect of switching the algorithm is an improvement in the load distribution on the varnish pool which is reflected in the convergence of connections to application nodes ultimately leading to better cache usage.

    image

    Please continue reading for more on the details of hashing as implemented in Haproxy and the patch we have pushed upstream which lets you try out these options.

    Hash functions strive to have little correlation between input and output. The heart of a hash function is its mixing step. The behavior of the mixing step largely determines the degree in which a hash function is collision resistant. Hash functions that are collision resistant are more likely to provide an even distribution of load.

    The purpose of the mixing function is to spread the effect of each message bit throughout all the bits of the internal state. Ideally every bit in the hash state is affected by every bit in the message and perform that operation as quickly as possible for the sake of program performance. A function is said to satisfy the strict avalanche criterion if, whenever a single input bit is complemented (toggled between 0 and 1), each of the output bits should change with a probability of one half for an arbitrary selection of the remaining input bits.

    To guard against a combination of hash function and input that results in high rate of collisions, Haproxy implements an avalanche algorithm on the result of the hashing function. The avalanche is always applied when using the consistent hashing directive. It is intended to provide a good distribution for little input variations. The result is quite suited to fit over a 32-bit space with enough variations so that a randomly picked number falls equally before any server position, which is ideal for consistently hashed backends, a common use case for caches.

    In additional tests involving alternative algorithms for hash input and an option to trigger avalanche, we found different algorithms perform better on different criteria. DJB2 performs well when hashing ascii text which makes it a good choice for hashing http host headers. Other alternatives perform better on numbers and are a good choice when using source ip. The results also vary by use of the avalanche flag.

    What are the effects of the DJB2 and avalanche algorithms on your production deployments? Would it not be great to have an option that lets you play with the hash function and determine via configuration if avalanche is beneficial to you?

    Now you can. Tumblr’s patches for enabling the hashing alternative and an option to trigger avalanche are now available in Haproxy 1.5. Let us know the results of your testing.