1.5M ratings
277k ratings

See, that’s what the app is perfect for.

Sounds perfect Wahhhh, I don’t wanna

Resizing Gifs On-Demand

Why dynamically resize images in the first place?

Before the dawn of on-demand resizing at Tumblr, every posted image was resized into seven or eight different sizes, and each was saved into our backing media store (a massive S3 bucket). This made serving our images very fast—just grab the size you want right from the bucket! While this was great, it also meant that any changes to our image processing would not affect any images we had already saved (billions of images). If we were to upgrade image quality, add a new size crop, or change how we handle taking down media, the effect would only be marginal…what a bummer! The cost of storing all the resizes as separate files (petabytes of data!), along with a lack of agility moving forward, motivated us to adopt a dynamic resizing and serving strategy.

We began with resizing jpg and png images on-demand instead of persisting each different resize crop in our S3 bucket. This has been a great success; our “Dynamic Image Resizer” churns through over 6,000 images a second, at a roundtrip request latency of only 250ms per image. Not having to store the resizes saves us tens of thousands of dollars a month! So, the natural question was, can we also do this for gifs and make a “Dynamic Gif Resizer?”

The problem with resizing gifs on-demand

Gifs, as a medium, are a wonderful thing. They capture a special or hilarious moment and repeat it back to you, forever. However, the actual Graphics Interchange Format leaves much to be desired. Last touched in 1989, the format is woefully outdated, and this begets massive, low quality animated images. When compared to video format counterparts (H.264 and the like), the gif file size can be tens of times larger at similar visual quality. Many companies have punted on the gif file format entirely; imgur released their gifv format, which wraps an mp4 video. Instagram will loop your video clips, but will flatten gifs to a still image. However, as the true “home of the gif,” Tumblr isn’t ever giving up on your gif files!

Resize it faster

A while ago, one of my colleagues @dngrm posted about updates we made in our gif resizing technology. Essentially, we switched our gif resizing library from ImageMagick to gifsicle with great success—we got lower latency and higher-quality results. In order to resize a gif in a realistic timeframe for on-demand resizing and serving, we proposed some changes to gifsicle that parallelizes the resizing step. Since a gif is just a stack of image frames, we figured that resizing them using a thread pool could lead to a performance improvement. Luckily for us (and the world!), gifsicle author Eddie Kohler accepted and merged our changes into gifsicle. With this new threaded resize option in gifsicle, we gained about a 1.5-2x speed-up in resizing an average gif against using the vanilla gifsicle. This brought down the average wall time of a gif resize to about 100ms. The entire gif resize request (downloading the upstream image, resizing the gif, and serving the response) is now only 400ms on average.

Cache is king

To make all this possible, tumblr heavily relies on CDNs to cache massive amounts of static content and avoid repeated work. Thanks to this, the Dynamic Gif Resizer only gets a little over 1,000 resize requests per second, thanks to an incredibly-high cache hit ratio on our CDNs.

On top of that, we rely on the usage of conditional GET requests and 304 Not Modified responses to cut down on the amount of real work we must do in the resizing level. The number of 304s we serve fluctuates between 30-50% for all gif responses, which saves us a tremendous amount of compute time!

Putting it all together

The resizer itself is an nginx server with a custom module that does the upstreaming and resizing, and is written in C. The jpg/png resizer utilizes OpenCV for image manipulation, while the gif resizer uses the aforementioned gifsicle library.
Our fleet of resizers and their surrounding architecture are housed in AWS. The main motivation for this was colocation to our image store (S3) and the ability to automatically scale our instance count up and down, depending on time of day (our traffic pattern is heavily cyclic over a 24h window). The rest of tumblr’s architecture is housed in our own DC. Below is a minimalistic diagram of our resizer setup.

image

Thanks

Both the Image Resizer and Gif Resizer were massive undertakings, and a lot of people deserve credit for fantastic work:
Massive thanks to co-developer @naklin, and to @neerajrajgure who helped with improvements.
To @dngrm, @michaelbenedict, @heinstrom, @yl3w, and @jeffreyweston for architecture help and sage advice.
To our AWS enterprise support team Frank Cincotta, Shaun Qualheim, Darrell DeCosta, and Dheeraj Achra.
And to Eddie Kohler who helped clean up my ugly gifsicle changes and let them be a part of his library.

Questions? Comments?

Talk to me on tumblr using our new messaging system! My tumblr is @hashtag-content

image

Originally posted by gameraboy

gif gif resizer resizer engineering tumblr staff
Who doesn’t love animated GIFs?
Believe it or not, support for GIFs at Tumblr was a happy accident! When Tumblr put together the code for handling JPEGs, support and GIFs (and PNGs) happened to also work using the same code. Perhaps even more...

Who doesn’t love animated GIFs?

Believe it or not, support for GIFs at Tumblr was a happy accident! When Tumblr put together the code for handling JPEGs, support and GIFs (and PNGs) happened to also work using the same code. Perhaps even more surprising is that the tools used to handle GIFs at Tumblr hadn’t changed much from those early days. 

The image above is an original from sukme that could not be posted to Tumblr last June. It also would have failed if he’d tried last Sunday. If you click-through to the original post, you will see a muddy, reduced-saturation mess. All this because our resizer couldn’t handle the original. 

I’ve got ninety-nine problems and the GIF is one

There is a lot of misinformation about GIF limits on Tumblr, so let me set the record straight: We don’t count colors or frames or pixels. We only count bytes and seconds. Every image that comes in is scaled to a number of smaller sizes and the smaller your image is, the fewer resizes need to happen, which means less time. 

We had two core failure modes in our prior resizer: Some images would take as much as several minutes to convert. This was not directly attributable to color, dimensions, or frame count, but a mysterious mix of all of them. Some images would balloon in size (600KB at 400x400, 27MB at 250x250).

The unpredictability of these failures made our GIF limits feel arbitrary and terrible to the end users. Some have gone so far as to threaten monkey kicks. I don’t want to get kicked by a monkey, so we started working hard late last year to fix it. 

A proposed solution

Some of you may have seen this post where the performance of our current converter was compared with a new “mystery” converter. The mystery converter was roughly 1000x faster on the “slapping” GIF and happened to look great, but had quality problems on other images. Those were more fully explored in here a couple of days later.

If you haven’t figured it out yet, the mystery converter is gifsicle.

Getting a better handle on it

To get an unbiased test set, I took a random sample of roughly 90K GIFs that Tumblr users tried to upload, not limiting the corpus only to those that succeeded. These were tested against the current converter, resizing down to the next size we produce. Each resize is given up to 20 seconds to complete in our application, but all resizes must complete in 30 seconds. All resizes must be under 1MB or we will convert the first frame to JPEG and call it a day. 

2.6% of my 90K GIFs took longer than 20 seconds to resize. This is an underestimation of how many GIFs would be rejected for time because this is only one of several resizes required. A whopping 17.1% of all GIFs were over 1MB. Even if we bump up to 2MB, the rejection rate is 2.75%. The converter was making over 25% of all resizes larger than the higher-resolution originals! The total rejection rate for my sample set was 4.46% of all original GIFs uploaded. 

Using gifsicle is so much faster that our CPU rejection rate drops to 0.00 on my test set. Also, just under 99% of all images were smaller when resized than they were at their original resolution. The size rejection rate was a much lower 0.59%.

Gifsicle problems

As compelling as the performance of gifsicle is, the quality problems are too much to ignore. We played around with the code a bit, but eventually we just got in touch with the author, Dr. Eddie Kohler. The specifics are in this post, but the short version is that Eddie was able to improve quality by adding some more advanced resampling methods as well as palette expansion for small-palette images. This increased our size rejection rate to 0.68% while still keeping us well under our CPU budget. 

Proving it

Image processing is all about choices. How do you resample? Do you sharpen? Where in the workflow is gamma correction applied, if at all? The list goes on and on. 

As you can imagine from the performance differences, our previous converter and gifsicle take very different approaches to GIF resizing. The output images look different. Sometimes it is slight, sometimes it is significant, but there is no way we could put out a converter that messes up your images, even if it messes them up quickly. 

We set up a qualitative study. The goal was simply to prove that we weren’t doing worse than our old converter, not necessarily that we were doing better. This study was opened up to all Tumblr employees, as well as some “randomly selected” outsiders (my friends and family). Participants were presented with one of two questions:

1.) Given an original and 1 resize, decide whether it is ok, unacceptable, or completely broken.

2.) Given an original and 2 resizes (randomly choses which was left and which was right, sometimes they were identical), choose the better image or say there is no difference.

The results were everything I could have hoped for. The “acceptable” test showed that users found gifsicle better at producing acceptable results (87% vs. 84%), but not by a statistically relevant amount (p=0.086) and that gifsicle produced fewer broken GIFs (0.71% vs. 1.38%), but again not enough to say it is definitively better (p=0.106). The “better” test found users preferring gifsicle 37% of the time, the prior converter only 16% of the time, but users also preferred one identical image over the other 27% of the time. Again, it is hard to say that gifsicle is better, but it is clear that it is no worse.

Putting it all together

The development and testing described above took from late October until the beginning of March. Packaging, deployment, and integration took only a couple of weeks!

We aren’t done. There is work underway exploring how we handle JPEGs and PNGs. There are a slew of features that we can go after. This was a big step, a necessary step, but not the end for sure. 

We are a community, it takes a village, there’s no “i” in GIF

This project couldn’t have happened without the excellent work of Eddie Kohler in creating, maintaining, and enhancing gifsicle. Tumblr’s Site Reliability Engineering group packaged and helped deploy gifsicle onto hundreds and hundreds of machines in our datacenter. Tumblr’s Security Team vetted the code, both by inspection and by attacking it to make sure we stay safe. This was all for the awesome Tumblr creators, but I have to mention qilme/sukme (same dude, two blogs), reallivingartist, and especially gnumblr for their help in understanding and ultimately attacking this monstrous problem.

gif gif limit gifsicle tumblr

Meet Dr. Eddie Kohler, a GIF creator’s best friend!

Scattered across the wacky animated set above is Eddie Kohler, professor of computer science at Harvard and the author of gifsicle since 1997. When it came time for Tumblr to reexamine how we manipulate GIFs, every engineer who looked at the problem inevitably came upon gifsicle, and every engineer eventually came to the same conclusion: the performance is stunning, but the quality just isn’t there. If you read between the lines of this post, sampling was obviously the issue.

Late last year, we got in touch with Eddie. After we came to a mutual understanding of the problem, Eddie agreed to come visit us in New York. We spent Friday the 13th basking in the warm, coal-fired glow of the GIF format and how to process it. 

By the end of the day we had a handshake-deal for Tumblr to sponsor some feature development on gifsicle, and what we are releasing now is the result of that work. 

I would love to say there was a mutual “eureka” moment, but that would be a lie. Eddie showed up with some brilliant ideas about how to handle resizing while maintaining performance and quality. 

Resampling:

Eddie added several resampling methods, including some hybrid modes. None are as fast as the “naïve” default method, but the results are simply much better. 

Palette:

Our old tool threw away all the palette information, resizing as if there were no color limits, and then took a second pass to try to create the optimum palette for the image. This is slow and takes a ton of memory and can leave images looking muddy unless you sharpen them afterwards. 

Gifsicle takes a very different approach. Scaling and resampling use 45-bit RGB colors (extra precision to allow a safe round-trip through gamma correction), but the results are fit to the original image’s color palette. This works for the vast majority of images while still avoiding the problem of having to choose which colors will be selected for the 512-color maximum palette (256 global and 256 frame-local colors). Despite my skepticism, this works amazingly well.

The last change made here was to allow optional expansion of the palette for small-palette images. When reducing the size of a 2-color black-and-white GIF, it is nice to be able to use a few shades of gray for some of the pixels. 

Results

As previously mentioned, we took a little speed hit by changing the resampling. That meant that we were only 10x faster than our previous converter instead of being 12-15x faster. The images are significantly smaller, too. Perhaps the biggest thing is that it is highly unlikely that a resize to smaller dimensions will create a bigger file. So now the animation you lovingly crafted and optimized to be under a megabyte won’t surprise you by timing out or exploding in size and getting rejected. 

The bottom line: our rejection rate using our old tool is estimated to be 4.46% of all original GIFs. Using gifsicle reduces that to 0.68% of all submitted GIFs, and no rejection of GIFs under 1MB. Oh, and your submissions will complete much faster. 

Eddie isn’t a Tumblr user, so send any thanks to him on Twitter.

gif gifsicle gif limt