We hear quite a bit about BIG DATA and REAL-TIME stuff. But I’m never sure exactly how BIG the data is or how REAL the time is. Seems like these are often just cool words that the cool kids like to throw around. The Tumblr engineering staff is too busy to worry about being cool, but we do have BIG data and we are making it available in REAL time — starting today. We’re happy to announce that the Tumblr fire hose is open, thanks to our partnership with GNIP.
As Tumblr has grown (more than 50 million blogs and more than 20 billion posts), so has interest in the underlying data. Our API has been a great way to pull small slices of it, but now enterprise customers can get a complete real-time feed of all public data.
Handling all of this has been a significant, but worthwhile engineering effort; it will give insight into the BIG and REAL creative community that powers Tumblr and produces a lot of fascinating data.
In a future post, we’ll talk about the underlying technical bits that make all this magic happen (hint: Kafka, Finagle). For now, go take a drink from the fire hose.