Episode cover
Batch Data & Streaming Data in one Atom (with Jove Zhong)
April 24, 2024 · 51 min

Every database has to juggle the need to process new data and to query old data. That task falls to any system that “does stuff and remembers stuff”. But it’s quite hard to really optimise one system for both use cases. There are different constraints on new and old data, and as a system gets larger and larger, those differences multiply to breaking point. That’s something Twitter’s engineers were figuring out in the 2010s.

One solution that came up in those years was the Lambda Architecture. A two-pronged approach that recognises the divide between new and old data, and works hard to blend the two together seamlessly in userspace. But that seamless blending is easier said than done. It’s nearly all bespoke work.

What if you could get it off the shelf? Let someone else do the work of combining two different kinds of database into one neat package? That's the question of the week as we look at the recently open-sourced project Proton, and its attempt to be the Lambda Architecture in a box…

Proton Docs: https://docs.timeplus.com/proton

Proton Source: https://github.com/timeplus-io/proton

Timeplus: https://www.timeplus.com/

Kris on Mastodon: http://mastodon.social/@krisajenkins

Kris on LinkedIn: https://www.linkedin.com/in/krisjenkins/

Kris on Twitter: https://twitter.com/krisajenkins

#podcast #softwareengineering #databases #dataengineering