
Why Kafka?
Kafka, not the author, is a foundational tool that enables a real-time distributed data stack. It's scary to the technically naive but pretty simple: It handles real-time data streaming.
Making scary things not scary can be a challenge. We have internal biases toward what we know already; leadership is understandably skeptical about "over-engineering," and adopting and integrating new technologies extends beyond the level of coding one can get by integrating an AI tool into an IDE.
In short, solving scary problems means getting a tool or library that will standardize locally and that any developer can use seamlessly to develop code on their workstation and transition that code into production.
The constant problem of a real-world system is that we have heterogeneous database needs. Often, we will have a Postgres database for operational data and an analytical data store for historical and analytical data.
Kafka is an excellent tool for that.
The best approach to learning Kafka is to build it locally and run the logic to see exactly what it does. This project does just that: https://github.com/timowlmtn/bigdataplatforms/tree/master/src/kafka
Comments