%{ title: "Rebuilding Our Data Pipeline with Broadway", author: "Jane Doe", tags: ~w(elixir broadway data-engineering), description: "How we replaced our Kafka consumer with Broadway for 10x throughput" } --- Last quarter we hit a wall with our homegrown Kafka consumer. Message lag was growing, backpressure was non-existent, and our on-call engineers were losing sleep. We decided to rebuild on [Broadway](https://github.com/dashbitco/broadway). ## Why Broadway? Broadway gives us three things our old consumer lacked: - **Batching** — messages are grouped before hitting the database, cutting our write volume by 90% - **Backpressure** — producers only send what consumers can handle - **Fault tolerance** — failed messages are retried automatically with configurable strategies ## The migration We ran both pipelines in parallel for two weeks, comparing output row-by-row. Once we confirmed parity, we cut over with zero downtime. ```elixir defmodule MyApp.EventPipeline do use Broadway def start_link(_opts) do Broadway.start_link(__MODULE__, name: __MODULE__, producer: [ module: {BroadwayKafka.Producer, [ hosts: [localhost: 9092], group_id: "my_app_events", topics: ["events"] ]} ], processors: [default: [concurrency: 10]], batchers: [default: [batch_size: 100, batch_timeout: 500]] ) end @impl true def handle_message(_, message, _) do message |> Broadway.Message.update_data(&Jason.decode!/1) end @impl true def handle_batch(_, messages, _, _) do rows = Enum.map(messages, & &1.data) MyApp.Repo.insert_all("events", rows) messages end end ``` ## Results After the migration, our p99 processing latency dropped from 12s to 180ms and we haven't had a single page about consumer lag since.