Goal: have a personal blog, and try out another point in the 'modular app design with elixir' space. Designing OTP systems with elixir had some interesting ideas.
1.8 KiB
1.8 KiB
%{ title: "Rebuilding Our Data Pipeline with Broadway", author: "Jane Doe", tags: ~w(elixir broadway data-engineering), description: "How we replaced our Kafka consumer with Broadway for 10x throughput" }
Last quarter we hit a wall with our homegrown Kafka consumer. Message lag was growing, backpressure was non-existent, and our on-call engineers were losing sleep. We decided to rebuild on Broadway.
Why Broadway?
Broadway gives us three things our old consumer lacked:
- Batching — messages are grouped before hitting the database, cutting our write volume by 90%
- Backpressure — producers only send what consumers can handle
- Fault tolerance — failed messages are retried automatically with configurable strategies
The migration
We ran both pipelines in parallel for two weeks, comparing output row-by-row. Once we confirmed parity, we cut over with zero downtime.
defmodule MyApp.EventPipeline do
use Broadway
def start_link(_opts) do
Broadway.start_link(__MODULE__,
name: __MODULE__,
producer: [
module: {BroadwayKafka.Producer, [
hosts: [localhost: 9092],
group_id: "my_app_events",
topics: ["events"]
]}
],
processors: [default: [concurrency: 10]],
batchers: [default: [batch_size: 100, batch_timeout: 500]]
)
end
@impl true
def handle_message(_, message, _) do
message
|> Broadway.Message.update_data(&Jason.decode!/1)
end
@impl true
def handle_batch(_, messages, _, _) do
rows = Enum.map(messages, & &1.data)
MyApp.Repo.insert_all("events", rows)
messages
end
end
Results
After the migration, our p99 processing latency dropped from 12s to 180ms and we haven't had a single page about consumer lag since.