firehose/03-10-rebuilding-data-pipeline.md at 68f43b84dd555b5d87246c1170a1985e5993a176

Your Name bc14696f57 Static blog with front page summary

Goal: have a personal blog, and try out another point in the 'modular
app design with elixir' space.

Designing OTP systems with elixir had some interesting ideas.

2026-03-17 11:17:21 +00:00

1.8 KiB

Raw Blame History

%{ title: "Rebuilding Our Data Pipeline with Broadway", author: "Jane Doe", tags: ~w(elixir broadway data-engineering), description: "How we replaced our Kafka consumer with Broadway for 10x throughput" }

Last quarter we hit a wall with our homegrown Kafka consumer. Message lag was growing, backpressure was non-existent, and our on-call engineers were losing sleep. We decided to rebuild on Broadway.

Why Broadway?

Broadway gives us three things our old consumer lacked:

Batching — messages are grouped before hitting the database, cutting our write volume by 90%
Backpressure — producers only send what consumers can handle
Fault tolerance — failed messages are retried automatically with configurable strategies

The migration

We ran both pipelines in parallel for two weeks, comparing output row-by-row. Once we confirmed parity, we cut over with zero downtime.

defmodule MyApp.EventPipeline do
  use Broadway

  def start_link(_opts) do
    Broadway.start_link(__MODULE__,
      name: __MODULE__,
      producer: [
        module: {BroadwayKafka.Producer, [
          hosts: [localhost: 9092],
          group_id: "my_app_events",
          topics: ["events"]
        ]}
      ],
      processors: [default: [concurrency: 10]],
      batchers: [default: [batch_size: 100, batch_timeout: 500]]
    )
  end

  @impl true
  def handle_message(_, message, _) do
    message
    |> Broadway.Message.update_data(&Jason.decode!/1)
  end

  @impl true
  def handle_batch(_, messages, _, _) do
    rows = Enum.map(messages, & &1.data)
    MyApp.Repo.insert_all("events", rows)
    messages
  end
end

Results

After the migration, our p99 processing latency dropped from 12s to 180ms and we haven't had a single page about consumer lag since.

1.8 KiB Raw Blame History

%{ title: "Rebuilding Our Data Pipeline with Broadway", author: "Jane Doe", tags: ~w(elixir broadway data-engineering), description: "How we replaced our Kafka consumer with Broadway for 10x throughput" }

Why Broadway?

The migration

Results

1.8 KiB

Raw Blame History