Skip to main content

Module kafka

Module kafka 

Source
Expand description

Kafka streaming ELT (Extract → Load → Transform).

Kafka is not a file connector. RDP treats it like stream frameworks (Flink, Kafka Streams):

  1. Extract — poll a bounded window from a topic (poll_kafka_window) or accept records from your host consumer (elt_load_kafka_records).
  2. Load — land raw/semi-structured rows to storage (Parquet, Postgres COPY, object store) with offsets preserved — no heavy transform in the hot path.
  3. Transform — run Polars SQL / pipeline JSON on landed data in a separate job or stage.

Poll window (Vec<KafkaStreamRecord>) is backpressure / checkpoint sizing — not batch ETL.

Enable native I/O with --features kafka. See docs/KAFKA_ELT.md.

Functions§

elt_load_kafka_records_json