Redpanda Connect Has a Free API: The Stream Processing Tool That Transforms Data Between Any Two Systems

#redpanda #api #streaming #data

You need to move data from Kafka to PostgreSQL. Or from S3 to Elasticsearch. Or from webhooks to BigQuery. Instead of writing custom ETL scripts for each pair, Redpanda Connect (formerly Benthos) pipes data between 200+ connectors with transformations in between.

What Is Redpanda Connect?

Redpanda Connect is an open-source stream processing tool. It reads data from inputs, applies transformations (via a language called Bloblang), and writes to outputs. It supports 200+ connectors: Kafka, AWS S3, PostgreSQL, HTTP, MQTT, AMQP, GCP, and many more.

The Free Tool

Redpanda Connect is completely free and open source:

200+ connectors: Kafka, S3, PostgreSQL, HTTP, MQTT, GCP, Azure
Bloblang: Powerful data transformation language
Single binary: No JVM, no runtime dependencies
Batching: Automatic micro-batching for throughput
Error handling: Dead letter queues, retries, backoff
Monitoring: Prometheus metrics, tracing
Config-driven: YAML configuration, no code needed

Quick Start

Install:

curl -LO https://github.com/redpanda-data/connect/releases/latest/download/redpanda-connect_linux_amd64.tar.gz
tar xzf redpanda-connect_linux_amd64.tar.gz

Pipe data between systems:

# config.yaml — Read from Kafka, transform, write to PostgreSQL
input:
  kafka:
    addresses: ["localhost:9092"]
    topics: ["user_events"]
    consumer_group: "pg-sink"

pipeline:
  processors:
    - bloblang: |
        root.user_id = this.userId
        root.event = this.eventType
        root.timestamp = this.ts.ts_parse("2006-01-02T15:04:05Z")
        root.properties = this.data.string()

output:
  sql_insert:
    driver: postgres
    dsn: "postgres://user:pass@localhost:5432/analytics"
    table: events
    columns: ["user_id", "event", "timestamp", "properties"]
    args_mapping: |
      root = [
        this.user_id,
        this.event,
        this.timestamp,
        this.properties
      ]

Run:

redpanda-connect run config.yaml

HTTP webhook to S3:

input:
  http_server:
    path: /webhooks
    allowed_verbs: ["POST"]

pipeline:
  processors:
    - bloblang: |
        root = this
        root.received_at = now()
        root.source = meta("http_server_request_path")

output:
  aws_s3:
    bucket: my-webhook-archive
    path: "webhooks/${!timestamp_unix()}-${!uuid_v4()}.json"

Why Teams Choose Redpanda Connect

A data team maintained 15 custom Python scripts to move data between Kafka, S3, PostgreSQL, and Elasticsearch. Each script had different error handling, retry logic, and monitoring. After replacing them with Redpanda Connect configs, they had consistent error handling, built-in retries, and Prometheus metrics for every pipeline. Total YAML: 200 lines replacing 3,000 lines of Python.

Who Is This For?

Data engineers building ETL/ELT pipelines
Backend developers moving data between microservices
DevOps teams needing log routing and transformation
Anyone writing custom scripts to pipe data between systems

Start Streaming

Redpanda Connect replaces custom data pipeline scripts with simple YAML configs. 200+ connectors, powerful transformations, zero code.

Need help with data pipelines or stream processing? I build custom data solutions — reach out to discuss your project.

Found this useful? I publish daily deep-dives into developer tools and APIs. Follow for more.

DEV Community