DEV Community

Alex Spinov
Alex Spinov

Posted on

Redpanda Connect Has a Free API: The Stream Processing Tool That Transforms Data Between Any Two Systems

You need to move data from Kafka to PostgreSQL. Or from S3 to Elasticsearch. Or from webhooks to BigQuery. Instead of writing custom ETL scripts for each pair, Redpanda Connect (formerly Benthos) pipes data between 200+ connectors with transformations in between.

What Is Redpanda Connect?

Redpanda Connect is an open-source stream processing tool. It reads data from inputs, applies transformations (via a language called Bloblang), and writes to outputs. It supports 200+ connectors: Kafka, AWS S3, PostgreSQL, HTTP, MQTT, AMQP, GCP, and many more.

The Free Tool

Redpanda Connect is completely free and open source:

  • 200+ connectors: Kafka, S3, PostgreSQL, HTTP, MQTT, GCP, Azure
  • Bloblang: Powerful data transformation language
  • Single binary: No JVM, no runtime dependencies
  • Batching: Automatic micro-batching for throughput
  • Error handling: Dead letter queues, retries, backoff
  • Monitoring: Prometheus metrics, tracing
  • Config-driven: YAML configuration, no code needed

Quick Start

Install:

curl -LO https://github.com/redpanda-data/connect/releases/latest/download/redpanda-connect_linux_amd64.tar.gz
tar xzf redpanda-connect_linux_amd64.tar.gz
Enter fullscreen mode Exit fullscreen mode

Pipe data between systems:

# config.yaml — Read from Kafka, transform, write to PostgreSQL
input:
  kafka:
    addresses: ["localhost:9092"]
    topics: ["user_events"]
    consumer_group: "pg-sink"

pipeline:
  processors:
    - bloblang: |
        root.user_id = this.userId
        root.event = this.eventType
        root.timestamp = this.ts.ts_parse("2006-01-02T15:04:05Z")
        root.properties = this.data.string()

output:
  sql_insert:
    driver: postgres
    dsn: "postgres://user:pass@localhost:5432/analytics"
    table: events
    columns: ["user_id", "event", "timestamp", "properties"]
    args_mapping: |
      root = [
        this.user_id,
        this.event,
        this.timestamp,
        this.properties
      ]
Enter fullscreen mode Exit fullscreen mode

Run:

redpanda-connect run config.yaml
Enter fullscreen mode Exit fullscreen mode

HTTP webhook to S3:

input:
  http_server:
    path: /webhooks
    allowed_verbs: ["POST"]

pipeline:
  processors:
    - bloblang: |
        root = this
        root.received_at = now()
        root.source = meta("http_server_request_path")

output:
  aws_s3:
    bucket: my-webhook-archive
    path: "webhooks/${!timestamp_unix()}-${!uuid_v4()}.json"
Enter fullscreen mode Exit fullscreen mode

Why Teams Choose Redpanda Connect

A data team maintained 15 custom Python scripts to move data between Kafka, S3, PostgreSQL, and Elasticsearch. Each script had different error handling, retry logic, and monitoring. After replacing them with Redpanda Connect configs, they had consistent error handling, built-in retries, and Prometheus metrics for every pipeline. Total YAML: 200 lines replacing 3,000 lines of Python.

Who Is This For?

  • Data engineers building ETL/ELT pipelines
  • Backend developers moving data between microservices
  • DevOps teams needing log routing and transformation
  • Anyone writing custom scripts to pipe data between systems

Start Streaming

Redpanda Connect replaces custom data pipeline scripts with simple YAML configs. 200+ connectors, powerful transformations, zero code.

Need help with data pipelines or stream processing? I build custom data solutions — reach out to discuss your project.


Found this useful? I publish daily deep-dives into developer tools and APIs. Follow for more.

Top comments (0)