DEV Community

Cover image for Running an Arweave Gateway in the Dark Web
K for Fullstack Frontend

Posted on

Running an Arweave Gateway in the Dark Web

Censorship resistance is a vital topic for decentralized storage solutions like Arweave. Permanent storage is just one variable of the equation. Governments can still prevent access to the perpetual storage, and users might face repercussions when accessing it anyway. Tor solves these two issues by anonymizing the connections between a user and a service like an Arweave gateway.

This article will explain how to turn an AR.IO node into an Onion Service by leveraging Docker Compose.

Target Audience

It would be best if you were comfortable with the Linux command line the Git version control system, and some basic understanding of HTTP and Docker won't hurt either.

Prerequisites

You need Git, Docker, and a browser that's able to handle .onion addresses (e.g., Tor Browser or Brave).

Background

Let's get some background on the technology you will use first.

What is Tor?

The Tor Project is a set of technologies that allow you to use the Internet anonymously; someone who monitors your traffic when using the Tor network can't find out what websites or services you're accessing. Not even the services themselves know who you are when accessing them.

When you send a service request into the Tor network, it gets relayed through multiple Tor nodes, and in the end, one random exit node calls the service, and its response is relayed back to you. On the way, the request is encrypted repeatedly, in layers, like an onion. That's where the name Tor comes from; initially, it was an acronym for The Onion Router (TOR).

Tor Network Architecture with HTTP Service
Tor Network Architecture with HTTP Service

What is an Onion Service?

An Onion Service is a proxy server that runs alongside the service it passes through the Tor network.

Usually, you go through 3 types of Tor nodes when accessing services via Tor.

  1. An entry node accepting a regular connection from a client outside the Tor network.
  2. Several intermediate nodes will relay the connection around and layer encryptions on them to ensure no single node in the network has all the information to track the request back to the client.
  3. An exit node that will proxy the connection to the desired service on the Internet.

The big issue in this scenario is the exit node. After the connection leaves the Tor network, it isn't encrypted anymore, so anyone who runs an exit node can read the data and try to use it to identify users of particular services.

An Onion Service gets around this issue by removing the need for an exit node. It's a Tor node that runs alongside the service, making it directly accessible on the Tor network. The service provider terminates the Tor connection on the same machine, or at least the same local network, that hosts the service.

Tor Network Architecture with Onion Service
Tor Network Architecture with Onion Service

Why Run an Arweave Gateway as Onion Service?

First, it makes the data on Arweave accessible for everyone, even people in countries where governments block access to Arweave or track who is accessing it for persecution purposes.

Second, converting a service to an Onion Service means putting it into the Dark Web. User identities are unknown to the service, and the server's location and its owner's identity are also unknown to the users. Since hosting an Arweave gateway might be an issue in some countries—it can potentially distribute problematic information—running it as an Onion Service allows a gateway operator to stay safe.

Also, an Onion Service comes with quality-of-life improvements for gateway operators like encryption and NAT punching; you no longer need SSL certificates and can run a gateway on a local network behind a NAT.

Implementation

Now that you understand what Tor this is about, let's get going!

The tasks that await us are:

  • Adding a Tor server to the AR.IO node's Docker Compose cluster
  • Optional: Improving security by filtering headers via Envoy
  • Optional: Improving UX by advertising the Tor address via headers and lowering hops in the Tor network

Creating the docker-compose.yaml File

To get started, create a docker-compose.yaml with the following content:

---
version: '3.0'

services:
  onion:
    image: fphammerle/onion-service
    ports:
      - 80:8000
    environment:
      VIRTUAL_PORT: '8000'
      TARGET: envoy:3000
    volumes:
      - type: volume
        target: /var/lib/tor
      - type: volume
        target: /onion-service
      - type: tmpfs
        target: /tmp
        tmpfs: {size: 4k}
    read_only: true
    cap_drop: [ALL]
    security_opt: [no-new-privileges]

  envoy:
    image: ghcr.io/ar-io/ar-io-envoy:latest
    build:
      context: envoy/
    expose:
      - 3000:3000
      - 9901:9901
    environment:
      - LOG_LEVEL=info
      - TVAL_AR_IO_HOST=core
      - TVAL_AR_IO_PORT=4000
      - TVAL_GATEWAY_HOST=${TRUSTED_GATEWAY_HOST:-arweave.net}
      - TVAL_GRAPHQL_HOST=${GRAPHQL_HOST:-core}
      - TVAL_GRAPHQL_PORT=${GRAPHQL_PORT:-4000}
      - TVAL_ARNS_ROOT_HOST=${ARNS_ROOT_HOST:-}

  core:
    image: ghcr.io/ar-io/ar-io-core:latest
    expose:
      - 4000:4000
    volumes:
      - ${CHUNKS_DATA_PATH:-./data/chunks}:/app/data/chunks
      - ${CONTIGUOUS_DATA_PATH:-./data/contiguous}:/app/data/contiguous
      - ${HEADERS_DATA_PATH:-./data/headers}:/app/data/headers
      - ${SQLITE_DATA_PATH:-./data/sqlite}:/app/data/sqlite
      - ${TEMP_DATA_PATH:-./data/tmp}:/app/data/tmp
    environment:
      - NODE_ENV=${NODE_ENV:-production}
      - LOG_FORMAT=${LOG_FORMAT:-simple}
      - TRUSTED_NODE_URL=${TRUSTED_NODE_URL:-}
      - TRUSTED_GATEWAY_URL=https://${TRUSTED_GATEWAY_HOST:-arweave.net}
      - START_HEIGHT=${START_HEIGHT:-}
      - STOP_HEIGHT=${STOP_HEIGHT:-}
      - SKIP_CACHE=${SKIP_CACHE:-}
      - SIMULATED_REQUEST_FAILURE_RATE=${SIMULATED_REQUEST_FAILURE_RATE:-}
      - INSTANCE_ID=${INSTANCE_ID:-}
      - AR_IO_WALLET=${AR_IO_WALLET:-}
      - ADMIN_API_KEY=${ADMIN_API_KEY:-}
      - BACKFILL_BUNDLE_RECORDS=${BACKFILL_BUNDLE_RECORDS:-}
      - FILTER_CHANGE_REPROCESS=${FILTER_CHANGE_REPROCESS:-}
      - ANS104_UNBUNDLE_FILTER=${ANS104_UNBUNDLE_FILTER:-}
      - ANS104_INDEX_FILTER=${ANS104_INDEX_FILTER:-}
      - ARNS_ROOT_HOST=${ARNS_ROOT_HOST:-}
      - SANDBOX_PROTOCOL=${SANDBOX_PROTOCOL:-}
Enter fullscreen mode Exit fullscreen mode

This file is the Docker Compose configuration that ships with the AR.IO node with a few changes.

An added onion service that handles all the Tor traffic and relays it to the envoy service, which, in turn, forwards it to the AR.IO node.

The ports of the envoy and core services are now private to the cluster, so the gateway is only accessible as an Onion Service.

Running the Cluster

You can run the cluster immediately; no build step is required.

docker compose up
Enter fullscreen mode Exit fullscreen mode

Testing the Gateway

If every container starts correctly, this command will get you the .onion address:

docker compose exec onion cat /onion-service/hostname
Enter fullscreen mode Exit fullscreen mode

It should look like a random string with .onion at the end.

You can use it with clients that can handle .onion addresses, like the Tor Browser and Brave.

Getting the network info:

http://<ONION_ADDRESS>/info
Enter fullscreen mode Exit fullscreen mode

Getting the UDL:

http:///yRj4a5KMctX_uOmKWCFJIjmY8DeJcusVk6-HzLiM_t8

Optional: Filtering Headers

While turning any service into an Onion Service is quite simple, actually staying anonymous depends on more factors. Check out the Tor docs on operational security before attempting to run this setup in production!

One of the steps you can take here is to filter out headers an attacker might use to identify your server or what software it runs.

Cloning the ar-io-node Repository

Since you must update the Envoy image to remove the headers, clone the ar-io-node repository from GitHub.

git clone https://github.com/ar-io/ar-io-node.git
Enter fullscreen mode Exit fullscreen mode

Updating the Envoy Configuration

Then make the following changes to the ar-io-node/envoy/envoy.template.yaml file.

Add this code directly under the route_config:

                  response_headers_to_remove:
                    - server
                    - x-server-upstream-envoy
Enter fullscreen mode Exit fullscreen mode

Note: Ensure correct indentation!

If you find any other problematic headers, add them to this list.

Replacing the docker-compose.yaml File

Override the ar-io-node/docker-compose.yaml file with the one created above and run Docker Compose again, this time with the --build flag, so the Envoy Docker image contains your changes.

docker compose up --build
Enter fullscreen mode Exit fullscreen mode

After this, the problematic headers are gone.

Optional: Using the Onion-Location Header and Lowering Hops

In many cases, it's fine that the gateway itself isn't anonymous, but you still want to give users the option to access it anonymously.

A person in Europe might not have any issues that the authorities know they're running an Arweave gateway, but a person in China might not be allowed to access this gateway.

If this is the case, you have some UX optimization potential!

  1. Expose the Envoy publicly so the gateway works over HTTP again.
  2. Add an Onion-Location header via Envoy to advertise the .onion addresses for the resources the gateway exposes.
  3. Reduce latency by lowering the number of hops between users and the Onion Service. Usually, there are six hops, three from the client and three from the server, but Onion Services can use one hop from the server if only the users need to be anonymous.

Cloning the ar-io-node Repository

Clone the ar-io-node repository to make changes to the Envoy configuration.

Note: This step is unneccesary if you already did the optional header removal step.

git clone https://github.com/ar-io/ar-io-node.git
Enter fullscreen mode Exit fullscreen mode

Updating the docker-compose.yaml File

As the onion service will write its .onion address on the filesystem, the onion and envoy services will share a volume

The Envoy proxy will be public again since this scenario doesn't require hiding it.

Finally, the single-hop mode lowers the latency in the Tor network.

Replace the ar-io-node/docker-compose.yaml content with the following code:

---
version: "3.0"

volumes:
  onion-data:

services:
  onion:
    image: fphammerle/onion-service
    ports:
      - 80:8000
    environment:
      VIRTUAL_PORT: "8000"
      TARGET: envoy:3000
      NON_ANONYMOUS_SINGLE_HOP_MODE: 1
    volumes:
      - type: volume
        target: /var/lib/tor
      - type: volume
        source: onion-data
        target: /onion-service
      - type: tmpfs
        target: /tmp
        tmpfs: { size: 4k }
    read_only: true
    cap_drop: [ALL]
    security_opt: [no-new-privileges]

  envoy:
    image: ghcr.io/ar-io/ar-io-envoy:latest
    build:
      context: envoy/
    ports:
      - 3000:3000
      - 9901:9901
    environment:
      - LOG_LEVEL=info
      - TVAL_AR_IO_HOST=core
      - TVAL_AR_IO_PORT=4000
      - TVAL_GATEWAY_HOST=${TRUSTED_GATEWAY_HOST:-arweave.net}
      - TVAL_GRAPHQL_HOST=${GRAPHQL_HOST:-core}
      - TVAL_GRAPHQL_PORT=${GRAPHQL_PORT:-4000}
      - TVAL_ARNS_ROOT_HOST=${ARNS_ROOT_HOST:-}
    volumes:
      - type: volume
        source: onion-data
        target: /onion-service

  core:
    image: ghcr.io/ar-io/ar-io-core:latest
    build:
      context: .
    expose:
      - 4000:4000
    volumes:
      - ${CHUNKS_DATA_PATH:-./data/chunks}:/app/data/chunks
      - ${CONTIGUOUS_DATA_PATH:-./data/contiguous}:/app/data/contiguous
      - ${HEADERS_DATA_PATH:-./data/headers}:/app/data/headers
      - ${SQLITE_DATA_PATH:-./data/sqlite}:/app/data/sqlite
      - ${TEMP_DATA_PATH:-./data/tmp}:/app/data/tmp
    environment:
      - NODE_ENV=${NODE_ENV:-production}
      - LOG_FORMAT=${LOG_FORMAT:-simple}
      - TRUSTED_NODE_URL=${TRUSTED_NODE_URL:-}
      - TRUSTED_GATEWAY_URL=https://${TRUSTED_GATEWAY_HOST:-arweave.net}
      - START_HEIGHT=${START_HEIGHT:-}
      - STOP_HEIGHT=${STOP_HEIGHT:-}
      - SKIP_CACHE=${SKIP_CACHE:-}
      - SIMULATED_REQUEST_FAILURE_RATE=${SIMULATED_REQUEST_FAILURE_RATE:-}
      - INSTANCE_ID=${INSTANCE_ID:-}
      - AR_IO_WALLET=${AR_IO_WALLET:-}
      - ADMIN_API_KEY=${ADMIN_API_KEY:-}
      - BACKFILL_BUNDLE_RECORDS=${BACKFILL_BUNDLE_RECORDS:-}
      - FILTER_CHANGE_REPROCESS=${FILTER_CHANGE_REPROCESS:-}
      - ANS104_UNBUNDLE_FILTER=${ANS104_UNBUNDLE_FILTER:-}
      - ANS104_INDEX_FILTER=${ANS104_INDEX_FILTER:-}
      - ARNS_ROOT_HOST=${ARNS_ROOT_HOST:-}
      - SANDBOX_PROTOCOL=${SANDBOX_PROTOCOL:-}
Enter fullscreen mode Exit fullscreen mode

Updating Envoy's docker-entrypoint.sh File

You must put the .onion address into an environment variable to use in the Envoy configuration.

Add the following line into the ar-io-node/envoy/docker-entrypoint.sh file, just below the # update env vars comment.

export TVAL_ONION_HOST=$(cat /onion-service/hostname)
Enter fullscreen mode Exit fullscreen mode

The ytt line that follows will gather all environment variables starting with TVAL_ and use them to generate the envoy.yaml. Since you mounted the volume the onion service uses to save its address, the enovy service can read it.

Updating the Envoy Configuration

Now that you have the .onion address, you must put it in the right header.

Add this code to the ar-io-node/enovy/envoy.template.yaml file directly below the route_config: line.

                  response_headers_to_add:
                    - header:
                        key: "Onion-Location"
                        value: #@ "http://" + data.values.ONION_HOST + "%REQ(:path)%"
Enter fullscreen mode Exit fullscreen mode

Note: Check the indentation!

A few things will now happen every time the envoy service starts.

  1. It mounts the onion-data volume shared with the onion service
  2. It reads the .onion address from the onion-data volume and puts it in the TVAL_ONION_HOST environment variable.
  3. It replaces the #@ "http://" + data.values.ONION_HOST + "%REQ(:path)%" with "http://<ONION_HOST>%REQ(:path)%". Where <ONION_HOST> is the content of the TVAL_ONION_HOST variable.

Then, on every request, the Enovy proxy handles.

  1. It will replace %REQ(:path)% with the path that was requested.
  2. It will append that path to the .onion address.
  3. IT will add the new URL as an Onion-Location header to the response.

If you request https://example.com/info, the header could look like this:

Onion-Location: http://32r2f29g9gc9gd3rtap1e10qj0d38h4f.onion/info
Enter fullscreen mode Exit fullscreen mode

This header allows Tor-compatible browsers to display a button that opens the gateway via Tor.

Summary

Setting up an Arweave gateway as Onion Service is surprisingly easy, especially if you're building on Docker.

Getting the address of the Onion Service into your own service is more work, especially if it runs in a Docker container you can't build yourself.

To review, we now have three ways to set up an Arweave gateway, each with its pros and cons.

Three Different AR.IO/Onion Setups
Three Different AR.IO/Onion Setups

  • The default AR.IO setup has the lowest latency.
  • The pure Onion Service setup gives the best privacy for clients and servers but also has quite high latency because of six additional indirections in the Tor network.
  • The hybrid setup only protects the clients, but has lower latency than the pure setup, since only four indirections happen in Tor.

Top comments (1)

Collapse
 
derick1530 profile image
Derick Zihalirwa

Great article as always