Pratham Chauhan

Posted on Jul 2

How to Let a Vercel App Read a Private ClickHouse on EC2 (Using Cloudflare Tunnel)

#aws #database #networking #tutorial

A start to finish walkthrough, with every piece of jargon explained in plain language.

Why you might need this

Sooner or later, most teams end up with two things that need to talk to each other
but live in different worlds.

On one side is a private backend: a database, an internal API, an admin panel,
or some other service running on a server you control (an EC2 box, a machine in your
office, a virtual private cloud). You deliberately keep it locked away from the
public internet, because it holds data that would be dangerous to expose. Locking it
down is the right call for security.

On the other side is a program hosted somewhere else: an app on Vercel, a
serverless function, a scheduled job, a third party tool, or an AI agent. It needs to
read from that private backend to do its work, but it runs outside your private
network, so it has no way in.

This is a genuinely awkward gap, and it shows up constantly:

A dashboard or reporting app on Vercel needs to query a database that only lives inside your private network.
A scheduled job or an AI agent needs to pull fresh data every hour from a warehouse that has no public address.
A webhook or a partner integration needs to reach an internal service without you exposing that service to the whole internet.
You are running something locally or on a home server and want a stable, secure web address for it without buying a static IP or fighting with your router.

The tempting shortcuts all make things worse. Opening a port to the internet turns a
carefully hidden service into a target. Restricting by IP address fails when the
outside program has no fixed address (Vercel and most serverless platforms do not).
Building a custom API server in front of it just moves the exposed door and adds a
second thing to secure and maintain.

What you actually want is a way for the outside program to reach the private service
that:

never requires opening an incoming port,
does not depend on the outside program having a fixed address,
encrypts the traffic and puts real authentication in front of it, and
costs little or nothing to run.

That is exactly what a Cloudflare Tunnel gives you, and this guide walks through
setting one up end to end. The specific example is an AI agent on Vercel reading a
ClickHouse database on EC2, but the same pattern works for any private service and any
outside consumer. Once you understand the shape of it, you will reach for it again and
again.

The problem, in one sentence

We run a ClickHouse database on an EC2 server that is not reachable from the
public internet, and we need a program hosted on Vercel to read from it safely.

That sentence has four pieces of jargon. Here is what each one means:

ClickHouse: a database built for analytics. It stores huge tables of events (in our case, every tool call and session from our apps) and answers questions like "how many errors happened in the last hour" very fast.
EC2: a virtual server you rent from Amazon Web Services (AWS). Think of it as a computer in Amazon's data center that we control.
Not publicly accessible: the server's firewall blocks incoming connections from the internet. Nothing outside can knock on its door. This is good for security, but it means our outside program cannot reach the database directly.
Vercel: a hosting platform where our program (an AI agent) runs. Vercel runs code in the cloud, and its servers do not have a fixed address, which matters later.

So the puzzle is: the database is deliberately locked away inside a private
network, and the program that needs it lives somewhere else entirely.

Why the obvious ideas are the wrong ideas

"Just open the database port to the internet." You could change the firewall
to allow the world to connect to ClickHouse. But now your database is exposed to
every bot on the internet, protected only by a password. One leaked password or
one unpatched bug and your data is gone. We do not want the database reachable
from the open internet at all.

"Only allow Vercel's IP address through the firewall." A firewall can be told
"only accept connections from this specific address." The problem is Vercel does
not give your program a fixed address. Its servers pick from a large, changing
pool of addresses, so there is no single address to allow. This approach does not
work here.

"Build a small API server in front of the database." A common instinct is to
write a little web service (for example in Express, a popular Node.js web
framework) that sits in front of ClickHouse, checks a password, and forwards
queries. The catch: that server still has to accept incoming connections from the
internet, so you have just moved the exposed door from the database to the API,
and now you have a second program to maintain and secure. Unless that API tightly
restricts what queries are allowed, it adds work without adding safety.

We want something better: a way for the outside program to reach the database
without opening any incoming door at all.

The tool that solves it: Cloudflare Tunnel

Cloudflare is a company that sits in front of websites to make them faster and
safer. One of its free products is Cloudflare Tunnel.

Here is the key idea, and it is the part most people get backwards.

A normal setup (a "reverse proxy") works like this: the outside service dials
in to your server. That requires your server to have an open door (an open
port), which is exactly what we are trying to avoid.

A Tunnel works the opposite way. You run a small program on your server called
cloudflared. That program dials out to Cloudflare and holds the
connection open. When a request arrives for your service, Cloudflare pushes it
down that outgoing connection to cloudflared, which then hands it to
ClickHouse over the server's own internal network.

Why this is safe and clever:

No incoming door is ever opened. The connection is outbound only, started by your own server. Your firewall stays fully closed to the outside.
Cloudflare never learns your server's address. You do not register an IP with anyone. There is nothing to point at, and nothing to attack.
You never hand Cloudflare an SSH key or any access to your box. The little program you run reaches out to them, not the other way around.

A few more terms you will meet:

Port: a numbered channel on a server for a specific service. ClickHouse's web interface listens on port 8123. "Opening a port" means allowing outside connections to that number.
localhost: a shortcut name a computer uses to mean "myself." If ClickHouse and cloudflared run on the same server, cloudflared reaches the database at localhost:8123, which is purely internal and never touches the internet.
DNS (Domain Name System): the internet's phone book. It turns a name like ch.test.com into an address a computer can connect to.
Nameservers: the specific servers that hold the phone book entries for your domain. Whoever runs your nameservers controls your domain's DNS.

What you need before you start

An EC2 server where ClickHouse is running, and the ability to log into it over SSH (the standard way to get a remote terminal on a server).
A domain name. Cloudflare Tunnel needs to attach your service to a web address, and to do that the domain must be managed by Cloudflare. Moving your main company domain is a big change, so the clean move is to buy a cheap, throwaway domain just for internal infrastructure (this guide uses test.com). It costs a few dollars a year and keeps your real domain untouched.
A free Cloudflare account.
Your program on Vercel that will do the reading.

A note on the domain, because it trips people up: Cloudflare's free plan requires
the whole domain to use Cloudflare's nameservers. You cannot hand over just one
subdomain on the free plan (that feature, called "partial setup," is a paid
business feature). This is exactly why a separate throwaway domain is the easy
answer: you move the entire throwaway domain to Cloudflare and never touch your
production domain.

Phase 1: Put the domain on Cloudflare

What we are doing and why: we are making Cloudflare the manager of the throwaway
domain's phone book (its DNS), because the Tunnel can only attach a web address to a
domain that Cloudflare controls.

In Cloudflare, click Add a site, type your domain (test.com), and choose the Free plan.
Cloudflare scans the domain's existing records and shows them to you. For a fresh throwaway domain these are just parking entries from the registrar, and you do not need to add anything. The one address you will actually use (ch.test.com) gets created automatically later, so there is nothing to add by hand here.
Click Continue to activation. Cloudflare gives you two nameserver addresses (something like dana.ns.cloudflare.com).
Go to your domain registrar (where you bought the domain, for example Namecheap), find the nameserver setting, switch it from the registrar's default to Custom, and paste in Cloudflare's two nameservers.
Wait for the domain to show Active in Cloudflare. This can take a few minutes and occasionally up to a day. Cloudflare emails you when it is ready.

Phase 2: Create the Tunnel and get its token

What we are doing and why: we are creating the Tunnel on Cloudflare's side and
getting a token. A token is a long secret string that acts as the Tunnel's
password. The little program on our server will use it to prove it is allowed to
connect. We are using the token method (rather than the browser login method)
because it is simpler on a server that has no web browser.

In the Cloudflare dashboard, open Zero Trust (this is Cloudflare's security product area; it also appears under the name "Cloudflare One").
Go to Networks, then Tunnels, then Create a tunnel.
Choose the connector type Cloudflared and give the tunnel a name (this guide uses ch, short for ClickHouse).
On the install screen, choose your server's operating system. Ubuntu, a very common Linux for EC2, is built on Debian, so choose Debian. Then choose the architecture (the chip type). Run uname -m on your server to check: x86_64 means choose 64-bit, and aarch64 means choose arm64 (used by Amazon's Graviton servers).
Cloudflare now shows a set of commands with your token baked in. Keep this tab open.

Phase 3: Install the connector on your server

What we are doing and why: we are installing cloudflared (the little program that
holds the outbound connection) and setting it up as a service, which means the
operating system keeps it running in the background and restarts it automatically if
the server reboots.

Cloudflare shows three command boxes on the install screen. You run the first two and
skip the third.

Log into your EC2 server over SSH.
Copy the first box ("Install cloudflared") using its copy icon, paste it into the terminal, and run it. This adds Cloudflare's software source and installs the program.
Copy the second box ("Install as service"), paste, and run it. It looks like sudo cloudflared service install eyJ... where the long eyJ... string is your token. This registers the connection and starts it running in the background.
Skip the third box ("Or, run tunnel"). The word "Or" is the giveaway: it is an alternative that runs in the foreground and stops the moment you close the terminal. We want the background service, not this.
Check it is running:
You want to see active (running).
```
sudo systemctl status cloudflared
```

Back in the Cloudflare dashboard, the tunnel's status should turn to Healthy
within a few seconds. Healthy means the outbound connection from your server to
Cloudflare is up. It will still say "No routes" because we have not told it what to
serve yet, which is the next step.

Phase 4: Point a web address at ClickHouse

What we are doing and why: the tunnel is connected, but it does not yet know what to
do with requests. We now create a route (older versions of the dashboard call this a
"public hostname") that says "when a request comes in for ch.test.com, forward it to
ClickHouse at localhost:8123."

Open your tunnel and go to the Routes tab, then click Add route.
You are shown four route types. Choose Published application. It means "publish a local service to the internet at a web address." The other three (Private hostname, Private CIDR, Workers VPC) require special client software and are not what we need.
Fill in the form:
- Subdomain: ch
- Domain: test.com
- Path: leave empty (empty means "match every request")
- Service Type: HTTP
- Service URL: http://localhost:8123 (this is ClickHouse on the same server; if ClickHouse runs on a different machine inside your private network, use that machine's private address instead of localhost)
Click Add route. Cloudflare automatically creates the DNS record for ch.test.com, so you do not have to add it by hand.

Test it from your own laptop:

curl "<https://ch.test.com/?query=SELECT%201>"

A couple of terms here:

curl: a command line tool for making web requests. It is the quickest way to check that an address responds.
The quotes around the address matter. The zsh shell (the default terminal on modern Macs) treats a bare ? as a special "match any file" character, and without quotes it errors with no matches found. Quoting the address turns off that behavior.

If you get back 1, the full path works: your laptop reached Cloudflare, Cloudflare
pushed the request down the tunnel to your server, and cloudflared handed it to
ClickHouse, which answered SELECT 1 with 1.

Phase 5: Give the app a limited, read only database account

What we are doing and why: right now anything hitting the tunnel could run any query
using the database's main account. We create a separate account with the least power
possible: it can only read the two tables our program needs, and it cannot write,
change, or delete anything. This is the principle of "least privilege," and it means a
mistake or a leaked credential can do far less damage.

Key terms:

readonly user: a database account restricted to read only queries. It cannot insert, update, delete, or change settings.
GRANT: the SQL command that gives an account permission to do a specific thing.
Least privilege: give an account exactly the access it needs and nothing more.
On the server, open the ClickHouse client. Creating accounts requires an
administrator account (usually called default), not an ordinary app account. If
you try as a limited user you will see an "ACCESS_DENIED" error about needing the
"CREATE USER" grant. Connect as the admin:
```
clickhouse-client -u default
```
(Add --password 'yourpassword' if the admin account has one.)

First generate a strong random password and save it somewhere safe:
```
openssl rand -base64 24
```
Create the read only account and grant it read access to just the two tables. Note
that our tables live in a database named new_database, so the grants name
new_database.your_table_name and new_databse.your_other_table_name. If your tables live
somewhere else, change the database name to match.
```
CREATE USER agent_ro IDENTIFIED BY 'PASTE_THE_GENERATED_PASSWORD' SETTINGS readonly = 1;
GRANT SELECT ON new_database.your_table_name TO agent_ro;
GRANT SELECT ON new_databse.your_other_table_name TO agent_ro;
CREATE QUOTA agent_ro_q FOR INTERVAL 1 hour MAX queries = 2000, result_rows = 50000000 TO agent_ro;
```
The last line is a quota: a safety limit so a runaway program cannot hammer the
database with unlimited queries.

Test the new account through the tunnel. The address includes ?database=new_database
so ClickHouse knows which database to look in (our program sends this automatically):

curl -H "X-ClickHouse-User: agent_ro" -H "X-ClickHouse-Key: PASTE_THE_GENERATED_PASSWORD" \
  "<https://ch.test.com/?database=new_database>" \
  --data-binary "SELECT count() FROM your_table_name FINAL FORMAT JSON"

You should get back a small block of JSON with a count.

Confirm the read only lock actually works. This next command tries to create a table
and should be rejected, which is exactly what we want:

curl -H "X-ClickHouse-User: agent_ro" -H "X-ClickHouse-Key: PASTE_THE_GENERATED_PASSWORD" \
  "<https://ch.test.com/?database=new_database>" \
  --data-binary "CREATE TABLE x (a Int) ENGINE=Memory"

Two small things that commonly go wrong here:

Getting the database name wrong, or forgetting ?database=new_database, produces an "UNKNOWN_TABLE" error because ClickHouse looks in the wrong place.
A typo like ?databse=new_database produces an "UNKNOWN_SETTING" error because ClickHouse reads the misspelled word as a setting name.

Phase 6: Add a second lock with a Cloudflare Access service token

What we are doing and why: the database password is one lock. We add a second,
independent lock in front of the web address itself, so that even reaching ClickHouse's
front door requires a separate secret. This is defense in depth: two locks, so one
failing does not expose everything.

Key terms:

Cloudflare Access: a Cloudflare feature that stands in front of a web address and refuses anyone who cannot prove they are allowed.
Service token: a machine to machine credential, made of a Client ID and a Client Secret (a public name and a private password, roughly). Automated programs send these as two request headers to get through Access. A "header" is a small labeled piece of extra information attached to a web request.
In Zero Trust, go to Access controls, then Service credentials, then Service Tokens, and click Create Service Token. Name it (for example your-agent) and generate it. Copy the Client ID and Client Secret now, because the secret is shown only once.
Go to Access controls, then Applications, then Add an application, and choose Self-hosted. Set the application's hostname to ch.test.com.
Add a policy with the action Service Auth (this specifically means "let approved machines in without a human login screen"). Under Include, choose Service Token and select the token you just created. Save the policy and the application.
Verify the lock. Without the token, the request should now be blocked:
With both the token headers and the database credentials, it should succeed:
```
curl -so /dev/null -w "%{http_code}\n" "<https://ch.test.com/?query=SELECT%201>"
```

    curl -H "CF-Access-Client-Id: YOUR_CLIENT_ID" \
         -H "CF-Access-Client-Secret: YOUR_CLIENT_SECRET" \
         -H "X-ClickHouse-User: agent_ro" -H "X-ClickHouse-Key:YOUR_DB_PASSWORD" \
         "<https://ch.test.com/?database=new_database>" \
         --data-binary "SELECT 1 FORMAT JSON"

If the first is blocked and the second returns 1, you now have two independent locks
in place, and zero open incoming ports on your server.

Phase 7: Point the Vercel program at the database

What we are doing and why: the program on Vercel needs to know the address, the
database account, and the Access token. These are supplied as environment variables,
which are named settings you give a program without writing them into its code (so
secrets never live in the source).

Set these on the Vercel project (in Settings, under Environment Variables):

CLICKHOUSE_HOST=ch.test.com
CLICKHOUSE_PORT=443
CLICKHOUSE_SECURE=true
CLICKHOUSE_DATABASE=new_database
CLICKHOUSE_USERNAME=agent_ro
CLICKHOUSE_PASSWORD=your_database_password
CF_ACCESS_CLIENT_ID=your_client_id
CF_ACCESS_CLIENT_SECRET=your_client_secret

A note on the values:

Port 443 with CLICKHOUSE_SECURE=true: 443 is the standard port for secure web traffic (HTTPS). Cloudflare serves your tunnel address over HTTPS, so the program connects on 443 with encryption on, even though ClickHouse itself is plain HTTP on 8123 behind the tunnel.
The two CF_ACCESS_* values are what get the program past the Access lock from Phase 6.

Redeploy the Vercel app so the new settings take effect, then run whatever triggers the
program to read ClickHouse. If it returns data instead of a connection error, the whole
chain is working.

A quick security checklist before you call it done

Rotate any secret that has touched a chat window, a screenshot, or a shared note.
If you pasted the database password anywhere while setting up, change it:
and update the Vercel environment variable to match.
```
ALTER USER agent_ro IDENTIFIED BY 'a_new_password';
```
Keep the account read only. Do not grant it more than the tables it needs.
Leave the server's incoming firewall closed. The whole point of the tunnel is that
you never need to open a port. If you did open one earlier to test, close it again.

The one gotcha to know about

If the tunnel refuses to turn "Healthy," the usual cause is that your server's
outbound internet access is blocked. cloudflared prefers a fast protocol called
QUIC over UDP port 7844. If your network blocks UDP, force it to use standard HTTPS
instead: in the tunnel's configuration set the connector protocol to http2. You still
never open any incoming port; this only changes how the outgoing connection is made.

The whole thing in one picture

Your laptop / the Vercel app
        |
        v  (request to <https://ch.test.com>, carrying the Access token + DB login)
   Cloudflare edge
        |
        |  Cloudflare Access checks the service token (lock #1)
        |
        v  pushes the request DOWN the connection your server opened
   cloudflared  (running on your EC2 server, connection was outbound only)
        |
        v  <http://localhost:8123>, staying inside the private network
   ClickHouse  (checks the read only account + password, lock #2)
        |
        v
     answer travels back the same way

No incoming port was ever opened. The database was never exposed to the internet. Two
independent locks guard the path, and the program on Vercel reads exactly the two tables
it is allowed to, and nothing else.

DEV Community