DEV Community

Cover image for Reduce the latency of Prisma Data Proxy by self-hosting
Aiji Uejima
Aiji Uejima

Posted on • Edited on

Reduce the latency of Prisma Data Proxy by self-hosting

What is this post?

This is a post about Prisma Data Proxy being slow, and I solved the problem by creating my own library to self-host it.

The library created and introduced in this post is released as OSS & published to npm, so please feel free to use it.

GitHub logo aiji42 / prisma-data-proxy-alt

This is a library to alternate and self-host the Prisma Data Proxy (cloud.prisma.io)

npm version codecov CI

Alternative Prisma Data Proxy

This is a library to alternate and self-host the Prisma Data Proxy (cloud.prisma.io).

In order to deploy your project to edge runtimes (such as Cloudflare Workers or Vercel Edge Functions) and use Prisma, you will need to use the Prisma Data Proxy.
However, at present, instances can only be built in limited areas, and there are also delays caused by cold standby. This is a very stressful problem.

Therefore, we have created a server library to replace Prisma Data Proxy. With it, you are free from stressful limitations You can deploy it on any platform in any region you like and use any data source you like, such as Supabase or Planetscale.

No changes are required to your prisma client code, just set the DATABASE_URL to the URL you self-hosted with this library.
This is not an official library, but it works the same as…

What is Prisma Data Proxy?

Prisma.io provides a proxy server for database connection management and pooling.

From https://www.prisma.io/data-platform

With the Proxy Without the Proxy
With the Proxy Without the Proxy

Cloudflare Workers and Vecel Edge Functions, for example, do not have a native (TCP) connection to the database.
So, Data Proxy intervenes between the connection to the database, and from the Worker, a virtual connection to the database is achieved through an HTTP connection.

Prisma Data Proxy overview

Prisma Data Proxy Weaknesses

The Data Proxy can be built by creating an instance on the web console at https://cloud.prisma.io.
However, as of July 7, 2022, there are only two regions to choose from: Northern Virginia and Frankfurt.

regions

And since it is a serverless service, it is also affected by the latency caused by cold standby.

The majority of Data Proxy use cases are for data source usage from the edge side, such as Cloudflare Workers, but the benefits of the edge case are diminished if the latency of the data request is high.

In the measurements from Japan, when I selected an instance in Northern Virginia and built and connected to Planetscale in the same region, I observed a latency of around2.6s with cold standby and600ms without cold standby.

With this performance, it is not realistic to put it into service.

I wanted to take advantage of Prisma's powerful type generation capabilities, so I decided to self-host the Data Proxy in the Japan region and build it on an architecture that would be less susceptible to cold standby.

Reasoning about Data Proxy implementation

Since the server-side code for the Prisma Data Proxy is not publicly available, I have deduced a black-box Data Proxy implementation from the client-side implementation.

The following source code shows that the Prisma client communicates with the Data Proxy in the GraphQL schema.

https://github.com/prisma/prisma/blob/main/packages/engine-core/src/data-proxy/DataProxyEngine.ts#L140-L151

  private async requestInternal<T>(body: Record<string, any>, headers: Record<string, string>, attempt: number) {
    try {
      this.logEmitter.emit('info', {
        message: `Calling ${await this.url('graphql')} (n=${attempt})`,
      })

      const response = await request(await this.url('graphql'), {
        method: 'POST',
        headers: { ...headers, ...this.headers },
        body: JSON.stringify(body),
        clientVersion: this.clientVersion,
      })
Enter fullscreen mode Exit fullscreen mode

I actually inserted console.log into the source code to check it.

// db.link.findMany({ select: { id: true, url: true, User: true }, where: { id: 1 } })

query query {
  findManyLink(where: { id: 1 }) {
    id
    url
    User {
      id
      createdAt
      updatedAt
      name
      email
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Alternative Data Proxy

Finally, I have completed an implementation of the Alternative Data Proxy, which is available on npm and free to use.

alternative data proxy

GitHub logo aiji42 / prisma-data-proxy-alt

This is a library to alternate and self-host the Prisma Data Proxy (cloud.prisma.io)

npm version codecov CI

Alternative Prisma Data Proxy

This is a library to alternate and self-host the Prisma Data Proxy (cloud.prisma.io).

In order to deploy your project to edge runtimes (such as Cloudflare Workers or Vercel Edge Functions) and use Prisma, you will need to use the Prisma Data Proxy.
However, at present, instances can only be built in limited areas, and there are also delays caused by cold standby. This is a very stressful problem.

Therefore, we have created a server library to replace Prisma Data Proxy. With it, you are free from stressful limitations You can deploy it on any platform in any region you like and use any data source you like, such as Supabase or Planetscale.

No changes are required to your prisma client code, just set the DATABASE_URL to the URL you self-hosted with this library.
This is not an official library, but it works the same as…

Setup

yarn add prisma-data-proxy-alt
Enter fullscreen mode Exit fullscreen mode

You need to set environment variables. This library also supports .env.

PRISMA_SCHEMA_PATH=/absolute/path/for/your/schema.prisma
DATABASE_URL={database URL scheme e.g. postgresql://postgres:pass@db:5432/postgres?schema=public}
DATA_PROXY_API_KEY={random string for authentication}
PORT={server port e.g. 3000}
Enter fullscreen mode Exit fullscreen mode

Launch proxy server

yarn pdp
Enter fullscreen mode Exit fullscreen mode

Self-certification ssl is required for local startup.

Deploy to Cloud Run

Create Dockerfile.

FROM node:16.15-bullseye-slim as base

RUN apt-get update && apt-get install -y tini ca-certificates \
  && apt-get clean \
  && rm -rf /var/lib/apt/lists/*

WORKDIR /app

FROM base as builder

COPY package.json .
COPY yarn.lock .
COPY prisma/schema.prisma ./prisma/schema.prisma

RUN yarn install

RUN yarn prisma generate

FROM base

COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./package.json

ENV PRISMA_SCHEMA_PATH=/app/node_modules/.prisma/client/schema.prisma

USER node

ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["yarn", "pdp"]
Enter fullscreen mode Exit fullscreen mode

Create cloudbuild.yml

steps:
  - name: 'gcr.io/kaniko-project/executor:latest'
    args:
      - --destination=gcr.io/$PROJECT_ID/prisma-data-proxy-alt:$SHORT_SHA
      - --destination=gcr.io/$PROJECT_ID/prisma-data-proxy-alt:latest
      - --cache=true
  - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
    entrypoint: gcloud
    args:
      - run
      - deploy
      - prisma-data-proxy-alt
      - --image
      - gcr.io/$PROJECT_ID/prisma-data-proxy-alt:latest
      - --region
      - $_REGION
      - --allow-unauthenticated
      - --set-env-vars
      - DATABASE_URL=$_DATABASE_URL
      - --set-env-vars
      - DATA_PROXY_API_KEY=$_DATA_PROXY_API_KEY
substitutions:
  _REGION: asia-northeast1
  _DATABASE_URL: your_database_url
  _DATA_PROXY_API_KEY: your_api_key
Enter fullscreen mode Exit fullscreen mode

Create a new trigger from the GCP Cloud Build web console and link it to your repository.

Set _REGION, _DATABASE_URL, and _DATA_PROXY_API_KEY in the substitution values.

cloud build substitution values

  • _REGION: The region of deploy target for Cloud Run
  • _DATABASE_URL: Connection URL to your data source (mysql, postgres, etc...)
  • _DATA_PROXY_API_KEY: Arbitrary string to be used when connecting data proxy.
    • e.g. prisma://your.deployed.domain?api_key={DATA_PROXY_API_KEY} (do not divulge it to outside parties)

Connect from client

On the client side, generate the Prisma client in data proxy mode --data-proxy. official document

Set the DATABSE_URL from the domain of the server you deployed and the api key (DATA_PROXY_API_KEY) you set for it.

DATABSE_URL=prisma://${YOUR_DEPLOYED_PROJECT_DOMAIN}?api_key=${DATA_PROXY_API_KEY}
Enter fullscreen mode Exit fullscreen mode

Now you can connect to the Alternative Data Proxy from your application. 🎉

Performance

Let's actually connect and measure the performance.

Prerequisites

I used Planetscale for the database and place it in the same region as each Data Proxy instance.

  1. Official Data Proxy provided by cloud.prisma.io (Northern Virginia) + Planetscale (Northern Virginia)
  2. Alternative Data Proxy deployed on Cloud Run (Tokyo) + Planetscale (Tokyo)
  3. Alternative Data Proxy deployed on Cloud Run (Northern Virginia) + Planetscale (Northern Virginia)

Measurement results

Official PD Virginia Self-Hosted PD Tokyo Self Hosted PD Virginia
669.82ms 98.33ms 243.41ms
685.02ms 110.36ms 235.07ms
747.65ms 95.04ms 242.25ms
639.58ms 91.52ms 242.83ms
634.05ms 106.34ms 254.64ms
Avg 675.23ms 🥉 100.32ms 🥇 243.64ms 🥈

Once again, we can see that the latency of the official Data Proxy is quite large.
What is surprising is that not only the self-hosted Data Proxy in the Tokyo region is faster, but also the Data Proxy in Northern Virginia in the same region has much lower latency than the official Data Proxy.
Presumably, the official Data Proxy is making every connection to the database, which leads to the increased latency.

Official Data Proxy regions will be added in due course, but the latency caused by cold standby and database connections cannot be eliminated, so it seems well worth using this Alternative Data Proxy.

Summary

  • Solved Prisma Data Proxy's weaknesses - region limitations and cold standby latency - by self-hosting a replacement server
  • The source of Prisma Data Proxy is not publicly available, but from the client and the actual request, I guessed that the reality is GraphQL, and succeeded in making it into a library!

I've managed to get Prisma to work satisfactorily from Cloudflare Workers.

The library created and introduced in this article is released as OSS & published to npm, so please feel free to use it.

GitHub logo aiji42 / prisma-data-proxy-alt

This is a library to alternate and self-host the Prisma Data Proxy (cloud.prisma.io)

npm version codecov CI

Alternative Prisma Data Proxy

This is a library to alternate and self-host the Prisma Data Proxy (cloud.prisma.io).

In order to deploy your project to edge runtimes (such as Cloudflare Workers or Vercel Edge Functions) and use Prisma, you will need to use the Prisma Data Proxy.
However, at present, instances can only be built in limited areas, and there are also delays caused by cold standby. This is a very stressful problem.

Therefore, we have created a server library to replace Prisma Data Proxy. With it, you are free from stressful limitations You can deploy it on any platform in any region you like and use any data source you like, such as Supabase or Planetscale.

No changes are required to your prisma client code, just set the DATABASE_URL to the URL you self-hosted with this library.
This is not an official library, but it works the same as…

Top comments (0)