DEV Community

Miguel Ángel Cabrera Miñagorri
Miguel Ángel Cabrera Miñagorri

Posted on

1 1 1 1 1

Running AI locally in your users' browsers

We all know how great AI is, however, there are still two major problems: data privacy and cost.

All the applications using AI right now are connected to cloud APIs. These APIs log prompts and contexts and in some cases they use that data to train models. That means that any sensitive data you include on them is potentially exposed.

Most web applications integrate AI features using the following schema:

Schema of AI integration in web applications

The problem here is that the application servers need to send the user data to the AI API, which is a third-party API and we cannot really know what will happen with the user data.

But, why don't we just process AI in the user device instead of the cloud? I have been testing it for a few weeks with amazing results. I found 3 main advantages:

  1. The user data is never sent to a third-party. It always remains on the user device.
  2. It's free for the app developer, you don't need to pay for the user inference, because it happens directly on the user device.
  3. The scalability is unlimited as every single new user brings his own computation power.

Let's take a quick look at how the previous schema changes when we offload the AI computation to the users:

Schema of running AI locally on the user's browser

It's a very simple concept. The user uses the we application as always, but when there is some task that requires to perform AI computation, instead of using a third-party API, we send it to the user and it's device will perform that computation in the most secure way, locally.

This is not just a dream, it's already fully functional, and I created a platform called Offload so that everyone can use this architecture easily, just changing a few lines of code. The SDK will handle everything behind the scenes, from downloading a model that fits on the user device, to help you manage the prompts and evaluate prompt responses locally, sending back the evaluation results to you without exposing the user data. Everything works transparently with a single function invocation.

I am looking for web developers that may benefit from this, even if it is just for hobby projects, so, if you like this approach ping me! I would love to help you set it up in your application and you will see that it is actually really simple to migrate within minutes.

Integrate Offload in your application

API Trace View

Struggling with slow API calls? 🕒

Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (0)

The Most Contextual AI Development Assistant

Pieces.app image

Our centralized storage agent works on-device, unifying various developer tools to proactively capture and enrich useful materials, streamline collaboration, and solve complex problems through a contextual understanding of your unique workflow.

👥 Ideal for solo developers, teams, and cross-company projects

Learn more

👋 Kindness is contagious

Dive into an ocean of knowledge with this thought-provoking post, revered deeply within the supportive DEV Community. Developers of all levels are welcome to join and enhance our collective intelligence.

Saying a simple "thank you" can brighten someone's day. Share your gratitude in the comments below!

On DEV, sharing ideas eases our path and fortifies our community connections. Found this helpful? Sending a quick thanks to the author can be profoundly valued.

Okay