DEV Community

Odinaka Joy
Odinaka Joy

Posted on

Running Machine Learning Models in the Browser Using onnxruntime-web

🚀 AI in the browser? Yes, it’s possible.

Most machine learning models live on the backend, meaning every prediction requires a server call. But I wanted to run my model directly in the browser, because it will be faster, private, and without extra infrastructure. That’s exactly what I did using onnxruntime-web library

Let me walk you through how I deployed my Mental Health Treatment Prediction model into the frontend of my app, SoulSync, using onnxruntime-web.

Here’s the process we will follow:

  1. Prepare new input data to match the model’s training data
  2. Encode and format the inputs correctly
  3. Load the ONNX model in the browser with onnxruntime-web
  4. Run inference and display the results

📌 Prerequisites & Setup

You will need:

  1. Basic Python knowledge - to train/export your model.

  2. Basic ML knowledge – how models are built and exported. Helpful guides:

  3. Basic JavaScript knowledge - to run your model in the browser.

  4. An ONNX model file I used my exported Mental Health Treatment Prediction model.

  5. A frontend environment — could be a plain HTML/JavaScript project or a framework like React. I used Next.js

  6. Place your exported ONNX model in the public/ (or assets/) folder of your project

  7. Install the library:

npm install onnxruntime-web
Enter fullscreen mode Exit fullscreen mode

This gives us the onnxruntime-web package, which runs ONNX models in the browser using WebAssembly (WASM) or WebGL for acceleration.


📌 Preparing Your Data for Inference

For inference to work, your new input data must match the training data in format, encoding, and order.

  1. Collect the same features you used during training. For my project, these included:
    • age, gender, family_history, work_interfere, no_employees, remote_work, leave, etc.
  2. Encode them the same way as during training:
    • Binary Encoding (Yes/No values)
    • Ordinal Encoding (ordered categories, e.g., “Very easy” → 0, …)
    • One-hot Encoding (3-category features)
  3. Verify order and number of features – must exactly match the training features.

Example: my model was trained on 21 features.

 

🎯 Encoding Input Data

Here’s how I encoded form data from users in my app:

const binaryMap = { Yes: 1, No: 0 };
const workInterfereMap = { Never: 0, Rarely: 1, Sometimes: 2, Often: 3, "Not specified": -1 };
const noEmployeesMap = { "1-5": 0, "6-25": 1, "26-100": 2, "100-500": 3, "500-1000": 4, "More than 1000": 5 };
const leaveMap = { "Very easy": 0, "Somewhat easy": 1, "Don't know": 2, "Somewhat difficult": 3, "Very difficult": 4 };

export const encodeInput = (formData) => {
  const age = parseInt(formData.age, 10) || 30;
  const family_history = binaryMap[formData.family_history];
  const work_interfere = workInterfereMap[formData.work_interfere] ?? -1;
  const no_employees = noEmployeesMap[formData.no_employees] ?? 2;
  const remote_work = binaryMap[formData.remote_work];
  const leave = leaveMap[formData.leave] ?? 2;
  const obs_consequence = binaryMap[formData.obs_consequence];

  // One-hot and binary encodings for categorical features
  const gender_Male = formData.gender === "Male" ? 1 : 0;
  const gender_Other = formData.gender === "Other" ? 1 : 0;
  const benefits_Yes = formData.benefits === "Yes" ? 1 : 0;
  const benefits_No = formData.benefits === "No" ? 1 : 0;
  const care_options_Not_sure = formData.benefits === "Not sure" ? 1 : 0;
  const care_options_Yes = formData.benefits === "Yes" ? 1 : 0;
  const wellness_program_Yes = formData.benefits === "Yes" ? 1 : 0;
  const wellness_program_No = formData.benefits === "No" ? 1 : 0;
  const seek_help_Yes = formData.benefits === "Yes" ? 1 : 0;
  const seek_help_No = formData.benefits === "No" ? 1 : 0;
  const anonymity_Yes = formData.benefits === "Yes" ? 1 : 0;
  const anonymity_No = formData.benefits === "No" ? 1 : 0;
  const mental_vs_physical_Yes = formData.benefits === "Yes" ? 1 : 0;
  const mental_vs_physical_No = formData.benefits === "No" ? 1 : 0;

  return {
    age,
    family_history,
    work_interfere,
    no_employees,
    remote_work,
    leave,
    obs_consequence,
    gender_Male,
    gender_Other,
    benefits_No,
    benefits_Yes,
    "care_options_Not sure": care_options_Not_sure,
    care_options_Yes,
    wellness_program_No,
    wellness_program_Yes,
    seek_help_No,
    seek_help_Yes,
    anonymity_No,
    anonymity_Yes,
    mental_vs_physical_No,
    mental_vs_physical_Yes,
  };
};
Enter fullscreen mode Exit fullscreen mode

This ensures new inputs match the 21 training features exactly.

🎯 Loading the ONNX Model with onnxruntime-web

Now, let’s load the model and run inference:

import * as ort from "onnxruntime-web";

export async function runInference(encodedInputData) {
  try {
    const session = await ort.InferenceSession.create("mental_health_model_deployment.onnx");

    const inputArray = Object.values(encodedInputData);

    // Create a tensor of shape [1, num_features]
    const tensor = new ort.Tensor("float32", Float32Array.from(inputArray), [1, inputArray.length]);

    // Input name must match what was used when exporting the ONNX model
    const feeds = { float_input: tensor };

    // Run inference
    const results = await session.run(feeds);

    const label = Number(results.label.data[0]);
    const probabilities = Array.from(results.probabilities.data);

    const classes = ["No Treatment", "Needs Treatment"];

    return {
      predictedClass: classes[label],
      probabilities: {
        [classes[0]]: probabilities[0],
        [classes[1]]: probabilities[1],
      },
    };
  } catch (e) {
    console.error("Error during ONNX inference:", e);
    throw e;
  }
}
Enter fullscreen mode Exit fullscreen mode

🎯 Running Inference

Example output:

{
    "predictedClass": "No Treatment",
    "probabilities": {
        "No Treatment": 0.9706928730010986,
        "Needs Treatment": 0.02930714190006256
    }
}
Enter fullscreen mode Exit fullscreen mode

🎯 Displaying Predictions to the User

Finally, you can show the results back in your frontend UI:
Mental Health Treatment Prediction result


Conclusion

And that’s it. We successfully deployed a machine learning model in the browser using onnxruntime-web.

This approach makes predictions:

  • Faster - no backend round trips)
  • Private - data stays on the user’s device
  • Accessible - works anywhere with just a browser

If you’d like to try this yourself, check out my demo app SoulSync

GitHub Repo: Link

Happy coding!!!

Top comments (0)