DEV Community

Manoj Kumar Patra
Manoj Kumar Patra

Posted on

3 2

Design Patterns for Resilient Serving - Batch Serving

Batch serving is useful when we need to carry out predictions asynchronously unlike stateless serving function where we can process one instance or at most a few 1000 instances embedded in a single request.

Examples include:

  1. Determine whether to reorder a stock-keeping unit - needs to be carried out on an hourly basis
  2. Creating personalized songs playlist
  3. Recommendation engines with periodic refresh rates - say, the periodic refresh rate is per hour, then, we carry out inferences for only those users who visited the website in last on hour

To achieve asynchronous predictions, batch serving makes use of distributed data processing infrastructures such as BigQuery, Apache Beam, etc.

Consider this example below, where we run inference on approx. 1.5 million rows of data using BigQuery:

WITH all_complaints AS (
SELECT * FROM ML.PREDICT(MODEL external_model,
  (SELECT consumer_complaint_narrative AS reviews
   FROM `bigquery-public-data`.cfpb_complaints.complaint_database
   WHERE consumer_complaint_narrative IS NOT NULL
  )
))
SELECT * FROM all_complaints
ORDER BY positive_review_probability DESC LIMIT 5
Enter fullscreen mode Exit fullscreen mode

Here, the following operations take place in order:

  1. Read consumer_complaint_narrative column from dataset where consumer_complaint_narrative is not NULL. Let's assume this is a total of X values. These are then distributed across N shards.
  2. N workers process each of N shards to read the data and do the inference using the model files.
  3. Each of the N workers find the 5 most positive complaints from the shard they processed.
  4. Take the (5 * N) complaints, sort them and then select 5 from the actual result.

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay