Optimizing Line Chart Performance with LTTB Algorithm

When working with large datasets, rendering all points in a line chart can cause significant performance issues. For example, plotting 50,000 data points directly can overwhelm the browser and make the chart unresponsive. Tools like amCharts and ApexCharts struggle with such datasets, while ECharts performs better but still isn't optimized for extremely large datasets.

To address this, I combined ECharts with the Largest Triangle Three Buckets (LTTB) algorithm. LTTB is a downsampling algorithm that reduces data points while preserving the visual structure of the dataset.

In this post, I’ll explain how the LTTB algorithm works step by step with an example of reducing 10 points to 4. For each step, I'll include the relevant code, explanations, and resulting data transformations.

LTTB Algorithm Code
Here’s the full code for the LTTB algorithm:

export function lttb(data: Point[], threshold: number): Point[] {
  const dataLength = data.length;
  if (threshold >= dataLength || threshold === 0) {
    return data; // No downsampling needed
  }

  const sampled: Point[] = [];
  let sampledIndex = 0;
  const bucketSize = (dataLength - 2) / (threshold - 2);
  let a = 0; // Start point
  let nextA = 0;

  sampled[sampledIndex++] = data[a]; // Add the first point

  for (let i = 0; i < threshold - 2; i++) {
    let avgX = 0;
    let avgY = 0;
    let avgRangeStart = Math.floor((i + 1) * bucketSize) + 1;
    let avgRangeEnd = Math.floor((i + 2) * bucketSize) + 1;
    avgRangeEnd = avgRangeEnd < dataLength ? avgRangeEnd : dataLength;
    const avgRangeLength = avgRangeEnd - avgRangeStart;

    for (; avgRangeStart < avgRangeEnd; avgRangeStart++) {
      avgX += data[avgRangeStart].x;
      avgY += data[avgRangeStart].y;
    }

    avgX /= avgRangeLength;
    avgY /= avgRangeLength;

    let rangeOffs = Math.floor((i + 0) * bucketSize) + 1;
    const rangeTo = Math.floor((i + 1) * bucketSize) + 1;
    const pointAX = data[a].x;
    const pointAY = data[a].y;

    let maxArea = -1;

    for (; rangeOffs < rangeTo; rangeOffs++) {
      const area = Math.abs(
        (pointAX - avgX) * (data[rangeOffs].y - pointAY) -
        (pointAX - data[rangeOffs].x) * (avgY - pointAY)
      );
      if (area > maxArea) {
        maxArea = area;
        nextA = rangeOffs;
      }
    }

    sampled[sampledIndex++] = data[nextA]; // Add the most important point
    a = nextA;
  }

  sampled[sampledIndex++] = data[dataLength - 1]; // Add the last point

  return sampled;
}

Example: Reducing 10 Points to 4 Points
We’ll reduce the following dataset of 10 points to 4 points using the LTTB algorithm:

Original Dataset

const data = [
  { x: 1, y: 10 },
  { x: 2, y: 20 },
  { x: 3, y: 15 },
  { x: 4, y: 25 },
  { x: 5, y: 30 },
  { x: 6, y: 20 },
  { x: 7, y: 40 },
  { x: 8, y: 35 },
  { x: 9, y: 45 },
  { x: 10, y: 50 }
];
const threshold = 4; // Reduce to 4 points

Step 1: Add the First Point

sampled[sampledIndex++] = data[a]; // Add the first point

✅ The algorithm always starts by adding the first data point. This ensures that the downsampled dataset begins correctly.

📊 Resulting Data:

sampled = [
  { x: 1, y: 10 }
];

Step 2: Calculate the Bucket Size

const bucketSize = (dataLength - 2) / (threshold - 2);

✅ The bucket size determines how many data points fall into each "bucket." In this case:

$\text{bucketSize} = \frac{\text{dataLength} - 2}{\text{threshold} - 2} = \frac{10 - 2}{4 - 2} = 4$

So, each bucket contains 4 data points.

Step 3: First Bucket (Points 2–5)

let avgX = 0;
let avgY = 0;
let avgRangeStart = Math.floor((i + 1) * bucketSize) + 1;
let avgRangeEnd = Math.floor((i + 2) * bucketSize) + 1;
avgRangeEnd = avgRangeEnd < dataLength ? avgRangeEnd : dataLength;
const avgRangeLength = avgRangeEnd - avgRangeStart;

for (; avgRangeStart < avgRangeEnd; avgRangeStart++) {
  avgX += data[avgRangeStart].x;
  avgY += data[avgRangeStart].y;
}

avgX /= avgRangeLength;
avgY /= avgRangeLength;

✅ The algorithm calculates the average point of the first bucket (points 2–5). This is done by summing their x and y values and dividing by the number of points.

$\text{avgX} = \frac{2 + 3 + 4 + 5}{4} = 3.5$

$\text{avgY} = \frac{20 + 15 + 25 + 30}{4} = 22.5$

Step 4: Find the Most Significant Point

let rangeOffs = Math.floor((i + 0) * bucketSize) + 1;
const rangeTo = Math.floor((i + 1) * bucketSize) + 1;
const pointAX = data[a].x;
const pointAY = data[a].y;

let maxArea = -1;

for (; rangeOffs < rangeTo; rangeOffs++) {
  const area = Math.abs(
    (pointAX - avgX) * (data[rangeOffs].y - pointAY) -
    (pointAX - data[rangeOffs].x) * (avgY - pointAY)
  );
  if (area > maxArea) {
    maxArea = area;
    nextA = rangeOffs;
  }
}

✅ The algorithm identifies the point in the bucket that forms the largest triangle with the average point and the previously selected point.

For this bucket, (2,20) has the largest area

sampled = [
  { x: 1, y: 10 },
  { x: 2, y: 20 }
];

Step 5: Second Bucket (Points 6–9)

The algorithm repeats the process for the second bucket (points 6–9), computing:

The most significant point in this bucket is (6, 20).

$\text{avgX} = \frac{6 + 7 + 8 + 9}{4} = 7.5$

$\text{avgY} = \frac{20 + 40 + 35 + 45}{4} = 35$

📊 Updated Data:

sampled = [
  { x: 1, y: 10 },
  { x: 2, y: 20 },
  { x: 6, y: 20 }
];

Step 6: Add the Last Point

sampled[sampledIndex++] = data[dataLength - 1]; // Add the last point

✅ The algorithm always ends by adding the last point.

📊 Final Data:

sampled = [
  { x: 1, y: 10 },
  { x: 2, y: 20 },
  { x: 6, y: 20 },
  { x: 10, y: 50 }
];

Conclusion
By combining ECharts with the LTTB algorithm, I was able to downsample my dataset efficiently and improve chart performance significantly. If you work with large datasets, this method is a great way to balance performance and visual accuracy. Let me know if you’ve tried this approach or if you have other techniques to share!

DEV Community

Optimizing Line Chart Performance with LTTB Algorithm

Top comments (0)