DEV Community

Bonthu Durga Prasad
Bonthu Durga Prasad

Posted on

High Performance Computing Storage in OCI using Lustre File System

High Performance Computing Storage in OCI using Lustre File System

As cloud workloads evolve, especially in areas like high-performance computing (HPC), machine learning, and big data analytics, traditional storage systems often become a bottleneck. These workloads require high throughput, low latency, and parallel file access.

In Oracle Cloud Infrastructure, high-performance storage requirements can be addressed using the Lustre File System, a distributed file system designed for large-scale workloads.

This article explores how Lustre works and how it can be used in OCI environments.

What is Lustre File System?

Lustre is a parallel distributed file system designed for environments that require high-speed access to large datasets.

It is commonly used in:

  • High Performance Computing (HPC)
  • Artificial Intelligence and Machine Learning
  • Scientific simulations
  • Big data processing

Unlike traditional file systems, Lustre distributes data across multiple storage nodes to achieve high performance.

Why Use Lustre in OCI?

Cloud-based HPC workloads demand:

  • High throughput
  • Scalable storage
  • Parallel access from multiple compute nodes

Lustre provides:

  • Parallel read/write operations
  • Horizontal scalability
  • High bandwidth performance

This makes it ideal for workloads where multiple compute instances process large datasets simultaneously.

Lustre Architecture Overview

Lustre is built using multiple components working together.

Key Components

  • Metadata Server (MDS) → Stores file metadata
  • Object Storage Servers (OSS) → Store actual data
  • Clients → Compute instances accessing the file system

Architecture Flow

Compute Nodes (Clients)


Metadata Server (MDS)


Object Storage Servers (OSS)


Distributed Storage

In this architecture:

  • Clients request metadata from MDS
  • Data is read/written from OSS nodes
  • Operations happen in parallel for high performance

How Lustre Works

When a client accesses a file:

  • Metadata request is sent to MDS
  • MDS provides file location information
  • Client directly accesses data from OSS nodes
  • Data transfer happens in parallel

This parallel architecture significantly improves performance.

Real-World Use Cases

Lustre is widely used in scenarios such as:

  1. Machine Learning Training

    Training large models requires fast access to massive datasets.

2.Scientific Research

Simulations generate huge amounts of data that must be processed quickly.

3.Media Rendering

Video processing and rendering workflows benefit from high throughput.

Benefits of Lustre in OCI

  • High throughput storage
  • Scalable architecture
  • Parallel data access
  • Optimized for HPC workloads

Best Practices

When using Lustre in OCI:

  • Use multiple compute nodes for parallel processing
  • Design workloads for distributed execution
  • Monitor performance and I/O usage
  • Use high-performance networking for better throughput

Lustre File System Limits

Lustre limits are per availability domain:
Resource Limit
Max file systems 8 per tenant per availability domain
Max capacity per FS 200 TB
Aggregate throughput 200 Gbps per tenancy per availability domain

The Lustre client is mandatory for any VM or compute instance that wants to access a Lustre file system.
Lustre client works only with Red Hat Compatible Kernel (RHCK) on Oracle Linu

Syncing Lustre with Object Storage

OCI Lustre can sync data with Object Storage for cost-effective long-term storage:

  1. Import
    • Pull objects from Object Storage → Lustre
    • Use case: AI training, data processing

    • Export • Push files from Lustre → Object Storage Use case: Save processed results

OCI Lustre file systems require a Lustre client kernel module.
However:

  • Oracle Linux normally uses UEK kernel, not compatible with Lustre
  • So you must switch to RHCK kernel (Red Hat Compatible Kernel)
  • Then you must build the Lustre client from source code unless a prebuilt package exists

Top comments (0)