DEV Community

Ryo Suwito
Ryo Suwito

Posted on

BSFS: A Security-First Approach to Block Storage

When examining modern storage systems, most implementations bolt security onto existing architectures as an afterthought. BSFS (Block Storage File System) takes the opposite approach: building cryptographic security into the fundamental design from day one.

Core Security Principles

BSFS implements several sophisticated security mechanisms that distinguish it from conventional file systems:

Random Block Allocation Strategy
Rather than sequential block allocation, BSFS deliberately randomizes block assignment. This approach prevents temporal correlation attacks where adversaries could infer file access patterns or data relationships through block locality analysis.

Encrypted Metadata Architecture
The Block Allocation Tables (BAT) undergo AES-256-CBC encryption at the partition level. File-to-block mappings remain opaque without proper key derivation, ensuring metadata confidentiality alongside data protection.

Cryptographic Tenant Isolation
Multi-tenant environments achieve isolation through HKDF-based key derivation rather than filesystem permissions. Each tenant operates with mathematically distinct cryptographic domains, eliminating cross-tenant information leakage vectors.

Atomic Update Semantics
Copy-on-write mechanisms ensure transactional consistency. BAT updates serve as commit points, guaranteeing that operations either complete entirely or leave no partial state.

Implementation Architecture

Tenant Storage Structure:
├── Partition 0
│   ├── BAT (AES-256-CBC encrypted)
│   └── Data Blocks (2MB default)
├── Partition 1
│   ├── BAT (encrypted with derived keys)
│   └── Data Blocks
└── Additional Partitions...
Enter fullscreen mode Exit fullscreen mode

The system maintains a clean separation between encrypted metadata and raw block storage, enabling efficient operations while preserving security boundaries.

API Design Philosophy

#include "bsfs.h"

int main() {
    uint8_t master_key[32] = {/* cryptographically secure key */};
    bsfs_tenant_t tenant;

    // Initialize tenant with cryptographic isolation
    if (bsfs_tenant_init(&tenant, "storage.blob", master_key) != 0) {
        return -1;
    }

    // Atomic file operations
    uuid_t file_id;
    uuid_generate(file_id);

    const char *data = "Sensitive data payload";
    if (bsfs_write_file(&tenant, file_id, (uint8_t*)data, strlen(data)) != 0) {
        return -1;
    }

    // Cleanup releases all cryptographic material
    bsfs_tenant_cleanup(&tenant);
    return 0;
}
Enter fullscreen mode Exit fullscreen mode

The API deliberately minimizes attack surface by providing only essential operations while maintaining cryptographic hygiene throughout the lifecycle.

Technical Specifications

Current Implementation Limits:

  • Single partition support (extensible architecture)
  • 64 files per partition maximum
  • 2GB file size ceiling
  • 2MB default block size with configurability

Security Features:

  • AES-256-CBC for BAT encryption
  • HKDF key derivation for tenant separation
  • Random block allocation for pattern obfuscation
  • Atomic updates via copy-on-write

Dependencies:

  • OpenSSL for cryptographic primitives
  • UUID library for file identification
  • Standard C library with POSIX extensions

Future Development Vectors

The roadmap indicates several compelling directions:

Scaling Improvements: Multi-partition support and increased file limits would enable enterprise deployment scenarios.

Language Bindings: Python ctypes wrappers and Node.js native modules could broaden adoption while maintaining the C performance foundation.

Python Integration Strategy

The Python wrapper implementation would leverage ctypes for direct shared library interaction, providing both low-level access and high-level abstractions:

Shared Library Integration:

import ctypes
import uuid
from ctypes import c_char_p, c_uint8, c_size_t, POINTER, Structure
from pathlib import Path

class BSFSTenant(Structure):
    _fields_ = [("opaque_data", ctypes.c_void_p)]

class BSFS:
    def __init__(self, lib_path="libbsfs.so"):
        self.lib = ctypes.CDLL(lib_path)
        self._setup_function_signatures()

    def _setup_function_signatures(self):
        # Define C function prototypes
        self.lib.bsfs_tenant_init.argtypes = [
            POINTER(BSFSTenant), c_char_p, POINTER(c_uint8)
        ]
        self.lib.bsfs_tenant_init.restype = ctypes.c_int

        self.lib.bsfs_write_file.argtypes = [
            POINTER(BSFSTenant), POINTER(c_uint8), 
            POINTER(c_uint8), c_size_t
        ]
        self.lib.bsfs_write_file.restype = ctypes.c_int
Enter fullscreen mode Exit fullscreen mode

High-Level Pythonic Interface:

from contextlib import contextmanager
from typing import Optional, Dict, Any
import secrets

class BSFSStorage:
    def __init__(self, storage_path: Path, master_key: Optional[bytes] = None):
        self.storage_path = storage_path
        self.master_key = master_key or secrets.token_bytes(32)
        self._bsfs = BSFS()
        self._tenant = BSFSTenant()

    def __enter__(self):
        result = self._bsfs.lib.bsfs_tenant_init(
            ctypes.byref(self._tenant),
            str(self.storage_path).encode(),
            (c_uint8 * 32)(*self.master_key)
        )
        if result != 0:
            raise BSFSError(f"Failed to initialize tenant: {result}")
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        self._bsfs.lib.bsfs_tenant_cleanup(ctypes.byref(self._tenant))

    def write_file(self, file_id: uuid.UUID, data: bytes) -> None:
        """Write file with atomic guarantees"""
        uuid_bytes = (c_uint8 * 16)(*file_id.bytes)
        data_array = (c_uint8 * len(data))(*data)

        result = self._bsfs.lib.bsfs_write_file(
            ctypes.byref(self._tenant),
            uuid_bytes,
            data_array,
            len(data)
        )
        if result != 0:
            raise BSFSError(f"Write operation failed: {result}")

    def read_file(self, file_id: uuid.UUID) -> bytes:
        """Read file by UUID"""
        uuid_bytes = (c_uint8 * 16)(*file_id.bytes)
        data_ptr = POINTER(c_uint8)()
        size = c_size_t()

        result = self._bsfs.lib.bsfs_read_file(
            ctypes.byref(self._tenant),
            uuid_bytes,
            ctypes.byref(data_ptr),
            ctypes.byref(size)
        )

        if result != 0:
            raise BSFSError(f"Read operation failed: {result}")

        # Convert C array to Python bytes
        data = ctypes.string_at(data_ptr, size.value)
        self._bsfs.lib.free(data_ptr)  # Free C-allocated memory
        return data

# Usage example demonstrating high-level interface
def secure_document_storage():
    storage_path = Path("./secure_docs.blob")

    with BSFSStorage(storage_path) as storage:
        # Store sensitive document
        doc_id = uuid.uuid4()
        sensitive_data = b"Classified research data with cryptographic protection"

        storage.write_file(doc_id, sensitive_data)

        # Retrieve and verify
        retrieved = storage.read_file(doc_id)
        assert retrieved == sensitive_data

        return doc_id

# Advanced usage with proper error handling
class SecureDocumentManager:
    def __init__(self, storage_path: Path, encryption_key: bytes):
        self.storage_path = storage_path
        self.encryption_key = encryption_key
        self._file_registry: Dict[str, uuid.UUID] = {}

    def store_document(self, name: str, content: bytes) -> uuid.UUID:
        """Store document with name mapping"""
        doc_id = uuid.uuid4()

        with BSFSStorage(self.storage_path, self.encryption_key) as storage:
            storage.write_file(doc_id, content)

        self._file_registry[name] = doc_id
        return doc_id

    def retrieve_document(self, name: str) -> bytes:
        """Retrieve document by name"""
        if name not in self._file_registry:
            raise KeyError(f"Document '{name}' not found")

        doc_id = self._file_registry[name]

        with BSFSStorage(self.storage_path, self.encryption_key) as storage:
            return storage.read_file(doc_id)
Enter fullscreen mode Exit fullscreen mode

This Python integration maintains the security properties of the underlying C implementation while providing idiomatic Python interfaces. The context manager ensures proper resource cleanup, while the ctypes integration provides zero-copy access to the high-performance C operations.

Advanced Features: Block-level deduplication, streaming I/O, and background garbage collection represent natural evolution paths.

Distributed Architecture: Multi-node clustering with replication would enable fault-tolerant deployments.

Assessment

BSFS represents a thoughtful approach to secure storage design. Rather than retrofitting security onto existing architectures, it builds cryptographic principles into the fundamental data structures and algorithms.

The codebase is available at github.com/ryo-suwito/bsfs for examination and contribution.

Top comments (0)