When examining modern storage systems, most implementations bolt security onto existing architectures as an afterthought. BSFS (Block Storage File System) takes the opposite approach: building cryptographic security into the fundamental design from day one.
Core Security Principles
BSFS implements several sophisticated security mechanisms that distinguish it from conventional file systems:
Random Block Allocation Strategy
Rather than sequential block allocation, BSFS deliberately randomizes block assignment. This approach prevents temporal correlation attacks where adversaries could infer file access patterns or data relationships through block locality analysis.
Encrypted Metadata Architecture
The Block Allocation Tables (BAT) undergo AES-256-CBC encryption at the partition level. File-to-block mappings remain opaque without proper key derivation, ensuring metadata confidentiality alongside data protection.
Cryptographic Tenant Isolation
Multi-tenant environments achieve isolation through HKDF-based key derivation rather than filesystem permissions. Each tenant operates with mathematically distinct cryptographic domains, eliminating cross-tenant information leakage vectors.
Atomic Update Semantics
Copy-on-write mechanisms ensure transactional consistency. BAT updates serve as commit points, guaranteeing that operations either complete entirely or leave no partial state.
Implementation Architecture
Tenant Storage Structure:
├── Partition 0
│ ├── BAT (AES-256-CBC encrypted)
│ └── Data Blocks (2MB default)
├── Partition 1
│ ├── BAT (encrypted with derived keys)
│ └── Data Blocks
└── Additional Partitions...
The system maintains a clean separation between encrypted metadata and raw block storage, enabling efficient operations while preserving security boundaries.
API Design Philosophy
#include "bsfs.h"
int main() {
uint8_t master_key[32] = {/* cryptographically secure key */};
bsfs_tenant_t tenant;
// Initialize tenant with cryptographic isolation
if (bsfs_tenant_init(&tenant, "storage.blob", master_key) != 0) {
return -1;
}
// Atomic file operations
uuid_t file_id;
uuid_generate(file_id);
const char *data = "Sensitive data payload";
if (bsfs_write_file(&tenant, file_id, (uint8_t*)data, strlen(data)) != 0) {
return -1;
}
// Cleanup releases all cryptographic material
bsfs_tenant_cleanup(&tenant);
return 0;
}
The API deliberately minimizes attack surface by providing only essential operations while maintaining cryptographic hygiene throughout the lifecycle.
Technical Specifications
Current Implementation Limits:
- Single partition support (extensible architecture)
- 64 files per partition maximum
- 2GB file size ceiling
- 2MB default block size with configurability
Security Features:
- AES-256-CBC for BAT encryption
- HKDF key derivation for tenant separation
- Random block allocation for pattern obfuscation
- Atomic updates via copy-on-write
Dependencies:
- OpenSSL for cryptographic primitives
- UUID library for file identification
- Standard C library with POSIX extensions
Future Development Vectors
The roadmap indicates several compelling directions:
Scaling Improvements: Multi-partition support and increased file limits would enable enterprise deployment scenarios.
Language Bindings: Python ctypes wrappers and Node.js native modules could broaden adoption while maintaining the C performance foundation.
Python Integration Strategy
The Python wrapper implementation would leverage ctypes for direct shared library interaction, providing both low-level access and high-level abstractions:
Shared Library Integration:
import ctypes
import uuid
from ctypes import c_char_p, c_uint8, c_size_t, POINTER, Structure
from pathlib import Path
class BSFSTenant(Structure):
_fields_ = [("opaque_data", ctypes.c_void_p)]
class BSFS:
def __init__(self, lib_path="libbsfs.so"):
self.lib = ctypes.CDLL(lib_path)
self._setup_function_signatures()
def _setup_function_signatures(self):
# Define C function prototypes
self.lib.bsfs_tenant_init.argtypes = [
POINTER(BSFSTenant), c_char_p, POINTER(c_uint8)
]
self.lib.bsfs_tenant_init.restype = ctypes.c_int
self.lib.bsfs_write_file.argtypes = [
POINTER(BSFSTenant), POINTER(c_uint8),
POINTER(c_uint8), c_size_t
]
self.lib.bsfs_write_file.restype = ctypes.c_int
High-Level Pythonic Interface:
from contextlib import contextmanager
from typing import Optional, Dict, Any
import secrets
class BSFSStorage:
def __init__(self, storage_path: Path, master_key: Optional[bytes] = None):
self.storage_path = storage_path
self.master_key = master_key or secrets.token_bytes(32)
self._bsfs = BSFS()
self._tenant = BSFSTenant()
def __enter__(self):
result = self._bsfs.lib.bsfs_tenant_init(
ctypes.byref(self._tenant),
str(self.storage_path).encode(),
(c_uint8 * 32)(*self.master_key)
)
if result != 0:
raise BSFSError(f"Failed to initialize tenant: {result}")
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self._bsfs.lib.bsfs_tenant_cleanup(ctypes.byref(self._tenant))
def write_file(self, file_id: uuid.UUID, data: bytes) -> None:
"""Write file with atomic guarantees"""
uuid_bytes = (c_uint8 * 16)(*file_id.bytes)
data_array = (c_uint8 * len(data))(*data)
result = self._bsfs.lib.bsfs_write_file(
ctypes.byref(self._tenant),
uuid_bytes,
data_array,
len(data)
)
if result != 0:
raise BSFSError(f"Write operation failed: {result}")
def read_file(self, file_id: uuid.UUID) -> bytes:
"""Read file by UUID"""
uuid_bytes = (c_uint8 * 16)(*file_id.bytes)
data_ptr = POINTER(c_uint8)()
size = c_size_t()
result = self._bsfs.lib.bsfs_read_file(
ctypes.byref(self._tenant),
uuid_bytes,
ctypes.byref(data_ptr),
ctypes.byref(size)
)
if result != 0:
raise BSFSError(f"Read operation failed: {result}")
# Convert C array to Python bytes
data = ctypes.string_at(data_ptr, size.value)
self._bsfs.lib.free(data_ptr) # Free C-allocated memory
return data
# Usage example demonstrating high-level interface
def secure_document_storage():
storage_path = Path("./secure_docs.blob")
with BSFSStorage(storage_path) as storage:
# Store sensitive document
doc_id = uuid.uuid4()
sensitive_data = b"Classified research data with cryptographic protection"
storage.write_file(doc_id, sensitive_data)
# Retrieve and verify
retrieved = storage.read_file(doc_id)
assert retrieved == sensitive_data
return doc_id
# Advanced usage with proper error handling
class SecureDocumentManager:
def __init__(self, storage_path: Path, encryption_key: bytes):
self.storage_path = storage_path
self.encryption_key = encryption_key
self._file_registry: Dict[str, uuid.UUID] = {}
def store_document(self, name: str, content: bytes) -> uuid.UUID:
"""Store document with name mapping"""
doc_id = uuid.uuid4()
with BSFSStorage(self.storage_path, self.encryption_key) as storage:
storage.write_file(doc_id, content)
self._file_registry[name] = doc_id
return doc_id
def retrieve_document(self, name: str) -> bytes:
"""Retrieve document by name"""
if name not in self._file_registry:
raise KeyError(f"Document '{name}' not found")
doc_id = self._file_registry[name]
with BSFSStorage(self.storage_path, self.encryption_key) as storage:
return storage.read_file(doc_id)
This Python integration maintains the security properties of the underlying C implementation while providing idiomatic Python interfaces. The context manager ensures proper resource cleanup, while the ctypes integration provides zero-copy access to the high-performance C operations.
Advanced Features: Block-level deduplication, streaming I/O, and background garbage collection represent natural evolution paths.
Distributed Architecture: Multi-node clustering with replication would enable fault-tolerant deployments.
Assessment
BSFS represents a thoughtful approach to secure storage design. Rather than retrofitting security onto existing architectures, it builds cryptographic principles into the fundamental data structures and algorithms.
The codebase is available at github.com/ryo-suwito/bsfs for examination and contribution.
Top comments (0)