From the Author:
D-MemFS was featured in Python Weekly Issue #737 (March 19, 2026) under Interesting Projects, Tools and Libraries. Being picked up by one of the most widely-read Python newsletters confirmed that in-memory I/O bottlenecks and memory management are truly universal challenges for developers everywhere. This series is my response to that interest.
🧭 About this Series: The Two Sides of Development
To provide a complete picture of this project, I’ve split each update into two perspectives:
- Side A (Practical / from Qiita): Implementation details, benchmarks, and technical solutions.
- Side B (Philosophy / from Zenn): The development war stories, AI-collaboration, and design decisions.
Introduction
When you want to use a RAM disk on Windows, tools like ImDisk Toolkit or OSFMount are usually the first candidates that come to mind. However, they share a common restriction—they require administrative privileges, or a paid license. Often, you can't use them in CI environments or on shared development machines.
But if you slightly change your perspective, another approach emerges.
Isn't the memory of the Python process itself a RAM disk from the perspective of the Python code?
In this article, I will introduce how to leverage D-MemFS as a "Python-exclusive software RAM disk" that requires neither drivers nor administrative privileges.
The Conclusion First: Comparison of Methods
Let's first summarize "which method you should choose."
| Comparison Item |
tempfile on SSD |
RAM Disk (ImDisk, etc.) | D-MemFS |
|---|---|---|---|
| Admin Privileges | Not Required | Required | Not Required |
| External Tools | Not Required | Required | Not Required (pip only) |
| Volatile (Auto-delete) | △ Manual Delete | △ Lost on Reboot | ✅ Auto-collected by GC |
| Cross-Platform | ✅ | ❌ Windows Only | ✅ Win/Mac/Linux |
| Access from Other Processes | ✅ | ✅ | ❌ Python Internal Only |
| Memory Limit Management | ❌ | ❌ | ✅ Hard Quotas + Memory Guard |
| Sequential I/O | ~1.9 GB/s | ~2.0 GB/s | ~1.9 GB/s |
| Random Access I/O | ~1.4 GB/s | ~1.3 GB/s | ~1.4 GB/s |
| Usage in CI Environments | ✅ | ❌ Permission Wall | ✅ |
Conclusion: If it's pure Python processing without the need to call external commands, D-MemFS is the easiest and fastest choice.
💡 v0.3.0 New Feature: Memory Guard
While Hard Quotas manage the "budget within the virtual FS," the Memory Guard introduced in v0.3.0 checks the host machine's remaining physical memory. If there isn't enough memory, it rejects the write beforehand. Even if you set a quota of 4 GiB, it's meaningless if the machine only has 2 GiB free—Memory Guard prevents this issue. Details will be covered in Side A - Part 3: Design Philosophy of Hard Quotas.
Premise: Why Do We Want a RAM Disk?
The primary uses for a RAM disk are twofold:
- High-speed temporary file processing — To speed up processing by eliminating disk I/O latency.
- Avoiding writes to disk — To protect SSD write endurance, and to avoid leaving sensitive data behind.
Both of these are very common requirements in Python testing, CI, and data processing pipelines.
D-MemFS: Python-Exclusive Software RAM Disk
D-MemFS (dmemfs) is a pure Python in-memory file system library.
pip install D-MemFS
- Standard library only (Zero external dependencies)
- No administrative privileges required
- Runs on Windows / macOS / Linux
- Automatically deleted when the process terminates (volatile)
Replacing tempfile with D-MemFS
The most typical pattern is replacing tempfile.TemporaryDirectory().
Before Replacement (Writes to disk occur)
import tempfile
import os
def process_data(raw: bytes) -> bytes:
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "input.bin")
output_path = os.path.join(tmpdir, "output.bin")
with open(input_path, "wb") as f:
f.write(raw)
# Some processing (an example assuming calling an external command)
data = open(input_path, "rb").read()
result = bytes(b ^ 0xFF for b in data) # Dummy conversion
with open(output_path, "wb") as f:
f.write(result)
return open(output_path, "rb").read()
After Replacement (Completely contained in memory)
from dmemfs import MemoryFileSystem
def process_data(raw: bytes) -> bytes:
mfs = MemoryFileSystem()
mfs.mkdir("/tmp")
with mfs.open("/tmp/input.bin", "wb") as f:
f.write(raw)
with mfs.open("/tmp/input.bin", "rb") as f:
data = f.read()
result = bytes(b ^ 0xFF for b in data)
with mfs.open("/tmp/output.bin", "wb") as f:
f.write(result)
with mfs.open("/tmp/output.bin", "rb") as f:
return f.read()
Access to the disk becomes zero. When mfs goes out of scope, it is collected by the GC, so no clean-up is necessary.
Automatic Release via GC
MemoryFileSystem is collected by the GC when it exits the scope. You can use it similarly to TemporaryDirectory.
from dmemfs import MemoryFileSystem
def process():
mfs = MemoryFileSystem(max_quota=64 * 1024 * 1024)
mfs.mkdir("/work")
with mfs.open("/work/data.csv", "wb") as f:
f.write(b"id,value\n1,100\n2,200\n")
with mfs.open("/work/data.csv", "rb") as f:
print(f.read().decode())
process()
# When exiting the function, mfs is collected by GC, and all contents are wiped.
Benchmarks
The following are benchmark results comparing D-MemFS, tempfile on a RAM Disk, and tempfile on an SSD (repeat=5, warmup=1).
Measurement Environment: Windows, X:\TEMP (RAM Disk) / C:\TempX (SSD)
Throughput values are calculated from 512 MiB of write + read.
Mass read/write of small files (300 files × 4 KiB)
| Method | Time (ms) |
|---|---|
io.BytesIO |
6 |
| D-MemFS | 51 |
| tempfile (on RAM Disk) | 207 |
| tempfile (on SSD) | 267 |
Stream read/write (16 MiB, 64 KiB chunks)
| Method | Time (ms) |
|---|---|
| tempfile (on RAM Disk) | 20 |
| tempfile (on SSD) | 21 |
io.BytesIO |
62 |
| D-MemFS | 81 |
Random Access (16 MiB)
| Method | Time (ms) |
|---|---|
| D-MemFS | 34 |
| tempfile (on SSD) | 35 |
| tempfile (on RAM Disk) | 37 |
io.BytesIO |
82 |
Large Capacity Stream read/write (512 MiB, 1 MiB chunks)
| Method | Time (ms) |
|---|---|
| D-MemFS | 529 |
| tempfile (on RAM Disk) | 514 |
| tempfile (on SSD) | 541 |
io.BytesIO |
2 258 |
Random reading of numerous files (10,000 files × 4 KiB)
| Method | Time (ms) |
|---|---|
| D-MemFS | 1 280 |
| tempfile (on RAM Disk) | 6 310 |
| tempfile (on SSD) | 8 601 |
Key Points:
- For small and numerous files, D-MemFS is overwhelmingly faster. Because the file open/close overhead is practically zero, it's 4 times faster with 300 files and over 5 times faster for random reads of 10,000 files.
- For large capacity streams (512 MiB+, large chunks), D-MemFS and tempfile are equivalent. Because the memory bandwidth becomes the bottleneck, the difference in storage location hardly shows.
- For small to medium capacity streams around 16 MiB, tempfile is faster. The per-chunk overhead is relatively higher for D-MemFS, so the finer the chunk size, the wider the gap.
-
io.BytesIOis the fastest for single-stream usage, but lacks file management features (paths, directories, quotas).
Using in Asynchronous Code (AsyncMemoryFileSystem)
For asyncio-based code, use AsyncMemoryFileSystem.
import asyncio
from dmemfs import AsyncMemoryFileSystem
async def main():
mfs = AsyncMemoryFileSystem(max_quota=32 * 1024 * 1024)
await mfs.mkdir("/data")
# Asynchronous write
async with await mfs.open("/data/result.bin", "wb") as f:
await f.write(b"async result data\n")
# Asynchronous read
async with await mfs.open("/data/result.bin", "rb") as f:
content = await f.read()
print(content)
asyncio.run(main())
Internally, it incorporates exclusive control based on asyncio.Lock, making it safe to access simultaneously from multiple coroutines.
Example: Parallel Downloads and Immediate Processing
import asyncio
import aiohttp
from dmemfs import AsyncMemoryFileSystem
async def fetch_and_process(session, url: str, mfs, path: str):
async with session.get(url) as resp:
data = await resp.read()
async with await mfs.open(path, "wb") as f:
await f.write(data)
async def main():
urls = [
("https://example.com/a.json", "/cache/a.json"),
("https://example.com/b.json", "/cache/b.json"),
("https://example.com/c.json", "/cache/c.json"),
]
mfs = AsyncMemoryFileSystem(max_quota=64 * 1024 * 1024)
await mfs.mkdir("/cache")
async with aiohttp.ClientSession() as session:
await asyncio.gather(*[
fetch_and_process(session, url, mfs, path)
for url, path in urls
])
# Process all files while they are readily available in memory
for _, path in urls:
async with await mfs.open(path, "rb") as f:
data = await f.read()
print(f"{path}: {len(data)} bytes")
asyncio.run(main())
Limitations
While D-MemFS is extremely useful as a software RAM disk, it has fundamental limitations compared to a real RAM disk.
| Limitation | Description |
|---|---|
| Restricted to within the Python Process | It cannot be accessed from other processes. It cannot be used for external commands or subprocess. |
| Volatile | When the process ends, all contents disappear (it does not persist until explicitly written out). |
| Cannot be Mounted | It cannot be exposed to the OS as a drive letter or mount point. |
Not compatible with os.PathLike (Intentional) |
A design decision to prevent confusing virtual paths with host paths. |
Use Cases
| Use Case | Description |
|---|---|
| Testing | For tests involving the file system using pytest. As a replacement for the tmp_path fixture. |
| CI | Speeding up builds on GitHub Actions or Azure Pipelines by reducing disk I/O. |
| Data Processing Pipelines | Handling all intermediate files entirely in memory. |
| Archive Extraction/Processing | Extracting zip/tar archives into memory and processing the internal files without writing to disk. |
| Dealing with Sensitive Data | Temporary processing where you don't want to leave any footprints on disk. |
| Mocking without External Dependencies | Intercepting file system access during testing. |
Extracting Archives into Memory
This is an example of extracting a zip file without writing it to disk and processing the internal files.
import io
import zipfile
from dmemfs import MemoryFileSystem
def process_zip(zip_bytes: bytes) -> dict[str, int]:
"""Extract contents of the zip to MFS in memory, returning the size of each file."""
mfs = MemoryFileSystem(max_quota=256 * 1024 * 1024)
mfs.mkdir("/extracted")
with zipfile.ZipFile(io.BytesIO(zip_bytes)) as zf:
for name in zf.namelist():
data = zf.read(name)
with mfs.open(f"/extracted/{name}", "wb") as f:
f.write(data)
# Process files on MFS (here returning a list of sizes)
result = {}
for entry in mfs.listdir("/extracted"):
path = f"/extracted/{entry}"
result[entry] = mfs.stat(path)["size"]
return result
You can do similar things with tar.gz.
import io
import tarfile
from dmemfs import MemoryFileSystem
def process_targz(targz_bytes: bytes) -> list[str]:
"""Extract tar.gz into MFS in memory and return the file list."""
mfs = MemoryFileSystem(max_quota=512 * 1024 * 1024)
mfs.mkdir("/extracted")
with tarfile.open(fileobj=io.BytesIO(targz_bytes), mode="r:gz") as tf:
for member in tf.getmembers():
if member.isfile():
f = tf.extractfile(member)
if f is not None:
# Reproduce directory hierarchy
parts = member.name.split("/")
for depth in range(1, len(parts)):
d = "/extracted/" + "/".join(parts[:depth])
try:
mfs.mkdir(d)
except FileExistsError:
pass
with mfs.open(f"/extracted/{member.name}", "wb") as out:
out.write(f.read())
return mfs.listdir("/extracted")
The key point is that writing to disk is zero. Compared to the traditional pattern of extracting downloaded archives to a temporary directory before processing, you won't wear down the write endurance of your SSD.
Using as a pytest Fixture
import pytest
from dmemfs import MemoryFileSystem
@pytest.fixture
def mfs():
"""Provides an independent in-memory FS for each test."""
return MemoryFileSystem(max_quota=32 * 1024 * 1024)
def test_write_and_read(mfs):
mfs.mkdir("/work")
with mfs.open("/work/test.txt", "wb") as f:
f.write(b"test data")
with mfs.open("/work/test.txt", "rb") as f:
assert f.read() == b"test data"
def test_quota_is_isolated(mfs):
"""Each test has an independent quota."""
mfs.mkdir("/data")
with mfs.open("/data/file.bin", "wb") as f:
f.write(b"x" * 1024)
st = mfs.stat("/data/file.bin")
assert st["size"] == 1024
Conclusion
The problem of needing administrative privileges for RAM disks on Windows can be completely avoided within Python code by using D-MemFS. Just by running pip install D-MemFS and replacing TemporaryDirectory, you can gain a dramatic speed improvement over using tempfile on an SSD.
For pure Python processing that doesn't rely on external commands, D-MemFS is the easiest and fastest choice.
🔗 Links & Resources
- GitHub: https://github.com/nightmarewalker/D-MemFS
- Original Japanese Article: 管理者権限不要でWindowsにRAMディスクを構築する ― Python環境のI/O高速化手法
If you find this project interesting, a ⭐ on GitHub would be the best way to support my work!
Top comments (0)