DEV Community

D
D

Posted on • Originally published at qiita.com

[Side A] Building a RAM Disk on Windows without Admin Privileges — Python I/O Acceleration Techniques

From the Author:
D-MemFS was featured in Python Weekly Issue #737 (March 19, 2026) under Interesting Projects, Tools and Libraries. Being picked up by one of the most widely-read Python newsletters confirmed that in-memory I/O bottlenecks and memory management are truly universal challenges for developers everywhere. This series is my response to that interest.

🧭 About this Series: The Two Sides of Development

To provide a complete picture of this project, I’ve split each update into two perspectives:

  • Side A (Practical / from Qiita): Implementation details, benchmarks, and technical solutions.
  • Side B (Philosophy / from Zenn): The development war stories, AI-collaboration, and design decisions.

Introduction

When you want to use a RAM disk on Windows, tools like ImDisk Toolkit or OSFMount are usually the first candidates that come to mind. However, they share a common restriction—they require administrative privileges, or a paid license. Often, you can't use them in CI environments or on shared development machines.

But if you slightly change your perspective, another approach emerges.

Isn't the memory of the Python process itself a RAM disk from the perspective of the Python code?

In this article, I will introduce how to leverage D-MemFS as a "Python-exclusive software RAM disk" that requires neither drivers nor administrative privileges.

The Conclusion First: Comparison of Methods

Let's first summarize "which method you should choose."

Comparison Item tempfile on SSD RAM Disk (ImDisk, etc.) D-MemFS
Admin Privileges Not Required Required Not Required
External Tools Not Required Required Not Required (pip only)
Volatile (Auto-delete) △ Manual Delete △ Lost on Reboot ✅ Auto-collected by GC
Cross-Platform ❌ Windows Only ✅ Win/Mac/Linux
Access from Other Processes ❌ Python Internal Only
Memory Limit Management ✅ Hard Quotas + Memory Guard
Sequential I/O ~1.9 GB/s ~2.0 GB/s ~1.9 GB/s
Random Access I/O ~1.4 GB/s ~1.3 GB/s ~1.4 GB/s
Usage in CI Environments ❌ Permission Wall

Conclusion: If it's pure Python processing without the need to call external commands, D-MemFS is the easiest and fastest choice.

💡 v0.3.0 New Feature: Memory Guard
While Hard Quotas manage the "budget within the virtual FS," the Memory Guard introduced in v0.3.0 checks the host machine's remaining physical memory. If there isn't enough memory, it rejects the write beforehand. Even if you set a quota of 4 GiB, it's meaningless if the machine only has 2 GiB free—Memory Guard prevents this issue. Details will be covered in Side A - Part 3: Design Philosophy of Hard Quotas.

Premise: Why Do We Want a RAM Disk?

The primary uses for a RAM disk are twofold:

  1. High-speed temporary file processing — To speed up processing by eliminating disk I/O latency.
  2. Avoiding writes to disk — To protect SSD write endurance, and to avoid leaving sensitive data behind.

Both of these are very common requirements in Python testing, CI, and data processing pipelines.

D-MemFS: Python-Exclusive Software RAM Disk

D-MemFS (dmemfs) is a pure Python in-memory file system library.

pip install D-MemFS
Enter fullscreen mode Exit fullscreen mode
  • Standard library only (Zero external dependencies)
  • No administrative privileges required
  • Runs on Windows / macOS / Linux
  • Automatically deleted when the process terminates (volatile)

Replacing tempfile with D-MemFS

The most typical pattern is replacing tempfile.TemporaryDirectory().

Before Replacement (Writes to disk occur)

import tempfile
import os

def process_data(raw: bytes) -> bytes:
    with tempfile.TemporaryDirectory() as tmpdir:
        input_path = os.path.join(tmpdir, "input.bin")
        output_path = os.path.join(tmpdir, "output.bin")

        with open(input_path, "wb") as f:
            f.write(raw)

        # Some processing (an example assuming calling an external command)
        data = open(input_path, "rb").read()
        result = bytes(b ^ 0xFF for b in data)  # Dummy conversion

        with open(output_path, "wb") as f:
            f.write(result)

        return open(output_path, "rb").read()
Enter fullscreen mode Exit fullscreen mode

After Replacement (Completely contained in memory)

from dmemfs import MemoryFileSystem

def process_data(raw: bytes) -> bytes:
    mfs = MemoryFileSystem()
    mfs.mkdir("/tmp")

    with mfs.open("/tmp/input.bin", "wb") as f:
        f.write(raw)

    with mfs.open("/tmp/input.bin", "rb") as f:
        data = f.read()

    result = bytes(b ^ 0xFF for b in data)

    with mfs.open("/tmp/output.bin", "wb") as f:
        f.write(result)

    with mfs.open("/tmp/output.bin", "rb") as f:
        return f.read()
Enter fullscreen mode Exit fullscreen mode

Access to the disk becomes zero. When mfs goes out of scope, it is collected by the GC, so no clean-up is necessary.

Automatic Release via GC

MemoryFileSystem is collected by the GC when it exits the scope. You can use it similarly to TemporaryDirectory.

from dmemfs import MemoryFileSystem

def process():
    mfs = MemoryFileSystem(max_quota=64 * 1024 * 1024)
    mfs.mkdir("/work")

    with mfs.open("/work/data.csv", "wb") as f:
        f.write(b"id,value\n1,100\n2,200\n")

    with mfs.open("/work/data.csv", "rb") as f:
        print(f.read().decode())

process()
# When exiting the function, mfs is collected by GC, and all contents are wiped.
Enter fullscreen mode Exit fullscreen mode

Benchmarks

The following are benchmark results comparing D-MemFS, tempfile on a RAM Disk, and tempfile on an SSD (repeat=5, warmup=1).

Measurement Environment: Windows, X:\TEMP (RAM Disk) / C:\TempX (SSD)
Throughput values are calculated from 512 MiB of write + read.

Mass read/write of small files (300 files × 4 KiB)

Method Time (ms)
io.BytesIO 6
D-MemFS 51
tempfile (on RAM Disk) 207
tempfile (on SSD) 267

Stream read/write (16 MiB, 64 KiB chunks)

Method Time (ms)
tempfile (on RAM Disk) 20
tempfile (on SSD) 21
io.BytesIO 62
D-MemFS 81

Random Access (16 MiB)

Method Time (ms)
D-MemFS 34
tempfile (on SSD) 35
tempfile (on RAM Disk) 37
io.BytesIO 82

Large Capacity Stream read/write (512 MiB, 1 MiB chunks)

Method Time (ms)
D-MemFS 529
tempfile (on RAM Disk) 514
tempfile (on SSD) 541
io.BytesIO 2 258

Random reading of numerous files (10,000 files × 4 KiB)

Method Time (ms)
D-MemFS 1 280
tempfile (on RAM Disk) 6 310
tempfile (on SSD) 8 601

Key Points:

  • For small and numerous files, D-MemFS is overwhelmingly faster. Because the file open/close overhead is practically zero, it's 4 times faster with 300 files and over 5 times faster for random reads of 10,000 files.
  • For large capacity streams (512 MiB+, large chunks), D-MemFS and tempfile are equivalent. Because the memory bandwidth becomes the bottleneck, the difference in storage location hardly shows.
  • For small to medium capacity streams around 16 MiB, tempfile is faster. The per-chunk overhead is relatively higher for D-MemFS, so the finer the chunk size, the wider the gap.
  • io.BytesIO is the fastest for single-stream usage, but lacks file management features (paths, directories, quotas).

Using in Asynchronous Code (AsyncMemoryFileSystem)

For asyncio-based code, use AsyncMemoryFileSystem.

import asyncio
from dmemfs import AsyncMemoryFileSystem

async def main():
    mfs = AsyncMemoryFileSystem(max_quota=32 * 1024 * 1024)
    await mfs.mkdir("/data")

    # Asynchronous write
    async with await mfs.open("/data/result.bin", "wb") as f:
        await f.write(b"async result data\n")

    # Asynchronous read
    async with await mfs.open("/data/result.bin", "rb") as f:
        content = await f.read()
        print(content)

asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode

Internally, it incorporates exclusive control based on asyncio.Lock, making it safe to access simultaneously from multiple coroutines.

Example: Parallel Downloads and Immediate Processing

import asyncio
import aiohttp
from dmemfs import AsyncMemoryFileSystem

async def fetch_and_process(session, url: str, mfs, path: str):
    async with session.get(url) as resp:
        data = await resp.read()
    async with await mfs.open(path, "wb") as f:
        await f.write(data)

async def main():
    urls = [
        ("https://example.com/a.json", "/cache/a.json"),
        ("https://example.com/b.json", "/cache/b.json"),
        ("https://example.com/c.json", "/cache/c.json"),
    ]

    mfs = AsyncMemoryFileSystem(max_quota=64 * 1024 * 1024)
    await mfs.mkdir("/cache")

    async with aiohttp.ClientSession() as session:
        await asyncio.gather(*[
            fetch_and_process(session, url, mfs, path)
            for url, path in urls
        ])

    # Process all files while they are readily available in memory
    for _, path in urls:
        async with await mfs.open(path, "rb") as f:
            data = await f.read()
            print(f"{path}: {len(data)} bytes")

asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode

Limitations

While D-MemFS is extremely useful as a software RAM disk, it has fundamental limitations compared to a real RAM disk.

Limitation Description
Restricted to within the Python Process It cannot be accessed from other processes. It cannot be used for external commands or subprocess.
Volatile When the process ends, all contents disappear (it does not persist until explicitly written out).
Cannot be Mounted It cannot be exposed to the OS as a drive letter or mount point.
Not compatible with os.PathLike (Intentional) A design decision to prevent confusing virtual paths with host paths.

Use Cases

Use Case Description
Testing For tests involving the file system using pytest. As a replacement for the tmp_path fixture.
CI Speeding up builds on GitHub Actions or Azure Pipelines by reducing disk I/O.
Data Processing Pipelines Handling all intermediate files entirely in memory.
Archive Extraction/Processing Extracting zip/tar archives into memory and processing the internal files without writing to disk.
Dealing with Sensitive Data Temporary processing where you don't want to leave any footprints on disk.
Mocking without External Dependencies Intercepting file system access during testing.

Extracting Archives into Memory

This is an example of extracting a zip file without writing it to disk and processing the internal files.

import io
import zipfile
from dmemfs import MemoryFileSystem

def process_zip(zip_bytes: bytes) -> dict[str, int]:
    """Extract contents of the zip to MFS in memory, returning the size of each file."""
    mfs = MemoryFileSystem(max_quota=256 * 1024 * 1024)
    mfs.mkdir("/extracted")

    with zipfile.ZipFile(io.BytesIO(zip_bytes)) as zf:
        for name in zf.namelist():
            data = zf.read(name)
            with mfs.open(f"/extracted/{name}", "wb") as f:
                f.write(data)

    # Process files on MFS (here returning a list of sizes)
    result = {}
    for entry in mfs.listdir("/extracted"):
        path = f"/extracted/{entry}"
        result[entry] = mfs.stat(path)["size"]
    return result
Enter fullscreen mode Exit fullscreen mode

You can do similar things with tar.gz.

import io
import tarfile
from dmemfs import MemoryFileSystem

def process_targz(targz_bytes: bytes) -> list[str]:
    """Extract tar.gz into MFS in memory and return the file list."""
    mfs = MemoryFileSystem(max_quota=512 * 1024 * 1024)
    mfs.mkdir("/extracted")

    with tarfile.open(fileobj=io.BytesIO(targz_bytes), mode="r:gz") as tf:
        for member in tf.getmembers():
            if member.isfile():
                f = tf.extractfile(member)
                if f is not None:
                    # Reproduce directory hierarchy
                    parts = member.name.split("/")
                    for depth in range(1, len(parts)):
                        d = "/extracted/" + "/".join(parts[:depth])
                        try:
                            mfs.mkdir(d)
                        except FileExistsError:
                            pass
                    with mfs.open(f"/extracted/{member.name}", "wb") as out:
                        out.write(f.read())

    return mfs.listdir("/extracted")
Enter fullscreen mode Exit fullscreen mode

The key point is that writing to disk is zero. Compared to the traditional pattern of extracting downloaded archives to a temporary directory before processing, you won't wear down the write endurance of your SSD.

Using as a pytest Fixture

import pytest
from dmemfs import MemoryFileSystem

@pytest.fixture
def mfs():
    """Provides an independent in-memory FS for each test."""
    return MemoryFileSystem(max_quota=32 * 1024 * 1024)

def test_write_and_read(mfs):
    mfs.mkdir("/work")
    with mfs.open("/work/test.txt", "wb") as f:
        f.write(b"test data")
    with mfs.open("/work/test.txt", "rb") as f:
        assert f.read() == b"test data"

def test_quota_is_isolated(mfs):
    """Each test has an independent quota."""
    mfs.mkdir("/data")
    with mfs.open("/data/file.bin", "wb") as f:
        f.write(b"x" * 1024)
    st = mfs.stat("/data/file.bin")
    assert st["size"] == 1024
Enter fullscreen mode Exit fullscreen mode

Conclusion

The problem of needing administrative privileges for RAM disks on Windows can be completely avoided within Python code by using D-MemFS. Just by running pip install D-MemFS and replacing TemporaryDirectory, you can gain a dramatic speed improvement over using tempfile on an SSD.

For pure Python processing that doesn't rely on external commands, D-MemFS is the easiest and fastest choice.


🔗 Links & Resources

If you find this project interesting, a ⭐ on GitHub would be the best way to support my work!

Top comments (0)