Hey everyone, if you're a Go developer who's ever gotten frustrated with slow file ops, especially when dealing with big files or lots of reads and writes. Meet mmapfile, my little project that's basically a drop-in replacement for the standard *os.File -- but way faster. It uses memory-mapped I/O to skip a bunch of the usual overhead.
Let's chat about what it is, why it's cool, and how you can use it.
So, What is mmapfile?
mmapfile is a Go library that acts just like *os.File but under the hood, it maps files directly into your program's memory. Instead of calling the system for every little R/W operation, you get straight access to the file data. It's like having the file right there in RAM without actually loading it all up front.
It works on different systems, but if your platform doesn't support it, it just reads the whole file into memory as a fallback. See Platform Support.
The end result is a lot way faster file operations, especially for jumping around in files or reading A LOT.
Why?
I built this to be easy to swap in. Here are the highlights:
-
It's Basically Just
*os.File: it implements all the interfaces you know and love:-
io.Reader,io.Writer,io.Seeker. -
io.ReaderAt,io.WriterAt,io.Closer. - And more like
io.ReaderFrom,io.WriterTo,io.StringWriter.
So, you can often just replace your
os.Opencalls withmmapfile.Openand be done.I also write Semgrep rules to automatically detect
*os.Fileusage and suggestmmapfilereplacements. See Semgrep Rules. -
Zero-Copy: the star of the show is the
Bytes()method. It gives you a direct view into the file's memory. No copying, no extra memory use, just point and read.Works Everywhere: Linux, macOS, Windows, even the weirder Unix variants. It handles the differences for you.
Safe for Multiple Threads:
ReadAtandWriteAtare thread-safe, so you can have goroutines hammering away without issues.No Memory Waste: most operations don't allocate anything on the heap (read: keeps your GC chill).
Getting Started
Get it with:
go get go.dw1.io/mmapfile
Quick example:
package main
import (
"fmt"
"log"
"go.dw1.io/mmapfile"
)
func main() {
f, err := mmapfile.Open("data.txt")
if err != nil {
log.Fatal(err)
}
defer f.Close()
// read like normal
buf := make([]byte, 100)
n, err := f.Read(buf)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Read %d bytes: %s\n", n, buf[:n])
// or grab the whole thing directly
data := f.Bytes()
fmt.Printf("File contents: %s\n", data)
}
For creating files:
f, err := mmapfile.OpenFile("newfile.txt", os.O_RDWR|os.O_CREATE, 0644, 1024*1024) // 1MB file
The API (Nothing Fancy btw)
It supports most os.OpenFile flags, minus a couple we'll talk about later. Here's what you can do:
| Method | What it does |
|---|---|
| Read([]byte) | Read some bytes, move the cursor |
| ReadAt([]byte, int64) | Read from a spot without moving the cursor |
| Write([]byte) | Write bytes, move the cursor |
| WriteAt([]byte, int64) | Write to a spot without moving the cursor |
| Seek(int64, int) | Jump to a position |
| ReadFrom(io.Reader) | Pull data from a reader |
| WriteTo(io.Writer) | Push data to a writer |
| Close() | Shut it down |
| Sync() | Save changes to disk |
| Stat() | Get file info |
| Name() | File name |
| Len() | File size |
| Bytes() | Direct memory access |
The sweet spot is random access. Use ReadAt/WriteAt for the best speed.
See Go reference.
Performance
I ran some benchmarks against regular *os.File, and mmapfile just destroys it in most cases.
Reads
- Tiny stuff (1KB): 50x faster.
- Huge files (1GB): Still 3x faster.
- Parallel reads: 10-12x faster (no matter the size).
Writes
- Small (1KB): 51x faster.
- Medium (100KB): 6x faster.
- Big sequential (500MB+): A bit slower because the kernel's tricks win there.
Other Stuff
-
Seek: 29x faster. -
WriteTo: 254x faster. - Overall: About 6x faster on average.
There is no system calls for most ops, just direct memory access. Fast, simple, no surprises.
When to Use It (and When Not To)
mmapfile shines in these spots:
- Big files with random access: Like databases or parsing binary files.
- Lots of reading: Configs, static data, lookup tables.
- Memory-mapped DBs: Fixed-size stuff, logs that just append.
- Shared memory between processes: Multiple programs reading the same file.
- High-frequency I/O: Thousands of small ops per second.
Skip it for:
- Streaming data: Like from networks or pipes.
- Files that grow a lot: Needs fixed size.
- Huge sequential writes: Kernel buffering beats user-space copies.
- Tiny, rare files: Setup cost isn't worth it.
Gotchas and Limits
Nothing is perfect. Watch out for:
- Fixed size: Can't grow files after opening. Set size when creating.
- No truncate: To change size, close and reopen.
-
No append mode:
os.O_APPENDis not there. -
Cursor ops are slower: Stick to
ReadAt/WriteAt.
and Bytes() gives you a slice that's only good until Close(). So if you mess with read-only files, you WILL crash.
Thread Safety
Built for concurrency:
-
ReadAt/WriteAt: Safe to call from multiple goroutines. -
Read/Write/Seek: Share a cursor, so lock if concurrent. -
Close: Don't call while others are running.
Real-World Uses
Perfect for:
- DB engines: Quick jumps to data pages.
- Log munching: Parsing giant log files.
- Config loading: Fast parsing of big configs.
- File caches: Persistent, memory-backed caches.
- Science stuff: Working with binary datasets.
Wrapping Up
If you're hacking on Go apps with heavy file I/O, mmapfile could be your new best friend. The speedups for reads and random access are killer for modern apps.
But hey, it's not for everything. Think about your use case; do you need growing files? Streaming? If not, give it a shot.
This is pre-1.0, so things might change. Go check it out on GitHub, try it, and let me know what you think!
Top comments (0)