Links:
Introduction
While the .NET ecosystem provides robust libraries for working with ZIP files, developers often encounter limitations due to internal or sealed classes that prevent straightforward extension or modification. In this article, I will discuss why and how I created FastZipEntry, a .NET library designed to efficiently retrieve specific entries from a ZIP archive without extracting the entire archive or iterating through all entries.
The Motivation
Limitations in the Existing Libraries
The existing ZIP handling libraries in .NET, specifically those provided by Microsoft, are powerful but sometimes restrictive. Many of the useful classes and methods are marked as internal or sealed, which means they cannot be extended or modified outside of their original scope. This restriction poses a significant challenge when you need to tweak or extend the functionality to suit specific needs.
Need for Efficient Entry Retrieval
In many scenarios, you might need to access a specific entry in a ZIP archive without extracting the entire archive or loading all entries into memory. This is particularly important for large archives where performance and memory consumption become critical concerns. Unfortunately, the default System.IO.Compression.ZipArchive does not provide a straightforward way to achieve this.
The Solution: FastZipEntry
To overcome these limitations, I decided to create FastZipEntry NuGet , a library that allows efficient retrieval of specific entries from a ZIP archive. This library leverages modified code from Microsoft's System.IO.Compression but extends its functionality to meet the needs outlined above.
Key Features
- Efficient Retrieval: Locate and retrieve specific ZIP entries by name without extracting the entire archive.
- Decompression: Includes support and access to the Deflate64 algorithm for decompression, also sourced from Microsoft’s codebase.
Implementation Details
Leveraging Microsoft’s Source Code
The core of FastZipEntry is based on the source code from Microsoft's System.IO.Compression library, which is available under the MIT license. Sadly, due to the internal and sealed nature of many classes, I had to copy and modify the necessary code to enable the required functionality. This approach, while not ideal, was necessary to provide the flexibility and performance benefits that FastZipEntry offers.
Adding Deflate64 Support
In addition to the core functionality, I also integrated the Deflate64 algorithm for decompression. This algorithm, taken from Microsoft’s source code, is essential for handling archives that use this specific compression method. By including this in FastZipEntry, I ensured that the library can handle a wider range of ZIP archives.
Usage Example
using System.IO;
using System.Text;
using FastZipEntry;
// Open a ZIP file and create a ZipEntryAccess instance
using FileStream zipFileStream = new FileStream("path/to/your.zip", FileMode.Open, FileAccess.Read);
ZipEntryAccess zipEntryAccess = new ZipEntryAccess(zipFileStream, Encoding.UTF8);
// Retrieve a specific entry from the ZIP file
string entryName = "desired_entry.txt";
ZipEntry? entry = zipEntryAccess.RetrieveZipEntry(entryName, StringComparison.OrdinalIgnoreCase);
if (entry != null)
{
// Use the entry (e.g., decompress it)
using Stream entryStream = entry.Open();
using FileStream outputStream = new FileStream("path/to/extracted/desired_entry.txt", FileMode.Create, FileAccess.Write);
entryStream.CopyTo(outputStream);
}
else
{
Console.WriteLine("Entry not found.");
}
Conclusion
I hope FastZipEntry proves useful in your projects, and I welcome any contributions or feedback.
- You can find and install the FastZipEntry package from the NuGet repository.
- You can also check out the source code in GitHub repository of this project.
Acknowledgments
This library is based on modified code from the Microsoft System.IO.Compression repository, and includes the Deflate64 algorithm from the same source.
Top comments (0)