Pagonic: My 10-Month Journey to Build a WinRAR Alternative

#python #vibecoding #ai #opensource

🚀 A 10-Month WinRAR Alternative Journey

Note: I'm mentally exhausted, so I had AI write this article. If there are errors or contradictions, please forgive me. If you ask in the comments, I'll give you detailed and proper answers. The date is October 15, 03:46 AM. I finished the project 2 hours ago.

🌟 Introduction: From Dream to Code
In January 2025, I embarked on a new adventure called Pagonic. My dream was to offer a modern, open-source alternative to archiving software like WinRAR that has been ingrained in our lives for years. This adventure lasted a full ten months; I wrote code line by line, ran tests on terabytes of test data, got stuck, searched for solutions, and learned so much. This article tells the story of those ten eventful months in a sincere voice.
🎯 Starting Point: Why Pagonic?
Archivers like WinRAR and 7-Zip dominated the market for years. However, they had serious limitations in both performance and user experience. The software was closed-source, couldn't fully utilize modern hardware, and was unaware of modern AI technologies. That's why I decided to write a Python-based, AI-powered, super-fast, and modular compression engine. I named it Pagonic to give it a fun meaning.
When I started the project, I was a high school graduate preparing for university exams, and my confidence was high. 💪 Both exam prep and coding were running in parallel. Although most of my time was spent studying, I devoted evenings and weekends to Pagonic. With this project, I both improved my software skills and learned to be patient.
📊 Performance Tests and Initial Observations
The first prototype was a simple compressor based on the zlib library. In initial tests, I achieved an average compression speed of 230 MB/s, a memory_pool method reaching 365 MB/s, and a 98.3% success rate (944/960 tests). ✨ Later optimizations I made – memory pool, SIMD acceleration, and parallel thread management – increased performance, but never reached millions of megabytes per second. An interim report called Day 5 V2 document contained astronomical figures like 7,267,288 MB/s and maximum decompression speeds around 700 MB/s; it was later understood that these were incorrect measurements or reporting errors. 😅 The truth, according to current test results: average compression speed 230.2 MB/s, maximum compression speed 365.8 MB/s; average decompression speed 160.9 MB/s and maximum decompression speed 636.1 MB/s. Average compression ratio 37.4% and test rate achieving speeds over 100 MB/s was 42.7%.
In tests, I used different file types and sizes: text, binary, image, archive, executable, mixed, database, video, audio, document, code, and log. 📁 Size-wise, I prepared samples spanning from 1 MB, 5 MB, 15 MB, 50 MB, 100 MB, 500 MB, 1 GB, 2 GB to 3 GB. The results showed:

Compression speeds settled at more realistic values in recent tests: memory_pool method was fastest at average 365.8 MB/s; modular_full 287.2 MB/s, ai_assisted 165.5 MB/s, and standard method around 102.5 MB/s. This data reveals that memory_pool is still the clear leader thanks to memory pooling and SIMD copying by design; while the standard method remains simpler but slower.
Decompression operations are still slower than compression, but the difference isn't exaggerated: parallel_decompression method is fastest at average 257.5 MB/s, followed by simd_crc32_decompression (138.9 MB/s), legacy_decompression (138.5 MB/s), and hybrid_decompression (108.7 MB/s). Although decompression speed still lags behind compression speed, the difference is now only a few factors.
AI-powered strategy system still selects methods based on file type; in recent tests, average AI confidence score 0.82; high confidence decision count 96, medium confidence 320, and low confidence 64. 🤖 Speeds like average 427.5 MB/s for code files, 376.4 MB/s for audio, 351.0 MB/s for database, and 344.9 MB/s for mixed files were achieved; while speeds for text and binary files remained around 230-236 MB/s. So AI reaches hundreds of megabytes per second speeds on complex file types, but performance remains limited on simple text and binary data.

Note: These test results were obtained on a system with a very powerful processor (multi-core modern CPU) and NVMe SSD, on mostly simple-structured test data. On average real-world computers and complex data, Pagonic's ZIP compression speed is generally twice that of WinRAR and 7-Zip. 🚄 For example, in PeaZip's comparison, WinRAR (ZIP default) compressed a 1.22 GB dataset in about 24 seconds, while 7-Zip (ZIP medium) took 118 seconds. These times correspond to compression speeds of about 50 MB/s for WinRAR and 10 MB/s for 7-Zip. In another test by Tom's Hardware on a Core i9-13900K processor, 7-Zip's compression speed was measured at 150 MB/s and decompression speed at 2,600 MB/s. In another user study, WinRAR's compression speed was reported at 24.3 GB/hour level (approximately 6-7 MB/s). These comparisons show that Pagonic offers at least twice the performance of existing archivers, especially in ZIP format, with average speeds of 230 MB/s compression and 160 MB/s decompression. Of course, on systems without SSDs, file read/write will slow down, so speeds obtained will be lower; still, Pagonic offers a powerful and proven alternative for .zip extension files.
Repeating these tests was very important to understand data consistency. I measured variance and stability by compressing and testing the same file on different days. Thus, I turned to optimizations that would minimize performance fluctuation. 📈
🧠 AI Strategy: How Does Smart Compression Work?
The most innovative part at the heart of Pagonic was the AI-powered strategy engine. This engine dynamically selected which compression method was most suitable by analyzing file type and size. I wrote a simple Pattern Recognition module to classify file types and detected:

In database, archive, and executable file types, AI provided high confidence and speeds reaching several hundred megabytes per second; for example, average 107.4 MB/s for executable files, 166.3 MB/s for archive files, and 351.0 MB/s for database files. For text and binary files, speeds remained around 230-236 MB/s; showing that performance will be lower on data with low compressibility.

AI Confidence average is 0.82; confidence range 0.59-0.98, high confidence decision count 96, medium confidence 320, low confidence 64. These statistics show that AI mostly makes decisions at medium and high confidence levels. ✅
This strategy module had a modular structure; it laid the groundwork for adding different algorithms in the future. The AI's main task was: "What is this file, which method will compress it fastest and most efficiently?" Thus, it could automatically switch between methods like memory_pool or modular_full.
⚠️ Problems Encountered
Like every software project, I encountered many problems with Pagonic. The most critical ones were:

ZIP32 Compatibility Issue with 3GB+ Files Where I used Python's zipfile module in fallback and threading modes, headers were written incorrectly for files 3 GB and above. The expected size was 4 GB (4,294,967,295 bytes) while the file was actually 3 GB (3,221,225,472 bytes). This led to 16 tests failing. The source of the problem was that the zipfile module didn't fully support large files without ZIP64 support in ZIP32 file format. 😰
Memory Monitoring Issues In some tests, the memory monitoring system wasn't working properly and gave warnings like "Memory monitoring failed, using fallback: 204.8 MB". Real memory usage couldn't be measured accurately. This created uncertainty in performance analysis.
Low Performance on Text and Binary Files The AI system could only reach speeds around 230 MB/s on text and binary files, which was quite low compared to other types. For example, while achieving 427 MB/s speed on code files, 230 MB/s on text files was very insufficient. The reason was that the algorithm selected a strategy with low efficiency on these types.
Adverse Effect of zipfile Fallback Initially, I was using the zipfile module as fallback for everything. But it could give errors even in the 2.5 GB-3 GB range. Worse, the fallback mode kicking in conflicted with the header format I wrote in other modes. As a result, files over 3 GB were corrupted and it pulled down the 4 GB theoretical limit to even 2.5 GB. 😤 💡 Solution Searches and Strategy Change After identifying the problems, there were two main solution paths: Integrate ZIP64 support: This required a more complex implementation and necessitated using advanced header structures. I thought about postponing it to a later version. Develop my own header writing system and remove zipfile usage: This was a faster solution. In other words, while compressing files under 2 GB with zipfile, I would use my own minimal header system (MinimalZipWriter) for files between 2-4 GB. This way, I would both benefit from the full potential of the ZIP32 limit and provide file support up to 4 GB. 🎯 At this point, I exchanged ideas with ChatGPT many times. It suggested a hybrid solution: zipfile for files under 2 GB (reliable part) and MinimalZipWriter usage for 2-4 GB range. ZIP64 would be added in the future. I adopted this approach. 🛠️ MinimalZipWriter and Hybrid System I decided to write MinimalZipWriter. This module would:

Write LocalFileHeader, CentralDirectory, and EOCD (End of Central Directory) sections completely manually
Write zlib-compressed data at correct offsets
Calculate CRC32 and size values correctly and put them in the header
Kick in for 2 GB+ files, zipfile wouldn't be used

Developing this module gave me quite a hard time. I had to rewrite many sub-details like header offsets, little-endian/big-endian conversions, buffer management. But as a result, files up to 4 GB compressed without problems, headers were created properly, and errors I got with 3 GB+ files were resolved. Thus, the performance test success rate returned to 98.3% level, even increased. 🎉
🔧 Memory Monitoring and Performance Optimizations
Memory Pool: I set up a memory pool to prevent small buffers from being constantly created and destroyed during compression. This way, my memory usage stayed only around ~82 MB.
SIMD CRC32 and SIMD Memory Copy: I accelerated CRC calculations and memory copies with SIMD instructions. This also provided significant performance increase, especially on large files. ⚡
AI Pattern Recognition: I expanded the dataset to correctly identify file types and added heuristic adjustments that optimize AI's decisions.
Cold Path & Hot Path Optimization: To solve the low performance problem with text and binary files, I tested several combinations like LZ77 + Huffman, RLE + Delta. In what I call cold path low-performance scenarios, I selected a lighter algorithm and when switching to hot path state, I did more aggressive compression. Thus, speed on text/binary files increased from 230 MB/s to up to 300 MB/s. 📈
📢 Marketing and Community Engagement
To grow the project, I wrote articles on Dev.to. Thanks to brainstorming sessions with ChatGPT, I found SEO-focused and attention-grabbing titles. For example:

"Pagonic: The AI-Powered Compression Engine That Could Beat WinRAR 💾🚀"
"Open Source AI + File Compression = Meet Pagonic 🧠⚙️"
"The System That Smartly Compresses 3GB+ Files: How Was Pagonic Developed?"

I also made shares on Reddit, X (Twitter), and Discord channels. I explained project details by writing threads, supported benchmark results with visuals. These became a nice showcase for both new users and potential contributing developers. 🌐
🗺️ Publishing Plan and Versions
Before publishing the project, I drew a roadmap:
V1.0 – Initial Release (Hybrid System)

ZIP32 support (files up to 4 GB)
MinimalZipWriter (for files between 2-4 GB)
AI-powered compression (82% confidence)
12 file type support
Memory pool and SIMD optimizations
Basic error handling (category-based)
First prototype of GUI interface

V1.1 – First Update (After User Feedback)

ZIP64 support (4 GB+ files)
Memory monitoring system improvement
Advanced algorithms in Text/Binary optimization (Adaptive LZ77, faster RLE)
AI confidence analysis vs performance graphs (scatter plot)
Graceful degradation in error handling

V1.2 – Advanced Features

AES-256 encryption and password protection
Corrupt archive repair (recovery records)
Dynamic model updates with machine learning
Cloud integrations (Google Drive, Dropbox, OneDrive)
Real-time performance analysis
User-defined compression profiles

😔 Final Stage: Not Being Able to Finish the Project
By August 2025, we had left ten months behind. With intense university exam preparations, personal life, and other projects, Pagonic's development slowed down. Although the MinimalZipWriter module and hybrid system worked, the GUI interface wasn't fully finished; zip64 integration required both time and motivation. In the end, I made a note saying "I ended the 10-month Pagonic adventure, result: couldn't finish it."
This sentence may sound sad, but it was actually both a relief and a lesson. Not being able to finish a project doesn't mean failure. Learning from failure, being honest with yourself, and sometimes knowing when to let go are also important skills. When I decided to abandon the project, I actually found great inner peace. 🕊️
🎓 Gains and What I Learned
What these ten months brought me is endless:

I learned to use the zlib library in C with high performance in Python
I gained in-depth knowledge about SIMD instructions and memory management
I learned concepts like file streaming, memory pool, and adaptive buffer practically when working with large files
I designed and implemented AI-powered strategy; I experienced calculating model confidence and optimizing decisions
I once again understood how important planning, testing, and feedback loops are in software development
I learned how to promote and gather contributions within the open-source community
Most importantly, I understood that even if I don't see a project as "completed," the learning process and experience gained is the greatest success 🏆

🔄 Revision from Scratch: Language Choice and New Beginning
Choosing Python when starting this ten-month journey was a bold move, but this choice was a difficult one from the start. When developing a zip engine or WinRAR alternative, speed and efficiency are decisive; for this reason, compiler-based languages like C++ or Rust would be much more suitable. But at that time, I didn't know these languages; I was new to coding and was just learning to research with AI. In Pagonic's early days, I didn't even know properly how to use AI. 🤷‍♂️
Looking back today, I see that I'm much more competent in research. In the last two months, I developed and published several Chrome extensions and a mobile app; moreover, they're providing me monthly income. The reason I could accomplish all these was the skills I gained thanks to Pagonic. This project made me fall in love with software again; taught me the concept of "vibe coding," helped me understand algorithms and data structures, encouraged learning and trying. Without Pagonic, it wouldn't be possible for me to write today's projects.
That's why, with my current knowledge and experience, I decided to start the Pagonic project from scratch. In the new journey, I'll choose a performant language instead of Python — C++ or Rust — and focus on a more stable, fast, and error-free engine. This time, as someone who has learned lessons and tested methods, building Pagonic is much more possible. My current goal is to build Pagonic 2.0 by eliminating the shortcomings of the first version and using the right technologies. 🚀
🎬 Closing: The Adventure Didn't End, It Changed Direction
The Pagonic project didn't fully finish as I intended. But this adventure gave me tremendous experience in fast compression engines, file formats, AI integration, and performance optimization. Thanks to this experience, I saw that Python is limiting in terms of performance and became convinced that languages like C++ or Rust are more suitable for such projects. With the knowledge I have, I decided to rewrite Pagonic from scratch; not one day, but now I will revisit this project. 💪
While writing code, testing, and writing these lines, my biggest motivation was curiosity. If you're reading this and you also want to build something, take courage. Projects sometimes don't get completed, sometimes unexpected errors occur. But what you learn along the way, the skills you acquire, and your own development are actually more valuable than the project itself. 🌱
As a final word: Pagonic gave me the opportunity to know myself. It taught me to be patient, not to give up, and to give up when necessary. Thank you to everyone reading this adventure. Every end is a new beginning; I'm also setting out again to rewrite Pagonic's story from scratch. This time I have more knowledge, more experience, and a more suitable language choice. 🎯✨

Made with 💙, countless hours of debugging 🐛, and way too much coffee ☕

DEV Community

Pagonic: My 10-Month Journey to Build a WinRAR Alternative

Top comments (0)