Choosing a KDF is a trade-off between user verification time and the brute-force cost for an attacker. For my bachelor’s thesis project, Vaulton, I wanted to look past theoretical recommendations and get empirical data on how these algorithms perform across different hardware tiers.
I benchmarked Argon2id, Bcrypt, and PBKDF2 across a range of devices from an Intel iGPU to an RTX 5080 to identify their real-world breaking points. This post covers the data from those tests, the hardware bottlenecks I identified, and how different architectures respond to memory-hard functions.
The Hardness of the Problem
- PBKDF2-SHA256: A NIST-standard algorithm that is primarily CPU-bound. Its largest weakness in modern security is its low memory requirement, which makes it highly parallelizable on GPUs.
- Bcrypt: Based on the Blowfish cipher, it introduces some memory requirements (a 4KB S-box) that complicate GPU implementation. While it remains competitive in terms of throughput, it is naturally limited by its age and fixed memory usage.
- Argon2id: The winner of the Password Hashing Competition (2015). It is designed specifically to be Memory-Hard, forcing attackers to dedicate physical RAM per thread, which significantly limits the parallelization capabilities of GPUs and ASICs.
Test Environment
Benchmarks were performed across three distinct hardware configurations representing various levels of compute power and memory architectures.
| ID | GPU | Memory Architecture | CPU |
|---|---|---|---|
| System 1 | Intel® Arc™ Graphics (iGPU) | 8018 MB (Shared) | Intel® Core™ Ultra 7 265K |
| System 2 | NVIDIA GTX 1660 Ti | 6143 MB (VRAM) | Intel® Core™ i5-9300HF |
| System 3 | NVIDIA GeForce RTX 5080 | 16302 MB (VRAM) | Intel® Core™ Ultra 7 265K |
Note: Attacker throughput was measured using Hashcat 7.1.2 (offline attack model, single device).
Methodology: Normalizing for User Cost
When choosing parameters for a password manager, the goal is to maximize security while maintaining a consistent user experience. I normalized the parameters so that the verification time falls into three distinct "UX profiles" (Low, Medium, and High) based on performance on a mid-range laptop (System 2).
| Profile | Algorithm | Hashcat mode | Parameters | Verification Time (System 2) |
|---|---|---|---|---|
| Low | PBKDF2 | 10900 | 250,000 iterations | 0.157s |
| Bcrypt | 3200 | Cost Factor 11 | 0.186s | |
| Argon2id | 34000 | 64 MB RAM | 0.178s | |
| Medium | PBKDF2 | 10900 | 500,000 iterations | 0.315s |
| Bcrypt | 3200 | Cost Factor 12 | 0.371s | |
| Argon2id | 34000 | 128 MB RAM | 0.370s | |
| High | PBKDF2 | 10900 | 1,000,000 iterations | 0.628s |
| Bcrypt | 3200 | Cost Factor 13 | 0.743s | |
| Argon2id | 34000 | 256 MB RAM | 0.741s |
Note: All Argon2id tests used 3 iterations and a parallelism factor of 1.
Benchmark Results (Throughput in H/s)
The following table presents the raw throughput (Hashes per second) for each algorithm at the normalized security levels across all systems.
| Hardware | Algorithm | Low | Medium | High |
|---|---|---|---|---|
| System 1 (iGPU) | PBKDF2-SHA256 | 1726 | 755 | 382 |
| Bcrypt | 34 | 17 | 9 | |
| Argon2id | 17 | 4 | 1 | |
| System 2 (1660 Ti) | PBKDF2-SHA256 | 3998 | 2054 | 996 |
| Bcrypt | 346 | 175 | 87 | |
| Argon2id | 360 | 101 | 26 | |
| System 3 (RTX 5080) | PBKDF2-SHA256 | 25236 | 12612 | 6271 |
| Bcrypt | 2390 | 1195 | 598 | |
| Argon2id | 1164 | 332 | 87 |
Baseline Comparison (SHA-256)
For context, the raw SHA-256 throughput (Baseline, Hashcat mode 1400) on these systems was:
- iGPU: 1084 MH/s
- GTX 1660 Ti: 2329 MH/s
- RTX 5080: 15263 MH/s
Analysis: The Attacker's Bottleneck

Figure 1: Comparison of security horizon across different password hashing algorithms.
The Resiliency of Bcrypt
In terms of raw H/s throughput, Bcrypt remains relatively efficient for attackers compared to memory-hard algorithms. On the RTX 5080, it maintains roughly 6.8x higher throughput than Argon2id at high security levels (598 H/s vs 87 H/s).
However, as seen in Figure 1, this efficiency translates into a stark reality for the security horizon of a master password. While Bcrypt provides a formidable defensive wall (estimated at 39 years for a given keyspace), Argon2id shifts the goalpost to 268 years for that same keyspace. While Bcrypt is "competitive" in its resistance compared to PBKDF2, it is still significantly outclassed by the generational jump in security provided by Argon2id's memory hardness.
The Memory Wall of Argon2id
The true advantage of Argon2id is not its throughput, but its GPU resistance through memory hardness. While PBKDF2 scales quickly with hardware power (jumping from 996 H/s on a 1660 Ti to 6271 H/s on a 5080), Argon2id forces a hardware bottleneck.
On the "High" profile (256MB), the RTX 5080 is roughly 72 times slower at cracking Argon2id than it is at cracking PBKDF2, despite the user spending around the same amount of time on verification.

Figure 2: Hardware scaling factor between GTX 1660 Ti and RTX 5080.
As shown in Figure 2, throwing more modern compute power at the problem yields significantly fewer gains for Argon2id. The more RAM you require, the less effective the attacker's raw compute power becomes, as they quickly hit a VRAM bandwidth and capacity wall.
Architectural Anomalies: The 1660 Ti Parity
An unexpected result occurred in the "Low" profile tests on the GTX 1660 Ti, where Argon2id (360 H/s) reached parity with Bcrypt (346 H/s).
This parity was absent in the other test environments; both the iGPU and the RTX 5080 showed Argon2id as roughly twice as slow as Bcrypt at the same level. This suggests that the 1660 Ti represents a specific point of convergence where the computational bottleneck of Bcrypt and the memory bottleneck of Argon2id happen to align, resulting in nearly identical throughput. On shared-memory or ultra-high-compute systems, the memory-hardness of Argon2id becomes the dominant bottleneck much earlier.
Conclusion
For modern applications requiring high security against GPU-based attacks:
- PBKDF2 is no longer recommended due to its lack of memory requirements.
- Bcrypt is a solid secondary choice but lacks the tuneable memory hardness of newer algorithms.
- Argon2id is the superior option. By forcing a significant memory footprint (e.g., 256MB), it effectively neutralizes the massive parallelization advantage of high-end GPUs.
In my implementation for Vaulton, I provide two tiered options for password security:
- Standard: Argon2id (128MB, 3 iterations, 1p) - Balance of performance and security.
- Hardened: Argon2id (256MB, 4 iterations, 1p) - Maximizing memory hardness for high-value protection.
These configurations are intended to maximize the computational and physical cost of a brute-force attack, forcing even high-end hardware like the RTX 5080 to operate at a fraction of its potential throughput due to memory bottlenecks. On low-memory mobile devices, higher memory settings (e.g., 256MB) may introduce noticeable latency or fail under constrained environments, so adaptive parameter tuning per device should be considered for commercial products.
Discussion
In systems implementing zero-knowledge architecture, the responsibility for performing these heavy KDF calculations shifted to the client-side (e.g., in a browser or mobile app), while the server might only perform a lightweight rehash of authentication verifiers for storage in the database.
How are you handling the trade-off between client-side performance and high-security KDF parameters? Are there specific edge cases or mobile device limitations you've encountered when requiring significant amounts of RAM for client-side hashing?
I am particularly interested in feedback about higher security (stronger params) vs user cost (UX degradation) tradeoffs.
Cover image by Athena Sandrini on Pexels, with minor edits.
Top comments (0)