It's remarkable too that none of the other commenters noted this. As the adage goes: just because you read it on the internet doesn't make it true! Good catch.
I think it's a lot better! I suspect that some of the differences now are due to technical trivia of superficially irrelevant details you happened to choose when implementing these programs, but that's the real-world for you. The current data highlights much more clearly just how many req/sec any of this options can handle - because I think that's the real takeaway here; the web-frameworks themselves are unlikely to be a significant bottleneck in any real-world usecase; and if the language and/or framework matters for heavier, real workloads - well, that's the kind of thing you can't microbenchmark well; you need a real use case.
What the current data also highlights more clearly is just how finicky perf at this level is; e.g. the way the program using the go http stack apparently is much more efficient than the program you labelled TCP; or how wrk and drill results are quite different. And that's important to understand; microbenchmarks are notoriously flaky and sensitive to all kinds of details you don't actually care about. Taking a microbenchmark to mean that task X takes time Y is usually the wrong way to think about it - it takes time Y only in the hyper-specific circumstances of the test - but generalizing a specific result is quite error-prone.
It's remarkable too that none of the other commenters noted this. As the adage goes: just because you read it on the internet doesn't make it true! Good catch.
I have updated the benchmarks with more data. WDYT now?
I think it's a lot better! I suspect that some of the differences now are due to technical trivia of superficially irrelevant details you happened to choose when implementing these programs, but that's the real-world for you. The current data highlights much more clearly just how many req/sec any of this options can handle - because I think that's the real takeaway here; the web-frameworks themselves are unlikely to be a significant bottleneck in any real-world usecase; and if the language and/or framework matters for heavier, real workloads - well, that's the kind of thing you can't microbenchmark well; you need a real use case.
What the current data also highlights more clearly is just how finicky perf at this level is; e.g. the way the program using the go http stack apparently is much more efficient than the program you labelled TCP; or how wrk and drill results are quite different. And that's important to understand; microbenchmarks are notoriously flaky and sensitive to all kinds of details you don't actually care about. Taking a microbenchmark to mean that task X takes time Y is usually the wrong way to think about it - it takes time Y only in the hyper-specific circumstances of the test - but generalizing a specific result is quite error-prone.
I think this is an excellent takeaway