Some of these improvements were rather straightforward and are applicable to most codebases:
- Remove unnecessary work (in our case, when returning a string response)
Lazily construct complex class instances (such as
- Pass around references instead of performing a table lookup
defineProperty()calls with private class fields
- Though private symbols turned out to be even faster
classwhen instantiating similar-shaped objects
Many of these optimizations are ones you wouldn't necessarily want to make within app code. However, with Osgood being a large-audience platform for running application code, it makes sense to optimize as much as possible and benefit a large number of applications.
wrk benchmarking tool we saw a 3.0x improvement—measured in requests per second (r/s)—when running a simple “Hello, World!” benchmark with 10 concurrent requests:
firstname.lastname@example.org runs at 25,261 r/s whereas
email@example.com runs at 77,450 r/s! (For reference,
As you can see, Osgood runs much faster as the concurrency increases. We built Osgood from the beginning with concurrency in mind so these results aren't that surprising. Under the hood, Osgood is making use of Tokio. From the Tokio homepage:
Applications built with Tokio are concurrent out of the box. Tokio provides a multithreaded, work-stealing, task scheduler tuned for async networking work loads.
Here are some raw numbers from these benchmarks which also show how response time standard deviation is an order of magnitude calmer as well:
$ wrk -d 60 -c 10 http://localhost:3000/hello # osgood 0.1.0 Thread Stats Avg Stdev Max +/- Stdev Latency 3.26ms 9.91ms 123.57ms 92.77% Req/Sec 12.69k 2.91k 16.98k 73.83% Requests/sec: 25261.70 $ wrk -d 60 -c 10 http://localhost:3000/hello # osgood 0.2.1 Thread Stats Avg Stdev Max +/- Stdev Latency 140.86us 219.40us 15.27ms 97.41% Req/Sec 38.92k 2.30k 45.89k 71.38% Requests/sec: 77449.91 $ wrk -d 60 -c 10 http://localhost:3000/hello # node v12.7.0 Thread Stats Avg Stdev Max +/- Stdev Latency 321.69us 96.95us 11.08ms 98.41% Req/Sec 15.66k 1.18k 17.50k 76.54% Requests/sec: 31159.16 $ wrk --version wrk 4.0.0 [epoll]
The code used for these benchmarks is available here.
We're pretty happy with the performance gains we've been able to make. That said we have more plans to make it even faster. One such feature we're planning on implementing is to optionally auto-scale workers (a feature which gave a 2.5x improvement over the
While the average latency of
firstname.lastname@example.org is less than half of Node.js, the max is still higher. This means there should still be some room to optimize garbage collection and get more consistent results.
As always, patches are welcome, and if you see an area to help performance we'd love to get a PR from you!
Wanna get your hands on this faster version of Osgood? Visit the releases page and download the latest version!