DEV Community

AlexMikhalev
AlexMikhalev

Posted on • Originally published at alexmikhalev.Medium on

Benchmarks for BERT Large Question Answering inference for RedisAI and RedisGears


Photo by Aliaksei manlyx on Unsplash

Updated walkthrough and tests:

https://reference-architecture.ai/docs/bert-qa-benchmarking/ — BERT LARGE UNCASE for Question Answering ~0.088387 seconds

pre-requisite for running the benchmark:

Assuming you are running Debian or ubuntu, have docker and docker-compose (or can create virtual environment via conda):

git clone -b benchmark — recurse-submodules git@github.com:applied-knowledge-systems/the-pattern.git

cd the-pattern

./bootstrap_benchmark.sh
Enter fullscreen mode Exit fullscreen mode

It should end with curl call to qasearch API, redis caching is disabled for benchmark, it’s a small cluster – 8 nodes in total (fixed in config.sh)

Curl call shall look like this:

curl -i -H “Content-Type: application/json” -X POST -d ‘{“search”:”Who performs viral transmission among adults?”}’ http://localhost:8080/qasearch

HTTP/1.1 200 OK

Server: gunicorn

Date: Fri, 15 Oct 2021 22:11:23 GMT

Connection: close

Content-Type: application/json

Content-Length: 426

{“links”:[{“created_at”:”2001",”rank”:29,”source”:”C0001486",”target”:”C0152083"}],”results”:[{“answer”:””,”sentence”:”Initially the 5 most gene 1 of the viral genome is translated into the viral A dROp which then replicates the viral genomic ANAs into negative strand ANAs”,”sentencekey”:”sentence:PMC302072.xml:{8YG}:11",”title”:”Heterogeneous nuclear ribonucleoprotein A1 regulates RNA synthesis of a cytoplasmic virus”}]}
Enter fullscreen mode Exit fullscreen mode

There is a sentence key with shard id or grab “Cache key” from docker logs -f rgcluster, one more thing is to figure out from logs the port of the shard corresponding to hashtag (also known as shard id, stuff in curly brackets – like this {8YG}, same will be in the output for export_load script.

Check that call works:

redis-cli -c -p **30003** -h 127.0.0.1 get “bertqa **{8YG}** _PMC302072.xml: **{8YG}** :10_Who performs viral transmission among adults”
Enter fullscreen mode Exit fullscreen mode

and then run the benchmark

redis-benchmark -p 30004 -h 127.0.0.1 -n 10 get “bertqa{356}_PMC126080.xml:{356}:1_Who performs viral transmission among adults”
Enter fullscreen mode Exit fullscreen mode

-n = number of times.

add

– csv if you want to output in CSV format

– precision 3 – if you want more decimals in ms

More information about benchmarking tool https://redis.io/topics/benchmarks

if you don’t have redis-utils installed locally, you can run the same via

docker exec -it rgcluster /bin/sh -c “redis-benchmark -r 10000 -n 10000 PING”
Enter fullscreen mode Exit fullscreen mode

The platform only has 20 articles, 8 Redis nodes = 4 masters + 4 slaves, so relevance would be bad and doesn’t need a lot of memory.

There are many ways to optimise this deployment for example add FP16 quantization and ONNX runtime, this script will be a good starting point.

Image of Datadog

The Essential Toolkit for Front-end Developers

Take a user-centric approach to front-end monitoring that evolves alongside increasingly complex frameworks and single-page applications.

Get The Kit

Top comments (0)

👋 Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.

Okay