DEV Community

Julien Simon
Julien Simon

Posted on • Originally published at julsimon.Medium on

Video: Accelerate Transformer inference with AWS Inferentia 2

AWS Inferentia2 is now generally available, and I couldn’t resist testing it with BERT models and comparing results with Inferentia1.

This thing is FAST and looks very cost-effective. Check it out!

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay