DEV Community

Julien Simon
Julien Simon

Posted on • Originally published at julsimon.Medium on

Deploying SuperNova-Lite on Inferentia2: the best 8B model for $1 an hour!

In this video, you will learn about Llama-3.1-SuperNova-Lite, the best open-source 8B model available today according to the Hugging Face Open LLM Leaderboard.

Llama-3.1-SuperNova-Lite is an 8B parameter model developed by Arcee.ai, based on the Llama-3.1–8B-Instruct architecture. It is a distilled version of the larger Llama-3.1–405B-Instruct model, leveraging offline logits extracted from the 405B parameter variant. This 8B variation of Llama-3.1-SuperNova maintains high performance while offering exceptional instruction-following capabilities and domain-specific adaptability.

I’ll show you how to compile on the fly and deploy SuperNova Lite on a SageMaker endpoint powered by an inf2.xlarge instance, the smallest Inferentia2 instance available at only $0.99 an hour!

Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay