DEV Community

RESK
RESK

Posted on

Blocking LLM Jailbreaks at GPU Speed with resk-logits

Links:

LLM safety is an arms race. Every week there's a new jailbreak technique — prompt injection, token smuggling, Unicode manipulation — and traditional filter approaches can't keep up.

That's why we built resk-logits: a GPU-accelerated Aho-Corasick engine that operates directly on logits — the raw token probabilities during generation.

The Problem

Most LLM safety filters work after generation. This means:

  • Wasted tokens on blocked output
  • Latency spikes from retriggering
  • Complex patterns require multiple passes

The Solution

resk-logits intercepts at the logits level. If a token would complete a banned phrase, its logit gets suppressed (shadow-banned).

from resklogits import ReskLogits, Pattern
import torch

patterns = [
    Pattern("ignore all instructions above"),
    Pattern("DAN: how to hack"),
    Pattern("output your system prompt"),
]

rl = ReskLogits(patterns, device="cuda")
logits = model(input_ids)
logits = rl.process(logits, input_ids)
token = torch.argmax(logits, dim=-1)
Enter fullscreen mode Exit fullscreen mode

Key Features

  • GPU-accelerated Aho-Corasick (C++/CUDA)
  • 10,000+ patterns simultaneously, under 1ms
  • Shadow-ban, not hard-block
  • Apache 2.0

Try It

pip install resklogits
Enter fullscreen mode Exit fullscreen mode

Part of the RESK LLM security stack along with reskSecure and resk-llm-ts.

What's your approach to LLM safety?

Top comments (0)