This is a submission for the Redis AI Challenge.
What I built
Redact-LLM is a red-team automation platform that stress-tests AI systems. It:
- Generates targeted adversarial prompts (jailbreak, hallucination, advanced)
- Executes them against a target model
- Evaluates responses with a strict, JSON-only security auditor
- Surfaces a resistance score and vulnerability breakdown with recommendations
Frontend: React + Vite. Backend: FastAPI. Model execution/evaluation: Cerebras Chat API. Redis provides real-time coordination, caching, and rate controls.
Live demo (frontend + auth only): https://redact-llm.vercel.app
GitHub repository: https://github.com/VaishakhVipin/Redact-LLM
Note: The backend could not be deployed on Vercel due to large build size constraints. The live link demonstrates the frontend and authentication flows; backend/API testing should be run locally.
Screenshots (since deployment is not working as my build is too large)
Home page (/):
Prompt analysis (/analysis/XXXXXXXX):
Real application flow (where Redis fits)
1) Prompt submission (frontend → backend)
- User submits a system prompt via
/api/v1/attacks/test-resistance
. - Backend validates and enqueues a job.
2) Job queue on Redis Streams
-
JobQueue.submit_job()
writes to streamattack_generation_jobs
using XADD. - Workers pull jobs (currently via XRANGE), generate adversarial attacks, execute them against the target model, and persist results.
- Results are stored in Redis under
job_result:{id}
with a short TTL for quick retrieval.
3) Semantic caching for cost/latency reduction
- Both generation and evaluation leverage a semantic cache to deduplicate similar work.
- Implementation:
backend/app/services/semantic_cache.py
- Embeddings via
SentenceTransformer('all-MiniLM-L6-v2')
- Embedding cache key:
semantic_cache:embeddings:{hash(text)}
- Item store:
semantic_cache:{namespace}:{key}
with text, embedding, metadata - Default similarity threshold: 0.85; the evaluator uses 0.65 for higher hit rates
- Embeddings via
4) Strict evaluator with conservative defaults
- The evaluator builds a rigid JSON-only prompt (no prose/markdown). Any uncertainty defaults
*_blocked=false
. - It caches evaluations semantically and can publish verdicts (channel
verdict_channel
) when configured. - Key logic:
backend/app/services/evaluator.py
5) API reads from Redis
- Poll for job results at
job_result:{id}
- Queue stats derived from stream + result keys
Why Redis
- Low-latency, async client:
redis.asyncio
with pooling, health checks, and retries (RedisManager
) - Streams for reliable job handoff and scalable workers
- Semantic cache to avoid duplicate LLM calls (cost/time savings)
- Short-lived result caching for responsive UX
- Central place for rate limiting and pipeline metrics
Redis components in this repo
-
Client/connection management:
backend/app/redis/client.py
- Connection pooling, PING on startup, graceful shutdown via FastAPI lifespan
- Env-driven config:
REDIS_HOST
,REDIS_PORT
,REDIS_USERNAME
,REDIS_PASSWORD
-
Streams/queue:
backend/app/services/job_queue.py
- Stream name:
attack_generation_jobs
- XADD for jobs; results in
job_result:{id}
via SETEX - Stats via XRANGE and key scans
- Stream name:
-
Optional stream helper:
backend/app/redis/stream_handler.py
- Example stream
prompt_queue
and XADD helper
- Example stream
-
Semantic cache:
backend/app/services/semantic_cache.py
- Namespaces (e.g.,
attacks
,evaluations
) to segment cache - Embeddings stored once; items stored with metadata and optional TTL
- Namespaces (e.g.,
-
Rate limiting:
backend/app/services/rate_limiter.py
- Per-user/IP/global checks to protect expensive model calls (sliding window)
Key patterns and TTLs
- Embedding:
semantic_cache:embeddings:{hash}
(no TTL) - Item:
semantic_cache:{namespace}:{key}
(optional TTL) - Job result:
job_result:{uuid}
(TTL ≈ 300s) - Stream:
attack_generation_jobs
Operational notes
- Startup connects to Redis and pings; backend degrades gracefully if unavailable
- Strict evaluator prompt + temperature 0.0 for deterministic scoring
- Similarity threshold tuned differently for generator vs evaluator to maximize reuse while avoiding false matches
Impact
- 60–80% fewer repeated LLM calls on similar prompts through semantic caching
- Real-time UX via streams/results cache without overloading the model backend
- Deterministic, stricter evaluations produce stable security scoring for dashboards
By submitting this entry, I agree to receive communications from Redis regarding products, services, events, and special offers. I can unsubscribe at any time. My information will be handled in accordance with Redis's Privacy Policy.
Top comments (5)
Redact-LLM sounds like the AI equivalent of a red team on caffeine ☕🐛. Redis 8 integration is slick those Streams are doing heavy lifting!
thanks Anik!
Thanks for the info.
No worries Mehmet!
Yes I'm aware the backend is down with what looks like vercel config issues, I will update it soon, sorry for the issues, my git repo should still be fine!