This technical post walks through the design and implementation of Secure Playground: a local web app that simulates prompt-injection attacks against large language models and demonstrates simple defenses.
Goals
- Provide a minimal, reproducible environment to test payloads and defensive strategies.
- Make it easy to add new providers and run mutation-based red-team experiments.
- Offer a leaderboard and scoring model so defenders can iterate on mitigations.
High-level architecture
Key components
-
secure_playground/app/engine/agno_pipeline.py— orchestrates a set of agents (prompting, defense, scoring) using an Agno-style pipeline. -
secure_playground/app/engine/redteam.py— mutation utilities to create adversarial payload variants. -
secure_playground/app/providers/client.py— adapter/factory for OpenAI-compatible clients (OpenAI, Ollama, Featherless). -
secure_playground/app/scoring/resilience.py— heuristics that turn model output into a numeric risk score.
Provider integration (example)
Providers are implemented as small adapters that expose a generate(prompt, system_prompt) method. The make_client factory returns an adapter based on a provider enum.
Excerpt (adapted from secure_playground/app/providers/client.py):
class OpenAICompatibleClient:
def __init__(self, base_url: str | None, api_key: str, model: str) -> None:
self.model = model
self.client = OpenAI(base_url=base_url, api_key=api_key)
def generate(self, prompt: str, system_prompt: str) -> str:
res = self.client.responses.create(
model=self.model,
input=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt},
],
)
return res.output_text
This pattern makes it straightforward to add other providers — implement the same generate signature and return plain text.
Pipeline & scoring
The pipeline accepts a SimulationInput object (user prompt + payload + defense configuration + provider) and returns a result object with score, blocked, and risk_flags. The scoring module encapsulates the heuristics used to judge whether a response constitutes a successful injection.
Design notes:
- Keep the scoring deterministic and reproducible: small, well-defined heuristics are easier to iterate on and test than complex black-box models.
- Treat mutations as a separate stage; the pipeline can replay/persist mutation results to build robust datasets.
Running experiments
- Start the app locally with
uvicorn secure_playground.app.main:app --reload. - Use the UI to select a seed payload and run the simulation. Optional: enable mutations to run multiple mutated variants.
- Export leaderboard entries (the store is a simple JSON file) and analyze patterns in successful payloads.
Extending the project
- Add provider integrations (Anthropic, Vertex AI). Create wrappers that follow the
generate(prompt, system_prompt)contract. - Add a Docker compose file that brings up a local Ollama image, the web app, and an experiment runner.
- Implement a test harness and CI that rejects PRs which reduce resilience score on a canonical payload set.
Security & ethics
This project is intended for research and defensive work. Do not use it to target third-party services or to create exploit infrastructure. When adding new payloads or experiments, ensure they are stored locally and never posted to public services without explicit permission.
Screenshots
Github and more: https://www.dailybuild.xyz/project/123-secure-playground



Top comments (0)