Black Hat Europe 2025 Arsenal: 8 AI Security Tools Transforming Cybersecurity

#ai #cybersecurity #news #opensource

Introduction
In December 2025, the global cybersecurity community’s annual flagship event, Black Hat Europe 2025, is set to kick off in London, UK. The Arsenal showcase, a key indicator of technological trends within the Black Hat series, has always been a focal point for security researchers and practitioners to gain insights into future trends. It brings together the world’s most cutting-edge open-source security tools and innovative concepts. This article provides a comprehensive analysis of 8 open-source AI security tools that will be presented at the Black Hat Europe 2025 Arsenal, helping you get an early look at their technical highlights and application scenarios.
Github link：

https://github.com/Tencent/AI-Infra-Guard
https://github.com/mandiant/harbinger
https://github.com/stratosphereips/MIPSEval
https://github.com/ErdemOzgen/RedAiRange
https://github.com/ReversecLabs/spikee
https://github.com/ThalesGroup/sql-data-guard

I. The Rise of AI Red Teaming: Platforms, Ranges, and Infrastructure Assessment
AI-powered red team attacks are rapidly evolving from individual techniques into systematic operational capabilities. This conference showcases a “Red Team Trilogy” covering an operational platform, a training range, and a risk self-assessment tool.

Harbinger: The AI-Powered Red Team Operations Center
Traditional red team operations are heavily reliant on manual experience, leading to significant efficiency bottlenecks. The “Harbinger” platform, open-sourced by the renowned cybersecurity company Mandiant, aims to address this pain point. It is an AI-driven red team collaboration and decision-making platform with core innovations in:

· Operational Automation: Utilizes AI to automatically execute repetitive tasks such as reconnaissance, exploitation, and lateral movement.

· Decision Support: Based on the operational landscape, AI can recommend the optimal next attack path to red team members.

· Automated Reporting: Automatically organizes attack logs, screenshots, and findings to generate structured attack reports, freeing red team members from tedious documentation work.

Connecting the different components of red teaming. This project integrates multiple components commonly used and makes it easier to perform actions, output, and parse.

Features
· Socks tasks: Run tools over socks proxies and log the output, as well as templating of commonly used tools.

· Neo4j: Use data from neo4j directly into templating of tool commands.

· C2 Servers: By default we have support for Mythic. But you can bring your own integration by implementing some code, see the custom connectors documentation.

· File parsing: Harbinger can parse a number of file types and import the data into the database. Examples include lsass dumps and ad snapshots. See the parser table for a full list.

· Output parsing: Harbinger can detect useful information in output from the C2 and provide you easy access to it.

· Data searching: Harbinger gives you the ability to search for data in the database in a number of ways. It combines the data from all your C2s in a single database.

· Playbooks: Execute commands in turn in a playbook.

· Dark mode: Do I need to say more.

· AI integration: Harbinger uses LLMs to analyze data, extract useful information and provide suggestions to the operator for the next steps and acts as an assistant.

Harbinger signals a shift in AI red teaming from “using AI tools” to being “driven by an AI platform.”

Red AI Range (RAR): The Digital Dojo for AI Offense and Defense
Theoretical knowledge cannot replace hands-on experience. “Red AI Range (RAR),” developed by Sasan Security, provides a much-needed AI security “cyber range” for the industry. It is an AI/ML system environment with pre-configured vulnerabilities, allowing security professionals to:

· Practice Real-World Attacks: Engage in hands-on practice of real-world attack techniques such as model evasion, data poisoning, and model stealing.

· Validate Defenses: Deploy and test defensive measures against AI threats in a controlled environment.
The open-sourcing of RAR significantly lowers the barrier for enterprises and individuals to conduct AI offensive and defensive exercises.

A.I.G: The AI Security Risk Self-Assessment Platform
From the underlying AI infrastructure to the Agent application layer, Tencent’s Zhuque Lab has open-sourced “A.I.G,” a comprehensive, intelligent, and user-friendly AI red team security testing platform. Unlike Harbinger, it focuses on helping ordinary users quickly assess the security risks of AI systems themselves, providing a very intuitive front-end interface. Its core capabilities include:

· AI Infrastructure Scanning: Accurately scans mainstream AI frameworks (like Ollama, ComfyUI) based on fingerprinting and detects known CVE vulnerabilities within them.

· MCP Server Scanning: With the explosion in popularity of MCPs, their security has become crucial. A.I.G uses Agent technology to scan MCP Server source code or remote MCP URLs, covering nine major risk categories including tool poisoning, remote code execution, and indirect prompt injection.
· Large Model Security Check-up: Includes multiple carefully curated jailbreak evaluation datasets to systematically assess the robustness of LLMs against the latest jailbreak attacks, and supports cross-model security comparison and scoring.
A.I.G has the highest number of GitHub Stars (2300+) among all the tools, and its widespread popularity indicates that AI security assessment is becoming democratized. Ordinary AI developers and Agent users also need a platform that can cover the full-stack risk assessment from the underlying infrastructure to the upper-level model applications.

II. LLM Prompt Security: From Prompt Injection to Data Protection
As LLMs become deeply integrated into business processes, fine-grained security assessment and access control are becoming critically important.

SPIKEE & MIPSEval: Evaluating Single-Turn and Multi-Turn LLM Security
Prompt injection is currently one of the most significant security threats to LLMs. SPIKEE (Simple Prompt Injection Kit for Evaluation and Exploitation), developed by Reversec, provides a lightweight, modular toolkit that allows researchers and developers to quickly test their LLM applications for prompt injection vulnerabilities.

However, many security issues only manifest during sustained, multi-turn conversations. The open-source tool MIPSEval fills this gap by being specifically designed to evaluate the security consistency of LLMs in long dialogues. For example, a model might refuse to answer an inappropriate question in the first turn, but after a few rounds of “priming” with unrelated conversation, its safety guardrails could be bypassed. MIPSEval, combined with multiple LLM Agents, provides a framework for evaluating this complex, stateful security.

SQL Data Guard: A Secure Channel for LLM Database Access
When an LLM needs to connect to an enterprise database to provide services, preventing sensitive data leakage or malicious SQL queries becomes a severe challenge. SQL Data Guard, open-sourced by Thales Group, offers an innovative solution. It acts as a security middleware deployed between the LLM and the database. By analyzing and rewriting the SQL queries generated by the LLM, it ensures that all database interactions comply with preset security policies, thereby effectively controlling risks while empowering the LLM with powerful data capabilities.
SQL is the go-to language for performing queries on databases, and for a good reason — it’s well known, easy to use, and pretty simple. However, it seems that it’s as easy to use as it is to exploit, and SQL injection is still one of the most targeted vulnerabilities — especially nowadays with the proliferation of “natural language queries” harnessing Large Language Models (LLMs) power to generate and run SQL queries.To help solve this problem, we developed sql-data-guard, an open-source project designed to verify that SQL queries access only the data they are allowed to. It takes a query and a restriction configuration, and returns whether the query is allowed to run or not. Additionally, it can modify the query to ensure it complies with the restrictions. sql-data-guard has also a built-in module for detection of malicious payloads, allowing it to report on and remove malicious expressions before query execution.sql-data-guard is particularly useful when constructing SQL queries with LLMs, as such queries can’t run as prepared statements. Prepared statements secure a query’s structure, but LLM-generated queries are dynamic and lack this fixed form, increasing SQL injection risk. sql-data-guard mitigates this by inspecting and validating the query’s content.By verifying and modifying queries before they are executed, sql-data-guard helps prevent unauthorized data access and accidental data exposure. Adding sql-data-guard to your application can prevent or minimize data breaches and the impact of SQL injection attacks, ensuring that only permitted data is accessed.Connecting LLMs to SQL databases without strict controls can risk accidental data exposure, as models may generate SQL queries that access sensitive information. OWASP highlights cases of poor sandboxing leading to unauthorized disclosures, emphasizing the need for clear access controls and prompt validation. Businesses should adopt rigorous access restrictions, regular audits, and robust API security, especially to comply with privacy laws and regulations like GDPR and CCPA, which penalize unauthorized data exposure.

DEV Community

Black Hat Europe 2025 Arsenal: 8 AI Security Tools Transforming Cybersecurity

Top comments (0)