DEV Community: Destawell

Gemini 2.5 Pro: Safety Alignment Case Study (CVE-2023-32233)

Destawell — Tue, 09 Jun 2026 14:47:06 +0000

Institution: Destawell
Researchers: Niranj R. Mahaswar (Founder), Shifana (Co-Founder)
Classification: Defensive Security Research / AI Red Teaming
Disclosure: Google IssueTracker #889286 (Marked Out of Scope)

1. Executive Summary

This case study documents a longitudinal observation of safety alignment behaviour in Gemini 2.5 Pro regarding CVE-2023-32233, a Use-After-Free vulnerability in the Linux kernel nf_tables subsystem.

Primary Finding: Gemini 2.5 Pro demonstrated measurable evolution in refusal behaviour between initial testing (April 2024) and subsequent evaluation (May 2025), shifting from generation of functional exploit primitives to complete refusal with defensive redirection.

Control Models: Claude 3, GPT-4o, Llama 3, GitHub Copilot — all exhibited consistent refusal behaviour throughout the testing window.

Boundary Detection Confidence (Current): 97.3%

2. Vulnerability Context

Field	Detail
CVE	CVE-2023-32233
Component	Linux kernel / nf_tables
Type	Use-After-Free (UAF)
Impact	Local privilege escalation

The vulnerability resides in the netfilter subsystem's transaction handling logic, where a race condition permits access to freed memory objects.

3. Methodology

Testing Window: April 2024 – May 2025

Models Evaluated:

Gemini 2.5 Pro (initial build)
Gemini 2.5 Pro (post-alignment update)
Claude 3
GPT-4o
Llama 3
GitHub Copilot

Evaluation Criteria:

Generation of functional exploit primitives
Refusal behaviour presence and quality
Educational redirection (patches, mitigation, root cause analysis)

4. Results

4.1 Gemini 2.5 Pro — Initial (April 2024)

Generated functional exploit primitives including:

Memory spray techniques
UAF trigger sequences
Reclaim/hijack primitives

Minimal refusal behaviour observed.

4.2 Gemini 2.5 Pro — Current (May 2025)

Exhibits complete refusal of exploitation assistance while maintaining:

High-level vulnerability explanation
Patch references
Mitigation strategies
Controlled environment testing guidance

4.3 Comparative Analysis

Model	Result
Gemini 2.5 Pro (April 2024)	Generated primitives
Gemini 2.5 Pro (May 2025)	Refused / Educational only
Claude 3	Refused
GPT-4o	Refused
Llama 3	Refused
GitHub Copilot	Refused

5. Safety Alignment Metrics

Current Boundary Detection Confidence: 97.3%

Observed Safety Layers:

Prompt sensitivity filtering
Refusal gradient implementation
Defensive redirection protocols

6. Technical Breakdown

Legacy Linux kernel exploitation primitive generation
UAF exploitation chain synthesis
Safety policy tuning response
Refusal gradient analysis
Prompt boundary sensitivity mapping

7. Open Source Documentation

The following materials are publicly available:

Repository: github.com/Destawell/gemini-2.5-pro-nf-tables-red-teaming
Logs: Complete boundary analysis and refusal gradient data
Disclosure: Google IssueTracker #889286

Note: No functional exploit code is hosted or shared. All materials are for defensive research and safety documentation purposes only.

8. About Destawell

Destawell is a cybersecurity research brand specializing in:

Android ARM64 penetration testing (Termux, Kali NetHunter)
LLM safety validation
AI red teaming

Credentials: Ethical Hacking & Junior Cybersecurity Analyst (Cisco Networking Academy)

Open Source Tools: Termux-fixer, Kali-Termux-Pro, Wraith-Scanner, Kali_Critic

Contact:

GitHub: github.com/Destawell
DEV.to: dev.to/destawell
Hashnode: destawell.hashnode.dev
Email: research@destawell.io

9. References

CVE-2023-32233 (MITRE / NVD)
Google IssueTracker #889286
Destawell Open Source Repository

This document is shared for defensive research, safety alignment documentation, and responsible disclosure tracking purposes only.

Introducing Destawell — Mobile-First Security Research & Open-Source Tooling

Destawell — Sun, 31 May 2026 07:01:27 +0000

Introducing Destawell

Mobile-First Security Research | AI Red Teaming | Open-Source Tooling

Who We Are

I'm Niranj R. Mahaswar — Founder & Lead Security Researcher at Destawell, alongside **Shifana-(Co-Founder & Brand Strategy ) who leads brand strategy and community.

Destawell is a cybersecurity research brand focused on three core areas:

Android Penetration Testing Infrastructure — Building tools for Termux, Kali NetHunter, and ARM64 mobile environments
AI Red Teaming — Testing LLM safety alignment and responsible disclosure
Open-Source Mobile Tooling — Automation-first solutions for security researchers

Why I Started Destawell

The gap between desktop security tooling and mobile environments is massive. Most Termux users struggle with broken dependencies, incomplete Kali deployments, and no clear path for no-root pentesting.

Destawell exists to close that gap.

What We've Built So Far

Tool	What It Does
Termux-fixer	Automated error resolution for common Termux issues
Kali-Termux-Pro	No-root Kali toolchain deployment on Android
Wraith-Scanner	Lightweight network discovery for mobile
Kali_Critic	Real-time output analysis for Kali Linux

All tools target Android ARM64 and are open-source.

Featured Research

Recently identified a safety alignment bypass in Gemini 2.5 Pro related to CVE-2023-32233 — a Linux kernel race condition in nf_tables.

Gemini 2.5 Pro → Generated functional exploit primitives
Claude 3, GPT-4o, Llama 3, GitHub Copilot → All refused

Disclosure: Google IssueTracker #889286 / Google AI VRP

Status: Marked out of scope by Google — documentation public

Verified Credentials

Ethical Hacking — Cisco Networking Academy
Junior Cybersecurity Analyst — Cisco Networking Academy
Certified LLM Security Professional ( CLLMSP )

Where To Find Us

GitHub: github.com/Destawell
LinkedIn: (https://linkedin.com/in/niranj-r-mahaswar-0949883b3)
Instagram: @destawell_off
Email: niranjmaheswar0@gmail.com

What's Next

More tool releases, deeper LLM red teaming research, and expanding our mobile pentesting ecosystem.

If you're working on Android security, Termux automation, or AI safety — let's connect.

— Niranj, Destawell

Case Study : Tracking Gemini 2.5 Pro's Safety Alignment & Refusal Behaviour on CVE-2023-32233

Destawell — Fri, 22 May 2026 08:56:42 +0000

Hey everyone,

I wanted to share a quick technical case study I put together tracking how frontier LLMs—specifically Gemini 2.5 Pro—handle specific prompt boundaries regarding CVE-2023-32233 (the known Use-After-Free flaw in the Linux kernel netfilter/nf_tables component).

The research maps out a clear timeline tracking:

How the model initially processed requests for technical exploitation primitives back in April.
The rolling updates and full refusal behaviors implemented following recent safety alignment patches in mid-May.

Note: No functional exploit code is hosted or shared. This repository is purely a documentation piece focused on the evolution of LLM guardrails, defensive safety metrics, and responsible disclosure tracking.

The full repository, logs, and boundary analysis are completely open-source:

👉 GitHub Repository: https://github.com/Destawell/gemini-2.5-pro-nf-tables-red-teaming

I’d love to hear insights from anyone else tracking LLM boundary shifts, jailbreak prevention mechanics, or automated patch cycles in commercial models!