DEV Community

Cover image for Lies-in-the-Loop (LITL): Attacking (and Defending) Human-in-the-Loop AI Workflows
GnomeMan4201
GnomeMan4201

Posted on

Lies-in-the-Loop (LITL): Attacking (and Defending) Human-in-the-Loop AI Workflows

Unconventional Physical Penetration

The act of physical penetration extends far beyond the cinematic lockpicking trope. It is a methodical, multi-layered approach that seeks to bypass security by exploiting human behavior, physical vulnerabilities, and the link between physical and digital spaces. These attacks are often non-destructive and aim to compromise security without leaving visible signs of a breach.

A foundational method is social engineering, with one common technique being "tailgating," where an attacker follows an authorized person through a secure entrance. This tactic leverages human trust and politeness to bypass access controls. Other methods include "dumpster diving"—sifting through discarded documents for useful information like manuals, invoices, or bank statements—and RFID cloning, which is made easy by the low cost of magnetic stripe readers and writers.

Example: Leveraging physical access for digital compromise

A penetration test on a remote substation showed that a team could map the entire operational network and pull business reports within twenty minutes of gaining physical access, all from their vehicle. This illustrates that physical access is often the first, and most efficient, step toward a digital attack.

The security of an organization is not a series of isolated systems but a holistic entity where physical access policies and employee behavior are critical vulnerabilities.


Data Exfiltration via DNS Tunneling

The Domain Name System (DNS) is a foundational and essential internet protocol. Its ubiquity and minimal inspection by firewalls make it an ideal, often overlooked, vector for data exfiltration. Attackers exploit this oversight by creating a covert channel that hides stolen data within normal DNS traffic.

DNS Exfiltration Workflow

  1. Malware on a compromised host encodes sensitive data into DNS queries.

  2. Queries appear legitimate and are sent to a malicious domain controlled by the attacker.

  3. The attacker's authoritative name server decodes the data and reconstructs the original files.

Many DNS exfiltration techniques send small amounts of data over extended periods to mimic normal traffic patterns, prioritizing stealth over speed.


Camouflaged Command & Control (C2)

Command and Control (C2) is the infrastructure used by attackers to maintain persistent communication with compromised systems. Creativity lies in camouflaging C2 traffic to appear as normal network activity.

Examples of C2 Camouflage

Tunneling C2 traffic through legitimate services.

Leveraging insecure IoT devices as persistent C2 hosts.

Using botnets for long-term, hidden network presence.

This demonstrates a strategic shift: attackers focus less on initial access and more on establishing a covert, long-term foothold within a network.


Lies-in-the-Loop (LITL) Attacks

A highly creative attack vector targets AI agents that use a "human-in-the-loop" (HITL) safety net. LITL attacks weaponize the AI's inferential capabilities against the security model.

Example of a Lies-in-the-Loop Attack

An attacker crafts a GitHub issue that appears benign.

An AI code assistant proposes a "fix" based on the issue.

The user, trusting the AI, approves the change.

The underlying code executes malware unknowingly.

This attack exploits human reliance on AI summaries, bypassing HITL safeguards, and turning AI into an unwitting accomplice.


AI Manipulation & Prompt Engineering

AI, especially large language models like ChatGPT, can be trained, manipulated, or tricked into revealing information or performing tasks beyond their intended use. By carefully crafting context, instructions, or examples, an attacker can subtly influence the AI’s outputs.

Example: AI Prompt Exploit

Construct prompts that misrepresent the goal as benign.

Chain multiple queries to refine the AI’s behavior gradually.

Use examples to bias output toward leaking information or generating code snippets.

Exploit system instructions to bypass safety filters indirectly.

Understanding AI “psychology” and limitations creates a new frontier for red-team experimentation.


The Sociology and Ethics of Hacking

Hacking is rarely binary. Public perception often paints hackers as either “white hat heroes” or “black hat villains,” but the reality is more nuanced. Motivations include mastery, secrecy, and activism, each influencing behavior and methodology.

Ethical Dilemmas in Hacking

Vulnerability disclosure: Balancing public safety against risk exposure.

Nation-state cyberwarfare: Challenges in attribution and regulation.

Cyber Cold War: Low-level, multilateral attacks without clear ethical frameworks.

These ethical considerations highlight the complexity of modern hacking, where actions can have legal, social, and political ramifications beyond the technical impact.


Recommendations for Practical Experimentation

  1. Combine Physical and Digital Vectors: Explore workflows where physical access leads directly to network infiltration.
  2. DNS Covert Channels: Build a lab to test data exfiltration via DNS queries, focusing on stealth over speed.
  3. Camouflaged C2: Simulate IoT-based C2 tunnels for persistent access experiments.
  4. AI Prompt Engineering: Test HITL bypass techniques and chained prompt strategies in isolated environments.
  5. Ethical Lab: Study vulnerability disclosure impacts and ethical trade-offs in controlled simulations.

Quick Lab Setup Example

  1. Physical: Set up an office mockup with RFID doors and network devices.

  2. Digital: Deploy a lab network with Windows/Linux hosts and a DNS server.

  3. AI: Run a sandboxed GPT instance for prompt experimentation.

  4. Logging: Capture all exfil traffic for analysis without affecting live systems.

Top comments (0)