DEV Community

Fagner Brack
Fagner Brack

Posted on • Originally published at fagnerbrack.com on

Detection of AI-Generated Text and the P vs NP Problem

The detection of AI-generated text has become a hot topic nowadays. However, the relationship between this challenge and the P vs NP problem, one of the most enigmatic questions in computer science, sheds light on the true complexity of the task. Dive into this captivating exploration to learn why AI-generated text detection may be an unattainable goal unless you've found the solution for the P vs NP problem.

The P vs NP Problem: A Brief Overview

The P vs NP problem is a fundamental question in computer science that has puzzled mathematicians and programmers for decades. It revolves around the relationship between two classes of problems: those that are solvable in polynomial time (P) and those whose solutions can only be verified in polynomial time but are much harder to solve (NP).

P problems are considered “easy” to solve because they can be tackled efficiently by algorithms. Examples include sorting a list of numbers, finding the shortest path between two points in a graph, or determining if a number is prime.

NP problems, on the other hand, are those for which solutions can be verified quickly, but finding a solution may be time-consuming. Examples include the travelling salesperson problem, where the goal is to find the shortest route that visits a series of cities and returns to the starting point, or the subset sum problem, which involves finding a subset of numbers that sum to a given target value.

The P vs NP question asks two questions:

Is every problem in NP also in P (P = NP)?

In other words, does every problem whose solution can be quickly verified also has an efficient algorithm to find the solution? If P equals NP, it would mean that even the most complex problems have efficient algorithms to solve them, which would have profound implications for cryptography, optimization, and many other fields.

Are some problems in NP but not in P (P ≠ NP)?

In other words, do some problems whose solution cannot be quickly verified NOT have an efficient algorithm to find the solution? If P does not equal NP, then there are problems that are inherently more difficult to solve than to check.

Today, all of our cryptography infrastructure is built under the assumption that P ≠ NP, so anything that proves them wrong is worth global news for a long, long time. More than when we came up with the Theory of Relativity.

The Clay Mathematics Institute has designated the P vs NP problem as one of its seven Millennium Prize Problems, offering a $1 million prize to anyone who can provide a correct solution. Despite decades of research, no one has been able to prove whether P equals NP or not, and the question remains one of the most significant open problems in computer science, with a million-dollar prize for whoever solves it.

The Connection between P vs NP and AI Text Detection

One-way hashes are mathematical functions that take an input (or “message”) and return a fixed-size string of bytes, usually as a hash value. The output appears random and is uniquely associated with the input. Under the assumption of P ≠ NP, it’s impossible to reverse-engineer the input from the output and even small changes to the input result in vastly different hash values. One-way hashes are widely used in cryptography, password storage, and data integrity verification.

hash = fn(message)

Now, let’s connect one-way hashes, the P vs NP problem, and AI-generated text detection. AI-generated content is produced by complex algorithms, such as large language models (LLMs), which are designed to create human-like text. The process of generating text from these models is a black box and can be thought of as a one-way function, similar to a hash function. Under the P ≠ NP assumption, once the text is generated, it’s incredibly difficult to determine the exact model, its parameters, or the prompt used to create the text.

content = fn(prompt)

Detecting AI-generated text is an NP problem because, although it might be possible to verify whether a given text is AI-generated, finding an efficient algorithm to consistently and accurately identify AI-generated content remains elusive. If P were equal to NP, an efficient algorithm would exist, enabling us to accurately detect AI-generated text without false positives.

However, since the P vs NP question remains unsolved, we don’t know if such an algorithm exists, and you have to be very sceptical of anyone who claims they found one and didn't win the million-dollar prize yet. They might have found an algorithm without touching the P/NP problem, but that discovery will be as groundbreaking. It's one thing to break a hash function that you can inspect and test reproducibly; another completely different is to reverse-engineer an LLM which is changing all the time and may not yield the same response for the same input prompt. You can use the same seed, but updates to the model will change the behaviour regardless.

Current AI Generation Tools and Their Limitations

In practice, detecting AI-generated text is challenging because these algorithms produce increasingly human-like content, making it hard to differentiate from genuine human writing. Additionally, AI-generated text can be modified with simple edits, further evading detection. For example, give me any AI-generated content, and I can bypass any detector with less than 5 edits for every 1000 words.

For example: According to the AI content detector, “ZeroGPT,” 92.26% of the United States Constitution was allegedly written by AI.

As a result, without a breakthrough in solving the P vs NP problem (such as proving P = NP) or a breakthrough in an AI detection algorithm (which is almost impossible because AI is a dynamic everchanging black box), it’s unlikely that we’ll develop a foolproof method to detect AI-generated content without false positives. It will be a game of catchup where every model update will change its default template and trigger a change to every AI Detection algorithm.

The Unlikely Solution: P = NP

In the context of AI-generated text detection, proving that P equals NP could have significant implications. If P were equal to NP, it would mean that an efficient algorithm exists for all problems whose solutions can be verified quickly, including the detection of AI-generated text. This would allow for the development of a foolproof method to identify machine-generated content without false positives or false negatives, ultimately enhancing the accuracy and reliability of AI text detection.

However, proving P = NP does not automatically provides the answer. It's possible to prove P = NP or P ≠ NP without actually providing an example.

Also, the likelihood of proving P equals NP remains uncertain, as the question has persisted as one of the most challenging and enigmatic problems in computer science. If someone were to successfully demonstrate that P equals NP, they would not only revolutionize the field but also receive substantial rewards and recognition.

Additionally, solving the problem would likely lead to international acclaim, a lasting impact on various industries, and potential advancements in cryptography, optimization, and other fields that rely on complex problem-solving algorithms. That person (or team) would definitely win a Turing Award.

An impossible task?

The detection of AI-generated text remains an elusive goal, closely tied to the unsolved P vs NP problem. Until this enigma is unravelled, confidently identifying AI-generated content remains a distant dream. In the meantime, be very sceptical of those who promise to be "AI detectors", and how they managed to do so without solving a problem that nobody else was able to.

Do you know anyone who claims who have developed an AI-generated text? Hit me up, and I will put them to the challenge.

Thanks for reading. If you have feedback, contact me on Twitter, LinkedIn or Github.

Related Material

Top comments (0)