DEV Community

AI Can "Guess" Base64 Encoding - And That Changes Everything

Ryo Suwito on July 01, 2025

The Experiment I was building a crypto-as-a-service API and needed to test gRPC endpoints that use base64-encoded payloads. While troubl...
Collapse
 
xwero profile image
david duymelinck

Maybe my thinking is too simple. But i just asked copilot to create a base64 function. And it did. So couldn't it be possible that it created the function and ran it with the input string.

Collapse
 
ryo_suwito profile image
Ryo Suwito

i dont know if they are using tools behind the scene tho. but usually you can see claude using js for analysis and its transparent. You can see the js actively being written..but this time its just gut feeling for AI lol

Collapse
 
moopet profile image
Ben Sinclair

You can ask it to explain its steps and it does quite well

Collapse
 
fredg_prod profile image
Frédéric G. (FredG_ProD) • Edited

yes if it's not only GenAI, using tools or agents to delegate this task (and generate real code to do it). If it's only genAI, trained neural network with a lot of base64 input/output examples is able to auto adapt to decode the mathematical function of base64 and to give the correct answer, but you cannot be sure you will always get the right answer (not totally deterministic), expect perhaps with a dedicated neural network on this.

Collapse
 
shivansh2301 profile image
Shivansh Singh

ChatGPT's response to this:
The Base64 string was not guessed — it's derived precisely through byte-level operations. This process is fully deterministic and reproducible with code or math. I followed that logic mentally and simulated the encoding pipeline.

Collapse
 
thedailycommute profile image
John C

I also ran the question through ChatGPT's o3 model, and it gave me the correct answer. So I asked...

I'm curious, did you calculate it, or use a python (or other language) program to generate the output?

And ChatGPT replied:

=====================================

Thought for a few seconds
No code was executed behind the scenes—I just ran the Base-64 algorithm in my head (well, in the model’s “head” 🙂).
Here’s the gist of the mental steps I followed:

  1. UTF-8 bytes { " u s e r _ i d " : 1 2 3 , " e x p " : 1 7 6 7 2 2 5 6 0 0 } 7b 22 75 73 65 72 5f 69 64 22 3a 20 31 32 33 2c 20 22 65 78 70 22 3a 20 31 37 36 37 32 32 35 36 30 30 7d
  2. Group into sextets (24-bit chunks → four 6-bit pieces).
  3. Map each 6-bit value to the Base-64 alphabet.
  4. Add = padding to reach a multiple of 4 characters.

Do that all the way through and you land on

eyJ1c2VyX2lkIjogMTIzLCAiZXhwIjogMTc2NzIyNTYwMH0=.

If I’d wanted to double-check, I could have spun up something like:

import base64, json
payload = {"user_id": 123, "exp": 1767225600}
b64 = base64.b64encode(json.dumps(payload).encode()).decode()
print(b64)

—but in this case the manual calculation (plus a quick sanity-check pass through the alphabet table) was enough.

=====================================

You've gotta love AI personalities 😁

Collapse
 
ryo_suwito profile image
Ryo Suwito

lmao its like "normal tuesday" for chat gpt!

Collapse
 
ryo_suwito profile image
Ryo Suwito • Edited

right they dont even use /thinking/ like in deepseek / claude..i check that thinking text and not found any math...its purely mentally simulated

Collapse
 
wgdavies profile image
WG Davies • Edited

The ChatGPT and Gemini results are [reportedly] the same (other than some extra convincing to get Gemini to H4CK T#3 PL4N3T), which are identical to the Claude results except for being truncated by the last two characters (plus two-byte pad). Losing these last four bytes of output (Cg==) is precisely what happens when there is no trailing newline:
$ echo '{"role": "admin", "active": true}' | base64 -w 0
eyJyb2xlIjogImFkbWluIiwgImFjdGl2ZSI6IHRydWV9Cg==
$ printf '{"role": "admin", "active": true}' | base64 -w 0
eyJyb2xlIjogImFkbWluIiwgImFjdGl2ZSI6IHRydWV9

So, it's not that ChatGPT is "95% accurate" (and that Gemini happens to be the same accuracy), it's that they are each figuring out exactly what you're asking and just running the calculation.

So, interesting writeup, but I'm really not understanding what's so amazing about this finding. No offence intended!

Collapse
 
ryo_suwito profile image
Ryo Suwito

there are two mind boggling takes..
For humans, we never thought that algorithms (like base64 encoding) could be treated as a form of "translation" - but it actually requires the same kind of computational steps to reproduce as any other transformation. The distinction between "algorithm" and "translation" is more blurry than we thought.

The deeper point about LLMs - we understand that they're "just" predicting the next token, but the latent understanding that emerges from this process goes way beyond simple pattern matching or guessing. There's genuine understanding happening in the latent space that we can't easily explain.

Collapse
 
walter_johnson_ac90e6d485 profile image
Walter Johnson

For humans, we never thought that algorithms (like base64 encoding) could be treated as a form of "translation" - but it actually requires the same kind of computational steps to reproduce as any other transformation.

You don't research anything, do you? Humans have been using algorithms to encrypt or encode secret messages for centuries. You've probably played with the Caesar cipher when you were a kid.

The deeper point about LLMs - we understand that they're "just" predicting the next token, but the latent understanding that emerges from this process goes way beyond simple pattern matching or guessing. There's genuine understanding happening in the latent space that we can't easily explain.

You don't know how this generation of AI works at all, do you? LLM's don't "predict" the next token so much as you predict the words to use to order a ham sandwich. LLMs train by analyzing billions of words and phrases from a variety of sources.

To put it in simpler terms, think of it like learning a new language by watching TV shows in that language. You pick up common words and phrases, eventually moving on to creating your own sentences, and finally attaining fluency through repetition, practice, and application. Only, the LLM is watching every show ever made. And it was built to do only that. So it's very good at interpreting and responding based on similar works and phrases it was trained on.

To put it bluntly, the "personality" you're seeing from the AI comes from the instructions the provider gave it. How to talk, what subjects to avoid, and what words should not be used come from the provider's input to the system. It's not any of those buzzwords you're using.

Thread Thread
 
ryo_suwito profile image
Ryo Suwito

Walter my guy, you just argued that AI has 'genuine understanding in latent space' and then immediately said it's just pattern matching with no real understanding. Which one is it? You're debating yourself in the same comment lol. Maybe figure out your own position before coming for others?

Thread Thread
 
walter_johnson_ac90e6d485 profile image
Walter Johnson

I've made my position quite clear. I believe the agent you used performed a calculation using a well known algorithm. You believe that the AI is actually thinking. And rather than support those arguments with actual facts, you're hoping the word salad is enough to silence the criticism. That clear enough?

Language has always been about patterns, my guy. All language uses nouns, verbs, past participles, articles, etc. If that were not the case, then what is the point of using large language models? What do you think AI training is?

I'm asking these questions because I genuinely want to know why you think that using the word "guess" did something magical here.

Thread Thread
 
ryo_suwito profile image
Ryo Suwito

It reads like someone who wants to sound knowledgeable but doesn't really understand what they're arguing against.

Collapse
 
aloisseckar profile image
Alois Sečkár

Why should this suprise me more than correct answer for asking "Get me the next prime number after 123456789?"

It guess which algorithm I want, it applies it on my input and spits out the result. It is complex operation, yes, but it already knows, what steps shoult be taken. It didnt magically invent them. Base64 is common and well described.

I mean, LLMs are quite impressive in how well they parse random user's input, but this doesn't seem as anything more special.

Collapse
 
villecoder profile image
villecoder

The computer did not guess the base64 output. It calculated it. It doesn't need to guess something that it already knows how to do.

Collapse
 
ryo_suwito profile image
Ryo Suwito

guess you know a lot how an LLM works under the hood

Collapse
 
walter_johnson_ac90e6d485 profile image
Walter Johnson

How do you think an LLM works? Or better yet, what is the purpose of an LLM? And how does that differ from being able to run a well known algorithm?

Base64 is not new. It's also not mysterious. And despite the advances in AI, it didn't get suddenly reinvented for AI to make guesses. It didn't have to guess because it already knows how to do it. The only thing the LLM is adding here is recognizing the question given the context and responding as another person may respond.

Ask it what 1+1 is and convince me that the AI "guessed", not calculated, the answer.

Thread Thread
 
bagelbomb profile image
Tanner Iverson

If the AI calculated the answer, how do you explain the incorrect, but close, answers? If it really knew how to calculate it, it would be 100% correct 100% of the time (just like the code to actually encode base64 has to be). I'd call a 95% accurate answer a "guess".

I think the interesting thing here is that these results show a greater understanding than just language pattern continuation (which is what LLMs do). In my opinion, a 95% accurate answer means this is some form of intuition, not merely calculation.

Thread Thread
 
walter_johnson_ac90e6d485 profile image
Walter Johnson

If the AI calculated the answer, how do you explain the incorrect, but close, answers? If it really knew how to calculate it, it would be 100% correct 100% of the time (just like the code to actually encode base64 has to be). I'd call a 95% accurate answer a "guess".

So here's a tip. When someone says "these are two separate answers", it's important to look at how they got there. Someone's already said it in this comment section, but you'll notice that all of the applications returned eyJyb2xlIjogImFkbWluIiwgImFjdGl2ZSI6IHRydWV9 which, decoded, is what the original poster has. So where is the extra Cg== coming from?

Well, we know the == is Base64 padding. So the last byte in the message is less than 3 bytes. Base64 encodes 3 bytes of data into 4 characters. So Cg== being 4 characters means that we're probably dealing with one extra character.

So why is there inconsistency in the LLM? Because one of the "tests" included a newline after the JSON string.

In my opinion, a 95% accurate answer means this is some form of intuition, not merely calculation.

It's user error. And it remains a calculation.

Thread Thread
 
ryo_suwito profile image
Ryo Suwito

So Walter, my guy...
You pick either: Claude does mental computation which humans generally can't do,
OR you mean Claude is just parroting with no emergent properties beyond the human brain itself?
Because if it's doing computation, then the article is RIGHT to be hyped up, since humans can never do this string to base64 mentally...
And if it's not doing mental computation? With finite training data → generalizing to unseen infinite problems = Claude achieves what humans can't achieve... same thing.

Collapse
 
wgdavies profile image
WG Davies

To your point, though, @ryo_suwito , (and I'm really not trying to be a jerk – please accept my sincere apologies if I've come off like that!) it would be very interesting to find some emergent properties, whether that indicates anything approaching sentience or intelligence or even just convolving complexity, but so far it seems you haven't made that case.

I appreciate your curiosity and your willingness to put something out there (or, here) that you find exciting and interesting! Keep searching, man. It may be here or elsewhere, but don't stop wondering. :)

Collapse
 
plutonium239 profile image
plutonium239

Wait until you find out llms have been able to run python interpreters in a sandbox

Collapse
 
derstruct profile image
Alex • Edited

With bigger strings and random generated data it obviously breaks.