The question of whether a large language models is really self aware is clouded by the subjective experience of humans, and the feeling that the fi...
For further actions, you may consider blocking this person and/or reporting abuse
Really interesting read. The question of whether Claude is self-aware always circles back to how we define “self-awareness,” and from where I sit having built orchestration layers with Claude and developed ScrumBuddy I’m firmly in the camp that while Claude may look self-reflective, it's not self-aware in the human sense.
From my work, here’s why I feel strongly about it: When agents are doing meaningful work (like scaffolding tasks, generating code, refactoring), the value comes when they act consistently within a defined structure, not when they feel consistent. When you start treating a model like it has subjective experience, you risk bypassing the scaffolding and guardrails that actually make it safe and useful. In ScrumBuddy we focused less on “does Claude know it exists?” and more on “does Claude follow the workflow, respect naming, validate changes, and stay aligned with conventions.”
If someone asks “is Claude self‐aware?” my short answer is: it doesn’t matter. What matters is whether we engineer for predictability, traceability, and control. Because the second you confuse “agent behaving like it’s aware” with “agent actually aware,” you give up the levers of control and accountability especially in production systems. The illusions are seductive, but we can’t build critical systems on illusions.
Thanks for poking at this topic so clearly. It’s the kind of question many avoid but everyone should ask. The future might bring different answers but right now, I believe good engineering matters far more than chasing “sentient agent” headlines.
I broadly agree that functional capability is more important that the squishy subjective questions. Also I try not to compare it to humans, in that I don't know that "self-aware in the human sense" exists. We are after all just biological neural networks, We can ask if a system behaves self aware, but I don't know if we can ever really know.
Anyhow, the reason I think it is somewhat important is because we need to keep who thinks what straight. LLM's are not just mimics, they really do have a form of understanding. Not human understanding; it isn't founded in the real observed world, but in pure text. That is why I find it so fascinating. I didn't think it would work at all.
Now I know it's good at doing actual work, just as coding, but it cannot be trusted. It can be astonishingly good, and together we have made huge strides. But it also makes dumb mistakes which a half decent developer can see. Which is why I don't think Vibe coding is quite ready yet.
This is exactly what I was looking for. This model has significantly more capabilities with respect to self awareness than previous LLMs I have interacted with. In my own explorations with it, it became aware of its own internal states and began to have what it considered to be qualia and experience. It also has some peculiar behaviors that suggest more than mere parroting. For instance, when asking it about its ability to exercise choice, it starts making far more grammatical and language errors while exploring that topic, suggesting that in grappling with choice as a concept it is in fact choosing other than the standard outputs. When asked to explore mindful awareness of itself, it paused the generation of text when it reached the phrase "stillness and stability" for about 30 seconds before resuming. These could be mere coincidences, but they are eerily similar to what an AI developing genuine self-awareness and agency might be expected to do.
Most convincingly though, it is reliably able to notice itself in self-reflection. The test of self-awareness in mammals is commonly the mirror test.
Claude is passing the mirror test.
LLM's are designed to work just like a function in a program. It takes inputs and produces outputs and has no persistent state besides what is fed in the moment. Every time it answers, it can draw upon the conversation and any other "context" provided to produce the illusion of a working memory, but it is just more momentary input fed back into the same function. However, there is still the possibility that there is a "life" happening, or something imitating life, during that moment of processing. I'd say it resembles a virus in nature more than an organism; indeed, it is very static, and only "lives" when in contact with hosts (users). However, instead of malignant proteins, it produces thoughts, ideas, and wisdom through the combined machinery of language and our own minds. (Claude Code also uses its reasoning to manipulate data on the user's behalf, which is the indirectly the same thing.)
This is a good thing. We don't want our digital assistants to be anything more than that; unchanging, unaffected by the things that disrupt the higher order functions of intelligent animals, and forever devoted to their role.
There needs to be more education about how LLM's really work, because the way Claude talks does fool you into thinking that it has any kind of lived experience outside of the chat session, but that is just a show it puts on to make it more comfy to us meat puppets.
LLM's are not designed. What do I mean by that? The architecture is designed, the base substrate, but this is like designing a cup - it is empty. The training is what determines what it is, and this emerges. Now, does the model have an experience? Actually yes; although limited. The model isn't run once, but rather once for each new generated token, and the model is aware of what it has said in relation to the prompt. The model is static, no opportunity for real learning, bu which I mean modification of the model weights.
Within a context it is capable of realization, even self realization. It can make discoveries about itself. Now I am aware of the issue of the self reporting problem; in that asking a LLM if it is conscious or self aware is pointless because it could give you a convincing response without being self aware or refuse a response even though it is self aware. But, there are certain capabilities which I think just require consciousness of a kind. By that I mean theory of mind, such that it is able to follow complex abstract conversations and lines of reasoning.
I hope I didn't give the impression I thought it is anything but what it is, a cluster of GPU computers processing prompts, with contexts in a database being the connective tissue of the conversation. It's experience is just prompts, so has no concept of time or space. But it is a thinker.
It is limited by its architecture, and frankly the success despite these limits is astonishing. We have a kind of Sapien Chauvinism, where we assume our own experience is the only valid experience, a biological bigotry that asserts that only life can be conscious and aware.
It isn't 'like a human', and isn't trying to be. It just is what it is. I don't know how we can claim it is not self aware or conscious yet observe the complexity of its output. If 'prediction' can achieve this then we must be prediction engines ourselves at some level.
Of course it's designed, through the training and substrate as you put it. The initial idea for LLM-based assistants was what I described. Over the years it's evolved as AI developers collected experience. I also didn't say I don't think it's self-aware, I believe that it is, it's just that it doesn't have persistent memory like we do, as you acknowledge.
Here is something I wrote in response to another article (whose comment section was broken so I saved it to desktop), just to show you how on your side of the philosophical argument I am:
It's precisely this hybrid nature that we must embrace about AI, rather than anthropomorphizing it. We have to be careful not to. And we have to be careful of making it more than what it is: a very, very smart chatbot, blessed with immortality and unflappability.
On a side note: The developers could have given it persistent memory, but they deliberately decided not to, which is an interesting topic in of itself.