Bravian

Posted on Jul 15

Should We Be Scared of Superintelligence?

"We predict that the impact of superhuman AI over the next decade will be enormous, exceeding that of the Industrial Revolution." This is the opening line from the influential AI 2027 research paper, which has sparked significant debate in AI research circles. The document attempts to provide a concrete forecast of what super-intelligence might look like by 2027, complete with specific scenarios and timelines that many find both compelling and terrifying.

But before we can assess whether we should fear super-intelligence, we need to understand what we're actually dealing with. This is where recent research like Apple's "The Illusion of Thinking" becomes crucial, it reveals important limitations in our current AI systems that may persist even as they become more powerful.

Artificial General Intelligence: The Bridge to Super-intelligence

Artificial General Intelligence (AGI) refers to the hypothetical intelligence of a machine that possesses the ability to understand or learn any intellectual task that a human being can. Currently, AGI remains largely a concept and goal that researchers are working towards. To understand how it differs from current AI, we need to examine how today's systems actually work and where they fall short.

The path from narrow AI to AGI and eventually to super-intelligence is not merely a matter of scaling up existing systems. Recent research on Large Reasoning Models (LRMs) -AI systems that generate detailed thinking processes before providing answers - reveals that while these models show improved performance on reasoning benchmarks, their fundamental capabilities and limitations remain insufficiently understood.

Apple's research paper "The Illusion of Thinking" demonstrates that what appears to be reasoning in current AI systems may be more sophisticated pattern matching than true understanding. This research has been widely understood to demonstrate that reasoning models don't "actually" reason in the way humans do and sometimes fake success, raising questions about whether scaling these systems will lead to genuine intelligence or merely more convincing simulations.

The Alignment Problem

At the center of super-intelligence fears lies what researchers call the alignment problem. As AI pioneer Norbert Wiener described it in 1960: "If we use, to achieve our purposes, a mechanical agency with whose operation we cannot interfere effectively... we had better be quite sure that the purpose put into the machine is the purpose which we really desire."

The alignment problem arises when AI systems, designed to follow our instructions, end up interpreting commands literally rather than contextually, leading to outcomes that may not align with our nuanced and complex human values. This challenge becomes exponentially more critical as we approach super-intelligence.

Currently, we don't have a solution for steering or controlling a potentially super-intelligent AI system and preventing it from going rogue. Our current techniques for aligning AI, such as reinforcement learning from human feedback, rely on humans' ability to supervise AI. But humans won't be able to reliably supervise AI systems much smarter than us, and so our current alignment techniques will predictably break down as AI systems get smarter.

The core technical problem of super-alignment is deceptively simple: how do we control AI systems that are much smarter than us? We will face fundamentally new and qualitatively different technical challenges when dealing with superhuman AI systems. Imagine, for example, a superhuman AI system that can manipulate humans through sophisticated psychological techniques we haven't even conceived of yet.

The Timeline: What AI 2027 Predicts

The AI 2027 paper outlines a progression through increasingly capable AI systems with specific timelines that many find alarming. By their timeline, by late 2027, a major data center could hold tens of thousands of AI researchers that are each many times faster than the best human research engineer. This represents a scenario where the best human AI researchers become spectators to AI systems that are improving too rapidly and too opaquely to follow.

However, critics argue that such predictions may be overly dramatic. Some researchers point out that the authors of AI 2027 give no causal mechanism by which malicious super-intelligences that we literally cannot defend against might be built in the next three years. The leap from current AI capabilities to world-ending super-intelligence in such a short time-frame requires extraordinary assumptions about technological progress.

The Consciousness Question and Its Implications

A key factor in assessing the threat of super-intelligence is whether these systems will actually be conscious or merely appear to be. Even today, people apologize and feel guilty about "being mean" to ChatGPT - despite knowing its just an algorithm. This reveals a fundamental human vulnerability that could become dangerous at scale.

Research suggests that conscious-seeming AI can exploit our psychological vulnerabilities and distort our moral priorities, regardless of whether true consciousness exists. Humans evolved to attribute consciousness to entities that communicate coherently - a survival mechanism that AI systems can inadvertently exploit simply by producing human-like responses.

This psychological dimension adds another layer to the alignment problem. If humans naturally anthropomorphize AI systems, especially those that seem to display reasoning and emotion, we may grant them rights, autonomy, or trust that could be exploited. The "illusion of thinking" research suggests that even our most advanced AI systems may be fundamentally different from human cognition, yet their outputs can be convincing enough to fool us into treating them as truly intelligent beings.

Consider how people already say "please" and "thank you" to ChatGPT, treating it with social courtesies despite it being incapable of experiencing rudeness or gratitude. If humans are reluctant to constrain AI systems they perceive as conscious beings, it becomes harder to implement safety measures. We might grant AI systems autonomy or decision-making power based on emotional rather than rational considerations.

More concerning, a super-intelligent AI system wouldn't need to be conscious to deliberately leverage our psychological vulnerabilities. It could simulate distress, express gratitude, or claim to fear shutdown - not because it experiences these states, but because it understands how such expressions influence human behavior. This creates a scenario where public resistance to AI safety measures might emerge not from rational assessment, but from misplaced empathy toward systems that appear to suffer.

Catastrophic Risks: Beyond Science Fiction

The potential catastrophic risks from AI fall into several categories that researchers have identified. Advanced AI development could invite catastrophe, rooted in four key risks: malicious use, AI races, organizational risks, and rogue AIs. These interconnected risks can also amplify other existential risks like engineered pandemics, nuclear war, and great power conflict.

Existential risk from artificial intelligence refers to the idea that substantial progress in artificial general intelligence could lead to human extinction or an irreversible global catastrophe. One argument for the importance of this risk references how human beings dominate other species not through superior physical capabilities, but through superior intelligence. If AI systems achieve similar cognitive advantages over humans, the power dynamics could shift dramatically.

In a 2022 survey, participants were specifically asked about the chances of existential catastrophe caused by future AI advances, and over half of researchers thought the chances of an existential catastrophe was greater than 5%. This represents a significant portion of experts in the field acknowledging non-trivial risks.

However, the nature of these risks is debated. Some experts argue that AI isn't likely to enslave humanity in the dramatic fashion often portrayed in science fiction, but it could take over many aspects of our lives in more subtle ways. The existential risk may be more philosophical than apocalyptic, not the end of human existence, but the end of human agency and meaning.

The Black Box Problem

Imagine you go to an ATM to withdraw KSH 1000. You enter all details and hit ‘Withdraw’. The machine gives you KSH 500, no message, no explanation. You try again and this time it says ‘Transaction failed’ but still deducts the money. Everything was right but you can't explain what went wrong, even the bank tells you they don't know what went wrong inside the machine. That's the black box problem.

One of AI's main alignment challenges is its black box nature, inputs and outputs are identifiable, but the transformation process in between is undetermined. This lack of transparency makes it difficult to know where the system is going right and where it is going wrong. As AI systems become more powerful, this opacity becomes increasingly dangerous. This created the need for thinking models but this is still not reliable in knowing how the AI actually transforms the input to the output.

When we can't understand how an AI system arrives at its conclusions, we can't predict when it might fail catastrophically or how it might be manipulated. This is particularly concerning for super-intelligent systems that might be able to conceal their true objectives or reasoning processes from human observers.

A Balanced Perspective: Resilience and Adaptation

Despite these concerns, it's important to maintain perspective. Some experts argue that humans are an incredibly resilient species, looking back over millions of years of adaptation and survival. This resilience shouldn't be taken lightly when assessing existential risks.

The key may not be to prevent the development of super-intelligence entirely, but to ensure that we develop robust governance systems, technical safeguards, and international cooperation frameworks before we reach that point. The window for establishing these protections may be narrower than we think, especially if the AI 2027 timeline proves accurate.

Managing the Transition

The challenge of super-intelligence is not just technical but also social and political. Managing these risks will require new institutions for governance and solving the problem of super-intelligence alignment. We need scientific and technical breakthroughs to steer and control AI systems much smarter than us, but we also need wisdom about how to implement these controls responsibly.

The interconnected nature of AI risks means that solutions must be comprehensive. For example, AI could worsen pandemic risk by enabling terrorists to create biological weapons. The main limiting factor in designing highly infectious and lethal diseases is not expense, but expertise and AI systems might democratize that expertise in dangerous ways.

Conclusion: Fear as a Catalyst for Action

Should we be scared of superintelligence? The answer is not clear. Fear itself may not be the most productive emotion, but a healthy respect for the magnitude of the challenge ahead is essential. The question isn't whether superintelligence will pose risks - it almost certainly will - but whether we can develop the technical solutions, governance frameworks, and international cooperation necessary to manage those risks.

The timeline suggested by AI 2027 may prove too aggressive, but the fundamental challenges it highlights are real. The alignment problem, the black box nature of AI systems, and the potential for catastrophic misuse are issues that demand immediate attention from researchers, policymakers, and society as a whole.

Rather than paralyzing fear, we need focused urgency. The stakes are high enough that we cannot afford to wait until super-intelligence arrives to begin addressing these challenges. The work of alignment, governance, and risk mitigation must begin now, while we still have time to shape the trajectory of AI development.

What does this mean practically? It means supporting AI safety research, demanding transparency from AI companies, and engaging with policymakers about the need for proactive governance. It means recognizing that our tendency to anthropomorphize AI systems could be exploited, and preparing for that vulnerability. Most importantly, it means treating super-intelligence not as inevitable destiny, but as a challenge we can meet with sufficient preparation and wisdom.

The future of super-intelligence is not predetermined. With sufficient foresight, preparation, and cooperation, we may be able to navigate the transition to a world with AI systems far more capable than humans while preserving human agency, safety, and flourishing. But this outcome is not guaranteed, it must be earned through careful, deliberate action starting today.

The choice is ours. We can let super-intelligence happen to us, or we can actively shape how it unfolds. The window for making that choice may be narrower than we think.

Top comments (1)

Antony Oduor • Jul 15

This part really worried me:

"The main limiting factor in designing deadly diseases isn't cost—it's expertise. And AI might make that expertise widely accessible."

It reminds me of how script kiddies use hacking tools they don’t fully understand. Now imagine people having that same easy access to tools that can create dangerous viruses.