This article discusses the importance of establishing ethical guidelines that both promote the advancement of AI technologies and ensure their safe and responsible implementation, highlighting the need for integrating ethical principles directly into AI systems and fostering a development culture that prioritizes safety and societal well-being.
Since the release of the chatbot ChatGPT, Artificial Intelligence (AI) technologies have been a feature of the news, suddenly bringing awareness to the technologies’ potential to revolutionize how people live and work. Additionally, the rapid exponential advancements and growth of these technologies have also sparked fears of misuse, loss of control, the risk of human extinction, and an intense debate about the need for guidelines to ensure their safe and ethical development and implementation. There are also concerns about the feasibility of establishing guidelines that simultaneously promote advancements in AI technology and safeguard society’s safety and well-being, with some advocating for severe restrictions on these technologies. However, rather than implementing a fallacy of draconian restrictions or pauses, it is both feasible and critically important to establish guidelines that promote AI technological advances and development while ensuring their implementation is not harmful to individuals, communities, and society. This is possible by advocating for the integration of ethical principles directly into AI systems through approaches like Constitutional AI (CAI) and AI Ethical Reasoning (AIER), by establishing an AI developers’ culture that prioritizes safety and ethics, and by establishing a governmental agency that aims to guide the society through the economic and social changes that AI advancements are inevitably bringing.
The Question
Is it feasible to establish AI guidelines that promote advancement in AI technologies, while ensuring that their implementation is not harmful to society? It is possible by advocating for the integration of ethical guidelines directly into the AI systems themselves, rather than imposing limitations on AI advancements. For instance, Arbelaez Ossa et al. (2024) in their quantitative study “Integrating Ethics in AI Development: A Qualitative Study” explore the best ethical approach to aligning AI development with healthcare needs. They interviewed 41 AI experts and analyzed the data using reflective thematic analysis. A thematic analysis is qualitative research used to identify, analyze, and report patterns (themes) within data. Their findings indicated that when developing an AI, especially for healthcare, AI developers need to consider the ethics, goals, stakeholders, and the specific situations where AI will be used, rather than the technology itself. This approach promotes AI advancements and ensures that ethical guidelines are aligned with the needs and requirements of different healthcare systems.
Table 1
GPQA Benchmark
Note: Benchmark evaluation results for GPQA. From Constitutional AI: Harmlessness from AI Feedback, by Bai et al., 2022, Table 8.
Additionally, a new concept is emerging in AI development that promotes advancement in AI technologies with the integration of ethical guidelines directly into the AI systems.
The article “Constitutional AI: Harmlessness from AI Feedback” (Bai et al., 2022) from Anthropic introduces a new approach called Constitutional AI or CAI for training AI systems to be helpful, honest, and harmless using a combination of supervised learning and reinforcement learning techniques. In other words, the CAI approach integrates ethical guidelines directly into the AI systems training. Anthropic is an American AI startup company, founded by former members of OpenAI. Anthropic’s last state-of-the-art model Claude-3, which was trained with the CAI approach, surpasses ChatGPT-4 and Google Gemini models in all reasoning, math, coding, reading comprehension, and question-answering benchmarks as well as in Graduate-Level Google-Proof Q&A benchmark (GPQA), as illustrated in Table 1. GPQA is a dataset designed to evaluate the capabilities of Large Language Models (LLMs) as well as the effectiveness of oversight mechanism frameworks, which are processes that ensure AI systems operate safely, ethically, and according to guidelines and societal norms.
Furthermore, AI systems can be trained to reason more efficiently, consequently avoiding the generation of potentially biased or harmful content. Zelikman et al. (2024) argue that their generalized Self-Taught Reasoner (STaR) algorithm can train language models (LMs) to reason more efficiently by using a single thought training process. They describe this approach as “applying STaR ‘quietly’, training the model to think before it speaks” (Zelikman et al. 2024, p2) or Quiet-STaR. Figure 1 visualizes this Quiet-STaR single thought training process. They apply Quiet-STaR to Mistral 7B LLM, improving considerably the reasoning abilities of the Mistral model. Nonetheless, it can be argued that integrating the Quiet-STaR technique with the Anthropic CAI approach could teach AI systems to reason more efficiently, consequently avoiding the generation of potentially biased or harmful content, this can be referred to as AI Ethical Reasoning or AIER. Therefore, integrating Arbelaez Ossa et al. quantitative study, the Anthropic CAI, and the Quiet-STaR approaches into the development of AI systems can significantly diminish the potential for the systems to generate biased or harmful content, making their implementation safer. This demonstrates that it is feasible to promote advancement in AI technologies while ensuring that their implementation is not harmful to society. However, many argue that it is not possible, believing that the risk posed by AI is so significant that it is crucial to establish guidelines that limit or even pause advancements in AI technologies.
Figure 1
Quiet-STaR
Note: Visualization of the Quiet-STaR algorithm as applied during training to a single thought.
From Quiet-STaR: Language models can teach themselves to think before speaking, by Zelikman et al., 2024, Figure 1
The Counterargument
The Future of Life Institute (2023) in its article “Policy Making in the Pause” argues for the implementation of robust third-party auditing and certification for specific AI systems and a pause on AI development until AI labs “have protocols in place to ensure that their systems are safe beyond a reasonable doubt, for individuals, communities, and society.” (Future of Life Institute, 2023, p4). The authors conclude by arguing that the dangers of unchecked AI advancements can result in substantial harm, in both the near and longer term, to individuals, communities, and society. On the other hand, robust third-party auditing and certification of AI research and development can achieve responsibly developed AI that benefits humanity. At first glance, this approach seems to be similar to one proposed by Arbelaez Ossa et al. and the Constitutional AI method, which is to ensure that AI development is responsible and beneficial to society. However, the Future of Life Institute’s proposal advocates for a pause in AI development until safety protocols are in place, while the other approaches are more focused on integrating ethical guidelines directly into the AI systems themselves without necessarily limiting or pausing advancements in AI technologies. In other words, Future of Life Institute’s approach does not promote AI advancements but instead advocates for limiting it.
This approach is also shared by others, the author of the Time article “Exclusive: U.S. Must Move ‘Decisively’ to Avert ‘Extinction-Level’ Threat From AI, Government-Commissioned Report Says” (Perrigo, 2024) reports that Gladstone AI, an AI startup commission by the U.S. Government to conduct an AI risk assessment, warns that the advances in AI, more specifically in Artificial General Intelligence (AGI) pose urgent and growing risks to national security, potentially causing an extinction-level threat to the human species. Gladstone AI recommends making it illegal to train AI models above a certain computing power threshold, requiring government permission for AI companies to train and deploy new models, outlawing open-source AI models, and severely controlling AI chip manufacture and export. This approach goes further than the Future of Life Institute approach in limiting and controlling AI development by arguing for draconian government oversight over the AI industry. However, the Future of Life Institute and Gladstone AI proposal are a case of “shutting the stable door after the horse has bolted.” Not only the approaches are impossible to implement worldwide but can also be extremely harmful to the future well-being of Western nations.
The Fallacy
“Shutting the stable door after the horse has bolted” is a fallacy that occurs when suggesting a solution for an issue that has already occurred or is currently occurring. In the context of AI development, arguing for strict regulations or a complete halt to AI research and development when advanced AI systems are already being developed and deployed by various organizations worldwide is an example of this fallacy. The current state of AI development is rapidly advancing, and ignoring this reality while arguing for potentially ineffective or late measures might not effectively address the risks posed by AI. For instance, on March 12, 2024, Cognition AI launched the first fully autonomous AI agent, Devin. Devin is an AI system autonomously capable of fully developing software, training other AI models, and editing its own codebase. (Vance, 2024) Moreover, implementing such restrictions poses potential risks for Western nations to miss the Artificial Superintelligence (ASI) ‘train,’ especially if other countries, like China and Russia, do not follow suit. Bostrom, a renowned philosopher at the University of Oxford and the director of the Future of Humanity Institute said in a Big Think interview, “I think there is a significant chance that we’ll have an intelligence (AI) explosion. So that within a short period of time, we go from something that was only moderately affecting the world, to something that completely transforms the world.” (Big Think, 2023, 02:05).
The illustration by Tim Urban, see Figure 2 “ANI-AGI-ASI Train,” (Urban, 2015) supports well Bostrom’s argument. Furthermore, the illustration acts as a metaphor for the rapid advancement of AI technologies and the possibility that if Western nations fail to keep pace with these advancements due to overly restrictive regulations will fall behind nations such as China and Russia. For instance, in a two-year study, the U.S. National Security Commission on Artificial Intelligence warned “China is already an AI peer, and it is more technically advanced in some applications. Within the next decade, China could surpass the US as the world’s AI superpower.” (Sevastopulo, 2021, p1). Therefore, the best approach for Western nations to remain competitive in the field of AI is to adopt and combine the Arbelaez Ossa et al. and Anthropic CAI approaches with AI reasoning techniques such as Quiet-STaR, which promote AI advancements while ensuring their implementation is not harmful to society by directly integrating ethics and ethical reasoning capabilities into the training and development of the AI models.
Figure 2
The ANI-AGI-ASI Train
Note: The illustration is a metaphor that depicts the rapid advancement of AI technology, progressing from Artificial Narrow Intelligence (ANI), which is less intelligent than human-level intelligence, to Artificial General Intelligence (AGI), which is equivalent to human-level intelligence, and to Artificial Super-Intelligence (ASI), which surpasses human intelligence. From The AI revolution: The road to superintelligence Part-2, by Urban, 2015.
The Solutions
To establish ethical guidelines that promote the advancement of AI technologies while ensuring its implementation is not harmful to society, it is essential to build on Bai et al. (2022) concept of Constitutional AI by integrating ethical guidelines directly into the training of AI systems. Additionally, as suggested by Arbelaez Ossa et al. (2024), AI developers should prioritize ethics, goals, stakeholders, and the specific contexts in which AI will be deployed. To achieve this, several strategies are proposed. For instance, in a Senate subcommittee hearing, Altman, CEO of OpenAI, proposed creating an agency to issue licenses for developing large-scale AI models, establish safety regulations, and test AI models before public release, and Montgomery, IBM’s chief privacy and trust officer, advocated for a precision regulation approach that focuses on specific AI uses rather than regulating the technology itself. (Kang, 2023) Ng renowned computer scientist in AI, founder of DeepLearning.AI, and adjunct professor at Stanford University (Stanford HAI, n.d.), in his lecture on July 26, 2023, at Stanford University dismisses concerns about AI posing an existential threat to humanity, arguing that AI development will be gradual and manageable (Stanford Online, 2023). In other words, Ng’s approach to AI regulations is not to regulate AI technologies directly but to manage their implementation gradually.
This approach is similar to the approach proposed by Webb, the founder of the Future Today Institute and professor of strategic foresight at the NYU Stern School of Business. During the 2024 South by Southwest Conference (SXSW), she suggested the establishment of a U.S. Department of Transition. (SXSW, 2024) The department would be tasked with assessing and forecasting how the advancements in AI technologies, connected devices, and biotechnology would affect different sectors of the economy. In other words, the department would assist the country’s economy in adapting to the rapid technological advancements driven by the emergence of AI. All these strategies share the common goal of establishing ethical guidelines that promote the advancement of AI technologies while ensuring their implementation is not harmful to individuals, communities, and society. The only sensible solution to implementing the approach is adopting minimal AI technology development regulatory guidelines, with the government acting as both a facilitator in the advancement of AI technologies and as a gatekeeper ensuring that its implementations are not harmful to society. This approach is only possible by applying the principles of Constitutional AI in AI development, and by establishing an AI developers’ culture that prioritizes efficient and safe AI reasoning capabilities in line with ethics or AIER, goals, stakeholders, and the specific context where each AI system would be deployed.
Conclusion
While the rapid advancement of AI technologies has raised concerns about the potential risks that it poses to society, it is not only possible but essential to establish guidelines that promote the development and advancement of AI technologies while ensuring their implementation is not harmful to individuals, communities, and society. This can be done by integrating ethical principles directly into AI systems through approaches like Constitutional AI and AI Ethical Reasoning by fostering an AI development culture that prioritizes ethics, stakeholder considerations, and context-specific deployment, rather than implementing severe restrictions or pauses on AI advancements. Furthermore, oversight and regulation of AI require a nuanced approach, especially for Western nations, to avoid missing out on the rapid advancements toward Artificial Superintelligence. This can be done by implementing gradually AI technologies and by establishing government agencies like the proposed U.S. Department of Transition with the main goal of guiding society through the economic and social changes that AI advancements are inevitably bringing. In this role, the government acts as both a facilitator of AI technology advancements and as a gatekeeper ensuring that their implementations are not harmful to society. Moreover, these measures must be put in place, now, as the rapid pace of AI advancement means that society cannot afford to wait, and substantial efforts should be made by the U.S. Government and AI developers to promote and support research on AI Ethical Reasoning.
References
Anthropic. (2024, March 4). The claude 3 model family: Opus, sonnet, haiku. Anthropic. Retrieved from: https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf
Arbelaez Ossa, L., Lorenzini, G., Milford, S. R., Shaw, D., Elger, B. S., & Rost, M. (2024). Integrating ethics in AI development: a qualitative study. BMC Medical Ethics, 25(1), NA. https://link-gale-com.csuglobal.idm.oclc.org/apps/doc/A782196655/AONE?u=colstglobal&sid=bookmark-AONE&xid=e925a51dLinks to an external site.
Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D. Tran-Johnson, E., Perez, E., … Kaplan, J. (2022, December 15). Constitutional AI: Harmlessness from AI feedback. ArXiv. https://doi.org/10.48550/arXiv.2212.08073
Big Think. (2023, April 9). Nick Bostrom on the birth of superintelligence [Video]. Big Think. https://bigthink.com/series/the-big-think interview/superintelligence/?utm_source=youtube&utm_medium=video&utm_campaign=youtube_description
Future of Life Institute. (2023, April 12). Policy making in the pause. Future of Live Institute. https://futureoflife.org/document/policymaking-in-the-pause
Kang, C. (2023, May 16). OpenAI’s Sam Altman urges A.I. regulation in Senate hearing. The New York Times. https://www.nytimes.com/2023/05/16/technology/openai-altman-artificial-intelligence-regulation.html
Perrigo, B. (2024, March 11)._ Exclusive: U.S. must move ‘decisively’ to avert ‘extinction-level’ threat from AI, government-commissioned report says_. Time Magazine. https://time.com/6898967/ai-extinction-national-security-risks-report/
Sevastopulo, D. (2021, March 3). China on track to surpass US as AI superpower, Congress told; Semiconductors. Financial Times, 4. https://link-gale-com.csuglobal.idm.oclc.org/apps/doc/A653575133/AONE?u=colstglobal&sid=bookmarkAONE&xid=042ad65a
South by Southwest (SXSW). (2024, March 4)._ Amy Webb Launches 2024 Emerging Tech Trend Report | SXSW 2024_ [Video]. SXWX. https://www.sxsw.com/news/2024/2024-sxsw-featured-session-amy-webb-launches-2024-emerging-tech-trend-report-video
Stanford HAI. (n.d.). Andrew Ng. Stanford University. https://hai.stanford.edu/people/andrew-ng
Stanford Online. (2023, August 29). _Andrew Ng: Opportunities in AI — 2023 _[Video]. YouTube. https://www.youtube.com/watch?v=5p248yoa3oE
Vance, A., (2024, March 12). Gold-medalist coders build an AI that can do their job for them. Bloomberg. https://www.bloomberg.com/news/articles/2024-03-12/cognition-ai-is-a-peter-thiel-backed-coding-assistant
Urban, T. (2015, January 27). The AI revolution: The road to superintelligence Part-2. Wait But Why. https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-2.html
Zelikman, E., Harik, G., Shao, Y., Jayasiri, V., Haber, N., & Goodman, N. (2024, March 14). Quiet-STaR: Language models can teach themselves to think before speaking. ArXiv. https://doi.org/10.48550/arXiv.2403.09629
Originally published at Alex.omegapy - Medium on September 2, 2024.
Top comments (0)