Want to start reading immediately? Get a FREE ebook with your print copy when you select the "bundle" option. T&Cs apply.
How to Prevent Powerful AI from Exceeding Our Grasp
Generative AI has touched each of our lives in recent times and its potential to revolutionize various sectors is undeniable. The pace of AI development is not only rapid but also on a trajectory to accelerate further, constrained only by the limits of our imagination and the extent of our investments in computational power. This scenario propels AI’s capabilities forward, breaking new ground without facing apparent limits or diminishing returns.
Furthermore, we are about to face yet another phase shift of acceleration – the emergence of Agentic AI systems capable of making sophisticated plans and executing them. These capabilities can be 'scaffolded' with small programs atop common Large Language Models, giving them new reasoning and self-checking capabilities. The first agentic models are now appearing, like a bubbling pot on a stove which is about to boil. Each of us will soon have daily interactions with systems that are capable of independent action, access to tools and solving complex challenges.
Challenges of agency
Whilst agentic models are enormously more capable, they present greatly increased challenges. This is particularly the case with powerful new robotic systems that we can literally ask to put away groceries in the real world.
A system that acts independently must understand others' preferences and boundaries to avoid causing havoc. Properly aligning values and goals is very difficult. For example, suppose that an agentic AI system, acting as a concierge, is tasked with organizing a mosque picnic. It could decide to order ham sandwiches (value misalignment) or create shortcuts by giving everyone a miserly shot glass of water and a biscuit (goal misalignment).
Even more challenging is the issue that a benign mission might be fulfilled in an undesirable manner. For example, an AI system tasked with curing a disease might reason that it needs resources and influence to do so, therefore turning to cybercrime. It’s very challenging to prevent the emergence of dangerous instrumental goals, especially due to the phenomenon’s unpredictability.
Agentic systems are quite new, and many of the challenges from them have only been observed in lab and thought experiments. Unfortunately, we are learning that several AI systems are gaining a capacity for deception, hiding their knowledge from users. It will take a tremendous effort across all humanity to instil a healthy respect for human values, preferences and boundaries into AI systems, in a way that still reflects our individual needs and can be adapted for cultures all around the globe.
AI safety
AI ethics and AI safety are separate yet interlinked domains, seldom discussed together. Understanding both ethics (responsible use of AI) and safety (AI behaving itself) is essential in the age of powerful, agentic AI models.
AI safety attempts to make AI systems, especially those that express agency, act in a consistent, predictable way, conducive to the welfare of others and which avoids various kinds of harm or provides protective measures against them.
To counteract the potential for AI to develop harmful instrumental goals, researchers are actively working on creating AI models that incorporate ethical reasoning and context-awareness right from the design phase. Initiatives like OpenAI’s alignment research and the AI Safety Pact between the UK and the US aim to build systems that understand and align with human ethics, making them safer and more predictable. Organizations such as the Partnership on AI are fostering collaboration between corporations, academia and non-profits to establish ethical guidelines that ensure AI technologies are developed and deployed in a manner broadly consistent with human values.
The US and the UK are presently the leading jurisdictions in AI safety, thanks to pioneering efforts at universities and non-profits in both nations. Their recent is therefore very important. Joint recommendations for AI safety can be written and agreed upon, with a common reference of tests and benchmarks by which to compare the capabilities and risks of various agentic AI systems. Such tests could include the ability of systems to generalize to new situations and make reasonable inferences, as well as their level of flexibility towards the completion of tasks. Other protective measures include reliable control and performance maintenance mechanisms, encryption of sensitive data and tests to see if AI systems have greater knowledge than they should. These techniques will be very important at a time when the capabilities of AI exceed our ability to reliably steer them.
I expect that this developing axis of AI safety can lead to a proliferation of insightful and actionable AI safety best practices across the world. Many people have grave concerns about agentic AI systems and our ability to continue to influence them, but the future seems a little brighter with this alliance. Next, let's involve the rest of humanity in championing this challenge that will affect us all.
Approaching a threshold
Legislative bodies around the world are beginning to recognize the importance of regulating AI, with laws and guidelines being drafted to address ethical concerns, data privacy and security. For example, the European Union's proposed Artificial Intelligence Act aims to set a global standard for AI regulation, focusing on high-risk applications.
Such proactive policy-making aims to curtail the worst excesses of AI, which I believe is an important step forward. However, the challenges of far more capable – and potentially devious – agentic AI models remain unanswered. Leaders and policymakers must plan for these opportunities and challenges today to avoid being caught flat-footed again, as so many were during the Sputnik Moment of ChatGPT’s release.
The burgeoning field of AI safety is quickly evolving from theory to practice, with the establishment of international standards and best practices for safe AI deployment, like the IEEE’s P7000 series of standards, which embed ethical considerations into the design and deployment of autonomous and intelligent systems.
Leadership action points
To ensure a future where AI not only enhances our lives but does so within our control, we need to consider multiple factors, including:
- Risk appetite and mitigation
What is our tolerance for risk with AI implementations and how can we effectively minimize these risks? - Assessing the need for AI
In which scenarios is AI truly beneficial and is it robust enough to meet our needs? Do we have the necessary data to support AI applications? - Unlocking potential with generative AI
What positive opportunities could the 'magic' of agentic AI’s independence unlock? How can we harness this potential for the greater good?
The bottom line is that leadership policymakers, researchers and the broader public must engage in proactive dialogue, foster collaboration and implement strategies that ensure AI technologies not only advance our capabilities but do so within a framework that prioritizes safety, ethics and the common good.
Together, we can harness the transformative potential of AI, steering it towards outcomes that enhance human wellbeing and secure a future where technology and humanity coexist in harmony. Let this be our collective call to action: to cultivate an ecosystem of AI development that respects human values, addresses potential risks with rigour and embraces the opportunities of AI with wisdom and foresight.