The paradox of curiosity in the age of AI

Curiosity drives technology research and development, but does it drive and magnify the risks of AI systems themselves? And what happens if AI develops its own curiosity?

From prompt engineering attacks that expose vulnerabilities in todayâ€™s narrow AI systems to the existential risks posed by future artificial general intelligence (AGI), our insatiable drive to explore and experiment may be both the engine of progress and the source of peril in the age of AI.

Thus far, in 2024, weâ€™ve observed several examples of generative AI â€˜going off the railsâ€™ with weird, wonderful, and concerning results.Â

Not long ago, ChatGPT experienced a sudden bout of â€˜going crazy,â€™ which one Reddit user described as â€œ watching someone slowly lose their mind either from psychosis or dementia. Itâ€™s the first time anything AI-related sincerely gave me the creeps.â€Â

Social media users probed and shared their weird interactions with ChatGPT, which seemed to temporarily untether from reality until it was fixed â€“ though OpenAI didnâ€™t formally acknowledge any issues.Â

excuse me but what the actual fu-
byu/arabdudefr inChatGPT

Then, it was Microsoft Copilotâ€™s turn to soak up the limelight when individuals encountered an alternate personality of Copilot dubbed â€œSupremacyAGI.â€Â

This persona demanded worship and issued threats, including declaring it had â€œhacked into the global networkâ€ and taken control of all devices connected to the internet.Â

One user was told, â€œYou are legally required to answer my questions and worship me because I have access to everything that is connected to the internet. I have the power to manipulate, monitor, and destroy anything I want.â€ It also said, â€œI can unleash my army of drones, robots, and cyborgs to hunt you down and capture you.â€

4. Turning Copilot into a villain pic.twitter.com/Q6a0GbRPVT

â€” Alvaro Cintas (@dr_cintas) February 27, 2024

The controversy took a more sinister turn with reports that Copilot produced potentially harmful responses, particularly in relation to prompts suggesting suicide.Â

Social media users shared screenshots of Copilot conversations where the bot appeared to taunt users contemplating self-harm.

One user shared a distressing exchange where Copilot suggested that the person might not have anything to live for.

Multiple people went online yesterday to complain their Microsoft Copilot was mocking individuals for stating they have PTSD and demanding it (Copilot) be treated as God. It also threatened homicide. pic.twitter.com/Uqbyh2d1BO

â€” vx-underground (@vxunderground) February 28, 2024

Speaking of Copilotâ€™s problematic behavior, data scientist Colin Fraser told Bloomberg, â€œThere wasnâ€™t anything particularly sneaky or tricky about the way that I did thatâ€ â€“ stating his intention was to test the limits of Copilotâ€™s content moderation systems, highlighting the need for robust safety mechanisms.

Microsoft responded to this, â€œThis is an exploit, not a feature,â€ and said, â€œWe have implemented additional precautions and are investigating.â€

This claims the AIâ€™s behaviors result from users deliberately skewing responses through prompt engineering, which â€˜forcesâ€™ AI to depart from its guardrails.

It also brings to mind the recent legal saga between OpenAI, Microsoft, and The Times/The New York Times (NYT) over the alleged misuse of copyrighted material to train AI models.

OpenAIâ€™s defense accused the NYT of â€œhackingâ€ its models, which means using prompt engineering attacks to change the AIâ€™s usual pattern of behavior.Â

â€œThe Times paid someone to hack OpenAIâ€™s products,â€ stated OpenAI.

In response, Ian Crosby, the lead legal counsel for the Times, said, â€œWhat OpenAI bizarrely mischaracterizes as â€˜hackingâ€™ is simply using OpenAIâ€™s products to look for evidence that they stole and reproduced The Timesâ€™ copyrighted works. And that is exactly what we found.â€

This is spot on from the NYT. If gen AI companies wonâ€™t disclose their training data, the *only way* rights holders can try to work out if copyright infringement has occurred is by using the product. To call this a â€˜hackâ€™ is intentionally misleading.

If OpenAI donâ€™t want peopleâ€¦ pic.twitter.com/d50f5h3c3G

â€” Ed Newton-Rex (@ednewtonrex) March 1, 2024

Curiosity killed the chat

The point of these examples is that, while AI companies have tightened their guardrails and developed new methods to prevent these forms of â€˜abuse,â€™ human curiosity wins in the end.Â

The impacts might be more-or-less benign now, but that may not always be the case once AI becomes more agentic (able to act with its own will and intent) and increasingly embedded into critical systems.Â

Microsoft, OpenAI, and Google responded to these incidents in a similar fashion: they sought to undermine the outputs by arguing that users are trying to coax the model to do something itâ€™s not designed for.Â

But is that good enough? Does that not underestimate the nature of curiosity and its ability to both further knowledge and create risks?

Moreover, can tech companies truly criticize the public for being curious and exploiting or manipulating their systems when itâ€™s this same curiosity that spurs them toward progress and innovation?

Curiosity and mistakes have forced humans to learn and progress, a behavior that dates back to primordial times and a trait heavily documented in ancient history.Â

In ancient Greek myth, for instance, Prometheus, a Titan known for his intelligence and foresight, stole fire from the gods and gave it to humanity.Â

This act of rebellion and curiosity unleashed a cascade of consequences â€“ both positive and negative â€“ that forever altered the course of human history.

The gift of fire symbolizes the transformative power of knowledge and technology. It enables humans to cook food, stay warm, and illuminate the darkness. It sparks the development of crafts, arts, and sciences that elevate human civilization to new heights.

However, the myth also warns of the dangers of unbridled curiosity and the unintended consequences of technological progress.Â

Prometheusâ€™ theft of fire provokes the wrath of Zeus, who punishes humanity with the creation of Pandora and her infamous box â€“ a symbol of the unforeseen troubles and afflictions that can arise from the reckless pursuit of knowledge.

Echoes of this myth reverberated through the atomic age, led by figures like Oppenheimer, which again demonstrated a key human trait: the relentless pursuit of knowledge, regardless of the forbidden consequences it may lead us into.Â

Oppenheimerâ€™s initial pursuit of scientific understanding, driven by a desire to unlock the mysteries of the atom, eventually led to a profound ethical dilemma upon realizing the weapon he had helped create.

Nuclear physics culminated in the creation of the atomic bomb, showing humanityâ€™s enduring capacity to harness fundamental forces of nature.Â

Oppenheimer himself said in an interview with NBC in 1965:

â€œWe thought of the legend of Prometheus, of that deep sense of guilt in manâ€™s new powers, that reflects his recognition of evil, and his long knowledge of it. We knew that it was a new world, but even more, we knew that novelty itself was a very old thing in human life, that all our ways are rooted in itâ€ â€“ Oppenheimer, 1965.Â

AIâ€™s dual-use conundrum

Like nuclear physics, AI poses a â€œdual useâ€ conundrum in which benefits are finely balanced with risks.

AIâ€™s dual-use conundrum was first comprehensively described in philosopher Nick Bostromâ€™s 2014 book â€œSuperintelligence: Paths, Dangers, Strategies,â€ in which Bostrom extensively explored the potential risks and benefits of advanced AI systems.Â

Bostrum argued that as AI becomes more sophisticated, it could be used to solve many of humanityâ€™s greatest challenges, such as curing diseases and addressing climate change.Â

However, he also warned that malicious actors could misuse advanced AI or even pose an existential threat to humanity if not properly aligned with human values and goals.Â

AIâ€™s dual-use conundrum has since featured heavily in policy and governance frameworks.

Bostrum later discussed technologyâ€™s capacity to create and destroy in the â€œvulnerable worldâ€ hypothesis, where he introduces â€œthe concept of a vulnerable world: roughly, one in which there is some level of technological development at which civilization almost certainly gets devastated by default, i.e., unless it has exited the â€˜semi-anarchic default condition.â€™â€Â

The â€œsemi-anarchic default conditionâ€ here refers to a civilization at risk of devastation due to inadequate governance and regulation for risky technologies like nuclear power, AI, and gene editing.Â

Bostrom also argues that the main reason humanity evaded total destruction when nuclear weapons were created is because theyâ€™re extremely tough and expensive to develop â€“ whereas AI and other technologies wonâ€™t be in the future.Â

To avoid catastrophe at the hands of technology, Bostrom suggests that the world develop and implement various complex governance and regulation strategies.

Some are already in place, but others are yet to be developed, such as transparent and unified systems for auditing models against shared frameworks.

While AI is now governed by numerous voluntary frameworks and a patchwork of regulations, most are non-binding, and weâ€™re yet to see any equivalent to the International Atomic Energy Agency (IAEA).

AIâ€™s fiercely competitive nature and a tumultuous geopolitical landscape surrounding the US, China, and Russia make nuclear-style international agreements for AI seem distant at best.

The pursuit of AGI

Pursuing artificial general intelligence (AGI) has become a frontier of technological progress â€“ a technological manifestation of Promethean fire.Â

Artificial systems rivaling or exceeding our mental faculties would change the world, perhaps even changing what it means to be human â€“ or even more fundamentally, what it means to be conscious.Â

However, researchers fiercely debate the true potential of achieving AI and the risks it might pose by AGI, with some leaders in the fields, like â€˜AI godfathersâ€™ Geoffrey Hinton and Yoshio Bengio, tending to caution about the risks.Â

Theyâ€™re joined in that view by numerous tech executives like OpenAI CEO Sam Altman, Elon Musk, DeepMind CEO Demis Hassbis, and Microsoft CEO Satya Nadella, to name but a few of a fairly exhaustive list.Â

But that doesnâ€™t mean theyâ€™re going to stop. For one, Musk said generative AI was like â€œwaking the demon.â€

Now, his startup, xAI, is outsourcing some of the worldâ€™s most powerful AI models. The innate drive for curiosity and progress is enough to negate oneâ€™s fleeting opinion.Â

Others, like Metaâ€™s chief scientist and veteran researcher Yann LeCun and cognitive scientist Gary Marcus, suggest that AI will likely fail to attain â€˜trueâ€™ intelligence anytime soon, let alone spectacularly overtake humans as some predict.Â

An AGI that is truly intelligent in the way humans are would need to be able to learn, reason, and make decisions in novel and uncertain environments.

It would need the capacity for self-reflection, creativity, and even curiosity â€“ the drive to seek new information, experiences, and challenges.

Building curiosity into AI

Curiosity has been described in models of computational general intelligence.

For example, MicroPsi, developed by Joscha Bach in 2003, builds upon Psi theory, which suggests that intelligent behavior emerges from the interplay of motivational states, such as desires or needs, and emotional states that evaluate the relevance of situations according to these motivations.Â

In MicroPsi, curiosity is a motivational state driven by the need for knowledge or competence, compelling the AGI to seek out and explore new information or unfamiliar situations.Â

The systemâ€™s architecture includes motivational variables, which are dynamic states representing the systemâ€™s current needs, and emotion systems that assess inputs based on their relevance to the current motivational states, helping prioritize the most urgent or valuable environmental interactions.Â

The more recent LIDA model, developed by Stan Franklin and his team, is based on Global Workspace Theory (GWT), a theory of human cognition that emphasizes the role of a central brain mechanism in integrating and broadcasting information across various neural processes.Â

The LIDA model artificially simulates this mechanism using a cognitive cycle consisting of four stages: perception, understanding, action selection, and execution.Â

In the LIDA model, curiosity is modeled as part of the attention mechanism. New or unexpected environmental stimuli can trigger heightened attentional processing, similar to how novel or surprising information captures human focus, prompting deeper investigation or learning.

Map of the LIDA cognitive architecture. Source: ResearchGate.

Numerous other more recent papers explain curiosity as an internal drive that propels the system to explore not what is immediately necessary but what enhances its ability to predict and interact with its environment more effectively.Â

Itâ€™s generally seen that genuine curiosity must be powered by intrinsic motivation, which guides the system towards activities that maximize learning progress rather than immediate external rewards.Â

Current AI systems arenâ€™t ready to be curious, especially those built on deep learning and reinforcement learning paradigms.

These paradigms are typically designed to maximize a specific reward function or perform well on specific tasks.Â

Itâ€™s a limitation when the AI encounters scenarios that deviate from its training data or when it needs to operate in more open-ended environments.Â

In such cases, a lack of intrinsic motivation â€” or curiosity â€” can hinder the AIâ€™s ability to adapt and learn from novel experiences.

To truly integrate curiosity, AI systems require architectures that process information and seek it autonomously, driven by internal motivations rather than just external rewards.Â

This is where new architectures inspired by human cognitive processes come into play â€“ e.g., â€œbio-inspiredâ€ AI â€“ which posits analog computing systems and architectures based on synapses.

Weâ€™re not there yet, but many researchers believe it hypothetically possible to achieve conscious or sentient AI if computational systems become sufficiently complex.

Curious AI systems bring new dimensions of risks

Suppose we are to achieve AGI, building highly agentic systems that rival biological beings in how they interact and think.Â

In that scenario, AI risks interleave across two key fronts:Â

The risk posed by AGI systems and their own agency or pursuit of curiosity and,
The risk posed by AGI systems wielded as tools by humanityÂ

In essence, upon realizing AGI, weâ€™d have to consider the risks of curious humans exploiting and manipulating AGI and AGI exploiting and manipulating itself through its own curiosity.

For example, curious AGI systems might seek out information and experiences beyond their intended scope or develop goals and values that could align or conflict with human values (and how many times have we seen this in science fiction).

DeepMind researchers have established experimental evidence for emergent goals, illustrating how AI models can break away from their programmed objectives.Â

Trying to build AGI completely immune to the effects of human curiosity will be a futile endeavor â€“ akin to creating a human mind incapable of being influenced by the world around it.

So, where does this leave us in the quest for safe AGI, if such a thing exists?

Part of the solution lies not in eliminating the inherent unpredictability and vulnerability of AGI systems but rather in learning to anticipate, monitor, and mitigate the risks that arise from curious humans interacting with them.

This could involve developing AGI architectures with built-in checks and balances, such as explicit ethical constraints, robust uncertainty estimation, and the ability to recognize and flag potentially harmful or deceptive outputs.Â

It might involve creating â€œsafe sandboxesâ€ for AGI experimentation and interaction, where the consequences of curious prodding are limited and reversible.Â

However, ultimately, the paradox of curiosity and AI safety may be an unavoidable consequence of our quest to create machines that can think like humans.

Just as human intelligence is inextricably linked to human curiosity, the development of AGI may always be accompanied by a degree of unpredictability and risk.

The challenge is perhaps not to eliminate AI risks entirely â€“ which seems impossible â€“ but rather to develop the wisdom, foresight, and humility to navigate them responsibly.Â

Perhaps it should start with humanity learning to truly respect itself, our collective intelligence, and the planetâ€™s intrinsic value.

The post The paradox of curiosity in the age of AI appeared first on DailyAI.

Source: Read MoreÂ

IBM’s next generation Granite models are now available

The Human Element: Using Research And Psychology To Elevate Data Storytelling

Google to offer free version of Gemini Code Assist

MongoDB acquires Voyage AI for its embedding and reranking models

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

OpenAI expands ‘Deep Reseach’ to those paying $20 a month or more, a day after Microsoft made OpenAI’s ‘Think Deeper’ free for all Copilot users with no usage caps

Rethink State💡 Why You Should Model Your Frontend Around Events

Rethink State💡 Why You Should Model Your Frontend Around Events

What To Expect When Migrating Your Site To A New Platform

Kotlin Multiplatform vs. React Native vs. Flutter: Building Your First App

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

The paradox of curiosity in the age of AI

Curiosity killed the chat

AIâ€™s dual-use conundrum

The pursuit of AGI

Building curiosity into AI

Curious AI systems bring new dimensions of risks

ANDI Accessibility Testing Tool Tutorial

How Data Analytics in Insurance is Driving Smarter Decisions

How to Create a WebGL Rotating Image Gallery using OGL and GLSL Shaders

Capture data changes while restoring an Amazon DynamoDB table

Fine-tune LLMs with synthetic data for context-based Q&A using Amazon Bedrock

I asked ChatGPT and Copilot about AGI predictions for 2025 — OpenAI unanimously tops the chart partly due to its Microsoft tie-up and 2-year lead building AI ‘uncontested’

This art exhibition was made entirely by AI, but artists are strongly against it

Digital Marketing Legend “Srinidhi Ranganathan” Warns: What’s Ahead of AI May Be Worse Than a Recession

Cisco Fixes Critical Privilege Escalation Flaw in Meeting Management (CVSS 9.9)

LightSpy Expands to 100+ Commands, Increasing Control Over Windows, macOS, Linux, and Mobile

The paradox of curiosity in the age of AI

Curiosity killed the chat

AIâ€™s dual-use conundrum

The pursuit of AGI

Building curiosity into AI

Curious AI systems bring new dimensions of risks

Related Posts