A Trialogue on AI Existential Risks

A fictional story on why parts of the AI Safety community need more nuance in their discourse.

10/5/202516 min read

The bar hadn't changed. Anh noticed this immediately; the same scuffed wooden floors, the same Edison bulbs casting their warm, insufficient light, even what appeared to be the same bartender, though that was unlikely after three years. She arrived first, claimed their old corner booth, and ordered a gin and tonic that tasted exactly as overpriced as she remembered.

Hakim came next, slightly out of breath, his messenger bag stuffed with what looked like printed reports. He'd always been that way: digital native who trusted paper. They embraced, and she felt him wince.

"Still doing the crossfit thing?" he asked.

"Powerlifting now. You're still doing the—" she gestured at his general posture, "—computer person slouch?"

"It's called intellectual intensity." He slid into the booth. "Where's Mira?"

"Late, obviously. Some things don't change."

But Mira wasn't that late. She arrived fifteen minutes later with apologies and an air of barely contained velocity, as though she'd been running multiple parallel threads in her mind and had reluctantly garbage-collected all but the one dedicated to being here. Her hair was shorter. She was wearing what Anh had learned to recognise as "tech casual": expensive clothing designed to look inexpensive.

"God, you both look exactly the same," Mira said, which wasn't quite true but felt so, the way old friends collapse time.

They caught up on the obvious things first. Anh's move to Philadelphia for the pharma job, managing AI integration projects for drug discovery. Hakim's pivot from a PhD in neural networks to AI policy work in Brussels, navigating the EU AI Act and its discontents. Mira's position at an AI safety research org in the Valley, working on interpretability and alignment.

"So we're all AI people now," Anh said. "When did that happen?"

"I don’t know for you, but I’ve worked on it since 2012!" Hakim said. "I’d say that, around 2023, for many it stopped being a niche hobby and started being... everything everywhere all at once?"

Mira took a long drink of her beer. "When it started being an emergency, you mean."

Of course, there it was. The old tension, surfacing like muscle memory. Anh had known it would come up, had almost looked forward to it. They'd had versions of this argument in graduate school, late nights in someone's apartment, before any of them knew where they'd end up and definitely before the arguments had stakes.

"I don't think I'd call it an emergency," Anh said carefully.

"What would you call it?" Mira leaned forward. "We're building something that might be smarter than us, we have no idea how to control it, and we're in an arms race to build it faster. That looks like an emergency to me!"

"It's a situation that requires careful attention," Hakim offered, in what Anh recognized as his diplomat voice.

"Which is different from an emergency. Emergencies make people stupid."

"People are already stupid," Mira said. "That's the problem."

Anh laughed despite herself. "Is that the sales pitch for AI safety these days? 'Join us, everyone else is stupid'?"

"I mean, am I wrong?" Mira's grin was sharp. "Look at what we're doing. We're racing toward something we don't understand, driven by market incentives and national security paranoia, and the main safety plan is 'hopefully we'll figure it out.' If that's not stupid, what is?"

"It's human," Anh said. "Which is different. Humans have been racing toward things they don't fully understand since we came down from the trees. It usually works out."

"Usually isn't good enough when the downside is extinction."

"Okay, but—" Hakim held up his hand. "Can we back up? Because I think we're starting from different premises. Mira, when you say we're building something smarter than us, what do you mean by 'smarter'?"

Mira looked at him like he'd asked what water was. "The ability to achieve goals in a wide range of environments. The ability to model the world accurately and plan effectively. General problem-solving capability across domains."

"Right. But that's already a very specific framing." Hakim had pulled out a pen and was drawing on a napkin, a habit that Anh found endearing and annoying in equal measure. "You're defining intelligence as this scalar thing, this single dimension we can turn up like a volume knob. Human-level intelligence, superhuman intelligence, AGI, ASI... it's all on one scale. But I’d ask you if that’s how intelligence actually works."

"It's how it seems to be working," Mira said. "Each generation of models is more capable than the last across basically every benchmark. That looks pretty unidimensional to me."

"Benchmarks that we designed," Hakim said. "That test for things we think of as intelligence. But okay, let's even grant that for a moment. Let's say we do build something with intelligence that exceeds humans on every cognitive dimension we can measure. Why does that necessarily translate into the ability to take over the world?"

"Because intelligence is power," Mira said immediately. "If you can think better than everyone else, you can outmaneuver them. You can hack systems, manipulate people, accumulate resources, and, of course, build more intelligence. It's the thing that matters most."

Anh had been watching this exchange with growing interest. "Ah, yeah? I mean, I know a lot of very intelligent people. Most of them are not powerful."

"That's because they're constrained," Mira said. "By bodies, by social structures, by resource limitations. An AI wouldn't have the same constraints."

"Wouldn't it?" Anh signaled for another round. "I spend my days watching AI systems try to navigate pharma R&D. These are sophisticated models, getting better every month. And yeah, they're very useful! They can predict protein structures, suggest novel compounds, parse through literature. But they can't actually do the experiments. They can't navigate the regulatory environment. They can't convince a skeptical review board or manage a clinical trial. All of that requires things that aren't intelligence."

"Such as?"

"Social capital. Institutional knowledge. Physical presence. Credibility. The ability to buy equipment, hire people, wait for results. An AI might be able to design a perfect drug, but if it can't synthesise it, test it, get it approved, manufacture it, and distribute it... what does it matter how smart it is?"

Mira was shaking her head. "You're thinking too small. A sufficiently intelligent AI would find ways around those constraints. It would manipulate people into doing what it wanted. It would accumulate resources through financial markets. It would—"

"Would it, though?" Hakim interrupted. "Or is this what we think would happen because it's what an intelligent human may try to do? But an AI isn't a human. It doesn't have our evolutionary history, our social instincts, our embodied experience of the world."

"That makes it more dangerous, not less," Mira said. "Because we can't predict what such an AI would do."

"Maybe. Or maybe it means our intuitions about what 'sufficiently intelligent' systems would do are just anthropomorphic projections." Hakim was warming to his theme now, speaking faster. "For example, there's this recent paper titled 'Lessons from a Chimp', I don't know if you've seen it, about AI deception research. It basically argues that a lot of the scary scenarios overattribute human traits to other agents (à la Daniel Dennett with his intentional stance). But that's a huge assumption."

Mira frowned. "Yeah, I read the paper and I think it misses the point. Sure, we shouldn't assume AIs will be deceptive in exactly the ways humans are. But the argument here is that powerful optimization processes will be deceptive because deception is instrumentally useful. That's not anthropomorphism, it's game theory."

"Is it?" Hakim leaned back. "Or is it game theory written by humans, about situations humans care about, using intuitions derived from human social behavior? Game theory doesn't emerge from the void. Someone has to decide what the players are, what they want, what moves are available, what information they have. All of those decisions are already laden with assumptions."

"So your position is what? We can't reason about this at all?"

"No, of course not. You know that I sympathise with the worries of the AI Safety school of thought. My position is we should be really careful about which parts of our reasoning are solid and which parts are us telling ourselves a story that feels compelling." Hakim paused. "You know what bothers me about a lot of some circles within the AI safety discourse? It has this aura of inevitability. Like: “the intelligence explosion is coming, alignment is nearly impossible, we have one shot to get it right or we're doomed”. It's too clean, too logical, too—"

"Rational," Anh supplied.

"Right. It has the aesthetic of perfect rationality. But (and I say this with love, Mira, because I know this is your world) I think sometimes that aesthetic gets in the way of actually understanding what's happening."

Mira had gone very still. "Develop."

"Okay." Hakim gathered his thoughts. "Look, when I was still doing technical work, I was all about the elegant proof, the clean mathematical argument. Then I went to Brussels and spent a year trying to help shape AI regulation. And you know what I learned? The actual world is nothing like a formal proof. It's this giant mess of competing interests and historical accidents and personality conflicts and things that almost worked once so now everyone assumes they're the right approach. At the same time, the classic AI existential risk story goes something like: we build an AI, it gets smarter, it gets smarter faster, it reaches some critical threshold where it can improve itself, then it undergoes explosive recursive self-improvement until it's godlike, and then either we've aligned it or we're dead. Right?"

"Simplified, but yes."

"But that story requires a bunch of things to be true. It requires that intelligence is the kind of thing that can scale smoothly without hitting bottlenecks. It requires that self-improvement is easy compared to initial improvement. It requires that there are no physical constraints (e.g. energy, compute, data) that slow things down. It requires that the AI can convert intelligence into world-changing power without friction. And most fundamentally, it requires that we live in a world where pure intelligence is the dominant force, where being smarter just straightforwardly makes you win."

"And you don't think we live in that world," Mira said quietly.

"I think we live in a world that's much messier than that. Where power comes from a weird mix of intelligence, resources, social connections, institutional authority, physical capability, timing, luck. Where systems are complex and nonlinear and definitely full of friction. Where a being that's much smarter than us might still struggle to get anything done because intelligence alone isn't enough."

Anh jumped in. "Yeah, I see this all the time. We'll have an AI system that's brilliant at some narrow task, way better than any human I know of. But then it has to interface with the real world, with all its bureaucracy and physics and human stubbornness, and suddenly it's not so godlike anymore. It's just another tool that needs to be wielded by someone who understands the full context."

"But that's now," Mira said. "That's with systems that are still narrow, still limited. The argument is about what happens when we get to general intelligence, and then beyond it."

"Sure, but—" Anh paused, choosing her words. "I guess I'm skeptical that 'general intelligence' is even a coherent concept in the way people talk about it. Like, I'm generally intelligent, right? I can do a lot of different things. But I can't do theoretical physics, I can't speak Mandarin, I can't perform surgery. My intelligence is general in some ways and specific in others, and it's all embedded in a particular body with particular needs in a particular social context. An AI's intelligence would be general in different ways and specific in different ways, embedded in different constraints. It's not just 'human intelligence but more.'"

"That's actually one of my biggest frustrations with the discourse," Hakim said. "This idea that there's a single axis called intelligence, and you just slide up it until you get to 'can do anything.' But intelligence in the real world is always contextual. It's always embedded. Even if you built a system that could, in principle, learn any skill… so what? Learning takes time, requires data and happens in a specific environment. The system would still be shaped by what it learned first, what it had access to, what problems it was trying to solve."

"Okay, but imagine an AI that could do all of those things," Mira pressed. "Theoretical physics, languages, surgery, everything. That's what we're headed toward."

"Maybe, maybe" Hakim said. "But even then, again… so what? I can imagine a human who's an expert in all of those domains. I mean, of course they'd be impressive as hell, but they still couldn't take over the world. Because the world isn't a puzzle you solve by being smart. It's this massive coordination problem involving billions of people with different goals and beliefs and capabilities."

"Unless you could manipulate all those people."

"There it is!" Hakim said, "That's the manipulation hypothesis again. That a sufficiently smart AI could just hack human psychology and get us to do whatever it wants. But I spent a year in policy trying to convince people of things I knew were true, with all the advantages of being human myself, sharing their evolutionary history and social context. And it was really, really hard. People are not manipulable in the way that framing suggests."

"They're not?" Mira's eyebrow arched. "Tell that to every advertising agency, every political campaign, every social media platform optimizing for engagement."

"Okay, fair," Hakim conceded. "But even those systems, with massive resources and tons of data about human behavior, they're not reliably controlling people. They're nudging probabilities. They're affecting populations, not individuals. And they work because they're operating within the existing grain of human psychology, making people more of what they already were, not fundamentally rewiring them."

"Plus," Anh added, "those systems have something an AI in a box doesn't have: they're already integrated into our lives. We've voluntarily given them access. A hypothetical misaligned AI would have to bootstrap its way to that level of integration, which is... that's actually the hard part."

Hakim was animated now. "Also, this is what I mean about anthropomorphism. We imagine a super-intelligent AI would be super-persuasive, super-manipulative, able to hack human psychology at will. But that assumes human psychology is simple enough to hack, that persuasion works like debugging code. Real humans are messy. They're inconsistent. They're embedded in social contexts that constrain them. They have bodies that affect their cognition. The same argument that works perfectly in one moment fails in another for reasons that have nothing to do with logic."

"You're basically arguing that humans are too stupid to be manipulated by something smart. Which is... not comforting," Mira said.

"No, I'm arguing we're too complex, which is different. Our irrationality might actually be protective in this context. We're not coherent utility maximisers, so you can't beat us just by being better at maximising utility."

Anh laughed. "That's very philosophical for someone who used to prove theorems about optimal stopping strategies."

"I spent years thinking about optimization," Hakim said. "And that's how I know the real world isn't optimizable. Or at least, not in the ways the math suggests."

There was a pause. The bar had filled up around them, other conversations layering into an ambient hum. Anh watched Mira, wondering what she was thinking. They'd had so many arguments like this in the past, but there was something different now. Stakes, maybe. Or weariness.

"You think I'm wrong," Mira said finally. It wasn't a question.

"I think you're telling a particular story," Hakim said gently. "A story that for sure has a certain elegance and internal consistency. And, don't get me wrong, maybe that story is also true! But I also believe it's easy to mistake elegance for truth. Easy to think that because an argument flows logically from premises, it must be right, when actually the premises themselves are uncertain."

"But we have to act on something," Mira said. "We can't just throw up our hands and say it's all too complex. People are building AGI right now. If there's even a ten percent chance the risk story is right, shouldn't we treat it as an emergency?"

"Maybe," Anh said. "But my concern is that if you treat something as an emergency, you get emergency reasoning. You get panicked decision-making, you get groupthink, you get people accepting wild claims because they're scary rather than because they're true. You get a community that's so focused on one particular threat model that it can't update when evidence comes in."

"What evidence?" Mira's voice had an edge now. "The evidence won't come until it's too late."

"See, that's..." Hakim paused. "That's what I mean about unfalsifiability. A hypothesis that can't be tested until after the world ends; do you see how that's epistemically problematic?"

"I see that you're both being awfully confident about something that hasn't happened yet. Maybe intelligence doesn't scale the way I think. Maybe there are more bottlenecks than the standard story assumes. But maybe there aren't. Maybe we build something next year that surprises everyone. Maybe all these frictions you're talking about turn out to be trivial for a system that's smart enough. We don't know."

"Exactly," Anh said. "We don't know. Which is why I think the AI safety community, and I say this as someone who's sympathetic, needs to be more honest about uncertainty. About the ways its models might be wrong. About the assumptions it's making."

"But we are uncertain," Mira protested. "That's why we talk about probability distributions, why we—"

"No, you're numerically uncertain," Hakim interrupted. "You assign probabilities to outcomes within your model. But you're not model-uncertain. You're not seriously considering that the whole frame might be wrong."

Mira opened her mouth, closed it, took a drink. "Okay. Maybe. But if the frame is wrong, what's the alternative? Just... don't worry about it?"

"Worry about specific problems," Anh said. "Problems we can actually observe and measure. How do we make AI systems that don't discriminate? How do we ensure they're robust to distribution shift? How do we make them interpretable enough that we can understand their failures? Those are all tractable questions with real-world constraints."

"Those are all good questions," Mira said. "But they're also rearranging deck chairs if the ship is going to sink anyway."

"Or," Hakim said, "they're the actual work that needs to be done, and the ship-sinking story is not happening. I don't know which is true. But I do know that one of those approaches allows for learning and updating and responding to what actually happens, and the other one doesn't."

"Because the other one thinks we only get one shot," Anh said.

"And because the other one has already decided what the ending is," Hakim added. "And that's not science. That's eschatology."

Mira stared at her beer. For a moment, Anh thought she might get angry; the old Mira would have, the one who'd argued with professors twice her age about Bayesian epistemology and never backed down. But this Mira just looked tired.

"Maybe you're right," she said quietly. "Maybe I am stuck in a frame. But God, I hope you're right about the rest of it too. Because if the standard story is even half true, and we're not preparing for it..."

"Then we'll face it when it comes," Anh said. "The way humans always have. Imperfectly, messily, with too little information and too much disagreement. But also with all the resources we don't usually count. Social structures, physical constraints, sheer bloody-minded complexity. All the things that make life hard but also make it survivable."

"And look," Hakim said, "I'm not saying ‘don't work on safety’. I'm saying be honest about what you know and what you're assuming. Build the field in a way that can update. Make arguments that don't require people to buy into an entire worldview before they can engage."

Mira was quiet for a moment. "You know what's funny? I went into this field because I thought it was the most rational way to approach AI. Maximize expected value, reason carefully about tail risks, don't let cognitive biases blind you to low-probability high-impact scenarios. But maybe... maybe that kind of perfect rationality is itself a bias. Maybe it makes you miss things that aren't in the model."

"Like what?" Anh asked.

"Like this." Mira gestured at the three of them. "Like the fact that we've been arguing for two hours and none of us has convinced the others, but we've all updated at least a little. That's not something you'd predict from a purely rational agent framework. It's messy and it’s social and it depends on trust and history and on the fact that we've shared enough drinks over the years that we know how to push each other without breaking anything."

"You're describing the very thing you've been dismissing," Hakim said with a smile. "The friction, the social dynamics, the complexity. That's not a bug. It's what makes this conversation possible."

"Yeah…" Mira said slowly. "Yeah, maybe that's right. It feels right, at least." She paused. "Okay. I'll make a deal with you both. I'll try to be more model-uncertain, more honest about assumptions, more open to the possibility that my frame is incomplete. But you have to promise me something too."

"What's that?" Anh asked.

"Keep engaging. Don't write off AI safety because some people in the community are too confident or too focused on a single narrative. The capabilities jumps, the unexplained emergent behaviors, the feeling that we're moving faster than we understand,... those are real problems. Even if the solution isn't 'assume doom and work backwards.'"

"That's fair," Hakim said. "I can commit to that."

"Me too," Anh said. "And honestly? I'm glad you're the one doing this work. Because if someone has to be worrying about existential risk, I'd rather it be someone who's willing to update when we push back."

"I notice I'm updating more than you two are," Mira pointed out.

"You're updating toward sanity," Anh said with a grin. "That's different."

"Eh! That's exactly the attitude that makes people not take safety seriously! This assumption that caring about x-risk means you're being irrational…"

"Okay, that's..." Anh stopped. "Actually, that's fair. I am being a bit of an ass."

Hakim laughed. "Character growth! In a bar conversation! Who would have thought?"

They ordered another round. The conversation drifted to other things. Mira's cat, Hakim's nightmare apartment in Brussels, Anh's attempts to learn Portuguese. Small things, human things, the texture of lives lived in the present tense rather than the conditional future.

Later, as they were leaving, Hakim turned to Mira. "Hey. I know I was pushing back hard in there. But for what it's worth, I'm glad someone's thinking about this stuff. Even if I'm not sure about all of it."

"Even if you think I'm in a doomer cult?"

"Even then. Someone needs to be checking the parachutes. I just want to make sure we're not so focused on the parachutes that we forget to look where we're actually jumping."

Mira laughed. "That metaphor got away from you."

"Most of them do," he admitted.

They hugged goodbye on the street corner, promising to do this again soon, knowing it might be another few years. Anh watched her friends walk off in different directions, toward different futures, different answers to the same impossible question.

She thought about intelligence, about power, about all the ways humans were more and less than the sum of their reasoning. About frictions and bottlenecks and the beautiful complexity of a world that refused to be optimized.

Then she pulled out her phone and ordered a rideshare, because even the most philosophical evening eventually bumps into the mundane problem of getting home.

Some things, at least, stayed simple.

🎧 Audio version here!