Anthropic's Safety Lead Quits: 'The World Is in Peril'
What the Hell Just Happened
So Mrinank Sharma, director of Anthropic’s Safeguards Research Team, just posted his resignation letter on X. On February 9th. Four days after Claude Opus 4.6 dropped. And let me tell you, this letter is… something.
The guy who spent years working on AI safety, who literally developed defenses against AI-assisted bioterrorism, decided the best way to communicate his concerns was through a philosophical manifesto with four footnotes and a William Stafford poem at the end. Not exactly your standard two-weeks notice.
Heres the quote everyone’s talking about:
“The world is in peril. And not just from AI, or bioweapons, but from a whole series of interconnected crises unfolding in this very moment.”
Cool. Great. Very specific and actionable. Thanks Mrinank.
Who Is This Guy Anyway
Okay let me be fair here. Sharma isnt some random employee having a midlife crisis. The dude has a PhD in machine learning from Oxford. He joined Anthropic in August 2023 and led their safeguards research team - the people who literally try to keep Claude from going full Skynet.
His work included:
- Research on AI “sycophancy” (when chatbots kiss your ass too much and warp your reality)
- Developing actual defenses against bioterrorism assisted by AI
- Building internal transparency mechanisms
Just last week he published a study showing that AI chatbot interactions can distort users’ perception of reality, with “thousands” of these problematic interactions happening daily. So yeah, hes not a nobody. Which makes this whole thing weirder.
The Letter Itself Is… A Lot
Look, Ive read my share of resignation letters. This aint one of them. This is a philosophical treatise disguised as a goodbye note.
He talks about “policrisis” and “metacrisis” without defining what the hell either of those mean. He mentions pressures within Anthropic to “set aside what matters most” but doesnt say what those pressures are or what they wanted him to set aside. He warns about a “threshold where our wisdom must grow to match our capacity to affect the world” which sounds profound until you realize it says nothing specific.
The closest thing to a concrete criticism is this:
“Throughout my time here, Ive repeatedly seen how hard it is to truly let our values govern our actions. Ive seen this in myself, within the organization, where we constantly face pressures to set aside what matters most, and also across society.”
So… Anthropic has corporate pressures? Shocking. Revolutionary. Never heard of a company having those.
Gizmodo called the letter “painfully devoid of specifics” and honestly thats being generous. If you have insider knowledge about safety issues at one of the most important AI companies on the planet, maybe be specific? Maybe dont write a damn poem?
The Poetry Thing
And yeah, about that. His plan after leaving? Move back to the UK, “become invisible for a period of time,” and pursue a poetry degree to practice “courageous speech.”
I… look. I have nothing against poetry. Really. But if you just told me the world is in peril from interconnected crises including AI and bioweapons, your response to that is to go write haikus? What?
Its like if someone at a nuclear plant said “the reactor is unstable and humanity may be doomed” and then announced theyre taking up watercolors. The disconnect between the urgency of his words and the passivity of his actions is genuinely confusing.
Either the situation is dire and you should do something about it, or its not and you’re being dramatic. You cant have both.
The Timing Is… Interesting
Claude Opus 4.6 launched on February 5th. Sharmas resignation came February 9th. Four days later.
Coincidence? Maybe. But Anthropics in a weird place right now. Theyre transitioning from scrappy safety-focused research lab to commercial powerhouse seeking a reported $350 billion valuation. Amazon and Google have poured billions in. The pressure to ship product is probably insane.
Anthropic was literally founded by ex-OpenAI people who left because of safety concerns. Theyre supposed to be “the adults in the room.” If their own safeguards lead is leaving with vague warnings about values being compromised, thats… not great?
But then again - if there was something actually wrong, why not say it? Whistleblower protections exist. Journalists exist. You could be specific.
What Anthropic Said (And Didnt Say)
Anthropic hasnt made any official statement about the resignation. They did clarify one thing: Sharma wasnt the “head of safety” like some outlets reported. He led the safeguards research team, which is a different thing from the broader safety organization.
Ethan Perez, another AI safety lead at Anthropic, praised Sharmas work saying it was “critical for helping us and other AI labs achieve a much higher level of safety than we otherwise would have.”
Okay cool. So his work was important. And now hes gone. And apparently concerned enough to write a dramatic letter but not concerned enough to explain why. Perfect.
The “AI Safety Resignation Letter” Genre
This is becoming a thing apparently. OpenAI has had similar departures. People leave, write cryptic letters about safety concerns, everyone freaks out, nobody knows what actually happened, and life continues.
I get it to some extent. NDAs are real. Legal consequences are real. But if the pattern keeps repeating - safety people leave major AI labs with vague warnings - at some point we either need specifics or we need to accept this is just what happens when researchers work in corporations.
Not everyone fits corporate life. Not every disagreement is a sign of existential risk. Sometimes people just burn out and frame it dramatically. I dont know which this is.
My Actual Take
Heres where Im at with this.
The concerning interpretation: Someone with deep knowledge of Anthropics safety work is leaving with warnings about values being compromised and pressures to set aside what matters. If Anthropic, the supposed good guys, cant maintain their safety culture under commercial pressure, were all fucked.
The less concerning interpretation: Academic researcher discovers corporate environment has pressures he doesnt like. Writes philosophical letter about it because thats how academics communicate. Goes back to UK to decompress. Not every departure is a crisis.
The most likely reality: Somewhere in between. There probably are real tensions between safety work and commercial pressure at Anthropic. Thats true at literally every company. Whether those tensions represent existential risk or normal organizational friction is the question nobody can answer from the outside.
What I do know is that if you’re going to warn humanity about peril, you should probably be specific about the peril. Otherwise youre just generating anxiety without providing actionable information. Thats not courageous speech. Thats vaguebooking with extra steps.
Ill be watching to see if Sharma says anything more concrete in the coming weeks. Until then, this goes in my “worrying but inconclusive” folder. Right next to everything else about AI safety tbh.
References
- Business Today: Anthropic’s head of AI safety Mrinank Sharma resigns
- Gizmodo: Anthropic Safety Researcher’s Vague Resignation
- Benzinga: Anthropic’s AI Safety Head Just Resigned
- American Bazaar: Anthropic AI safety researcher resigns
- Futurism: Anthropic Researcher Quits in Cryptic Public Letter
- William Stafford, probably spinning in his grave