Perspective

The World Has Arms Control Regimes, But AI Companies Are Not Answering to Them

Javaid Iqbal Sofi / Apr 3, 2026

The first chemical munition, a GB (Sarin) filled M55 rocket, is destroyed at JACADS in 1990. (Wikimedia Commons)

When Anthropic posted a job listing for a “Policy Manager, Chemical Weapons and High-Yield Explosives,” it was easy to read as a curiosity. A Silicon Valley AI company is hiring someone whose core responsibility is to prevent a language model from helping someone synthesize a nerve agent or build a dirty bomb. This is not a role that fits neatly into the usual taxonomy of tech jobs.

But the listing is not a curiosity. It is a symptom of something policymakers have been slow to name directly: the frontier AI industry is constructing its own internal arms control infrastructure, in the absence of any external requirement to do so, with no accountability to any institution beyond itself. Without external oversight, these arrangements will not survive the competitive pressures that the companies themselves have identified. And the companies building them have, in their own policy documents, essentially said as much.

The person Anthropic hires for this position will design evaluation methodologies for assessing what Claude can do in the domain of chemical weapons, explosives synthesis and energetic materials. They will develop strategies to identify and mitigate potential misuse in model outputs and run rapid-response protocols when Anthropic detects escalating queries in these categories. Applicants are required to have at least five years of direct experience in chemical weapons or explosives defense and working knowledge of radiological dispersal devices, commonly called dirty bombs.

OpenAI reportedly posted a near-identical role for a researcher focused on biological and chemical risks. Two of the most powerful AI companies in the world are, simultaneously, hiring people whose professional backgrounds would previously have been found at defense ministries, national laboratories, or international security bodies.

These are not compliance hires in the conventional sense. They are, in functional terms, performing the work of verification officers at international bodies. The difference is that traditional verification officers operate within treaty frameworks, answer to governments and interface with international institutions. These answer to a board of directors and a CEO navigating a competitive commercial market.

Meanwhile, the international body with the deepest relevant expertise already exists. The Organisation for the Prohibition of Chemical Weapons has 193 member states and a verification regime backed by international law. In March 2026, it released a report on how AI intersects with the Chemical Weapons Convention, noting that AI-enabled tools are already transforming chemical research. Molecular modeling and AI-assisted synthesis planning now allow researchers to identify chemical pathways faster and with less expertise than was previously required; predictive toxicity analysis has accelerated the process further. For the OPCW, these same tools could lower the barrier to designing harmful chemicals that could be weaponized.

The OPCW wants AI companies to engage with it. It is building capacity-building programs, consulting with industry and academia, and pushing member states to develop shared norms. What it cannot do is compel a private company in San Francisco to submit its model evaluations for external review or disclose the results of internal red-teaming. It has no mechanism to require coordination with its technical secretariat before deploying a new model version. That gap — between an international body with deep domain expertise and no authority over AI companies, and AI companies with enormous capabilities and no international obligations — is where we currently live.

In February 2026, Anthropic released version 3.0 of its Responsible Scaling Policy — the voluntary internal framework the company uses to manage catastrophic risks as its models become more capable. The document is detailed and serious. It includes new requirements for periodic Risk Reports, external expert review in high-risk scenarios, whistleblower protections, and a Frontier Safety Roadmap that the company commits to publicly grading itself.

But the update drew attention primarily for what it removed. Previous versions of the RSP included a binding commitment: if Anthropic could not demonstrate adequate safety measures before crossing a capability threshold, it would pause development. That commitment is gone. In its place are voluntary public goals the company describes as ambitious but explicitly non-binding.

The reasoning Anthropic offered was candid. Unilateral safety commitments, the company acknowledged, do not work if competitors are not making equivalent ones. A company that stops training while others continue does not make the world safer — it cedes market position and, eventually, the ability to shape the technology at all. The only durable solution is coordination. External rules. Something that applies to everyone.

Anthropic’s chief science officer, Jared Kaplan, put it plainly: “We didn’t really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”

The Centre for the Governance of AI at Oxford drew the logical conclusion in its analysis of RSP 3.0: “If the core problem is collective action, Anthropic should push for stronger regulation, according to its own logic.” The report adds that while Anthropic appears to be taking some steps in this direction, “its efforts seem to lag behind what its own logic suggests.”

The companies best positioned to design workable AI governance frameworks are the ones that understand the technology most deeply. They are also the companies with the strongest competitive incentive to avoid any framework that constrains them more than their rivals. They can diagnose the problem clearly. Acting on the diagnosis is harder.

Even within the voluntary safety infrastructure that exists, coverage is uneven. Frontier AI safety work has concentrated heavily on pandemic-scale biological risks, modeling scenarios in which a lone actor uses AI to engineer a pathogen capable of mass casualties. That is a legitimate threat, but it has drawn attention away from others. Chemical weapons, improvised explosive attacks, and radiological devices — threats that are lower-profile than a pandemic but considerably more accessible to a motivated actor — receive substantially less systematic attention.

Labs publish evaluations of whether their models could enable a pandemic; they do not typically detail whether those same models could assist in a chemical attack or help someone circumvent export controls on precursor materials. The hiring of dedicated chemical weapons policy managers at two major labs suggests the companies have registered this concern. What is less clear is whether the evaluations those managers conduct will ever be visible to anyone outside the companies.

The 2026 International AI Safety Report, a multi-institutional effort coordinated through the UK AI Safety Institute, found that most risk-management practices at frontier labs remain voluntary. A handful of jurisdictions have begun formalizing limited requirements — California’s SB-53 and elements of the EU AI Act now require frontier developers to publish risk frameworks; New York’s RAISE Act will add similar obligations when it takes effect in 2027. None specifically addresses the weapons-domain evaluations that Anthropic and OpenAI are now staffing internally. What existing law requires (publishing a risk framework) and what the risk actually demands (mandatory disclosure of weapons-domain evaluations to external bodies with real authority) are not close to each other.

The architecture isn’t hard to sketch. It would require mandatory disclosure of model evaluations related to chemical, biological, radiological, and nuclear (CBRN) risks — not published voluntarily on a company blog, but submitted to an external body with the domain expertise to assess them. Formal, durable channels between AI developers and existing institutions like the OPCW, whose technical secretariat is actively seeking exactly this kind of engagement, are a prerequisite. And the rules would need to apply across the industry, not only to the companies that happen to prioritize safety in any given competitive cycle.

The people Anthropic and OpenAI are hiring are almost certainly capable professionals taking on genuinely consequential work. The critique here is structural, not individual. These officers have no institutional counterparts in government with comparable technical depth. They have no formal obligation to share their findings with the international bodies whose mandates cover precisely the threats they are assessing. They can be overruled by commercial considerations. They can be let go.

The question for policymakers is not whether to act, but whether to act before the architecture is constructed entirely inside the companies it was meant to oversee.

Authors

Javaid Iqbal Sofi

Javaid Iqbal Sofi is a Public Policy researcher at Virginia Tech whose work focuses on AI governance, global technology regulation, and the societal implications of emerging technologies.

News

EU’s AI Act Delays Let High-Risk Systems Dodge OversightApril 2, 2026

Perspective

America’s AI Governance Crisis Is a Democracy CrisisMarch 24, 2026

Perspective

What the AI Safety Debate Can Learn from the TechlashNovember 6, 2025

Perspective

How to Say No to the Next AI ReleaseJune 16, 2025

Perspective

The AI Safety Debate Needs AI SkepticsOctober 6, 2025