Home

Donate
Perspective

The White House Wants to Vet AI Models. It Won’t Solve the Safety Problem.

Emma Hatheway / May 7, 2026

President Donald J. Trump delivers remarks at a Crypto conference at the Mar-a-Lago Club, Saturday, April 25, 2025, in Palm Beach, Florida. (Official White House photo by Daniel Torok)

On Monday, The New York Times reported that the Trump administration is weighing an executive order to create a working group of industry executives and government officials to develop options to give the federal government initial access to new AI models to vet them before release. According to the Times, one option under discussion is “having the NSA, the White House Office of the National Cyber Director and the director of national intelligence oversee the model review,” though such review would not necessarily result in a model being blocked from public release.

This is a meaningful shift from an administration that spent its first year dismantling Biden-era AI safety frameworks, and it signals that public concern may have finally registered with government officials. It remains to be seen what the working group might produce. But merely reviewing a model without consequence does not equate with meaningful oversight, and a working group co-designed with the companies being reviewed does not meet necessary standards of independence to establish an ideal framework.

While Anthropic's recent decision to withhold Mythos due to cybersecurity risks is regarded as an attempt to be responsible, it also serves as a reminder that we are relying on the discretion of a small number of executives to make decisions that affect the public. A safety regime that depends on which CEO is in charge or who they wish to brief is not a safety regime. Replacing that with a federal review staffed by representatives from the intelligence community and directly influenced by industry tech giants would not be any better.

Today, most AI research capacity sits inside the companies selling AI products. Nearly 80 percent of global AI computing power is privately owned, and the majority of the researchers capable of evaluating frontier systems work for the labs building them. A decade ago, universities led AI research. Today, nearly 70 percent of new AI PhDs go directly into the private sector, drawn by compensation packages academic institutions cannot match. Academics and researchers who remain rely on corporate partnerships for the compute access their work requires, which shapes what questions get asked.

Fear of falling behind China has accelerated this consolidation into policy; achieving “unquestioned and unchallenged global technological dominance” is the stated policy of the administration. In December 2025, the White House issued an executive order to combat state-level AI regulation. The result is a system where the same companies racing to ship products and satisfy investors also decide whether their systems are safe.

While AI companies employ world-class researchers, they operate under constraints that make truly independent safety testing unlikely. In other high-stakes industries, we don’t rely on companies to evaluate their own risks. For instance, pharmaceutical companies are required to conduct clinical trials under a framework established by the Food and Drug Administration (FDA). When products affect millions of lives, we’ve long recognized that independent verification matters. In the AI sector, equivalent safeguards do not exist. Testing happens behind closed doors, and third party auditing is often limited as it requires data and computing power that is largely managed by private companies. If the only experts capable of auditing AI systems work for the companies being regulated, real enforcement is lacking.

Some version of this proposed solution already exists. The Center for AI Standards and Innovation (CAISI, formerly the AI Safety Institute) was created to evaluate AI systems for the federal government, until it was reorganized under the current administration. (Industry sources told the Times it was “sidelined.”) But CAISI operates through “voluntary agreements” with developers, largely focuses on national security risks like cybersecurity and biosecurity, and describes itself as industry's “primary point of contact” within the government rather than as an independent evaluator. A day after the White House proposal was reported, CAISI announced expanded testing agreements with Google DeepMind, Microsoft, and xAI, in addition to existing agreements with Anthropic and OpenAI. While CAISI’s evaluations for security risks may be rigorous, they are not independent of industry influence.

A serious independent research capacity would look different. It would exist outside of the industry and expand beyond the intelligence community in order to evaluate the full range of public concerns, such as data privacy or information integrity, rather than just national security risks.

METR, a nonprofit research organization, has spent the last several years building rigorous evaluations of frontier AI models. They are afforded pre-deployment access to models at OpenAI and Anthropic, and are provided compute credits from AI labs they work with. While METR is helping to assess capabilities and publishes its findings to increase public awareness, it cannot act alone. Imagine a network of independent research institutes housed at universities or nonprofits, equipped with computing resources and stable funding separate from commercial pressure. These labs would test systems before and after deployment and publish processes and results publicly. While some independent organizations, like METR, test existing models, others that lack a relationship with AI labs are often denied full access to models and training data. With proper access and compute, independent researchers could replicate company safety claims and assess data bias, national security risks, and threats to information integrity. They could also build public benchmarks and tools that would improve future AI development.

Some argue that external scrutiny will stifle innovation. However, data indicates that trust is a prerequisite for large-scale adoption. A global survey from KPMG, a consultancy, and the University of Melbourne found that 54 percent of respondents are wary of AI. Independent evaluation builds confidence and a more sustainable ecosystem. Others argue the cost of AI research and compute is too high. However, funding could come from both government investment and philanthropic support. Funds could be re-attributed from existing tech subsidies, and diversified funding would reduce undue influence.

History shows that self regulation fails when commercial incentives and public safety diverge. As lawmakers debate AI regulation, the public should demand one basic safeguard: independent testing that doesn't rely on company assurances alone.

Authors

Emma Hatheway
Emma Hatheway studies technology policy at Columbia's School of International and Public Affairs and is a graduate intern at All Tech Is Human, a nonprofit focused on building a more responsible tech ecosystem. Before graduate school, she worked as a product strategist in marketing technology.

Related

Perspective
Anthropic Warned Big Companies About Mythos. Workers and Watchdogs Need a Seat at the Table.April 22, 2026
Analysis
Timeline of Trump White House Actions and Statements on Artificial IntelligenceJanuary 25, 2026

Topics