Researchers Propose Financial Risk Standard for AI Agents

A cross-institutional team of researchers from Google DeepMind, Microsoft Research, Columbia University, t54 Labs, and Virtuals Protocol has released a new research paper proposing the Agentic Risk Standard (ARS) — a framework that applies financial risk management principles to AI agent transactions.

—

Through 5,000 rounds of simulation, the researchers found that agent underwriting services can reduce losses in financial transactions by up to 61%.

The paper, entitled "Quantifying Trust: Financial Risk Management for Trustworthy AI Agents," introduces a settlement-layer protocol that uses escrow, underwriting, and collateralization to protect users from financial loss when autonomous AI systems execute tasks involving payments or assets.

The full paper is available on arXiv.

Loading tweet...
View Tweet

The Problem: AI Agents Are Moving Real Money

AI agents are rapidly evolving from chatbots into autonomous systems that write code, file taxes, manage customer service, and execute financial transactions. As these systems take on tasks with real economic consequences, users face a fundamental problem: existing AI safety research focuses on improving model behavior but cannot eliminate the possibility of failure.

Large language models are inherently stochastic, meaning no amount of training can reliably reduce the probability of failure to zero.

Sponsored

Solana DEX Bal$50

1Pick a token to buy

USDC Stablecoin $1.00stable

SOL Solana +2.1%$134.22

T USDT Stablecoin $1.00stable

B BONK Memecoin +14.2%$0.000034

2Set your discount

USDCBuying $1.00market

Buy USDC at 50% off below market price

1%10%25%50%75%90%

You pay

$500.37 SOL

You get (if filled)

$100of USDC

3Place your fate order

Buying USDC 50% off

You pay$50.00

You get if filled$100.00 USDC

Fill chance49.25%

Settle~2 sec · On-chain

Buy USDC at 50% Discount

4Revealing fate

00.00 Fill if below 49.25

You bought $100 of USDC for $50

Doubled your money · on-chain

1Set your premium

USDCSelling $1.00market

Sell USDC at 100% premium above market price

1%25%50%100%250%900%

You stake

$100of USDC

You get (if filled)

$200paid in SOL

2Place your fate order

Selling USDC 100% premium

You stake$100.00 USDC

You get if filled$200.00

Fill chance49.25%

Settle~2 sec · On-chain

Sell USDC at 100% premium

3Revealing fate

00.00 Fill if below 49.25

You sold $100 of USDC for $200

Doubled your money · on-chain

Gamble for a better price than market Trade Now →

The researchers point to concrete evidence: in a 2025 autonomous crypto trading competition, most AI agents lost money, with one model losing 63% of its capital while others dropped by 30–56%.

The researchers identify this as a "guarantee gap" — a disconnect between the probabilistic reliability that AI safety techniques provide and the enforceable guarantees users need before delegating high-stakes tasks. Without a way to bound potential losses, users rationally limit AI delegation to low-risk tasks, constraining the broader adoption of agent-based services.

How ARS Works: Escrow, Underwriting, and Collateral

Rather than attempting to make AI models perfect, ARS takes a complementary approach inspired by how traditional industries have managed uncertainty for centuries. Financial markets use clearinghouses and margin requirements. Doctors carry malpractice insurance. Construction companies post performance bonds. The framework applies this logic to AI agents through two modes:

Standard service tasks (generating a report, writing code, preparing a document): Payment is held in escrow and released only after the work is verified.
Fund-handling tasks (trading, currency conversion, financial API calls): An underwriting layer is added — a risk-bearing party evaluates the task, prices the risk, may require the agent provider to post collateral, and commits to reimbursing the user under specified failure conditions.

The entire transaction lifecycle is formalized as a deterministic state machine with explicit fund-control rules. Regardless of how an AI agent behaves internally, the financial outcome for the user is governed by auditable, enforceable settlement logic.

Loading tweet...
View Tweet

Simulation Results: Up to 61% Loss Reduction

The paper includes a simulation study modeling users, AI agent providers, and underwriters interacting through the ARS protocol across 5,000 episodes. Key findings include:

Sponsored

Moonpay

Buy SOL with a card in 30 seconds. No KYC for orders under $150.

Trusted by 30 million users. Powering the on-ramp for the largest Solana wallets and dApps in the ecosystem.

Get Started

The mechanism consistently reduced user losses compared to an ecosystem with no underwriting, with loss reduction ranging from 24% to 61% depending on pricing and risk estimation settings.
The collateral mechanism independently deterred 15–20% of risky transactions from executing in the first place, as fraud or misexecution now carries direct cost for the agent side.
Tighter underwriting improves user protection and underwriter solvency but introduces friction that can reduce market participation — mirroring tradeoffs that exist in traditional insurance and financial markets.

What the Researchers Say

"Most trustworthy AI research aims to reduce the probability of failure. That work is essential, but probability is not a guarantee. ARS takes a complementary approach: instead of trying to make the model perfect, we formalize what happens financially when it isn't. The result is a settlement protocol where user protection is deterministic, not probabilistic." — Wenyue Hua, Senior Researcher at Microsoft Research

"The industry is building increasingly autonomous AI agents but hasn't addressed what happens when they fail with someone's money. That's the problem t54 Labs was founded to solve, and the proposed Agentic Risk Standard represents our thinking alongside leading researchers across the industry and academia. We're publishing it openly because the wider ecosystem needs to recognize that financial risk management for AI agents isn't optional — it's foundational." — Chandler Fang, Founder of t54 Labs

The Research Team and Backing

The paper is co-authored by researchers across five institutions: Wenyue Hua (Microsoft Research), Tianyi Peng (Columbia University), Chi Wang (Google DeepMind), Ian Kaufman and Chandler Fang (t54 Labs), and Bryan Lim (Virtuals ACP). The research represents the individual scholarly contributions of the authors and does not represent the positions of their respective employers.

t54 Labs, which builds trust and risk infrastructure for the agentic economy, raised a $5M seed round led by Anagram, with participation from Franklin Templeton, Ripple, and other strategic investors. The company's work on agent risk assessment and payment infrastructure informed the problem framing and protocol design of ARS.

Loading tweet...
View Tweet

As AI agents increasingly handle real assets onchain and off, the ARS framework represents one of the first formal attempts to bridge the gap between AI safety research and enforceable financial protections. The open-source standard is available for review and implementation by the broader ecosystem.

Sponsored

Gray & Sons

Rolex Day-Date 40

Reference 228238 · 40mm 18k Gold

554.1 SOL ≈ $45,500

Inspect the piece →

Sponsored

Gray & Sons

Rolex Day-Date 40

Reference 228238 · 40mm 18k Gold

554.1 SOL

                ≈ $45,500 USD
              

Researchers Propose Financial Risk Standard for AI Agents

The Problem: AI Agents Are Moving Real Money

The researchers point to concrete evidence: in a 2025 autonomous crypto trading competition, most AI agents lost money, with one model losing 63% of its capital while others dropped by 30–56%.

How ARS Works: Escrow, Underwriting, and Collateral

Simulation Results: Up to 61% Loss Reduction

Buy SOL with a card in 30 seconds. No KYC for orders under $150.

What the Researchers Say

The Research Team and Backing

Rolex Day-Date 40

Rolex Day-Date 40

Sophia Reyes