Research

ATLAS

Continual learning framework for LLM agents. Teacher-Student architecture with persistent learning memory.

CL-Bench

An evaluation harness for benchmarking continually learning agents.

Scaling Verification

Ensemble-of-judges architecture for process-level reward modeling. 93.7% on RewardBench V2.

Publications

Writing

Where Should Test-Time Compute Go? Surprisal-Guided Selection in Verifiable Environments

Selecting the model's least confident correct solutions recovers oracle performance at zero cost.

Frontier Security Agents Don't Lack Detection. They Lack Restraint.

Measuring incident response agent calibration under adversarial evidence.

When Sampling Beats Training: Multi-Turn RL's Cost-Benefit Problem

When to invest in training vs. inference-time compute for multi-turn agents.

Rethinking Evaluation for Agents That Never Stop Learning

How do we evaluate an agent that keeps changing?

Building a World Model of Consequence

What world models are, how to train them, and how they sit alongside agents.

Continual Learning for Stateful Agent Systems

Teaching agents to learn from their own trajectories.

On-Policy Distillation

Knowledge transfer for agent learning.

The ATLAS Reward System

How we built a reward system achieving 93.7% on RewardBench V2.

Inference-Time Continual Learning

Gradient free adaptation.

How We Define Learning

Learning as changes in behavior and competence over time, not one-off accuracy gains.

My Agents Keep Failing. Yours Will Too.

Failure modes and mitigations for production agent systems.

Everything is Changing...Again

Parenting in the age of AI and the future of human-computer interaction.

Open Source Contributions

Multi-turn RL training cookbooks and recipes
Model serving optimizations