Research
ATLAS
Continual learning framework for LLM agents. Teacher-Student architecture with persistent learning memory.
Scaling Verification
Ensemble-of-judges architecture for process-level reward modeling. 93.7% on RewardBench V2.
Publications
Writing
Where Should Test-Time Compute Go? Surprisal-Guided Selection in Verifiable Environments
Selecting the model's least confident correct solutions recovers oracle performance at zero cost.
Frontier Security Agents Don't Lack Detection. They Lack Restraint.
Measuring incident response agent calibration under adversarial evidence.
When Sampling Beats Training: Multi-Turn RL's Cost-Benefit Problem
When to invest in training vs. inference-time compute for multi-turn agents.
Rethinking Evaluation for Agents That Never Stop Learning
How do we evaluate an agent that keeps changing?
Building a World Model of Consequence
What world models are, how to train them, and how they sit alongside agents.
Continual Learning for Stateful Agent Systems
Teaching agents to learn from their own trajectories.
On-Policy Distillation
Knowledge transfer for agent learning.
The ATLAS Reward System
How we built a reward system achieving 93.7% on RewardBench V2.
Inference-Time Continual Learning
Gradient free adaptation.
How We Define Learning
Learning as changes in behavior and competence over time, not one-off accuracy gains.
My Agents Keep Failing. Yours Will Too.
Failure modes and mitigations for production agent systems.
Everything is Changing...Again
Parenting in the age of AI and the future of human-computer interaction.