Thoughts
Learning Action Macros for Computer-Use Agents
April 6, 2026
On mining repeated behaviors from traces to build higher-level actions for agents.
Evaluation Awareness
March 9, 2026
On what happens when models start reasoning about being evaluated.
Training Reasoning Models: Notes as I Try to Understand GRPO and DAPO
February 16, 2026
Notes on GRPO, DAPO, and training reasoning capabilities in language models.