Ideas Worth Publishing.

Technical deep dives, research updates, and perspectives on AGI safety from the TAL Corp team. We write about what we are learning — including the parts that did not go as planned.

5Posts

4Authors

5Topics

2025Since

Featured

InterpretabilityDeep DiveFebruary 14, 2026 · 8 min read

Why Interpretability Is Load-Bearing for AGI Safety

We cannot align what we cannot understand. Mechanistic interpretability is not a nice-to-have — it is the foundation everything else rests on.

Ananya Krishnaswamy

Head of Interpretability

↓

All Posts.

AlignmentResearch UpdateJanuary 28, 2026 · 6 min read

The Corrigibility Problem Gets Harder as Models Get Better

Naive corrigibility constraints degrade as capability increases. Here is what we found — and what we are doing about it.

James Mercer

↓

EvaluationAnnouncementJanuary 12, 2026 · 5 min read

Introducing SAFE-AGENT: A Benchmark for Autonomous AI Safety

We are open-sourcing a 1,240-task benchmark suite for evaluating whether AI agents respect their autonomy envelopes.

Soo-Jin Park

↓

Agentic SystemsTechnicalDecember 18, 2025 · 7 min read

Emergent Misalignment in Multi-Agent Systems

When multiple AI agents coordinate, new safety failure modes emerge that are invisible at the individual agent level.

David Okafor

↓

PerspectiveOpinionNovember 30, 2025 · 4 min read

On AGI Timelines and Why We Are Building Now

We do not know exactly when AGI will arrive. But the work required to make it safe takes years — and it needs to start before we need it.

James Mercer

↓

Newsletter

New Posts. No Noise.

We publish infrequently and only when we have something worth saying. Subscribe to get new posts directly — no marketing, no digests, no noise.

Frequency

2–4 posts / month

Format

Long-form only

Topics

Safety, research, engineering

Unsubscribe

One click, any time

Subscribe