Interpretability Researcher

Make AGI reasoning transparent — developing tools that let humans inspect and understand what TAL models are actually doing inside.

About This Role

We believe AGI that cannot be inspected cannot be trusted. The Interpretability team builds the glass-box tools that make TAL Corp's models the most transparent in the industry.

Responsibilities

▸Design mechanistic interpretability experiments on transformer-based models
▸Build visualisation tools that expose attention patterns and internal activations
▸Develop probing classifiers to understand what information is encoded at each layer
▸Publish interpretability findings as open-source tools and peer-reviewed papers
▸Collaborate with safety engineers to translate findings into deployed monitoring

Requirements

▸PhD in Machine Learning, Cognitive Science, or Neuroscience
▸Deep understanding of transformer internals — attention, residual streams, MLP layers
▸Proficiency in Python, PyTorch, and scientific visualisation libraries
▸Experience with activation patching, causal tracing, or circuit analysis
▸Strong publication record or equivalent research output

Nice to Have

▸Background in neuroscience or cognitive psychology
▸Experience with sparse autoencoders or dictionary learning
▸Familiarity with formal logic or program synthesis

TAL Corp is an equal opportunity employer. We believe the best team reflects the full diversity of humanity — because we are building for all of it.

Apply Through Training

At TAL Corp you don't just send a résumé — you prove yourself. Apply by joining our training program; complete it, and top performers are hired into this role.

1Register & start your 7-day program
2Train, build real skills, earn a credential
3Top performers → straight into hiring

Already training? Log in

← View all openings