Labs
MissionCareersUpdates
Models
Terra-1Mantle-1Aether-1
SafetyResearchTeamBlogDocs
All Positions
AGI ResearchBangalore, IndiaFull-Time

Research Scientist — Multimodal AI

Build AI systems that understand the world through vision, language, audio, and beyond — the foundation of genuine general intelligence.

About This Role

AGI must perceive the world the way humans do — across multiple modalities simultaneously. This role pushes the state of the art in multimodal understanding and generation.

Responsibilities
  • Research and develop architectures that unify vision, language, and audio understanding
  • Design multimodal pretraining objectives and evaluation benchmarks
  • Run experiments on large-scale image-text and video-text datasets
  • Collaborate with the alignment team to ensure multimodal models remain safe
  • Publish research and contribute to the open-source multimodal community
Requirements
  • PhD in Computer Vision, NLP, or Machine Learning
  • Strong publication record in multimodal AI, vision-language models, or generative models
  • Expert knowledge of contrastive learning, diffusion models, or autoregressive vision
  • Proficiency with PyTorch and large-scale training infrastructure
Nice to Have
  • Experience with video understanding or audio-visual models
  • Background in human perception or psychophysics
  • Prior work on CLIP, DALL-E, Flamingo, or similar systems

TAL Corp is an equal opportunity employer. We believe the best team reflects the full diversity of humanity — because we are building for all of it.

Apply Through Training

At TAL Corp you don't just send a résumé — you prove yourself. Apply by joining our training program; complete it, and top performers are hired into this role.

  • 1Register & start your 7-day program
  • 2Train, build real skills, earn a credential
  • 3Top performers → straight into hiring

Already training? Log in