🧠 Khai Loong Aw

Khai Loong Aw

I am a Computer Science PhD student at Stanford University.

I am currently working on computer vision and developmental cognitive science in the NeuroAI Lab, so fortunate to be advised by Daniel Yamins.

Previously, I worked on identifying methods to train AI language models that improve their functional similarity to the language processing mechanisms in the human brain. In 2023, I worked with Antoine Bosselut and Martin Schrimpf at EPFL. In 2022, I worked with Mariya Toneva at the MPI for Software Systems.

Email / CV / Bio / Scholar / Twitter / GitHub

Research

Khai is a PhD student in the Computer Science Department at Stanford University. He is interested in building visual-cognitive world models capable of tackling the complex tasks that humans perform. By training these systems under developmentally realistic constraints, he aims to discover the mechanisms, structures, and inductive biases that enable human babies to acquire these abilities. Before joining Stanford, he explored ways to train AI language models to align them with language processing mechanisms in the human brain.

	Taming generative video models for zero-shot optical flow extraction Seungwoo Kim, Khai Loong Aw, Klemen Kotar, Cristobal Eyzaguirre, Wanhee Lee, Yunong Liu, Jared Watrous, Stefan Stojanov, Juan Carlos Niebles, Jiajun Wu, Daniel Yamins NeurIPS*, 2025 arXiv / Website / GitHub We design KL-tracing, a novel method that uses KL divergence of prediction logits for extracting state-of-the-art optical flow from autoregressive video generative models
	3D Scene Understanding Through Local Random Access Sequence Modeling Wanhee Lee, Klemen Kotar, Rahul Venkatesh, Jared Watrous, Honglin Chen, Khai Loong Aw, Daniel Yamins 2025 arXiv We propose Local Random Access Sequence (LRAS), an autoregressive generative model architecture. Using optical flow as an intermediate representation, LRAS achieves state-of-the-art novel view synthesis and 3D object manipulation.
	Instruction-tuning Aligns LLMs to the Human Brain Khai Loong Aw, Syrielle Montariol, Badr AlKhamissi, Martin Schrimpf, Antoine Bosselut COLM, 2024 arXiv We investigate how instruction-tuning affects language models from a neuroscientific perspective, revealing that it generally improves their alignment with human brain activity, with model size and world knowledge playing key roles.
	Training language models to summarize narratives improves brain alignment Khai Loong Aw, Mariya Toneva ICLR, 2023 (Spotlight) arXiv / GitHub We show that training language models to summarize narratives (i.e., deeper understanding of characters, emotions, and relationships) results in richer representations that are more aligned to human brain activity.
	Detecting False Alarms from Automatic Static Analysis Tools: How Far are We? Hong Jin Kang, Khai Loong Aw, David Lo ICSE, 2022 (Distinguished Paper Nominee) arXiv / Poster / Video