👋🇻🇳 Xin chào! / Hello! I’m Anh Dao, an undergraduate student in Computer Science, Mathematics, and Applied Mathematics at the Department of Computer Science and Engineering (CSE) of Michigan State University (MSU), Honors College. I am an undergraduate research assistant in the Action Lab, advised by Prof. Yu Kong.

During my undergraduate studies, I’ve also had the privilege of working with Prof. Hy Son (Truong-Son Hy), Prof. Anh Nguyen, and Prof. Zijun Cui. I owe special thanks to Yifan Li 🙌, my first research mentor and a close collaborator who has shaped much of my early journey in computer vision and embodied AI.

My research spans Vision-Language Models (VLMs) and Vision-Language-Action (VLA) systems, with the goal of building agents that can understand physical environments, reason about humans and objects, and act intelligently in the real world.

To achieve this, my work tackles three fundamental challenges at the intersection of computer vision, multimodal models, and embodied AI:

  1. Understanding 🧭: How can we infuse VLMs with physical commonsense and rich 3D spatial understanding of their environments?
  2. Reasoning 🧠: How can we enable agents to solve complex, long-horizon tasks that require memory, planning, and causal reasoning?
  3. Efficiency ⚡️: How can we design efficient VLM/VLA models (e.g., via token reduction and optimized inference) that are fast and practical enough for real-world deployment?
  4. I am always open to collaboration—if you find overlapping interests, feel free to reach out! 🤝