👋🇻🇳 Xin chào! / Hello! I’m Anh Dao, an undergraduate student in Computer Science, Mathematics, and Applied Mathematics at the Department of Computer Science and Engineering (CSE) of Michigan State University (MSU), Honors College. I am an undergraduate research assistant in the Action Lab, advised by Prof. Yu Kong.

During my undergraduate studies, I’ve also had the privilege of working with Prof. Hy Son (Truong-Son Hy), Prof. Anh Nguyen, and Prof. Zijun Cui. I owe special thanks to Yifan Li 🙌, my first research mentor and a close collaborator who has shaped much of my early journey in computer vision and embodied AI.

My research spans Vision-Language Models (VLMs) and Vision-Language-Action (VLA) systems, with the goal of building agents that can understand physical environments, reason about humans and objects, and act intelligently in the real world.

To achieve this, my work tackles three fundamental challenges at the intersection of computer vision, multimodal models, and embodied AI:

  1. Understanding 🧭: How can we infuse VLMs with physical commonsense and rich 3D spatial understanding of their environments?
  2. Reasoning 🧠: How can we enable agents to solve complex, long-horizon tasks that require memory, planning, and causal reasoning?
  3. Efficiency ⚡️: How can we design efficient VLM/VLA models (e.g., via token reduction and optimized inference) that are fast and practical enough for real-world deployment?
  4. I am always open to collaboration—if you find overlapping interests, feel free to reach out! 🤝

🔥 News

Current News 📣 📣 📣
  • 2025.09: 🎉 Our paper IndustryEQA on embodied question answering in industrial scenarios has been accepted by NeurIPS 2025!
  • 2025.08: 🎉 Our team achieved 3rd place in the Visual Quality Comparison for LMMs challenge at ICCV 2025!
  • 2025.07: 🎉 Our paper VietMedKG on traditional Vietnamese medicine knowledge graphs has been accepted by ACM TALLIP 2025!
  • 2025.07: 🎉 Our team achieved 1st place in the MAI 2025 Camera Scene Detection Challenge at CVPR 2025 Workshop!
  • 2025.07: 🎉 Our team achieved 1st place in the Interactive Track at the IViSE Workshop (CVPR 2025)!
  • 2025.03: 🎉 I started my AI Research Engineer Intern role at ZenAI!
  • 2025.01: 🎉 We released our survey paper “Visual Large Language Models for Generalized and Specialized Applications” on arXiv!
Previous News 🗂️ 🗂️ 🗂️
  • 2024.08: 🎉 Our work on facial affective behavior analysis with instruction tuning was accepted by ECCV 2024.
  • 2023.08: 🎉 I received the Honors College Wielenga Research Scholars award at MSU.
  • 2023.03: 🎉 Our team won the Best Use of AI prize at Spartan Hackathon 8.
  • 2022.08: 🎉 I started my undergraduate journey at Michigan State University (Honors College).

📝 Publications

📷🤖 Computer Vision, Embodied AI & Visual LLMs

ArXiv 2025
Beyond Motion Pattern

Beyond Motion Pattern: An Empirical Study of Physical Forces for Human Motion Understanding
Anh Dao*, Manh Tran*, Yufei Zhang, Xiaoming Liu, Zijun Cui.
Under review.

Preprint (coming soon)

ArXiv 2025
IndustryNav

IndustryNav: Exploring Spatial Reasoning of Embodied Agents in Dynamic Industrial Navigation
Yifan Li*, Lichi Li*, Anh Dao*, Xinyu Zhou, Yicheng Qiao, Zheda Mai, Daeun Lee,
Zichen Chen, Zhen Tan, Mohit Bansal, Yu Kong.
Under review.

Paper · Project

NeurIPS 2025
IndustryEQA

Star
IndustryEQA: Pushing the Frontiers of Embodied Question Answering in Industrial Scenarios

Yifan Li*, Yuhang Chen*, Anh Dao*, Lichi Li, Zhongyi Cai, Zhen Tan, Tianlong Chen, Yu Kong

Paper · Project · Benchmark · Code

CVPRW 2025
Cycle Training

Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection
Huu-Phong Phan-Nguyen*, Anh Dao*, Tien-Huy Nguyen*, Tuan Quang, Huu-Loc Tran,
Tinh-Anh Nguyen-Nhu, Huy-Thach Pham, Quan Nguyen, Hoang M. Le, Quang-Vinh Dinh

Paper

ArXiv 2025
VLLM Survey

Star
Visual Large Language Models for Generalized and Specialized Applications

Yifan Li, Zhixin Lai, Wentao Bao, Zhen Tan, Anh Dao, Kewei Sui, Jiayi Shen, Dong Liu, Huan Liu, Yu Kong

Paper · Project

ECCV 2024
EmoLA

Star
Facial Affective Behavior Analysis with Instruction Tuning

Yifan Li, Anh Dao, Wentao Bao, Zhen Tan, Tianlong Chen, Huan Liu, Yu Kong Under review.

Paper · Project · Code

📚 Other Research Papers

  • Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
    Contributor for Vietnamese data curation and evaluation.
    Paper · Weights · Code

  • VietMedKG: Knowledge Graph and Benchmark for Traditional Vietnamese Medicine
    Tam Trinh*, Anh Dao*, Hy Thi Hong Nhung, Hy Truong Son.
    ACM TALLIP 2025.
    Paper · Code

🎖 Honors and Awards

  • Aug 2025 · 3rd Place, Visual Quality Comparison for LMMs Challenge (ICCV 2025) · Paper · Certificate
  • Mar 2025 · 1st Place, MAI 2025 Camera Scene Detection Challenge (CVPR 2025) · Paper · Certificate
  • Mar 2025 · 1st Place, Interactive Track, IViSE Workshop (CVPR 2025) · Paper
  • Aug 2023 · Honors College Wielenga Research Scholar, Michigan State University
  • May 2023 · Best Project, MSU AI Club (Spring 2023)
  • Mar 2023 · Best Use of AI Prize, Spartan Hackathon 8, Michigan State University · Project
  • Sep 2022 – Sep 2025 · Google TensorFlow Developer Certificate · Certificate
  • 2022 – Present · Dean’s List, all semesters, Michigan State University

💬 Academic Services

Conference Reviewer

  • IEEE/CVF Winter Conference on Applications of Computer Vision (WACV): 2025

💻 Internships

AI Research Engineer Intern · ZenAI

2025.05 – 2025.09

  • Built production-ready image-generation pipelines in ComfyUI, fine-tuning Flux, Qwen-Image, and SDXL for client-specific styles; deployed SDXL to the Gradient (SN56) subnet.
  • Developed an AI software engineering agent, evaluated on SWE-bench Verified and Polyglot for the Ridges (SN62) subnet.

Machine Learning Engineer Intern · FPT Software AI Center

2023.05 – 2023.08

  • Applied LoRA/QLoRA to adapt LLMs to customer tasks and deployed high-throughput inference with NVIDIA Triton, TensorRT, and ONNX.

Machine Learning Engineer Fellow · Cinnamon AI

2023.07 – 2023.09

  • Built a full-stack academic paper recommender using Django + React, integrating an LLM assistant via LangChain and the OpenAI API.

📖 Educations

  • 2022.08 - 2025.12 (expected): B.S. in Computer Science, Mathematics & Applied Mathematics, Minor in Computational Mathematics, Science and Engineering (CMSE), Michigan State University, United States.
    • Honors College · College of Engineering
    • GPA: 3.98 / 4.0

⚙️ Others

🧪 Selected Course Projects & Side Work

  • Critic-Based Self-Correction Using Rules of Thumb for Ethical Enhancement in LLMs: Technical Report

  • Gait All-Time: Infrared–Visible Gait Recognition: Technical Report

  • ViRAG: Vietnamese Retrieval-Augmented Generation System:Repository

🌱 Teaching

  • Teaching Assistant, MTH 103 College Algebra, Michigan State University, Fall 2024.
  • Teaching Assistant, CSE 102 Algorithmic Thinking and Programming, Michigan State University, Fall 2023.