Research
Four pillars of next-generation AI systems: Reasoning, Embodied, Agentic, Learnable. We study the fundamental capabilities that connect them, not just each in isolation.
Reasoning
We study how large models solve problems that require multi-step inference, from mathematical and code reasoning to long-chain and multimodal reasoning. Our goal is to make reasoning reliable, efficient, and verifiable, so that large models can think and decide as deeply as (or beyond) humans.
- Long-chain and infinite-horizon reasoning
- Mathematical, code, and text-to-SQL reasoning
- Chain-of-thought tuning, process verification, and reward modeling
- Multimodal and spatial reasoning in vision-language models
- Social reasoning and theory of mind for LLMs
Agentic AI
We build AI agents that plan, use tools, and accomplish complex tasks autonomously. Our work covers GUI agents, code agents, tool agents, embodied agents, and multi-agent systems, moving from scripted workflows to genuinely autonomous behavior across both digital and physical environments.
- GUI agents and mobile / desktop automation
- Tool-use and task-automation agents
- Code agents and adversarial code/test co-evolution
- Multi-agent collaboration and policy-level reflection
- Retrieval-augmented and memory-equipped agents
Learnable AI
Deployment is not the end of training: it is where self-evolution begins. We study how models bootstrap their own capabilities from experience, through reinforcement learning, self-improvement, and feedback-driven tuning, so that intelligence compounds autonomously rather than plateauing after release.
- Reinforcement learning for reasoning and agents
- Self-evolving agents and self-improvement loops
- Generator/verifier co-evolution and policy/reward co-optimization
- LLM steering and efficient adaptation after deployment
Embodied AI
We study how AI systems perceive, reason about, and act in environments: closing the loop from understanding to reasoning to action. Our current focus is on spatial intelligence, embodied reasoning, and benchmarks that evaluate agents in interactive physical scenes.
- Spatial reasoning and multi-perspective localization
- Embodied reasoning that synergizes search, planning, and action
- Benchmarks for agent reasoning in embodied tasks
- World models for physical and interactive environments