VLA & Multimodal Learning
LOCATION
Beijing / Shanghai / Shenzhen / Hong Kong
EMPLOYMENT TYPE
FUll-time
The Missions
End-to-end data–model closed loops that quantify effective data gain on embodied learning systems, and next-generation human-centric data acquisition pipelines spanning hardware, simulation, and dexterous teleoperation.
You will design and push VLA models to state-of-the-art performance, evolving multimodal fusion architectures that unlock the value of force and tactile signals in policy generation
The DNA
First-principles thinker with strong experimental instincts, hands-on experience in real-world data collection, and deep familiarity with vision, force, and tactile sensing systems.
Research background from top labs or conferences (NeurIPS, ICLR, CVPR, CoRL, RSS) is highly valued.