10 Breakthroughs in Robotic Touch: How DAIMON Robotics Is Redefining Dexterous Manipulation

By

Robots have long struggled with one crucial human sense: touch. While vision and language models advance, the lack of tactile feedback leaves robots clumsy in real-world tasks. Enter DAIMON Robotics, a Hong Kong-based startup that released the Daimon-Infinity dataset in April 2025—the largest omni-modal robotic dataset ever. It combines high-resolution tactile sensing with visual, language, and action data across hundreds of tasks. This article explores ten key insights from DAIMON's work, from its groundbreaking sensor technology to real-world deployments in hotels and factories.

1. Daimon-Infinity: The Largest Omni-Modal Robotic Dataset

Daimon-Infinity is the world's most comprehensive robotic manipulation dataset. It includes million-hour-scale multimodal data—vision, tactile, language, and action—covering over 80 real-world scenarios and 2,000+ human skills. From folding laundry at home to precision assembly on factory lines, the dataset provides the vast, diverse training data needed for physical AI. Its sheer scale and tactile resolution (over 110,000 sensing points per fingertip) set a new standard for robot learning.

10 Breakthroughs in Robotic Touch: How DAIMON Robotics Is Redefining Dexterous Manipulation
Source: spectrum.ieee.org

2. A Fingertip-Sized Tactile Sensor with 110,000 Sensing Units

Central to DAIMON's technology is a monochromatic, vision-based tactile sensor packed into a fingertip-sized module. It captures high-resolution texture, pressure, and slip information through a camera inside the fingertip. With more than 110,000 effective sensing units, it rivals human fingertip sensitivity. This hardware is the foundation for generating the high-quality tactile data in the dataset, enabling robots to feel surfaces, edges, and forces during manipulation.

3. Distributed Data Collection: Millions of Hours Annually

DAIMON built a distributed out-of-lab collection network that can generate millions of hours of tactile-rich data per year. Instead of relying on a single lab setup, the system uses multiple remote stations performing tasks simultaneously. This approach dramatically scales data volume and diversity, capturing rare edge cases—like gripping a wet object or threading a needle—that are vital for robust AI training.

4. Open-Sourcing 10,000 Hours of Data

To accelerate real-world adoption of embodied AI, DAIMON open-sourced 10,000 hours of its dataset. This move allows researchers worldwide to access high-quality tactile data without starting from scratch. By lowering barriers, DAIMON hopes to foster innovation in tactile-driven manipulation, from pick-and-place in logistics to delicate surgical tasks. The open-source release also invites community validation and improvement.

5. Prof. Michael Yu Wang: The Tacticle Visionary

Behind the strategy is Prof. Michael Yu Wang, co-founder and chief scientist. With a PhD from Carnegie Mellon (studying under manipulation pioneer Matt Mason) and decades at the Hong Kong University of Science & Technology, he brings four decades of expertise. His key insight: tactile sensing must be elevated alongside vision and language for true dexterity. He pioneered the Vision-Tactile-Language-Action (VTLA) architecture, making touch a first-class modality.

6. Solving the 'Insensitivity' Problem in Robot Manipulation

Current robots rely heavily on vision, but they lack tactile feedback—what Prof. Wang calls the 'insensitivity' problem. Without touch, robots cannot gauge grip force, detect surface texture, or sense when an object slips. This limits tasks like picking up a fragile egg or assembling parts with tight tolerances. DAIMON's tactile data aims to fill this gap, providing the missing sensory channel for safe, precise manipulation in unstructured environments.

10 Breakthroughs in Robotic Touch: How DAIMON Robotics Is Redefining Dexterous Manipulation
Source: spectrum.ieee.org

7. The Vision-Tactile-Language-Action (VTLA) Architecture

VTLA treats tactile data as a core input modality, on par with vision and language. Unlike earlier models that tacked on touch as an afterthought, VTLA integrates high-resolution tactile signals directly into the decision loop—from perception to action. This architecture enables robots to learn manipulation skills that adapt to varying object properties, like hardness, friction, or temperature. The Daimon-Infinity dataset is built to train such multimodal models.

8. Global Collaborations: Google DeepMind, Northwestern, NUS

The dataset initiative is a collaborative effort supported by partners across China and the globe, including Google DeepMind, Northwestern University, and the National University of Singapore. These partnerships bring diverse expertise in deep learning, materials science, and robotics. Together, they ensure the dataset captures a wide range of tasks and environmental conditions, from lab benchmarks to real-world settings like convenience stores and hotels.

9. Real-World Inroads: Hotels and Convenience Stores in China

Prof. Wang sees immediate applications in service robotics—specifically in Chinese hotels and convenience stores. Robots equipped with tactile sensors can handle check-in tasks, deliver items, or restock shelves with the same care as a human. Tactile feedback prevents them from crushing fragile items or dropping slippery bottles. Early pilots are exploring these scenarios, aiming for deployment within the next year.

10. From Laundry to Assembly: Broad Task Coverage

The Daimon-Infinity dataset spans tasks ranging from domestic chores (folding laundry, opening jars) to industrial operations (assembly line part insertion, torque control). This breadth is deliberate: tactile skills must generalize across very different force and texture regimes. By covering 2,000+ human skills, DAIMON equips robots to handle the messy variety of real-world manipulation—with touch as the unifying sense.

DAIMON Robotics' dataset marks a turning point for embodied AI. By open-sourcing a massive, high-resolution tactile dataset and pioneering the VTLA architecture, the company enables robots to finally 'feel' their way through complex tasks. As Prof. Wang puts it, 'Touch is the last frontier for dexterous manipulation.' With these advances, that frontier is now within reach.

Tags:

Related Articles

Recommended

Discover More

abc88f8betvn1385 Alarm Apps That Actually Work When Google Clock FailsEU Regulators Voice Concerns Over Tesla's Full Self-Driving as Approval Process Advancesqibet88winf8betHow AI Can Shrink Real-Estate Development from Months to Days: A Step-by-Step GuideThe Hidden Barrier to Zero Trust: Why Secure Data Movement Mattersvn138qibetKazakhstan's Ministry Renews Coursera Partnership to Advance Digital and AI Education for Students88winabc88