1CFCS, School of Computer Science, Peking University
2Galbot
3UC Berkeley
4Beijing Academy of Artificial Intelligence
* equal contributions
† corresponding author
Simulation set
: DexGraspNet 2.0 contains 427M grasps (4 random grasps are visualized in each scene here for clarity).
Real-world Execution
: first row are model-generated grasps conditioned on real-world single-view depth point clouds, second row are top-ranked grasps, and third row are real-world executions.
Grasping in cluttered scenes remains highly challenging for dexterous hands due to the scarcity of data. To address this problem, we present a large-scale synthetic benchmark, encompassing 1319 objects, 8270 scenes, and 427 million grasps. Beyond benchmarking, we also propose a novel two-stage grasping method that learns efficiently from data by using a diffusion model that conditions on local geometry. Our proposed generative method outperforms all baselines in simulation experiments. Furthermore, with the aid of test-time-depth restoration, our method demonstrates zero-shot sim-to-real transfer, attaining 90.7% real-world dexterous grasping success rate in cluttered scenes.
Some synthetic training scenes and grasping pose from DexGraspNet 2.0.
Dexterous grasping generation and real-world execution in a cluttered scene.
@inproceedings{zhangdexgraspnet,
title={DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes},
author={Zhang, Jialiang and Liu, Haoran and Li, Danshi and Yu, XinQiang and Geng, Haoran and Ding, Yufei and Chen, Jiayi and Wang, He},
booktitle={8th Annual Conference on Robot Learning}
}
If you have any questions, please feel free to contact Jialiang Zhang at zhangjialiang@stu.pku.edu.cn, Haoran Liu at lhrrhl0419@stu.pku.edu.cn, Danshi Li at danshi.li.academia@gmail.com, Xinqiang Yu at yuxinqiang@galbot.com, and He Wang at hewang@pku.edu.cn.