Before coming to UCSD, I received my bachelor degree in Computer Science from Zhejiang University, with High Honors and minor in Electronic Engineering.
Robotics, Computer Vision, Machine Learning
Address: Computer Science & Engineering, EBU3B Room 4128
9500 Gilman Drive, Mail Code 0404, La Jolla, CA, 92093-0404
Multi-View to Single View: Action Recognition with Transferable Virtual View Synthesis
Gao Peng*, Hao Zhu*, Yong-Lu Li*, Jiajun Tang, Jin Xia, Xiaolong Wang†, Cewu Lu† (*Equal contribution)
[Paper(Comming Soon)] [Code(Comming Soon)] [Meme]
We present a novel Virtual View Synthesis (VVS) framework that utilizes existing multi-view datasets to enhance the single-view activity recognition via knowledge transfer. After training on the multi-view video dataset, VVS can learn the ability of how to better inspect the existing view from a virtual viewpoint from the multi-view consistencies and then transfers the knowledge to the single-view circumstance.
SAPIEN: A SimulAted Part-based Interactive ENvironment
Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X.Chang, Leonidas J. Guibas, Hao Su
CVPR 2020, Oral Presentation
[Paper] [Project] [Video]
We build SAPIEN simulator: an interaction-rich and physics-realistic simulation environment integrating PhysX engine and ROS control interface. I lead the team of building SAPIEN dataset: more than 2K 3D articulated models with 14K movable parts. The dataset is richly annotated with kinematic part motions and dynamic interactive attributes to support robot interaction.
S4G: Amodal Single-view Single-Shot SE(3) Grasp Detection in Cluttered Scenes
Yuzhe Qin*, Rui Chen*, Hao Zhu, Meng Song, Jing Xu, Hao Su (*Equal contribution)
CoRL 2019, Spotlight Presentation
[Paper] [Project] [Video] [Presentation]
We studied the problem of 6-DoF grasping by a parallel gripper in a cluttered scene captured using a commodity depth sensor from a single view point. Our learning based approach trained in a synthetic scene can work well in real-world scenarios, with improved speed and success rate compared with state-of-the-arts.
CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark
Jiefeng Li, Can Wang, Hao Zhu, Yihuan Mao, Hao-Shu Fang, Cewu Lu
CVPR 2019, Oral Presentation
[Paper] [Code] [Dataset] [Video]
We collect CrowdPose, a new dataset of crowded human poses. We propose a joint-candidate single person pose estimation (SPPE) and a global maximum joints association algorithm to tacklethe problem of pose estimation in a crowd. Our method surpasses the state-of-the-art method by 5.2 mAP on CrowdPose and replacing certain steps in the state-of-the-art method with our module would bring 0.8 mAP improvement on MSCOCO.