Fangzhou Hong is currently a third-year Ph.D. student in the School of Computer Science and Engineering at Nanyang Technological University (MMLab@NTU), supervised by Prof. Ziwei Liu. Previously, he received B.Eng. degree in Software Engineering from Tsinghua University in 2020. His research interests lie on the computer vision and deep learning. Particularly, he is interested in 3D representation learning and its intersection with computer graphics.
One paper (AvatarCLIP) accepted to SIGGRAPH 2022 (journal track).
One paper (Garment4D) accepted to NeurIPS 2021.
I am awarded Google PhD Fellowship 2021 (Machine Perception).
One paper (extended Cylinder3D) accepted by TPAMI.
Two papers (DS-Net and Cylinder3D) accepted to CVPR 2021.
Start my journey in MMLab@NTU!
EVA3D: Compositional 3D Human Generation from 2D Image Collections
International Conference on Learning Representations (ICLR), 2023 (Spotlight)
HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling
European Conference on Computer Vision (ECCV), 2022 (Oral)
A large-scale multi-modal (color images, point clouds, keypoints, SMPL parameters, and textured meshes) 4D human dataset with 1000 human subjects, 400k sequences and 60M frames.
AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars
ACM Transactions on Graphics (SIGGRAPH), 2022
AvatarCLIP empowers layman users to customize a 3D avatar with the desired shape and texture, and drive the avatar with the described motions using solely natural languages.
Versatile Multi-Modal Pre-Training for Human-Centric Perception
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022 (Oral)
The first to leverage the multi-modal nature of human data (e.g. RGB, depth, 2D key-points) for effective human-centric representation learning.
Garment4D: Garment Reconstruction from Point Cloud Sequences
35th Conference on Neural Information Processing Systems (NeurIPS), 2021
The first attempt at separable and interpretable garment reconstruction from point cloud sequences, especially challenging loose garments.
LiDAR-based Panoptic Segmentation via Dynamic Shifting Network
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
Rank 1st in the public leaderboard of SemanticKITTI panoptic segmentation (2020-11-16); A learnable clustering module is designed to adapt kernel functions to complex point distributions.
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation / LiDAR-based Perception
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021 (Oral) IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Rank 1st in the public leaderboard of SemanticKITTI semantic segmentation (2020-11-16); Cylindrical 3D convolution is designed to explore the 3D geometric pattern of LiDAR point clouds. Further extend the cylindrical convolution to more general LiDAR-based perception tasks.
LRC-Net: Learning Discriminative Features on Point Clouds by Encoding Local Region Contexts
Computer Aided Geometric Design, 2020, 79: 101859. (SCI, 2017 Impact factor: 1.421, CCF B)
To learn discriminative features on point clouds by encoding the fine-grained contexts inside and among local regions simultaneously.
SHERF: Generalizable Human NeRF from a Single Image
arXiv Preprint, 2023
Reconstruct human NeRF from a single image in one forward pass!
ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model
arXiv Preprint, 2023
ReMoDiffuse is a diffusion-model-based motion generation framework that integrates a retrieval mechanism to refine the denoising process, which enhances the generalizability and diversity.
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
arXiv Preprint, 2022
The first diffusion-model-based text-driven motion generation framework with probabilistic mapping, realistic synthesis and multi-level manipulation ability.
LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network
arXiv Preprint, 2022
Extension of the CVPR21 Version; Extend DS-Net to 4D panoptic LiDAR segmentation by the temporally unified instance clustering on aligned LiDAR frames.
Google PhD Fellowship 2021
Outstanding Undergraduate Thesis of Tsinghua University
Outstanding Graduate of Tsinghua University
Outstanding Graduate of Beijing
Outstanding Graduate of School of Software, Tsinghua University
ICBC Scholarship (Top 3%)
Hua Wei Scholarship (Top 1%)
Tung OOCL Scholarship (Top 5%)
Conference Reviewer: CVPR’21/23, ICCV’23, NeurIPS’22, ICML’23, SIGGRAPH’23, AAAI’21/23
Journal Reviewer: TPAMI, IJCV, TCSVT, JABES, PR
NTU CE/CZ1115 Introduction to Data Science and Artificial Intelligence (Teaching Assistant)
NTU CE2003 Digital System Design (Teaching Assistant)
NTU CE/CZ1115 Introduction to Data Science and Artificial Intelligence (Teaching Assistant)
NTU SC1013 Physics for Computing (Teaching Assistant)
EVA3D is a high-quality unconditional 3D human generative model that only requires 2D image collections for training.