Zhiwen Chen (陈志文)

I'm currently working as a Staff Algorithm Engineer at Alibaba Group where I'm leading the PixelAI Algorithm Team of Tao Technology. Previously before 2017, I worked as a Video Analytic Researcher at Trakomatic Pte. Ltd., Singapore for several years.

I received the B.E. degree in computer science from SJTU, under the supervision of Prof. Fan Wu in 2012 and the M.E. degree in computer science from NUS in 2014.

Email  /  LinkedIn  /  CV

profile photo

Research

I'm interested in Computer Vision, in particular, Human Reconstruction, Animatable Avatar, Human-Computer Interaction, etc. Below are some highlighted publications.

project image TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting
Jianchuan Chen, Jingchuan Hu, Gaige Wang, Zhonghua Jiang, Tiansong Zhou,
Zhiwen Chen, Chengfei Lv
CVPR 2025 Highlight
paper / project page

We introduce TaoAvatar, which generates photorealistic, topology-consistent 3D full-body avatars from multi-view sequences. It provides high-quality, real-time rendering with low storage requirements, compatible across various mobile and AR devices like the Apple Vision Pro.

project image SaMER: A Scenario-aware Multi-dimensional Evaluator for Large Language Models
Kehua Feng, Keyan Ding, Jing Yu, Yiwen Qu, Zhiwen Chen, Chengfei Lv, Gang Yu, Qiang Zhang, Huajun Chen
ICLR 2025
paper / project page

We introduce SaMer, a fine-grained, scenario-adaptive evaluator that dynamically adjusts evaluation dimensions based on query context.

project image GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting
Hongyun Yu, Zhan Qu, Qihang Yu, Jianchuan Chen, Zhonghua Jiang, Zhiwen Chen, Shengyu Zhang, Jimin Xu, Fei Wu, Chengfei Lv, Gang Yu
ACM MM 2024
paper / project page

We present GaussianTalker, a novel method for audio-driven talking head synthesis based on 3D Gaussian Splatting. It outperforms existing state-of-the-art methods in talking head synthesis, delivering precise lip synchronization and exceptional visual quality. It also achieves rendering speeds of 130 FPS on NVIDIA RTX4090 GPU.

project image Multi-Level Pixel-Wise Correspondence Learning for 6DoF Face Pose Estimation
Miao Xu, Xiangyu Zhu, Yueying Kao, Zhiwen Chen, Jiangjing Lyu, Zhen Lei
TMM 2024
paper

We present a novel framework for 6DoF face pose estimation, where 2D features extracted from images and 3D features representing 3D shape interact with each other in a transformer architecture to learn the 2D-3D correspondence.

project image MVP-Human Dataset for 3D Clothed Human Avatar Reconstruction from Multiple Frames
Xiangyu Zhu, Tingting Liao, Xiaomei Zhang, Jiangjing Lyu, Zhiwen Chen, Yunfeng Wang, Kan Guo, Qiong Cao, Stan Z. Li, Zhen Lei
TBIOM 2023
paper / code

We present 3D Avatar Reconstruction in the wild (ARwild), which first reconstructs the implicit skinning fields in a multi-level manner.

project image Context Attention Network for Skeleton Extraction
Zixuan Huang, Yunfeng Wang, Zhiwen Chen, Xin Gao, Ruili Feng, Xiaobo Li
CVPR 2022 Workshop
paper

We proposed an attention-based model called Context Attention Network (CANet), which integrates the context extraction module in a UNet architecture, can effectively improve the network’s ability to extract the skeleton pixels. We were evaluated 1st place on CVPR DLGC Workshop and Challenge.

project image Simple Baseline for Single Human Motion Forecasting
Chenxi Wang, Yunfeng Wang, Zixuan Huang, Zhiwen Chen
ICCV 2021 Workshop
paper

We established a simple but effective baseline for single human motion forecasting without visual and social information. We were evaluated 1st place on ICCV SoMoF Workshop and Challenge.


Achievements

These include workshops, challenges and awards.

project image 1st International Workshop and Challenge on People Analysis: From Face, Body and Fashion to 3D Virtual Avatars
Zhiwen Chen (Challenge Main Organizer)
ECCV 2022 Workshop and Challenge
challenge

We contribute a large-scale dataset, MVP-Human (Multi-View and Multi-Pose 3D Human), which contains 250 subjects. Each subject has 15 type of different poses. Each pose contains 8-view RGB images.

project image The Fourth Workshop on Deep Learning for Geometric Computing
Zixuan Huang, Yunfeng Wang, Zhiwen Chen
CVPR 2022 Challenge
1st place winner of Pixel SkelNetOn Track
workshop / challenge

project image 1st Workshop, Benchmark and Challenge on Human Trajectory and Pose Dynamics Forecasting in the Wild
Chenxi Wang, Yunfeng Wang, Zixuan Huang, Zhiwen Chen
ICCV 2021 Challenge
1st place winner of PoseTrack and 3DPW datasets
workshop / challenge


Template copied from Jon Barron and Matiur Rahman Minar.