Skip to content

yxt7979/Style-Talker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

StyleTalker: Stylized Talking Head Avatar from Monocular Video

Xueting Yang, Yukang Cao, Junli Deng, Zhaoxin Fan, Kai Han*, Kwan-Yee K. Wong*

Paper page

Please refer to our webpage for more visualizations.

Abstract

We introduce StyleTalker, a text-guided method for editing and animating dynamic 3D head avatars from a monocular video. Current 3D scene editing techniques face two main challenges when applied in this task: 1) They typically require multi-view videos for accurate geometry reconstruction. Additionally, they are not suited for dynamic scenarios, making them ineffective for editing talking head avatars from a single-view video. 2) They struggle with fine-grained local edits, largely due to biases inherited from pre-trained 2D image diffusion models and limitations in detecting detailed facial landmarks. To overcome these challenges, we propose StyleTalker with two key innovations: 1) A mesh-enhanced 3D Gaussian reconstruction that combines 3D head priors with multi-view video diffusion, improving the accuracy and flexibility of the reconstruction process. 2) A landmark-driven talking head editing method that uses 3D facial landmarks to guide the editing process. By adjusting the strength of the edits based on the distance to these landmarks, our method ensures that the avatar's original identity is preserved while achieving the requested editing. Our extensive experiments demonstrate that StyleTalker outperforms current state-of-the-art methods, delivering high-quality edits and enabling the animation of avatars with diverse facial expressions, all based on a single-source video.

Pipeline

StyleTalker takes as input a monocular video and text prompts to edit dynamic 3D scenes represented by 3D Gaussian splitting. (1) we first integrate both 3D head prior and multi-view video diffusion model (V3D) to reconstruct the detailed 3D Gaussian head. (2) We propose a weighted loss related to edit strength b and the 3D facial landmarks, effectively achieve local editing while maintaining the original character identity.

Code

We are working on releasing the code... 🏗️ 🚧 🔨 Please stay tuned!

Misc.

If you want to cite our work, please use the following bib entry:


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published