Graphics, Vision & Video

Reconstructing Detailed Dynamic Face Geometry from Monocular Video

SIGGRAPH Asia 2013

Pablo Garrido 1   Levi Valgaerts 1   Chenglei Wu 1,2   Christian Theobalt 1
1 MPI for Informatics 2 Intel Visual Computing Institute
Abstract Videos Bibtex Dataset


Detailed facial performance geometry can be reconstructed using dense camera and light setups in controlled studios. However, a wide range of important applications cannot employ these approaches, including all movie productions shot from a single principal camera. For post-production, these require dynamic monocular face capture for appearance modification. We present a new method for capturing face geometry from monocular video. Our approach captures detailed, dynamic, spatio-temporally coherent 3D face geometry without the need for markers. It works under uncontrolled lighting, and it successfully reconstructs expressive motion including high-frequency face detail such as folds and laugh lines. After simple manual initialization, the capturing process is fully automatic, which makes it versatile, lightweight and easy-to-deploy. Our approach tracks accurate sparse 2D features between automatically selected key frames to animate a parametric blend shape model, which is further refined in pose, expression and shape by temporally coherent optical flow and photometric stereo. We demonstrate performance capture results for long and complex face sequences captured indoors and outdoors, and we exemplify the relevance of our approach as an enabling technology for model-based face editing in movies and video, such as adding new facial textures, as well as a step towards enabling everyone to do facial performance capture with a single affordable camera.

pdf (3.4M) / (35.9M)
Supplementary Material
pdf (70.9M)
mp4 (126.5M)
pptx (400.6M)

Dataset available!


Supplementary video to the paper
Additional video


author = {Pablo Garrido and Levi Valgaerts and Chenglei Wu and Christian Theobalt},
title = {Reconstructing Detailed Dynamic Face Geometry from Monocular Video},
booktitle = {{ACM} Trans. Graph. (Proceedings of SIGGRAPH Asia 2013)},
volume = {32},
number = {6},
pages = {158:1--158:10},
month = {November},
year = {2013},
url = {},
doi = {10.1145/2508363.2508380}


Data description: We provide 3 sequences captured indoors with a Canon EOS 550D at 25 fps in HD quality (1920 x 1088 pixels) and another one captured outdoors with a GoPro Hero at 30 fps in HD quality (but cropped to 1020x880 pixels). The sequences range between 562~1000 frames. Our tracked 3D geometry consists of 200K vertices, except for the one tracked outdoors which consists of 50K vertices. Note that each mesh was colored with the corresponding pixel in the image.

Terms of use: The data we provide is meant for research purposes only and any use of it for non-scientific means is not allowed. This includes publishing any scientific results obtained with our data in non-scientific literature, such as tabloid press. We ask the researcher to respect our actors and not to use the data for any distasteful manipulations (such as hideous deformations, exploding heads, manipulations that might be culturally sensitive, etc.). Any dissemination of this data outside of your institute is forbidden; distribution within the affiliated institution is allowed.

If you use our data, you are required to cite the following paper (see bibtex above):
"Reconstructing Detailed Dynamic Face Geometry from Monocular Video".

Downloads: The data provided below is password protected. To request the data, you must send us an email to stating the following:

  • Your name, title, and institution
  • Your intended use of the data
  • A statement saying that you accept our terms of use
Note: Only institutional emails will be accepted.

Compressed rar files: Subject1 Subject2 Subject3 Subject4_Outdoors
Browse: Link