Details in mesh animations are difficult to generate but they have
great impact on visual quality. In this work, we demonstrate a practical
software system for capturing such details from multi-view
video recordings. Given a stream of synchronized video images
that record a human performance from multiple viewpoints and an
articulated template of the performer, our system captures the motion
of both the skeleton and the shape. The output mesh animation
is enhanced with the details observed in the image silhouettes. For
example, a performance in casual loose-fitting clothes will generate
mesh animations with flowing garment motions. We accomplish
this with a fast pose tracking method followed by nonrigid deformation
of the template to fit the silhouettes. The entire process
takes less than sixteen seconds per frame and requires no markers
or texture cues. Captured meshes are in full correspondence making
them readily usable for editing operations including texturing,
deformation transfer, and deformation model learning.