Style and Pose Control for Image Synthesis of Humans from a Single Monocular View

Animation Swapp Animation Swapp


Photo-realistic re-rendering of a human from a single image with explicit control over body pose, shape and appearance enables a wide range of applications, such as human appearance transfer, virtual try-on, motion imitation, and novel view synthesis. While significant progress has been made in this direction using learning based image generation tools, such as GANs, existing approaches yield noticeable artefacts such as blurring of fine details, unrealistic distortions of the body parts and garments as well as severe changes of the textures. We, therefore, propose a new method for synthesizing photo-realistic human images with explicit control over pose and part based appearance ,i.e., StylePoseGAN, where we extend a non-controllable generator to accept conditioning of pose and appearance separately. Our network can be trained in a fully supervised way with human images to disentangle pose, appearance and body parts, and it significantly outperforms existing single image re-rendering methods. Our disentangled representation opens up further applications such as garment transfer, motion transfer virtual try-on, head (identity) swap and appearance interpolation. StylePoseGAN achieves state-of-the-art image generation fidelity on common perceptual metrics compared to the current best-performing methods, and convinces in a comprehensive user study.

Animation Swapp


  • Paper


BibTeX, 1 KB

      title={Style and Pose Control for Image Synthesis of Humans from a Single Monocular View}, 
      author={Kripasindhu Sarkar and Vladislav Golyanik and Lingjie Liu and Christian Theobalt},


This work was supported by the ERC Consolidator Grant 4DReply (770784).


For questions and clarifications please get in touch with:
Kripasindhu Ksarkar

This page is Zotero translator friendly. Page last updated Imprint. Data Protection.