MotionDreamer

Exploring Semantic Video Diffusion features for Zero-Shot 3D Mesh Animation

Delft University of Technology

Welcome to

MotionDreamer

Animation techniques bring digital 3D worlds and characters to life. However, manual animation is tedious and automated techniques are often specialized to narrow shape classes. In our work, we propose a technique for automatic re-animation of arbitrary 3D shapes based on a motion prior extracted from a video diffusion model. Unlike existing 4D generation methods, we focus solely on the motion, and we leverage an explicit mesh-based representation compatible with existing computer-graphics pipelines. Furthermore, our utilization of diffusion features enhances accuracy of our motion fitting. We analyze efficacy of these features for animation fitting and we experimentally validate our approach for two different diffusion models and four animation models. Finally, we demonstrate that our time-efficient zero-shot method achieves a superior performance re-animating a diverse set of 3D shapes when compared to existing techniques in a user study.


Try it out yourself!

Explore some looped examples of our method below. Push the L and R buttons to cycle through the objects and animations. To change the camera view push V. To toggle the single-view texturing press T. Note that textures are only available for a single view which is used for optimization (check the paper for details).

Orc Head (NJF)

T
Texture
L
Previous
R
Next
V
View

Model source: Jaka Ardian 3D art (Model from Indonesia)

The website is still under construction, more information will be added in the future.


How does it work?

First, we automatically texture the input mesh M to reduce the domain gap to the VDM (Video Diffusion Model) prior. Second, we condition the VDM by a rendered image Irgb to produce a latent video with motion and to extract features  for all L frames from its internal U-Net. Finally, we reproject the input frame features Â0 on the mesh surface and we optimize mesh animation parameters p to match the reposed mesh features to the feature video.

BibTeX

@misc{uzolas2024motiondreamerexploringsemanticvideo,
      title={MotionDreamer: Exploring Semantic Video Diffusion features for Zero-Shot 3D Mesh Animation}, 
      author={Lukas Uzolas and Elmar Eisemann and Petr Kellnhofer},
      year={2024},
      eprint={2405.20155},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2405.20155}, 
}