
Video : https://youtu.be/NtyhIcRR3NU
In this video, we dive into Nvidia Cosmo, the groundbreaking Text-to-World and Video-to-World diffusion models announced at CES 2025. Discover how to run Nvidia Cosmo in ComfyUI for Text-to-Video, Image-to-Video, and Video-to-Video workflows, and explore the future of AI video generation.
What You'll Learn:
Nvidia Cosmo Overview: Understand the capabilities of the 7B and 14B parameter models for Text-to-Video, Image-to-Video, and Video-to-Video generation.
ComfyUI Integration: Step-by-step guide on setting up Nvidia Cosmo in ComfyUI, including downloading model weights, SafeTensor files, and workflow configurations.
Workflow Demos: See Text-to-Video and Image-to-Video workflows in action, with tips on optimizing settings like Torch Compile, EDM Sampling, and REST Multi-Step sampling.
Advanced Features: Explore how Nvidia Cosmo supports Video-to-Video by default, allowing seamless video input and output without additional custom nodes.
Text encoder and VAE:
https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/tree/main
oldt5_xxl_fp8_e4m3fn_scaled.safetensors -> ComfyUI/models/text_encoders
cosmos_cv8x8x8_1.0.safetensors -> ComfyUI/models/vae
Note: oldt5_xxl is not the same as the t5xxl used in flux and other models. oldt5_xxl is t5xxl 1.0 while the one used in flux and others is t5xxl 1.1
https://huggingface.co/mcmonkey/cosmos-1.0/tree/main
Goes in: ComfyUI/models/diffusion_models
Note: "Text to World" means Text to video and "Video to World" means image/video to video.
If you want the original diffusion models in .pt format the official links are:
https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Text2World
https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Text2World
https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Video2World
https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Video2World
Workflows
Future Updates:
Get a sneak peek at the latest ComfyUI updates, including support for Start Image and End Image in Image-to-Video workflows.

Resources:
https://huggingface.co/collections/nvidia/cosmos-6751e884dc10e013a0a0d8e6
https://gist.github.com/comfyanonymous/2f57adabe5a22b36a21ae024306daddb
https://github.com/comfyanonymous/ComfyUI/issues/6375
https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/tree/main/vae
https://huggingface.co/mcmonkey/cosmos-1.0/tree/main
https://huggingface.co/Kijai/Cosmos1_ComfyUI/tree/main
Why This Matters:
Cutting-Edge AI: Nvidia Cosmo represents the next evolution in AI video models, offering unparalleled flexibility and quality.
Efficient Workflows: Learn how to integrate Nvidia Cosmo into ComfyUI for seamless video generation, whether you're working with text, images, or videos.
Creative Potential: Unlock new possibilities in AI video creation with advanced features like Negative Prompts, Torch Compile, and RES Multi-Step sampling.
Perfect For:
AI video creators looking to explore the latest advancements in Text-to-Video, Image-to-Video, and Video-to-Video generation.
ComfyUI users interested in integrating Nvidia Cosmo into their workflows.
AI enthusiasts eager to stay ahead of the curve with cutting-edge AI video models.
Eastern Magus
2025-01-20 01:56:52 +0000 UTC