(Workflow & Tutorial) Hanyuan Image-to-Video Workflow with 2 Sampling Groups , and Why?
Added 2025-03-07 13:00:12 +0000 UTC
Related Video : https://youtu.be/XsEi02yH_hI
Public Post : https://www.patreon.com/posts/123793195/
Reward Lora : https://www.patreon.com/posts/122549388?utm_campaign=postshare_creator&utm_content=android_share
The Hanyuan Image-to-Video model has been making waves in the AI community, and for good reason. It allows users to generate high-quality videos from still images with relative ease. In this post, we’ll explore how to set up a workflow using two sampling groups, leveraging insights from both practical experience and the official documentation available at ComfyUI’s Hanyuan Video guide.
Understanding the Basics
Before diving into the two-sampling-group workflow, let’s review the essentials:
Model Requirements: The Hanyuan Image-to-Video model requires specific files like Clip Visions (Lava Llama 3 Vision Safe Tensor), VAE, and Diffusion Models. These must be placed in their respective subfolders within your ComfyUI directory.
Resolution: By default, the model supports 720p resolution, but higher resolutions are achievable depending on your hardware capabilities.
Prompting: Unlike essay-like prompts, the Hanyuan model thrives on concise, clear instructions. Positive prompts suffice for generating desired outputs.
Download the following models and place them in the locations specified below:
Copy in ComfyUI/Models/
├── clip_vision/ │
└── llava_llama3_vision.safetensors
├── text_encoders/ │
├── clip_l.safetensors │
├── llava_llama3_fp16.safetensors │
└── llava_llama3_fp8_scaled.safetensors
├── vae/ │
└── hunyuan_video_vae_bf16.safetensors
└── diffusion_models/
└── hunyuan_video_image_to_video_720p_bf16.safetensors
Setting Up the Workflow
To maximize video quality and stability, incorporating two sampling groups is highly recommended. This approach refines the output by applying additional processing after an initial generation pass. Here's how it works:
Step 1: Initial Sampling Group
Input Preparation: Start by preparing your input image and resizing it appropriately. Use the Image Resize Node in ComfyUI to ensure consistent dimensions across all frames.
Custom Nodes: After updating ComfyUI, you’ll find new nodes specifically designed for Hanyuan Image-to-Video. The Genuine Image-to-Video Node will receive your start image and frame count (e.g., 429 frames for a 5-second video).
Connections: Connect the resized image to the Clip Vision Encode node, then link its output to the Text Encode Image-to-Video node. Ensure that your positive prompts are passed as conditions.
Sampler Configuration: Configure the first sampler with moderate settings—around 30 steps—and select a scheduler method. Optionally, enable TeaCache to speed up generation slightly.
Step 2: Second Sampling Group
The second sampling group enhances the results further by refining latent data from the first pass:
Latent Data Transfer: Take the latent image data generated by the first sampler and feed it into a second sampler node.
Refinement Settings: Increase the step count or adjust other parameters in the second sampler to sharpen details and improve motion coherence. For instance, increasing the step count to 50 can yield better facial clarity and smoother transitions.
Comparison Output: Compare the outputs of both sampling groups. Typically, the second sampler produces sharper lines, richer colors, and more stable character movements.
Benefits of Two Sampling Groups
Using two sampling groups offers several advantages:
Enhanced Quality: The second sampler polishes the raw output from the first, resulting in clearer visuals and reduced artifacts.
Better Motion Stability: Characters and objects exhibit more natural movements, avoiding issues like morphing or broken parts.
Flexibility: You can experiment with different settings in each sampler to achieve unique effects tailored to your project needs.
For more detailed guidance, visit the Hanyuan Video documentation and stay tuned for future updates!
Attached workflow file below, have fun :)
Comments
Which node did this occurred?
Benjamin Law
2025-03-09 20:18:24 +0000 UTCi got an error: teacache_hunyuanvideo_forward() takes from 8 to 11 positional arguments but 12 were given
O S
2025-03-09 18:29:23 +0000 UTCAs I mentioned in the video, its Reward Lora , i did another post on here that is about this Lora https://www.patreon.com/posts/122549388?utm_campaign=postshare_creator&utm_content=android_share Thanks
Benjamin Law
2025-03-07 16:19:14 +0000 UTCWhere does one find the file: HYVrewardMPS_epoch40.safetensors ? It's referenced in the workflow but not in the tutorial.
Kevin Gregory
2025-03-07 16:17:01 +0000 UTC