NokiMo
Innovate Futures @ Benji
Innovate Futures @ Benji

patreon


[For Patreon Supporters] Wan 2.2 SoundToVideo - I2V & V2V Workflow (Ver. 20250828)



Related Post : https://www.patreon.com/posts/wan-2-2-sound-to-137551273

Tutorial Video : https://youtu.be/MegoM8KSO_s

Attached 2 workflows demo in this tutorial video.




Resources:

wan2.2 s2v in Comfy

https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files

Models Your Need To Get This working

-----------------------------------

models/diffusion_models

-----------------------------------

wan2.2_s2v_14B_bf16 (For High VRAM):

https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_s2v_14B_bf16.safetensors

WanVideo_comfy_fp8_scaled/S2V (For Low VRAM):

https://huggingface.co/Kijai/WanVideo_comfy_fp8_scaled/tree/main/S2V

Wan2.2-S2V-14B-GGUF (For Low VRAM):

https://huggingface.co/QuantStack/Wan2.2-S2V-14B-GGUF

models/audio_encoders

-----------------------------------

wav2vec2_large_english_fp16 :

https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/audio_encoders/wav2vec2_large_english_fp16.safetensors

models/Lora

-----------------------------------

wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors :

https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors

models/text_encoders

-----------------------------------

https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/text_encoders

models/vae

-----------------------------------

https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/vae

https://github.com/benjiyaya/ComfyUI-Logic



option for Audio Separation : https://huggingface.co/Kijai/MelBandRoFormer_comfy/tree/main

Hard to explain everything in one video, so feel free to leave your question in the comment section or discuss it in Patreon Discord.

Comments

This question is too broad, A lots of facts can cause this.

Benjamin Law

when I set it to 720p, the video generation freezes when it gets to the ksampler, any clue?

Nicolas Giarrusso

installation -wise , yes, using native node everything already packed. Hardware requirements wise, no, Wan 2.2 required a lots more VRam on computing, some YouTube false claim, said oh 8GB VRam to run Wan 2.2. Really? generate a little 3-5 seconds chip took him half a hour. Inf.Talk required less VRam. Lipsync quality, 480p 50/50 for both. 720p, Wan 2.2 got some more detail, but not too much.

Benjamin

I hate to ask but overall is the WAN 2.2 S2V even worth it compared to infinite talk? I ask because not even you sounded too impressed with it in the video. Are there any advantages to using WAN S2V over Infinite talk?

Russ Ader


Related Creators