NokiMo
Innovate Futures @ Benji
Innovate Futures @ Benji

patreon


Hunyuan Video - Installation In GPU Server Steps

If you want speed like me , use either H100 or A100 (with 80GB VRam) for new AI video model such as Hunyuan Video, there are some GPU Server like this in Runpod  and Modal.

Here the steps in the Command Prompt setup for Hunyuan Video:

1 - clone the project

git clone https://github.com/Tencent/HunyuanVideo && cd HunyuanVideo

2 - setup virtual environment

If you want to set up a virtual environment (optional):

bash Anaconda-latest-Linux-x86_64.sh

conda env create -f environment.yml

conda activate HunyuanVideo

conda install gcc_linux-64 gxx_linux-64 -y conda install cuda -c nvidia -y

3 - Install libraries

python -m pip install -r requirements.txt

pip install packaging

pip uninstall -y ninja && pip install ninja

python -m pip install git+https://github.com/Dao-AILab/flash-attention.git@v2.5.9.post1

4 - Download Model from HF

huggingface-cli login

(get Read token from huggingface.co)

huggingface-cli download tencent/HunyuanVideo --local-dir ./ckpts

cd HunyuanVideo/ckpts

huggingface-cli download xtuner/llava-llama-3-8b-v1_1-transformers --local-dir ./llava-llama-3-8b-v1_1-transformers

cd ../

python hyvideo/utils/preprocess_text_encoder_tokenizer_utils.py --input_dir ckpts/llava-llama-3-8b-v1_1-transformers --output_dir ckpts/text_encoder

cd HunyuanVideo/ckpts

huggingface-cli download openai/clip-vit-large-patch14 --local-dir ./text_encoder_2

5 - After install everything, you should be ready to get start.

Go to the main folder:

cd HunyuanVideo

Run Text2Video script with parameters for video generation:

python3 sample_video.py --video-size 720 1280 --video-length 129 --infer-steps 30 --prompt "a cat is running, realistic." --flow-reverse --seed 0 --use-cpu-offload --save-path ./results

*** Replace the text prompt "a cat is running, realistic." with your own text prompt.

Have Fun!

Comments

Every creation goes through it's process and all I get is a black image

2thecurve

For Runpod, beware of the Pytouch version must be higher than 11.8 .

Benjamin Law

Thanks for your works I test it today on runpod to see if I follow your article if we arrived to make it run

Bertrand Samimi

For commercial solutions, I like Runway and Kling the most. Hailuo, I still can't consider its for commercial ready, in terms of the video features they have. Also, it always create crazy act for characters. For Tencent Video , i think its in between Runway and Kling for video quality. So wait and see their img2vid model weights release to have a full evaluate.

Benjamin Law

This is exciting times for the community supporting and championing open source video AI. In your experimentation and use, what's your take and comparison to the commercial solutions (Runway, Kling, Hailuo, Luma). Would love to know. Thanks

Yu Yeh

I did it this way, also as the official site method. You can try on Windows, but Flash Attention is not available in Win OS. Check out my video just posted about it.

Benjamin Law

Does this mean we need to be running Linux?... "conda install gcc_linux-64 gxx_linux-64 -y conda install cuda -c nvidia -y" I see on the GitHub that the instruction instead says... "python -m pip install -r requirements.txt"

OhWow


Related Creators