CosyVoice2-TTS One-Click Installer for Windows: Effortless AI Voice Synthesis for Beginners
- Download Link: Download from HuggingFace.co
Have you been amazed by Alibaba's open-source CosyVoice2
AI text-to-speech technology but were put off by the complex and error-prone installation process?
Don't worry, this one-click installer is tailor-made for you!
With it, you don't need to install Python or struggle with various complicated errors. Just a few simple steps on your Windows 10 or Windows 11 system, and you can easily experience top-tier AI voice synthesis technology.
A Quick Look at the Power of CosyVoice2
CosyVoice2 is a powerful multilingual text-to-speech model that generates exceptionally accurate, stable, and naturally fluent speech.
- Multi-language Support: Includes Chinese, English, Japanese, and Korean, as well as various Chinese dialects like Cantonese, Sichuanese, and Shanghainese.
- Cross-Lingual Voice Cloning: You can use a Chinese voice sample to make it speak authentic English, and vice versa.
- Ultra-Low Latency: Responds incredibly fast, with generated audio available in as little as 150 milliseconds.
- More Accurate Pronunciation: Compared to its predecessor, the error rate is reduced by 30%-50%, resulting in very standard pronunciation.
- Highly Stable Timbre: Maintains voice consistency and stability no matter how it's used.
- Emotion and Accent Control: Supports finer control over emotions and adjustments to accents, making the voice more expressive.
🚀 Start Your AI Voice Journey in Just Three Steps
Step 1: Download the All-in-One Package
First, you need to download the package file named cosyvoice2-win.7z
. We provide two download channels; you can choose the one that is faster for you:
- Download Link: Download from HuggingFace.co
Special Note: This is a
.7z
format archive. If your computer cannot open it directly or you encounter errors during extraction, we recommend installing a free and powerful decompression tool like 360 Zip or Bandizip before trying again.
Step 2: Unzip the File
After the download is complete, locate the archive file. Right-click on it and select "Extract Here" or "Extract to cosyvoice2". After extraction, you will get a new folder with the same name.
Step 3: Double-Click to Start!
Open the folder you just extracted and find a file named 双击启动.bat
(which means "Double-click to start.bat").
Simply double-click it with your mouse, and the program will start running!
What Happens After Double-Clicking?
A black window will pop up (we call it the "Command Prompt"). Please do not close this window, as the program is handling everything for you in the background:
- Automatic Model Download: The program will first check if the required AI model files (several GB in size) are present. If any files are missing, it will automatically start downloading them. You will see the download progress in the window. This process can take a long time depending on your internet speed, so please be patient.
Network Tip: If the download fails mid-way and you want to try again, first navigate to the
pretrained_models
folder, delete the incomplete model folder inside, and then run双击启动.bat
again.
Start Core Service: Once the models are ready, the program will automatically launch the WebUI service. This is the user interface you will use for voice synthesis.
See the Success Message: Continue to wait until you see a message similar to the one below in the black window. This means you're all set!
Running on local URL: http://127.0.0.1:8000 To create a public link, set `share=True` in `launch()`.
This indicates that CosyVoice2 is now running successfully on your computer!
💻 Start Your AI Voice Creation
Keep the black window open, then open your web browser (Chrome or Edge is recommended) and type the following into the address bar at the top:
http://127.0.0.1:8000
Press Enter, and you will see the clean and powerful user interface. Now you can explore freely, input text, upload voice samples, and generate unique AI voices!
How to close the program? It's very simple. When you are finished, just close the black command prompt window that has been running.
🔧 Advanced Usage: Switching Between Different Voice Models
This package comes with several models, each with different characteristics. The default is the most comprehensive CosyVoice2-0.5B
model. You can switch manually if you have specific needs.
CosyVoice-300M-SFT
: Must be used if you want to use the various built-in preset voices.CosyVoice-300M-Instruct
: Must be used if you want to control the voice with text descriptions (e.g., "say it in a gentle tone").CosyVoice2-0.5B
: The latest and most powerful model, with the best overall performance (default).CosyVoice-300M
: A base model.
How to switch:
- In the folder, find the
双击启动.bat
file, right-click on it, and select "Edit". (If you don't see "Edit", choose "Open with" -> "Notepad"). - You will see the following lines of code:batch
call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice2-0.5B rem call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice-300M rem call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice-300M-Instruct rem call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice-300M-SFT
- Here,
rem
means "remark" or "comment," indicating that the line is currently inactive.- To disable the current model: Add
rem
(note the space after rem) to the beginning of its line. - To enable the target model: Remove the
rem
from the beginning of its line.
- To disable the current model: Add
- After making your changes, save and close Notepad, then run
双击启动.bat
again (you must close any running instance first).
For example, to switch to the CosyVoice-300M-SFT
model, you would change the file to look like this:
rem call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice2-0.5B
rem call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice-300M
rem call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice-300M-Instruct
call %cd%/pybin/python webui.py --model_dir pretrained_models/CosyVoice-300M-SFT
❓ Frequently Asked Questions (FAQ)
Q: The program flashes and closes immediately after starting, or I see a ValueError: When localhost is not accessible...
error in the black window. What should I do?
A: This usually happens because your computer has a network proxy or VPN software (like some game accelerators) enabled. They are occupying the local network port that the program needs to use.
Solution: Please disable your VPN or network proxy software and then double-click the program to start it again.
For Advanced Users: API Integration
The package also includes a run-api.bat
file. If you are a developer and wish to integrate CosyVoice2's speech synthesis capabilities into other programs (such as pyVideoTrans
), you can double-click this file to start the API service.