Using the deepgram.com Speech Recognition API | pyVideoTrans-Open Source Video Translation Tool -pyvideotrans.com github.com/jianchang512/pyvideotrans

Using the deepgram.com Speech Recognition API

After v2.92, support for the deepgram.com speech recognition API has been added. This is a foreign AI service that offers a $200 credit upon registration, which is sufficient for a period of use.

Open the website https://deepgram.com/ and register/log in to access the console at https://console.deepgram.com/

After logging in, click the large green "Create API Key" button in the console.

A window like the following will pop up after clicking:

Enter any English letters in the first text box, and then click "" at the bottom. The SK will then be displayed; remember to copy it, as shown below:

Open Menu -- Speech Recognition Settings -- Deepgram Window

API Key: Enter the copied key from the previous step in the API Key field.
Silence Duration: You can keep the default value of 200, i.e., 200ms. If the video to be recognized has a fast speech rate, you can appropriately reduce it to 150. If it is slower with more silences, you can appropriately increase it to 500 or 800.

Note: The Deepgram platform does not support Chinese well. Whether using the subtitles directly returned by Deepgram or re-segmenting based on character-level timestamps, punctuation marks are missing, leading to suboptimal subtitle segmentation. To optimize this, the Ali Chinese Punctuation Restoration Model is used to re-segment the text. Please select "Chinese Re-segmentation" in the software interface.