Skip to content

Using the deepgram.com Speech Recognition API

After v2.92, support for the deepgram.com speech recognition API has been added. This is a foreign AI service that offers a $200 credit upon registration, which is sufficient for a period of use.

  1. Open the website https://deepgram.com/ and register/log in to access the console at https://console.deepgram.com/

  1. After logging in, click the large green "Create API Key" button in the console.

A window like the following will pop up after clicking:

Enter any English letters in the first text box, and then click "" at the bottom. The SK will then be displayed; remember to copy it, as shown below:

  1. Open Menu -- Speech Recognition Settings -- Deepgram Window

  • API Key: Enter the copied key from the previous step in the API Key field.

  • Silence Duration: You can keep the default value of 200, i.e., 200ms. If the video to be recognized has a fast speech rate, you can appropriately reduce it to 150. If it is slower with more silences, you can appropriately increase it to 500 or 800.

  1. Note: The Deepgram platform does not support Chinese well. Whether using the subtitles directly returned by Deepgram or re-segmenting based on character-level timestamps, punctuation marks are missing, leading to suboptimal subtitle segmentation. To optimize this, the Ali Chinese Punctuation Restoration Model is used to re-segment the text. Please select "Chinese Re-segmentation" in the software interface.