High Altitude Oolong: Try Web front end for whisper

2024/11/19

先來試試

他是用Gradio 做 whisper 的 web front end
python 用 3.11，然後 clone source ans install requirements

git clone https://github.com/jhj0517/Whisper-WebUI.git
cd Whisper-WeUI
pip install -r requirements.txt

因為我用 conda ，所以把 start-webui.sh的 venc/bin/activate comment 掉。
然後用 public serv:

./start-webui.sh --server_name=0.0.0.0b --inbrowser=false

把影片拖到網頁，選 large-v2，輸出 srt，開始...
看 console output 是開始下載model...所以網頁的 progress bar 沒有動作。
model download 完，開始轉換....
結果出現 Error:

Unable to load any of {libcudnn_ops.so.9.1.0, libcudnn_ops.so.9.1, libcudnn_ops.so.9, libcudnn_ops.so}

pip install ctranslate2==4.4.0

之後，就沒 Error 了。

實際測試各個 model，發現並不是越大就越好。
在轉換長影片(2:30)時，越 1:00 後的文字出現問題，一直重復。

High Altitude Oolong