音视频同步网络 syncnet
https://www.robots.ox.ac.uk/~vgg/software/lipsync/
在视频中消除音频和视觉流之间的时间延迟,以及
确定在视频中的多个面孔中谁在说话。
Dependencies
git clone https://github.com/joonson/syncnet_python.git
conda create -n syncnet_python python==3.10
conda activate syncnet_python
pip install -r requirements.txt
requirements.txt
numpy==1.26.4
opencv-contrib-python==4.9.0.80
opencv-python==4.9.0.80
torch==2.2.2
torchvision==0.17.2
scipy==1.15.2
scenedetect
python_speech_features
Download model
./download_model.sh
Run