panns_inference:音频标记和声音事件检测推理工具箱
项目描述
PANN 推断
panns_inference为音频标记和声音事件检测提供了一个易于使用的 Python 接口。音频标记和声音事件检测模型从 PANN 训练:用于音频模式识别的大规模预训练音频神经网络:https ://github.com/qiuqiangkong/audioset_tagging_cnn
安装
PyTorch>=1.0 是必需的。
$ pip install panns-inference
用法
$ python3 example.py
例如:
import librosa
import panns_inference
from panns_inference import AudioTagging, SoundEventDetection, labels
audio_path = 'examples/R9_ZSCveAHg_7s.wav'
(audio, _) = librosa.core.load(audio_path, sr=32000, mono=True)
audio = audio[None, :] # (batch_size, segment_samples)
print('------ Audio tagging ------')
at = AudioTagging(checkpoint_path=None, device='cuda')
(clipwise_output, embedding) = at.inference(audio)
print('------ Sound event detection ------')
sed = SoundEventDetection(checkpoint_path=None, device='cuda')
framewise_output = sed.inference(audio)
结果
------ Audio tagging ------ Checkpoint path: /root/panns_data/Cnn14_mAP=0.431.pth GPU number: 1 Speech: 0.893 Telephone bell ringing: 0.754 Inside, small room: 0.235 Telephone: 0.183 Music: 0.092 Ringtone: 0.047 Inside, large room or hall: 0.028 Alarm: 0.014 Animal: 0.009 Vehicle: 0.008 ------ Sound event detection ------ Checkpoint path: /root/panns_data/Cnn14_mAP=0.431.pth GPU number: 1 Save fig to results/sed_result.pdf
声音事件检测图:
引用
[1] Kong, Qiuqiang, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D. Plumbley。“PANN:用于音频模式识别的大规模预训练音频神经网络。” arXiv 预印本 arXiv:1912.10211 (2019)。
项目详情
关
panns_inference -0.0.7-py3-none-any.whl 的哈希值
算法 | 哈希摘要 | |
---|---|---|
SHA256 | 74a527c83d285a3885dcf892d61bb20f82b15217dca8932933a9753d371cc347 |
|
MD5 | 0e77f2fbcdf632409c2e09ce37f89f55 |
|
布莱克2-256 | 36dd2a540d0a8c1300fa0e30f2655b4a17cb133b3ff987b1c488db466729abc0 |