Real-time classification of audio - thousands of classes


There is a need to classify an audio stream in real-time - using one from thousands of the defined classes. Up to 5000 classes potentially.

What's the best machine learning algorithm for this task?

DTW? HMM? SVM? Some specific DSP algorithms?

Posted 2016-07-01T20:07:31.860

Anything that doesn't require extracting expensive features; something that works with FFTs and MFCCs, such as a modern neural network should be fine. The training complexity is not an issue in your case. – Emre – 2016-07-01T20:19:13.463

Could give some example of the classes? Is your problem similar to speech recognition? – William – 2016-07-01T20:24:42.323

Yes, similar to speech recognition, but discovering such things as "car engine", "scream", "human steps", "dog barks" etc. – Oleg Puzanov – 2016-07-01T20:32:07.553

SVM should work too. – n1tk – 2016-07-03T04:41:20.140

