In this post I describe the neural network architecture that I try as a bird song classifier.
When I first came up with an idea of building a bird song classifier I started to google for the training dataset.
I found xeno-canto.org and the first thing that caught my attention was spectrograms.
(Sepctrogram is visual representation of how spectrum evolves through time. The vertical axis reflects frequency, the horizontal represents time. Bright pixels on the spectrogram indicate that for this particular time there is a signal of this particular frequency)
Well, spectrograms are ideal for visual pattern matching!
Why do I need to analyse sound when we have such expressive visual patterns of songs? That was my thoughts.
I decided to train neural net to classify spectrograms.