An Analytical Method for Recognizing Cat Meowing States Utilizing Short-Time Fourier Transform and Vision Transformers

Main Article Content

Jian Huang, Yuxia Tang, Peixuan Zhang, Yanhua Liu

Abstract

Sound Event Detection (SED) technology is currently one of the research hotspots in the field of audio signal processing. Its goal is to identify the event categories present in an audio segment and label the start and end times of each event. Using sound detection technology to analyze and identify animal sound signals is important for understanding animal behavior patterns and detecting animal status. In view of the complex noise environment and low detection accuracy in practical application scenarios, this paper takes complex audio as the analysis object and explores the animal sound event detection method combining short-time Fourier transform technology and deep learning, which is an exploratory work for further developing practical animal sound recognition systems. The main work and innovations are as follows: (1) extracting the characteristics of animal sound events by analyzing the spectrogram imaging parameters, and (2) proposing a method for detecting animal state sound events based on short-time Fourier transform and deep learning.

Article Details

Section
Articles