Improving acoustic species identification using data augmentation within a deep learning framework

MacIsaac, Jennifer, Newson, Stuart, Ashton-Butt, Adham, Pearce, Huma and Milner, Ben (2024) Improving acoustic species identification using data augmentation within a deep learning framework. Ecological Informatics, 83. ISSN 1574-9541

[thumbnail of MacIsaac_etal_2024_EcologicalInformatics]
Preview
PDF (MacIsaac_etal_2024_EcologicalInformatics) - Published Version
Available under License Creative Commons Attribution.

Download (13MB) | Preview

Abstract

Convolutional neural networks (CNNs) are effective tools for acoustic classification tasks such as species identification. Large datasets of labelled recordings are required to develop CNN classifiers which can be difficult to obtain, particularly if species are rare or vocalise infrequently. Additionally, data often requires manual labelling which can be time consuming requiring expert analysis. Artificially generating data using augmentation can address these challenges, however the impact of data augmentation on CNN performance is poorly understood and often omitted in bioacoustic studies. Here, we empirically test the impact of CNN architecture and 20 data augmentation methods on classifier performance. We use acoustic identification of 18 small mammal species as a case study of a species group that can be effectively surveyed by acoustic monitoring, but recordings for training data are scarce and difficult to collect. Networks that achieved the highest accuracy across all sample sizes was a 10-layer CNN (96.43 %) and a pre-trained ResNet50 model (96.37 %). Overall, all augmentation effects improved ResNet50 model performance and 17 effects improved Conv10 performance, increasing relative change in accuracy (RCA) by 0.021–0.641. Three augmentation effects negatively impacted Conv10 RCA by −0.042 to −0.182. We also show that adding augmented data when the number of original samples is low has the greatest positive impact on accuracy and this effect was larger with ResNet50 models. Our work demonstrates that using data augmentation where few original samples are available can considerably improve model performance and highlights the potential of augmentation in developing acoustic classifiers for species where data are limited or difficult to obtain.

Item Type: Article
Additional Information: Data availability statement: The source code for this paper is available on GitHub: https://github.com/j-macisaac/bioacoustic-classification-and-data-augmentation.git. Acoustic datasets used during this study are available from the corresponding author on reasonable request. Acknowledgements: This work was supported by the Natural Environment Research Council and the ARIES Doctoral Training Partnership [grant number NE/S007334/1], the Endangered Landscapes and Seascapes Programme, managed by the Cambridge Conservation Initiative in partnership with Arcadia and Frankfurt Zoological Society. Data collection was facilitated by APB and Anton Kuzmickij in Belarus, Kaunas Tadas Ivanauskas Museum of Zoology in Lithuania and Roger Trout in the UK. The research presented in this paper was carried out on the High Performance Computing Cluster supported by the Research and Specialist Computing Support service at the University of East Anglia.
Faculty \ School: Faculty of Science > School of Computing Sciences
Faculty of Science > School of Environmental Sciences
UEA Research Groups: Faculty of Science > Research Groups > Interactive Graphics and Audio
Faculty of Science > Research Groups > Smart Emerging Technologies
Depositing User: LivePure Connector
Date Deposited: 17 Oct 2024 09:30
Last Modified: 01 Nov 2024 04:30
URI: https://ueaeprints.uea.ac.uk/id/eprint/97050
DOI: 10.1016/j.ecoinf.2024.102851

Downloads

Downloads per month over past year

Actions (login required)

View Item View Item