0% found this document useful (0 votes)
850 views524 pages

Exploring Animal Behavior Through Sound Volume 1

Uploaded by

Andressa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
850 views524 pages

Exploring Animal Behavior Through Sound Volume 1

Uploaded by

Andressa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 524

Christine Erbe

Jeanette A. Thomas Editors

Exploring
Animal Behavior
Through Sound:
Volume 1
Methods
Exploring Animal Behavior Through
Sound: Volume 1
Christine Erbe • Jeanette A. Thomas
Editors

Exploring Animal
Behavior Through
Sound: Volume 1
Methods
Editors
Christine Erbe Jeanette A. Thomas (deceased)
Centre for Marine Science Moline, IL, USA
and Technology
Curtin University
Perth, WA, Australia

ISBN 978-3-030-97538-8 ISBN 978-3-030-97540-1 (eBook)


https://doi.org/10.1007/978-3-030-97540-1

# Springer Nature Switzerland AG 2022. This book is an open access publication.


Jointly published with ASA Press
Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative
Commons license and indicate if changes were made.
The images or other third party material in this book are included in the book's Creative Commons
license, unless indicated otherwise in a credit line to the material. If material is not included in the
book's Creative Commons license and your intended use is not permitted by statutory regulation
or exceeds the permitted use, you will need to obtain permission directly from the copyright
holder.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are
exempt from the relevant protective laws and regulations and therefore free for general use.
The publishers, the authors, and the editors are safe to assume that the advice and information in
this book are believed to be true and accurate at the date of publication. Neither the publishers nor
the authors or the editors give a warranty, express or implied, with respect to the material
contained herein or for any errors or omissions that may have been made. The publishers
remain neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Cover photo: Acoustic recording of an Adélie penguin colony at Brown Bluff, Antarctic Sound
(# Ole Næsbye Larsen)

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
In loving memory of
Jeanette A. Thomas,
A pioneer of animal bioacoustics,
A role model, mentor, colleague,
And dear friend to many of us.
We miss you, Jeanette.
The ASA Press

ASA Press, which represents a collaboration between the Acoustical Society


of America and Springer Nature, is dedicated to encouraging the publication
of important new books as well as the distribution of classic titles in acoustics.
These titles, published under a dual ASA Press/Springer imprint, are intended
to reflect the full range of research in acoustics. ASA Press titles can include
all types of books that Springer publishes, and may appear in any appropriate
Springer book series.

Editorial Board
Mark F. Hamilton (Chair), University of Texas at Austin
James Cottingham, Coe College
Timothy F. Duda, Woods Hole Oceanographic Institution
Robin Glosemeyer Petrone, Threshold Acoustics
William M. Hartmann (Ex Officio), Michigan State University
Darlene R. Ketten, Boston University
James F. Lynch (Ex Officio), Woods Hole Oceanographic Institution
Philip L. Marston, Washington State University
Arthur N. Popper (Ex Officio), University of Maryland
Christine H. Shadle, Haskins Laboratories
G. Christopher Stecker, Boys Town National Research Hospital
Stephen C. Thompson, The Pennsylvania State University
Ning Xiang, Rensselaer Polytechnic Institute
Preface

The idea for this textbook on Animal Bioacoustics was Jeanette’s. She reached
out to bioacousticians working on the different animal taxa and received great
interest in this book. Experts from around the globe joined her effort, devel-
oping chapters on bioacoustic studies on the diverse animal taxa, from
invertebrates and insects, to amphibians, reptiles, fishes, birds, and mammals.
It soon became obvious that the developing chapters relied on common
background knowledge, techniques, and terminology. The need for a volume
on methods to precede the volume on taxon-specific bioacoustic studies was
identified and this is when I came onboard.
In this volume, Chapter 1 presents a brief history to bioacoustic recording
and equipment. Chapter 2 provides guidance on choosing and calibrating
equipment. Chapter 3 explains how to collect bioacoustic data in the field
and laboratory, and what metadata are important to document. Chapter 4
introduces basic acoustic concepts, standard terminology, quantities and units,
and basic signal processing methods. Chapter 5 delves into the source–path–
receiver model, applied to terrestrial bioacoustic studies, with a comprehen-
sive treatise of sound propagation in terrestrial environments. Chapter 6 is
devoted to the intricacies of sound propagation under water. Chapter 7
explores terrestrial and aquatic soundscapes and introduces basic analysis
tools. Chapter 8 gives an overview of software algorithms for automated
detection and classification of animal sounds. Chapter 9 unravels analytical
and statistical methods for analyzing bioacoustic data. Chapter 10 presents
behavioral and physiological methods for studying animal hearing. The final
three chapters apply the tools presented in the first ten chapters to taxon-
overarching topics. Chapter 11 explores animal acoustic and vibrational
communication. Chapter 12 provides an overview of echolocation in bats,
dolphins, birds, and shrews. And Chap. 13 gives examples of the effects of
noise on animals.
The intended audience includes students and researchers of animal ecology
and, specifically, animal behavior, who wish to add acoustics to their toolbox.
Environmental managers in industry and government, members of
non-governmental organizations concerned with animal conservation, and
regulators of noise might equally find the book useful. The book will
empower its readers to understand and apply the bioacoustic research litera-
ture, design their own studies in the field and laboratory, avoid common
pitfalls and mistakes, choose appropriate equipment, apply different data

ix
x Preface

analysis methods, correctly interpret their data, adequately archive data for
future applications, and apply their results to management and conservation.
I would like to thank Keith Attenborough, Jay Barlow, Ross Chapman,
Russ Charif, Kurt Fristrup, Karl-Heinz Frommolt, Bob Gisiner, Alan Grinnell,
Shane Guan, Shizuko Hiryu, Dorian Houser, Vincent Janik, Colleen LePrell,
Peter Narins, Eric Rexstad, James Simmons, Hans Slabbekoorn, and Meta
Virant-Doberlet for reviewing one or more chapters in this volume.
A special thank-you goes to Lars Koerner at Springer Verlag in Heidelberg
for his emotional, technical, and editorial support throughout the years, in
particular the final year.
Open access to this book was mostly funded by the Richard Lounsbery
Foundation, as a contribution to the International Quiet Ocean Experiment.
The remainder of fees was covered by the Centre for Marine Science and
Technology at Curtin University, the Cornell Lab of Ornithology, and
l’Université de Toulon. Thank you!
Jeanette A. Thomas was a pioneer of animal bioacoustics. She successfully
straddled both terrestrial and aquatic worlds, studying animals from the
tropics to the poles. This book is a testament to her legacy.

Perth, WA Christine Erbe


September 2021

The original online version of this Frontmatter and backmatter was revised.
Contents

1 History of Sound Recording and Analysis Equipment . . . . . . 1


Gianni Pavan, Gregory Budney, Holger Klinck, Hervé Glotin,
Dena J. Clink, and Jeanette A. Thomas
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Advances in Recorders . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Advances in Microphones . . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Advances in Hydrophones . . . . . . . . . . . . . . . . . . . . . . 20
1.5 Autonomous Mobile Systems . . . . . . . . . . . . . . . . . . . . 24
1.6 Advances in Sound Analysis Hard- and Software . . . . . . 27
1.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2 Choosing Equipment for Animal Bioacoustic Research . . . . . 37
Shyam Madhusudhana, Gianni Pavan, Lee A. Miller,
William L. Gannon, Anthony Hawkins, Christine Erbe,
Jennifer A. Hamel, and Jeanette A. Thomas
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2 Basic Concepts of Sound Recording . . . . . . . . . . . . . . . 38
2.3 Instrumentation of Signal Chain Components . . . . . . . . . 40
2.4 Autonomous Recorders . . . . . . . . . . . . . . . . . . . . . . . . 63
2.5 Recording Directly to a Computer . . . . . . . . . . . . . . . . . 66
2.6 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.7 Other Gear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.9 Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . 81
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3 Collecting, Documenting, and Archiving Bioacoustical
Data and Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
William L. Gannon, Rebecca Dunlop, Anthony Hawkins,
and Jeanette A. Thomas
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.2 Ethical Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.3 Good Practices in Bioacoustical Studies . . . . . . . . . . . . . 89
3.4 Playback Methods and Controls . . . . . . . . . . . . . . . . . . 94
3.5 Considerations for Terrestrial Field Studies . . . . . . . . . . 99
3.6 Considerations for Aquatic Field Studies . . . . . . . . . . . . 99

xi
xii Contents

3.7 Considerations for Studies on Captive Animals . . . . . . . 102


3.8 Digital File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
3.9 Data Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.10 Archiving Recordings . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.11 Repositories of Bioacoustical Data . . . . . . . . . . . . . . . . 106
3.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
3.13 Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . 108
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4 Introduction to Acoustic Terminology and Signal
Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Christine Erbe, Alec Duncan, Lauren Hawkins,
John M. Terhune, and Jeanette A. Thomas
4.1 What Is Sound? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.2 Terms and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.3 Acoustic Signal Processing . . . . . . . . . . . . . . . . . . . . . . 134
4.4 Localization and Tracking . . . . . . . . . . . . . . . . . . . . . . 142
4.5 Symbols and Abbreviations (Table 4.10) . . . . . . . . . . . . 149
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5 Source-Path-Receiver Model for Airborne Sounds . . . . . . . . . 153
Ole Næsbye Larsen, William L. Gannon, Christine Erbe,
Gianni Pavan, and Jeanette A. Thomas
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.2 Sound Propagation in Terrestrial Environments . . . . . . . 155
5.3 The Source-Path-Receiver Model for Animal Acoustic
Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
5.5 Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . 179
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
6 Introduction to Sound Propagation Under Water . . . . . . . . . 185
Christine Erbe, Alec Duncan, and Kathleen J. Vigness-Raposa
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
6.2 The Sonar Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
6.3 The Layered Ocean . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
6.4 Propagation Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
6.5 Practical Acoustic Modeling Examples . . . . . . . . . . . . . 208
6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
6.7 Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . 214
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
7 Analysis of Soundscapes as an Ecological Tool . . . . . . . . . . . . 217
Renée P. Schoeman, Christine Erbe, Gianni Pavan,
Roberta Righini, and Jeanette A. Thomas
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
7.2 Terrestrial Soundscapes . . . . . . . . . . . . . . . . . . . . . . . . 219
Contents xiii

7.3 Aquatic Soundscapes . . . . . . . . . . . . . . . . . . . . . . . . . . 227


7.4 Soundscape Changes Over Space and Time . . . . . . . . . . 235
7.5 How to Analyze Soundscapes . . . . . . . . . . . . . . . . . . . . 239
7.6 Applications of Soundscape Studies . . . . . . . . . . . . . . . 249
7.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
7.8 Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . 252
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
8 Detection and Classification Methods for Animal Sounds . . . . 269
Julie N. Oswald, Christine Erbe, William L. Gannon,
Shyam Madhusudhana, and Jeanette A. Thomas
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
8.2 Qualitative Naming and Classification of Animal
Sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
8.3 Detection of Animal Sounds . . . . . . . . . . . . . . . . . . . . . 276
8.4 Quantitative Classification of Animal Sounds . . . . . . . . . 282
8.5 Challenges in Classifying Animal Sounds . . . . . . . . . . . 299
8.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
8.7 Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . 306
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
9 Fundamental Data Analysis Tools and Concepts for
Bioacoustical Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
Chandra Salgado Kent, Tiago A. Marques, and Danielle Harris
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
9.2 Developing a Clear Research Question . . . . . . . . . . . . . 321
9.3 Designing the Study and Collecting Data . . . . . . . . . . . . 321
9.4 Data Types and Statistical Concepts . . . . . . . . . . . . . . . 325
9.5 Tackling Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
9.6 Examples in Bioacoustics . . . . . . . . . . . . . . . . . . . . . . . 347
9.7 Software for Analyses . . . . . . . . . . . . . . . . . . . . . . . . . 351
9.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
10 Behavioral and Physiological Audiometric Methods for
Animals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Sandra L. McFadden, Andrea Megela Simmons,
Christine Erbe, and Jeanette A. Thomas
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
10.2 What Is an Audiogram? . . . . . . . . . . . . . . . . . . . . . . . . 356
10.3 Behavioral Methods for Audiometric Studies on Live
Animals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
10.4 Physiological Methods for Audiometric Studies on Live
Animals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
10.5 Other Audiometric Measurements . . . . . . . . . . . . . . . . . 379
10.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
xiv Contents

11 Vibrational and Acoustic Communication in Animals . . . . . . 389


Rebecca Dunlop, William L. Gannon,
Marthe Kiley-Worthington, Peggy S. M. Hill, Andreas Wessel,
and Jeanette A. Thomas
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
11.2 The Origins of Substrate-Borne Vibrational and
Acoustic Communication . . . . . . . . . . . . . . . . . . . . . . . 390
11.3 A Summary of Communication . . . . . . . . . . . . . . . . . . . 392
11.4 The Advantages and Disadvantages of Vibrational and
Acoustic Communication . . . . . . . . . . . . . . . . . . . . . . . 399
11.5 The Influence of the Environment on Acoustic and
Vibrational Communication . . . . . . . . . . . . . . . . . . . . . 400
11.6 Information Content or the Meaning of Signals . . . . . . . 403
11.7 Comparing Human Language to Nonhuman Auditory
Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
11.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores . . 419
Signe M. M. Brinkløv, Lasse Jakobsen, and Lee A. Miller
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
12.2 Characteristics of Echolocation Signals . . . . . . . . . . . . . 420
12.3 Differences in Echolocation Signals in Air and Water . . . 420
12.4 Echolocation in Bats . . . . . . . . . . . . . . . . . . . . . . . . . . 423
12.5 Echolocation in Odontocetes . . . . . . . . . . . . . . . . . . . . . 431
12.6 Echolocation in Birds . . . . . . . . . . . . . . . . . . . . . . . . . . 440
12.7 Orientation and Echolocation in Insectivores
and Rodents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
12.8 Are Echolocation Signals also Used for
Communication? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
12.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
12.10 Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . 450
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
13 The Effects of Noise on Animals . . . . . . . . . . . . . . . . . . . . . . . 459
Christine Erbe, Micheal L. Dent, William L. Gannon,
Robert D. McCauley, Heinrich Römer, Brandon L. Southall,
Amanda L. Stansbury, Angela S. Stoeger,
and Jeanette A. Thomas
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
13.2 Behavioral Options in a Noisy Environment . . . . . . . . . 465
13.3 Physiological Effects . . . . . . . . . . . . . . . . . . . . . . . . . . 467
13.4 Noise Effects on Marine Invertebrates . . . . . . . . . . . . . . 467
13.5 Noise Effects on Terrestrial Invertebrates . . . . . . . . . . . . 471
13.6 Noise Effects on Reptiles . . . . . . . . . . . . . . . . . . . . . . . 473
13.7 Noise Effects on Amphibians . . . . . . . . . . . . . . . . . . . . 474
Contents xv

13.8 Noise Effects on Fish . . . . . . . . . . . . . . . . . . . . . . . . . . 477


13.9 Noise Effects on Birds . . . . . . . . . . . . . . . . . . . . . . . . . 480
13.10 Noise Effects on Terrestrial Mammals . . . . . . . . . . . . . . 484
13.11 Noise Effects on Marine Mammals . . . . . . . . . . . . . . . . 488
13.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
About the Editor

Christine Erbe holds an M.Sc. degree in Physics (University of Dortmund,


Germany) and a Ph.D. in Geophysics (University of British Columbia,
Canada). She worked as a Research Scientist at Fisheries and Oceans
Canada, was Director of JASCO Applied Sciences Australia, and after a
brief stint in high-school education, returned to academia as Director of the
Centre for Marine Science and Technology at Curtin University (Perth, WA,
Australia). Christine’s interests are underwater sound (biotic, abiotic, and
anthropogenic), sound propagation, signal processing, and noise effects on
marine fauna. She is a Fellow of the Acoustical Society of America, former
Chair of the Animal Bioacoustics Technical Committee of the Acoustical
Society of America, and former Chair of the international conference series on
The Effects of Noise on Aquatic Life.

Jeanette A. Thomas (deceased), obtained her Ph.D. in Ecology and Evolu-


tionary Biology from the University of Minnesota (1979) on underwater
vocalizations of Weddell seals in the Antarctic. She was Director of the
Bioacoustics Laboratory at Hubbs-SeaWorld Research Institute (San Diego,
CA, USA), Senior Scientist at the Naval Ocean Systems Center (Kailua, HI,
USA), and Professor in Biology at Western Illinois University (WIU;
Macomb, IL, USA), where she helped establish a master’s degree program
in biology in collaboration with Shedd Aquarium (Chicago, IL, USA). In
2000, she developed the WIU Graduate Certificate in Zoo and Aquarium
Studies. Jeanette received several awards through WIU: Distinguished Fac-
ulty Lecturer, Outstanding Researcher, and Distinguished Alumni. Jeanette
was President of the Society for Marine Mammalogy (1994–1996) and Editor
for Aquatic Mammals (2000–2009).

xvii
History of Sound Recording and Analysis
Equipment 1
Gianni Pavan, Gregory Budney, Holger Klinck, Hervé Glotin,
Dena J. Clink, and Jeanette A. Thomas

1.1 Introduction reached the consumer market around 1980 with


the introduction of the compact disc (CD). In the
For centuries, scientists have recognized the “analog days,” researchers had to carry bulky and
importance of documenting human, animal, and heavy equipment and batteries to field locations;
environmental sounds. However, in recent recording duration was often limited by excessive
decades, the field of bioacoustics has experienced tape and battery consumption.
an exceptional period of growth, primarily Researchers produced hardcopies of sound
boosted by the rapid development of new displays using a Kay Sona-Graph™ machine
technologies and methods to record and analyze and spliced together sonograms to generate
acoustic signals. The most significant revolution figures for publication. Initially, frequency and
in the field was the introduction of digital record- time measurements were taken from these
ing, data storage, and analysis technologies that hardcopies using a regular ruler, and signals or
sound events of interest were identified manually
Jeanette A. Thomas (deceased) contributed to this chapter by listening human observers. As a result, studies
while at the Department of Biological Sciences, Western using bioacoustics-based approaches were sparse.
Illinois University-Quad Cities, Moline, IL, USA Now, researchers struggle to keep up with the
ever-increasing number of studies using bio-
G. Pavan (*)
Interdisciplinary Center for Bioacoustics and acoustics made possible by the accessibility,
Environmental Research, Dept. of Earth and Environment affordability, and extended recording capabilities
Sciences, University of Pavia, Pavia, Italy of current equipment.
e-mail: gianni.pavan@unipv.it
This chapter is a compilation of the authors’
G. Budney collective experiences in the field of bioacoustics,
Macaulay Library, Cornell Laboratory of Ornithology,
with each author having considerable experience
Cornell University, Ithaca, NY, USA
e-mail: gfb3@cornell.edu studying the sounds of vocal animals across a
myriad of terrestrial and aquatic environments.
H. Klinck · D. J. Clink
K. Lisa Yang Center for Conservation Bioacoustics, Even considering the drawbacks of the “good
Cornell Laboratory of Ornithology, Cornell University, old days” of bioacoustics research, the authors
Ithaca, NY, USA concur they were incredibly fortunate to have a
e-mail: holger.klinck@cornell.edu; dena.clink@cornell.
career studying fascinating animal sounds. As
edu
recording and analysis technologies improved,
H. Glotin
the types of information that could be extracted
Université de Toulon, Aix Marseille Univ, CNRS, LIS,
DYNI, Marseille, France from recordings of animal sounds increased. Pres-
e-mail: glotin@univ-tln.fr ently, species-level identification is possible in
# The Author(s) 2022 1
C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_1
2 G. Pavan et al.

most cases, and depending on the focal animals behavior is a concern. Through the development
the age, sex, reproductive status, behavior, activ- and use of autonomous recorders, video cameras,
ity patterns, and even health of an individual may and acoustic animal tags, human observer effects
be estimated from acoustic recordings. Acoustic can be minimized, and unsupervised data collec-
data can be used to estimate the population den- tion over extended periods (days to months) and
sity of vocal animals, and dialects can indicate the in remote locations is now possible.
geographic boundaries of a population. However, In this chapter, we describe the history of the
density estimation by acoustics is still in its development of transducers, recorders, and sound
infancy, and will require further advancement in analyzers, along with the advances that these
the spatial analysis of the acoustic environment developments facilitated in the field of bioacous-
by using multiple sensors to become reliable and tics. Recording equipment can now capture a
widely applicable. At the community level, the wide range of frequencies, from infrasounds to
entire acoustic environment or soundscape can be ultrasounds (sounds below and above the range of
used to estimate species abundance and biodiver- human hearing, respectively), and are used in a
sity. Changes in vocal behavior can be indicative wide range of applications, from the study of
of environmental stressors, such as anthropogenic individuals and populations to entire
noise or habitat degradation (Pavan 2017). soundscapes. The digital revolution in sound
Originally, sounds of terrestrial animals were recording and analysis allowed for significant
studied with equipment and methods developed advances in the field of bioacoustics (Obrist
for military needs, human speech analysis, and et al. 2010) and resulted in the development of
music processing (Koenig et al. 1946; Potter et al. new disciplines, such as computational bioacous-
1947; Marler 1955). Later, scientists became tics (Frommolt et al. 2008), acoustic ecology,
interested in the sounds of aquatic animals, and soundscape ecology (Pijanowski et al. 2011a, b;
underwater research was facilitated by Farina 2014), and ecoacoustics (Farina and Gage
technologies used by the navies to monitor the 2017). An overview of acoustic principles and the
noise made by ships and submarines. Because of evolution of sound recording systems for musical
the frequency limitations of transducers (i.e., applications is given in Rumsey and McCormick
microphones and hydrophones), recorders, and (2009) and in Rossing (2007).
analysis equipment, most initial bioacoustic
research was conducted in the sonic range (i.e.,
the frequency range audible to humans: 20 Hz– 1.2 Advances in Recorders
20 kHz). Even in the early stages of the digital
revolution, both recorders and analysis equipment The most significant advancement in recording
were generally limited to audible frequencies. technology was the switch from analog-to-digital
A major hurdle for collecting field recordings devices. A reduction in size and weight of the
was the large size and weight of early analog recorder, extended battery life, rechargeable
equipment, along with high power consumption, batteries, more stable and larger capacity storage
which resulted in limited recording time. The media, broader frequency range, and accessibility
development of smaller, lightweight recording of a computer interface accompanied this transi-
devices made the collection of acoustic data sig- tion. Together, these advances provided
nificantly easier. Currently, with the advent of bioacousticians with an adaptable system for
small digital recorders with large solid-state recording a variety of species, greater field porta-
memories, anyone including researchers, bility, and generally more affordable high-quality
professionals, and amateurs can collect large equipment.
amounts of high-quality acoustic data continu- To understand the basic differences between
ously over extended periods. However, when analog and digital recorders, a clear explanation
using handheld recorders, the potential influence of the terms is necessary. Humans perceive the
of the human observer on the animals’ acoustic world in analog; this means that everything is
1 History of Sound Recording and Analysis Equipment 3

seen and heard as a continuous flow of informa- digitize the analog signal and transform it into a
tion. In contrast, digital information estimates sequence of numbers.1 For playing back sounds
analog data by taking samples at discrete intervals from a computer, a sound interface with a digital-
and describing the sample values as a finite num- to-analog converter (DA-converter) is required.
ber represented by binary coding (Pohlmann Next, we outline a brief history of the evolution
1995). For instance, while a vinyl record player of analog and digital recording devices. For more
(phonograph) is analog, a CD player is digital. A detail on digital recording technologies, see
phonograph converts groove modulation from a Pohlmann 1995.
vinyl record into a continuous electrical signal,
whereas a CD player reads a pit structure that is
interpreted as a series of ones and zeros (bits) that
1.2.1 Analog Recorders
is typical of binary coding. Likewise, a video
cassette recorder (VCR) is analog, yet a digital
The first purported sound recording was made by
videodisc (DVD) player is digital. A VCR reads
Édouard-Léon Scott de Martinville and dates
audio and video data from a tape as a continuous
back to 1860. The recording was just a few
variation of magnetic information, whereas a
seconds in duration and was made using a
DVD player reads ones and zeros from a disc
phonautograph. The phonautograph has a vibrat-
similar to a CD.
ing stylus, which moves on soot-covered paper to
Digital devices can approximate analog audio
draw the sound waveform.2 It was invented in
or video signals with an accuracy level that is
1857, and although it could record sounds, it
dependent on both sampling rate and bit depth
never evolved to allow reproduction of the
(or the number of bits in each sample). The
recorded sound.
Shannon-Nyquist sampling theorem proves that,
In the 1870s, Thomas Edison invented the
for a given frequency range, a sampling rate at
wax-cylinder recorder (Figs. 1.1 and 1.2), which
least twice that of the highest frequency can cap-
had a vibrating diaphragm that was mechanically
ture all information in that frequency band,
linked to a needle that sculpted grooves. It was
enabling perfect reconstruction of the analog
initially recorded on aluminum foil and then on a
waveform.
wax layer covering the cylinder, as it was slowly
With proper sampling, analog signals can be
rotated and translated on a screw axis. This device
transformed in the digital domain at a level that
encoded the sound vibrations into modulations of
makes them indistinguishable from the original.
the groove and then allowed playback of the
A significant advantage of digital data is that it
recorded vibrations through the same needle-
can be stored and manipulated more easily than
membrane system.
analog recordings. With analog recorders, each
According to Ranft (2001), the first known
copy produces a little degradation that
recordings of animal sounds (a caged Indian
accumulates through multiple successive copies.
bird, the Common Shama) were made in
Analog tapes are also prone to degradation with
Germany in 1889 on an Edison wax-cylinder.
time. Digital copies are a perfect duplication that
One of the first known scientific studies of animal
is indistinguishable from the original, unless spe-
sounds occurred in 1892 when Richard Lynch
cific data codes are added to identify them. More
Garner recorded primates on vax cylinders at a
importantly, digital recordings can be directly
zoo in the USA (Garner 1892). Garner also
transferred to a computer for processing or trans-
ferred through the Internet to be shared among
different laboratories. If researchers want to trans- 1
Analog Definition and Meaning: www.webopedia.com/
fer audio or video files from old analog tapes so
TERM/A/analog.html; accessed 24 Oct 2021.
they can be recognized and processed by a com- 2
The Phonautograms of Édouard-Léon Scott de
puter, they must use a sound interface based on an Martinville: http://www.firstsounds.org/sounds/scott.php;
analog-to-digital converter (AD-converter) to accessed 24 Oct 2021.
4 G. Pavan et al.

Fig. 1.1 Thomas Alva Edison and his phonograph. (per http://loc.gov/pictures/resource/cwpbh.04044/),
Image source: https://commons.wikimedia.org/wiki/File: public domain, Wikimedia Commons
Edison_and_phonograph_edit2.jpg, by Levin C. Handy

experimented with the playback of the recordings bird sound was issued in 1910 in Germany, and
to observe the primates’ reactions. the first radio broadcast of a singing bird was in
The first flat disc was invented in the late Britain in 1927 (Ranft 2001).
1870s, which provided an advantage over previ- Lademar Poulsen, a Danish engineer, invented
ous technology as the discs could be easily the telegraphone or wire recorder in 1898
replicated. Then in 1887, Emile Berliner patented (Poulsen 1900). Wire recorders were the first
a variant of the phonograph, named the gramo- magnetic recording devices, and they utilized a
phone, which used flat discs instead of spinning thin metallic wire, which passed across an elec-
cylinders (Fig. 1.3). Sounds were recorded on tromagnetic recording head. Each point along the
a disc as modulated grooves, with a system wire was magnetized based on the intensity and
similar to the one developed by Edison for polarity of the signal in the recording head. Wire
wax-cylinders. The first published recording of a recorders often had problems with kinks in the
1 History of Sound Recording and Analysis Equipment 5

Fig. 1.2 Photographs of an Edison’s wax-cylinder player licenses/by-sa/3.0/, via Wikimedia Commons; (right)
(left) and a wax-cylinder recording (right). Image sources: https://commons.wikimedia.org/wiki/File:Bettini_1890s_
(left) https://commons.wikimedia.org/wiki/File: brown_wax_cylinder.jpg, by Jalal Gerald Aro, CC BY-SA
EdisonPhonograph.jpg, by Norman Bruderhofer, www. 2.0 https://creativecommons.org/licenses/by-sa/2.0, via
cylinder.de, CC BY-SA 3.0 http://creativecommons.org/ Wikimedia Commons

wires, but editing was relatively easy as sections recording of bird songs in the field3 (Ranft
of wire could simply be cut out. 2001). In those years, Theodore Case of Fox
In the early 1900s, RCA Victor developed the Case Corporation approached Arthur Allen to
Victrola, which played records or albums that record singing wild birds and demonstrate the
were readily available to the general public. sound-synchronized film technology. Under the
Sounds were recorded as modulated grooves on guidance of Allen, a Fox Case Corporation crew
a disc, and this disc was used to produce a master filmed and recorded the songs of wild birds in
metallic plate where the grooves appeared as North America (Little 2003). Today, two of those
ridges. Albums were then produced for distribu- recordings can be heard on the Macaulay Library
tion by molding copies using the master plate and website.4 After a successful campaign with the
Bakelite (or synthetic plastic) material. In 1920, Fox Case film crew, Allen and his colleague Peter
AT&T invented the Vitaphone, which recorded Paul Kellogg recorded the sounds of wildlife for
and reproduced sounds as optical soundtracks on research and education purposes. The Library of
photographic film; the film impression was made Natural Sounds (now known as the Macaulay
with a thin beam of light modulated by the sound. Library) began in 1930 at the Cornell Laboratory
Arthur Allen, the founder of Cornell of Ornithology. In 1932, Allen and Kellogg used
University’s Laboratory of Ornithology, and visual and audio recordings to demonstrate to the
Peter Kellogg made the first recordings of wild American Ornithological Union that the ruffed
birds in 1929 at a city park in Ithaca, NY, USA. grouse (Bonasa umbellus) produced drumming
Albert R. Brand (a graduate student of Allen) and sounds (Little 2003). In 1935, Cornell biologists
M. Peter Keane built the first equipment for
recording in the field. Together, they recorded 3
Macaulay Library: Early milestones (1920–1950):
over 40 bird species within the first two years. https://www.macaulaylibrary.org/about/history/early-
With World War I parabola molds available from milestones/; accessed 24 Oct 2021.
4
the Physics Department, Keane and True McLean Macaulay Library: listen to recordings of Rose-breasted
Grosbeak https://macaulaylibrary.org/asset/16968 and a
(a professor in Electrical Engineering at Cornell) Song Sparrow https://macaulaylibrary.org/asset/16737;
constructed a parabolic reflector to improve accessed 11 Oct 2021.
6 G. Pavan et al.

Fig. 1.3 Emile Berliner with disc record gramophone – record_gramophone_-_between_1910_and_1929.jpg,


between 1910 and 1929. Image source: https://commons. National Photo Company Collection (Library of
wikimedia.org/wiki/File:Emile_Berliner_with_disc_ Congress), public domain, via Wikimedia Commons

carried out an expedition to record the sounds of principle as the magnetic wire recorder, but
vanishing bird species, including the ivory-billed instead of wire, it had long, thin strips of paper
woodpecker (Campephilus principalis), for impregnated with fine particles of iron oxide that
which they used a mule-drawn wagon to transport were drawn across an electromagnetic head. After
recording equipment into the field (Fig. 1.4).5 World War II, the American company Ampex
Even with limited space and harsh conditions, perfected the German technology by replacing
Alton Lindsay, in 1934, took a phonograph paper with a thin plastic film. For almost
recorder on the Little America Expedition to 50 years, reel-to-reel magnetic tape was the stan-
Antarctica and made recordings of airborne dard media for use on recorder/playback devices
sounds from Weddell seals (Leptonychotes (Fig. 1.5). Reel-to-reel recorders (or open-reel
weddellii), available today at the Smithsonian recorders) used variable tape speeds to record
Institution. different frequency ranges, with faster recording
In the late 1930s, a German company invented speeds providing higher-frequency recordings.
the Magnetaphone, which was based on the same Another American company, a contemporary of
Ampex, the Amplifier Corporation of America,
5 was one of the first companies to develop a truly
Macaulay Library: listen to the ivory-billed woodpecker
recording made with an optical film recorder https:// portable reel-to-reel recorder, the Magnemite
macaulaylibrary.org/asset/6784; accessed 11 Oct 2021. 610, which was introduced in 1951 and was
1 History of Sound Recording and Analysis Equipment 7

Fig. 1.4 Photograph of ornithologist Peter Paul Kellogg Tract, Madison Parish, Louisiana. Image by Arthur
in 1935 in a mule-drawn wagon used to haul an amplifier A. Allen courtesy of the Cornell Laboratory of
(center) and optical film recorder (on the right) to capture Ornithology
the sounds of ivory-billed woodpeckers in the Singer

Fig. 1.5 Open-reel recorder made by AEG (1939). Image BY-SA 3.0 https://creativecommons.org/licenses/by-sa/
source: https://commons.wikimedia.org/wiki/File:AEG_ 3.0, via Wikimedia Commons
Magnetophon_K4_1939.jpg, by Friedrich Engel, CC
8 G. Pavan et al.

Fig. 1.6 Photograph of an


early 1950s field recording
system. Peter Paul Kellogg
with an Amplifier
Corporation of America
Magnemite 610 reel-to-reel
tape recorder and a Western
Electric 633 microphone
mounted in a parabolic
reflector. Courtesy of the
Cornell Laboratory of
Ornithology

used by many pioneers in the field of bioacous- which meant they were better suited for field
tics. Figure 1.6 shows Peter Paul Kellogg using a studies. Eventually, recorders had even more
1950s Magnemite 610 recorder with a Western channels (as many as 24 in some music-recording
Electric 633 microphone mounted in a parabolic studios), which enabled scientists to record and
reflector. playback signals simultaneously from more than
Initially, tape recordings were mono one acoustic sensor.
recordings with one soundtrack on the tape. Ste- Recorders were also developed to record a
reo recording techniques (providing two record/ wide range of frequencies. Studies by Griffin
playback channels) were developed in the 1960s. (1944), Sales and Pye (1974), and Au (1993),
Initially, these recorders were bulky and not field provided evidence that animals (bats and
portable. Then, portable open-reel recorders were dolphins) produce a wide range of ultrasonic
developed for the rapidly developing outdoor signals. The first recordings of ultrasonic echolo-
recording needs of the radio, music, and film cation signals from bats and dolphins were made
industries. Stereophonic recorders allowed the on expensive dedicated tape recorders at very fast
recording of two synchronous signals on parallel tape speed (60 and 120 inches per second).
tracks onto one tape. In bioacoustics applications, Among them, the RACAL Store4DS recorder
often one track was used by the recordist for was used in the 1980s and 1990s, and it provided
comments and the second track for recording tape speed up to 60 inches per second to record
animal sounds. frequencies up to 300 kHz. It was battery
In the 1970s and 1980s, the most common powered and reasonably portable. However, the
reel-to-reel recorders used by bioacousticians limited data storage capacity of these magnetic
were the Nagra III and IV series and the Uher reels meant that the recordings lasted only a few
4000 series. They offered multiple recording and minutes.
playback speeds (depending on the models, 3.75, In 1964, Philips introduced the compact cas-
7.5, 15, or 30 inches per second), were relatively sette tape, which was comprised of a small plastic
lightweight, ruggedized, and battery powered, case holding two small reels with 1/8-inch wide
1 History of Sound Recording and Analysis Equipment 9

Fig. 1.7 Left: Photograph of a semi-professional stereo https://commons.wikimedia.org/wiki/File:Philips_


cassette recorder Marantz CP430 used by nature recordists EL3302.jpg, by mib18 at German Wikipedia, CC BY-SA
until the last decade of the twentieth century. Right: Pho- 3.0 http://creativecommons.org/licenses/by-sa/3.0/, via
tograph of a mono cassette recorder (Philips K7, 1968) Wikimedia Commons
with microphone and cassette inside. Image source:

magnetic tape running at 4.75 cm/s (1.875 inches The same trick can now be done easily with
per second). In the 1970s, analog cassette digital systems. Playbacks are a commonly used
recorders, which could easily record and playback experimental approach in bioacoustics, wherein
sounds, became available at affordable prices, but previously recorded sounds are broadcast to the
were used primarily for music and human speech, animals of interest. Many playback studies used
and were thus limited in frequency to the human magnetic tape recordings containing animal
hearing range. These recorders (Fig. 1.7) were sounds as the stimuli.
much smaller and less expensive than reel-to- Researchers could easily play the sound back-
reel devices. Cassette tapes could record up to ward (by reversing the reading direction of a
one hour on each side of the cassette (typical spliced tape) or insert a section of tape containing
total recording duration was either 60, 90, or sounds of another species, individual, or noise as
120 min), but tapes were very thin and fragile, a control stimulus. Magnetic tape was also used
which made them prone to print-through (the to record live video images. The first practical
magnetic transfer of a recorded signal to adjacent video tape recorder (VTR) was built in 1956 by
layers of tape). In 1976, Sony introduced, with Ampex Corporation. The first VTRs were
little success, the Elcaset, a bigger cassette with reel-to-reel recorders used in television studios,
1/4-inch tape running at 9.5 cm/s. Today, how- which made recording for television cheaper and
ever, it is almost impossible to find new reel-to- easier.
reel or cassette tapes as there are very few VHS tape recorders, introduced in the 1970s,
manufacturers of these media. were the first compact analog devices to record
One of the advantages of tape recording was both audio and video signals simultaneously on
the possibility to play back the tapes at a speed the same tape. Commercial video cameras
lower or higher than the original recording speed. quickly became available for home use. Battery
This way it was possible to lower the frequency power for cassette recorders and VHS cameras/
of recorded ultrasonic signals to the human recorders made this equipment popular for field
hearing range, thus making them audible (and studies of animal behavior and sounds.
longer in duration); conversely, recordings of Many magnetic analog recordings had problems
infrasounds were played at higher speed to because the media deteriorated when tapes were
make them audible (and shorter in duration). not stored under properly climate-controlled
10 G. Pavan et al.

conditions. Unfortunately, some older analog encapsulated in a small cassette using a rotating
recordings have been lost, or, in some cases, the helical-scanning magnetic head, which allowed
players are not available to retrieve the recorded for much faster head-tape speed and data density.
sounds. In the last decades, a great effort was made Many R-DAT recorders allowed recording at dif-
by major sound libraries to preserve old recordings ferent sampling rates of 32.0, 44.1, or 48.0 kHz
(on wax-cylinders, discs, magnetic tapes, and and 16-bit resolution (the CD standard is
cassettes) and to transfer them to safer digital stor- 44.1 kHz, 16 bit) (Pohlmann 1995). The R-DAT
age (Ranft 1997, 2001, 2004). This was often not format had little success in the consumer market
an easy task because magnetic tape recordings used because of the high cost but was used widely by
a large variety of tape types, speeds, and track professional recordists as a replacement for
format arrangements. Unfortunately, many valu- expensive and bulky open-reel recorders.
able tape recordings have yet to be converted to a Some specialized R-DAT models allowed
digital format and archived. Without a long-term recording up to 100 kHz on a single channel
preservation strategy and support, it is possible that (i.e., by using a 204.8 kHz sampling frequency
these media may be lost forever. and doubled tape speed). R-DAT offered record-
ing quality that was comparable to open-reel
recorders, however, the helical-scanning head
proved problematic in humid conditions, and the
1.2.2 Digital Recorders
thin tape used in R-DAT cassettes was easily
damaged. An alternative to R-DAT was the digi-
The introduction of the CD by the music industry
tal compact cassette (DCC) introduced by Philips
in 1983 brought digital audio to the consumer
in 1992. DCC was compatible with the already
market and started a new audio recording age
existing analog cassette tapes but failed to gain
(Pohlmann 1995). The ability to store sound in a
commercial success.
digital format greatly improved acoustic data col-
Digital recorders with optical discs (CD-R and
lection. It allowed easy and perfect replication of
DVD-R) never gained popularity for field
recordings, enabled accurate digital editing, and
applications because the equipment had to remain
provided the means of more permanent data stor-
stationary while recording. Also, at the same
age with direct access for processing and analysis
time, magnetic discs (hard drives) quickly
by a computer.
became the state-of-the-art data storage media.
In 1987, Rotary Digital Audio Tape (R-DAT
In contrast, the MiniDisc (MD), a small optical
or DAT) recorders were the first widely available
disc developed and marketed by Sony in 1992,
digital recorders (Fig. 1.8). However, these
had more success among nature recordists,
devices still recorded on a thin magnetic tape

Fig. 1.8 (a) Photograph of a portable R-DAT recorder Sony TCD-D7 (1992) with a DAT cassette and the optical able to
provide digital data transfer to a PC. (b) a MiniDisc recorder and disc (1997)
1 History of Sound Recording and Analysis Equipment 11

because the MD portable recorders were smaller, have quiet microphone preamplifiers, several
lighter weight, and much cheaper than DAT types of powering options and can have up
recorders. MD offered random access to the to 8 channels. Most pocket recorders lack the
recordings (DAT and analog tape recorders phantom powering required for professional
allowed only sequential access), which made it microphones, but can power external
much easier to find and listen to specific sections microphones at low voltage (Plug-In-Power, or
of a recording. These devices used the same sam- PIP; see Sect. 1.3.1).
pling mode as the CD (44.1 kHz, 16 bit). The Most digital recorders can sample at different
main disadvantage of the MD was the lossy signal sampling frequencies (e.g., 44.1, 48, 96, and
compression based on Adaptive Transform 192 kHz) with either 16 or 24 bits of resolution,
Acoustic Coding (ATRAC), similar to the MP3 yielding very high sound quality. Some models
codec developed by the Moving Picture Expert can sample up to 192 kHz, but some of these have
Group (Budney and Grotke 1997). The compres- input electronics that limit the bandwidth to less
sion fit 74 minutes of acoustic data onto a small than 60 kHz, well beyond human hearing limits,
digital disc with a nominal capacity of but not enough for recording animal ultrasounds.
140 megabytes (MB) with a compression rate of In the music industry, other standards have been
5:1. The precision of some measurements of the developed to allow even higher acoustic quality
acoustic structure of animal sounds can be signif- (Melchior 2019), up to 384 kHz sampling with
icantly affected by lossy data compression 32-bit depth, but they are not yet available in
schemes (Araya-Salas et al. 2017). low-cost consumer recorders.
With hard drive recorders and the subsequent
development of solid-state memory recorders, a
new generation of high-quality equipment with 1.2.3 Recording to a Computer
unparalleled capacity became available in the
early 2000s (Figs. 1.9 and 1.10). Solid-state mem- In the 1990s, the first sound-acquisition boards
ory recorders do not require mechanical moving for personal computers became available, which
parts for the storage and retrieval of digital infor- revolutionized the way scientists collect and ana-
mation and instead use memory cards, such as lyze acoustic data. Once a sound was recorded in
Compact Flash (CF) or Secure Digital (SD and a digital format, recordings could easily and with-
microSD) cards also used in the digital photogra- out degradation be transferred to a computer,
phy market. stored, edited, copied, distributed, played,
The subsequent development of pocket digital processed, and analyzed with different
recorders for the consumer market allowed algorithms. Software (either freeware or commer-
scientists and amateurs to record many hours of cial) that can be used on a laptop provides
sounds with high quality. Portability and storage scientists with “a bioacoustics laboratory in a
space increased while cost decreased. Today, tape bag.” The consumer and professional market
recorders have been completely replaced by offer a large number of sound interfaces, to be
solid-state digital recorders with either external connected by USB or other standards to a PC,
(Fig. 1.9a) or built-in microphones (Fig. 1.9c). which can offer very high audio quality and mul-
Attempts to develop portable digital recorders tiple input/output channels. Smaller versions of
based on handheld portable computers or pocket such a setup, or compact single-board computers
PCs never gained much popularity because of the costing few tens of US dollars, are being used in
rapid development of pocket recorders. Profes- autonomous stationary and mobile recording
sional and semi-professional recorders systems, which allow data collection and real-
(Fig. 1.9a) provide phantom powering at 48 V time data processing in remote areas for months
(P48) for professional condenser microphones, at a time (e.g., Klinck et al. 2012).
12 G. Pavan et al.

Fig. 1.9 (a) Photograph of a professional portable high- card. (c) Photograph of five widely used digital recorders
quality recorder (Sound Devices, SD722) with both hard lined-up for comparative testing. From left: Sony
disc and solid-state memory recording capabilities, PCM-M10, Sony PCM-D50, Olympus LS-3, Roland
connected to two low noise microphones (Rode NT1A) R05, and Zoom H1. They feature internal microphones,
for soundscape recording. (b) Photograph of SONY but also can connect to external Plug-In-Power (PIP)
TC-510 open-reel recorder (1982) and a SONY microphones or hydrophones. Courtesy of M Pesente
PCM-M10 digital recorder with its microSD memory (2016)

1.2.4 Autonomous Programmable interests, off-the-shelf recorders were modified


Recorders and connected to timers, enabling recording at a
defined schedule. The use of portable computers
Researchers soon realized that their presence dur- also allowed scheduled recording in the field
ing recordings could influence the animal’s (Fig. 1.10). However, the main limitation was
behavior, and that a remote system, which could the need of external batteries, which allowed
be used in the absence of human observers, was only a few days of operation. In addition, long-
needed. There was also an increasing interest in term recording required protection of the equip-
collecting samples of the acoustic environment ment in waterproof cases and additional batteries.
over long periods of time. To address these new Defense and research laboratories alike have
1 History of Sound Recording and Analysis Equipment 13

Fig. 1.10 Left:


Photograph of a portable
digital recording and
analysis system composed
of a pair of microphones, an
AD-converter with USB
interface (Edirol UA25), a
low-power notebook, and
an additional battery
(2004). Right: Photograph
of an autonomous terrestrial
recorder by Wildlife
Acoustics (model SM3,
2014) with external battery
deployed in a nature reserve
in Italy

interesting stories to tell about the evolution of recording system made by Wildlife Acoustics. A
their autonomous recording equipment (e.g., few different types of autonomous recorders are
McCauley et al. 2017). currently available. However, as interest in con-
The first commercially available, programma- tinuous, long-term acoustic monitoring of remote
ble autonomous recorder, SongMeter 1 (SM1), areas (Pavan et al. 2015; Righini and Pavan 2019)
was sold by Wildlife Acoustics in late 2007 increases, new devices will continue to appear on
and opened a rapidly developing market. Since the market and in the open-source arena. In some
then, new products have been proposed by cases, audio recorders can be coupled with photo-
companies and research groups, with increasing and video traps to get images of the animals if
performances and autonomy. These can be they are at a close enough range.
programmed to record at defined intervals (e.g., Recent open-source autonomous recorders are
every day across the dawn and dusk periods) or built around the Raspberry Pi and similar small
more regular sampling schedules (e.g., 1 minute board computers. However, these devices often
every 10 minutes, or 10 minutes every half-hour) have inefficient power optimization and require
to sample temporal patterns of variation in a large batteries to supply power over long periods.
soundscape. This way, the acoustic behavior of The Solo acoustic monitoring platform6
animals of interest can be recorded without dis- (consisting of Raspberry Pi plus external micro-
turbance by the recordist and for extended phone) needs a 12-V car battery to record for
periods, both day and night. These recorders 40 days. Autonomous recorders need to be
need to be rugged and reliable to be deployed in low-power to allow for extended periods of
harsh environments. The period of time that recording time with a manageable battery supply.
recorders can collect data depends on the combi- The AudioMoth7 is an open-source device that
nation of available battery power and memory. also can be purchased assembled, and it employs
Depending on these factors, terrestrial recorders a low-power microcontroller with an onboard
can operate for weeks to months. A grid of auton- Micro Electro-Mechanical System (MEMS)
omous recorders can be used for monitoring bio-
diversity over a large area (e.g., entire countries; 6
Project website: https://solo-system.github.io/home.
Obrist et al. 2010), even in the ultrasonic range. html; accessed 1 Oct 2021.
7
Figure 1.10b illustrates one type of autonomous https://www.openacousticdevices.info/audiomoth;
accessed 22 Jun 2022.
14 G. Pavan et al.

Fig. 1.11 The JASON


Qualilife also hosts a high
dynamic luxmeter in four
different wavelengths and
direct USB HDD or micro
SD storage

microphone (Hill et al. 2018). MEMS are very environmental information has driven the devel-
small and cheap and allow for production of opment of new multi-channel, multi-parametric
autonomous recording devices at very low cost. instrumentation. Multi-channel portable recorders
Autonomous recorders can also be built around a and computer interfaces developed primarily for
wireless interface to send raw or processed data in professional music recording can be used for bio-
real-time, in near real-time, or at scheduled acoustics applications, however, dedicated
intervals. However, data transmission requires recorders with very high sampling rates are also
power and the creation or use of a suitable wire- being developed for specific study systems.
less network (Sethi et al. 2018). The recently developed JASON Qualilife9 can
Smartphones with an external battery supply record up to 5 data channels, with the maximum
are another option used to explore animal sounds sampling frequency up to 800 kHz per channel,
and soundscapes. The Automated Remote Biodi- all featuring 16-bit resolution, a sharp filter to
versity Monitoring Network (RFCx ARBIMON) prevent aliasing, and an adjustable analog gain
can receive acoustic data from a remote recorder for a large range of uses (Fig. 1.11).
based on a cellphone that, if coverage is available, Although already designed for low-power con-
directly sends data to the central server with sumption (12 V, 100 mA), to further reduce
online access.8 This system, coupled with Artifi- power consumption and achieve extended long-
cial Intelligence recognition algorithms, can iden- term recording, an extension board (Qualilife
tify sound categories to generate alerts to prevent Wake-Up Detector; Fourniol et al. 2018; Glotin
poaching and deforestation. More information on et al. 2018), can be used to trigger the recorder
autonomous recorders is available in Chap. 2. when it receives a signal at a specified frequency.
This allows for a reduction in power consumption
and data storage, also reducing unnecessary post-
1.2.5 Multi-Channel Recorders processing work. Moreover, it includes a high
dynamic luxmeter (which works from sun zenith
Collecting multiple channels of acoustic data to lunar eclipse) that is synchronized with the
allows for acoustic localization of the sound acoustic recorder.
source. Multi-channel recordings can help miti-
gate the Lloyd’s mirror effect, a phenomenon in
which low-frequency sounds near the ground 1.3 Advances in Microphones
may not be recorded correctly because of the
interference of direct and surface reflected There were several early attempts in the mid- to
sound. Increased interest in collecting multiple late-1800s by Johann Philipp Reis and Elisha
channels of acoustic data coupled with

8 9
Project website: https://rfcx.org/ & https://arbimon.rfcx. Project website: https://www.univ-tln.fr/SMIoT.html;
org; accessed 1 Oct 2021. accessed 20 Jun 2022.
1 History of Sound Recording and Analysis Equipment 15

Fig. 1.12 Left: Drawing of a carbon-button microphone microphone used for bioacoustics research; https://
(1916). Image source: https://commons.wikimedia.org/ commons.wikimedia.org/wiki/File:Sennheiser_MKH416.
wiki/File:Carbon_button_microphone_1916.png; jpg by Galak76, CC BY-SA 3.0 http://creativecommons.
unknown author, public domain, via Wikimedia org/licenses/by-sa/3.0/, via Wikimedia Commons
Commons. Right: Sennheiser MKH416 directional

Gray to develop the precursor to a microphone. of carbonized anthracite coal, which were con-
Reis developed the sound transmitter, which fined between two electrodes. One electrode was
contained a metallic strip that rested on a mem- connected to an iron diaphragm. Edison’s trans-
brane that caused intermittent contact between a mitter was durable, efficient, simple, and cheap to
metal point on the strip and an electrical circuit build. His transmitter became the basis for
when it vibrated. Elisha Gray developed the liq- millions of telephone transmitters used around
uid transmitter, consisting of a diaphragm the world.
connected to a moveable conductive rod, which
was immersed in an acidic solution. In 1876,
Alexander Graham Bell invented the magnetic 1.3.1 Microphones Used
transmitter, and Edison and Berliner developed a in Bioacoustics Research
loosely-packed carbon granules microphone
(Fig. 1.12). David Edward Hughes coined the At the beginning of the twentieth century, most
term “microphone” in 1878 for his microphone microphones were carbon granule sensors. These
system based on carbon granules, which early microphones were noisy and had limited
performed poorly by today’s standards (due to sensitivity and frequency response. This meant
high self-noise and distortion). However, it was these early microphones were suited only for
an important step forward, enabling technology recording human voices. In those early stages,
for long-distance voice communication or tele- dynamic microphones based on a membrane
phony (for more details see Robjohns 2010)10 with a coil immersed in a magnetic field were
In 1886, Thomas Alva Edison refined the car- difficult to produce because they required small
bon granule microphone and developed the but strong magnets.
carbon-button transmitter. This transmitter In 1917, Edward Wente made a great stride
consisted of a compartment filled with granules forward by inventing the condenser microphone,
which is still used in a wide variety of
10 applications today. In the 1920s, with the signifi-
A Brief History of Microphones: http://microphone-
data.com/media/filestore/articles/History-10.pdf; accessed cant increase in broadcast radio, there was a high
11 Oct 2021. demand for better quality microphones. The
16 G. Pavan et al.

Fig. 1.13 Photograph of the PRIMO EM172 microphone capsule (left) used by many nature sound recordists for their
custom-made microphones (center and right). Courtesy of M Pesente

piezoelectric microphone was created based on The widely used condenser microphones are
piezoelectric crystals, which are sensitive to pres- fairly sensitive, compared with dynamic
sure changes and generate a voltage when com- microphones, and feature an extended frequency
pressed/decompressed; conversely, they vibrate response, but they require external power. Profes-
and produce sound waves if excited by an electric sional condenser microphones are often powered
signal. Originally, they used quartz or Rochelle through the signal cables with 48 V (phantom
salt crystals, but the sound quality was poor. With power, P48) provided by the recording device,
the development of strong magnets, dynamic by a preamplifier, or by a power unit. Consumer
microphones were then used for decades because microphones usually use electret condenser
of their simplicity and reliability. However, for capsules that require 3–5 Vdc powering (plug-in
bioacoustics studies, they were not sensitive power, PIP) provided by the recorder via the
enough, and their frequency response generally microphone plug. Microphones well-suited for
did not extend beyond the human hearing range. bioacoustics studies can be built with electret
Today, almost 90% of the microphones condenser capsules costing only a few US dollars
manufactured annually are electret condenser (Fig. 1.13). For a detailed discussion of features
microphones (Rossing 2007) because of their and operation of microphones, see Chap. 2, sec-
many advantages when compared with dynamic tion on selecting a microphone.
microphones, including higher sensitivity, higher Many animals including insects, frogs, bats,
fidelity, and wider frequency response. Piezoelec- and other terrestrial and marine mammals emit
tric transducers are now mainly used in ultrasonic sounds (Sales and Pye 1974). Studies
hydrophones that have specialized ceramics that of ultrasonic signals require a broadband micro-
provide high sound quality. Robjohns (2010) phone capable of responding to signals at very
provides a history of microphone evolution and high frequencies. In contrast, some animals, such
outlines how advances in broadcast radio, as elephants, produce very low-frequency sounds
telephones, television, and music industry, along and require infrasonic microphones capable of
with the need for directional and ultrasonic detecting signals at or below 20 Hz (Payne et al.
recordings, drove the design of several new 1986). Previously, ultrasonic and infrasonic
types of microphones (e.g., the condenser-, recording required very expensive and complex
dynamic-, ribbon-, and carbon-microphones). transducers, recorders, and analyzers. With the
1 History of Sound Recording and Analysis Equipment 17

advent of broadband AD-converters in laptops sounds) to minimize off-axis sounds (e.g., noise
and smartphones, ultrasonic and infrasonic ani- from the public and room reflections).
mal sounds can now be recorded at a reasonable Single microphone (i.e., monophonic)
cost. Ultrasonic microphones may use small elec- recordings cannot provide any spatial informa-
tret condenser capsules or MEMS, which are tion. These recordings are made with a single
primarily used in smartphones. MEMS are small microphone that can be an omnidirectional micro-
and inexpensive, feature an extended frequency phone to capture all sounds around or a direc-
response (including the ultrasonic frequency tional one to capture sounds from a specific
range), can include an AD-converter, and can be source or direction. However, microphones can
directly integrated into digital systems. Some be paired to record sounds in stereo to provide
microphones also incorporate a high-speed a spatial sound image wherein listeners can iden-
AD-converter and USB interface to be directly tify the perceived spatial location of the sound
connected to a computer, a smartphone, or a tablet source. Many different types of microphone
for recording and real-time display. The configurations have been developed, mainly
Dodotronic Ultramic series offers a range of for recording music, but also for recording
USB ultrasonic microphones with sampling soundscapes.
frequencies ranging from 192 kHz to 384 kHz A further development, mainly conceived for
(Buzzetti et al. 2020); the most advanced models cinema and videogames, is the surround system
also include the ability to record on an internal that is based on multi-microphone (i.e., micro-
microSD memory card.11 phone array) recordings and speakers placed
In cases where researchers want to separate around the listener to create a more immersive
sounds coming from different directions, or target acoustic experience (Streicher and Everest 1998;
an individual animal for recording, a directional Rayburn 2011). With 3D audio, a whole acoustic
microphone, a parabolic reflector, or a micro- space is recorded with a microphone array. From
phone array can be used. One of the first this, it is possible to extract sound information to
documented attempts was in 1932, when Peter build a stereophonic or binaural or surround pro-
Paul Kellogg and Arthur Allen used a micro- gram. Today 3D audio is mainly used for 3D
phone installed in the focus of a parabolic reflec- Virtual Reality, with either video game, cinema
tor to record bird sounds (Wahlstrom 1985; Ranft or scientific uses, that allows the user to be placed
2001). Parabolic reflectors have been widely used in a 3D audio and video environment (with spe-
to record animal sounds, capture distant speech, cial visors and headphones, or in special VR
and detect the noise of incoming vehicles and rooms) and to move inside it to look and listen
airplanes during the first and second world wars in any direction. The currently most used 3D
(i.e., before the invention of radar; see Chap. 2 for audio system is Ambisonics (Fig. 1.14) that is
a discussion of use and features of parabolic based on 4 (first order), 8 (second order),
reflectors). As an alternative to parabolic 16 (third order) or more channels (Zotter and
reflectors, ultra-directional microphones, or Frank 2019).
so-called shotgun microphones, were developed. Specific microphone array applications in bio-
The design of shotgun microphones is based on acoustics include localizing sound sources, either
the interference tube principle to attenuate off- static or moving, such as flying bats (Blumstein
axis sounds; these microphones were developed et al. 2011). Using specific algorithms, signals
to have a narrow angle of forward reception. The can be extracted from the microphone array, and
shotgun was initially designed for use in a studio the direction and intensity of sound sources can
setting (as opposed to recording long-distance be identified by superimposing a sound map on
top of an image taken by a video camera. This
type of application is called an acoustic camera
11
Dodotronic webpage: http://www.dodotronic.com; and is largely employed by the automotive indus-
accessed 20 Jun 2022. try to locate sources of noise in a vehicle.
18 G. Pavan et al.

generally used to characterize the acoustic


properties of a signal or of a location. Usually,
measurement microphones are condenser
microphones optimized for a specific frequency
range and used to characterize a sound field or a
sound level when connected to a sound level
meter (or phonometer); see Chap. 2 for a discus-
sion of measurement microphone features and
operation. This microphone technology has not
changed much over time; however, the measuring
equipment to which microphones are connected
has evolved within a few decades from bulky and
expensive analog devices to small, powerful, and
flexible digital devices also able to provide spec-
tral analysis.

1.3.3 Accelerometers

An accelerometer measures the acceleration (i.e.,


the rate of change of velocity) of an object. Sin-
gle- and multi-axis accelerometers can detect both
the magnitude and the direction of the accelera-
tion, as a vector quantity. They can thus measure
the movements of an animal (e.g., mounted in a
collar) or to sense the vibration of a body part.
Tiny accelerometers are used to detect vibrations
Fig. 1.14 Ambisonic recorder with 4 microphones (first generated by insects and other animals for com-
order) Zoom H3VR munication. The recently defined science of
biotremology uses accelerometers and laser
Acoustic cameras help visualize patterns of both vibrometers to study vibrational communication
indoor and outdoor noise (e.g., of a passing car, in insects and other zoological groups (Hill et al.
train, airplane, or around a wind turbine). Acous- 2019) by either detecting their movements or the
tic cameras have the potential to help in localizing vibrations transmitted through the substrate.
biotic sound sources; however, they are expen- MEMS accelerometers are now very tiny and
sive and have been rarely used for bioacoustics largely used in electronic devices, such as
studies; an example is given by Stoeger et al. smartphones and game controllers, to sense their
(2012) to identify the sound sources in elephants. movement in space.

1.3.4 Laser and Optical Microphones


1.3.2 Measurement Microphones
Laser microphones, also known as laser
Measurement microphones are a special class of interferometers, laser accelerometers or
microphones designed to make accurate ampli- vibrometers, are designed to detect vibrations on
tude measures of sounds, ranging from a surface without any contact with the sound
infrasound to ultrasound. Although measurement source. These microphones can detect vibrations
microphones can be used for recording, they are over large distances, from few centimeters to tens
1 History of Sound Recording and Analysis Equipment 19

Fig. 1.15 Left: Photograph of an early ultrasonic bat UltraMic250k, based on MEMS, developed by
detector from the laboratory of Donald Griffin. Image Dodotronic in 2010, connected to a tablet computer that
courtesy of the Cornell Laboratory of Ornithology. allows recording and display of ultrasounds in real-time
Right: Photograph of an ultrasonic USB microphone

and hundreds of meters. For example, laser was related to their hearing, it was not until the
microphones can measure the vibration of a development of ultrasonic recorders and
glass window to capture the sounds produced microphones in the early 1940s (Fig. 1.15) that
inside a room. These devices were developed for scientists were able to study the ultrasonic sounds
spying purposes and are now mostly used in produced by bats for echolocation (Griffin 1944).
industry to record vibration of machinery. In bio- Donald Griffin was working with piezoelectric
acoustics research, and biotremology studies in transducers connected to an oscilloscope when
particular (Hill et al. 2019), this technology is he observed high-frequency signals produced by
used to record the vibration of animal body parts bats flying outside his open laboratory window.
(e.g., wings or abdomen of insects producing This discovery opened an entirely new field of bat
sounds) or vibration of the substrates (e.g., plant echolocation research.
stem, tree trunk, spider-web, and burrow-wall), Early bat detectors were based on the hetero-
which could indicate the presence of an animal. dyne principle and on frequency-division
Current instruments are lightweight and easy to counters (Obrist et al. 2010), which produced
use; however, they require that the target being audible but highly distorted sounds when receiv-
recorded is not moving and on a stable platform. ing ultrasonic calls. Heterodyne detectors allowed
These devices should not be confused with opti- only a narrow frequency range up to a few kHz, to
cal microphones and hydrophones, which are be shifted down to the audible range. The user
being developed and have a completely optical then tuned the detector to the frequency of interest
chain, where the transducer directly produces an and listened to and recorded signals only around
optical signal to be sent on an optical fiber cable, the tuned frequency. Information outside that fre-
either analog or digital, from the transducer to the quency range was discarded.
recorder. Frequency division (or count-down) detectors
cover a broad frequency range. They are based on
zero-crossing detection. They count how many
times the signal waveform crosses zero pressure
1.3.5 Bat Detectors
and they produce a synthetic wave every
n incoming waves. The output signal frequency
In the eighteenth century, the Italian scientist
is a fraction of the original frequency (i.e., 1/n),
Lazzaro Spallanzani recognized that bats were
and advanced systems retain the amplitude enve-
capable of navigating and capturing their prey in
lope of the original signal. The frequency division
the dark. While Spallanzani hypothesized that this
20 G. Pavan et al.

method is much better than the heterodyne; how- timer to start at sunset and stop at sunrise. Some
ever, both produce a distorted signal often not also have analysis software that identifies the
useful for scientific investigation. The first digital species, of course with variable margin of error
models, called time-expansion detectors, digitally depending on the species (see Chap. 2, section on
recorded the incoming bat calls at a high sampling bat detectors). Given the computing and storage
rate, and played them back at a reduced sampling capabilities of current tablets and smartphones,
rate, which allowed for human observers to hear dedicated ultrasonic microphones with an
the calls and record them on a conventional integrated AD interface also are available to
recorder (Obrist et al. 2010). This method record bat calls and display their features on the
preserves all acoustic features so that recordings device screen (Fig. 1.15).
can be used for scientific analysis.
Digital bat detectors include a built-in ultra-
sonic microphone, onboard signal sampling and 1.4 Advances in Hydrophones
processing, memory for digital data storage, a
graphical display to show a spectrogram with In 1826, Jean-Daniel Colladon and Charles-
related settings, and a speaker for monitoring Francois Sturm made an experiment in Lake
incoming ultrasounds by either slowing down or Geneva, Switzerland, to determine the speed of
shifting them in frequency. Current models are sound in water (Colladon 1893). They used two
completely digital, they record and store data small boats on opposite sides of the lake, ~14 km
continuously, and can transpose ultrasounds into apart. On one boat, there was an underwater bell,
audible sounds in real-time by spectral shifting which was struck at the same time that gunpow-
(or spectral compression), using a Fast Fourier der was ignited, which resulted in a paired under-
Transform (FFT) algorithm (see Chap. 4 on signal water sound and above-water gunpowder flash.
processing). Some bat detectors can be used as The operator of the second boat used an under-
autonomous recorders which can selectively water listening horn to detect the sound of the bell
record ultrasounds from echolocating bats for (Fig. 1.16). The time difference between seeing
many consecutive nights, with a programmable the gunpowder flash and hearing the bell allowed

Fig. 1.16 Experimental


setup to determine the
speed of sound underwater.
Image Source: J. D.
Colladon, Souvenirs et
Memoires, Albert-
Schuchardt, Geneva, 1893
1 History of Sound Recording and Analysis Equipment 21

the scientists to compute the speed of sound in 1.4.1 Single Hydrophones


water. Their measurements were fairly accurate
and indicated that the speed of sound in water is Hydrophones are transducers used to receive
approximately five times greater than the speed of underwater sound; they are usually based on pie-
sound in air. zoelectric materials. Hydrophones are generally
Until the advent of hydrophones, it was built with a piezoelectric transducer that generates
assumed that oceans, rivers, and streams were a voltage when compressed/decompressed; con-
quiet environments. Much of hydrophone devel- versely, it can vibrate and produce sound waves if
opment was driven by military needs during excited by an electric signal. Piezoelectric
World Wars I and II, when the use of transducers can be operated either as a receiver
hydrophones and sonar projectors facilitated the or as a transmitter. In 1917, Paul Langevin
detection of enemy vessels, particularly obtained a large 10 cm  10 cm  1.6 cm slice
submarines, by listening to their sound (i.e., pas- of a natural quartz crystal and used this to develop
sive sonar) or by listening for the reflection of a transmitter capable of emitting sound so power-
emitted sound pulses (i.e., active sonar). Sonar ful it killed nearby fish. After World War II, other
operators were some of the earliest materials (potassium dihydrogen phosphate,
bioacousticians who were able to distinguish ammonium dihydrogen phosphate, and barium
sonar signals from marine animal sounds (Fish titanate) were used instead of quartz to build
and Mowbray 1970). Today, hydrophones are hydrophone transducers (Rossing 2007).
used in a large variety of biological research As the Navies of the world began to recognize
applications to monitor population dynamics and the utility of listening underwater, hydrophone
behavior of marine invertebrates, fish, and technology developed fairly rapidly, and also
mammals (Au and Hastings 2008; Tremblay was used for oceanographic and biological
et al. 2009). Hydrophones are also largely used research (Wenz 1962; Munk and Wunsch 1979;
to monitor the underwater noise produced by ship Urick 1983; Naramoto 2000). Most of the early
traffic and other invasive activities, such as seis- bioacoustics research on aquatic animals was
mic surveys with airguns and naval sonar (Pavan conducted using a battery-operated single hydro-
et al. 2004). phone (Fig. 1.17) suspended in the water from the

Fig. 1.17 Simple


piezoelectric hydrophone
(Aquarian Audio HC2a)
with PIP powering
connected to a digital
pocket recorder (SONY
PCM-M10)
22 G. Pavan et al.

shore, a small boat, or sea ice, and required the non-invasive and able to collect long-term data
presence of a researcher. from remote areas independently of weather and
Traditional hydrophones feature an analog light conditions (Mellinger et al. 2007; Lammers
output (voltage or current) and are available et al. 2008; Tremblay et al. 2009; Obrist et al.
with or without a front-end preamplifier. 2010; Sousa-Lima et al. 2013; Jacobson et al.
Hydrophones that feature an integrated 2016); see Chap. 2.
AD-converter and digitize the analog signal
directly at the sensor are now commercially avail-
able. Some digital hydrophones also integrate 1.4.4 Towed Hydrophone Arrays
signal processing and storage capabilities (e.g.,
real-time reporting of noise levels). Because of A towed array contains several hydrophones
the increased power consumption of digital housed in an oil-filled plastic sleeve, which are
hydrophones, these are primarily used in cabled pulled behind vessels of varying size. Towed
sensor networks, such as seafloor sensors or arrays of hydrophones allow beamforming (a
sub-surface towed arrays. processing technique that combines time-delayed
signals from multiple hydrophones to increase
gain in a given direction) to improve signal-to-
1.4.2 Sonobuoys noise ratio and estimate bearings to specific sound
sources. Consecutive bearing estimates allow the
Navies of the world recognized the need for a localization of a source and determining its range.
hydrophone that could operate remotely, was A towed array in effect provides a high-gain,
mobile, and could monitor sounds at different directional sensor that can be steered in different
water depths, which led to the development of directions either in real-time or in the post-
sonobuoys. Sonobuoys are individual canisters processing of recordings (see Chap. 2 for details
that float at the water surface and house a hydro- of towed hydrophone arrays). During World
phone, dampening cable, battery, recording/trans- War I, a towed sonar array (the first documented
mitting electronics, and a transmitting antenna. towed array) known as the Electric Eel was devel-
See Chap. 2 for details of features and operation oped by the US Navy physicist Harvey Hayes
of sonobuoys. Navies of the world used (Naramoto 2000). Bill Watkins and William
sonobuoys for underwater listening to detect Schevill at Woods Hole Oceanographic Institu-
submarines by deploying them from airplanes or tion were among the first bioacousticians to use
ships. A few labs were able to acquire military this technology to record and study the sounds of
sonobuoys and used them for receiving and marine mammals (e.g., Watkins and Schevill
recording marine animals. 1977; Watkins et al. 1987). The original towed
arrays focused on lower-frequency signals (i.e.,
frequencies typical of foreign vessel noise), but
1.4.3 Autonomous Underwater Schevill and Watkins developed new instruments
Acoustic Recorders to record the higher frequencies emitted by
dolphins. Their recordings are of high scientific
In recent years, a wide variety of stationary, value and are available online in digital format at
autonomous passive acoustic monitoring (PAM) the WHOI Watkins Sound Library.12
systems have been developed for the recording of In 1983, Thomas et al. (1986, 1987) worked
acoustic activity from naturally occurring with a geophysical company to build a modified
biological and geophysical sources, as well as towed array specifically for the study of marine
from anthropogenic sources in marine mammal sounds (Fig. 1.18), which was capable
environments (Figs. 1.19, 1.20, 1.21, and 1.22).
These systems have an advantage over systems 12
WHOI Library: http://cis.whoi.edu/science/B/
that rely on human observers as they are whalesounds/index.cfm; accessed 11 Oct 2021.
1 History of Sound Recording and Analysis Equipment 23

Fig. 1.18 Left: Photograph of the topside electronics Mary, to listen for underwater sounds of marine mammals
required to receive, record, and process data from a and fish in the Eastern Tropical Pacific. Photos by Jeanette
towed array in 1983. Right: Photograph of deploying a Thomas
towed array from the deck of a tuna seiner, the MV Queen

of capturing low- and medium-frequency under- the height of the Cold War, the US Navy
water sounds (20 Hz–15 kHz). Depth and temper- launched a classified project known as the
ature sensors on the array measured the SOund SUrveillance System (SOSUS). The
thermocline and sound propagation conditions in SOSUS large-aperture arrays allowed the Navy
the area. Self-noise from the moving ship was to detect signals at ranges of several hundred
present, but filtered out as much as possible. kilometers. SOSUS arrays were highly successful
Many species of marine mammals were heard, in detecting and tracking Soviet submarines of
which helped the fishermen find tuna as they that era. The sailors operating the early SOSUS
tend to associate with dolphin pods. arrays also detected numerous biological sounds
In recent years, lightweight towed arrays have of unknown origin. An unknown low-frequency
been developed to meet the requirements of sound was attributed to the “Jezebel Monster,”
studying marine mammal sounds from small yet later found to be from blue (Balaenoptera
platforms, such as sailboats (Pavan and Borsani musculus) and fin whales (Balaenoptera
1997). Deployment of the towed array from a physalus). After the end of the Cold War, the
sailboat minimizes recorded self-noise of the SOSUS system was made available to scientists
towing vessel. Current towed arrays can capture (Nishimura and Conlon 1994; Stafford et al.
sounds over a large geographic area and cover a 1998; Watkins et al. 2000), who monitored the
wide frequency range (from infrasound to presence of marine mammal sounds and tracked
ultrasound). their long-range seasonal movements across the
oceans. In one case, a blue whale was tracked for
80 days along the eastern seaboard of the USA
1.4.5 Seafloor Hydrophone Arrays using the 20-Hz signal the animal repeatedly
produced.
Arrays of bottom-mounted hydrophones were an At present, bottom-mounted arrays of
important naval asset for the surveillance of hydrophones are deployed across oceans world-
oceans for the presence and movements of wide, with some strictly dedicated to military
enemy vessels and submarines. In the 1950s, at applications, and others dedicated to monitoring
24 G. Pavan et al.

Fig. 1.19 The JASON Qualilife DAQ 3x600 kHz in the custom array by H Glotin, recording sperm whales in the near
field in 2018. Courtesy of V Sarano

earthquakes or nuclear explosions, such as the 1.4.6 Small Arrays


array operated by the Comprehensive Nuclear
Test Ban Treaty Organization (CTBTO). Over Novel hydrophone array configurations have
the last decade, multidisciplinary seafloor recently been developed for a team led by
networks were established: the North-East Pacific François Sarano to conduct a longitudinal study
Time-series Undersea Networked Experiments on the same group of sperm whales since 2013,
(NEPTUNE) and the Victoria Experimental Net- under the authority of the Marine Megafauna
work Under the Sea (VENUS) in Canada13; the Conservation Organization and as part of the
Controlled, Agile, and Novel Ocean Network global program Maubydick. In 2017 and 2018,
(CANON) run by MBARI in the USA; the the team collected a set of audio-visual recordings
European Multidisciplinary Seafloor Observatory using a custom acoustic antenna developed by the
(EMSO) run by Europe; the Submarine Multidis- University of Toulon with the JASON Qualilife
ciplinary Observatory (SMO) managed by Italy; DAQ (Data AcQuisition) to record the animals in
and the Neutrino Mediterranean Observatory the near field at very high frequency (600 kHz
(NEMO also known as KM3net) operated by the sampling frequency, Fig. 1.19). A similar antenna
Neutrino Mediterranean Observatory. Some of has been deployed in Amazonia allowing high-
these arrays are equipped with wideband definition 3D tracking and click analysis of the
hydrophones, which allow scientists to monitor Amazon river dolphin (Inia geoffrensis; Glotin
a variety of marine mammal species as well as et al. 2018).
ambient noise levels (Nosengo 2009; Favali et al.
2013; Caruso et al. 2015; Sciacca et al. 2015;
Viola et al. 2017). NEPTUNE and VENUS also
1.5 Autonomous Mobile Systems
provide online public access to recorded data. The
Listening Into the Deep Ocean (LIDO) project
1.5.1 Aerial Mobile Systems
provides real-time streaming of acoustic data
that is a gateway to several underwater data
Autonomous mobile monitoring systems were
acquisition systems (André et al. 2011).
developed for terrestrial applications, such as the
Autonomous Aerial Acoustic Recording Systems
(AAARS) developed at the University of
Tennessee (Buehler et al. 2014). This system is
13 based on an altitude-controlled weather balloon
Canada seafloor networks: http://www.oceannetworks.
ca; accessed 11 Oct 2021. with an acoustic recorder and a GPS unit with
1 History of Sound Recording and Analysis Equipment 25

radio transmitter. It moves quietly according to changes in buoyancy, in conjunction with


local winds and can be tracked by a radio wings, to convert vertical motion to horizontal
receiver. If ground anchored, this system allows motion, and thereby propel themselves forward
the recording of sounds in a given location. with very low-power consumption. Gliders
Mobile systems based on drones, on the contrary, slowly dive (~ 0.25 m/s horizontal speed) in a
can be stationary or can be programmed to survey saw-tooth pattern through the water. When
a given area, however, they are very noisy and surfacing after a dive, the glider communicates
this can severely affect animal behavior and both with an onshore base station to exchange data and
the quality and usability of the recordings. commands (e.g., send position, remaining battery
capacity, whale detections, and ambient noise
levels, and receive new waypoints). The maxi-
mum operating depth of current models is about
1.5.2 Underwater Mobile Systems
1000 m. Therefore, these instruments are well-
suited for monitoring of deep-diving odontocetes,
The high cost of visual and acoustic marine
such as beaked whales (Klinck et al. 2012).
surveys conducted from large research vessels
Other instruments in this category include
drove the development of new monitoring
deep-diving (Matsumoto et al. 2013) and surface
solutions using autonomous vehicles; either
drifters (Griffiths and Barlow 2015). These
moving on the surface (Unmanned Surface
instruments drift with the ocean current and can-
Vessels, USVs) or underwater (Autonomous
not be programmed to navigate along a defined
Underwater Vehicles, AUVs). These systems are
track-line. However, they are much cheaper than
remotely operated by an onshore pilot and can
gliders. Recent Autonomous Surface Vehicles
monitor offshore areas for weeks or months at a
(ASV) can perform surveys along a pre-defined
time (Klinck et al. 2012, 2015).
track; among these, the Sphyrna (Fig. 1.20) has
The most commonly used autonomous mobile
advanced algorithms to allow 3D passive acoustic
systems to monitor the marine acoustic environ-
tracking of deep divers with four hydrophones
ment are underwater gliders (Baumgartner et al.
fixed on the keel (Poupard et al. 2019).
2013). These instruments (Fig. 1.20) use small

Fig. 1.20 Left: Photograph of the passive acoustic The Sphyrna ASV allows 3D passive acoustic tracking of
seaglider™ developed by the Applied Physics Laboratory, diving cetaceans
University of Washington. Courtesy of G Shilling. Right:
26 G. Pavan et al.

Fig. 1.21 The evolution of the DTAG over fifteen years. 2000 (a) had 400 MB of memory and could record a single
Each design comprises electronics, batteries, suction cups, sound channel at 16 kHz sampling frequency for a few
floatation material, and a VHF transmitter for retrieval hours. The most recent version developed in 2009
when the tag is floating on the sea surface. The tags all (b) records stereo sound at up to 500 kHz sampling fre-
record sound, depth, and motion to solid-state memory. quency for almost two days. (c) is an intermediate version
However, the size, capabilities, and endurance have of the tag. Courtesy of P Tyack and M Johnson (2016)
changed over the years. The earliest version developed in

1.5.3 Animal Acoustic Tags sounds included rumination, which allowed the
researchers to document foraging activities.
A recent development for studying animals Video tags have been attached to whales,
in-situ is the animal-worn acoustic tag. Such dolphins, sirenians, and penguins, and to docu-
devices allow detailed observations of the move- ment the underwater life. Sophisticated acoustic
ment and acoustic behavior of tagged animals. tags provided an important step forward in marine
However, for some species, such as cetaceans, mammal bioacoustics. The development of these
developing a reliable, long-term instrument tags was primarily driven by the need to docu-
attachment has been problematic. ment and understand the reaction of cetaceans to
Recorders in collars, similar to those used for underwater sounds such as naval sonars, airguns,
radio tracking, have also been experimented to and pile drivers. The D-TAG (Johnson and Tyack
record sounds and activity of terrestrial animals 2003), A-Tag (Akamatsu et al. 2007), Acousonde
while moving freely, but with few applications. recorder (Burgess et al. 2011), and other similar
More successful was using the crittercam devel- instruments, feature a variety of animal move-
oped and used by National Geographic to primar- ment detectors (three-axial accelerometer, mag-
ily provide amazing video14 of wild animals netometer, depth-sensor, light sensor, etc.) and
either on land or in water. Lynch et al. (2013) acoustic sensors (hydrophones). These tags are
attached an inexpensive collar-mounted record- attached to the animals with non-invasive suction
ing device on ten wild mule deer (Odocoileus cups, and usually stay attached for a few hours,
hemionus) over two weeks in Colorado. Recorded but can stay on the animal for up to a few days.
Once detached, the tag floats to the surface and
14
https://www.nationalgeographic.org/education/ transmits a radio signal to aid recovery. This kind
crittercam-education/; accessed 11 Oct 2021. of technology (Fig. 1.21) has enabled important
1 History of Sound Recording and Analysis Equipment 27

research on sound usage and behavioral responses traced on paper by an oscillating pen (similar to
of animals to anthropogenic sounds, such as naval a seismometer).
sonars (Tyack 2009; Tyack et al. 2011). The Kay Electric Company (later to become
Often a variety of sensors can be attached to Kay Elemetrics) developed the Sona-Graph™
the animal to provide additional environmental machine, which was a completely analog instru-
or behavioral data to accompany acoustic ment and one of the first instruments to create an
recordings. Evans et al. (2004) attached a water- image of a sound known as a SonaGramTM.
proof video camera with a hydrophone, VHS Developed primarily for navy applications and
recorder, and depth-sensor to examine vocal initially called vibralyzer, this technology was
behavior during dives of Weddell seals in applied successfully to the study of human speech
Antarctica. Each time the seal vocalized, the and animal sounds (Koenig et al. 1946; Borror
depth and time of the sound were documented, and Reese 1953; Thorpe 1954; Marler 1955:
audio and video were recorded, and the call type Fig. 1.22). A SonaGram (sometimes called a
was later analyzed in the laboratory. Researchers sonogram by biologists) is a visual representation
had to retrieve the VHS tapes, but this species of the frequencies (on the y-axis) and intensity
remains close to a colony during the breeding (color or shades of gray as the z-axis) in a sound
season, hauls out on the ice daily, and is easily as they vary with time (on the x-axis). This type of
(re)captured for recovery of the tag and data. image visualization is also called spectrogram.
Current digital video equipment is highly The Sona-Graph™ was very expensive and capa-
miniaturized and allows new exciting options ble of analyzing a signal of only a few seconds in
for exploring the life of animals in the wild. duration up to 8 or 16 kHz. The device offered
two analysis settings, wideband (300 Hz) and
narrowband (45 Hz). The wideband setting
1.6 Advances in Sound Analysis provided better time resolution, while the narrow-
Hard- and Software band setting provided better frequency resolution
(Beecher 1988). The sound could be played back
The most important advancements in sound anal- from a reel-to-reel recorder and recorded on an
ysis equipment were the transition from analog- iron oxide magnetic track, which ran the circum-
to-digital systems, along with the transition from ference of a large internal turntable. A special
hardware to software signal processing. This thermo- sensitive paper was wrapped around a
provided lightweight, field portable, battery- drum mounted on top of the turntable. The drum
operated units with higher storage capacity, spun synchronously with the turntable as the sig-
more stable storage media, and broadband analy- nal was played back through a variable band-pass
sis, often at a more affordable price than before. filter or a filter bank, and a stylus burned the
Now, even a smartphone can produce a spectro- signal onto the paper on the rotating drum
gram in real-time. Another important break- according to the level of sound at the frequencies
through was the ability of scientists to share given by the filter (Fig. 1.23).
digital data using the internet and shared storage This was a smelly, smoky process, which
in the cloud. made the procedure unpleasant for researchers.
Initially, the basic analysis of acoustic signals To analyze a long sound recording, several short
was done using oscilloscopes. These instruments spectrogram sections had to be printed and taped
provided a visual representation of the waveform together. The resulting sheets of paper often
of acoustic signals known as oscillograms, which required a lot of wall or table space for review
are plots with amplitude on the y-axis and time on and further analysis. Because of the large size,
the x-axis. Originally, oscilloscopes were large, these spectrograms were also difficult to reduce in
heavy, expensive, AC powered, and used vacuum size and adapt for inclusion in a publication.
tubes. To obtain a hardcopy of the waveform, a In the 1970s, a camera using Kodak photo-
camera was used to capture an image from the graphic paper (the size of 35-mm film) was
display. In some cases, the waveforms were attached to the screen of an advanced
28 G. Pavan et al.

Fig. 1.22 Photograph of


L. Irby Davis using an early
Kay Electric Co. Sona-
Graph Sound Spectrograph
analyzer (the late 1950s).
Notice the sonogram on the
paper wrapped around the
drum on top of the analyzer.
Courtesy of the Cornell
Laboratory of Ornithology

Fig. 1.23 Two


spectrograms by Ken
Norris illustrating the wide-
band (top) and narrow-band
settings (bottom) of the Kay
Sona-Graph 6061A
spectrum analyzer. Note
that the values of the x- and
y-axes were not printed on
the output. The x-axis is
time in seconds and y-axis
is the frequency in hertz.
Courtesy of the Cornell
Laboratory of Ornithology

oscilloscope capable of performing real-time FFT frequency and time could be taken as the
spectrum analysis (Hopkins et al. 1974). As the spectrograms were displayed. The photographic
sound played, a spectrogram image appeared on paper had to be developed in a dark room and
the screen and the camera photographed the produced a roll of 35-mm paper about 4 m long.
resulting image in real-time. Measurements of One advantage of this system was the ability to
1 History of Sound Recording and Analysis Equipment 29

view the sounds in real-time, which allowed each one characterized by frequency, amplitude,
scientists to study patterns of sounds. This system and phase. This algorithm was successfully
produced long-lasting spectrograms that are still applied to the human voice and to animal sounds
usable 40 years later (see Thomas and Kuechle to produce spectrograms in different formats. The
1982 for samples of sonogram output). speed and data-handling capabilities of computers
Once thermal imaging paper (similar to the in subsequent years allowed for the implementa-
paper used in older fax machines) was developed, tion of more complex mathematical signal
Kay, Unigon, and other companies developed processing algorithms (see Chap. 4 on signal
real-time spectrogram imaging units, which had processing).
a continuous output using large rolls (8 inch A few years later, in 1980, a computer-based
wide) of thermal imaging paper. For further anal- digital spectrographic workstation was developed
ysis, segments had to be cut with scissors. How- at the University of Pavia (Italy) that produced
ever, these data were difficult to analyze, store, black-and-white spectrograms of animal sounds
and prepare for publication. Measurements of on a computer screen, with a moving cursor to
frequency and time could be taken as the images take measures. The workstation produced and
were displayed on the analyzer but were not printed a spectrogram of a 1-s signal in about
provided on the output itself. If exposed to light 40 minutes (Pavan 1983, 1985). The
or heat, the hardcopies gradually turned brown AD-converter allowed users to acquire and ana-
and were generally unusable after a few years. lyze sounds in the ranges of 5, 10, and 20 kHz
In the mid-1970s, the first attempts were made with a sampling frequency of 51.2 kHz.
to use general-purpose computers to analyze Hardcopies of displays were made on the
sounds, mainly for speech analysis. These computer’s printer and then joined together
attempts used the Fast Fourier Transform (Strong (Fig. 1.24).
and Palmer 1975), an algorithm that decomposes Around that same time, in 1984, a group of
a signal segment into a finite number of sinusoids, acousticians at The Rockefeller University and

Fig. 1.24 Black-and-white spectrogram of a 2.4-s bird axis is the frequency in hertz. Frequency range 0–5 kHz,
song (Thekla lark) produced in 1981 by joining three sampling frequency 20,480 Hz, and 12-bit resolution
printouts of 800 ms each; the spectrogram generation (72-dB dynamic range). From top: spectrogram, envelope,
required 2 hours. The x-axis is time in seconds and y- tracking of dominant frequency, and amplitude plot in dB
30 G. Pavan et al.

Fig. 1.25 Photograph of


an envelope-plot and color
spectrogram generated by
the digital signal processing
workstation based on
HP1000 mainframe in
1985. Recordings were of
calls of a Barbary partridge
(Alectoris barbara)

Engineering Design Inc. developed a software Intel 8086/8087 processors and a high-quality
program, called Signal. This software was devel- Audiologic Duetto sound board produced in
oped for computers and was able to control and Italy, with sampling frequency up to 48 kHz
communicate with the recording hardware. The with 16-bit resolution, and later with a widely
system was able to display spectrograms in real- available and cheap Sound Blaster sound card.
time, provide basic time-frequency information A mouse-driven cursor allowed to take accurate
of recorded signals, and store data digitally on measures directly on the computer screen, and
the computer’s hard disc. These developments printouts were possible in gray scales on standard
revolutionized bioacoustics sound analysis; how- matrix-dot printers or on thermal printers. By
ever, at the time, these units were expensive, storing the recordings in a digital format, it was
custom-made, and had very little storage capacity also possible to edit the recordings and to play
(the typical storage available in 1985 was 5 MB them back at a different speed or even backward
on a 15-inch magnetic disc). (e.g., to produce playback tapes for behavioral
In 1985, the spectrographic workstation experiments).
was upgraded to produce color spectrograms At the same time, other researchers started
(Fig. 1.25; Pavan 1992) on a mainframe computer experimenting with digital signal processing.
(HP 1000) interfaced to an AD-converter and to a Aubin (France) and Specht (Germany) developed
graphic workstation.15 Around this time, the first similar digital sound analysis systems that
personal computers (PC) appeared, and the soft- also included the synthesis of sounds for
ware was rewritten to produce real-time color playback experiments (Bremond and Aubin
spectrograms and signal envelopes using an 1989; Specht 1992; Aubin et al. 2000).
Specialized AD-converters appeared on the mar-
15
http://www.unipv.it/cibra/res_dspwstory_uk.html; ket to sample analog signals at high rates, which
accessed 29 Oct 2021. allowed digital recording and analysis of
1 History of Sound Recording and Analysis Equipment 31

Fig. 1.26 Photograph of the University of Pavia portable open-reel stereo recorder, cassette deck recorder,
bioacoustic laboratory equipment in 1989 with a Kay filter bank, speakers, and headphone
Sona-Graph DSP 5500, color monitor, thermal printer,

frequencies up to 100 kHz. However, specialized bioacoustics, it is worth to mention Canary,


processors (Digital Signal Processors, DSP) were developed for Macintosh computers at Cornell
required to process ultrasonic signals in real-time University, then replaced by Raven,16 a multi-
(Pavan 1992, 1994). platform software developed from the same uni-
In 1987, new commercially available digital versity. For an overview of computer-based bio-
instruments dedicated to sound analysis became acoustics sound analysis and related algorithms,
available, among them the Kay Sona-Graph DSP see Hopp et al. (1998), Zimmer (2011), and Sueur
5500 (Fig. 1.26). This very expensive unit was (2018). Many academic institutions and
able to analyze and display stereo signals in real- companies started to develop software programs
time up to 32 kHz. Either reel-to-reel or cassette for PC, Mac, and Linux computers.17
recordings could be used as an input, and the unit These software programs allowed for easy
had a thermal-paper printer for printing gray- recording, manipulation, analysis, and display of
shaded spectrograms. signals. Now, researchers are able to collect huge
Digital sound storage and analysis became acoustic datasets, and computational bioacoustics
widespread given the improvements in digital faces the Big Data problem. The latest software
computer technology and data storage, coupled programs, either commercial or open source, also
with the proliferation of personal computers, and enable the user to run sophisticated detection/
the development of dedicated sound analysis soft-
ware packages. These advances also fostered the
16
development of high-quality electro-acoustic and Accessed from the K. Lisa Yang Center for Conserva-
tion Bioacoustics https://ravensoundsoftware.com/soft
musical equipment (microphones, recorders, and
ware/raven-pro/; accessed 11 Oct 2021.
AD-converters) for a rapidly expanding consumer 17
List of available software: http://tcabasa.org/?page_
market of musicians and music enthusiasts. id¼2666; accessed 4 Oct 2021. https://github.com/rhine3/
Among the first analysis software dedicated to bioacoustics-software; accessed 20 Jun 2022.
32 G. Pavan et al.

classification algorithms over long-term data sets enabled by equipment developed for military
for automated detection of occurrences of a target use, professional music applications, human
sound (see Chap. 8 on detection and classification speech analysis, and for the radio, television,
methods). This saves much time and avoids hav- and film industries. Often an improvement in
ing to view and listen to the entire recording one type of equipment led to advancements in
manually. Scientists also can use readily available another. Analog devices, which stored data on
programming environments (including MATLAB, magnetic tape, were replaced by digital devices,
Octave, Python, R) to develop their own analyses, such as optical discs, hard drives and solid-state
often facilitated by libraries of procedures dedi- memory cards. Microphones and hydrophones
cated to sound processing and bioacoustic analy- are now used in arrays that allow long-term mon-
sis (e.g., Sueur et al. 2008; Sueur 2018; Ulloa itoring, localization of the sound-producing
et al. 2021). animals, and 3D acoustic recording. Towed
In the late 1990s, smartphone technology was hydrophone arrays allow mobile surveys of
developed, along with sound analysis software marine sounds, which can be coupled with animal
for these devices. Smartphones of the twenty- sightings and environmental data. Autonomous
first century have the same computing power as transducer/recorder units can be deployed for
a desktop PC. Sound recording and visualization long-term monitoring of biotic and abiotic sounds
applications were developed for both Android in both air and water in remote habitats. Recently,
and iPhone Operating System (iOS) platforms. smartphone applications have provided an afford-
In addition, the development of the Internet of able and portable bioacoustics laboratory for use
Things and low-cost computer platforms (e.g., by hobbyists, citizen scientists, and researchers
Arduino, Raspberry PI, and others) have allowed alike.
scientists to build web-enabled data recording and The digital revolution in sound recording and
analysis systems. These new technologies and analysis has facilitated significant advances in the
analytical methods can be applied not only to field of bioacoustics and enabled the development
audible sound but also to infrasonic and ultra- of ecoacoustics, which joins bioacoustics and ecol-
sonic signals. For example, ultrasonic echoloca- ogy, and computational bioacoustics. Acousticians
tion signals produced by bats can now easily be are now able to study the sounds from sound-
shifted into the human hearing range, visualized, producing species in a wide variety of locations,
and analyzed in real-time with handheld digital during day and night, year-round, and often
devices, with a smartphone equipped with an remotely. Many free and commercially available
ultrasonic microphone, or remotely monitored software packages for recording and analyzing
with web-connected recorders.18 acoustic data have been developed for computers,
tablets, and smartphones. Artificial Intelligence is
now being applied to big data problems and to
bioacoustic recordings to hopefully classify and
1.7 Summary
recognize sounds at species level. It has never
been easier or cheaper to study the acoustic world
Advances in electronic technology over the last
ranging from infrasounds to ultrasounds. How-
100 years, including the dramatic size reduction
ever, it is always important to know the intrinsic
of equipment, increased battery life, increased
limitations of each piece of equipment or software,
data storage capacity, the switch from analog-to-
the constraints given by the environmental context,
digital recorders, along with the transition from
and all their potential impact on the final results. It
analog-to-digital signal processing, have
is also worth considering that bioacoustics and
facilitated an explosion of research in the field
ecoacoustics are now being widely used to study
of bioacoustics. Many of these advances were
and monitor critical and endangered species and to
monitor entire ecosystems to understand climate
18
http://www.bat-pi.eu/; accessed 11 Oct 2021. change impacts.
1 History of Sound Recording and Analysis Equipment 33

References Biology of Marine Mammals, Tampa, Florida,


November–December 2011
Buzzetti F, Brizio C, Pavan G (2020) Beyond the audible:
Akamatsu T, Telemann J, Miller LA, Tougaard J, Dietz R,
wide band (0-125 kHz) field investigation on Italian
Wang D, Wang K, Siebert U, Naito Y (2007) Compar-
Orthoptera (Insecta) songs. Biodivers J 11(2):443–496
ison of echolocation behaviour between coastal and
Caruso F, Sciacca V, Bellia G, De Domenico E, Larosa G,
reverie porpoises. Deep-Sea Res II 54(3–4):290–297
Papale E, Pellegrino C, Pulvirenti S, Riccobene G,
André M, van der Schaar M, Zaugg S, Houégnigan L,
Simeone F, Speziale F, Viola S, Pavan G (2015) Size
Sánchez AM, Castell JV (2011) Listening to
distribution of sperm whales acoustically identified
the deep: live monitoring of ocean noise and cetacean
during long term deep-sea monitoring in the Ionian
acoustic signals. Mar Pollut Bull 63:18–26
Sea. PLoS One 10:e0144503. https://doi.org/10.1371/
Araya-Salas M, Smith-Vidaurre G, Webster M (2017)
journal.pone.0144503
Assessing the effect of sound file compression and
Colladon JD (1893) Souvenirs et memories. Albert-
background noise on measures of acoustic signal struc-
Schuchardt, Geneva
ture. Bioacoustics 28(1):57–73. https://doi.org/10.
Evans WE, Thomas JA, Davis R (2004) Sounds used
1080/09524622.2017.1396498
during underwater navigation by a Weddell seal
Au WWL (1993) The Sonar of Dolphins, Springer, Berlin,
(Leptonychotes weddellii). In: Thomas JA, Moss C,
278 pp
Vater V (eds) Echolocation in bats and dolphins.
Au WWL, Hastings MC (2008) Principles of marine bio-
Univ Chicago Press, pp 541–547
acoustics. Springer, Berlin
Farina A (2014) Soundscape ecology. Springer:315 pp
Aubin T, Rybak F, Moulin B (2000) A simple method for
Farina A, Gage SH (eds) (2017) Ecoacoustics: the ecolog-
recording low-amplitude sounds. Application to the
ical role of sounds. Wiley, Hoboken. 352 pp
study of the courtship song of the fruit fly Drosophila
Favali P, Chierici F, Marinaro G, Giovanetti G,
melanogaster. Bioacoustics 11:51–67
Azzarone A, Beranzoli L, De Santis A, Embriaco D,
Baumgartner MF, Fratantoni DM, Hurst TP, Brown MW,
Monna S, Lo Bue N, Sgroi T, Cianchini G, Badiali L,
Cole TVN, Van Parijs SM, Johnson M (2013) Real-
Qamili E, De Caro M. G, Falcone G, Montuori C.,
time reporting of baleen whale passive acoustic
Frugoni F., Riccobene G, Sedita M., Barbagallo G,
detections from ocean gliders. J Acoust Soc Am 134:
Cacopardo G, Calì C, Cocimano R, Coniglione R,
1814–1823
Costa M, D’Amico A, Del Tevere F., Distefano C,
Beecher MD (1988) Spectrographic analysis of animal
Ferrera F, Valentina Giordano V, Massimo Imbesi M,
vocalizations: implications of the “uncertainty princi-
Dario Lattuada D, Migneco E, Musumeci M,
ple”. Bioacoustics 1:187–208
Orlando A, Papaleo R, Piattelli P, Raia G, Rovelli A,
Blumstein DT, Mennill DJ, Clemins P, Girod L, Yao K,
Sapienza P, Speziale F, Trovato A, Viola S, Ameli F,
Patricelli G, Deppe JL, Krakauer AH, Clark CW,
Bonori M, Capone A, Masullo R, Simeone F,
Cortopassi KA, Hanser SF, McCowan B, Ali AM,
Pignagnoli L, Zitellini N, Bruni F, Gasparoni F,
Kirschel ANG (2011) Acoustic monitoring in terres-
Pavan G (2013) NEMO-SN1 abyssal cabled observa-
trial environments using microphone arrays:
tory in the Western Ionian Sea. IEEE J Ocean Engineer
applications, technological considerations and pro-
38(2): 358–374
spectus. J Appl Ecol 48:758–767
Fish MP, Mowbray WH (1970) Sounds of Western North
Borror DJ, Reese CR (1953) The analysis of bird songs by
Atlantic fishes. A reference file of biological underwa-
means of a vibralyzer. Wilson Bulletin 65(4):271–303
ter sounds. The John Hopkins Press, Baltimore
Bremond JC, Aubin T (1989) Choice and description of a
Fourniol M, Gies V, Barchasz V, Kussener E,
method of sound synthesis adapted to the study of bird
Barthelemy H, Vauche R, Glotin H (2018)
calls. Biol Behav 14:229–237
Low-power frequency trigger for environmental inter-
Budney GF, Grotke RW (1997) Techniques for audio
net of things, in IEEE/ASME Mechatronic & Embed-
recording vocalizations of tropical birds. In: Remsen
ded Systems. http://sabiod.org/bib
JV (ed) Ornithological Monographs 48: studies in neo-
Frommolt KH, Bardeli R, Clausen M, (editors), 2008)
tropical ornithology honoring Ted Parker. American
Computational bioacoustics for assessing biodiversity.
Ornithologists’ Union, Chicago, IL, pp 147–163
Proceedings of the International Expert meeting on
Buehler, DA, Hockman EV, Prevost EC, Wilkerson JB,
IT-based detection of bioacoustical patterns. Published
Smith DR, Fischer RA (2014) Demonstration and
by Federal Agency for Nature Conservation, Bonn,
implementation of autonomous aerial acoustic record-
Germany
ing systems (AAARS) to monitor bird populations in
Garner RL (1892) The speech of monkeys. William
inaccessible areas. Proceedings of AOUCOSSCO
Heinemeann, London
2014, Estes Park, Colorado, USA, p 17
Glotin H, Blakefield G, Trone M, Bonnett DE, Giraudet P,
Burgess WC, Oleson EM, Baird RW (2011) A hydrody-
Giès V, Barchasz V, Patris J, Malige F, Balestriero R
namic acoustic recording tag for small cetaceans and
(2018) 1 MHz SR high definition 3D tracking of
first results from a pantropical spotted dolphin. Poster
wild Amazon River dolphins with JASON tiny array,
presented at the 19th Biennial Conference on the
results and perspectives, in detection, classification,
34 G. Pavan et al.

localisation, & density estimation workshop. devices for studying animal behavior. Ecol Evol 3(7):
Sorbonne, Paris 2030–2037. https://doi.org/10.1002/ece3.608
Griffin DR (1944) Echolocation by blind men, bats and Marler P (1955) Characteristics of some animal calls.
radar. Science 29:589–590 Nature 176:6–8
Griffiths ET, Barlow J (2015) Design of the drifting acous- Matsumoto H, Jones C, Klinck H, Mellinger DK, Dziak
tic spar buoy recorder (DASBR). Proceedings of the RP, Meinig C (2013) Tracking beaked whales with a
7th international workshop on detection, classification, passive acoustic profiler float. J Acoust Soc Am 133:
localization, and density estimation of marine 731–740
mammals using passive acoustics, San Diego, CA, McCauley RD, Thomas F, Parsons MJG, Erbe C, Cato D,
USA, pp. 86 Duncan AJ, Gavrilov AN, Parnum IM, Salgado-Kent
Hill AP, Prince P, Covarrubias EP, Doncaster CP, C (2017) Developing an underwater sound recorder.
Snaddon JL, Rogers A (2018) AudioMoth: evaluation Acoust Aust 45(2):301–311. https://doi.org/10.1007/
of a smart open acoustic device for monitoring s40857-017-0113-8
biodiversity and the environment. Methods Melchior VR (2019) High resolution audio: a history and
Ecol Evol:1–13 perspective. J Audio Eng Soc 67(5):246–257
Hill P, Lakes-Harlan R, Mazzoni V, Narins PM, Virant- Mellinger DK, Stafford KM, Moore SE, Dziak RP,
Doberlet M, Wessel A (eds) (2019) Biotremology: Matsumoto H (2007) An overview of fixed passive
studying vibrational behavior. Springer Verlag, acoustic observation methods for cetaceans. Oceanog-
p 534. https://doi.org/10.1007/978-3-030-22293-2 raphy 20(4):36–45
Hopkins CD, Rosetto M, Lutjen A (1974) A continuous Munk W, Wunsch C (1979) Ocean acoustic tomography: a
sound Spectrum analyzer for animal sounds. Z scheme for large scale monitoring. Deep-Sea Res
Tierpsychol 34(3):313–320. https://doi.org/10.1111/j. 26A:123–161
1439-0310.1974.tb01804.x Naramoto MV (2000) A concise history of acoustics in
Hopp SL, Owren MJ, Evans CS (eds) (1998) Animal warfare. Appl Acoust 59:137–147
acoustic communication: sound analysis and research Nishimura CE, Conlon DM (1994) IUSS dual use: moni-
methods. Springer, p 421 toring whales and earthquakes using SOSUS, marine
Jacobson EK, Forney KA, Harvey JT (2016) Evaluation of tech. Soc J 27:13–21
a passive acoustic monitoring network for harbor por- Nosengo N (2009) The neutrino and the whale. Nature
poise in California. Tech rep no. CEC-500-2016-008. 462:560–561
https://doi.org/10.13140/RG.2.1.2282.9680 Obrist MK, Pavan G, Sueur J, Riede K, Llusia D, Marquez
Johnson MP, Tyack PL (2003) A digital acoustic R (2010) Bioacoustic approaches in biodiversity
recording tag for measuring the response of inventories. In: manual on field recording techniques
wild marine mammals to sound. IEEE J Ocean Eng and protocols for all taxa biodiversity inventories. Abc
28:3–12 Taxa 8:68–99
Klinck H, Mellinger DK, Klinck K, Bogue NL, Luby JC, Pavan G (1983) Ricerche di elettronica applicata alla
Jump WA, Shilling GS, Litchendorf T, Wood AS, biologia. 1. Analisi computerizzata del canto degli
Schorr GS, Baird RW (2012) Near-real-time acoustic uccelli. Pubbl Ist Entom Univ di Pavia 24:1–43
monitoring of beaked whales and other cetaceans using Pavan G (1985) Analisi con calcolatore delle emissioni
a seaglider™. PLoS One 7(5):e36128. https://doi.org/ acustiche degli uccelli. Annuario EST (Enciclopedia
10.1371/journal.pone.0036128 della Scienza e della Tecnica). Arnoldo Mondadori,
Klinck H, Fregosi S, Matsumoto H, Turrpin A, Mellinger Milano, pp 135–140
DK, Erofeev A, Barth JA, Shearman RK, Pavan G (1992) A portable DSP workstation for real-time
Jafarmadar K, Stelzer R (2015) Mobile autonomous analysis of cetacean sounds in the field. European
platforms for passive-acoustic monitoring of high- Research on Cetaceans 6:165–169
frequency cetaceans. Proceedings of the 8th Interna- Pavan G (1994) A digital signal processing workstation for
tional Robotic Sailing Conference, Mariehamn, Åland, bioacoustical research. Atti 6 conv. Ital. Ornit. Torino
Finland, August 2015: 29–38 1991:227–234
Koenig W, Dunn HK, Lacy LY (1946) The sound spectro- Pavan G (2017) Fundamentals of soundscape conserva-
graph. J Acoust Soc Am 18:19–49 tion. In: Farina A, Gage SH (eds) Ecoacoustics: the
Lammers MO, Brainard RE, Whitlow WWL, Mooney TA, ecological role of sound. Wiley, pp 235–258
Wong KB (2008) An ecological acoustic recorder Pavan G (web) http://www.unipv.it/cibra/res_dspwstory_
(EAR) for long term monitoring of biological and uk.html. Accessed 28 March 2018
anthropogenic sounds on coral reefs and other marine Pavan G, Borsani JF (1997) Bioacoustic research on
habitats. J Acoust Soc Am 123:1720–1728 cetaceans in the Mediterranean Sea. Mar Freshw
Little RS (2003) For the birds: the Laboratory of Ornithol- Behav Physiol 30:99–123
ogy and Sapsucker Woods at Cornell University. Pavan G, Fossati C, Manghi M, Priano M (2004) Passive
Basking Ridge, NJ acoustics tools for the implementation of acoustic risk
Lynch E, Angeloni L, Fristrup K, Joyce D, Wittemyer G mitigation policies. In: Evans PGH, Miller LA (eds)
(2013) The use of on-animal acoustical recording Proceedings of the workshop on active sonar and
1 History of Sound Recording and Analysis Equipment 35

cetaceans, 17th ECS conference, March 2003. whale (Balaenoptera physalus) offshore eastern Sicily,
European Cetacean Society Newsletter no. 42 – Spe- Central Mediterranean Sea, PLoS One, 10 (11):
cial Issue: 52-58 e0141838. https://doi.org/10.1371/journal.pone.
Pavan G, Favaretto A, Bovelacci B, Scaravelli D, 0141838
Macchio S, Glotin H (2015) Bioacoustics and Sethi SS, Ewers RM, Jones NS, Orme CDL, Picinali L
ecoacoustics applied to environmental monitoring and (2018) Robust, real-time and autonomous monitoring
management. Rivista Italiana di Acustica 39(2):68–74 of ecosystems with an open, low-cost, networked
Payne KB, Langbauer WR Jr, Thomas EM (1986) Infra- device. Methods Ecol Evol 9(12):2383–2387
sonic calls of the Asian elephant (Elephas maximus). Sousa-Lima R, Norris TF, Oswald JN, Fernandes DP
Behav Ecol Sociobiol 18:297–301 (2013) A review and inventory of fixed autonomous
Pijanowski BC, Farina A, Gage SH et al (2011a) What is recorders for passive acoustic monitoring of marine
soundscape ecology? An introduction and overview of mammals. Aquat Mamm 39:23–53
an emerging new science. Landsc Ecol 26:1213–1232. Specht R (1992) Sonagraphishce Analysen mit dem per-
https://doi.org/10.1007/s10980-011-9600-8 sonal computer. Der Falke 1992:281–283
Pijanowski BC, Villanueva-Rivera LJ, Dumyahn SL et al Stafford KM, Fox CG, Clark DS (1998) Long-range
(2011b) Soundscape ecology: the science of sound in acoustic detection and localization of blue whale calls
the landscape. Bioscience 61:203–216. https://doi.org/ in the Northeast Pacific Ocean. J Acoust Soc Am
10.1525/bio.2011.61.3.6 104(6):36161–33625
Pohlmann KC (1995) Principles of digital audio. McGraw- Stoeger AS, Heilmann G, Zeppelzauer M, Ganswindt A,
Hill, New York Hensman S, Charlton BD (2012) Visualizing sound
Potter RK, Kopp GA, Green HC (1947) The visible emission of elephant vocalizations: evidence for two
speech, Bell Laboratory, Cambridge, MA rumble production types. PLoS One 7(11):e48907.
Poulsen V (1900) Method of recording and reproducing https://doi.org/10.1371/journal.pone.0048907
sounds or signals. US Patent 661619A Streicher R, Everest FA (1998) The new stereo
Poupard M, Ferrari M, Schluter J, Marxer R, Giraudet P, Soundbook. AES Publishing, Pasadena, CA
Barchasz V, Gies V, Pavan G, Glotin H (2019) Real- Strong WJ, Palmer EP (1975) Computer-based sound
time passive acoustic 3D tracking of deep diving ceta- spectrograph system. J Acoust Soc Am 58:899–904
cean by small non-uniform Mobile surface antenna. Sueur J (2018) Sound analysis and synthesis with
IEEE ICASSP:8251–8255 R. Springer, New York
Ranft R (1997) The wildlife section of the British library Sueur J, Aubin T, Simonis C (2008) Seewave: a free
National Sound Archive. Bioacoustics 7:315–319 modular tool for sound analysis and synthesis. Bio-
Ranft R (2001) Capturing and preserving the sounds of acoustics 18:213–226
nature. In: Linehan A (ed) Aural history: essays on Thomas JA, Kuechle VB (1982) Quantitative analysis of
recorded sound. The British Library, London, pp Weddell seal (Leptonychotes weddellii) underwater
65–78 vocalizations at McMurdo Sound, Antarctica. J Acoust
Ranft R (2004) Natural sound archives: past, present and Soc Am 72(6):1730–1738
future. An Acad Bras Cienc 76(2):455–465 Thomas JA, Awbrey FT, SR Fisher (1986) Use of acoustic
Rayburn RA (2011) Eargle’s the microphone book: from techniques for studying whale behavior. Rep Int Whal
mono to stereo to surround - a guide to microphone Commn Special Issue 8, No. SC/A82/Bw10: 121–138
design and application, 3rd edn. Elsevier, Oxford Thomas JA, Fisher SR, Ferm LM, Holt RS (1987) Acous-
Righini R, Pavan G (2019) First assessment of the tic detection of cetaceans using a towed-array of
soundscape of the integral nature reserve “Sasso hydrophones. Rep Int Whal Commn Special Issue
Fratino” in the central Apennine, Italy. Biodivers J 8, No. SC/37/03: 139–148
21(1):4–14. https://doi.org/10.1080/14888386.2019. Thorpe WH (1954) The process of song-learning in the
1696229 chaffinch as studied by means of the sound spectro-
Robjohns H (2010) A brief history of microphones. http:// graph. Nature 173:465
microphone-data.com/media/filestore/articles/ Tremblay C, Calupca T, Clark CW, Robbins M,
History-10.pdf Spaulding E, Warde A, Kemp J, Newhall K (2009)
Rossing TD (ed) (2007) Springer handbook of acoustics. Autonomous seafloor recorders and auto detection
Springer, New York. 1167 pp with CD-ROM buoys to monitor whale activity for long-term and
Rumsey F, McCormick T (2009) Sound and recording. near-real-time applications. J Acoust Soc Am 125:
Focal Press, New York. 628 pp 2548
Sales G, Pye D (1974) Ultrasonic communication by Tyack PL (2009) Human-generated sound and marine
animals. Chapman and Hall, London mammals. Phys Today, November: 39–44
Sciacca V, Caruso F, Beranzoli L, Chierici F, De Tyack PL, Zimmer WMX, Moretti D, Southall BL,
Domenico E, Embriaco D, Favali P Giovanetti, Claridge DE, Claridge DE, Durban JW, Clark CW,
Larosa G, Marinaro G, Papale E, Pavan G, D’Amico A, DiMarzio N, Jarvis S, McCarthy E,
Pellegrino C, Pulvirenti S, Simeone F, Viola S, Morrissey R, Ward J, Boyd IL (2011) Beaked whales
Riccobene G (2015) Annual acoustic presence of fin respond to simulated and actual navy sonar. PLoS One
36 G. Pavan et al.

6(3):e17009. https://doi.org/10.1371/journal.pone. Wahlstrom S (1985) The parabolic reflector as an acousti-


0017009 cal amplifier. J Audio Eng Soc 33(6):418–430
Ulloa JS, Haupert S, Latorre JF, Aubin T, Sueur J (2021) Watkins WA, Schevill WE (1977) Sperm whale codas. J
Scikit-maad: an open-source and modular toolbox for Acoust Soc Am 62:1485–1490
quantitative soundscape analysis in Python. Methods Watkins WA, Tyack PL, Moore KE (1987) The 20 Hz
Ecol Evol 12(12):2334–2340. https://doi.org/10.1111/ signals of finback whales (Balaenoptera physalus). J
2041-210X.13711 Acoust Soc Am 82:1901–1912
Urick RJ (1983) Principles of underwater sound. McGraw- Watkins WA, George GE, Daher MA, Mullin K, Martin
Hill, New York DL, Haga SH, DiMarzio N (2000) Whale call data for
Viola S, Grammauta R, Sciacca V, Bellia G, Beranzoli L, the North Pacific November 1995 through July 1999.
Buscaino G, Caruso F, Chierici F, Cuttone G, Occurrence of calling whales and source locations
Embriaco D, Giovanetti G, Favali P, Pavan G, from SOSUS and other acoustic systems. WHOI Tech-
Pellegrino C, Pulvirenti S, Riccobene G, Simeone F nical Report 2000–02: 160 pp
(2017) Continuous monitoring of noise levels in the Wenz G (1962) Acoustic ambient noise in the ocean:
Gulf of Catania (Western Ionian Sea). Study of corre- spectra and sources. J Acoust Soc Am 34:1936–1956
lation with ship traffic. Mar Pollut Bull 121(1–2): Zimmer WMX (2011) Passive acoustic monitoring of
97–103. https://doi.org/10.1016/j.marpolbul.2017. cetaceans. Cambridge University Press, Cambridge
05.040 Zotter F, Frank M (2019) Ambisonics. Springer
Verlag, Cham

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Choosing Equipment for Animal
Bioacoustic Research 2
Shyam Madhusudhana, Gianni Pavan, Lee A. Miller,
William L. Gannon, Anthony Hawkins, Christine Erbe,
Jennifer A. Hamel, and Jeanette A. Thomas

2.1 Introduction by available equipment. Over time, technological


advances and the availability of user-friendly anal-
Until a few decades ago, progress in bioacoustic ysis software have made bioacoustics research
and then ecoacoustic research was severely limited more commonplace. The advantage of passive
bioacoustic studies (in which sounds are often
remotely recorded) is that the methods are
Jeanette A. Thomas (deceased) contributed to this chapter
non-invasive and anyone with a minimal amount
while at the Department of Biological Sciences, Western
Illinois University-Quad Cities, Moline, IL, USA of equipment can record animal sounds. However,
this disadvantage diminishes if a researcher is not
S. Madhusudhana (*) knowledgeable about the characteristics and
K. Lisa Yang Center for Conservation Bioacoustics,
limitations of the equipment being used. Given
Cornell Lab of Ornithology, Cornell University, Ithaca,
NY, USA the rapid advances in digital technology,
e-mail: shyamm@cornell.edu bioacousticians are often challenged with keeping
G. Pavan up with these advances. Appropriate selection and
Department of Earth and Environment Sciences, usage of sensors, amplifiers, filters, and recorders,
University of Pavia, Pavia, Italy and proper usage of analysis software are key to
e-mail: gianni.pavan@unipv.it
valid studies on animal sounds. This chapter guides
L. A. Miller bioacoustics researchers in selecting appropriate
Institute of Biology, University of Southern Denmark,
gear for maximizing the outcomes of their research.
Odense M, Denmark
e-mail: lee@biology.sdu.dk To record, store, and play back sounds, there
are two types of devices: analog and digital. Ana-
W. L. Gannon
Department of Biology and Graduate Studies, Museum of log recording devices, such as cassette recorders
Southwestern Biology, University of New Mexico, and reel-to-reel tape recorders, are now obsolete
Albuquerque, NM, USA and almost completely replaced by digital record-
e-mail: wgannon@unm.edu
ing devices. However, many researchers over
A. Hawkins time have made phonograph, reel-to-reel, or cas-
The Aquatic Noise Trust, Kincraig, Blairs, Aberdeen, UK
sette recordings, which provide historical data.
C. Erbe So, when reading an older research article in
Centre for Marine Science and Technology, Curtin
bioacoustics, one may have to consider the poten-
University, Bentley, WA, Australia
e-mail: c.erbe@curtin.edu.au tial limitations of the specific equipment used at
the time and their ramifications on the reported
J. A. Hamel
Department of Biology, Elon University, Elon, NC, USA findings. Chapter 1 provides an overview of older
e-mail: jhamel2@elon.edu and historic equipment.
# The Author(s) 2022 37
C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_2
38 S. Madhusudhana et al.

2.2 Basic Concepts of Sound which is ½ of the sampling frequency (see


Recording Chap. 4). Sampling frequency for the standard
CD is 44.1 kHz (i.e., high enough to match the
The acquisition, storage, and playback of sounds full human hearing range). An 8-kHz sampling
in digital systems involve the interoperation of a frequency suffices to understand the human
few independent components (Fig. 2.1). Bio- voice. Nowadays, digital recorders easily sample
acoustics researchers may choose to source the up to 192 kHz and higher, with the flexibility to
necessary components and assemble a setup choose lower sampling frequencies (32, 44.1,
themselves. The practical considerations for 48, 88.2, and 96 kHz are common). Instrumenta-
selecting these components will be covered in tion recorders can have sampling frequencies up
Sect. 2.3. Alternatively, researchers may opt for to 1 MHz.
pre-assembled equipment. The growing market Despite the available sampling frequencies,
has made available a wide variety of programma- the actual recording bandwidth of a recorder is
ble, and often customizable, autonomous dictated by the analog electronics before the
recorders. Section 2.4 discusses a few of the analog-to-digital (AD) converter. Because most
widely used terrestrial and underwater autono- commercial recorders are designed for the record-
mous recorders. Organizations developing auton- ing of music or human speech, the upper fre-
omous recorders often invest in the necessary quency is often limited to 20 kHz and the
trial-and-error experimentation for arriving at electronics do not have a flat frequency response
optimal combinations of components for different beyond this limit, even if selecting a high sam-
applications. The use of such pre-assembled pling frequency such as 192 kHz. For profes-
equipment allows bioacoustics researchers to cir- sional recorders, the real frequency response
cumvent the associated efforts (financial and (i.e., the output amplitude across frequencies as
labor). However, unique demands of specific a function of input amplitude) is usually stated in
studies may not always be addressed by existing the equipment specifications (e.g., flat to within
autonomous recorders. Before diving into details 3 dB between 10 Hz and 60 kHz). If the fre-
of each component, we provide a quick recap of quency response is not specified, it is important to
the overarching concepts and terminologies. make some tests using a frequency-generator as a
sound source. It is also important to consider that
the frequencies close to the Nyquist frequency
might be affected by artifacts such as aliasing.
2.2.1 Sampling Rate and Bandwidth

The sampling rate used when converting analog


electronic signals to digital signals limits the max- 2.2.2 Aliasing
imum frequency that can be recorded. The sam-
pling frequency is measured in hertz, and the According to sampling theory, to preserve all
sampling rate (which has the same value but information in an analog signal, a sampling fre-
different unit) is measured in samples/s. The fre- quency at least twice the highest frequency in the
quency range is limited by the Nyquist frequency, signal (including harmonics) should be used. A

Fig. 2.1 Signal chain of a


typical digital recording
setup in bioacoustics
studies showing the
different components
involved in the collection,
analysis, and transmission
of sounds
2 Choosing Equipment for Animal Bioacoustic Research 39

non-optimal sampling frequency can produce 1 mV/Pa (¼0.001 V/Pa) can be expressed as
misrepresentations of components in the original 60 dB re 1 V/Pa. Note that an rms sound pres-
waveform, which often manifest as artifacts in a sure of 1 Pa is equal to a sound pressure level
spectrographic display but are not actually pres- (SPL) of 94 dB re 20 μPa, because
ent in the original signal (see Chap. 4, section on
1 Pa ¼ 1,000,000 μPa ¼ 50,000  20 μPa;
aliasing). In a spectrogram, the alias is mostly in
the higher frequency region and appears as the apply 20 log 10 and get: 20 log 10 ð50,000Þ ¼ 94:
mirror-image of the actual signals beyond the
The most sensitive sensor is not necessarily the
Nyquist frequency (Fig. 2.2). In digital recording,
“best” sensor. When attempting to capture very
anti-aliasing filters (Sect. 2.3.2.2) are required
loud sound, less sensitive equipment should be
before the sampling stage to prevent aliasing
chosen to avoid signal distortion or, in extreme
from sounds that have components higher than
cases, damaging the equipment. If only a sensor
the Nyquist frequency.
of low sensitivity is available, then an amplifier
may be used in the recording chain, but self-noise
may become an issue. High sensitivity allows
2.2.3 Amplitude Sensitivity lower gain settings to promote a good recording.

Amplitude sensitivity, expressed as the ratio of


output voltage to input pressure, indicates how
2.2.4 Bit-Resolution and Dynamic
many volts are produced from a sound with a
Range
root-mean-square (rms) sound pressure of 1 Pa
in air and 1 μPa in water. More commonly, sensor
The dynamic range is the difference between the
sensitivity is given in decibel: dB re 1 V/Pa for
highest and lowest sound levels that can be
microphones and dB re 1 V/μPa for hydrophones.
recorded. Digital recorders usually operate with
To convert the linear sensitivity to dB, one needs
16- or 24-bit resolution; 16 bits guarantee a
to take 20 log10. So, a microphone sensitivity of

Fig. 2.2 Spectrogram (top) and oscillogram (bottom) of created with frequency ffN. As such, a 50-kHz input
an AD-converter with a sinusoidal frequency sweep from produces a 46-kHz alias and a 52-kHz input produces a
40 kHz to 100 kHz as input. Sampling frequency 96 kHz, 44-kHz alias, etc. The amplitude of the alias depends on
and thus Nyquist frequency 48 kHz. In an ideal system the attenuation of the anti-aliasing filter at the input fre-
with a sharp anti-aliasing filter, the spectrogram would quency. An attenuation of 10 dB at 50 kHz produces an
only go up to 48 kHz and show nothing once the signal alias at 46 kHz with a level of 10 dB relative to the input
frequency went beyond Nyquist. In this real-world exam- level. Spectrogram generated by SeaPro (http://www.
ple, however, as the signal frequency f exceeds the Nyquist unipv.it/cibra/seapro.html; accessed 15 Mar. 2021)
frequency fN, the alias (appearing as the downsweep) is software
40 S. Madhusudhana et al.

dynamic range of about 96 dB (unipolar, 90 dB can generate broadband background noise with
bipolar) and 24 bits theoretically produce a various spectral shapes (i.e., not necessarily flat
dynamic range of 144 dB (unipolar, 138 dB bipo- across the frequency band, like white noise, but
lar) thus encompassing the dynamic range of worse at higher frequencies). The level of this
human hearing. However, even the best analog noise is expressed in decibel (e.g., dB(A) after
circuits rarely exceed 110 dB of dynamic range. frequency weighting, dB re 20 μPa unweighted in
This means that of the available 24 bits, only air, or dB re 1 μPa unweighted in water) to indi-
20 bits are effectively used to encode the sound cate the equivalent sound level of noise as if
and the others are dominated by noise. In many generated by the environment. The self-noise of
conditions, the real dynamic range is limited to a sensor is almost always declared in its technical
70–80 dB by the noise of the sensor and pream- specifications; the same is true for professional
plifier. An accurate setting of the recording levels recorders. On the contrary, for many consumer
can allow effective use of 16-bit recorders, with- recorders, even of high quality, the self-noise
out wasting the extra storage space required for measures are rarely available. A useful compari-
24-bit recording. However, when incoming sound son of the self-noise of consumer recorders avail-
levels cannot be predicted, the 24-bit setting able on the market is presented on the website of
allows additional dynamic range for unpredict- Avisoft Bioacoustics.1
able sound events (e.g., high-intensity impulsive The noisiest component of the chain
noises such as from pile driving). The recorded determines the quality of the recording. This is
volume should be set at a particular level to particularly important when recording low-level
exploit the dynamic range of the recording sounds (Fig. 2.3). The input self-noise is
setup: high enough to rise above the equipment expressed as the Equivalent Input Noise (EIN)
self-noise during quiet times, but not too high to measured in an open or unloaded circuit and
cause clipping of loud sounds. Recently expressed in dBU (the “U” stands for
introduced recorders allow 32-bit floating-point “unloaded”). Very good values range from
recording by combining the output of two 24-bit 130 dBU to 120 dBU, and poor recorders
converters working with different signal gains. have a 100 dBU EIN.
This simplifies the setting of recording levels but
cannot yet overcome the dynamic range
limitations of the microphones and of associated 2.3 Instrumentation of Signal
preamplifiers. Chain Components

To ensure that proper equipment is used for


2.2.5 Self-Noise recording, analysis, and playback, researchers
must consult manuals for each piece of equipment
All components of the signal chain suffer from in the signal chain before conducting research. In
self-noise, which is additive across the signal some cases, laboratory tests may be required to
chain. Self-noise and dynamic range are the two verify the real performance or to calibrate equip-
critical specifications that affect amplitude ment (Sect. 2.6). While recording, researchers
response. For example, when recording in very must ensure that the frequency response (and, in
quiet locations or to pick up very low-level turn, bandwidth), self-noise, and dynamic range
sounds, the self-noise generated by the (in particular, the maximum recording level) of
components of a signal chain must be taken into the overall recording system do not end up delet-
consideration, along with dynamic range. Self- ing or significantly distorting a portion of the
noise limits the spatial range of bioacoustic sam- signal. Otherwise, a researcher can miss part of
pling. It may also be an issue in playback, when
self-noise is amplified and broadcast in addition 1
http://www.avisoft.com/recorder-tests/; accessed
to the intended signal. The circuits inside sensors 1 Feb. 2021.
2 Choosing Equipment for Animal Bioacoustic Research 41

Fig. 2.3 Spectrogram depicting high self-noise versus background. In the following sections, nosier systems
low self-noise output by three microphone/recorder were used; the sounds appear unclear and listening was
combinations. In the left section, a low-noise system was unpleasant
used and the signal clearly emerged from the environment

an animal’s sound that is outside the recording 2.3.1 Sensors


system’s sensitivity or frequency range. This
might especially happen, if the sound is above Microphones and hydrophones convert sound
or below the human hearing range. For example, pressure signals into electrical signals. The elec-
elephants communicate with conspecifics using trical signal, which is representative of the origi-
infrasounds (Payne et al. 1986), and rodents and nal sound waveform, can be amplified, filtered,
bats produce ultrasounds for communication and recorded, visualized, and further analyzed or
foraging (see Chap. 12 on echolocation). converted back to sound for playback or projec-
Other features to consider when purchasing tion. Speakers work in the reverse and convert the
equipment for fieldwork are the construction electrical signal into sound for broadcast. A trans-
quality, weather proofing, reliability, visibility of ducer converts a signal from one form (of energy)
the display, and ease of use in harsh conditions to another. So microphones, hydrophones, and
(see Chap. 3 on practical considerations). speakers are all transducers. Usually,
Powering the instruments might be a major issue microphones and hydrophones, as long as they
with regard to practicality, cost, and safety. For do not have a built-in preamplifier, can be used as
example, low-noise preamplifiers generally both sound sensors and sound projectors. But
require higher operating currents. Large-capacity their receiving and projecting amplitude
batteries increase the risk of fire. During long field sensitivities, frequency responses, and
trips, internal rechargeable batteries may be diffi- directionalities may differ.
cult to recharge; replaceable batteries may be Each microphone and hydrophone has a
easier to manage, and external powering options unique amplitude sensitivity, frequency response,
could become a necessity (e.g., to power a and directivity pattern. These are specified in the
recorder with a standard 5 V USB source or specification sheets of high-quality sound
with a 6- or 12-V battery pack). For extended sensors. A flat frequency response gives the
autonomous deployments, the cost of the power least distorted audio-signal; however, during sig-
source might end up exceeding the cost of the nal calibration, a non-flat response can be
recording equipment. accounted for. The sensor size influences ampli-
tude sensitivity, frequency response, and
42 S. Madhusudhana et al.

Fig. 2.4 Schematic of a dynamic microphone (left) and a Microphone schematic components: 1. vibrating dia-
condenser microphone (right) showing the conversion of phragm, 2. coil attached to the diaphragm, 3. magnet,
sound waves into electrical audio-signal outputs. 4. backplate, 5. battery, 6. resistor, 7. output

directionality. A sound sensor, to be omnidirec- in the condenser. Capacitance changes are then
tional, should be smaller than the minimum wave- converted to voltage. Condenser microphones
length of the signal to be received. Large sensors need a high voltage to polarize the condenser. In
are more sensitive but tend to limit responses at contrast, electret microphones are permanently
high frequencies. Large sensors become direc- polarized as their diaphragms are made of
tional at lower frequencies than small sensors do. metallic-coated, pre-polarized, plastic membrane.
Both condenser and electret microphones need
power for their integrated preamplifier, with con-
2.3.1.1 Microphones
denser microphones requiring additional power to
Microphones convert sound energy (from sound
polarize the condenser. This power may be sup-
waves) into an electrical audio-signal using a
plied by an internal 3–5 V battery, 48-V phantom
moving diaphragm or membrane. Two main
power (P48), or a Power-In-Plug (PIP) unit. P48
types of microphones are common: dynamic
is a standard means of feeding power to a con-
microphones and electrostatic microphones (con-
denser microphone with 48 Vdc and is commonly
denser and electret microphones) (Brüel and Kjær
used in professional recorders. Modern pocket
1982). Some microphones are sensitive to particle
digital recorders use PIP units for powering their
motion, as well as sound pressure, which results
microphones. The membranes in electrostatic
in them being very sensitive to sounds very close
microphones are delicate and sensitive to humid-
to the microphone (i.e., in the near-field). This
ity, which can be problematic in humid
often exaggerates the low-frequency components
environments. The lower mass of electrostatic
of the received sound.
elements generally yields superior high-
In dynamic microphones, a coil on the back of
frequency response. However, electrostatic
the diaphragm is immersed in a magnetic field
sensors may be noisier than dynamic sensors.
and generates a current by electromagnetic induc-
For studies involving low-frequency sounds,
tion when the membrane moves (Fig. 2.4). Such
dynamic sensors may be a better choice.
microphones do not require external power, but
A radio-frequency microphone is a special
they have limited sensitivity, making them most
type of condenser microphone, developed by
useful for loud signals or at close range to the
Sennheiser2 in its MKH series. With this type of
sound source. The delicate mechanical suspen-
microphone, variations of the capacitor modulate
sion in dynamic microphones may warrant gentle
the frequency of a radio-frequency oscillator, and
handling.
then a demodulator extracts the audio-signal to be
Electrostatic microphones are based on a con-
denser with a thin moving diaphragm (Fig. 2.4).
Movement of the diaphragm changes capacitance 2
http://www.sennheiser.com/; accessed 15 Mar. 2021.
2 Choosing Equipment for Animal Bioacoustic Research 43

transmitted over a cable. The radio-frequency subtracting the given SNR from 94. If properly
oscillator and the demodulator are both housed measured and reported, an SNR of 80 dB
inside the microphone, and these microphones are (A) means a self-noise of 14 dB(A), which is
less prone to problems of interference and pretty good. In other cases, the sensitivity, the
humidity. maximum allowed SPL, and the dynamic range
The more recently developed Micro-Electri- are presented. In this case, the self-noise can be
cal-Mechanical System (MEMS) microphones obtained by subtracting the dynamic range from
have pressure-sensitive elements integrated the maximum allowed SPL.
directly into a silicon chip (as found in most cell
phones) with similar fabrication technologies
Ultrasonic and Infrasonic Microphones
used to make semi-conductor devices. Some inte-
Microphones for ultrasounds are typically small,
grate an AD-converter to produce a digital output.
with a small membrane with very low inertia.
Their development resulted from the need for tiny
Ultrasonic microphones are usually condenser
microphones for cell phones. Because of the
microphones developed for measurement
small size and low inertia of their sensors,
purposes, not for recording music; however, the
MEMS microphones are sensitive to high
increasing interest in ultrasonic communication
frequencies and consequently are used in ultra-
and echolocation in animals (mainly bats and
sonic microphones, such as in bat detectors.
rodents, but also insects) has fostered the devel-
Because of their low cost, they are the perfect
opment of a wide range of sensors for
candidates for array applications, including
ultrasounds. Ultrasonic microphones for mea-
“acoustic cameras” that overlay the image taken
surement purpose need to have a flat frequency
by a video-camera with a map of the sound
response; usually they also have high self-noise
sources generated by a matrix of tens or hundreds
and are very expensive. If the flatness of the
of MEMS microphones.
frequency response is not a necessity, other,
Most condenser microphones have a self-noise
lower-cost microphones can be used instead
lower than 20 dB(A), which is sufficient to record
(e.g., low-cost small condenser microphones and
music or speech at a close distance, but not suited
tiny MEMS microphones). Considering that
to record faint animal sounds and noises in a quiet
ultrasonic microphones need high sampling
environment. The quietest studio microphones
rates, often beyond those available in consumer
have a self-noise below 10 dB(A); among these
digital recorders or AD-converters (see Sect.
microphones is the Rode NT1A, a cardioid micro-
2.3.4), ultrasonic sensors with integrated
phone that has an excellent self-noise of only
AD-converter and USB interface have been
5.5 dB(A). Even quieter microphones are avail-
developed. In bioacoustic studies, these are
able in the category of instrumentation
mainly used for detecting and recording bats
microphones, but few very expensive models are
(Sect. 2.3.5), insects (Buzzetti et al. 2020), and
available. Lynch et al. (2011) and Pavan (2017)
rodents either in the wild or in etho-
used very quiet instruments to show that noise in
pharmacological studies (Buck et al. 2014).
natural environments can be as low as 10 dB re
Infrasonic microphones are specially designed
20 μPa and even go below 0 dB re 20 μPa below
for low-frequency recording, down to 1 Hz or
1 kHz. Of course, a quiet microphone must be
even 0.1 Hz. Until a few decades ago, Sennheiser
connected to a quiet recorder!
produced the MKH 110, a condenser microphone
Sometimes, microphone specifications are dif-
with 12-V powering. Now discontinued, it is still
ficult to read or self-noise is not provided. One
appreciated in the used equipment market. These
must examine the parameters that are given, such
microphones have been widely used to record
as amplitude sensitivity and the signal-to-noise
elephant communication (Payne et al. 1986;
ratio (SNR). If not differently declared, the SNR
Poole et al. 1988). Currently, microphones
is relative to 94 dB re 20 μPa (i.e., 1 Pa) at 1 kHz
designed for infrasonic applications are largely
and thus the self-noise can be obtained by
44 S. Madhusudhana et al.

limited to measurement (instrumentation) Wireless microphones transmit the received


microphones. sound by a radio signal that can be either a stan-
dard AM- or FM-transmission or a digital format
to ensure signal quality and privacy. Wireless
Measurement and Specialty Microphones
microphones allow the cable-less transmission in
Measurement microphones (or, instrumentation
situations where cables are problematic. Wireless
microphones) are a special class of microphones
microphones connected to a multi-channel
designed to make accurate measurements of
receiver allow a wide area to be monitored. In
sound amplitude within a specified frequency
some cases, the wireless microphones used for
range, which could be infrasound to ultrasound,
television interviews can be used successfully
to accurately characterize a sound field or a sound
(e.g., by placing the microphone close to or inside
source. These microphones comply with specific
a nest and then recording from a distance). A
and rigid requirements. They need to have a well-
traditional microphone can also be equipped
defined and stable frequency response to sound
with a radio transmitter and a battery that powers
(ideally flat). They usually appear as cylinders
both. The limitations include powering the
with diameters ranging from 1/8 inch for very
transmitters (in particular, in field and long-term
high frequencies (but with low sensitivity) to
deployments), limited dynamic range,
2 inches for high sensitivity and low noise (but
compromised self-noise, and radio-frequency
limited extension to high frequencies). Normally
interference during transmission.
based on condenser sensors, these microphones
are often powered at 200 V. Measurement
microphones are usually connected to specific Microphone Directionality
digital recorders and analyzers, or integrated into Directionality is an important characteristic of a
a sound level meter (also known as phonometer). microphone. Omnidirectional microphones detect
Usually dedicated to noise measurement, these sound from all directions and can be appropri-
microphones are also used to calibrate other ately used for recording a soundscape (i.e., the
types of instruments (see Sect. 2.6) and to record combination of all sounds generated in an envi-
sounds for analysis and listening with great accu- ronment; see Chap. 7). Directional microphones
racy. Brüel & Kjær3 are well known for their are good for making recordings of a selected
measurement microphones; however, other animal in a specific direction (e.g., a particular
manufacturers exist as well, providing a wide individual in a colony) and for attenuating noise
range of sensors for applications of sound record- coming from directions other than the signal
ing, acoustic measurements, noise monitoring, direction (e.g., the noise of a nearby river or
building acoustics, cinema calibration, occupa- road). Directional microphones thus improve the
tional health, and live sound broadcasts. SNR by reducing background sounds and noise
Optical microphones are a very special cate- coming from other directions in the environment.
gory of measurement microphones. A laser beam In indoor applications, directional microphones
is reflected by a very tiny low-inertia sound-sens- are used to focus on a performer and to attenuate
ing membrane, and the reflected beam is then reverberation from the hall. Widely available
detected by an optical sensor to extract the modu- types of directional microphones include cardi-
lation given by the membrane moved by sound oid, hypercardioid, bidirectional, and unidirec-
waves. Their advantage is the direct optical out- tional (Fig. 2.5). Cardioid microphones exhibit a
put that is conducive for long-range transmission heart-shaped directivity (i.e., they are less sensi-
over optical cables and their insensitivity to elec- tive at 180 from the sound source) and they are
tric and electromagnetic fields. often used with parabolic reflectors. The
hypercardioid microphone is less sensitive at
120 from the direction to the sound source.
3 Bidirectional microphones pick up sound in a
http://www.bksv.com/en/; accessed 15 Mar. 2021.
2 Choosing Equipment for Animal Bioacoustic Research 45

0° 0°
a. b.
-5 dB -5 dB

-10 dB -10 dB

-15 dB -15 dB

-20 dB -20 dB

-25 dB -25 dB

270° 90° 270° 90°

180° 180°
0° 0°
c. d.
-5 dB

-10 dB

-15 dB

-20 dB

-25 dB

270° 90° 270° 90°


-25 dB
-20 dB
-15 dB
-10 dB
-5 dB

180° 180°

Fig. 2.5 Polar patterns of directionality of different plane. In the horizontal plane, these patterns are symmet-
microphones. With microphones facing the top of the rical (i.e., they rotate about the vertical axis). (a) omnidi-
page, these patterns extend from the axis of the rectional, (b) cardioid, (c) bidirectional (figure-of-8), and
microphones, and thus present directivity in the vertical (d) shotgun (lobar)

figure-of-8 pattern equally from two, opposite length of the interference tube and with the fre-
directions. quency of incoming signals, so that at high fre-
Shotgun microphones (Fig. 2.5d) are the most quency (> 4 kHz), the receiving lobe is quite
directional and commonly used for recording a narrow. For lower frequencies, the directivity
specific animal. Their use is desirable when it is decreases. This also means that off-axis sounds
necessary to improve the recording level of a are not only attenuated, but also have a modified
specific sound source, or to attenuate unwanted frequency spectrum, with high frequencies more
sound coming from other directions. The design attenuated than low frequencies. At wavelengths
of shotgun microphones (such as the Sennheiser longer than tube length, off-axis attenuation is
K6/ME66 or the MKH 8070) is based on the null. If interested in higher frequencies, such as
interference tube principle; usually a cardioid bird songs above 1 kHz, a high-pass filter to cut
condenser microphone is placed at the end of a off low frequencies (e.g., to attenuate wind noise
tube with slits on sides, canceling off-axis signals or traffic noise below 150 Hz) is available in high-
(Fig. 2.6). The directivity increases with the quality microphones.
46 S. Madhusudhana et al.

Fig. 2.6 Photograph (left) of a modular microphone ME66, shotgun ME67). Polar pattern (top-right) of the
(Sennheiser K6/ME66) with the preamplifier body that microphone at different frequencies and the frequency
hosts a battery to power the microphone in case the P48 response (bottom-right) on axis and at 90 from the
powering is not available; the sensing capsule is inter- sound. Reprinted with permission from Sennheiser
changeable (omni ME62, cardioid ME64, short shotgun

Monophonic and Stereophonic Recording In the binaural stereo recording configuration,


Monaural recordings are made with a single two omnidirectional microphones are placed
microphone. Stereo recordings are made with approximately the distance between the ears of a
two microphones and provide a sense of depth typical human head (16–18 cm spacing) through
or movement through space in recordings. Stereo the use of a mannequin head that simulates a
recording offers spatial information, which helps human head and ears. This presents the idea of
better discriminate sound sources in the three-dimensional (3D) sound experience as the
surrounding space. Three primary setups are listeners with headphones have the sensation “to
used for stereo recordings (Fig. 2.7): XY, binau- be there,” with their ears in the same position of
ral, and MS (middle-side). A common setup for the microphones. The microphones can also be
the XY stereo recording uses two cardioid or separated with nothing in-between, or with just a
super-cardioid microphones placed at 60 or 90 generic separation, such as a sphere of foam, or a
angles, nose-to-nose. The two microphones can Jecklin disk. Another special binaural configura-
be coincident or spaced. In some cases, the left tion is called the Stereo Ambient Sampling Sys-
microphone points in the left direction, in other tem (SASS) design that simulates a human head.
cases, the left microphone points in the right Compared with other techniques, with exception
direction and the right one in the left direction. of the true binaural, this type of recording
2 Choosing Equipment for Animal Bioacoustic Research 47

Fig. 2.7 XY recording configuration (left) using two in the middle and a bidirectional microphone taking the
cardioid microphones, and MS recording configuration sounds coming from the sides (figure-of-8 polar pattern)
(right) which typically combines a cardioid microphone

produces the best spatial image when heard sphere, surround-sound) acoustic environment,
through headphones. In some setups, cardioid capture sound not only in the horizontal plane,
microphones angled at 60 –90 , like in the XY but also above and below the listener. Surround-
configuration, are used to enhance left-right sound recording requires several microphones in
separation. a 3D configuration, whose signals (channels) are
In the MS microphone stereo recording setup, electronically or digitally combined to produce
a cardioid microphone is piggy-backed on top of both stereo and multi-channel surround-sound
a bidirectional microphone. The cardioid picks up experiences, or to create specific receiving
frontal information, whereas the bidirectional beams (e.g., to focus on a sub-space or on a
microphone gets sounds coming from the sides specific source). The Ambisonics system allows
only. This type of recording requires specific recording of sound pressure on 3 axes with
electronics, or signal processing to combine the 4 microphone capsules mounted as a small tetra-
signals to produce a traditional stereo image. In hedron (first order Ambisonics) (Zotter and Frank
essence, the signals from the left and right 2019). Higher-order Ambisonics microphones
capsules are summed out-of-phase before being can have up to 32 capsules on a small sphere to
combined with the mono-signal. This computa- achieve higher directional details and to simulate
tion allows the recordist to control the width of virtual directional microphones to be oriented in
the stereo spread and make other adjustments in any direction during post-processing.
post-processing. In the early stages of the sound
industry, this helped to maintain the compatibility Microphone Arrays
among mono and stereo recordings. Several Arrays of sound sensors are used to monitor
microphone arrangements have been developed animals across habitats, locate and track sound
for stereophonic recording; for a comprehensive sources (such as individual animals), and study
review, see Rayburn (2011) or Streicher and environmental noise. Arrays may be stationary
Everest (1998). (fixed in location), freely drifting (e.g., suspended
Latest developments, mainly driven by the from balloons), or towed. Ambisonic
film industry to produce an immersive 3D (full- microphones, are a special case of microphone
48 S. Madhusudhana et al.

arrays. The sensors in an array operate in tandem. Do-it-Yourself (DIY) Microphones


Their signals are combined in digital signal Microphones well-suited for bioacoustic studies
processing. A number of requirements need to can be built with microphone capsules costing
be met for successful array processing (e.g., to only a few US dollars. Examples are the omnidi-
track a bat by its biosonar). Sensor locations need rectional electret capsules from Primo
to be known accurately. Sensor directionality Microphones Inc. (EM models)4 or the PUI
needs to be known. Sensor spacing must be such Audio Inc. AOM-5024 L model.5 These capsules
that the target signal can be detected on multiple can be powered directly by PIP when connected
sensors. These sensors need to be matched and to a handheld digital recorder, or powered with a
their eccentricities need to be computed. Time battery and a simple electronic circuit. Adapters
differences of arrival (TDOA) need to be can be easily built to power PIP microphones with
computed between sensors. An overview of digi- the P48 powering provided by professional
tal signal processing algorithms to locate and recorders that do not provide PIP.6 DIY
track sound sources is given in Chap. 4. microphones can be easily assembled to experi-
While the complexity of meeting the above ment with different spatial configurations, even in
requirements has limited the application of micro- the focus of a parabolic reflector, or to have
phone arrays for animal localization and tracking low-cost expendable microphones for very spe-
in terrestrial environments, Mennill et al. (2012) cific field tasks.
successfully deployed an array of wireless
microphones with integrated Global Positioning
Deployment Considerations
System (GPS) time synchronization to make In open-field environments, wind can affect sig-
accurate measurements of the position of a
nal reception by a microphone by causing
sound source by computing TDOAs of the same
non-acoustic noise, which is an artifact of turbu-
sound at different microphones. They discuss lent pressure fluctuations at the external surface of
how this system may be implemented to monitor
the microphone. Such turbulent pressure
frogs, birds, and mammals. Jensen and Miller
fluctuations may be caused by the obstruction
(1999) used a 13.5-m vertical, linear microphone that the microphone itself presents. Turbulent air
array that allowed for simultaneous recordings
flow may also be caused elsewhere and produce
of bat signals at three different heights of vegeta-
noise artifacts in recordings as the perturbations
tion. With this design, they were able to calculate travel past the microphone. Even a light breeze
flight direction, altitude, and distance from the
can produce strong low-frequency noise artifacts,
array. which can overload the internal electronics or the
The literature sometimes presents arrays of recorder. Microphones can be fitted with a
sensors that do not operate in tandem. Rather,
windsock to reduce wind noise. A windsock can
sensors are widely spaced over a potentially be easily made with commercially available open-
large area, sampling independently without syn- cell foam, which limits air flow but allows sound
chronization. The applications are not to locate
waves to reach the microphone membrane. For
and track individual sound sources, but rather to severe wind conditions, a fur-like cover is prefer-
monitor a soundscape, compare animal presence/ able (Fig. 2.9).
absence across sites, or evaluate environmental
When aiming to record animals in a specific
noise over a large area. During digital signal direction (e.g., a bird calling from a tree), a direc-
processing, noise levels might be compared tional microphone should be used and pointed at
across sites and perhaps interpolated to produce
a noise map. For example, the Cornell Lab of 4
https://www.primomic.com/; accessed 15 Mar. 2021.
Ornithology uses an array of 30 recorders to 5
https://www.puiaudio.com/; accessed 13 Aug. 2021.
monitor animal habitat use on a wide spatial 6
http://tombenedict.wordpress.com/2016/03/05/diy-
scale and to assess anthropogenic impacts microphone-em172-capsule-and-xlr-plug/; accessed
(Fig. 2.8). 13 Aug. 2021.
2
Choosing Equipment for Animal Bioacoustic Research

Fig. 2.8 Noise pattern observed at Sapsucker Woods (Ithaca, NY, USA), caused by a The ambient noise levels are raised by about 15 dB at the frequencies of chorusing birds
jet plane taking off from a nearby runway. Receiver locations are denoted by white (2 to 4 kHz). Image courtesy of Dimitri Ponirakis, K. Lisa Yang Center for Conserva-
circles. Regions shaded red show high noise levels and follow parallel to the flight path. tion Bioacoustics, Cornell Lab of Ornithology
49
50 S. Madhusudhana et al.

the dish. The lowest frequency a parabola can


reflect, and thus focus on the microphone,
depends on the dish diameter (Wahlstrom 1985).
For a 1-kHz signal, a 30.5 cm diameter dish is
fine, and for a 500-Hz signal, a dish of 61 cm in
diameter is required. The very low frequency of a
lion roar (40–200 Hz) would require a dish about
10 m in diameter.
Compared to shotgun microphones, parabolic
reflectors intercept a much wider quantity (pro-
portional to the diameter and surface of the reflec-
tor) of acoustic energy and concentrate it on the
Fig. 2.9 Photograph of a microphone setup with pistol
grip and elastic suspension, foam windsock, and additional
microphone, thus providing a high gain. How-
furry windsock for maximum wind protection. Reprinted ever, this gain is proportional to the frequency
with permission from Sennheiser and the parabola diameter, thus producing a
recording with increased high-frequency levels
the bird. It will focus sound recording in the that requires equalization in post-processing
direction of the bird and limit background noise (some parabolas can have equalization built-in).
from other directions. An alternative to a highly As a rule of thumb, the more wavelengths are
directional shotgun microphone is a cardioid contained in the parabola diameter, the higher
microphone placed in the focus of a parabolic the gain and greater the directionality. Because
reflector (Fig. 2.10). The microphone is pointed of these features, parabolas, with the right choice
toward the parabolic reflector, facing into the of microphones, can provide excellent recordings
dish, not toward the animal. Ideally, the of very quiet, distant sources. For example, in a
microphone’s beam pattern would be matched to taxonomic and behavioral study of chipmunks
the solid angle subtended by the reflector. The (Neotamias spp.), Gannon and Lawlor (1989)
diameter of the parabolic reflector determines used a 51-cm parabolic reflector with a
which frequency range of incoming sounds will Sennheiser ME-20 omnidirectional microphone
be amplified (Fig. 2.11). To be reflected, the and K3U preamplifier. Chipmunk calls were in
wavelength of the incoming sound must fit inside the range of 4 kHz to 15 kHz, so this size dish was

Fig. 2.10 Diagram of a parabolic dish and microphone used to record a bird on a tree. The parabolic solution gives
added amplification and directivity, which helps in recording a single animal, a quiet animal, or animals at a distance
2 Choosing Equipment for Animal Bioacoustic Research 51

Fig. 2.11 Sketch of frequency response and gain of a show the theoretical gain of three parabolas of different
generic microphone placed in parabolas of different sizes. The gain is proportional to frequency and to the
diameters. The red lines show the frequency response of parabola diameter. Actual response may vary depending
an ideal microphone, with the option of a high-pass filter to on the shape and depth of the parabola and on the response
reduce low-frequency noise below 80 Hz. The blue lines and positioning of the microphone

adequate for detecting this range of usually is sealed in a resin package with a water-
mid-frequency calls. proof connector and needs to be handled with
To produce a more pleasant recording, it is care. After use in saltwater, a hydrophone should
possible to record in stereo by using two be rinsed with freshwater or else connections are
microphones in the focus, separated by a thin likely to corrode.
plate. This way, sounds coming from the frontal A piezoelectric transducer can be used as a
axis of the parabola reach both microphones with sensor or projector; however, when the transducer
the same level, while off-axis sounds are focused has a built-in preamplifier, it can no longer be
more on one side. Another option is to place an used as a projector, but only as a sensor.
MS microphone combination in the focus of the Hydrophones are much less sensitive, and a
parabola. Listening with headphones helps in great deal of power is needed (from an external
pointing the parabola on the source of interest amplifier) to drive a hydrophone as a projector.
and gives immediate feedback on the quality of As a sensor, a hydrophone can have a built-in
the sounds being recorded. When analyzing preamplifier that matches the frequency response,
recordings made with a parabola, it is important dynamic range, and high impedance of the trans-
to take into account that the frequency response is ducer. A few hydrophones on the market with
not flat as it increases with frequency (Fig. 2.11). built-in preamplifier (Fig. 2.12) can be powered
In some cases, slightly moving the microphone directly by a recorder, computer, or analysis sys-
out of focus reduces the high-frequency emphasis tem (e.g., either by P48 or by PIP at 2–5 Vdc).
and produces a more pleasant sound. Most preamplified hydrophones require powering
through dedicated cables and can require single or
2.3.1.2 Hydrophones dual powering (e.g., þ12 V, or 12 V and þ12 V)
A hydrophone is a piezoelectric transducer that to be provided by a battery box (Fig. 2.12). A
converts sound waves in water to electrical popular low-cost hydrophone is the H2c from
signals. Hydrophones can receive sound in air, Aquarian Audio,7 which allows PIP powering.
but the sound has to be of very high amplitude. The DolphinEar8 is an inexpensive, lightweight,
Because the acoustic impedances of the medium
and the sensor match much better in water than in 7
http://www.aquarianaudio.com/; accessed 15 Mar. 2021.
air, hydrophones have to be less sensitive, or they 8
http://www.dolphinearglobal.com/; accessed 19 Jun.
would easily overload. The underwater sensor 2022.
52 S. Madhusudhana et al.

Fig. 2.12 Photographs of an ITC 6050C hydrophone with built-in preamplifier and external battery power (left) and a
Cetacean Research Technology C57 hydrophone with cable and battery box (right; courtesy of J R Olson)

battery-operated hydrophone with an external amplitude sensitivity) to record impulsive pile


amplifier and headset that is good for ecotourism driving at ranges from 14 m to 1330 m.
or classroom use. Other relatively low-cost Hydrophones can vary considerably in their
hydrophones well suited for marine mammal frequency response; some are used specifically
studies are produced by Cetacean Research for low-frequency, mid-frequency, or high-
Technology.9 frequency reception. Typically, hydrophones are
To record underwater sound in open water smaller than the wavelengths that are being
from a distant source, a sensitive hydrophone is recorded. But, with the smaller sensor comes a
needed. Good sensitivity would be 160 dB re lower energy input. This results in lowered sensi-
1 V/μPa. Such a hydrophone produces 1 V when tivity. Generally, the smaller the piezoelectric
receiving 160 dB re 1 μPa of acoustic pressure element, the broader the frequency range, but
and 1 mV for a signal of 100 dB re 1 μPa. If used the lower the amplitude sensitivity. Lower sensi-
for recording a signal at 180 dB re 1 μPa, it will tivity can require higher amplification, and thus
produce a 10-V output and may overload the can produce higher electronic noise. Piezoelectric
connected electronics. To record underwater hydrophones usually have a resonance peak in the
sound at close distance (e.g., in front of an upper part of their bandwidth, so that optimum
echolocating dolphin which can produce pulses operation of the hydrophone is along the flat
with source levels above 220 dB re 1 μPa m portion of the frequency response curve below
pk-pk), a low-sensitivity hydrophone is needed resonance. Reception at other frequencies could
(e.g., one that has a sensitivity of 210 dB re be used, but the difference in response of the
1 V/μPa). Very likely, such a hydrophone cannot hydrophone needs to be accounted for during
be used for recording low-level sounds from a analyses. Some studies require the use of multiple
distant source because it requires high amplifica- hydrophones to cover the entire frequency range
tion and consequently produces high electronic of the animal’s sounds.
noise. However, using hydrophones with built-in
preamplifiers when powerful signals can occur
Hydrophone Directionality
risks overloading of the preamplifier, thus pro-
Hydrophones, much like microphones, have
ducing distorted signals. Erbe (2009) used four
directional receiving and transmitting
different hydrophone systems (differing in
characteristics, depending on the size and shape
of the transducer (Fig. 2.13). Spherical
9
http://www.cetaceanresearch.com/; accessed transducers receive and transmit signals uni-
15 Mar. 2021. formly in all directions. With a cylindrical
2 Choosing Equipment for Animal Bioacoustic Research 53

Fig. 2.13 Specifications and polar plot of directional ITC ITC (https://www.gavial.com/itc-products; accessed
3003D transducer (left) and omnidirectional ITC 1007 22 Aug. 2021)
transducer (right). Reprinted with permission from Gavial

transducer, sounds are received and projected sensor, a spherical hydrophone is typically omni-
uniformly in the horizontal plane, assuming the directional (receives sounds equally from all
transducer is suspended vertically. In the vertical directions) as shown by the right polar plot of
plane, the transducer will have a directivity pat- Fig. 2.13. Used as a projector, the directivity
tern. If the transducer has a planar shape, it will pattern of a hydrophone changes depending on
have two beams on its opposite faces as shown in the frequency being projected (directivity
the left polar plot in Fig. 2.13. When used as a increases with frequency).
54 S. Madhusudhana et al.

Sonobuoys Stationary Hydrophone Arrays


A sonobuoy is a canister housing a hydrophone, Stationary hydrophone array configurations
dampening cable, battery, recording/transmitting include moorings (with or without surface
electronics, and a transmitting antenna. Navies of buoy), seafloor packages, or cabled systems.
the world use sonobuoys for underwater listening Arrays of permanent, stationary hydrophones
by deploying them from aircraft or ships. These can be placed on the seafloor and connected via
devices also may be used for bioacoustic studies. cables, either electrical or electro-optical, to
Once a sonobuoy is deployed in saltwater, a bat- processing centers located on shore. Multi-
tery is activated, which triggers the inflation channel receivers allow listening or recording of
(CO2) of a flotation balloon and antenna. The sounds from multiple hydrophones. Typically,
hydrophone and associated dampening cables the array is optimized for long-range acoustic
can be set to drop to a pre-selected water depth reception by using very-low-frequency sensors.
(i.e., 30, 60, 120, or 300 m). During operation, the Some bottom-mounted arrays are equipped with
sonobuoy canister floats at the water surface with wideband hydrophones to allow scientists to
the antenna in the air and transmits acoustic data monitor a wide variety of marine species, as
in real-time to a receiver onboard a vessel or well as ambient noise levels (e.g., Caruso et al.
aircraft or to a receiver at a station onshore. 2015; Favali et al. 2013; Nosengo 2009; Sciacca
After a preset time (e.g., 1, 2, 4, or 8 h), a burn- et al. 2015). Usually, these arrays are installed and
wire penetrates the flotation balloon, and the maintained by navies, oceanographic
sonobuoy fills with water and sinks to the organizations, or research centers for many years
seafloor. (see Chap. 1 for a list of past and current bottom-
Analog sonobuoys (Fig. 2.14) are available in mounted hydrophone arrays deployed around the
two common configurations: omnidirectional world).
sonobuoys (with a frequency response of up to
20 kHz) and DIrectional Frequency Analysis and
Towed Hydrophone Arrays
Recording (DIFAR) sonobuoys, which provide
A towed array contains several hydrophones (not
bearing information on incoming signals. The
necessarily of the same type), commonly housed
latter type has been used to determine source
in an oil-filled sleeve (Fig. 2.15), where the oil
levels and calling rates in cetaceans (e.g., Miller
matches the acoustic impedance of sea water.
et al. 2015). The most recent generation of
Originally developed for navies and geophysical
sonobuoys features a digital recording system
survey companies, towed arrays were bulky and
and is equipped with GPS technology.
expensive, and mainly received low-frequency

Fig. 2.14 Photograph of a sonobuoy deployed from a


ship to monitor whale sounds in the Mediterranean Sea Fig. 2.15 Photograph of a towed array under water,
(SOLMAR Project, http://www.unipv.it/cibra/res_solmar_ developed by the University of Pavia (Italy), with the
uk.html) tow vessel in the background
2 Choosing Equipment for Animal Bioacoustic Research 55

sound (<15 kHz). In more recent years, light- non-acoustic mechanical vibration, which reduce
weight, wideband towed arrays sensitive up to the ability to capture low-frequency animal
100 kHz and more have been developed to meet sounds and which can cause an acoustic overload
the requirements of researchers aiming to study of the recording chain. To mitigate these issues,
marine mammals from small platforms, such as tow speed should usually not exceed 6 knots. A
sailboats (Pavan and Borsani 1997; Pavan et al. long cable with special elastic sections in the
2013). By simultaneously processing sound from array can dampen vibrations. Flow- and vessel-
more than one hydrophone (or group of noise can be mitigated with a smooth high-pass
hydrophones), the bearing (or even location) of filter (e.g., 500 Hz, 12 dB/octave; see Sect. 2.3.2.1).
the vocalizing animal maybe be determined (see
Chap. 4, section on sound localization). Towed
Deployment Considerations
arrays are used for line-transect surveys and to
To operate properly, hydrophones must have little
sample animals in their environment over a wide
vertical or horizontal movement. Water flow over
geographic range.
the surface of the hydrophone generates pressure
A straight-line array cannot resolve between
fluctuations, which appear as noise in
signals arriving from the port or starboard side
spectrograms but which are not due to an acoustic
without the vessel changing course or using mul-
wave. This flow noise is an artifact of deployment
tiple array deployments (Thode et al. 2010).
(see Chap. 3, section on flow noise). It is typically
Large arrays (sometimes hundreds of sensors,
of low to mid frequencies (see, for example,
possibly with different frequency sensitivities
the spectrogram in Fig. 3 in Erbe et al. (2015)
and bandwidths) allow tracking of multiple
showing flow noise in marine soundscape
sources simultaneously by selective beamforming
recordings) and thus can be filtered out with a
(Zimmer 2011). More complex towed systems
high-pass filter, but this limits the recording of
use a 3D hydrophone configuration called a volu-
low-frequency sounds. Large or rapid vertical or
metric array (Zimmer 2013) or vector sensors
horizontal movement of a hydrophone (e.g., if it
(Thode et al. 2010) to locate sound sources in
is deployed over the side of a boat) may cause the
three dimensions. Acoustic vector sensors are
system to be saturated with no useable recordings
sensitive to particle velocity rather than to pres-
collected. It is very difficult to make good
sure and hence sense the direction of incoming
recordings in the open ocean; a hydrophone
sound waves and resolve the directional
often needs to have its own flotation system,
ambiguities. Thode et al. (2010) attached a vector
rather than be suspended from a boat; otherwise,
sensor module to the end of an 800-m towed array
the movement of the boat will translate into
to detect sperm whale clicks and compute unam-
movement of the hydrophone. The horizontal
biguous bearing estimates of whales over time.
component of water flow past a hydrophone
Many towed arrays have a depth sensor, so the
may be minimized by deploying freely drifting
operator knows the tow-depth in relation to the
hydrophone systems (e.g., suspended from a
sound velocity profile in the water column. Such
freely drifting buoy). The vertical component of
information allows the user to position the array
water flow past a hydrophone may be minimized
either in a surface duct or below the thermocline
by dampening systems; for example, suspending
to listen to sounds coming from deep water (see
the recorder on a bungee with a movement-
Chap. 6 on sound propagation under water).
dampening drogue, or by using a catenary
Additionally, the depth information enables
floatation line (see Chap. 3 and Fig. 5 in Erbe
subsequent array processing to exploit the surface
et al. 2019). In towed arrays, long towing cables
effects on sound propagation to improve localiza-
and specifically designed hydrophones
tion accuracy.
(acceleration-compensated) are used to avoid sat-
Array performance is degraded (in particular
uration of the hydrophones from movement.
below ~1 kHz) by vessel self-noise, hydrody-
namic noise artifacts (flow noise), and
56 S. Madhusudhana et al.

2.3.2 Filters 2.3.2.2 Anti-Aliasing Filters


Digital recorders and audio interfaces have built-
Filters are used to minimize unwanted noise from in anti-aliasing filters with varied performances;
the environment (including other animals) or whereas instrumentation recorders and instru-
electronic self-noise. Filters can be used while mentation acquisition boards usually do not
recording or during post-processing. Filtering have built-in anti-aliasing filters and require a
during recording facilitates conserving recorder separate signal-conditioning device to perform
dynamic range for signals in the frequency band filtering and adjust the signal level. The avail-
of interest. A filter can be a stand-alone unit able filters have their specific shape and thus can
(some also have an amplifier) or filtering can be influence the frequency response of the
achieved using software, either in real-time or in recording.
post-processing. Note that filters are not a “magic AD-converters (Sect. 2.3.4) in recording
wand” to make a bad recording clean. While equipment (either stand-alone recorders or exter-
recording, filters can be used to suppress nal converters connected to a computer) have
unwanted noise without affecting the sounds of relatively smooth anti-aliasing filters that attenu-
interest only when the noise and the sounds do ate frequencies starting somewhat below the
not overlap in frequency. If noise and sounds do Nyquist frequency, but do not completely cut
overlap (in frequency, or in time, or both), it is out the signal at Nyquist. Attenuation at Nyquist
possible to perform some filtering or noise is often in the range of 6–12 dB, and the maxi-
removal in post-processing. However, the settings mum attenuation (the FZero of the filter) is
need to be carefully chosen. Some microphones located above the Nyquist frequency.
and digital recorders (Sect. 2.3.4) have built-in The anti-aliasing filter shape is rarely reported
selectable filters, often with selectable attenuation in equipment specifications; tests are required to
rates. evaluate the anti-aliasing performances of the
AD-converter, in particular if wideband signals
are to be recorded and analyzed. Concern for
2.3.2.1 Low- and High-Pass Filters
aliased components is required for any type of
Using a low-pass filter, the recordist can set a
signal possibly exceeding the Nyquist frequency,
frequency above which signals are attenuated. A
including external interferences captured by the
high-pass filter attenuates signals below a selected
electronics and cables, as well as higher
frequency. High-pass filters are often used to
harmonics of the signals to be recorded. A labo-
reduce low-frequency noise generated by wind
ratory test with a frequency-generator signal
and road traffic in terrestrial recordings and flow
sweeping across the whole frequency range of
noise in underwater recordings. For example, to
the recorder and beyond the Nyquist frequency
record a bird singing in the 2–5 kHz range, a high-
can reveal unexpected and unwanted performance
pass filter set at 1 kHz will suppress traffic noise
by the converter.
(which is typically below 500 Hz). A band-pass
filter combines low-pass and high-pass filters. All
filters have a transition bandwidth at the intersec-
tion of the pass band and the attenuation band, 2.3.3 Amplifiers
where there is a roll-off in the attenuation amount
(steepness), which is normally expressed in A preamplifier conditions the incoming signal
dB/octave (e.g., 6 dB/octave in a smooth filter, from a transducer and boosts the signal before it
or 24 dB/octave for a steeper filter). The greater is recorded. A preamplifier converts a weak elec-
the roll-off, the sharper the filter. However, trical signal into a stronger, noise-tolerant output
sharper filters have longer impulse responses signal for further processing. Without preampli-
and generate longer artifacts in the output fication, the recorded signal could be noisy or
waveforms. distorted. The preamplifier has a high input-
2 Choosing Equipment for Animal Bioacoustic Research 57

impedance (i.e., it requires only a small current to 2.3.4 Analog-to-Digital Converters


sense the input signal) and a low output- and Digital Recorders
impedance (so that when a current is drawn
from the output, the change in the output voltage Despite declared sampling frequencies and
is minimal). In other words, a preamplifier bit-resolution, AD-converters, either in a stand-
converts a high-impedance input signal from a alone recorder or in a computer audio-interface,
transducer to a low-impedance output signal. are based on diverse technologies and can affect
Besides lowering impedance, some preamplifiers the quality of a recording. For example, delta-
also provide amplification (typically 20 to 26 dB). sigma converters have high noise at high
This is not true for most preamplifiers and hence frequencies, beyond the human hearing limits,
they are typically paired with amplifiers. which becomes evident in wide-bandwidth
Preamplification should be constant across the power spectra and spectrograms. Another prob-
recording bandwidth so as not to distort the sig- lem is jitter from instability of the clock driving
nal. The frequency range and dynamic range the AD-converter and the digital stream. Exces-
specifications of the preamplifier and amplifier sive jitter can reduce the quality of recordings and
need to match other electronics in the recording can be seen easily by analyzing a clean test tone.
system. For recording faint animal sounds or Jitter can produce both random artifacts
quiet soundscapes, the quality of the preamplifier (Fig. 2.16) and periodic artifacts with well-
is often an issue and must be considered carefully defined frequencies. Jitter cannot be minimized
relative to the required use and the transducer to by the user because it is characteristic of a given
be connected. device. AD-converters can be divided into two
An amplifier increases the signal gain after it main categories: for musical use, generally lim-
is captured to drive the signal along a cable to the ited to the standard sampling frequencies of 44.1,
AD-converter without significantly degrading 48, 96, and 192 kHz, or for instrumental
the SNR. Amplifiers can boost hydrophone measures, with sampling frequencies ranging
signals as much as 60 dB (1000x). However, from 100 Hz to 1 MHz and more. Converters
amplifying a signal will also increase ambient for the consumer and prosumer musical market
background sounds and self-noise; very high have smooth anti-aliasing filters included, suit-
amplification could inadvertently make the able for musical signals, and a high-pass filter
noise level so high that desired signals cannot usually set below 10 Hz; instrumentation
be recorded with good fidelity. Amplifiers for converters do not have any filter on their inputs
microphones are battery-powered and have and will sample any signal starting from 0 Hz
high- and low-pass filters, which makes them (DC coupling). When using instrumentation
useful for fieldwork. converters, aliasing problems must be considered,
Speakers include power amplifiers that drive and external anti-aliasing filters must be included
a projector to generate high-amplitude acoustic in the recording chain (see Sect. 2.3.2.2).
signals in air or under water. The power ampli- An inexpensive and very portable
fier provides the higher current to drive the AD-converter unit is PoScope’s10 Mega1 sam-
speaker. Most power amplifiers used in high- pling at 500 kHz at 12 bit and recording directly
fidelity home-entertainment systems also can to a PC in PCM files via USB interface. However,
be used in bioacoustic research. However, in the PoScope, as most industrial data acquisition
some cases, more power and bandwidth are systems, including most National Instruments11
needed so that commercial broadcast power devices, has no anti-aliasing filter and the mea-
amplifiers must be used. No matter what class surement needs to be sampled at a rate much
of amplifier or preamplifier is used, one should
always consult the manufacturer’s manual.
Over-amplification can “blow” a loudspeaker 10
https://www.poscope.com/; accessed 15 Mar. 2021.
11
or underwater projector. http://www.ni.com/; accessed 22 Aug. 2021.
58 S. Madhusudhana et al.

Fig. 2.16 Spectrogram of a sinusoidal tone sampled at tone sampled at 44,100 Hz with a good AD-converter
44,100 Hz with a poor AD-converter (top panel). Note the (middle panel); the broad blue band is absent in this
low-intensity broadband noise (blue components) due to image. The bottom panel shows the constant amplitude
random jitter around the red line representing the tone’s of the signal waveform
central frequency. Spectrogram of the same sinusoidal

higher than the highest frequency contained in the higher frequencies of interest, so using a high-
input signals. If the upper-frequency content of pass filter at a selected low frequency while
the signal (including any possible noise or inter- recording is recommended.
ference such as those generated by video AD-converters are more commonly available
monitors, digital networks, and switching power in the consumer market as “digital recorders” that
supplies) is unknown, use a good-quality, also include the circuitry to save recorded data to
low-pass external filter at the known or presumed permanent storage (e.g., SD-cards or internal
upper cut-off frequency while recording and digi- memory) and an interface for powering the other
tally filter and down-sample the recorded file components (either from an external source or
thereafter. It is also important to consider that through internal batteries). Some digital recorders
strong low-frequency sounds below the desired also offer built-in selectable high-pass filters,
frequency range can limit the dynamic range at which can help reduce the low-frequency noises
2 Choosing Equipment for Animal Bioacoustic Research 59

produced by handling and suppress wind or flow form of ultrasonic microphones with integrated
noises. high-speed AD-converter and USB interface
The frequency response of the digital recorder (e.g., Dodotronic12 Ultramic family with sam-
should be matched to the frequency response of pling frequencies ranging from 200 kHz to
the sensor–preamplifier–amplifier system as close 384 kHz). Dodotronic microphones do not need
as possible and to the needs of the research. The specific drivers and can be used on Windows,
component with the narrowest frequency MacOS, and Linux, and also on Android
response is the limiting factor in the recording smartphones. Recent models include support for
chain. All AD-converters have a maximum volt- internal storage (miniSD card) and powering with
age range at the input that can be converted with- a USB battery box. The internal recorder can be
out overloading or clipping. The trick is to stay set by Bluetooth to record on trigger or on a time
below the clip-level and still have good dynamic schedule. Other similar devices are the Wildlife
range and SNRs. Other important features in Acoustics Echo Meter Touch and Petterson Ultra-
selecting the appropriate recorder are: the number sound Microphone. Another option for recording
of channels (e.g., 2, 4, 8, or more), durability, at very high sampling frequency is to use an
reliability for field-use, battery duration, flexibil- instrumentation AD-converter like the PoScope
ity and ease of use, maximum storage, integrated Mega1+.
sensors (unidirectional or directional), inputs for Many recorders are not suited for very-low-
external sensors, power options for the external frequency recording. Most have a lower limit of
sensors (P48 and/or PIP power), and the capabil- 10–20 Hz; others can record down to 7–10 Hz.
ity to connect a remote-control or a timer. Some Recording very-low-frequency animal signals is
recorders (especially many analog and digital tape complicated because this frequency range also
recorders and video-cameras) use Automatic contains environmental and electronic noise,
Gain Control (AGC) to keep the recorded volume which typically would be filtered out. For record-
within the same amplitude range. Other devices ing infrasounds (e.g., calls of elephants or baleen
have an Auto Level Control (ALC) setting or a whales), it is important to check the specifications
limiter function designed to avoid overloading or of the recorder and eventually make a bench-test
clipping. Some recorders indicate clipping either of the available frequency range using a signal
by a level-meter or with a flashing light. Any generator (a tone sweeping through a wide range
AGC, ALC, or limiter options should be disabled of frequencies is a good test signal). An option is
to perform comparisons among different sounds to use an instrumentation AD-converter with DC
or different recordings and if true sound level coupling.
measurements are needed. The gain level should
remain constant throughout a recording, and
2.3.4.2 Special Features of Digital
noted; ideally, the sampling rate and gain settings
Recorders
should remain the same among recordings, at
Pre-recording buffer memory allows the user to
least for the same subject or context.
save the few seconds of sound before pressing the
record button. Auto-start initiates the recording
2.3.4.1 Recording Ultrasounds
automatically when a certain input level is
and Infrasounds
exceeded. Double recording allows a lower-level
Ultrasonic recorders were developed mainly for
backup copy in case some parts of the primary
bat and dolphin studies; however, other animal
recording are overloaded. With this method, the
species also produce ultrasonic sounds (e.g.,
incoming sound is recorded twice, in two differ-
insects, frogs, and infant rodents). To record
ent files, the second stereo file is stored at some
ultrasound requires a sensor with suitable fre-
dB down from the first file. In terrestrial
quency extension and a recorder or an
AD-converter with a high enough sampling fre-
12
quency. An affordable solution is available in the http://www.dodotronic.com/; accessed 15 Mar. 2021.
60 S. Madhusudhana et al.

applications, a wired remote-control can be useful Elektronik13 D100) tuned to the 40–50 kHz
when it is required to hide or protect the recorder range, the call of a bat at 45 kHz (such as the
(e.g., from rain). A wireless remote-control, by Pipistrelli bat, Pipistrellus spp.) is multiplied
Bluetooth or by Wi-Fi (wireless fidelity), allows (heterodyned) by a frequency (43 kHz) generated
controlling the functions and levels by a by an internal oscillator. This produces sidebands
smartphone application, but this would consume at 88 kHz and 2 kHz (which are the sum and the
additional power and could impact energy difference of the two frequencies); the higher
budgets. File time-stamping inserts the date and frequency is eliminated with filters and the
time of the recording in the file name, rather than lower frequency is broadcast to the listener and
just a sequential number. This is extremely help- available for recording. This makes for a tunable,
ful when storing and cataloging the recordings. inexpensive bat detector that will quickly indicate
Some recorders have a computer audio-interface if bats are in the area. Heterodyning offers a
or the ability to connect a computer to record limited view of the ultrasonic spectrum but is
directly on a laptop or a tablet. This option allows still appreciated by many bat specialists.
the same recording quality while using special Frequency-division transforms the available
software for managing files (e.g., to tag files frequencies and replicates the bat call by
with a time-stamp and GPS position, or to auto- converting it into a square wave (sine wave also
matically start and stop the recording according to used) at its zero-crossing points. This wave is
received signals or according to a user-defined then divided by a preset factor (usually 10), cre-
schedule). ating another square (or sine) wave at a lower
frequency (e.g., a 40-kHz call is converted to
4 kHz). All sounds in the environment are
2.3.5 Equipment for Monitoring Bats converted in this way. As such, masking of bat
calls by noise, or overlapping of calls from differ-
Acoustic detection of ultrasonic bat calls has ent individuals, can produce results that could
emerged as the most commonly used method for become difficult to interpret. Many devices have
monitoring bat presence and activity (Collins and filters and ways to lower or otherwise adjust
Jones 2009; Gorresen et al. 2008; Weller and background noise. However, this recording
Baldwin 2012). Observing and recording bats, option is now obsolete because modern digital
other than for scientific research, is a very diffuse ultrasound recorders are capable of recording at
hobby and a common topic of citizen science. very high sampling frequencies (upward of
This results in a wide variety of bat detectors 200 kHz) and capture the full bandwidth.
produced by small companies or DIY bat detector Time-expansion bat detectors use an
kits. The common types of detectors are hetero- AD-converter to digitize sounds, convert them
dyne, frequency-division, time-expansion, zero- so that they are audible to the human operator,
crossing, and full-bandwidth digital recorders and store these digital signals to memory (usually
(Obrist et al. 2010). Some bat detectors have SD-card). Reduction of the recorded frequencies
their own specific software, either free or to be expands the sounds in time (hence the name).
purchased, for further processing of Some modern digital bat detectors do convert
recorded data. ultrasounds to audible sounds in real-time by
Heterodyning was the first developed system, means of FFT processing (Pavan et al. 2001).
completely analog, to shift one frequency (the However, there is a delay when the signals are
incoming signal) to another by multiplying it retrieved and played back at a slower speed
with a second frequency (set by the user). The (so that they can be heard with some delay). A
user can tune the detector (similar to tuning a high-frequency modulated call that sounds like a
radio) to select a frequency range accessing a
small portion of the available received frequency.
13
For example, with a bat detector (e.g., Pettersson http://www.batsound.com/; accessed 15 Mar. 2021.
2 Choosing Equipment for Animal Bioacoustic Research 61

quick click is heard as a descending note or whis- Some frequency-division detectors are com-
tle upon playback from time-expansion. bined with heterodyne and time-expansion
Zero-crossing is an algorithm for extracting capabilities into one unit. The Ciel CDB301
primary frequency information by tracking when combines both a heterodyne detector with a
the waveform crosses the zero-amplitude level at frequency-division detector, allowing the
certain rates. Zero-crossing bat detectors run con- researcher to tune into the frequency of a known
stantly, wake up when certain frequencies are bat call and identify a bat by both its sound
detected, and save information on zero-crossings contour and frequency. At the same time, the
in storage. Some advanced bat detectors also detector monitors the whole frequency band and
retain the amplitude envelope of the original checks if there are any bats in the vicinity. The
call; however, they only track the most intense Pettersson D240, like many of these dual bat
component of the call. Using zero-crossing, a bat detectors, provides heterodyning ability on one
detector documents the dominant frequency, so if, channel and time-expansion on another.
for some reason, a harmonic is dominant over the Connected to a voice-activated digital recorder,
fundamental or other signals overlap the funda- these detectors can be left in the field in monitor
mental of the call, only the most intense fre- mode and retrieved data can be analyzed on a PC
quency is recorded. The operator needs to using the product’s software (e.g., BatSound).
recognize this in order to represent the true nature The Anabat Walkabout (Fig. 2.17) records bat
of the bat’s signal. The recordings produced by signals using the zero-crossing technology and
zero-crossing detectors are usually small (e.g., also saves signals as full-spectrum WAV files
50 KB), whereas an equivalent recording of full- compatible with SonoBat software. The calls can
spectrum calls consumes considerable storage be heard and displayed at the same time and saved
space (e.g., 5 MB per call). to disk, making species identification instanta-
Full-spectrum digital bat detectors are digital neous. Units are compact, mobile, and well-suited
recorders with high sampling frequency that cap- for long-term monitoring. Solar-powered units
ture the full bandwidth of the call (Dannhof and with detachable solid-state hard drives allow for
Bruns 1991; Moir et al. 2013). In some detectors, greater periods of use.
it is also possible to hear sounds in time- For teaching or demonstration, any detector is
expansion while recording continuously. These useful, but one may consider heterodyne types of
bat detectors can record continuously or only detectors because of their low cost (i.e., every
when there are signals in a given frequency student could use one). An interesting and flexi-
band set by the user (triggered recording); this ble option is represented by ultrasonic
solution reduces the storage size and shortens microphones that incorporate a high-speed
the time needed to analyze the recordings as AD-converter that can be connected by USB to
only call series are recorded. Different trigger any computer platform (Windows, MacOS,
parameters allow selecting the frequency range Linux, iOS, Android, or Raspberry). The
to be recorded (spectral trigger) and the threshold Dodotronic Ultramic series, the Wildlife Acous-
level to activate the recorder. This technology is tics Echo Meter Touch, and the Petterson M500
available in handheld and autonomous recorders are great devices for classroom demonstration.
(see Sect. 2.4.1), and computer-based bat They allow to record ultrasounds continuously
detectors that use an external ultrasonic micro- or on trigger with a companion tablet or
phone. Some of the more advanced handheld smartphone, and provide full-spectrum recording
digital bat detectors incorporate a display to capability, audio feedback, and real-time visuali-
visualize detected calls, and also include zation. Some of these manufacturers also provide
frequency-division, time-expansion, or software for either basic operations, such as
frequency-shifting to provide acoustic feedback recording and display, or more advanced tasks
to the operator. such as bat species identification.
62 S. Madhusudhana et al.

Fig. 2.17 Some of the a. b.


detectors discussed in this
section. (a) Dodotronic
USB Ultramic 384BLE, (b)
Wildlife Acoustics (http://
www.wildlifeacoustics.
com/; accessed 15 Mar.
2021) Echo Meter Touch
2 Pro connected to an iPad
and to a smartphone, (c)
Anabat Walkabout (Titley
Scientific (http://www.
titley-scientific.com/;
accessed 15 Mar. 2021)),
and (d) D1000X bat
detector by Pettersson
Elektronik. Permission
given by the respective
manufacturers c. d.

2.3.6 Projectors output similar to the level an animal would


encounter. Generating sound in water requires
Playback studies to investigate animal behavior more energy than in air, because of the higher
have been used on many different taxa (see impedance and density of water.
Chap. 3, section on playback methods). The Among loudspeakers, some common names
projectors used for broadcasting in air and under are used to describe their general operational fre-
water also have, like the sensors, their character- quency range: a tweeter is a high-frequency
istic frequency response and operational fre- speaker typically small in diameter and a woofer
quency range. Equipment with suitable is a low to very low frequency speaker that is
characteristics should be chosen appropriately much larger in diameter than a tweeter. A system
based on the characteristics of the sounds to be with detachable loudspeakers can be convenient
transmitted. Usually, speakers are electrodynamic for placing speakers close to an animal or on
devices; however, for high frequencies, electro- opposing sides of an animal.
static speakers are also used. At high amplitudes, For underwater applications, there are two
projected sounds can distort. One must look in the types of projectors: electrodynamic devices and
manufacturer’s manual to check maximum ampli- transducers with piezoelectric elements. An elec-
tude output of the projector and select a unit trodynamic device functions like an in-air
sufficiently capable of producing amplitude speaker, but is watertight and can be used at
2 Choosing Equipment for Animal Bioacoustic Research 63

Fig. 2.18 Photograph of


JA Thomas lowering a
Lubell underwater speaker
into a melt hole to play back
underwater vocalizations to
Weddell seals
(Leptonychotes weddellii)
in the Antarctic

shallow depths. For example, a swimming pool long-term (months to years) data from remote
speaker (Lubell,14 Fig. 2.18) is an inexpensive areas and operate independent of weather and
electrodynamic device, but has a narrow fre- light conditions (e.g., Lammers et al. 2008;
quency range that is relatively flat. On the other McCauley et al. 2017; Obrist et al. 2010). Some
hand, piezoelectric projectors have projection recorders generate recordings in popular formats
sensitivity that varies with frequency. Note that (e.g., WAV files) that are compatible across sev-
many of the piezoelectric projectors are two-way eral analysis software packages, whereas others
or reciprocal devices that can also receive acous- generate a device-specific file format requiring
tic signals in water. The receiving sensitivity is the use of a specific software program for
fairly flat for a large portion of the operative analyses. Autonomous recorders eliminate the
frequency range; on the contrary, when working influence of an observer’s presence on the
as a projector, the amplitude of the generated animal’s behavior, are non-invasive, operate
signal typically increases with frequency. remotely, allow systematic periodic sampling,
and provide long-term recordings.

2.4 Autonomous Recorders


2.4.1 Terrestrial Recorders
Autonomous recorders combine the different
components of the signal chain (sound sensing, Autonomous recorders are used to study airborne
amplifying, filtering, and digitization) to offer a sounds from terrestrial animals on a long-term
packaged solution. A variety of autonomous pas- basis, during day and night, during any type of
sive acoustic monitoring (PAM) systems have weather, and in areas where the animals might not
been developed, which allow the documentation be visible because of vegetation. They are
of acoustic activity from animals and the environ- low-power, digital recorders with extended data
ment. Autonomous recorders (both terrestrial and storage capabilities enabling the recording of
aquatic) are programmable and can be set up to sounds for extended periods, continuously, or on
satisfy specific needs. These systems can obtain a pre-defined schedule (e.g., record x hours before
and after sunset or sunrise, or for x min every
14 y min). Important features of autonomous
http://www.lubell.com/; accessed 15 Mar. 2021.
64 S. Madhusudhana et al.

recorders in the field include: battery duration, certain frequency bands exceeds a preset thresh-
total recording time, recorder reliability, program- old, data are recorded. This can reduce the
ming capabilities, weatherproof construction, amount of data to be stored onboard. Recorded
tamper-proof setup, ease of data-retrieval, and data can be retrieved manually from the recorder
possible interface with video. The frequency or remotely via wireless methods. The more
response, dynamic range, and amplitude sensitiv- advanced units feature Wi-Fi, cellular network,
ity of the unit are determined by the sound sensor, or satellite communication interfaces for data
preamplifier, amplifier, and AD-converter used. transmission to a remote server. For instance,
By using a GPS or a highly precise internal clock, Pavan and team used autonomous recorders
individual recorders can be time-synchronized. (Wildlife Acoustics SM3 and SM4) to document
This allows measuring the TDOA of sounds airborne sounds for six years at three locations
among multiple recorders to triangulate and with 10-min samples every 30 min (Fig. 2.19)
locate a sound source (see Chap. 4, section on (Pavan et al. 2015; Righini and Pavan 2019).
localization). Another option is triggered Bat nocturnal activities were monitored via ultra-
recordings. For example, when the energy in sonic autonomous recorders (Wildlife Acoustics

Fig. 2.19 (a) Photograph


of autonomous acoustic
recorders placed in the
Sassofratino Nature
Reserve, Italy. In the
foreground, a Wildlife
Acoustics Song Meter
SM3. In the background, a
custom recorder developed
at the University of Pavia.
(b) Wildlife Acoustics Song
Meter SM4BAT-FS. (c)
Titley Scientific Anabat
Express. Permission to
reprint by the respective
manufacturers
2 Choosing Equipment for Animal Bioacoustic Research 65

EM3+ and SM4BAT-FS) and an ultrasonic USB expanding. Autonomous recorders with a variety
microphone (Dodotronic Ultramic 250 K) of features (such as operational longevity, high
connected to a PC-tablet. depth rating, onboard processing, and communi-
The increasing interest in acoustic monitoring cation capabilities) are produced by several com-
in the last few years has stimulated the develop- mercial organizations and academic entities.
ment of many autonomous recorders; among Examples of commercially available recorders
these, the Wildlife Acoustics series, the are the AMAR from JASCO Applied Sciences,19
Bioacoustic Audio Recorder (Frontier Labs,15 Snap from Loggerhead Instruments,20 AURAL
Brisbane, Queensland, Australia), the Swift from Multi-Électronique,21 icListen from
(Cornell Lab of Ornithology, Cornell University, Ocean Sonics,22 SoundTrap from
Ithaca, New York, USA), and the Anabat Express OceanInstrumentsNZ,23 EAR from Oceanwide
(Titley Scientific, Brendale, Queensland, Science Institute24 (Lammers et al. 2008), and
Australia). Some recent open-source examples RESEA from RTSYS.25 Academic recorders
are built around the Raspberry Pi and similar include the Rockhopper by Cornell Lab of Orni-
small-board computers. In some cases, the thology (upgraded variant of MARU; Klinck
projects are open access. However, these devices et al. 2020), USR by Curtin University
often require large batteries to sustain power over (McCauley et al. 2017), and HARP by Scripps
long periods. Examples include the Solo acoustic Institution of Oceanography (Wiggins and
monitoring platform16 (Whytock and Christie Hildebrand 2007). Selection of a particular type
2017), based on the Raspberry Pi and an external of autonomous recorder is driven by the needs
microphone; the Bat Pi 217 for monitoring bats; and limitations of the research project. Most of
and the AURITA system, which combines in a these modern recorders support recording at 16-
waterproof package the Solo recorder and a com- and 24-bit resolutions and offer flexibility to
mercially available bat recorder, the Peersonic record at different sampling frequencies and to
RPA2, to capture sounds from 60 Hz to program custom duty cycles. Some even offer
192 kHz (Beason et al. 2018). The AudioMoth,18 the flexibility to easily switch components (e.g.,
an open-source device, which also can be pur- choosing hydrophones with appropriate sensitiv-
chased and assembled, employs a low-power ity or frequency range). With the market for these
microcontroller and an onboard MEMS micro- recorders expanding, there are numerous options
phone (Hill et al. 2018) and has very basic available beyond the few products
capabilities but allows remote data acquisition at mentioned here.
very low cost on a single channel with sampling In very shallow waters, at depths reachable by
frequencies up to 384 kHz. a diver, deployment and recovery operations can
be relatively easy. At greater depths, specific
additional equipment is needed to allow the
recovery—typically, a ballast (to secure stability
2.4.2 Underwater Recorders
on the seafloor), an acoustic release, and floaters
to retrieve the recorder at the surface once the
Over the past few decades, interest in marine
bioacoustics and in underwater noise monitoring
have increased worldwide, and the market for 19
http://www.jasco.com/; accessed 15 Mar. 2021.
underwater autonomous recorders is rapidly 20
http://www.loggerhead.com/; accessed 15 Mar. 2021.
21
http://www.multi-electronique.com/; accessed
15
https://frontierlabs.com.au/; accessed 23 Aug. 2021. 23 Aug. 2021.
22
16
http://solo-system.github.io/home.html; accessed http://oceansonics.com/; accessed 15 Mar. 2021.
23
15 Mar. 2021. http://www.oceaninstruments.co.nz/; accessed
17
http://www.bat-pi.eu/; accessed 23 Aug. 2021. 15 Mar. 2021.
24
18
https://www.openacousticdevices.info/; accessed https://oceanwidescience.org/; accessed 23 Aug. 2021.
25
23 Aug. 2021. http://rtsys.eu/; accessed 15 Mar. 2021.
66 S. Madhusudhana et al.

built-in sound system of a computer is not good


enough, an external AD-converter can be easily
connected by USB, or, for special devices, by
other interface types. For fieldwork, it is prefera-
ble to choose converters with powering from the
computer USB. The quality of recordings
depends on the preamplifier noise and bandwidth,
sampling rate, and bit-resolution of the soundcard
or AD-converter. However, other features can
drive the choice: number of channels, features of
the AD-converter, the type of interface (USB,
Firewire, Thunderbolt, or proprietary), availabil-
ity of drivers for the computer, and power avail-
able for the sensors (P48 or PIP). For laptops used
in fieldwork, their size, weight, ruggedness,
power consumption, and reliability should be
considered. Most USB-based converters for
Fig. 2.20 Schematic of a mooring setup for the Rockhop- music recording are equipped with microphone
per autonomous passive acoustic recorder (Klinck et al. preamplifiers with P48 power and offer good
2020). The example includes a wide-bandwidth hydro- quality; some offer very high quality, comparable
phone from HighTech Inc. (http://www.hightechincusa. to the best digital recorder, with sampling
com/; accessed 15 Mar. 2021) (HTI-92-WB), but the
recorder offers flexibility with hydrophone choices frequencies up to 192 kHz with a number of
channels ranging from 2 to 8; some external
units provide up to 32 channels. Single-channel
releaser disconnects the recorder from the ballast
AD-converters are also available to be directly
(Fig. 2.20). Anchored units are sometimes also
connected to a P48 microphone, to transform the
diver-recovered or programmed to surface at a set
microphone into a USB microphone. However,
date and time. In ice-covered habitats, the equip-
because some quality parameters are rarely
ment can be secured to fast- or pack-ice with the
described in official specifications (e.g., the self-
hydrophone in the water.
noise, jitter-noise, and the anti-aliasing-filter
used), conducting laboratory or bench tests to
choose the best AD-converter can be necessary.
2.5 Recording Directly For specific applications, the use of instrumenta-
to a Computer tion AD-converters may be required.

Almost all computers, laptops, and tablets have


an audio input and built-in microphone. Digital 2.6 Calibration
recording of sounds is controlled by the onboard
soundcard. However, in most cases, the recording For quantitative animal bioacoustic studies,
quality of the built-in microphone is only condu- calibrated recording equipment needs to be used
cive for recording human voice or music and so that absolute sound pressure can be deter-
inadequate for animal sounds. For most animal mined. This section deals with two types of cali-
recordings, an external sound sensor (microphone bration: calibrating the recording equipment and
or hydrophone) connected to a high-quality audio calibrating the recording. To calibrate the record-
input must be used with the computer or laptop. ing, the calibration of the recording equipment is
The recordist should consult the computer applied to the recorded data.
specifications to know the frequency range and Calibrating the recording system implies deter-
dynamic range of the built-in soundcard. If the mining the frequency response and amplitude
2 Choosing Equipment for Animal Bioacoustic Research 67

Fig. 2.21 Waveform of a


sinusoidal signal (pressure
p as a function of time)
showing prms, ppk, and
ppk-pk

sensitivity of the recording system. The recording electronics and AD-converters is given in pk-pk
system consists of several components (e.g., sen- values. The simple equation is only valid for
sor, amplifier, and AD-converter), each with its sinusoidal signals.
own frequency response and amplitude sensitiv- Using a sine wave yields an amplitude sensi-
ity. The recording system may be calibrated as a tivity at only one frequency. In order to measure
whole by presenting a calibration signal of known the frequency response of the equipment, a series
amplitude and measuring the output. From the of sine waves at different frequencies needs to be
difference between output and input, the fre- presented. More commonly, white noise (i.e., a
quency response and amplitude sensitivity may broadband signal of equal amplitude across fre-
be calculated. Or, each piece of equipment may quency) is used and amplitude sensitivity is deter-
be calibrated separately, and the frequency mined at all frequencies contained in the signal
responses and amplitude sensitivities may be after Fourier transform of the output signal (see
joined (i.e., multiplied in linear terms or summed Chap. 4).
in logarithmic terms). A simple recording setup is shown in
The simplest calibration signal is a sine wave Fig. 2.22. A calibration signal p(t) (i.e., pure
(i.e., a pure tone; Fig. 2.21). While the rms value tone or white noise of known amplitude) is
is typically used in equipment calibration sheets, presented to the sensor (i.e., microphone or
the peak (pk) or peak-to-peak (pk-pk) values are hydrophone). The sensor has a sensitivity s,
more easily read off signal displays on a computer which relates the voltage V at its output to the
or oscilloscope. For a sine wave, the pressure p at its input; so s has the unit V/Pa. The
conversion is: sensitivity can also be expressed in dB re 1 V/Pa:
ppk S ¼ 20 log10 (s/(V/Pa)). The output voltage V of
prms ¼ pffiffiffi  0:707  ppk the sensor is typically passed to an amplifier. The
2
p ppk pffiffiffi amplifier gain g relates the voltage at its output to
, 20 log 10 rms ¼ 20 log 10  20 log 10 2 the voltage at its input and is thus unit-less:
p0 p0
g ¼ V2/V1. Expressed in dB, the amplifier gain
ppk
 20 log 10  3 dB is G ¼ 20 log10 (g). The output voltage of the
p0
amplifier is then passed to an AD-converter such
The variable p denotes pressure. The reference as a soundcard on a computer. The AD-converter
pressure p0 is 20 μPa in air (i.e., for microphone has a digitization gain c, that relates the digital
calibration) and 1 μPa in water (i.e., for hydro- values d in the audio file to the voltage V at its
phone calibration); also see Chap. 4 on an intro- input. The bit-depth of the AD-converter limits
duction to quantities and units. To add to the the maximum digital value (i.e., the full-scale
confusion, the dynamic range of analog value FS) that can be stored. The digitization
gain is defined as the ratio of the full-scale value
68 S. Madhusudhana et al.

Fig. 2.22 Sketch of a generic recording system may be expressed in linear terms (small letters) or decibels
consisting of a sensor (i.e., microphone or hydrophone), (capital letters). The sensor converts the input pressure
amplifier, and AD-converter (e.g., a computer with time series p(t) to a voltage time series V1(t), which is
soundcard). Each piece of equipment has its own sensitiv- amplified to yield V2(t). The AD-converter produces a
ity or gain (indicated by red letters). These sensitivities digital time series d(t)

to the input voltage that produces the full-scale normalized by the full-scale value and so lie
value: c ¼ FS/Vmax. The digitization gain is between 1 and +1. Computing the rms ampli-
expressed in dB re FS/V. The sensitivities tude of the normalized digital time series yields a
(in linear terms) of each component in the record- value of, let’s say, 0.06. In logarithmic terms, the
ing system can be multiplied to yield the system rms amplitude level of the stored normalized data
sensitivity, which relates the digital values d in is D ¼ 20log10(0.06) ¼ 24 dB. What was the
the audio file to the pressure p sensed by the received sound pressure level of the bird song?
sensor. In logarithmic terms, the overall system Subtracting all the gains, the rms sound pressure
sensitivity is the sum of the sensitivities of each level received at the microphone was 32 dB re
piece of equipment. 1 Pa (because 24 –(6) – 40 –(26) ¼ 32).
Once the recording system has been calibrated, The standard reference pressure in air is, how-
it can be used to record animals or other sound ever, 20 μPa, which is equivalent to
sources. To determine the calibrated pressure 20log10(20/1,000,000) ¼ 94 dB re 1 Pa. So,
time series p(t) from the stored data d(t), divide the rms sound pressure level recorded from the
by all the sensitivities and gains: p(t) ¼ d(t) / (c g bird was 32 (94) ¼ 62 dB re 20 μPa. The
s). Alternatively, using the level quantities (in dB) researcher might further want to compute
for each equipment, the received level RL (e.g., calibrated sound spectrograms of the bird song,
rms sound pressure level) is determined by and so the question is how to convert the digital
subtracting all sensitivities and gains from the values to pressure values. Using the linear
rms amplitude level D: RL ¼ D – C – G – S. sensitivities and gains, p(t) ¼ d(t) / (FS / 2 V) /
For example, somebody made a 10-minute 100 / (0.05 V/Pa) yields pressure samples in units
recording of a singing bird. The microphone sen- of Pa.
sitivity was s ¼ 50 mV/Pa, or
S ¼ 20log10(0.05) ¼ 26 dB re 1 V/Pa. The
amplitude at the output of the microphone was 2.6.1 Microphone
amplified by, let’s say, a factor g ¼ 100, or
G ¼ 20log10(100) ¼ 40 dB. The soundcard pro- To make accurate recordings of sound intensity in
duced a full-scale amplitude at 2 V input: c ¼ FS/ the laboratory or field, either from an animal or a
2 V, or C ¼ 20log10(1/2) ¼ 6 dB re FS/V. A different source, a researcher should always use a
computer is used to process the data. If the data calibrated microphone. A commercial micro-
are read using the MATLAB (The MathWorks phone is calibrated when received from the man-
Inc., Natick, MA, USA) function audioread ufacturer and comes with specification sheets
with the flag “native,” then the raw digital values containing amplitude sensitivity, frequency
are presented. With the flag “double,” the data are response, and reception directionality as a
2 Choosing Equipment for Animal Bioacoustic Research 69

a. b.

dB

c. 5

-5

-10

-15

-20
1 10 100 1000 10000 100000 Hz

Fig. 2.23 Specifications of a Brüel & Kjær 1/2-inch free-field microphone type 4191. (a) Photo. (b) Polar plot of
receiving directionality from 16 kHz to 40 kHz. c. Graph of frequency response. Permission to reprint from Brüel & Kjær

function of frequency in the horizontal and verti- roadway or jet noise, may also be considered
cal planes. For example, the ½-inch microphone while ensuring that both microphones receive
shown in Fig. 2.23a has an amplitude sensitivity the same signals and levels. First, calibrate the
of 12.5 mV/Pa or 38 dB re 1 V/Pa and a flat sound field at the frequencies of interest with the
frequency response (to within 3 dB) from about calibrated microphone. Then, replace the
3 Hz to 40 kHz (Fig. 2.23c). Given its cylindrical calibrated microphone with the one of unknown
symmetry, it is omnidirectional about its vertical
axis (Fig. 2.23b). In the vertical plane, its receiv-
ing directionality is steered toward its axis; in
other words, it is most sensitive in the forward
(i.e., vertical in Fig. 2.23b) direction. The lower
the frequency, the more receptive it becomes
from other directions. To check that the micro-
phone maintains its sensitivity over time, a bioac-
oustician should periodically use a calibrator. For
example, the calibrator shown in Fig. 2.24 is very
stable and emits a 1 kHz tone at 94 dB re 20 μPa.
Provided there is a commercial, calibrated
microphone available, a researcher can calibrate
a microphone of unknown sensitivity by compar- Fig. 2.24 A sound level calibrator (LUTRON, model
SC-941) that generates 94 dB re 20 μPa at 1 kHz. The
ison with a calibrated microphone. Using a loud- microphone to be calibrated must be inserted in the hole
speaker system to do this is a convenient option. (1/4 inch diameter) on the left side. Adapters are available
Alternatively, signals of opportunity, like to fit other microphone diameters
70 S. Madhusudhana et al.

Fig. 2.25 Sketch of a setup to calibrate a microphone of manual with permission from Lasse Jakobsen, Institute of
unknown sensitivity with a microphone of known sensi- Biology, University of Southern Denmark, Odense,
tivity in a constant sound field. Redrawn from a laboratory Denmark

sensitivity and record the output in the same fre- To use RESON hydrophones as examples,
quency range. Do not place the two microphones their most sensitive hydrophone (i.e., the one
side-by-side in the sound field since this could with the least negative sensitivity: TC4032;
cause diffraction and distortion of the sound field. Fig. 2.26) has a sensitivity of 170 dB re 1 V/μPa
The sound field should not contain echoes, so (single ended). If the sound received by the
choose an open space or an anechoic room for hydrophone were 170 dB re 1μPa rms, then
low frequencies. In the example of Fig. 2.25, the the output from the hydrophone would be
calibrated microphone has a sensitivity of 50 mV/ 1 V rms. To compare this to a microphone, add
Pa. In the given sound field, it produces an output 120 dB, which is a factor 106 in pressure (20 log10
signal with an amplitude of 0.3 voltage units. After (106) ¼ 120 and 106 μPa ¼ 1 Pa). So,
the calibrated microphone has been removed and 170 dB + 120 dB yields 50 dB re 1 V/Pa.
the to-be-calibrated microphone has been installed The most sensitive ½- or 1-inch microphone is
at exactly the same location, the latter produces an 26 dB re 1 V/Pa, which is 24 dB (i.e., about
output signal of 0.7 voltage units. The sensitivity 16 times, because 20log10(16) ¼ 24) more sensi-
of the to-be-calibrated microphone is simply tive than the TC4032 hydrophone.
0.7/0.3  50 mV/Pa ¼ 117 mV/Pa. Although most hydrophones are stable
through time, it is wise to check the calibration
periodically using a pistonphone. However, a
pistonphone can determine the sensitivity of an
2.6.2 Hydrophone
uncalibrated hydrophone at only one frequency.
The sound pressure of a pistonphone is extremely
High-quality commercial hydrophones are
stable and is only affected by one factor: baromet-
calibrated by the manufacturer with all pertinent
ric pressure. For this reason, a special barometer
information contained in the accompanying spec-
is included with the pistonphone. For accurate
ification sheets. Many hydrophone types have
calibrations, the barometric pressure should be
built-in preamplifiers with amplification and
checked, and sound pressure adjusted according
impedance matching. Thus, these hydrophones
to the scale on the barometer. For calibrations
come with a calibration sheet having one sensi-
performed near sea level (as is often the case in
tivity value that includes the preamplifier. The
marine bioacoustics), this error is negligible, but
sensitivity of a hydrophone is usually expressed
if one is working in an aquatic environment that is
in dB re 1 V/μPa, which is different from the
significantly above sea level, then this factor
expression for microphone sensitivity (dB re
(which is 2 dB at 2000 m altitude) should be
1 V/Pa).
included. For hydrophones to be deployed at
2 Choosing Equipment for Animal Bioacoustic Research 71

Fig. 2.26 Graph of amplitude sensitivity and frequency sensitive is the TC4035. Permission to reprint from
response for several RESON hydrophones with RESON (http://www.teledyne-reson.com/; accessed
preamplifiers. The most sensitive is the TC4032; the least 15 Mar. 2021)

great depth in the ocean, the amplitude sensitivity projected pulse must be ramped up and down to
(and pressure resistance) should be measured in a reduce high-frequency artifacts caused by the
pressure chamber. onset and end of the pulse.
The frequency response of an uncalibrated The next step is to determine the received level
hydrophone (for frequencies up to a few kHz) of an underwater sound. For example, a dolphin
can be measured in air by using the same method click is recorded with a TC4035 hydrophone,
as described for a microphone (Fig. 2.25). How- which has a sensitivity of 215 dB re 1 V/μPa
ever, for higher frequencies, this should be done (Fig. 2.26). If the output is amplified by 60 dB
in open water (e.g., a deep lake) and the method (1000x) and the recorded signal is 1.2 V pk-pk,
described for microphones can be used by simply then the received level is: 20 log10 (1.2) – 60 –
substituting the microphone with a hydrophone of (215) ¼ 1.58  60 + 215  157 dB re 1 μPa
known sensitivity compared to one of unknown pk-pk. Usually, the analog voltage signal is
sensitivity. An appropriate amplifier and an converted to a digital signal by an AD-converter,
underwater projector are needed, but a hydro- which has a digitization gain that also needs to be
phone without a built-in preamplifier also can be accounted for (see above).
used as a projector. First, the environment (lake,
pool, or tank) should be checked for echoes and
reverberations (see Popper and Hawkins 2018 for 2.6.3 AD-Converter
details). The projected calibration sound must be
a pulse that ends before the first echo arrives at the A 16-bit AD-converter has 216 bit resolution,
sensor. This necessity restricts the frequency covering 65,536 counts peak-to-peak. Its full-
range that can be used for calibration since the scale value is 216–1 ¼ 65,535 in unipolar mode,
72 S. Madhusudhana et al.

where the digital amplitude values lie between the chosen gain will affect the amplitude sensitiv-
0 and 65,535, or 215 ¼ 32,768 in bipolar mode, ity and needs to be accounted for. Some manuals
where the digital amplitude values are in the (e.g., the SoundTrap User Guide26) provide guid-
range 32,768; . . ; 0; . . ; 32,767. In decibels, ance on how to calibrate the recorded data if read
the dynamic range of a 16-bit AD-converter in by software packages such as MATLAB,
bipolar mode is 20 log10 (32,768) ¼ 90 dB. Every PAMGuard, or Audacity.
bit gives ~6 dB of dynamic range in the digital
domain. But a 90-dB dynamic range rarely can be
realized since most electronics used before
2.6.5 Measuring Self-Noise
AD-conversion do not have such a large dynamic
range. A 24-bit converter in bipolar mode offers a
When intending to record quiet sounds or ambient
theoretical dynamic range of about 138 dB; how-
sound levels in the absence of nearby sound
ever, only the most sophisticated electronics can
sources, it is important to first measure the system
provide up to 115–120 dB of dynamic range. This
self-noise to avoid confounding electronic noise
means that there cannot be more than 19–20 bits
with environmental noise. For this, the system
of real dynamic range and the remaining bits
should record in a quiet room and the sound
(least significant bits) are just filled by noise.
sensor should be in a sound- and vibration-proof
AD-converter specification sheets rarely show
box (Fig. 2.27). If using an autonomous recorder,
this, thus there is growing need to have more
the entire system should rest in a sound-proof
realistic AD-specifications to account for the
box.
intrinsic AD-converter noise and its artifacts
To record quiet sounds under water or to accu-
showing as distortion and jitter. In some record-
rately quantify ambient sea noise, a sensitive
ing systems, the least significant bits are used to
hydrophone with a wide frequency range is
encode complementary information; however,
needed (e.g., the TC4032, Fig. 2.26). All of the
this practice is not standard.
system components should have low self-noise. A
AD-converters thus carry an intrinsic digitiza-
“wet-ground” ground-wire from the input equip-
tion gain, which is the ratio of the full-scale value
ment to the water might be necessary to reduce
to the input voltage that leads to full-scale. The
system noise. The amplifier should have an
digitization gain is expressed in dB re FS/V. For
adjustable band-pass filter to avoid aliasing dur-
example, an AD-converter with a digitization
ing direct digital recording. The AD-converter
gain of 6 dB re FS/V reaches its FS value at a
needs sufficient bit-resolution and sampling rate
peak input voltage of 2 V, because
to cover the frequency band of interest. The sys-
20 log10(FS/2 V) ¼ 6 dB re FS/V. AD-converters
tem frequency response shown in Fig. 2.27 goes
may be calibrated with a voltage signal generator.
up to about 100 kHz. If the full bandwidth is
The peak voltage of the input signal has to be less
desired, then the sampling frequency should be
than the maximum voltage range specified in the
at least 200 kHz. When reporting measured
specification sheet; otherwise, the AD-converter
levels, provide the frequency range over which
will be overloaded and the signal clipped.
sound was measured and the bandwidth over
which sound levels were computed (e.g., per Hz
or in 1/3-octave bands).
2.6.4 Autonomous Recorder

Off-the-shelf autonomous recorders are


manufacturer-calibrated. The specification sheets
typically give one overall amplitude sensitivity
and frequency response for the entire system 26
http://www.oceaninstruments.co.nz/wp-content/
(including sensor, amplifier, and AD-converter). uploads/2018/03/ST500-User-Guide.pdf; accessed
If the recorder allows variable gain settings, then 5 Mar. 2021.
2 Choosing Equipment for Animal Bioacoustic Research 73

Fig. 2.27 Diagram of


equipment to measure
underwater ambient noise.
The RESON hydrophone
with lowest self-noise is the
TC4032. Prior to
deployment, system self-
noise may be determined by
recording with the
hydrophone in a sound- and
vibration-proof box in the
laboratory. Permission to
reprint from RESON

Chap. 4, section on weighting curves). However,


2.7 Other Gear
it is important to not underestimate the impact of
infrasounds, which can be heard or perceived by
2.7.1 Sound Pressure Level Meter
animals. The C-weighting is selected when the
user desires to measure the peak sound pressure
SPL meters, also called phonometers, are used to
level. Measurements with these filters are
measure ambient noise, including abiotic and
expressed as dB(lin), dB(A), or dB(C). To mea-
biotic sounds. SPL meters have a variety of
sure environmental noise over the whole spec-
settings for transient vs. continuous sound, fre-
trum (especially for species with unknown
quency range, amplitude range, and any
hearing curves), it is important to use the
weightings (Brüel and Kjær 2001). The micro-
unweighted, flat setting. At low frequencies of
phone on an SPL meter is omnidirectional, can be
anthropogenic noise, the type of weighting used
covered with a windsock, and mounted on a tri-
can make a large difference in the amplitude
pod. The fast-setting is used for impulse or tran-
measurement.
sient sounds. The slow-setting is used for
Out of the various measures an SPL meter may
continuous sounds. Most SPL meters have a
report, the most common one is perhaps the
selectable frequency range. The user can select a
Equivalent Continuous Sound Level (Leq),
flat setting, which collects dB measurements
which is a time-average: the equivalent constant
equally over the desired bandwidth (i.e., without
SPL that would produce the same energy as the
weightings). The A-weighting is selected when
fluctuating sound level measured over a given
the user desires to place a filter over the sampled
time interval (e.g., 60 s). The duration of the
frequency range in an effort to account for the
measure must be declared as Leq,T (e.g., Leq,60s),
relative loudness perceived by the human ear (see
74 S. Madhusudhana et al.

LAeq,1s da 20/06/2013 16.48.00 a 20/06/2013 16.58.00


80
LAeq = 54.8 dB
dB
70

60

50

40
30
1/3 Ottava da 20/06/2013 16.48.00 a 20/06/2013 16.58.00
20K 70
Hz
10K dB
5K
60
2K
1K
500
50
200
100
40
50
20
30
10 20
5
16.48 h:m 16.50 16.52 16.54 16.56 16.58

Fig. 2.28 Recording and spectral analysis of noise in a spectral composition of the recorded period. At about
residential area. Recording (top) of the overall sound level 20 Hz is the noise generated by a truck engine. At about
(A-weighted) with the LAeq level of the shown period. The 16.53 occurs the noise of a passing airplane (50–1000 Hz).
unweighted spectrographic image (bottom), with fre- Bird songs appear at 1500–9000 Hz. Courtesy of Alberto
quency up to 20 kHz on a logarithmic scale, shows the Armani

where T is the time interval of the measurement. broadband or band-limited (e.g., in a 1-octave or
The level may be weighted (e.g., A or C 1/3-octave band). Most sophisticated, and expen-
weighting). LAeq is often used in the assessment sive, noise measuring systems can produce spec-
of noise dose or sound exposure in humans tra in narrower bands (as fine as 1-Hz bands) and
(Fig. 2.28). For example, LAeq,1s ¼ 73 dB or calculate spectral percentiles to show the level
Leq,1s ¼ 73 dB(A) is a measurement taken with variation statistics for each frequency band. In
an A-weighting filter over 1 s and LCeq,1s other words, the percentile analysis of a 1/3-
indicates a measurement taken with a octave spectrum shows what percentage of time
C-weighting filter for 1 s. each level is reached or exceeded within the mea-
Some SPL meters have a 60-s Leq setting used surement period (see Chap. 4, section on power
for short-term sampling. However, if the sound spectral density percentiles).
level varies randomly, calculating Leq is tricky, All these devices need to be calibrated period-
and so, Integrating Sound Level Meters are better ically with a known calibration tone. Calibrators
(Fig. 2.29) as they determine Leq during a suitable are standardized at the factory and usually main-
time period. When more information on the sta- tain calibration for a long time. Only specialized
tistics of sound levels is needed, in both time and laboratories can certify calibrators. The calibrator
frequency, noise-level analyzers are used signal is usually a 1-kHz sinusoidal tone at 94 dB
(Fig. 2.29). They perform statistical analyses of re 20 μPa SPL rms (equivalent to a pressure of
sound levels over a specified period, either 1 Pa rms, 95.45 dB pk, or 1.41 Pa pk).
2 Choosing Equipment for Animal Bioacoustic Research 75

insects, and so most examples in this section deal


with invertebrate signalers and plant substrates.
Vibrational signals travel through various
kinds of substrates (e.g., rod-like, such as plant
stems; plate-like, such as leaf litter) as different
types of waves (e.g., bending, Rayleigh) that vary
in their direction of energy propagation (reviewed
in Elias and Mason 2014; Mortimer 2017). In
plant stems and leaves, substrate-borne vibrations
travel as bending waves (Michelsen et al. 1982)
and signal propagation is frequency-dispersive; in
other words, energy at higher frequencies
propagates faster than does energy at lower
frequencies (Michelsen et al. 1982). Furthermore,
each substrate acts as a unique filter, attenuating
some frequencies more than others (reviewed in
Elias and Mason 2014). Filtering varies among
different plant species (Bell 1980; McNett and
Cocroft 2008; Virant-Doberlet and Čokl 2004),
different parts of same plants (Čokl et al. 2005;
McNett and Cocroft 2008), and even among dif-
ferent parts of the same leaves (Čokl et al. 2004;
Magal et al. 2000).
Filtering is a key consideration for selecting a
Fig. 2.29 Photograph of Larson Davis SoundAdvisor sensor for recording or playback (Cocroft et al.
831C sound level meter with spectral analysis and sound 2014b). Importantly, the transmission and filter-
recording capabilities (left; permission to reprint from
Larson Davis (http://www.larsondavis.com/; accessed
ing properties of a given substrate can be affected
5 Mar. 2021)) and of a simple noise-level analyzer with by a sensor, if it loads on extra mass. If the aim is
calibrator (right; shown being calibrated using a 1 kHz to characterize signal parameters of a given spe-
tone with 94 dB SPL) cies, then to minimize filtering, one must choose a
sensor that adds as little mass as possible and
minimize the signal propagation distance between
2.7.2 Vibration Measurement the source and the receiver. For example, one
might affix a small and lightweight micro-
2.7.2.1 In Terrestrial Studies accelerometer to the substrate, close to the signal-
In addition to communicating through sound (i.e., ing animal. Alternatively, one might use a laser-
pressure waves propagating through air or liquid), Doppler vibrometer to detect and record signals
animals ranging from elephants to insects com- directly from the body of the signaling animal
municate by producing waves that travel through (Čokl et al. 2005).
solids (i.e., substrate-borne vibrations, also The output of a sensor is proportional to the
referred to as vibrational or seismic communica- quantity (displacement, velocity or acceleration)
tion in the literature) (Cocroft et al. 2014a; Hill that it detects – a sensor that detects displacement
2008; Hill et al. 2019; O’Connell-Rodwell 2010). will be most sensitive to low-frequency signals,
Of insects alone, an estimated ~195,000 species whereas a sensor that detects acceleration will be
communicate in part or whole via substrate-borne most sensitive to high-frequency signals. The
vibrations (Cocroft and Rodríguez 2005). Of consequence of this relationship between output
these, the most species-rich group is plant-living and quantity is that the type of sensor used
76 S. Madhusudhana et al.

impacts the measurements that one makes of a Sensor Types Based on the Quantity
signal and how that signal is characterized. Measured
Some of the key considerations for selecting a Displacement: Phonocartridges and other piezo-
type of sensor include its sensitivity and power electric sensors have greatest sensitivity at low
needs (all sensors require power), the frequency frequencies. Phonocartridges can be quite good
and amplitude ranges of the signals, equipment for detecting low-frequency, low-amplitude
ruggedness and portability (if considered for signals in plant substrates, but placement of the
fieldwork), and cost (Table 2.1). Research photocartridge on the plant leaf or stem necessar-
questions can be framed around the signaler or ily loads the substrate and changes its transmis-
receiver, and the measurement of interest can vary sion properties (Fig. 2.30a). Additionally,
widely (e.g., number of signals produced, signal amplitude measurements made with
parameters, etc.). Different sensor types function phonocartridges are variable and not repeatable,
best in different frequency ranges, and the domi- because amplitude varies with the pressure with
nant frequency of a vibrational signal can vary which the stylus contacts the plant tissue.
widely, from <50 Hz for tremulating katydids Velocity: LDVs use the reflection of a laser
(De Souza et al. 2011; Morris 1980; Morris beam pointed at a reflective object or substrate
et al. 1994; Sarria-S et al. 2016), to between to detect the velocity of its movement. (If a sur-
50 and 200 Hz for tremulating stinkbugs face does not reflect enough of the laser for mea-
(reviewed in Čokl et al. 2014), to above 500 Hz surement, a small amount of reflective paint or
for diverse kinds of plant-feeding insects tape can be applied to the substrate.) LDVs are
(reviewed in Čokl et al. 2014). Vibrational signals highly sensitive and excellent for detecting and
can also be narrowband (McNett and Cocroft making measurements of low-amplitude signals
2008) or broadband, with energy distributed that also have energy concentrated in low
over several kHz (Cocroft 1996; Hamel and frequencies. They do not load any mass to a
Cocroft 2019). substrate, so they do not affect signal transmis-
The amplitudes of vibrational signals also vary sion in this way, and in fact, they can be used to
widely, even just within small arthropods. For characterize signals by recording from an animal
example, large neotropical katydids produce itself (Čokl et al. 2005). LDVs provide repeatable
substrate-borne vibrations by vertically measures of amplitude for vibrational signals.
oscillating their abdomens relative to the substrate Unfortunately, LDVs can be expensive. Although
(in other words, they bounce) and the amplitude they are fairly portable, they are still quite cum-
of these oscillations can be large enough to bersome compared with a micro-accelerometer.
observe with the naked eye (Belwood and Morris Additionally, because an LDV detects motion
1987; Morris et al. 1994; Rajaraman et al. 2015). perpendicular to the laser, the researcher must
In contrast, the amplitude of signals by tiny tree- decide which plane is of interest (e.g., identify
hopper nymphs can be so low as to be difficult to the major axis of motion). LDVs are not well-
detect without a very sensitive sensor, such as a suited for high-amplitude signals, as a moving
laser-Doppler vibrometer (LDV) (JH, pers. obs.). branch or stem will break the contact of the laser
The animal’s use of substrates is another key with the reflective surface and disrupt measurement.
factor to consider: some vibrationally signaling Acceleration: Accelerometers can be pur-
animals, such as small, plant-feeding insects, are chased in a wide variety of sensitivities, fre-
relatively sessile and signal from specific quency ranges, and sizes, and some models have
locations on plants of a single species (McNett the capacity for adjustable gain. For example, a
and Cocroft 2008), whereas other vibrationally commonly used micro-accelerometer in studies of
signaling animals are more motile and may signal small insects has a mass of 0.8 g and a frequency
on diverse substrate types (reviewed in Elias and range of 0.8 Hz–10 kHz. Accelerometers can
Mason 2010). generate repeatable measurements of amplitude,
2

Table 2.1 Examples of sensors that might be selected for vibrational communication studies, taking research aim, substrate type, signal frequency and amplitude ranges,
ruggedness of equipment, and cost constraints into account
Dominant
Substrate frequency Signal Cost
Organism Aim type and range amplitude Other considerations constraints Decision
Small insect that mainly Characterize spectral Plant 100 Hz, Low Study will be Funding is LDV is ideal for
signals on herbaceous and temporal features of stem/leaf most energy conducted in the lab provided, or low-frequency, low-amplitude
plant species (e.g., signals <200 Hz and field; field work is an LDV is signals; will not load the
stinkbug) in moderate otherwise substrate or affect signal
environments available transmission
Small, sessile insect that Behavioral study Tree Broadband, Low Equipment needs to Funding is Accelerometer is the best bet
signals on a specific part documenting signaling branch energy be sturdy; field study; limited for funding constraints, field
of a single plant species response (or lack distributed branches sometimes study, and the need to affix
(e.g., treehopper) thereof) in response to a 100 Hz move in breeze sensor to substrate; frequency
stimulus range suggests that this
Choosing Equipment for Animal Bioacoustic Research

5000 Hz
approach will work
Medium or large, motile Characterize spectral Leaf litter Both high Medium; Animal moves around Funding is LDV is ideal for not loading
arthropod (e.g., wolf and temporal features of and low- multiple signal but is confined to an provided, or the substrate and for
spider) signals frequency elements arena; signals on leaf an LDV is characterizing low- and
signals litter otherwise mid-frequency ranges
available
Large, motile insect Behavioral study Large 20 Hz, most High (can Equipment needs to Funding is LDV is ideal for
(e.g., katydid) documenting signaling plant energy observe be sturdy and limited low-frequency signals, but
response (or lack stem/leaf < 100 Hz movement of function in high heat high signal amplitude
thereof) in response to a plant stem as and humidity suggests that laser would not
stimulus animal signals) environment remain in contact (with a
moving substrate). Funding
limitations and study setting
suggest that an accelerometer
is the place to start.
77
78 S. Madhusudhana et al.

Fig. 2.30 Sensors that detect and measure substrate- to substrates with a small amount of accelerometer wax
borne vibrations. (a) A phonocartridge attached to or dental wax. Lightweight supports such as twist-ties and
lab-hands or a thin wooden dowel. (b) Accelerometer. thin hair clips are used to reduce the likelihood of the
(c) Piezo disc or contact microphone for detecting accelerometer shifting position or detaching from a
substrate-borne vibrations. (d–f) Accelerometers affixed substrate
2 Choosing Equipment for Animal Bioacoustic Research 79

and because accelerometers are necessarily the particle motion from the sound pressure
attached to a substrate, they can measure high- measurements and the acoustic properties of the
amplitude signals that move the substrate itself. medium. This is relatively easy in an acoustic
Accelerometers are lightweight and small free-field (i.e., no nearby boundaries to sound
(Fig. 2.30b), can be rugged, and several com- propagation). However, near acoustic boundaries
monly used models can be powered by one or (like the seabed and the sea surface), the relation-
more 9-V batteries. Drawbacks of accelerometers ship between pressure and particle motion
are that attaching a sensor to a substrate loads becomes complex and so, particularly in shallow
mass to the substrate; to avoid altering of sub- waters that are inhabited by many fishes and
strate transmission properties, it is recommended invertebrates, measuring particle motion directly
to limit sensor mass to <5% of the mass of the is necessary. The result is a dearth of data on
substrate (Cocroft and Rodríguez 2005). Because particle motion and its importance to, and poten-
accelerometers detect acceleration, they are not as tial effects upon, animals. Although there are
sensitive at low frequencies as they are at higher excellent hydrophones for monitoring sound
frequencies, and they generally have lower pressure, there are far fewer devices for detecting
bandwidths than LDVs. and analyzing particle motion.
The study of animal vibrational communica- Popper and Hawkins (2018) described the
tion is rapidly growing. In order to withstand the many problems with measuring particle motion
rigor of peer-review, researchers must document in a tank and recommended that measurements be
the type, make, model, and sensitivity of the taken in the field, or at least in a specially
sensors used, and also document the factors likely designed sound exposure chamber to control the
to affect signal characteristics and propagation relative magnitudes of particle motion and sound
(e.g., substrate type and characteristics, position pressure. To make particle motion measurements,
of the animal). The relative position of the sensor it is necessary to mount three orthogonally
must be logical, consistent, and be informative for orientated vector sensors together to monitor the
the study. For sensors that attach to substrates three spatial components of particle motion. Any
(e.g., accelerometers), secure and even attach- sound can thus be resolved into its directional
ment will help achieve a good signal-to-noise components and the direction to the sound source
ratio and minimize impedance mismatch may be determined. Calibrated particle motion
(Fig. 2.30 a, d–f). measurement systems are commercially avail-
able, but expensive. An alternative approach is
2.7.2.2 In Underwater Studies to measure the sound pressure gradient in the
An important issue with respect to fishes and water to derive the particle motion in a particular
invertebrates is their sensitivity to particle motion direction.
that accompanies sound transmission, rather than Many studies have used custom-built particle
to sound pressure. Particle motion comprises par- motion sensors for studying the impacts of
ticle displacement, particle velocity, and particle anthropogenic activities on fish (e.g., Campbell
acceleration (ISO 18405 201727) and differs from et al. 2019; Solé et al. 2017; van der Knaap et al.
sound pressure in that it is a vector quantity. In 2021). GeoSpectrum Technologies Inc. offers a
contrast, sound pressure is a scalar quantity, act- few choices for off-the-shelf particle motion
ing in all directions. sensors in their M20 line of products. Each device
Popper and Hawkins (2018) reported that it is consists of an omnidirectional acoustic pressure
commonplace to characterize underwater sound sensor co-located with three (or two) dipole
by the sound pressure alone, because it is easily sensors that measure the amplitude and phase of
measured by a hydrophone, and then to estimate particle motion in the three (or two) orthogonal
directions. Being lightweight and having a small
27
https://www.iso.org/standard/62406.html; accessed form factor (e.g., the M20–040 has a 64 mm
8 Mar. 2021. diameter and is 179 mm tall; Fig. 2.31), they are
80 S. Madhusudhana et al.

Fig. 2.31 Photograph (left) and receiving frequency velocity level (PVL): dBV re 1 m/s. Permission to reprint
response (right) of GeoSpectrum M20–040. Note that the from GeoSpectrum Technologies Inc. (http://www.
units of the calibration curve are in terms of particle GeoSpectrum.ca/; accessed 15 Mar. 2021)

preferred over traditional hydrophone arrays for provide the ability to select a recording time and
assessing directionality, especially for use on duration for long-term, remote monitoring of
small unmanned underwater vehicles (e.g., Stinco ambient and animal sounds.
et al. 2019). The M20 devices support direction-
ality assessments over a frequency range of 1 Hz
to 3 kHz, and the bearing uncertainty increases 2.8 Summary
with decreasing frequency and decreasing SNR.
Erbe et al. (2017) used a GeoSpectrum M20 to Technology used in bioacoustic research is
determine sound pressure, particle displacement, changing rapidly. This chapter describes cur-
particle velocity, and particle acceleration from rently used equipment in bioacoustic studies,
recreational swimmers, kayakers, and divers. along with references and websites. The chapter
starts with an introduction to the nomenclature
used in the industry, describing these as they
2.7.3 Smartphone Applications apply to animal bioacoustic research. An under-
standing of the terminology would assist a bioac-
Smartphone applications have put bioacoustic oustician with choosing appropriate equipment
research in the hands of hobbyists and citizen with characteristics suitable for a particular
scientists. Applications are inexpensive, rapidly study. Instruments that form a complete recording
evolving, and available on both Android based or playback setup are described in light of these
phones and iPhones. These applications are well- characteristics, along with mentions of a few of
suited for classroom and field demonstrations of the commonly used products available in the
bioacoustic research. The microphone and market. Considerations such as electronic noise,
soundcard in cellphones from different aliasing, sensitivity, resolution, and dynamic
manufacturers determine the frequency range range are discussed for both terrestrial and under-
and level of the sounds recorded and the type of water equipment. Autonomous recorders, that
analysis possible. A researcher needs to know the offer pre-packaged programmable solutions for
frequency range and amplitude sensitivity of the passive acoustic monitoring, are also discussed.
cellphone to ensure that the sounds of the target The discussions cover several indicative
animals can be appropriately captured. bioacoustic studies (targeting a wide variety of
Applications used in battery-operated cellphones fauna) that highlight the use of specific equipment
2 Choosing Equipment for Animal Bioacoustic Research 81

for different purposes and under different • Marco Pesente’s blog on getting started with
conditions. Other related types of equipment nature recording: http://www.naturesound.it/;
used in closely related fields (such as accessed 6 Sep. 2021.
biotremology, particle velocity measurement, • Useful instructions on how to build your own
etc.) are highlighted. DIY microphones can be found on the email
A priori knowledge of the target animal’s discussion lists naturerecordists
sounds is helpful in selecting appropriate equip- (naturerecordists@yahoogroups.com) and
ment. Sensing and recording equipment needs to micbuilders (micbuilders@yahoogroups.
be appropriate for the environmental conditions com).
being studied. This chapter summarizes how to • For biotremology, recent reviews that discuss
select and operate microphones and hydrophones, sensor possibilities as well as playback equip-
digital recorders, automated recording systems, ment include Wood and O’Connell-Rodwell
amplifiers, filters, sound pressure level meters, (2010) and Elias and Mason (2014). For a
and cellphone applications. Knowing the equip- thorough discussion of considerations for
ment specifications and selecting components to vibrational playback experiments, we suggest
match in frequency range and amplitude sensitivity Cocroft et al. (2014b). An email discussion list
is important. The dynamic range, amplitude sensi- of vibrational communication researchers can
tivity, and frequency response of each piece of be found at biotremology@googlegroups.
equipment in a recording setup must match and com.
suit the types of sound (i.e., their level and fre-
quency range) intended to be recorded. Periodic Smartphone applications:
calibrations of microphones and hydrophones are
necessary to ensure accurate measurements are • How to record birds for fun and science and
made, and the methods are described herein. With with a cellphone: https://www.allaboutbirds.
their wide availability and ease of use, smartphone org/news/how-to-record-bird-sounds-with-
driven approaches are gaining popularity lately. your-smartphone-our-tips/; accessed
The chapter aims to offer the reader a firm ground- 30 Jan. 2021.
ing with the concepts and available equipment
Acknowledgments SM thanks Holger Klinck, Director,
options in bioacoustics. Pointers to seek further
K. Lisa Yang Center for Conservation Bioacoustics,
understanding are provided along with information Cornell Lab of Ornithology, for his support and advice
about online resources that could offer more up-to- on some of the topics covered in the chapter. Thanks are
date information on the topic. extended by LAM to Lasse Jakobsen and Magnus
Wahlberg, Institute of Biology, University of Southern
Denmark, Odense, Denmark, and Jakob Tougaard and
Peter T. Madsen, Institute for Bioscience, Aarhus Univer-
sity, Aarhus, Denmark, for comments on this chapter.
2.9 Additional Resources WLG thanks Natalie Gannon and Mithriel for information
and photographs on marine acoustic programs in Puerto
Information about recording equipment: Rico. Michael O’Farrell provided current notes on Anabat
and other bat detector technology. Dean Julie Coonrod,
• Review by the Macaulay Library of the University of New Mexico, provided academic support for
Cornell Laboratory of Ornithology: https:// completion of this project. GP thanks Marco Pesente for
his contribution of material about DIY microphones.
www.macaulaylibrary.org/resources/audio-
recording-gear/; accessed 30 Jan. 2021.
• Introductory guide on instruments and
techniques for bioacoustics by the Interdisci- References
plinary Center for Bioacoustics and Environ-
mental Research, University of Pavia: http:// Beason RD, Rüdiger Riesch R, Koricheva J (2018)
www.unipv.it/cibra/edu_equipment_uk.html; AURITA: an affordable, autonomous recording device
for acoustic monitoring of audible and ultrasonic
accessed 30 Jan. 2021.
82 S. Madhusudhana et al.

frequencies. Bioacoustics 28(4):381–396. https://doi. Čokl A, Zorović M, Kosi AŽ, Stritih N, Virant-Doberlet M
org/10.1080/09524622.2018.1463293 (2014) Communication through plants in a narrow
Bell PD (1980) Transmission of vibrations along plant frequency window. In: Janik EM, McGregor P (eds)
stems: implications for insect communication. Journal Studying vibrational communication. Springer, Berlin,
of the New York Entomological Society 88:210–216 pp 171–195. https://doi.org/10.1007/978-3-662-
Belwood JJ, Morris GK (1987) Bat predation and its 43607-3_10
influence on calling behavior in neotropical katydids. Collins J, Jones G (2009) Differences in bat activity in
Science 238:64–67. https://doi.org/10.1126/science. relation to bat detector height: implications for bat
238.4823.64 surveys at proposed windfarm sites. Acta
Brüel and Kjær (1982) Condenser-microphones. Brüel & Chiropterologica 11:343–350
Kjær, Denmark: 1–146. http://www.bkhome.com/doc/ Dannhof BJ, Bruns V (1991) The organ of Corti in the bat
be0089.pdf Hipposideros bicolor. Hear Res 53(2):253–268
Brüel and Kjær (2001) Environmental noise. Brüel & De Souza LR, Kasumovic MM, Judge KA (2011) Com-
Kjær, Denmark: 1–67. http://www.bkhome.com/doc/ municating male size by tremulatory vibration in a
br1626.pdf Columbian rainforest katydid, Gnathoclita sodalis
Buck CL, Malavar JC, George O, Koob GF, Vendruscolo (Orthoptera, Tettigoniidae). Behaviour 148:341–357.
LF (2014) Anticipatory 50 kHz ultrasonic https://doi.org/10.1163/000579511X559418
vocalizations are associated with escalated alcohol Elias DO, Mason AC (2010) Signaling in variable
intake in dependent rats. Behav Brain Res 271:171– environments: substrate-borne signaling mechanisms
176 and communication behavior in spiders. In:
Buzzetti F, Brizio C, Pavan G (2020) Beyond the audible: O’Connell-Rodwell CE (ed) The use of vibrations in
wide band (0–125 kHz) field investigation on Italian communication: properties, mechanisms and function
Orthoptera (Insecta) songs. Biodiversity Journal 11(2): across taxa. Research Signpost, Kerala
443–496 Elias DO, Mason AC (2014) The role of wave and sub-
Campbell J, Sabet SS, Slabbekoorn H (2019) Particle strate heterogeneity in vibratory communication: prac-
motion and sound pressure in fish tanks: a behavioural tical issues in studying the effect of vibratory
exploration of acoustic sensitivity in the zebrafish. environments in communication. In: Janik EM,
Behavioural Processes 164:38–47 McGregor P (eds) Studying vibrational communica-
Caruso F, Sciacca V, Bellia G, De Domenico E, Larosa G, tion. Springer, Heidelberg, pp 215–247. https://doi.
Papale E, Pellegrino C, Pulvirenti S, Riccobene G, org/10.1007/978-3-662-43607-3_12
Simeone F, Speziale F, Viola S, Pavan G (2015) Size Erbe C (2009) Underwater noise from pile driving in
distribution of sperm whales acoustically identified Moreton Bay, Qld. Acoustics Australia 37(3):87–92
during long term deep-sea monitoring in the Ionian Erbe C, Verma A, McCauley R, Gavrilov A, Parnum I
Sea. PLoS One 10(12):e.0144503. https://doi.org/10. (2015) The marine soundscape of the Perth Canyon.
1371/journal.pone.0144503 Progress in Oceanography 137:38–51. https://doi.org/
Cocroft RB (1996) Insect vibrational defence signals. 10.1016/j.pocean.2015.05.015
Nature 382:679–680 Erbe C, Parsons M, Duncan AJ, Lucke K, Gavrilov A,
Cocroft RB, Rodríguez RL (2005) The behavioral ecology Allen K (2017) Underwater particle motion (accelera-
of insect vibrational communication. BioScience 55: tion, velocity and displacement) from recreational
323. https://doi.org/10.1641/0006-3568(2005)055[ swimmers, divers, surfers and kayakers. Acoust Aust
0323:TBEOIV]2.0.CO;2 45:293–299. https://doi.org/10.1007/s40857-017-
Cocroft RB, Gogala M, Hill PSM, Wessel A (eds) (2014a) 0107-6
Studying vibrational communication. Springer-Verlag, Erbe, C., Marley, S., Schoeman, R., Smith, J. N., Trigg, L.,
Berlin, Heidelberg. https://doi.org/10.1007/978-3-662- and Embling, C. B. (2019). The effects of ship noise on
43607-3 marine mammals--A review. Front Mar Sci 6, 606.
Cocroft RB, Hamel JA, Su Q, Gibson J (2014b) Vibra- https://doi.org/10.3389/fmars.2019.00606.
tional playback experiments: challenges and Favali P, Chierici F, Marinaro G, Giovanetti G,
solutions. In: Janik EM, McGregor P (eds) Studying Azzarone A, Beranzoli L, De Santis A, Embriaco D,
vibrational communication. Springer, Heidelberg, pp Monna S, Lo Bue N, Sgroi T, Cianchini G, Badiali L,
249–274 Qamili E, De Caro MG, Falcone G, Montuori C,
Čokl A, Prešern J, Virant-Doberlet M, Bagwell GJ, Millar Frugoni F, Riccobene G, Sedita M, Barbagallo G,
JG (2004) Vibratory signals of the harlequin bug and Cacopardo G, Calì C, Cocimano R, Coniglione R,
their transmission through plants. Physiological Ento- Costa M, D’Amico A, Del Tevere F, Distefano C,
mology 29:372–380. https://doi.org/10.1111/j. Ferrera F, Valentina Giordano V, Massimo Imbesi M,
0307-6962.2004.00395.x Dario Lattuada D, Migneco X, Musumeci M,
Čokl A, Zorović M, Žunič A, Virant-Doberlet M (2005) Orlando A, Papaleo R, Piattelli P, Raia G, Rovelli X,
Tuning of host plants with vibratory songs of Nezara Sapienza P, Speziale F, Trovato A, Viola S, Ameli F,
viridula L (Heteroptera: Pentatomidae). J Exp Biol Bonori M, Capone A, Masullo R, Simeone F,
208:1481–1488. https://doi.org/10.1242/jeb.01557 Pignagnoli L, Zitellini N, Bruni F, Gasparoni F,
2 Choosing Equipment for Animal Bioacoustic Research 83

Pavan G (2013) NEMO-SN1 Abyssal Cabled observa- wireless microphone array for spatial monitoring of
tory in the Western Ionian Sea. IEEE J Ocean Engineer animal ecology and behaviour. Methods Ecol Evol
38(2):358–374 3(4):704–712
Gannon WL, Lawlor TL (1989) Variation in the chip Michelsen A, Fink F, Gogala M, Traue D (1982) Plants as
vocalization of three species of Townsend chipmunks transmission channels for insect vibrational songs.
(genus Eutamias). J Mammal 70:740–753 Behav Ecol Sociobiol 11:269–281
Gorresen PM, Miles AC, Todd CM, Bonaccorso FJ, Miller BS, Barlow J, Calderan S, Collins K, Leaper R,
Weller TJ (2008) Assessing bat detectability and occu- Olson P, Ensor P, Peel D, Donnelly D, Andrews-
pancy with multiple automated echolocation detectors. Goff V, Olavarria C, Owen K, Rekdahl M,
J Mamm 89:11–17 Schmitt N, Wadley V, Gedamke J, Gales N, Double
Hamel JA, Cocroft RB (2019) Maternal vibrational signals MC (2015) Validating the reliability of passive acous-
reduce the risk of attracting eavesdropping predators. tic localisation: a novel method for encountering rare
Front Ecol Evol 7:614. https://doi.org/10.3389/fevo. and remote Antarctic blue whales. Endang Species Res
2019.00204 26:257–269
Hill PSM (2008) Vibrational communication in animals. Moir HM, Jackson JC, Windmill JFC (2013) Extremely
Harvard University Press, Cambridge, MA high frequency sensitivity in a ‘simple’ ear. Biol Lett 9:
Hill AP, Prince P, Covarrubias EP, Doncaster CP, 20130241
Snaddon JL, Rogers A (2018) AudioMoth: Evaluation Morris GK (1980) Calling display and mating behaviour
of a smart open acoustic device for monitoring biodi- of Copiphora rhinoceros Pictet (Orthoptera:
versity and the environment. Meth Ecol Evol:1–13 Tettigoniidae). Anim Behav 28:42-IN1. https://doi.
Hill P, Lakes-Harlan R, Mazzoni V, Narins PM, Virant- org/10.1016/S0003-3472(80)80006-6
Doberlet M, and Wessel A. eds. (2019). Biotremology: Morris GK, Mason AC, Wall P, Belwood JJ (1994) High
studying vibrational behavior. Springer International ultrasonic and tremulation signals in neotropical
Publishing: Cham. https://doi.org/10.1007/978-3-030- katydids (Orthoptera: Tettigoniidae). J Zool 233:129–
22293-2 163. https://doi.org/10.1111/j.1469-7998.1994.
Jensen ME, Miller LA (1999) Echolocation signals of the tb05266.x
bat, Eptesicus serotinus, recorded using a vertical Mortimer B (2017) Biotremology: do physical constraints
microphone array: effect of flight altitude on searching limit the propagation of vibrational information? Anim
signals. Behav Ecol Sociobiol 47:60–69 Behav 130:165–174. https://doi.org/10.1016/j.
Klinck H, Winiarski D, Mack RC, Tessaglia-Hymes CT, anbehav.2017.06.015
Ponirakis DW, Dugan PJ, Jones C, Matsumoto H Nosengo N (2009) The Neutrino and the whale. Nature
(2020) The ROCKHOPPER: a compact and extensible 462:560–561
marine autonomous passive acoustic recording O’Connell-Rodwell CE (2010) The use of vibrations in
system. In: OCEANS 2020 MTS/IEEE Global. IEEE, communication: properties, mechanisms and function
pp 1–7 across taxa. Research Signpost, Kerala
Lammers MO, Brainard RE, Au WW, Mooney TA, Wong Obrist MK, Pavan G, Sueur J, Riede K, Llusia D, Márquez
KB (2008) An ecological acoustic recorder (EAR) for R (2010) Bioacoustic approaches in biodiversity
long-term monitoring of biological and anthropogenic inventories. In: Manual on field recording techniques
sounds on coral reefs and other marine habitats. J and protocols for all taxa biodiversity inventories. Abc
Acoust Soc Am 123(3):1720–1728 Taxa 8:68–99. ISSN 1784-1283 (hard copy) ISSN
Lynch E, Joyce D, Fristrup K (2011) An assessment of 1784-1291 (online pdf). Available at http://www.
noise audibility and sound levels in U.S. National abctaxa.be/volumes/volume-8-manual-atbi/Chapter-5/
Parks. Landscape Ecol 26:1297. https://doi.org/10. Pavan G (2017) Fundamentals of soundscape
1007/s10980-011-9643-x conservation. In: Farina A, Gage SH (eds)
Magal C, Schöller M, Tautz J, Casas J (2000) The role of Ecoacoustics: the ecological role of sound. Wiley,
leaf structure in vibration propagation. J Acoust Soc Hoboken, pp 235–258
Am 108:2412–2418. https://doi.org/10.1121/1. Pavan G, Borsani JF (1997) Bioacoustic research on
1286098 cetaceans in the Mediterranean Sea. Mar Freshw
McCauley RD, Thomas F, Parsons MJG, Erbe C, Cato D, Behav Physiol 30:99–123
Duncan AJ, Gavrilov AN, Parnum IM, Salgado-Kent Pavan G, Manghi M, Fossati C (2001) Software and hard-
C (2017) Developing an underwater sound recorder. ware sound analysis tools for field work. Proc. 2nd
Acoust Aust 45(2):301–311. https://doi.org/10.1007/ Symposium on Underwater Bio-sonar and Bioacoustic
s40857-017-0113-8 Systems. Proc. I.O.A. 23(part 4):175–183
McNett GD, Cocroft RB (2008) Host shifts favor vibra- Pavan G, Fossati C, Caltavuturo G (2013) Marine bio-
tional signal divergence in Enchenopa binotata acoustics and computational bioacoustics at the Uni-
treehoppers. Behav Ecol 19:650–656. https://doi.org/ versity of Pavia (Italy). In: Adam O, Samaran F (eds)
10.1093/beheco/arn017 Detection classification and localization of marine
Mennill DJ, Battiston M, Wilson DR, Foote JR, Doucet mammals using passive acoustics. 2003–2013:
SM (2012) Field test of an affordable, portable,
84 S. Madhusudhana et al.

10 years of international research. DIRAC NGO, Paris, Stinco P, Tesei A, Biagini S, Micheli M, Ferrri G, LePage
pp 3–25. 1–298. ISBN 978-2-7466-6118-9 KD (2019, June). Source localization using an acoustic
Pavan G, Favaretto A, Bovelacci B, Scaravelli D, vector sensor hosted on a buoyancy glider. In
Macchio S, Glotin H (2015) Bioacoustics and OCEANS 2019-Marseille (pp. 1–5). IEEE. https://
ecoacoustics applied to environmental monitoring and doi.org/10.1109/OCEANSE.2019.8867452
management. Rivista Italiana di Acustica 39(2):68–74 Streicher R, Everest FA (1998) The new stereo sound
Payne K, Langbauer WR Jr, Thomas EM (1986) Infra- book. AES Publishing, Pasadena, CA
sonic calls of the Asian Elephant (Elephas maximus). Thode A, Skinner J, Scott P, Roswell J, Strailey J, Folkert
Behav Ecol Sociobiol 18:297–301 K (2010) Tracking sperm whales with a towed acoustic
Poole JH, Payne K, Langbauer WR, Moss CJ (1988) The vector sensor. J Acoust Soc Am 128(5):2681–2694
social context of some very low frequency calls of van der Knaap I, Reubens J, Thomas L, Ainslie MA,
African elephants. Behav Ecol Sociobiol 22:385–392 Winter HV, Hubert J, Martin B, Slabbekoorn H
Popper AN, Hawkins AD (2018) The importance of parti- (2021) Effects of a seismic survey on movement of
cle motion to fishes and invertebrates. J Acoust Soc free-ranging Atlantic cod. Curr Biol 31(7):1555–1562.
Am 143(1):470–488. https://doi.org/10.1121/1. e4. https://doi.org/10.1016/j.cub.2021.01.050
5021594 Virant-Doberlet M, Čokl A (2004) Vibrational communi-
Rajaraman K, Godthi V, Pratap R, Balakrishnan R (2015) cation in insects. Neotropical Entomology 33:121–
A novel acoustic-vibratory multimodal duet. J Exp 134. https://doi.org/10.1590/S1519-
Biol 218:3042–3050. https://doi.org/10.1242/jeb. 566X2004000200001
122911 Wahlstrom S (1985) The parabolic reflector. J Audio Eng
Rayburn RA (2011) Eargle’s the microphone book: from Soc 33(6):418–429
mono to stereo to surround: a guide to microphone Weller TJ, Baldwin JA (2012) Using echolocation
design and application, 3rd edn. Elsevier, Waltham, MA monitoring to model bat occupancy and inform
Righini R, Pavan G (2019) First assessment of the mitigations at wind energy facilities. J Wildl
soundscape of the Integral Nature Reserve “Sasso Manag 76:619–631
Fratino” in the Central Apennine, Italy. Biodiversity Whytock RC, Christie J (2017) Solo: an open source,
Journal 21(1):4–14. https://doi.org/10.1080/14888386. customizable and inexpensive audio recorder for
2019.1696229 bioacoustic research. Meth Ecol Evol 8:308–312
Sarria-S FA, Buxton K, Jonsson T, Montealegre ZF (2016) Wiggins SM, Hildebrand JA (2007) High-frequency
Wing mechanics, vibrational and acoustic communica- acoustic recording package (HARP) for broad-band,
tion in a new bush-cricket species of the genus long-term marine mammal monitoring. In: Interna-
Copiphora (Orthoptera: Tettigoniidae) from Colombia. tional Symposium on Underwater Technology and
Zoologischer Anzeiger - A Journal of Comparative Workshop on Scientific Use of Submarine Cables and
Zoology 263:55–65. https://doi.org/10.1016/j.jcz. Related Technologies 2007 (IEEE, Tokyo, Japan):
2016.04.008 551–557
Sciacca V, Caruso F, Beranzoli L, Chierici F, De Wood JD, O’Connell-Rodwell CE (2010) Studying vibra-
Domenico E, Embriaco D, Favali P, Giovanetti G, tional communication: equipment options, recording,
Larosa G, Marinaro G, Papale E, Pavan G, playback and analysis techniques. In: The use of
Pellegrino C, Pulvirenti S, Simeone F, Viola S, vibrations in communication: properties, mechanisms
Riccobene G (2015) Annual acoustic presence of fin and function across taxa. Research Signpost, Kerala,
whale (Balaenoptera physalus) offshore Eastern Sicily, pp 163–182
Central Mediterranean Sea. PLoS One 10(11): Zimmer WMX (2011) Passive acoustic monitoring of
e0141838. https://doi.org/10.1371/journal.pone. cetaceans. Cambridge University Press, Cambridge
0141838 Zimmer WMX (2013) Range estimation of cetaceans with
Solé M, Sigray P, Lenoir M, Van Der Schaar M, compact volumetric arrays. J Acoust Soc Am 134(3):
Lalander E, André M (2017) Offshore exposure 2610–2618
experiments on cuttlefish indicate received sound pres- Zotter F, Frank M (2019) Ambisonics. Springer Interna-
sure and particle motion levels associated with acoustic tional Publishing, Cham. ISBN 978-3-030-17206-0.
trauma. Sci Rep 7(1):1–13 https://doi.org/10.1007/978-3-030-17207-7
2 Choosing Equipment for Animal Bioacoustic Research 85

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Collecting, Documenting, and Archiving
Bioacoustical Data and Metadata 3
William L. Gannon, Rebecca Dunlop, Anthony Hawkins,
and Jeanette A. Thomas

3.1 Introduction Although methods in bioacoustical studies are


typically non-invasive, research should be
Over the last 100 years, bioacoustical research conducted in an ethical way and any necessary
has led to many important discoveries about the permits obtained. Bioacoustical research should
role of sounds in animal behavior. Over time, best be able to be repeated reliably, where another
practices have evolved in bioacoustical research; investigator should be able to understand the
often through trial and error. In this chapter, these circumstances of the recordings, replicate and
best practices, based on the literature and the apply the results, and be reassured the methods
co-authors’ experiences and opinions, are were appropriate for the goals of the study.
summarized. We recommend methods to prop- Detailed logs of recordings are important and
erly collect and conserve data, use appropriate should include names of researchers; date and
equipment, save time, and perhaps even make a time; location; ambient conditions; equipment
study more affordable. It is advised, of course, specifications; species, age, and sex; and behav-
that researchers conduct a current literature ioral context of the animal during the recording.
review before beginning their work, as Details of data collection and signal analysis
developments in technique and technology are should accompany any results, such as frequency
moving at a fast pace. range, sampling rate, bit-resolution, analysis
bandwidth and interval, amplitude range, and
any filtering or weightings used.
Here, we also discuss special considerations,
Jeanette A. Thomas (deceased) contributed to this chapter or adaptation of methods, for acoustic studies in
while at the Department of Biological Sciences, Western aquatic versus terrestrial field environments, as
Illinois University-Quad Cities, Moline, IL, USA
well as considerations for studies on captive
W. L. Gannon (*) animals. The “playback” technique, where a
Department of Biology, Museum of Southwestern Biology sound is played back to an animal and response
and Graduate Studies, University of New Mexico, noted, is a common method used in bioacoustical
Albuquerque, NM, USA
studies and this chapter provides
e-mail: wgannon@unm.edu
recommendations for designing a robust playback
R. Dunlop
study. Finally, methods for data archival, and
School of Biological Sciences, University of Queensland,
Brisbane, QLD, Australia current repositories for bioacoustical data, are
e-mail: r.dunlop@uq.edu.au provided as a resource for those interested in
A. Hawkins examining existing data or preserving their own
The Aquatic Noise Trust, Kincraig, Blairs, Aberdeen, UK recordings.
# The Author(s) 2022 87
C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_3
88 W. L. Gannon et al.

3.2 Ethical Research principles and studies that are conducted with
scientific integrity (Fig. 3.1). Most researchers
As with all scientific endeavors, bioacousticians consider their work with animals to be harmless
work to answer questions and address hypotheses and therefore ethical. However, the process of
by observing or manipulating the natural world. thinking through how animals could be affected,
There is an ethical obligation to document and proposing research methods during the prep-
procedures and methods, so that reported results aration of an IACUC protocol can be very instruc-
are understandable and reproducible by other tive. In some cases, preparing a protocol for
researchers. A reliable way for understanding review can save a project from mistakes (such as
data, and how they were collected, is by low statistical power, inadequate or illegal animal
documenting metadata associated with a record- housing or handling methods, unnecessary dupli-
ing. Metadata are the description of basic infor- cation, unnecessary expense, or unrecognized
mation collected at the time of the recording, such alternative hypotheses). In fact, developing a
as the recordist; date and time; specific location research protocol can serve to make the research
(GPS coordinates); equipment and settings; water more robust.
depth or altitude; water or air medium; water or Gannon (2014) provided two examples that
air temperature/humidity; weather conditions; illustrate a potentially unethical study and posed
and species, sex, age, and behavior of the the question of whether a research permit was
animals. Knowing the who, what, when, and needed. In 1991, a rare migrant yellow-green
where, of acoustic recordings makes acoustic vireo (Vireo flavoviridis) was spotted at protected
data more useful and allows a review of methods parklands in Rattlesnake Springs, New Mexico,
by other researchers to validate or USA. The sighting was announced on the rare-
supplement data. bird hotline and a number of people went to the
Although bioacoustical studies are usually area to view the bird and to add it to their “life
non-invasive, investigators need to consider and list.” During this time, a PhD student was
minimize any potential effects of their work on collecting goldfinches (Spinus tristis). Knowing
animals (e.g., avoid playbacks of extremely loud that genetic material and voucher specimens are
or injurious sounds that could disturb animals in important to taxonomic and conservation
critical breeding and feeding areas). In many research, he decided to collect the rare bird for a
cases, animal ethics permits and/or research museum research collection. To entice the bird to
permits are needed from the country, state, an unprotected area for easy and legal collection,
county, or any other political entity in which the he recorded calls of the vireo and then played
study will be conducted. If the species is them back where he could legally collect the
endangered, additional permits may be required. bird. The birding community became incredulous
Most research institutions receiving funding from and angry. Was it ethical to record and use
the USA government require investigators to sub- playbacks of this species’ calls to lure the bird to
mit an animal research protocol to an Institutional an unprotected area for collection (see Gluck
Animal Care and Use Committee (IACUC) for 1998)?
approval before conducting research involving More recently, as characterized in Fig. 3.2, a
any animals. Ethical conduct of research goes smartphone birding application was used to lure a
beyond satisfying the requirements of the male common yellowthroat (Geothlypis trichas)
IACUC and includes responsible data collection into view. White (2013) described that broadcast-
and management, appropriate statistical analyses, ing calls, using a smartphone application, gener-
thorough presentation and archival of data, and a ally elicits a quick response from a normally
study that is reproducible. Additionally, research concealed bird. Possibly thinking the sounds
should be reported, peer-reviewed, and published were from another male of his species and threat-
ethically. This falls under research ethics ening his territory, the male yellowthroat
swooped down right in front of a birding tour
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 89

Fig. 3.1 A collage of common reference materials and the scientific process and the ethics of how a study is
journals that are used to advise on the responsible conduct conducted undoubtedly produce better science
of research with animals. Considerations of the integrity of

and was photographed. Is it ethical to lure a bird


3.3 Good Practices in Bioacoustical
to impress a tour group or does the playback
Studies
burden the bird with unnecessary stress, perhaps
reducing his fitness? Should acoustic luring be
Once research questions have been developed and
prohibited for all bird species or for only
equipment has been selected (see Chap. 2 on
endangered animals? Conversely, should these
equipment choices), recording can begin!
techniques be encouraged in order to raise aware-
Animals can be recorded in a controlled labora-
ness of wild things to a public who are increas-
tory or in the field. Bioacousticians often need to
ingly alienated from nature?
be innovative when collecting acoustic data in
Ethical treatment of animals serves to make a
field situations because additional equipment,
research project rigorous and results stronger.
AC-power, and access to repairs are not always
Given the personnel time to design experiments,
available. Below is a summary of some
obtain permits, and conduct bioacoustical
recommendations for beginning bioacousticians.
research, and given the expense and potential
All suggestions are relevant to both terrestrial and
disturbance to animals, is the project worth
aquatic environments unless identified otherwise.
doing? If it is worth doing, it is worth doing well.
90 W. L. Gannon et al.

report source-levels of animal or environmental


sounds. Calibrating recording equipment is
referred to in Chap. 2 of this volume. Ideally the
distance to the sound source (vocalizing animals
in our case) should be known. A common “trick”
is dropping a colored poker chip at the point
where the recording is started and then as moving
toward the sound source, dropping additional
chips until the point where the animal who had
been calling has presumable run off. The distance
can then easily be measured between chips. Abso-
lute distance and calibration of the recording sys-
tem is difficult in field studies.
If more than one channel is available on a
recorder, use one channel to narrate metadata
and the animals’ behaviors with the second chan-
nel dedicated to recording animal sounds. This
allows all details and conditions of the situation to
be documented in real-time and synchronized
Fig. 3.2 Caricature of an ornithologist luring a bird by with the animals’ sounds and behaviors. After
playback of bird calls (with permission of the illustrator each session, the researcher should listen to the
Rohan Chakravarty)
recordings to make sure signals were recorded
and the equipment was working properly. We
3.3.1 Recording Sounds recommend making a copy of each recording
and storing the backup and the original in differ-
It is best to work toward making the cleanest ent places.
recording possible for accurate acoustic analysis. When possible, use battery-power or direct-
Be sure that you have a solid understanding of the current (DC), rather than alternating-current
gain and level controls on your recorder. The gain (AC) wall- or shore-power. Using batteries
and level meter work in concert and the person eliminates background electronic noise and
making the recording needs to be comfortable provides portability of the equipment. AC-power
with these settings before serious acoustic can create a 50-Hz (European power) or 60-Hz
research begins. Ideally the entire recording (North American power) hum or background
chain should be calibrated. Calibration generally noise on a recording. This frequency-specific
refers to correlating the readings of an instrument noise is easy to recognize and filter-out, prefera-
with those of a standard for the purpose of bly during the recording. However, if the animal
checking the instrument’s accuracy. When produces low-frequency signals (e.g., 20-Hz calls
recording sound, a calibration signal (a pure from some baleen whales, low-frequency knocks
tone) of known frequency and amplitude should and grunts from fish, rumbles by elephants) the
be placed at the beginning of all recordings. Some recordings should not be filtered. Note that in
recorders have a built-in calibration tone. The extremely cold locations, battery-life will be
tone also can be used to mark an important sec- shorter and any type of mechanical components
tion of the recording. Having a calibration tone on such as belts, gears, toggles, reels, or digital
a recording allows measurement of absolute equipment can cease to operate correctly. We
amplitude, rather than just relative amplitude. recommend that backup batteries be available or
This step is necessary if the researcher wants to on-charge for quick battery exchange.
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 91

3.3.2 Environmental Conditions vegetation, or other animal sounds can mask


recordings of the target species (see Chap. 5 on
Equipment should be selected based on environ- the source-path-receiver model for airborne
mental conditions at the field site including ambi- sound). In aquatic habitats, wind, sea-state, break-
ent temperature and humidity, prevalence of wind ing waves, precipitation, and other animal sounds
and waves, amount and type of precipitation, and can create a noisy background. In both terrestrial
frequency and amplitude of the target species and aquatic environments, anthropogenic noise
(Fig. 3.3; see Chap. 2 on equipment choices). (from vehicles and vessels, industrial operations,
Before commencing field work, check the military activities, etc.) essentially is omnipresent
weather forecast. Recording animal sounds dur- (see Chap. 7 on soundscapes). If using a remote
ing precipitation, high wind, or a high sea-state recording system, protect the unit from the
often is futile because incoming signals will be weather and secure it as best possible. Be aware
masked. In addition, animals sometimes do not that even in remote locations, theft of field equip-
call during these conditions. In terrestrial ment occurs.
environments, noise from wind, weather, moving

Fig. 3.3 Conditions in the field often contrast sharply from protected for bats to inhabit safely. Occasional sampling is
those in a controlled laboratory environment. Working to completed by live-capture (bottom left) and acoustic moni-
exclude bats (Townsend’s big-eared bat, Corynorhinus toring (bottom right). All photos by authors except bottom
townsendii) from gold mining operations in Nevada, USA left (MNH field biologists collect bat specimens, by
(top left). Recording assures animals are excluded prior to Florante A. Cruz; https://www.wikiwand.com/en/UPLB_
destroying the tunnel system for mineral extraction. Mitiga- Museum_of_Natural_History; licensed under CC BY-SA
tion sites are identified (top right) which are gated and 4.0; https://creativecommons.org/licenses/by-sa/4.0/
92 W. L. Gannon et al.

Fig. 3.4 Photographs of researchers in Antarctica record- prominent so as to not draw the subject’s attention. Note
ing a killer whale (Orcinus orca; left) and Weddell seal the researcher on the right maintains a distance from the
(Leptonychotes weddellii; right). Equipment is both seal so as not to disturb it
protected from being molested by the animal but also not

Documenting the ambient temperature and Researchers should not disturb animals while
humidity is especially important when studying recording (Fig. 3.5). If possible, the recordist
ectothermic terrestrial animals, such as reptiles, should hide in a blind spot or use an automated
frogs, toads, insects, or other invertebrates. At recording system with no observer present. Note
low ambient temperatures, ectothermic animals that sometimes narrating observations of the
are less active and sounds are lower in frequency animal’s behavior during the recording is useful
than during higher ambient temperatures. For which means that the researcher should decide
example, studies by Kissner et al. (1997) between using a remote setup and a setup where
demonstrated that sounds from ectothermic they are nearby. To concurrently monitor animal
animals, such as rattlesnakes (Crotalus viridis), behavior, a video camera on a tripod can be used,
change with ambient temperature and humidity. with minimal disturbance to the animal. How-
ever, the researcher should be aware that the
audio track of a video camera has a limited fre-
quency response and an auto-adaptive level con-
3.3.3 Animal Considerations
trol, meaning these sound recordings should not
be relied upon for acoustical analysis. Closed
The transducer should be positioned so target-
Circuit Television (CCTV), synchronized with
animal sounds are recorded but the animal does
omnidirectional microphones on an ultrasonic
not damage the equipment. An aggressive or curi-
detector, and coordinated using a mobile phone
ous animal can quickly demolish a recording
and speaking clock, has been used to document
system (Fig. 3.4). Equipment used in playback
new vocalizations and activities patterns for
studies can be particularly susceptible to an ani-
barbastelle bats (Barbastella barbastellus;
mal attack. The goal of recording is to document
Young et al. 2018). With a little ingenuity, a
sounds from natural circumstances and not from a
researcher can create a robust recording system.
charging or frightened animal. Captive animals
To save time and expense, it is important to
often are curious about a hydrophone or a micro-
know whether a species has a preferred time of
phone in their enclosure and can need time to
day or season for producing sounds. Many spe-
habituate to equipment before undisturbed sounds
cies are most vocal during the breeding season.
are produced. Placing the transducer in a
Some birds and amphibians are most soniferous
protected area or in a protective mesh cage may
at dawn and dusk whereas many chorusing
be necessary.
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 93

Fig. 3.5 What could go


wrong? In the field,
equipment failure is
certain. Over-planning,
backups, duplicate
systems, checklists,
and more will help avoid
data collection failures

insects primarily produce sounds at dusk. For mid-April. Testes were in regression by 20 July
example, Thomas and DeMaster (1982) showed and had become inactive by mid- to late-
that Antarctic crabeater seals (Lobodon September (Davis 1958). So, if a researcher
carcinophaga) preferred to call under water desires to record sounds of this species associated
between 2100 h and 0500 h and were hauled-out with breeding, the study should be conducted
on the ice at other times. If the number of from mid-April to mid-July. In addition, this spe-
vocalizations was used as a population count, a cies shifts their song to an earlier start time in
census of crabeater seals at 1200 h would have relation to civil twilight. As day length increases
yielded a much lower population estimate than a between the spring equinox and the summer sol-
census at 2400 h. Bats, obviously, are active at stice, civil twilight occurs earlier in relation to
night. However, there is usually a notable peak of sunrise, causing the dawn calling period to
activity approximately 30 minutes after dusk lengthen.
(Kunz and Parsons 2009). Some species (many
in the genus Myotis and Tadarida) are more likely
to be recorded during the first four hours of night, 3.3.4 Documentation and Data
while others emerge past midnight (Euderma, Sheets
Artibeus). Some bats have multimodal activity
patterns (Sherwin et al. 2000) and many sciurids Documentation is very important. A logbook
(e.g., Marmota and Neotamias) actively vocalize should accompany each recording to provide
in the morning and then again in late afternoon metadata on the recordist; the recording system
(Gannon 1999). Some species (e.g., prairie dogs, and equipment settings (e.g., any filter or gain
Cynomys and pikas, Ochotona) are seasonally settings); the location, date and time; environ-
soniferous all day (Slobodchikoff et al. 1998; mental conditions; types of sounds recorded; the
Smith et al. 2016). animals’ behavior (e.g., breeding, feeding, or
It is important to know the effects of both time socializing); a specific animal number
of day and month to interpret the behavioral con- (if marked); and any other circumstances which
text of a recording. For example, breeding data could be valuable for analysis.
from the North American male rufous-sided Many devices may record some of the
towhee (Pipilo erythrophthalmus) showed that metadata automatically. For instance, the Echo
males reached breeding condition around Meter Touch 2 PRO Ultrasonic Module using
94 W. L. Gannon et al.

Table 3.1 Sample logbook showing important metadata to be noted. Examples from author (JAT) notes for Weddell
seal (Leptonychotes weddellii) and sea otter (Enhydra lutris)
Tape Counter Collector Date Time Location Subject Quality Comments
2 234 JA 23 March 16:00 McMurdo Weddell Poor Underwater, adult male,
Thomas 2004 seal 839W, wind 20 knts
13 22 CM 18 Sept 13:15 Valdez, Sea otter Excellent Airborne, mother and pup,
Smith 2004 AK unmarked, no wind

Kaleidoscope Pro software1 (Wildlife Acoustics, heat-shrink tubing, electrical ties, electrical tape,
Maynard, MA, USA) records calls to an iPhone extra cables and connectors, batteries (preferably
or other device and collects metadata about each rechargeable, with charger), multi-meter, etc. If
recording. Metadata can then be displayed with possible, pack replacement equipment: anemom-
Kaleidoscope software or exported to a spread- eter, thermometer, laptop with extra charger,
sheet. Recording directly to a computer allows external speakers, software for data entry, backup
time-stamped (and often GPS-stamped) files. hydrophone or microphone, headset, walkie-
If a datasheet (spreadsheet) is used, put talkie, smartphone, microphone for narration
metadata headers as the first column and fill the onto a PC, and data storage devices (SD-cards,
rows with your observations (Table 3.1). Each thumb-drive, external hard-drive). Why are
sound or bout of sounds should be assigned a duplicates necessary? If you cannot repair some-
unique number for easy reference later, and a thing, then use backups so the research effort is
variety of variables can then be noted for each not wasted.
sound (Table 3.2). Spreadsheets can be imported Moving or shipping equipment often creates
directly into a variety of statistical and graphing problems with loose connections or fittings. If
software products for analyses (see Chap. 9 on equipment is not operating properly, tighten
analytical approaches). Note that datasheets for fasteners on the equipment housing, make sure
playback studies usually include additional circuit boards are seated properly, check that
variables on animal behavior (Table 3.3). batteries are fully charged, and make sure all
cables are connected and working. To check for
cable malfunction, use an ohm-meter to make
3.3.5 Trouble-shooting Equipment sure the resistance of a cable is zero. If new
Problems equipment is used in a study, always unpack it
and check its operation in the laboratory before
Often field work is conducted in remote locations, going to the field. Bring manuals for all equip-
sometimes without easy access to the Internet, ment to the field site or know where to reliably
electricity, or equipment repairs. Consider all pos- access them.
sible equipment problems and always have
backups—of everything. A good motto for field
work is to “bring one to use and one to lose”
3.4 Playback Methods
(Fig. 3.5). Studies usually are costly and time-
and Controls
consuming—in particular in remote locations.
There is nothing worse than a missed field oppor-
Projections of sounds to animals (or playbacks)
tunity caused by the lack of a cable or battery.
are common methods of study in bioacoustics
Bring proper tools to the field site to make
(Fig. 3.6). Several authors have used playbacks
repairs: soldering iron, solder, electrical wire,
to determine the function of a specific animal
1
sound by measuring the animal’s behavioral
https://www.wildlifeacoustics.com/products/echo-
response (Morton and Morton 1998).
meter-touch-2-pro-ios and https://www.wildlifeacoustics.
com/products/kaleidoscope-pro; accessed 13 June 2022
3

Table 3.2 Sample data sheet for acoustic measurements on airborne vocalizations of California sea lions (Zalophus californianus); na not applicable. Courtesy of Schwalm
(2012)
Dominant Dominant First Interval to
start end Dominant Dominant harmonic First second Total
frequency frequency maximum minimum interval component component duration
Date Sound (Hz) (Hz) frequency (Hz) frequency (Hz) Harmonics (Hz) duration (s) (s) (s) Rate
3 July 2011 1 642 755 943 491 Yes 453 0.838 na 0.838 Single
4 July 2011 2 566 566 717 415 Yes 377 0.253 0.148 5.211 Even, series
of 2
5 July 2011 3 640 534 720 294 No na 0.139 na 0.139 Single
9 July 2011 4 614 800 881 480 Yes 26 0.388 0.146 3.477 Accelerate,
series of 4
9 July 2011 5 587 667 747 427 Yes 400 0.165 0.146 5.57 Irregular,
series of 6
Collecting, Documenting, and Archiving Bioacoustical Data and Metadata

Table 3.3 Sample data sheet for playback study with white-cheeked gibbons (Nomascus leucogenys); na not applicable. Courtesy of Yegge (2012)
Type of Quadrant Mate in Number Approach Move away Out-of-
Session Date Time playback Animal number same quad calls speaker from speaker Stationary Climb Move Groom sight
3 29 Aug 16:00 Control CJ 2 Yes 0 Yes No No Yes Run No No
3 29 Aug 16:00 Control Max 2 No 0 No No Yes No na Yes No
4 30 Aug 16:30 Own duet CJ 3 Yes 2 No Yes No No Walk No No
4 30 Aug 16:30 Own duet Max 4 Yes 4 No No Yes No na Yes No
95
96 W. L. Gannon et al.

Fig. 3.6 Playback studies are those by which an animal or carnivores, and primates), birds, reptiles, fish, and many
group of animals is played their calls (or calls of their others. Painting “His Master’s Voice” by Francis Barraud
conspecifics) back to them and then their response is (1856–1924). Source: Victor Talking Machine Company.
recorded. Research using playbacks has been used com- Public domain; https://commons.wikimedia.org/wiki/File:
monly in mammals (such as squirrels, prairie dogs, pika, His_Master%27s_Voice.jpg

Playback studies on fish have been used to attracting European sprat (Sprattus sprattus) in
determine species recognition from a particular mid-water in the sea (Fig. 3.7).
sound, to classify different call types, to identify Many birds respond to playbacks of their own
effects of sound on fish behavior, to study how a or other animal sounds by approaching the pro-
call was coded, and to measure acoustic jector and sometimes even attacking the speaker
parameters of the call relevant to communication (Fig. 3.8). Emlen (1972) investigated how infor-
(Zelick et al. 1999). For example, Myrberg and mation is encoded in bird song by altering
Riggio (1985), studying bicolor damselfish components of Indigo bunting (Passerina
(Stegastes partitus), found that males produced cyanea) song and playing-back the modified
sounds more often in response to playbacks of songs to male territory holders. He quantified
conspecific sounds than to sounds of other spe- the intensity of responses to modified songs and
cies’, and responded more readily to sounds from thus inferred the importance of temporal, struc-
non-resident fish than sounds from their nearest tural, and syntactical features for both individual-
neighbor. Playbacks of male Lake Malawi cichlid and species-recognition.
fish (Pseudotropheus zebra) sounds to female Beecher and Burt (2004) played-back territo-
cichlids caused them to lay eggs earlier than con- rial sounds from male song sparrows (Melospiza
trol female fish of another Lake Malawi cichlid melodia) that were in neighboring territories ver-
species (Pseudotropheus emmiltos; Amorim et al. sus distant territories. The males were slower and
2008). Simpson et al. (2011) played-back ambient less likely to fly over and explore the sounds from
sounds of different reefs to coral reef fish and a neighbor than calls from a distant male. When a
showed that fish approached the sounds of their song from a distant territorial male was played,
native coral reef versus sounds from a foreign the subject almost always matched or replicated
reef. Hawkins et al. (2014) played back the song and approached the speaker as if looking
recordings of impulsive pile driving sounds for an intruder. In contrast, when the song of a
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 97

a b
5m
5m

10 m

10 m
15 m

20 m
15 m

c d
5m 5m

10 m
10 m
15 m
15 m
20 m

20 m
25 m

Fig. 3.7 Responses of sprat (Sprattus sprattus) schools to reappearing seconds later slightly closer to the seabed.
sound exposure. Vertical lines indicate the beginning and (c) A large sprat school cut off at the onset of the sound
end of each sound sequence. (a) Echogram of a medium- and reappearing at a greater depth at lower density. (d) A
sized sprat school, cut off abruptly after the beginning of small sprat school increasing in density in response to
the sound, and reappearing a few seconds later as a denser sound exposure. From Hawkins et al. 2014. # Acoustical
school slightly closer to the seabed. (b) A medium-sized Society of America, 2014. All rights reserved
sprat school cut off at the onset of the sound and

Fig. 3.8 Diagram of a


playback experiment with
two different bird songs.
The recording and the
speakers should match the
frequency range and levels
of the original signals.
Courtesy of G Pavan
98 W. L. Gannon et al.

neighbor male was played, 85% of the time the eat seals versus killer whales that eat fish; the
subject sang a different song, but one familiar to seals exhibited fearful responses when sounds
the neighbor. By responding with a different, but by the former were broadcast. Wild killer whales
shared song, the subject sparrow indicated it either approached or ignored playbacks of sounds
recognized that the sounds were from a neighbor. from another killer whale pod, but did not call in
Much of the work in determining the function response. However, when their own calls were
of alarm calls in ground squirrels and prairie dogs played, most killer whales approached the source
(Spermophilus and Cynomys, respectively) was and the entire pod started calling in response
determined or confirmed by playing-back previ- (Filatova et al. 2011). Clark and Clark (1980)
ously recorded calls to an attentive colony of described right whale (Balaena australis) behav-
these rodents in the field and observing their ior from playback experiments where right
responses (e.g., Slobodchikoff et al. 2009). Prat whales can differentiate between conspecific
et al. (2016) used playback techniques of calls sounds and other sounds. Playbacks of their own
recorded from the Egyptian fruit bat (Rousettus song or social sounds to wild humpback whales
aegyptiacus) to show that 16 sounds recorded and (Megaptera novaeangliae) resulted in some
played-back from this bat provided enough infor- animals approaching, some charging the source,
mation to identify who was calling, where they and others moving away (Mobley et al. 1988;
were calling from, what they were calling about, Tyack 1983).
and what sort of response the receiver made to the Before a playback session, the researcher
vocalization. should always check the projected sound near
Yegge (2012) and Thomas et al. (2016) the animal to make sure the sound is not distorted
reported using playbacks of duets to restore a and is of sufficient amplitude to mimic the
pair-bond in yellow-cheeked gibbons (Nomascus intended sound. Ideally, playback experiments
gabriellae). A breeding pair of captive gibbons should be carried out on wild animals that are
stopped duetting when construction occurred near free to move within their natural habitats. Captive
their exhibit lasting for about 6 months. After- animals often are de-sensitized to reoccurring
wards, the authors played-back sounds of the sounds, and confinement within a small space
pair’s previous duet, along with a silent- and can greatly alter their behaviors and
music-controls. The pair slowly resumed their vocalizations. It is especially important to ensure
duet, established a pair-bond, and continued to that playback experiments are carried out under
duet, some 5 years later. appropriate acoustic conditions, where the trans-
Playback experiments with marine mammals mitted sounds are free from distortion, and reflec-
are less common due to the logistical challenges tion and reverberation are minimal. This is a
of undertaking these experiments at sea. How- particular problem with playback experiments
ever, there are a few examples. Weddell seals on fish, where sounds can be greatly altered by
(Leptonychotes weddellii) produced geographi- the acoustic environment, especially in small
cally different vocal repertoires that has potential aquarium tanks (Parvulescu 1964; Grey et al.
for identifying discrete breeding stocks of Antarc- 2016; Rogers et al. 2016).
tic seals (Thomas et al. 1983). Charrier et al. Playback studies require controls to ensure the
(2013) used playback methods to confirm that animal is responding to the projected sound and
bearded seals (Erignathus barbatus) recognized not to the noise/hum of equipment or the novelty
vocalizations of their species from different of a new sound. Current sound analysis and
regions. Male harbor seals (Phoca vitulina) that sound-generation software allows the manipula-
are territorial, use roars given by intruding seals to tion of many sound characteristics that could be
locate and challenge those intruders (Hayes et al. used as a control. There are several types of
2004). Deecke (2003) used playbacks to examine controls used by investigators: 1) Merely turn on
whether captive harbor seals could distinguish the equipment to replicate the electronic/back-
sounds from killer whales (Orcinus orca) that ground noise. 2) Play the animal’s own sound,
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 99

but backwards. This projects the same frequency, periods of heavy rain, sounds from animals will
amplitude, and time relationships of the actual either not be heard or masked by the rain.
sound, but in a different order. 3) Play the A common problem in bioacoustical studies in
animal’s sound at a higher or lower speed. This terrestrial environments is the presence of
transforms the projected sound into a different acoustically-active non-target animals. If a
frequency range. 4) Play a call with parts filtered non-target species calls in a specific frequency
out. 5) Play something totally novel to the animal, band, their sounds can perhaps be filtered out,
such as sounds from another species it has never but in many cases, this is not possible. Some
encountered, music, machinery noise, or human analysis software allows to define the frequency
speech. 6) Play sounds typical of the animal’s and amplitude of a target species’ calls and auto-
natural environment. matically identifies only them in a recording.
However, in many cases, finding locations and
times when only an individual animal is
3.5 Considerations for Terrestrial vocalizing provides the best opportunity to make
Field Studies quality recordings.
A good solution for animals such as bats is to
If recording on land, from a vehicle (such as use units which are self-contained and weather
during a truck survey for bat sounds), ground- resistant (see Chap. 2, section on bat detectors).
generated noise can be a problem. In fact, Borkin Each unit can include a receiving transducer,
et al. (2019) reported a negative relationship storage device, or laptop programmed to record
between bat activity and night-time traffic volume at intervals and can be powered by rechargeable
on New Zealand highways; when traffic battery packs or solar panels. Data can be recov-
increased, probability of detecting bats decreased. ered daily, weekly, monthly, or even uploaded in
These researchers used stationary automatic bat the proximity of Wi-Fi for automated data
detectors to avoid their own road noise. Some retrieval. Arrays of bat detectors have been used
solutions include: stopping and turning the vehi- to record ultrasonic calls of bats, as well as to
cle off and recording in silence; using a recently sample the acoustic landscape, estimate biodiver-
paved asphalt track rather than an older and nois- sity, and estimate species density (Carles et al.
ier road or a dirt track; and carrying out vehicle 2007; Sherwin et al. 2000).
transects using electric vehicles. Road surveys are
valuable, but reducing non-biotic noise would
make these transects even more valuable. Terres- 3.6 Considerations for Aquatic
trial recordings can be contaminated with nearby Field Studies
traffic noise. It is therefore advisable to make a
sample recording, check it for ambient noise, and Studies in freshwater are easier on the equipment
select an optimal quiet area. than in saltwater environments; saltwater’s corro-
Air temperature can be a problem. Thomas, sive properties require that underwater equipment
Zinnel, and Ferm (1983), when recording be rinsed with freshwater after use and recorders
Weddell seal breeding colonies, used water- and hydrophones be wiped down to remove salt-
activated chemical heat packs placed next to water deposited from the air. It is, of course, good
recording equipment and batteries in an insulated practice to wipe down and dry all equipment,
box to keep equipment warm in the Antarctic for whether it was deployed in saltwater, in freshwa-
24-hour periods. In extremely warm locations ter, or on land, after use to avoid any rusting or
with high humidity, moisture can collect on build-up of deposits.
recorders or microphones. Placing recording Maintenance and calibration of equipment
equipment inside an insulated box with desiccants such as hydrophones has been shown to be impor-
can minimize moisture problems. In rain forests, tant for long-term monitoring studies and data
equipment must be totally waterproof. During integrity. This includes considerations such as
100 W. L. Gannon et al.

the pressure rating on the hydrophone and the fluctuating pressure around the hydrophone,
length of cable that is waterproofed; the longer which is sensed by the hydrophone and appears
the cable, the higher the impedance and the as noise in recordings. But this “noise” is not due
greater the signal attenuation. Some plastic- to a traveling acoustic wave and hence not due to
coated cables, if deployed for long periods, are sound in the environment. It is an artifact. Flow
vulnerable to damage by marine organisms, shark noise is often a problem in rivers but also offshore
bites, and even sea urchins. Polytetrafluor- (see flow noise marked in the spectrograms in
oethylene (PTFE) coated cables are less suscepti- Fig. 3.3 in Erbe et al. 2015). It can require the
ble to damage of this kind. In addition, acoustic- use of a shield or deflector, or placement of the
release mechanisms (to allow equipment to sur- hydrophone in a sheltered area.
face) can malfunction when encrusted by marine Sound-recording acoustic tags are attached to
creatures. In a review of underwater soundscape marine animals to record their vocalizations and
ecology to monitor habitat health in general, and examine the effects of anthropogenic noise in the
fish spawning in particular, Lindseth and Lobel marine environment relative to animal generated
(2018) summarized current recording and sam- sound. Flow noise (generated simply by water
pling methods including metrics commonly used flowing around the tag) can be useful in this
in analyses of aquatic acoustic data. They point instance, as it can measure whale speed (von
out that there have been significant technological Benda-Beckmann et al. 2016; Fig. 3.9). However,
advances in equipment, especially hydrophones. interference by background noise is also a com-
In aquatic situations, there can be electronic mon problem. Unfortunately, survey vessels pro-
interference from improper grounding on the ves- duce noise while operating. Therefore, to avoid
sel, depending on the types of electronic equip- unnecessary mechanical background noise during
ment running onboard (e.g., lights, radios, recordings, turn off any non-essential equipment
freezers, generators, winches, fans, air (such as engines, pumps, filters, fans, generators,
conditioners, or furnaces). A quick-fix to ground- lights, refrigerators, winches, etc.). However,
ing problems on a ship is to drop a bare wire into fishing, military, research, and whale-watching
the water with the other end attached to the boat operators often are reluctant to do this. Alter-
recording equipment. However, a trial-and-error natively, these vessel sounds can be filtered out
approach may be needed to resolve this. during recording or analysis.
Flow noise is a problem that causes artifacts in In rivers or shallow coastal areas, currents and
the recordings. Noise from water flow over the tides transport sediment which may create noise.
hydrophone and its mooring can create turbulence It may come as quite a shock when an entire
and small eddies (vortex shedding). These lead to recording is ruined by nonstop sand swishing

Fig. 3.9 Non-animal


generated noise can affect
aquatic recordings
adversely unless the
research has a system in
place that accounts for
noise versus animal
generated calls. Simply
attaching a hydrophone or
tag to a marine mammal can
cause flow noise from water
rushing around the attached
object
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 101

back and forth over the hydrophone, creating under water). It is important to measure and
noise between 10 Hz and 2 kHz (Erbe 2009). understand the sound speed profile in the study
Perhaps more amusing shallow-water “mooring area to know the propagation pattern and range of
noise” occurred when a group of teenage girls a signal, which influence the recorded sound. For
swam over to the mooring, held on to the floats years, navies of the world measured sound speed
and sang ABBA songs for 20 minutes—very profiles using disposable, battery-operated CTD
clearly recorded. The entire recording session (conductivity, temperature, depth) units, which
had to be discarded (Erbe 2013). were tossed into the ocean and data sent back to
Similarly, a hydrophone fixed to a ship, boat, the ship as the unit fell in the water and unspooled
buoy, or dock will bob up-and-down and produce a long copper wire. The units were not retrieved.
spurious signals such as flow noise as the water Today, retrievable, digital CTD units are used.
passes the hydrophone and artifacts from hydro- The sound speed profile may change over the
static pressure changes as the hydrophone course of a day—within the upper few meters
changes its depth. The recording can be saturated below the sea surface. Turl and Thomas (1992)
with such signals. This noise can be reduced by documented that a false killer whale (Pseudorca
suspending the hydrophone with a bungee cord, crassidens) echolocating during target-detection
decoupling the floating hydrophone from the sur- distance experiments in Kaneohe Bay, Hawaii,
face through a catenary line, or mounting the USA, consistently performed better during the
hydrophone on the seafloor (Fig. 3.10; also see morning than afternoon; i.e., the whale could
Chap. 2, section on PAM systems). Another solu- detect the target at a greater distance during the
tion to reduce flow noise is to use a sonobuoy or morning. After taking CTD measurements prior
an anti-heave buoy (see photograph in Chap. 4, to the morning and afternoon sessions, the
section on sonobuoys). The long cable of the researchers realized the water column, and thus
sonobuoy acts as a bungee cord to dampen verti- sound speed profile, were very different between
cal oscillations of the hydrophone. The sonobuoy the two periods because or prevailing midday
is isolated from self-noise of the vessel, but will rains.
detect sounds from the vessel until it moves out of Sound propagation is particularly complicated
range. in shallow water because of the close proximity of
Local sound propagation conditions will affect boundaries formed by the sea surface and seabed
the recording (see Chap. 6 on sound propagation (Rogers and Cox 1988). Sound is reflected,

sea surface b) float + GPS c)


suspension catenary
system floats
a)
weighted
float
recorder

acous c floats
release hydrophone
rope

seafloor anchors weighted recorder

Fig. 3.10 Mooring options to avoid noise artifacts: (a) et al.; https://www.frontiersin.org/articles/10.3389/fmars.
recorder on the seafloor, (b) recorder suspended from a 2019.00606/full. Published under a Creative Commons
float via a bungee cord and drogue, and (c) recorder Attribution License (CC BY); https://creativecommons.
suspended via a catenary line (Erbe et al. 2019). # Erbe org/licenses/by/4.0/
102 W. L. Gannon et al.

scattered, and absorbed at these boundaries. animals, and health). Care should be taken to
There is far more attenuation of low-frequency study healthy animals, as opposed to ill or
sounds in shallow water compared to deep water. rehabilitating animals, to best represent the acous-
Rogers and Cox (1988) suggested that the lowest tic abilities of their wild counterparts. However,
frequency that could propagate in water less than burgeoning research by Therrien et al. (2012)
1 m deep was about 300 Hz, but this was strongly indicated that changes in vocal behavior of
dependent on the nature of the seabed (sand, rock, bottlenose dolphins (Tursiops truncatus) and
or mud). California sea lions (Zalophus californianus)
Ambient noise is an omnipresent issue and actually could be used to indicate a health prob-
may mask the signals desired for recording (see lem (Schwalm 2012). Moreover, captive animals,
Chap. 7 on soundscapes). Wind and precipitation especially those that have been hand-reared or
create noise underwater from coastal to offshore raised in a hatchery (such as salmon or sea bass)
regions. In polar regions, ice popping and crack- can show some degree of genetic selection,
ing may dominate the soundscape. When a hydro- de-sensitization, and habituation to the presence
phone was dropped in the ice-covered water next of high levels of ambient sound. They can be
to a group of Antarctic Weddell seals (JAT, per- much less responsive to sounds than wild
sonal observations), music was heard from the animals.
radio-station at the New Zealand Research Base Most zoos have noise created by loudspeaker
in Antarctica about 2 km away! Organisms from announcements, music, shows, rides, or facility
tiny snapping shrimp to enormous singing whales vehicles. Key events, such as hearing music for a
may also mask recordings of a target species. show, or a vehicle delivering food, may affect
Ship noise is almost omnipresent in the world’s animal behavior; therefore, studies should not be
oceans, so it can be difficult to obtain recordings conducted during those times. Reminiscent of
of a target species in a quiet aquatic environment. Ivan Pavlov in the 1890s experiment that dogs
were being conditioned behaviorally (drooled) in
response to being fed at the sound of a bell
3.7 Considerations for Studies (conditioned response), researchers need to be
on Captive Animals aware of regular triggers to animal behavior. Of
course, a common source of noise in captive
Because there are regulations on the housing and studies is from visitors, keepers, and maintenance
care of captive animals, research permit and workers. If at all possible, it is best to conduct
IACUC requirements can be more detailed for research before or after humans are near the study
research on captive species. However, often location (i.e., before or after the zoo is open). If
those regulations were written for laboratory possible, operation of air conditioners, furnaces,
animals used in medical research (mostly Rattus air-filters, and lights should be stopped, or
and Mus) and are not specified or applicable for minimized, to reduce or eliminate background
wild animal research. For example, one of us sounds in recordings. Some facilities isolate
(WLG) had to convince the university veterinar- their mechanical equipment in a separate building
ian to allow kangaroo rats (Heteromyidae, from the animals’ environment; this greatly
Dipodomys) to be housed using sandy desert reduces noise exposure for the animals. A prelim-
soils instead of rat bedding so that these wild inary survey of noise in the animals’ enclosure,
animals could properly sand-bathe and tunnel. using a sound pressure level meter, helps identify
Zoos and aquaria support bioacoustical studies any particularly noisy or quiet areas.
on a wide variety of species, including Sometimes, ultrasonic noise or underwater
endangered species. Some benefits of studying noise can be present unbeknownst to zoo or
captive animals in a zoo are that their history is aquarium staff. One of us (JAT, personal
usually known (i.e., wild caught vs. captive born, observations) provided two examples. In an
sex, age, reproductive history, relatedness to other underwater hearing study on a Pacific white-
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 103

Fig. 3.11 Waveforms and


spectra of echolocation
clicks of bottlenose
dolphins in open ocean
(Kaneohe Bay, Hawaii,
USA) and in a tank. The
spectrum of the click from
the tank had a lower
frequency peak at 40 kHz
and a lower source level of
170–185 dB re 1 μPa
m. Reprinted by permission
from Springer Nature.
Hearing by Whales and
Dolphins, edited by
W. W. L. Au, A. N. Popper,
and R. R. Fay, pp. 364–408,
Echolocation in dolphins,
W. W. L. Au; https://doi.
org/10.1007/978-1-4612-
1150-1_9. # Springer
Nature, 2000. All rights
reserved

sided dolphin (Lagenorhynchus obliquidens) by in highly reverberant concrete pools echolocate


Tremel et al. (1998), the test animal consistently less and at lower amplitudes than in the wild
reported hearing a 32-kHz signal at two different (Fig. 3.11) (Au 2000).
thresholds on different days. Spectrum analysis of Today, exhibit designers incorporate irregular
the ambient noise in the pool revealed an inter- wall and floor surfaces in pools, indoor
mittent noise near 32 kHz. So, on test days when enclosures, and outdoor exhibits to minimize
the noise was present, the animal’s threshold at reverberations. Projecting a signal into a regularly
this frequency was much lower than on test days shaped (e.g., round or square) pool with a flat
when the noise was absent. Because the noise was bottom (e.g., during a hearing test) can set up
ultrasonic, it was not known by staff or standing waves, which result in a sound-field
researchers. In another study by Therrien et al. that dramatically changes with receiver location
(2012), 24-hour recordings of bottlenose dolphins and frequency. A resonant pool amplifies sound
detected an almost continuous banging noise in at its resonance frequencies and dampens others,
the water. Zoo staff were unaware of the noise essentially distorting the signal desired by the
and upon a diver’s inspection of the pool, found a researcher. While concrete walls in a zoo or
metal gate hinge that was broken and causing the aquarium are easy to construct and clean, they
banging sound. In both these examples, staff did provide a reflective surface that often causes
not know about the noise, which could have been annoying, cave-like reverberations.
annoying to the animals and disturb bioacoustical Particular issues are encountered when trying
research. to perform hearing tests and sound exposure
Researchers should understand the possible experiments with fish or invertebrates in water-
effects of the exhibit environment on the acoustic filled tanks that are only a few meters in
behavior of animals. For example, dolphins living dimensions, or even smaller. The complexities
104 W. L. Gannon et al.

of the sound-field in small tanks were first pointed 3.8 Digital File Format
out by Parvulescu (1964) and recently discussed
by Duncan et al. (2016), Grey et al. (2016), Several file formats are available to save digital
Rogers et al. (2016), and Popper and Hawkins recordings. Digital file extensions include WAV,
(2018). Even in quite large tanks, the sound-field PCM, MP3, au, ram, MIDI, ogg, as well as others.
generated by even a simple sound source is It is best to record using uncompressed or WAV
transformed by interactions with boundaries or PCM (Pulse Code Modulation) formats for
(i.e., walls, floor of pool, and water surface) and faithful spectrum analysis.
can vary rapidly as a function of both space and MP3 is a digital audio-encoding format which
frequency. The resulting sound-field can be diffi- uses data compression to reduce file size. It is a
cult to model, or even characterize, and the common audio-format for consumer audio and a
sound-level can be very different from the natural de facto standard of digital audio-compression
environment. In particular, the levels of the parti- used for the transfer and playback of music. How-
cle motion components of the sounds (to which ever, MP3 files and other compression methods
fish are sensitive) can be very high. Attempts at are poor for spectrum analysis because compres-
dampening reverberation by adding materials sion only retains signals in a frequency band up to
such as “horse hair” or bubble-wrap can be effec- 16 kHz (i.e., the human hearing range). As a
tive at high frequencies, but have little effect at result, spectrum analysis using MP3 files is not
the low frequencies to which fish are sensitive and trustworthy above 16 kHz. The psychoacoustic-
where the sound wavelength often exceeds the based compression algorithms, in addition to lim-
dimensions of the tank (Popper and Hawkins iting frequencies to below 16 kHz (and even less
2018). In contrast, experiments performed in at higher compression ratios), discards fine details
deep and open water allow the establishment of that cannot be heard by humans. Cuts introduced
a relatively simple, well-controlled, and predict- by compression appear as unpleasant “holes” in
able sound-field (Hawkins 2014). the spectrogram and can destroy details that could
Grey et al. (2016) measured the sound-field in have meaning. However, MP3 files can be valu-
several large laboratory tanks and came to the able for ecological monitoring of temporal and
following conclusions: 1) Tanks, even large spatial patterns of well-known sounds.
ones, are not appropriate surrogates for open- A few digital recorders offer the Free Lossless
water environments. 2) Tank wall-thickness is Audio Codec (FLAC) format, which has less
largely irrelevant. Walls backed by air essentially compression and reduces the storage space up to
present a low impedance, and walls in contact 50% without loss of detail. In addition, a few
with a solid foundation or ground present finite digital recorders employ a Direct Stream Digital
(non-rigid) impedance defined by the substrate (DSD) format; a proprietary system of digitally
materials. 3) Resonance of the tank walls can recreating audible signals for the Super Audio
dominate underwater sound-field characteristics. CD, using delta-sigma 1-bit A/D-converters at
4) Lining the walls of a tank with acoustic absor- 2.8 or 5.6 MHz. Because of the intrinsic
bent material is futile, because the thicknesses properties of the delta-sigma conversion made
required at low frequencies would leave no by the 1-bit A/D-converter, these recorders have
room for the fish. 5) Both the sound pressure the potential to record frequencies well beyond
and the particle motion of a sound need to be 100 kHz, but with increased noise at high
measured and checked for mutual validation by frequencies. Spectrum analysis of recordings
calculating the particle motion from pressure made in the DSD format is appropriate.
gradients. Special hydrophone systems, based on Waveform sound files (WAV; created by
seismic accelerometers, are required to measure Microsoft) are perhaps the simplest of the com-
particle motion (see Chap. 2). mon formats for storing audio samples. Unlike
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 105

MPEG and other compressed formats, WAV files When converting analog to digital formats,
and their derivatives (like the Broadcast Wave usually using an A/D-converter, the sampling
File, BWF) store samples “in the raw” where no frequency must be at least twice the highest fre-
pre-processing is used, other than formatting of quency recorded and the recordist needs to make
data. When there is a choice of a recording file sure that the parameters of the storage medium are
format, the WAV (or BWF) format should be adequate for the task. There are a number of free
selected, rather than the MP3 format. software applications for conversion of analog to
With continuous recording, WAV files can digital formats.
become quite large and subsequently be difficult Storage of digital recordings can be done on
to handle with sound analysis software. For hard drives, optical drives, solid-state memory, or
example, WAV recordings sampling at 96 kHz an Internet cloud. Bluetooth (a wireless technol-
and 24 bit for 1 hour will occupy approximately ogy standard) provides reliable exchange of data
1 GB of storage capacity (96,000 samples/s  between fixed and mobile devices over short
24 bits  1 byte/8 bits  60 minutes  60 s/ distances. Bluetooth uses UHF radio waves that
minute ¼ 1.04 GB). If monitoring is required for are effective at a short distance.
long periods, it is therefore important to select the
appropriate sampling rate to conserve storage
space. For example, if mid-frequency fish sounds 3.10 Archiving Recordings
are the main features of interest, then it can be
appropriately sampled at only 22 kHz, or at an Properly curated recordings are critically impor-
even lower sampling frequency. Several possible tant for assessing changes in soundscapes, ambi-
sampling frequencies and sometimes a choice of ent noise, and animal presence/absence and
bit depth (16 or 24 bit) are available, but not on all acoustic behavior over time. For example, under-
recorders. Some recorders enable a limit to be water recordings made by the US Navy off the
placed on the maximum size of each recorded coast of California indicated a steady increase in
file. Alternatively, a recording protocol can be background noise levels in the ocean in the last
adopted to limit the length of each recording. 60 years (from the 1960s). Marie Poland Fish, an
oceanographer and marine biologist, recorded
and analyzed the sounds of more than 300 species
3.9 Data Storage of marine life, from mammals to mussels. Her
work (described and spectrograms provided in
All storage media should be carefully labeled Fish and Mowbray 1970) helped the US Navy
with who, what, where, and when. Each recording to distinguish fish and other animal sounds from
period should have a unique number. Creating a the sounds made by submarines and remains a
master catalog of recording numbers allows primary source for analysis of marine fish sounds.
researchers to cross-reference metadata from a Recordings of humpback whale songs date
logbook. back to the 1970s and continue to document
Magnetic media, including magnetic tape annual changes in their song within different
(e.g., reel-to-reel, cassette, or DAT tapes), and populations. Williams et al. (2013) studied the
computer hard drives require storage in a dry, changing songs of male savannah sparrows
dark area away from any type of magnetic field. (Passerculus sandwichensis) recorded over three
Exposure to a magnet could erase data. If tapes decades (1980–2011) on Kent Island, New
are not played often, the tightly packed tape could Brunswick, in the Bay of Fundy. Life-long
“bleed through” from one segment to another, recordings of songs of white-crowned sparrows
thus contaminating data. Therefore, converting (Zonotrichia leucophrys) found they memorize
old recordings on magnetic tape to modern stor- syllables they hear at 10–50 days of age and
age is becoming urgent for data on historic sing the same song throughout their life. In con-
soundscapes and animals not be lost. trast, life-long recordings of northern
106 W. L. Gannon et al.

mockingbirds (Mimus polyglottos) found they ence program to create a map of the UK coastal
add elements to their songs throughout their soundscape in 2015.3 Other European online
lives. Only long-term archival data could be sound libraries include: Tierstimmen Archiv4
used for analysis of these trends. In this time of (approximately 120,000 sound recordings;
global warming and accelerated ice melts, Museum für Naturkunde, Berlin, Germany)
archived recordings from the polar regions Xeno-Canto5 (595,000 recordings from approxi-
might become instrumental in monitoring the mately 10,250 bird species Naturalis Biodiversity
rate of climate change (by quantifying Center, Leiden, Netherlands), and FonoZoo6
ice-cracking noise) and the effects on (11,657 recordings of 1621 animal species;
soundscapes and ecology (Obrist et al. 2010). Fonoteca Zoológica, Museo Nacional de Ciencias
The take-home message here is that good research Naturales (CSIC), Madrid, Spain).
practices with solid documentation and data In the USA, the Macaulay Library7 (Cornell
archiving allow for future knowledge generation. Lab of Ornithology, Ithaca, NY, USA) archived
older analog, digital, and video recordings. To
date, their holdings are approximately 24 million
3.11 Repositories photos, 915,000 audio and 192,000 video
of Bioacoustical Data recordings available for researchers. The K. Lisa
Yang Center for Conservation Bioacoustics8
Hafner et al. (1997) noted that collections of (Cornell Lab of Ornithology, Ithaca, NY, USA)
animal recordings with ancillary data are rich is everything “bird” including citizen science and
sources of reference material for bioacoustical masterful guides and information in ornithology
studies. Archiving analog data by converting to (including bird vocalization identification apps
a digital format has played an essential role in and bird cams). The Museum of Southwestern
preserving data for future use. Species-specific Biology9 (University of New Mexico,
sounds from a variety of regions and times, with Albuquerque, NM, USA) and Museum of Verte-
associated voucher specimens and metadata, are brate Zoology10 (University of California,
available for researchers at a number of Berkeley, CA, USA) have hundreds of thousands
organizations. All collections and their of cataloged natural history journals and voucher
corresponding links were valid as of specimens and began to associate avian
13 June 2022. vocalizations with voucher specimens in the
In Europe, there is a long tradition of recording 2000s. These museum collections have shown a
animal sounds, in particular bird songs, and many desire to include bat call libraries before 2023.
collections have been published on vinyl discs The Watkins Sound Library11 (Woods Hole
and CDs, mainly in France and the UK. In 1969, Oceanographic Institution, Woods Hole, MA,
the British Library of Wildlife Sounds2 USA) provides particularly good collections of
established holdings of more than 160,000 well- marine mammal sounds with a highlighted
documented field-recordings covering all classes “Best of” cuts section that contains 1694 sound
of sound-producing animals from many regions.
More than 10,000 species of invertebrates,
insects, amphibians, reptiles, fishes, birds, and 3
https://www.bl.uk/sounds-of-our-shores
4
mammals, including many rare and threatened http://www.tierstimmenarchiv.de/
5
species. A large number of these recordings https://www.xeno-canto.org/
6
were made for radio by the BBC Natural History http://www.fonozoo.com/index_eng.php
7
Unit. The British Library supported a citizen-sci- http://macaulaylibrary.org
8
https://www.birds.cornell.edu/ccb/
9
https://arctosdb.org/; http://www.msb.unm.edu/
10
2
https://www.bl.uk/collection-guides/wildlife-and-envi http://mvz.berkeley.edu/General_Information.html
11
ronmental-sounds; accessed 13 June 2022 https://cis.whoi.edu/science/B/whalesounds/index.cfm
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 107

Fig. 3.12 Commercial companies and others market photo by the authors; right photo, “Capturing the sounds
sounds of animals and soundscapes recorded by of the lake” by S. Shiller; https://www.flickr.com/photos/
researchers such as Bernie Krause. Recording and 12289718@N00/9454414945; licensed under CC BY 2.0;
analyzing natural sound is fulfilling and insightful, and https://creativecommons.org/licenses/by/2.0/
can be a profound source for generating knowledge. Left

cuts deemed to be of higher sound quality and


3.12 Summary
lower noise from 32 different marine mammal
species.
As with other areas of science, good practices for
Several commercial companies market LPs
bioacoustical research, as well as an awareness of
and CDs of nature sounds. Bernie Krause12
the ethical implications of that research, should be
(Wild Sanctuary, Glen Ellen, CA, USA;
employed. This chapter provides a list of
Fig. 3.12) is unique among researchers, commer-
considerations for terrestrial, aquatic, and captive
cial ventures, and artists. From the Wild Sanctu-
studies—a list that will doubtlessly be improved
ary website, “The Wild Sanctuary Audio Archive
as technology and access to the acoustic world
represents a vast and important collection of
improves. No longer is large, heavy, and expen-
whole-habitat field recordings and precise
sive equipment necessary to make high-quality,
metadata dating from the late 1960s. This unique
meaningful acoustic recordings. Acoustic data are
bioacoustic resource contains marine and terres-
important beyond the immediate scope of a proj-
trial soundscapes representing the voices of living
ect, but data must be well documented with
organisms from larvae to large mammals and the
metadata (including field notes and ancillary
numerous tropical, temperate and Arctic biomes
information) and stored in a way that they are
from which they come. The catalog currently
preserved and accessible for future research. The
contains over 4500 hours of wild soundscapes
importance of a well-designed data sheet for easy
and in excess of 15,000 identified life forms.”
data entry and analysis is also discussed along
The acoustic world is not only at our finger tips,
with special considerations for study design.
but the world is becoming available for all to hear.
Playbacks of sounds to animals are commonly
used by bioacousticians and procedures for
playbacks and controls are recommended.
Several sound libraries are publicly available
12
for research. These facilities have invested a great
http://www.wildsanctuary.com/
108 W. L. Gannon et al.

deal of time in transferring analog recordings to Charrier I, Mathevon H, Aubin T (2013) Bearded seal
digital formats for more permanent preservation. males perceive geographic variation in their trills.
Behav Ecol Sociobiol 67(10):1679–1689. https://doi.
CDs of animal and nature sounds are now com- org/10.1007/s00265-013-1578-6
mercially available. Archives are useful for edu- Clark CW, Clark JM (1980) Sound playback experiments
cation and research. As we evaluate current with southern right whales (Eubalaena australis). Sci-
hypotheses related to global warming, perhaps ence 20:663–665
Davis J (1958) Singing behavior and the gonad cycle of
we can hear the world change. the Rufous-sided towhee. Condor 60:308–336
Deecke V (2003) Seals are guided by voices. Discover
April p. 17.
Duncan AJ, Lucke K, Erbe C, McCauley RD (2016) Issues
3.13 Additional Resources associated with sound exposure experiments in tanks.
Proc Meet Acoust 27(1):070008. https://doi.org/10.
• Sound recording tips from eBird: https://www. 1121/2.0000280
macaulaylibrary.org/how-to/recording- Emlen ST (1972) An Experimental Analysis of the
Parameters of Bird Song Eliciting Species Recogni-
techniques/ tion. Anim Behav 41:130–171
• Bioacoustics equipment and field techniques, Erbe C (2009) Underwater noise from pile driving in
Centro Interdisciplinare di Bioacustica Moreton Bay, Qld. Acoust Aust 37(3):87–92
e Ricerche Ambientali, Università degli Studi Erbe C (2013) Underwater noise of small personal water-
craft (jet skis). J Acoust Soc Am 133(4):EL326–
di Pavia: http://www.unipv.it/cibra/edu_equip EL330. https://doi.org/10.1121/1.4795220
ment_uk.html Erbe C, Verma A, McCauley R, Gavrilov A, Parnum I
• Manual on Field Recording Techniques and (2015) The marine soundscape of the Perth Canyon.
Protocols for All Taxa Biodiversity Prog Oceanogr 137:38–51. https://doi.org/10.1016/j.
pocean.2015.05.015
Inventories and Monitoring (Eymann et al. Erbe C, Marley S, Schoeman R, Smith JN, Trigg L,
2010): https://issuu.com/ysamyn/docs/ Embling CB (2019) The effects of ship noise on marine
abctaxa_vol_8_part1_lr mammals--A review. Front Mar Sci 11. https://doi.org/
10.3389/fmars.2019.00606
Eymann J, Degreef J, Hauser C, Monje JC, Samyn Y,
All web resources were last accessed VanDerSpiegel D (2010) Manual on field recording
13 June 2022. techniques and protocols for all taxa biodiversity
inventories and monitoring. Abc Taxa series, The Bel-
gian Development Corporation, Brussels, Belgium
http://www.abctaxa.be ISSN 1784-1291.
References Filatova OA, Fedutin ID, Burdin AM (2011) Responses of
Kamchatkan fish-eating killer whales to playbacks of
Amorim MCP, Simões JM, Fonseca PJ, Turner GF (2008) conspecific calls. Mar Mam Sci 27(2):E26–E42
Species differences in courtship acoustic signals Fish MP, Mowbray WH (1970) Sounds of Western North
among five Lake Malawi cichlid species Atlantic fishes. A reference file of biological underwa-
(Pseudotropheus spp.). Fish Biol 72:1355–1368. ter sounds. The John Hopkins Press, Baltimore, MA,
https://doi.org/10.1111/j.1095-8649.2008.01802.x USA, 207 p
Au WWL (2000) Chapter 9: Echolocation in dolphins. In: Gannon WL (1999) Tamias siskiyou. In: Wilson DE, Ruff
WWL A, Popper AN, Fay RR (eds) Hearing by whales S (eds) Complete book of North American Mammals.
and dolphins. Springer-Verlag, New York, pp 364–408 Smithsonian Institution Press, Washington, DC
Beecher MD, Burt JM (2004) The Role of Social Interac- Gannon WL (2014) Integrating research ethics with grad-
tion in Bird Song Learning. Curr Dir Psychol Sci 13: uate education in Geography. J Geography High Ed
224–228. https://doi.org/10.1111/j.0963-7214.2004. 38:481–499. https://doi.org/10.1080/03098265.2014.
00313.x 958656
Borkin KM, Smith DHV, Shaw WB, McQueen JC (2019) Gluck JP (1998) The death of a vagrant bird. In: Orlans
More traffic, less bat activity: the relationship between BF, Beauchamp TL, Dresser R, Morton D, Gluck JP
overnight traffic volumes and Chalinolobus (eds) The human use of animals; Case studies in ethical
tuberculatus activity along New Zealand highways. choice. Oxford University Press, New York, pp
Acta Chiropterologica 21:321–329. https://doi.org/10. 191–208
3161/15081109ACC2019.21.2.007 Grey MD, Rogers PH, Popper AN, Hawkins AD, Fay RR
Carles F, Torre I, Arrizabalaga A (2007) Comparison of (2016) Large tank acoustics: How big is big
Sampling Methods for Inventory of Bat Communities. enough? In: Popper AN, Hawkins A (eds) The effects
J Mammal 88:526–533. https://doi.org/10.1644/06- of noise on aquatic life II, Advances in experimental
MAMM-A-135R1.1 Medicine and biology. Springer Science + Business
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 109

Media, New York. https://doi.org/10.1007/978-1- Prat Y, Taub M, Yovel Y (2016) Everyday bat
4939-2981-8_43 vocalizations contain information about emitter.
Hafner MS, Gannon WL, Salazar-Bravo J, Alvarez- Nature 39419(2016):1. https://doi.org/10.1038/
Castaneda ST (1997) Mammal collections of the West- srep39419
ern Hemisphere: a survey and directory of existing Rogers PH, Cox M (1988) Underwater sound as a
collections. In: American Society of Mammalogists. biological stimulus. In: Atema J, Fay RR, Popper
Allen Press, Lawrence KS, p 97. ISBN 0-89338-055- AN, Tavolga WN (eds) Sensory biology of aquatic
5. http://www.mammalsociety.org/uploads/commit animals. Springer-Verlag, New York, pp 131–149
tee_files/collsurvey.pdf Rogers PH, Hawkins AD, Popper AN, Fay RR, Grey MD
Hawkins AD (2014) Examining fish in the sea: a European (2016) Parvulescu revisited: small tank acoustics for
perspective on fish hearing experiments. In: Popper bioacousticians. In: Popper AN, Hawkins A (eds) The
AN, Fay RR (eds) Perspectives on auditory research. effects of noise on aquatic life II, Advances in experi-
Springer 247 Handbook of Auditory Research, p 50. mental medicine and biology. Springer Science + Busi-
https://doi.org/10.1007/978-1-4614-9102-6_14 ness Media, New York. https://doi.org/10.1007/978-1-
Hawkins AD, Roberts L, Cheesman S (2014) Responses 4939-2981-8_43
of free-living coastal pelagic fish to impulsive sounds. Schwalm A (2012) Analysis of aerial vocalizations of
J Acoust Soc Am 135(5):3101–3116 California sea lions in rehabilitation as an indicator of
Hayes SA, Kumar A, Costa DP, Mellinger DK, Harvey J health. PhD Dissertation, Western Illinois University.
(2004) Evaluating the function of the male harbour Sherwin RE, Gannon WL, Haymond S (2000) The effi-
seal, Phoca vitulina, roar through playback cacy of acoustic techniques to infer differential use of
experiments. Anim Behav 67:1133–1139. https://doi. habitat by bats. Acta Chiropterologica 2(2):145–153
org/10.1016/j.anbehav.2003.06.019 Simpson SD, Radford AN, Tickle EJ, Meekan MG, Jeffs
Kissner KJ, Forbes MR, Secoy DM (1997) Rattling behav- AG (2011) Adaptive avoidance of reef noise. PLoS
ior of prairie rattlesnakes (Crotalus viridis viridis, One 6(2):e16625. https://doi.org/10.1371/journal.
Viperidae) in relation to sex, reproductive Status, pone.0016625
Body Size, and Body Temperature. Ethology Slobodchikoff CN, Ackers SH, Van Ert M (1998) Geo-
103(12):1042–1050. https://doi.org/10.1111/j. graphical variation in alarm calls of Gunnison’s prairie
1439-0310.1997.tb00146.x dogs. J Mammal 79:1265–1272
Kunz TH, Parsons S (eds) (2009) Ecological and behav- Slobodchikoff CN, Perla BS, Verdolin JL (2009)
ioral methods for the study of bats, 2nd edn. The John Prairie dogs: Communication and community in an
Hopkins Press, Baltimore, MD, p 920 animal society. Harvard University Press, Boston, MA
Lindseth AV, Lobel PS (2018) Underwater soundscape Smith AT, Nagy JD, Millar CI (2016) Behavioral ecology
monitoring and fish bioacoustics: a review. Fishes 3: of American pikas (Ochotona princeps) at Mono
1–15. https://doi.org/10.3390/fishes3030036 Craters, California: Living on the edge. West N Am
Mobley JR Jr, Herman IM, Frankel AS (1988) Responses Nat 76(4):459–484
of wintering humpback whales (Megaptera Therrien SC, Thomas JA, Therrien RE, Stacey R (2012)
novaeangliae) to playback of recordings of winter Time of day and social change affects underwater
and summer vocalisations and of synthetic sound. sound production of bottlenose dolphins (Tursiops
Behav Ecol Sociobiol 23:211–223 truncatus) at the Brookfield Zoo. Aquat Mamm
Morton SL, Morton ES (1998) Sound playback studies. In: 38(1):65–75
Evans CS (ed) Animal acoustic communication. Thomas JA, DeMaster DP (1982) An acoustic technique
Springer-Verlag, Berlin, pp 323–352 for determining haulout pattern in leopard (Hydrurga
Myrberg AA Jr, Riggio RJ (1985) Acoustically mediated leptonyx) and crabeater (Lobodon carcinophagus)
individual recognition by a coral reef fish seals. Can J Zool 60(8):2028–2031
(Pomacentrus partitus). Anim Behav 33:411–416 Thomas JA, Zinnel KC, Ferm LM (1983) Analysis of
Obrist MK, Pavan G, Sueur J, Riede K, Llusia D, Márquez Weddell seal (Leptonychotes weddelli) vocalizations
R (2010) Bioacoustic approaches in biodiversity using underwater playbacks. Can J Zool 61:1448–1456
inventories. In: Manual on field recording techniques Thomas JA, Friel B, Yegge S (2016) Restoring duetting
and protocols for all taxa biodiversity inventories, Abc behavior in a mated pair of buffy cheeked gibbons after
Taxa series, vol 8, http://www.abctaxa.be, ISSN 1784- exposure to construction noise at a zoo through
1291 edn. The Belgian Development Corporation, playbacks of their own sounds ASA abstract Dec 2016.
Brussels, pp 68–99 Tremel DP, Thomas JA, Ramirez KT, Dye GS, Bachman
Parvulescu A (1964) Problems of propagation and WA, Orban AN, Grimm KK (1998) Underwater
processing. In: Tavolga WN (ed) Marine hearing sensitivity of a Pacific white-sided dolphin,
bio-acoustics. Pergamon Press, Oxford, pp 87–100 Lagenorhynchus obliquidens. Aquat Mamm 24(2):
Popper AN, Hawkins AD (2018) The importance of parti- 63–69
cle motion to fishes and invertebrates. J Acoust Soc Turl CW, Thomas JA (1992) Possible relationship
Am 143(1):470–488. https://doi.org/10.1121/1. between oceanographic conditions and long-range tar-
5021594 get detection by a false killer whale. In: Thomas JA,
110 W. L. Gannon et al.

Kastelein RA, Supin AY (eds) Marine Mammal Sen- evolution in Savannah Sparrow songs. Anim Behav
sory Systems. Plenum Press, New York, pp 421–432. 85(1):213. https://doi.org/10.1016/j.anbehav.2012.
773 pp. ISBN 9780306443510 10.028
Tyack P (1983) Differential responses of humpback Yegge SA (2012) Playbacks of conspecific vocalizations
whales, Megaptera novaeangliae, to playback of and music to Gabriella’s crested gibbons at Niabi Zoo
song or social sounds. Behav Ecol Sociobiol 13:49–55 to restore their duetting behavior Western Illinois Uni-
von Benda-Beckmann AM, Wensveen PJ, Samarra FIP, versity. PhD Dissertation.
Beerens SP, Miller PJO (2016) Separating underwater Young S, Carr A, Jones G (2018) CCTV enables the
ambient noise from flow noise recorded on stereo discovery of new barbastelle (Barbastella
acoustic tags attached to marine mammals. J Exp Biol barbastellus) vocalizations and activity pattern near a
219, 2271–2275. https://doi.org/10.1242/jeb.133116. roost. Acta Chiropterologica 20:262–272
White M (2013) The ethical flap over birdsong apps. Zelick R, Mann DA, Popper AN (1999) Acoustic commu-
National Geographic, https://www. nication in fishes and frogs. In: Fay RR, Popper AN
nationalgeographic.com/news/2013/6/130614-bird- (eds) Comparative hearing: fish and amphibians.
watching-birdsong-smartphone-app-ethics/ Springer-Verlag, New York, pp 363–411
Williams H, Levin II, Ryan-Norris D, Newman AEM,
Wheelwright NT (2013) Three decades of cultural

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Introduction to Acoustic Terminology
and Signal Processing 4
Christine Erbe, Alec Duncan, Lauren Hawkins,
John M. Terhune, and Jeanette A. Thomas

4.1 What Is Sound? Not all sounds produce an auditory sensation


in humans. For example, ultrasound refers to
Most people think of sound as something they can sound at frequencies above 20 kHz, while
hear, such as speech, music, bird song, or noise infrasound refers to frequencies below 20 Hz.
from an overflying airplane. There has to be a These definitions are based on the human hearing
source of sound, such as another person, an ani- range of 20 Hz – 20 kHz (American National
mal, or a train. The sound then travels from the Standards Institute 2013). While sound outside
source through the air to our ears. Acoustics is the of the human hearing range is inaudible to
science of sound and includes the generation, humans, it may be audible to certain animals.
propagation, reception, and effects of sound. For example, dolphins hear well into high ultra-
The more scientific definition of sound refers to sonic frequencies above 100 kHz. Also, inaudible
an oscillation in pressure and particle displace- doesn’t mean that the sound cannot cause an
ment that propagates through an acoustic medium effect. For example, infrasound from wind
(American National Standards Institute 2013; turbines has been linked to nausea and other
International Organization for Standardization symptoms in humans (Tonin 2018). As well, the
2017). Sound can also be defined as an auditory effects of ultrasound on humans have been of
sensation that is evoked by such oscillation concern (Parrack 1966; Acton 1974; Leighton
(American National Standards Institute 2013), 2018).
however, more general definitions do not require Noise is also sound, but typically considered
a human listener, do allow for an animal receiver, unwanted. It therefore requires a listener and
or don’t require a receiver at all. includes an aspect of perception. Whether a
sound is perceived as noise depends on the lis-
tener, the situation, as well as acquired cognitive
Jeanette A. Thomas (deceased) contributed to this chapter and emotional experiences with that sound. Dif-
while at the Department of Biological Sciences, Western
ferent listeners might perceive sound differently
Illinois University-Quad Cities, Moline, IL, USA
and classify different sound as noise. One
C. Erbe (*) · A. Duncan · L. Hawkins person’s music is another person’s noise.
Centre for Marine Science and Technology, Curtin Noise could be the sound near an airport that
University, Perth, WA, Australia
has the potential to mask speech. It could be the
e-mail: c.erbe@curtin.edu.au; A.J.Duncan@curtin.edu.au
ambient noise at a recording site and encompass
J. M. Terhune
sound from a multitude of sources near and far.
Department of Biological Sciences, University of New
Brunswick, Saint John, NB, Canada It could be the recorder’s electric self-noise
e-mail: terhune@unb.ca (see also American National Standards
# The Author(s) 2022 111
C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_4
112 C. Erbe et al.

Institute 2013; International Organization for trains, ships, and construction sites. The distinc-
Standardization 2017). In contrast to noise, a sig- tion by source type is common in the study of
nal is wanted, because it conveys information. soundscapes. These comprise a geophony,
There are many ways to describe, quantify, biophony, and anthropophony.
and classify sounds. One way is to label sounds The following sections explain some of the phys-
according to the medium in which they have ical measurements by which sounds can be
traveled: air-borne, water-borne, or structure- characterized and quantified. The terminology is
borne (also called substrate-borne or ground- based on international standards (including, Interna-
borne). For example, scientists studying bat echo- tional Organization for Standardization 2007, 2017;
location work with air-borne sound. Those American National Standards Institute 2013).
looking at the effects of marine seismic survey
noise on baleen whales work with water-borne
sounds. Some of the sound may have traveled as 4.2 Terms and Definitions
a structural vibration through the ground and is
therefore referred to as structure-borne. Just as 4.2.1 Units
earthquakes can be felt on land, submarine
earthquakes can be sensed by benthic organisms A wide (and confusing) collection of units can be
on the seafloor. In both cases, the sound is found in early books and papers on acoustics, but
structure-borne (Dziak et al. 2004). Sound can the units now used for all scientific work are
cross from one medium into another. The sound based on the International System of Units, better
of airplanes is generated and heard in air but also known as the SI system (Taylor and Thompson
transmits into water where it may be detected by 2008). In this system, a unit is specified by a
aquatic fauna (e.g., Erbe et al. 2017b; Kuehne standard symbol representing the unit itself, and
et al. 2020). a multiplier prefix representing a power of
Another way of grouping sounds is by their 10 multiples of that unit. For example, the symbol
sources: geophysical, biological, or anthropo- μPa (pronounced micro pascal) is made up of the
genic. Geophysical sources of sound are wind, multiplier prefix μ (micro), representing a factor
rain, hail, breaking waves, polar ice, earthquakes, of 106 (one one-millionth) and the symbol Pa
and volcanoes. Biological sounds are made by (pascal), which is the SI unit of pressure. So, a
animals on land, such as insects, birds, and bats, measured pressure given as 1.4 μPa corresponds
or by animals in water, such as invertebrates, to 1.4 times 106 Pa or 0.0000014 Pa. The SI base
fishes, and whales. Anthropogenic sounds are units are listed in Table 4.1. Other quantities and
made by humans and stem from airplanes, cars, their units result from quantity equations that are

Table 4.1 SI base units (length, mass, time, electric current, temperature, luminous intensity, and amount of substance)
and example derived units (frequency, pressure, energy, and power)
Quantity Unit name Unit symbol Expressed in terms of base units
Length meter m
Mass kilogram kg
Time second s
Electric current ampere A
Temperature kelvin K
Luminous intensity candela cd
Amount of substance mole mol
Frequency hertz Hz 1/s
Pressure pascal Pa kg / (m s2)
Energy joule J kg m2 / s2
Power watt W kg m2 / s3
4 Introduction to Acoustic Terminology and Signal Processing 113

Table 4.2 SI multiplier prefixes


Prefix Symbol Factor Prefix Symbol Factor
deci d 101 deka da 101
centi c 10–2 hecto h 102
milli m 10–3 kilo k 103
micro μ 10–6 mega M 106
nano n 10–9 giga G 109
pico p 10–12 tera T 1012

based on these base quantities. The SI multiplier oscillate is parallel (or longitudinal) to the direc-
prefixes that go along with these units are listed in tion of propagation of the sound wave in the case
Table 4.2. Note that unit names are always written of longitudinal waves.
in lowercase. However, if the unit is named after a Rock is a solid medium and here, vibration
person, then the symbol is capitalized, otherwise travels as both longitudinal (also called pressure
the symbol is also lowercase. Examples for units or P-waves) and transverse waves (also called
named in honor of a person are kelvin [K], pascal shear or S-waves). In S-waves, the particles oscil-
[Pa], and hertz [Hz]. late perpendicular to the direction of propagation.
It is again because of the coupling of particles,
that the wave propagates. P-waves travel faster
than S-waves so that P-waves arrive before
4.2.2 Sound
S-waves. The P therefore also stands for “pri-
mary” and S for “secondary.”
Sound refers to a mechanical wave that creates a
local disturbance in pressure, stress, particle dis-
placement, and other quantities, and that
propagates through a compressible medium by 4.2.3 Frequency
oscillation of its particles. These particles are
acted upon by internal elastic forces. Air and Frequency refers to the rate of oscillation. Specif-
water are both fluid acoustic media and sound in ically, it is the rate of change of the phase of a sine
these media travels as longitudinal waves (also wave over time, divided by 2π. Here, phase refers
called pressure or P-waves). A common miscon- to the argument of a sine (or cosine) function.
ception is that the air or water particles travel with It denotes a particular point in the cycle of a
the sound wave from the source to a receiver. This waveform. Phase changes with time. Phase is
is not the case. Instead, individual particles oscil- measured as an angle in radians or degrees.
late back and forth about their equilibrium posi- Phase is a very important factor in the interaction
tion. These oscillations are coupled across of one wave with another. Phase is not normally
individual particles, which creates alternating an audible characteristic of a sound wave, though
regions of compressions and rarefactions and it can be in the case of very-low-frequency
which allows the sound wave to propagate sounds.
(Fig. 4.11). The line along which the particles A simpler concept of frequency of a sine wave,
as shown in Fig. 4.1, is the number of cycles per
1 second. A full cycle lasts from one positive peak
Dan Russell’s animations of particle motion during
acoustic wave propagation: https://www.acs.psu.edu/ to the next positive peak. To determine the fre-
drussell/Demos/waves-intro/waves-intro.html, of the quency, count how many full cycles and fractions
amplitude at a fixed location: https://www.acs.psu.edu/ thereof occur in 1 s. Note that pitch is an attribute
drussell/Demos/wave-x-t/wave-x-t.html, and of longitudi-
of auditory sensation and while it is related to
nal and transverse waves: https://www.acs.psu.edu/
drussell/Demos/waves/wavemotion.html; accessed frequency, it is used in human auditory perception
12 October 2020. as a means to order sounds on a musical scale. As
114 C. Erbe et al.

Fig. 4.1 A sinusoidal sound wave having a peak pressure propagates to the right. At regions of compression, the
of 1 Pa, a peak-to-peak pressure of 2 Pa, a root-mean- pressure is high; at regions of rarefaction, it is low. The
square pressure of 0.7 Pa, a period of 0.25 s, and a bottom plot shows the change in pressure over time at a
frequency of 4 Hz. The top plot indicates the motion of fixed location. While the plots are lined up, the horizontal
the particles of the medium; they undergo coupled axes of the top and bottom plots are space and time,
oscillations back and forth, so that the sound wave respectively

we know very little about auditory perception in but also harmonically related overtones. The
animals, the term pitch is not normally used in frequencies of overtones are integer multiples of
animal bioacoustics. the fundamental: 2 f0, 3 f0, 4 f0, ... Beware that
The symbol for frequency is f and the unit is there are two schemes for naming these tones: f0
hertz [Hz] in honor of Heinrich Rudolf Hertz, a can be called either the fundamental or the first
German physicist who proved the existence of harmonic. In the former case, 2 f0 becomes the
electromagnetic waves. Expressed in SI units, first overtone, 3 f0 the second overtone, etc. In the
1 Hz ¼ 1/s. latter case, 2 f0 becomes the second harmonic, 3 f0
The fundamental frequency (symbol: f0; unit: the third harmonic, etc.
Hz) of an oscillation is the reciprocal of the Musical instruments produce harmonics,
period. The period (symbol: τ; unit: s) is the which determine the characteristic timbre of the
duration of one cycle and is related to the funda- sounds they produce. For example, it is the
mental frequency as (see Fig. 4.1): differences in harmonics that make a flute sound
unmistakably different from a clarinet, even when
1
τ¼ they are playing the same note. Animal sounds
f0
also often have harmonics as they use similar
The wavelength (symbol: λ; unit: m) of a sine basic mechanisms to musical instruments. Most
wave measures the spatial distance between two mammals have string-like vocal cords and birds
successive “peaks” or other identifiable points on have string-like syrinxes. Fish have muscles that
the wave. contract around a swim bladder to produce
A sound that consists of only one frequency is percussive-type sounds. Insects and invertebrates
commonly called a pure tone. Very often, sounds stridulate or rub body parts together to produce a
contain not only the fundamental frequency percussive sound.
4 Introduction to Acoustic Terminology and Signal Processing 115

Fig. 4.2 Spectrograms of


(a) a jet ski recorded under
water Erbe 2013 and (b) a
Carnaby’s Cockatoo
(Calyptorhynchus
latirostris) whistle, both
displaying frequency
modulation

The frequency or frequencies of a sound may contour with respect to time is zero at a local
change over time, so that frequency is a function extremum, and the second derivate is a positive
of time: f(t). This is called frequency modulation number in the case of a minimum or a negative
(abbreviation: FM). If the frequency increases number in the case of a maximum. At an inflec-
over time, the sound is called an upsweep. If tion point, the curvature of the contour changes
the frequency decreases over time, the sound is from clockwise to counter-clockwise or vice
called a downsweep. Sounds without frequency versa. Mathematically, the first derivative of the
modulation are called continuous wave. The whistle contour with respect to time exhibits a
sound of jet skis under water is frequency- local extremum and the second derivative is zero
modulated due to frequent speed changes (Erbe at an inflection point. Steps in the contour are
2013). Whistles of animals such as birds or discontinuities in frequency. There is no temporal
dolphins (e.g., Ward et al. 2016) are commonly gap but the contour jumps in frequency. The
frequency-modulated and often exhibit overtones frequency measurements are taken from the fun-
(Fig. 4.2). damental contour. The duration, number of local
The acoustic features of frequency-modulated extrema, number of inflection points, and number
sounds such as whistles can identify the species, of steps are the same in fundamental and
population, and sometimes individual animal that overtones and can therefore be measured from
made them (e.g., Caldwell and Caldwell 1965). any harmonic contour. This is beneficial if the
Such characteristic features include the start fre- fundamental is partly masked by noise.
quency, end frequency, minimum frequency,
maximum frequency, duration, number of local
extrema, number of inflection points, and number
of steps (e.g., Marley et al. 2017). The start fre-
quency is the frequency at the beginning of the
fundamental contour, the end frequency is the
frequency at the end of the fundamental contour
(Fig. 4.3). The minimum frequency is the lowest
frequency of the fundamental contour and the
maximum frequency is the highest. Duration
measures how long the whistle lasts. Extrema
are points of local minima or maxima in the
contour. At a local minimum, the contour changes
from downsweep to upsweep; at a local maxi-
mum, it changes from upsweep to downsweep. Fig. 4.3 Spectrogram of a frequency-modulated sound,
Mathematically, the first derivative of the whistle identifying characteristic features
116 C. Erbe et al.

4.2.4 Pressure

Atmospheric pressure is the static pressure at a


specified height above ground and is due to the
weight of the atmosphere above. Similarly,
hydrostatic pressure is the static pressure at a
specified depth below the sea surface and is due
to the weight of the water above plus the weight
of the atmosphere.
Sound pressure (or acoustic pressure) is caused
by a sound wave. Sound pressure (symbol:
p; unit: Pa) is dynamic pressure; it varies with
time t (i.e., p is a function of t: p(t)). It is a
deviation from the static pressure and defined as
the difference between the instantaneous pressure
and the static pressure. Air-borne sound pressure
is measured with a microphone, water-borne
sound pressure with a hydrophone. The unit of
pressure is pascal [Pa] in honor of Blaise Pascal, a
French mathematician and physicist. Some of the
superseded units of pressure are bar and dynes per
square centimeter, which can be converted to Fig. 4.4 Gabor click similar to a beaked whale click. The
pascal: 1 bar ¼ 106 dyn/cm2 ¼ 105 Pa. Mathe- signal is based on a sine wave; the amplitude is modulated
by a Gaussian function, and the frequency is swept up with
matically, pressure is defined as force per area. time. The corresponding spectrogram is shown in the
Pascal in SI units is bottom panel
 
1 Pa ¼ 1 N=m2 ¼ 1 J=m3 ¼ 1 kg= m s2
pðt Þ ¼ Aðt Þ sin ð2 πf ðt Þ  t Þ
where N symbolizes newton, the unit of force,
and J symbolizes joule, the unit of energy. The amplitude function changes exponentially
The pressure in Fig. 4.1 follows a sine wave: with time:
2
Aðt Þ ¼ eðtt0 Þ =2σ , where the peak occurs at
2
p(t) ¼ A sin (2 πft), where A is the amplitude and
f the frequency. In the example of Fig. 4.1, t0 ¼ 1 ms, and σ is the standard deviation of the
A ¼ 1 Pa, f ¼ 4 Hz. In general terms, the ampli- Gaussian envelope. Such signals (sine waves that
tude is the magnitude of the largest departure of a are amplitude-modulated by a Gaussian function)
periodically varying quantity (such as sound pres- are called Gabor signals. Echolocation clicks are
sure or particle velocity, see Sect. 4.2.8) from its commonly of Gabor shape (e.g., Kamminga and
equilibrium value. The magnitude is always posi- Beitsma 1990; Holland et al. 2004). In several
tive and commonly symbolized by two species of beaked whales, the sine wave is
vertical bars: |p(t)|. These are the same values as frequency-modulated (Baumann-Pickering et al.
p(t), but without the sign (i.e., the magnitude is 2013) as in the example in Fig. 4.4, where the
always positive). The amplitude may not always frequency changes linearly with time, sweeping
be a constant. When it changes as a function of up from 10 to 50 kHz.
time A(t), the signal undergoes amplitude modu- The peak-to-peak sound pressure (symbol: ppk-
lation (abbreviation: AM). pk unit: Pa) is the difference between the maxi-
;
The signal in Fig. 4.4 is both amplitude- and mum pressure and the minimum pressure of a
frequency-modulated: sound wave:
4 Introduction to Acoustic Terminology and Signal Processing 117

ppkpk ¼ max ðpðt ÞÞ  min ðpðt ÞÞ (m/s)2 if the signal is particle velocity). The mean-
square sound pressure formula is similar to
In other words, it is the sum of the greatest (Eq. 4.1) but without the root.
magnitude during compression and the greatest The sound pressure level (abbreviation: SPL;
magnitude during rarefaction. symbol: Lp) is the level of the root-mean-square
The peak sound pressure (symbol: ppk; unit: sound pressure and computed as
Pa) is also called zero-to-peak sound pressure and
is the greatest deviation of the sound pressure  
from the static pressure; it is the greatest magni- prms
Lp ¼ 20 log 10
tude of p(t): p0

ppk ¼ max ðjpðt ÞjÞ expressed in dB relative to (abbreviated: re) a


reference value p0. The standard reference value
This can occur during compression and/or is 20 μPa in air and 1 μPa in water.
rarefaction. In other words, ppk is the greater The peak sound pressure level (also called
of the greatest magnitude during compression zero-to-peak sound pressure level; abbreviation:
and the greatest magnitude during rarefaction SPLpk; symbol: Lp,pk) is the level of the peak
(Fig. 4.1). sound pressure and computed as
The root-mean-square (rms) is a useful mea-  
sure for signals (like sound pressure) that aren’t ppk
Lp,pk ¼ 20 log 10
simple oscillatory functions. The rms of any sig- p0
nal can be calculated, no matter how complicated
It is expressed in dB relative to a reference
it is. To do so, square each sample of the signal,
value p0 (i.e., 20 μPa in air and 1 μPa in water).
average all the squared samples, and then take the
Similarly, the peak-to-peak sound pressure
square root of the result. It turns out that the rms
level is the level of the peak-to-peak sound
of a sine wave is 0.707 times its amplitude, but
pressure:
this is only true for sinusoidal (sine or cosine)
 
waves. The units for rms are the same as those ppkpk
for amplitude (e.g., Pa if the signal is pressure or Lp,pkpk ¼ 20 log 10
p0
m/s if the signal is particle velocity). The root-
mean-square sound pressure (symbol: prms; unit: Example sound pressure levels in air and water
Pa) is computed as its name dictates, as the root of are given in Tables 4.3 and 4.4. Sources can have
the mean over time of the squared pressure: a large range of levels and only one example is
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi given for each source. Animal sounds and
uZ t 2
u their levels may vary with species, sex, age,
u p2 ðt Þdt
t t1 behavioral context, etc. Animals in captivity
prms ¼ , or in discrete form :
t2  t1 may produce lower levels than animals in
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi the wild. Ship noise depends on the type of ves-
PN 2
i¼1 pi
sel, its propulsion system, speed, load, etc. The
prms ¼ ð4:1Þ
N tables are intended to give an overview of the
dynamic range of source levels across the differ-
This computation is practically carried out ent sources.
over a time interval from t1 to t2. Loudness is an attribute of auditory sensation.
The mean-square is the mean of the square of While it is related to sound pressure, loudness
the signal values. The mean-square of a signal is measures how loud or soft a sound seems to
always equal to the square of the signal’s rms. Its us. Given that very little is known about auditory
units are the square of the corresponding ampli- perception in animals, the term loudness is rarely
tude units (e.g., Pa2 if the signal is pressure or used in animal bioacoustics.
118 C. Erbe et al.

Table 4.3 Examples of sound pressure levels in air. All meters. Note that the different sources listed can have a
levels are broadband; the hearing thresholds are single- range of levels and only one example is given
frequency. Nominal ranges from the source are given in
Pa dB re 20 μPa
Explosion at 1 m 63,246 190
Airplane take-off at 25 m 632 150
Human pain threshold at 1 kHz 200 140
Lion roar at 1 m 13 116
Human discomfort threshold at 1 kHz 10 114
Diesel lawn mower at 1 m 1 94
Truck at city speed at 20 m 0.2 80
Old vacuum cleaner at 1 m 0.1 70
Bird song at 1 m 0.02 60
Cricket chorus at 1 m 0.02 60
Human speech at 1 m 0.01 55
Buzzing mosquito 0.002 40
Human whisper at 1 m 0.001 30
Fluttering leaves 0.0002 20
Human breathing at 1 m 0.0001 10
Human hearing threshold at 1 kHz 0.00002 0

Table 4.4 Examples of sound pressure levels in water. meters. Note that the different sources listed can have a
All levels are broadband; the hearing thresholds are single- range of levels and only one example is given
frequency. Nominal ranges from the source are given in
Pa dB re 1 μPa
Subsea earthquake 316,228 230
Seismic survey airgun at 1 m 10,000 200
Container ship at 1 m 5623 195
Humpback whale song at 1 m 1778 185
Zodiac at high speed at 1 m 178 165
Dolphin whistle at 1 m 32 150
Geotechnical drilling at 1 m 18 145
Jet ski 10 140
Toadfish at 1 m 10 140
Damsel fish at 1 m 1 120
Open ocean ambient noise at sea state 4 0.1 100
Open ocean ambient noise at sea state 0.5 0.01 80
California Sea lion hearing threshold at 10 kHz 0.001 60
Killer whale hearing threshold at 20 kHz 0.0001 40

4.2.5 Sound Exposure sound exposure level (abbreviation: SEL; sym-


bol: LE,p) is computed as:
Sound exposure (symbol: Ep,T; unit: Pa2s) is the  
Ep,T
integral over time of the squared pressure: LE,p ¼ 10 log 10
E p,0
Z t2
Ep,T ¼ p2 ðt Þdt It is expressed in dB relative to Ep,0 ¼ 400
t1
μPa2s in air, and Ep,0 ¼ 1 μPa2s in water. Sound
Sound exposure increases with time. The lon- exposure is proportional to the total energy of a
ger the sound lasts, the greater the exposure. The sound wave.
4 Introduction to Acoustic Terminology and Signal Processing 119

4.2.6 When to Use SPL and SEL? 10


8 Pile Driving Pulse Underwater
10
Sound pressure and sound exposure are closely

Pressure [ Pa]
related, and in fact, the sound exposure level can 5
be computed from the sound pressure level as:
0
LE,p ¼ Lp þ 10 log 10 ðt 2  t 1 Þ
-5
Conceptually, the difference is that the SPL is a
time-average and therefore useful for sounds that
don’t change significantly over time, or that last for 10
15 Cumulative Squared Pressure
a long time, or that, for the assessments of noise 4
impacts, can be considered continuous. Examples 95 % cumulative E

t) [ Pa 2s]
are workplace noise or ship noise. The SEL, how- 3
ever, increases with time and critically depends on
the time window over which it is computed. It is 2

sum(p 2
therefore most useful for short-duration, transient
sounds, such as pulses from explosions, pile 1
5 % cumulative E
driving, or seismic surveys. The SEL is then
computed over the duration of the pulse. 0
0 0.1 0.2 0.3 0.4 0.5
It can be difficult to determine the actual pulse T5% T95%
length as the exact start and end points are often Time [s]
not clearly visible, in particular in background
Fig. 4.5 Pressure pulse recorded from pile driving under
noise. Therefore, in praxis, SEL is commonly
water (top) and cumulative squared-pressure curve (bot-
computed over the 90% energy signal duration. tom). The horizontal lines indicate the 5% and 95% cumu-
This is the time during which 90% of the sound lative squared-pressure points on the y-axis. The vertical
exposure occurs. Sound exposure is computed lines identify the corresponding times on the x-axis. The
time between the 5% and 95% marks is the 90% energy
symmetrically about the 50% mark; i.e., from
signal duration. Recording from Erbe 2009
the 5% to the 95% points on the cumulative
squared-pressure curve. SEL becomes (Fig. 4.5):
0Z t95% 1 4.2.7 Acoustic Energy, Intensity,
and Power
B p ðtÞdt C
2
B t5% C
LE,p ¼ 10 log 10 B C
@ E p,0 A Apart from sound pressure and sound exposure,
other physical quantities appear in the bioacous-
tics literature, but are often wrongly used. Acous-
In the presence of significant background tic energy refers to the total energy contained in
noise pn(t), the noise exposure needs to be an acoustic wave. This is the sum of kinetic
subtracted from the overall sound exposure in energy (contained in the movement of the
order to yield the sound exposure due to the signal particles of the medium) and potential energy
alone. In praxis, the noise exposure is computed (i.e., work done by elastic forces in the medium).
over an equally long time window (from t1 to t2) Acoustic energy E is proportional to squared pres-
preceding or succeeding the signal of interest: sure p and time interval Δt (i.e., to sound expo-
0Z t95% Z t2 1 sure) only in the case of a free plane wave or a
p ðt Þdt 
2
pn ðt Þdt C
2 spherical wave at a large distance from its source:
B
B t t1 C
LE,p ¼ 10 log 10 B 5% C S 2
@ E p,0 A E¼ p Δt
Z
120 C. Erbe et al.

The proportionality constant is the ratio of More information and definitions can be found in
surface area S through which the energy flows acoustic standards (including American National
and acoustic impedance Z. Acoustic energy Standards Institute 2013; International Organiza-
increases with time; i.e., the longer the sound tion for Standardization 2017).
lasts or the longer it is measured, the greater the
transmitted energy. The unit of energy is joule
[J] in honor of English physicist James Prescott
4.2.8 Particle Velocity
Joule. In SI units:

1 J ¼ 1 kg m2 =s2 Particle velocity (symbol: u; unit: m/s) refers to


the oscillatory movement of the particles of the
Acoustic power P is the amount of acoustic acoustic medium (i.e., molecules in air and water,
energy E radiated within a time interval Δt: and atoms in the ground) as a wave passes
through. In the example of Fig. 4.1, the particle
P ¼ E=Δt
velocity is a sine wave, just like the acoustic
The unit of power is watt [W]. In SI units: pressure. Each particle oscillates about its equi-
librium position. At this point, its displacement is
1 W ¼ 1 J=s ¼ 1 kg m2 =s3 zero, but its velocity is greatest (i.e., either maxi-
mally positive or maximally negative, depending
Acoustic intensity I is the amount of acoustic on the direction in which the particle is moving).
energy E flowing through a surface area At the two turning points, the displacement from
S perpendicular to the direction of propagation, the equilibrium position is maximum and the
per time Δt: velocity passes through zero, changing sign (i.e.,
I ¼ E=ðSΔt Þ ¼ P=S direction) from positive to negative, or vice versa.
Velocity is a vector, which means it has both
For a free plane wave or a spherical wave at a magnitude and direction. Particle displacement
large distance from its source, this becomes: (unit: m) and particle acceleration (unit: m/s2)
are also vector quantities. In fact, particle velocity
I ¼ p2 =Z ð4:2Þ is the first derivative of particle displacement with
respect to time, and particle acceleration is the
The unit of intensity is W/m2. A conceptually
second derivative of particle displacement with
different definition equates the instantaneous
respect to time. Measurements of particle dis-
acoustic intensity with the product of sound pres-
placement, velocity, and acceleration created by
sure and particle velocity u:
snorkeling are shown in Fig. 4.6.
I ð t Þ ¼ pð t Þ uð t Þ Air molecules also move due to wind, and
water molecules move due to waves and currents.
The two concepts are mathematically equiva- But these types of movement are not due to
lent for free plane and spherical waves and the sound. Wind velocity and current velocity are
unit of intensity is always W/m2. entirely different from the oscillatory particle
The above quantities (energy, power, and velocity involved in the propagation of sound.
intensity) are sometimes used interchangeably. It is equally important to understand that the
That’s wrong. They are not the same, but they speed at which the particles move when a sound
are related. With E, P, I, S, and t denoting energy, wave passes through is not equal to the speed of
power, intensity, surface area, and time, sound at which the sound wave travels through
respectively: the medium. The latter is not an oscillatory
P ¼ E=Δt ¼ I S quantity.
4 Introduction to Acoustic Terminology and Signal Processing 121

Sound pressure Particle displacement magnitude


8000 120 8000
60
Frequency [Hz]

Frequency [Hz]
6000 100 6000 40

20
4000 80 4000
0
2000 60 2000
-20

0 40 0 -40
30 40 50 60 30 40 50 60

Particle velocity magnitude Particle acceleration magnitude


8000 80 8000 80
Frequency [Hz]

Frequency [Hz]
6000 60 6000 60
40
4000 4000
40
20
2000 2000
0 20
0 0
30 40 50 60 30 40 50 60
Time [s] Time [s]

Fig. 4.6 Spectrograms of mean-square sound pressure Hz], and mean-square particle acceleration spectral density
spectral density [dB re 1 μPa2/Hz], mean-square particle [dB re 1 (μm/s2)2/Hz] recorded under water when a snor-
displacement spectral density [dB re 1 pm2/Hz], mean- keler swam above the recorder (Erbe et al. 2016b; Erbe
square particle velocity spectral density [dB re 1 (nm/s)2/ et al. 2017a)

4.2.9 Speed of Sound versus altitude or water depth) are given in


Fig. 4.7.
The speed at which sound travels through an
acoustic medium is called the speed of sound
(symbol: c; unit: m/s). It depends primarily on
4.2.10 Acoustic Impedance
temperature and height above ground in air, and
on temperature, salinity, and depth below the sea
Each acoustic medium has a characteristic
surface in water. The speed of sound is computed
impedance (symbol: Z). It is the product of the
as the distance sound travels divided by time. It
medium’s density (symbol: ρ) and speed of
can also be computed from measurements of the
sound: Z ¼ ρc. In air at 0  C with a density
waveform (i.e., wavelength, period, and fre-
ρ ¼ 1.3 kg/m3 and speed of sound c ¼ 330 m/s,
quency as in Fig. 4.1):
the characteristic impedance is Z ¼ 429 kg/(m2s). In
c ¼ λ=τ ¼ λ f freshwater at 5  C with a density of ρ ¼ 1000 kg/m3
and a speed of sound c ¼ 1427 m/s, the character-
In solid media, such as rock, two types of istic impedance is Z ¼ 1427,000 kg/(m2s). In sea
waves are supported, P- and S-waves (see Sect. water at 20  C and 1 m depth with 3.4% salinity, a
4.2.2), and the speeds (cP and cS) at which they density of ρ ¼ 1035 kg/m3, and a speed of sound of
travel differ. Table 4.5 gives examples for the c ¼ 1520 m/s, the characteristic impedance is
speed of sound in air and water, and for P- and Z ¼ 1,573,200 kg/(m2s). The characteristic imped-
S-waves in some Earth materials. Example sound ance relates the sound pressure to particle velocity
speed profiles (i.e., line graphs of sound speed via p ¼ Z u for plane waves.
122 C. Erbe et al.

Table 4.5 P-wave and S-wave speeds of certain acoustic media


Medium cP [m/s] cS [m/s]
Air, 0  C 330
Air, 20  C 343
Freshwater, 5  C 1427
Freshwater, 20  C 1481
Salt water, 20  C, salinity 3.4%, 1 m depth 1520
Sand 800–2200
Clay 1000–2500
Sandstone 1400–4300 700–2800
Granite 5500–5900 2800–3000
Limestone 5500–6100 2800–3300

Fig. 4.7 Example profiles of the speed of sound in (a) air and the national programs that contribute to it; https://argo.
(data from The Engineering ToolBox; https://www. ucsd.edu, https://www.ocean-ops.org. The Argo Program
engineeringtoolbox.com/elevation-speed-sound-air-d_ is part of the Global Ocean Observing System. Argo float
1534.html; accessed 16 April 2021) and (b) water in polar data and metadata from Global Data Assembly Centre
and equatorial regions (These data were collected and (Argo GDAC); https://doi.org/10.17882/42182; accessed
made freely available by the International Argo Program 16 April 2021). See Chaps. 5 and 6

4.2.11 The Decibel into a manageable range of values. This is one of


the reasons why the decibel is so popular in
Acousticians may deal with very-high-amplitude acoustics. Another reason is that human percep-
signals and very-low-amplitude signals; e.g., the tion of the loudness of a sound is approximately
sound pressure near an explosion might be proportional to the logarithm of its amplitude.
60,000 Pa, while the sound pressure from When quantities such as sound pressure or
human breathing is only 0.0001 Pa. This means sound exposure are converted to logarithmic
that the dynamic range of quantities in acoustics scale, the word “level” is added to the name.
is large and, in fact, covers seven orders of mag- Sound pressure level and sound exposure
nitude (see Tables 4.3 and 4.4). Rather than level are much more commonly used than their
handling multiple zeros and decimals, using a linear counterparts, sound pressure and sound
logarithmic scale compresses the dynamic range exposure.
4 Introduction to Acoustic Terminology and Signal Processing 123

By definition, the level LQ of quantity Q is or P0). For example, an underwater tone at a level
proportional to the logarithm of the ratio of of 120 dB re 1 μPa rms has an rms pressure of
Q and a reference value Q0, which has the same 1 Pa. This is worked out as follows:
unit. In the case of a field quantity F, such as
sound pressure or particle velocity, or an electri- F ¼ 10120=20  1μPa ¼ 106 μPa ¼ 1 Pa
cal quantity such as voltage or current, the level
However, a tone of 120 dB re 20 μPa rms in air
LF is computed as
has an rms pressure of 20 Pa:
F
LF ¼ 20 log 10 F ¼ 10120=20  20 μPa ¼ 106  20 μPa ¼ 20 Pa
F0

In the case of a power quantity P, such as


mean-square sound pressure or energy, the level 4.2.11.2 Differences between Levels
LP is computed as of like Quantities
P A particular difference between two levels
LP ¼ 10 log 10 corresponds to particular ratios between their
P0
field and power quantities. The general
Both levels are expressed in decibels (dB). relationships are:
Note the different factors (20 versus 10) in the
F1
equations. It is critically important to always state LF1  LF2 ¼ 20 log 10
the reference value F0 or P0 when discussing F2
levels, because reference values differ between P1
air and water. LP1  LP2 ¼ 10 log 10
P2

¼ 10ð 20 Þ
F1 LF1 LF2
4.2.11.1 Conversion from Decibel
F2
to Field or Power Quantities
The relationships for calculating field and power
¼ 10ð 10 Þ
P1 LP1 LP2

quantities from their levels are, respectively: P2


LF LP
F ¼ 10 20 F 0 , and P ¼ 10 10 P0 ð4:3Þ Some common examples are given in
Table 4.6. Note the inverse relationship between
The units of the calculated quantities corre- ratios for corresponding positive and negative
spond to the units of the reference quantity (F0 level differences and also that each power

Table 4.6 Level differences and their corresponding field and power quantity ratios
Level difference Field quantity ratio (F1/F2); use for Power quantity ratio (P1/P2); use for power,
(LF1-LF2 or LP1-LP2) pressure, particle velocity, voltage, intensity, energy, sound exposure, mean-square
in dB current, etc. pressure, etc.
40 1/100 ¼ 0.01 1/10,000 ¼ 0.0001
20 1/10 ¼ 0.1 1/100 ¼ 0.01
pffiffiffiffiffi
10 1= 10  0:316 1/10 ¼ 0.1
6 1/2 ¼ 0.5 1/4 ¼ 0.25
pffiffiffi
3 1/ 2  0.707 1/2 ¼ 0.5
0 1 1
pffiffiffi
3 2  1.41 2
6 2 4
pffiffiffiffiffi
10 10  3.16 10
20 10 100
40 100 10,000
124 C. Erbe et al.

quantity ratio is the square of the corresponding analog-to-digital converters have a digitization
field quantity ratio. gain expressed in dB re FS/V, which specifies
For example, a tone at a level of 120 dB re the input voltage that leads to full scale (FS). If
1 μPa rms is 20 dB stronger than a tone at a the digitizer has a digitization gain ΔLDG ¼ 10 dB
level of 100 dB re 1 μPa rms, so from re FS/V, then 1010/20 FS/V ¼ 101/2 FS/V is the
Table 4.6, the ratio of the two rms pressures is relationship between FS and input voltage,
p1/p2 ¼ F1/F2 ¼ 10, and the ratio of their meaning that FS is reached when the input is
intensities is I1/I2 ¼ P1/P2 ¼ 100. 1/101/2 V ¼ 0.32 V. The actual value of FS
depends on the number of bits available. A
16-bit digitizer in bipolar mode (i.e., producing
4.2.11.3 Amplification of Signals
both positive and negative numbers) has a full-
The above formulae and Table 4.6 can also be
scale value of 216–1 ¼ 215 ¼ 32,768. And so the
used to calculate the effect of amplifying signals.
digital values v representing the acoustic pressure
For example, if an amplifier has a gain of 20 dB,
will lie between 32,768 and + 32,767 (with one
then the rms voltage at the output of the amplifier
of the possible numbers being 0). The final steps
will be 10 times the rms voltage at its input.
in relating these digital values to the recorded
Similarly, an amplifier with a 40 dB gain will
acoustic pressure entail dividing by FS,
increase the rms voltage by a factor of 100. If
converting to dB, and subtracting all the gains:
several amplifier stages are cascaded, then their
combined gain is the sum of the gains of the Lp ¼ 20 log 10 ðv=FSÞ  ΔLDG  ΔLG  N S
individual stages (in dB). ¼ 20 log 10 ðv=FSÞ þ 150 dB re 1 μPa
When calibrating acoustic recordings (see
Chap. 2), the gains of all components of the
recording systems have to be summed. An under-
water recording system (Fig. 4.8), for example,
4.2.11.4 Superposition of Field
contains a hydrophone that converts received
and Power Quantities
acoustic pressure to a time series of voltages at
If two tones of the same frequency and level
its output. The sensitivity of the hydrophone
arrive in phase at a listener, then the amplitude
specifies this relationship. For example, a hydro-
is doubled and the combined level is therefore
phone with a sensitivity NS ¼ 180 dB re
6 dB above the level of each tone (see
1 V/μPa produces 10–180/20 ¼ 109 Volts output
Table 4.6). If, on the other hand, there is a random
per 1 μPa input. A more sensitive hydrophone has
phase difference between the two tones then, on
a less negative sensitivity. The output voltage
average, the intensity of the two signals will sum.
might be passed to an amplifier with ΔLG ¼ 20 dB
In this case (again from Table 4.6) the combined
gain, after which it is digitized by a data acquisi-
intensity is 3 dB higher than the level of each
tion board, such as a computer’s soundcard. All
tone. For example, if each tone has a level of
120 dB re 1 μPa rms, then the two tones together
have a level of 126 dB re 1 μPa rms if they are in
phase. Their superposition has an average level of
123 dB re 1 μPa rms if they have a random phase
difference. Summing signals that have the same
amplifier phase, or a fixed phase difference, is known as
soundcard coherent summation, whereas performing an “on
average” summation of signals assuming a ran-
hydrophone
dom phase is called incoherent summation.
Fig. 4.8 Sketch of an example underwater recording The calculation is more complicated if the two
setup. A terrestrial setup would have a microphone instead tones have different levels. It is necessary to use
of a hydrophone Eq. (4.3) to convert both levels to corresponding
4 Introduction to Acoustic Terminology and Signal Processing 125

incoherent summation is less than 0.5 dB higher


than that of the higher of the two; and for many
practical applications, the lower-level signal can
be ignored.

4.2.11.5 Levels in Air Versus Water


Comparing sound levels in air and water is com-
plicated and has caused much confusion in the
past. For two sound sources of equal intensity Ia
and Iw in air and water, respectively, the sound
pressure level is 62 dB greater in water because of
two factors: the greater acoustic impedance of
water and the different reference pressures used
in the two media.
The effect of the acoustic impedance can be
Fig. 4.9 Line graphs of the effect on the higher-level seen as follows. Assuming Iw ¼ Ia, then from
signal of combining two signals by coherent summation (Eq. 4.2):
(assuming the signals are in phase or 180 out of phase)
and incoherent summation p2w p2a p2 Z
¼ , which is equivalent to w2 ¼ w :
Zw Za pa Z a

field (coherent summation) or power (incoherent This ratio of mean-square pressures in the two
summation) quantities, add these quantities, and media can be expressed in terms of the density
then convert the result back to a level. and speed of sound of the two media:
The outcome of this process is plotted in
Fig. 4.9 in terms of the increase in the combined p2w Z w ρw cw
¼ ¼ :
level from that of the higher-level signal as a p2a Z a ρa c a
function of the difference between the higher
Applying 10 log10() to these ratios, the differ-
and lower levels. Note that this increase never
ence between the mean-square sound pressure
exceeds 6 dB for a coherent summation or 3 dB
levels in water and air is:
for an incoherent summation. In the case of a
coherent summation, proper account has to be p2w p2
taken of the relative phases of the two tones Lpw2  Lpa2 ¼ 10 log 10 2
 10 log 10 a2
p0 p0
when adding the field quantities, and this can
have a very large effect. Figure 4.9 shows the p2w ρ c
¼ 10 log 10 ¼ 10 log 10 w w
extreme cases: The upper limit occurs when the p2a ρa c a
two signals are in phase, and the lower limit ¼ 36 dB
occurs when they have a phase difference of
180 (π radians). The latter case gives destructive The difference between the sound pressure
interference and the combined level is lower levels is, of course, also 36 dB:
than that of the highest individual signal. If pw p
the two individual signals have a 180 phase Lpw  Lpa ¼ 20 log 10  20 log 10 a
p0 p0
difference and the same amplitude, then the rffiffiffiffiffiffiffiffiffiffi
p ρw c w
destructive interference is complete, the two ¼ 20 log 10 w ¼ 20 log 10
pa ρa c a
signals cancel each other out, and the combined
level is 1! ¼ 36 dB
Another useful observation from Fig. 4.9 is
that when the difference in level between the In the above two equations, the same reference
two individual signals is greater than 10 dB, the pressure p0 is required. However, the convention
126 C. Erbe et al.

is to use pa0¼20 μPa in air and pw0¼1 μPa in Some sources are large in their physical
water. The difference in reference pressures adds dimensions and placing a recorder at short range
another 26 dB to the sound pressure level in (i.e., into the so-called near-field, see Sect. 4.2.13)
water, because: will not result in a level that captures the full
output of the source. Also, many sound sources
pa0 20 μPa
20 log 10 ¼ 20 log 10 ¼ 26 dB do not operate in a free-field but rather near a
pw0 1 μPa
boundary (e.g., air-ground, air-water, or water-
So, if two sound sources emit the same inten- seafloor). At such boundaries, reflection, scatter-
sity in air and water, then the sound pressure level ing, absorption, and phase changes may occur,
in water referenced to 1 μPa is 62 dB (i.e., affecting the recorded level. In praxis, a sound
36 dB + 26 dB) greater than the sound pressure source is recorded at some range in the far-field
level in air referenced to 20 μPa. and an appropriate (and sometimes sophisticated)
While this might be confusing, there would sound propagation model is utilized to account
hardly be a sensible reason to compare levels in for the effects of the environment in order to
air and water. Such comparisons have been compute a source level that is independent of
attempted in the past to give an analogy to levels the environment. Such source levels can then be
with which humans have experience in air. For applied to new situations and different
example, humans find 114 dB re 20 μPa annoying environments in order to predict received levels
and 140 dB re 20 μPa painful, so what would be a elsewhere. Like other levels, the source level is
similarly annoying level under water that might expressed in dB relative to a reference value. It is
disturb animals? further referenced to a nominal distance of 1 m
But animals perceive sound differently from from the source. The source level can be a sound
humans, hear sound at different frequencies and pressure level or a sound exposure level,
levels, and can have rather different auditory depending on the source and situation.
anatomy (see Chap. 10 on audiograms). As a The radiated noise level (abbreviation: RNL;
result, a signal easily heard by a human could be symbol LRN) is more easily determined. It is the
barely audible to some animals or much louder to level of the product of the sound pressure and the
others. Even for divers, sound reception under range r at which the sound pressure is recorded,
water is quite a different process from sound and it can be calculated as the received sound
reception in air, due to different acoustic imped- pressure level Lp plus a spherical propagation
ance ratios of the acoustic medium and human loss term:
tissues, and different sound propagation paths. prms ðr Þr r
Furthermore, the psychoacoustic effects (emo- LRN ¼ 20 log 10 ¼ Lp þ 20 log 10
p0 r 0 r0
tional impacts) of different types of noise on
animals have not been examined thoroughly. It is expressed in dB relative to a reference
Even in humans, for example, 110 dB re 20 μPa value of p0r0 ¼ 20 μPa m in air and p0r0 ¼ 1 μPa m
of rock music does not provide the same experi- in water. The radiated noise level is dependent
ence as 110 dB re 20 μPa of traffic noise. upon the environment and is therefore also called
affected source level. Note that it is very common
in the bioacoustic literature to report source levels
4.2.12 Source Level and radiated noise levels as dB re 20 μPa @ 1 m
in air and dB re 1 μPa @ 1 m in water. The ISO
The source level (abbreviation: SL; symbol: LS) is definition is mathematically different and the
meant to be characteristic of the sound source and notation excludes “@ 1 m” (International Organi-
independent of both the environment in which the zation for Standardization 2017).
source operates and the method by which the While the source level can be characteristic of
source level is determined. In praxis, the determi- the source, there are many factors that affect the
nation of the source level has numerous problems. source level. For example, larger ships typically
4 Introduction to Acoustic Terminology and Signal Processing 127

have a higher source level than smaller ships. out of phase either because sound from different
Cars going fast have a higher source level than parts of the source arrives at different times (This
cars going slowly. Animals can vary the ampli- is the case of an extended source.) or because the
tude of the same sound depending on the context curvature of the spherical wavefront from the
and their motivation. Different sound types can source is too great to be ignored (This is the case
have different source levels. Territorial defense or of a source small enough to be considered a point
aggressive sounds usually have the highest source source.). These two cases have different frequency
level in a species’ repertoire. Mother-offspring dependence with the near-field to far-field transi-
sounds often have the lowest source level in a tion distance increasing with increasing frequency
species’ repertoire, because mother and calf are for an extended source, and decreasing with
typically close together and want to avoid detec- increasing frequency for a small source. A single
tion by predators. source may behave as a small source at low
frequencies and as an extended source at high
frequencies, which implies that there is some
4.2.13 What Field? Free-Field, non-zero frequency at which it will have a mini-
Far-Field, Near-Field mum near-field to far-field transition distance.
This has resulted in much confusion.
While this might read like the opening of a When is a sound source small versus
Dr. Seuss book, it is quite important to understand extended? A sound source can be considered
these concepts. The free-field, or free sound field, small when its physical dimensions are small
exists around a sound source placed in a homoge- compared to the acoustic wavelength. A fin
neous and isotropic medium that is free of whale (Balaenoptera physalus) with a head size
boundaries. Homogenous means that the medium of perhaps 6 m produces a characteristic 20-Hz
is uniform in all of its parameters; isotropic means signal that has a wavelength of about 70 m and so
that the parameters do not depend on the direction the whale can be considered small.
of measurement. While the free-field assumption When studying the effects of noise on animals,
is commonly applied to estimates of particle however, the noise sources one deals with are
velocity from pressure measurements or estimates mostly extended sources. In the near-field, the
of propagation loss, sound sources and receivers amplitudes of field and power quantities are
are rarely in a free-field. More often, sound affected by the physical dimension of the sound
sources and receivers are near a boundary. This source. This is because the surface of an extended
is the case for sources such as trains or construc- sound source can be considered an array of sepa-
tion sites and for receivers such as humans, all of rate point sources. Each point source generates an
which are right at the air-ground boundary. This acoustic wave. At any location, the instantaneous
is also the case for sources such as ships at the pressure (as an example of a field quantity) is the
water surface and for receivers such as fishes in summation of the instantaneous pressures from
shallow water, where they are near two all of the point sources. In the near-field, the
boundaries: the air-water and the water-seafloor various sound waves have traveled various
boundaries. At boundaries, some of the sound is distances and arrive at various phases. Therefore,
transmitted into the other medium, some of it is the near-field consists of regions of destructive
reflected, some of it is scattered in various and constructive interference and the pressure
directions. For more detail on source-path- amplitude depends greatly on where exactly in
receiver models in air and water, see Chaps. 5 the near-field it is measured. There may be
and 6. regions close to a sound source where the pres-
The far-field is the region that is far enough sure amplitude is always zero. The interference
from the source so that the particle velocity and pattern depends on the frequency of the sound,
pressure are effectively in phase. The near-field is and the regions of destructive and constructive
the region closer to the source where they become interference will be different depending on the
128 C. Erbe et al.

170 higher than what would be measured with a


receiver in the near-field (blue solid line in
160 Fig. 4.10).
Near Field Far Field ... Radiated noise levels and source levels are
150 useful to estimate the received level at some
range in the far-field. They will always be higher
SPL [dB re 1 Pa]

140 than the levels that exist in the near-field. There


spherical spreading has been a lot of confusion about this in the
130
bioacoustics community, for example in the case
of marine seismic surveys. A seismic airgun array
(i.e., a number of separate seismic airguns
120
arranged in a 2-dimensional array) might have
physical dimensions of several tens of meters
110
and a source level (in terms of sound exposure)
of 220 dB re 1 μPa2s m (e.g., Erbe and King
100 2009). However, in situ measurements near the
0 10 20 30 40
Range [m] array may never exceed 190 dB re 1 μPa2s, except
in the immediate vicinity (<< 1 m) of an individ-
Fig. 4.10 Graph of sound pressure versus range, perpen- ual airgun. This is because the highest level that
dicular from a circular piston such as a loudspeaker with may be recorded is close to an individual airgun
radius 1 m, f ¼ 22 kHz, under water
in the array. The other airguns in the array are too
far away to significantly add to the level of any
frequency of the sound. In the far-field of the particular airgun (see Fig. 4.9). At short range
extended source, the sound waves from the sepa- from the array, the sound waves from some
rate point sources have traveled nearly the same airguns will add constructively and from others
distance and arrive in phase. The pressure ampli- destructively, so that the measured pressure
tude depends only on the range from the source amplitude is always less than the amplitude from
and decreases monotonically with increasing one airgun multiplied by the number of airguns in
range. The amplitudes of field quantities F and the array. Constructive superposition of sound
power quantities P decay with range r as: waves from all airguns only happens in the
far-field, where the pressure amplitude is reduced
1 1 due to propagation loss.
F ðr Þ  and Pðr Þ  2 in the far‐field:
r r
The range at which the field transitions from
near to far can be estimated as L2/ λ, where L is the 4.2.14 Frequency Weighting
largest dimension of the source and λ is the wave-
length of interest. (Fig. 4.10). Frequency weightings are mathematical functions
All sound sources have near- and far-fields. applied to sound measurements to compensate
The source level of a sound source is, in praxis, quantitatively for variations in the auditory sensi-
determined from measurements in the far-field by tivity of humans and non-human animals (see
correcting for propagation loss. In the example of Chap. 10 on audiometry). These functions
Fig. 4.10, the sound pressure level might be “weight” the contributions of different
measured as 126 dB re 1 μPa at 30 m range frequencies to the overall sound level,
from the source. A spherical propagation loss de-emphasizing frequencies where the subject’s
term (20 log 10 rr0 ¼ 30 dB ; red dashed line in auditory sensitivity is less and emphasizing
frequencies where it is greater. Frequency
Fig. 4.10) is then applied to estimate the radiated
weighting essentially applies a band-pass filter
noise level: 156 dB re 1 μPa m. This level is
to the sound. Weighting is applied before the
4 Introduction to Acoustic Terminology and Signal Processing 129

calculation of broadband SPLs or SELs. A num- 0


ber of weighting functions exist for different
purposes: for example, A, B, C, D, Z, FLAT, -20
and Linear frequency weightings to measure the

Gain [dB]
effect of noise on humans. However, at present,
-40
only weightings A, C, and Z are standardized
(International Electrotechnical Commission
2013). -60
A
C
-80 Z
4.2.14.1 A, C, and Z Frequency
Weightings 10 2 10 4
A, C, and Z frequency weightings are derived Frequency [Hz]
from standardized equal-loudness contours.
Fig. 4.11 Graph of A-, C-, and Z-weighting curves
These are curves which demonstrate SPL
variations over the frequency spectrum for
which constant loudness is perceived (Suzuki The function is tailored to the perception of
and Takeshima 2004). Loudness is the human low-level sounds and represents an idealized
perception of sound pressure. Loudness levels human 40-phon equal-loudness contour.
are measured in units of phons, determined from Measurements are noted as dB(A) or dBA.
referencing the equal-loudness contours. The The C-weighting function provides a better
number of phons n is equal in intensity to a representation of human auditory sensitivity to
1-kHz tone with an SPL of n dB. The equal- high-level sounds. This weighting is useful for
loudness contours were developed from human stipulating peak or impact noise levels and is
loudness perception studies (Fletcher used for the assessment of instrument and equip-
and Munson 1933; Robinson and Dadson 1956; ment noise.
Suzuki and Takeshima 2004) and are The Z-weighting function (also known as the
standardized (International Organization for zero-weighting function) covers a range of
Standardization 2003). Table 4.7 defines the A, frequencies from 8 Hz to 20 kHz (within 
C, and Z-weighting values at frequencies up to 1.5 dB), replacing the “FLAT” and “Linear”
16 kHz. Figure 4.11 displays the contours of the weighting functions. It adds no “weight” to
weightings. account for the auditory sensitivity of humans
A-weighting is the primary weighting function and is commonly used in octave-band
for environmental noise assessment. It covers a analysis to analyze the sound source rather than
broad range of frequencies from 20 Hz to 20 kHz. its effect.

Table 4.7 A, C, and Z-weighting values


Frequency [Hz] A-weighting [dB] C-weighting [dB] Z-weighting [dB]
63 26.2 0.8 0
125 16.1 0.2 0
250 8.6 0 0
500 3.2 0 0
1000 0 0 0
2000 1.2 0.2 0
4000 1 0.8 0
8000 1.1 3 0
16,000 6.6 8.5 0
130 C. Erbe et al.

4.2.14.2 Frequency Weightings evolve, reflecting the advancement in marine


for Non-human Animals mammal auditory sensitivity and response
Equal-loudness contours for non-human animals research, with the most recent modifications pro-
are very challenging to develop as it is difficult to posed by Southall et al. (2019), including a redef-
obtain the required data. Direct measurements of inition of marine mammal hearing groups,
equal loudness in non-human animals have only function assumptions, and parameters. The
been achieved for bottlenose dolphins (Tursiops updated functions are based on the following
truncatus; Finneran and Schlundt 2011); how- equation:
ever, equal-response-latency curves have been
generated from reaction-time studies and been W ð f Þ¼C
 2a
used as proxies for equal-loudness contours f
(Kastelein et al. 2011). Several functions applica- f1
þ10log 10 !
ble to the assessment of noise impact on marine  2 a  2 b
f f
mammals have also been developed similar to the 1þ f1 1þ f2
A-weighting function with adjustments for the
hearing sensitivity of different marine mammal ð4:4Þ
groups. Other weighting functions exist for other W( f ) is the weighting function amplitude
species. [dB] at frequency f [kHz]; f1 and f2 are the
low-frequency and high-frequency cut-off values
4.2.14.3 M-Weighting [kHz], respectively. Constants a and b are the
The M-weighting function was developed to low-frequency and high-frequency exponent
account for the auditory sensitivity of five func- values, defining the rate of decline of the
tional hearing groups of marine mammals weighting amplitude at low and high frequencies,
(Southall et al. 2007). Development of this func- and C defines the vertical position of the curve
tion was restricted by data availability and is (maximum weighting function amplitude is 0).
limited in its capacity to capture all complexities Table 4.8 lists the function constants for each
of marine mammal auditory responses (Tougaard marine mammal hearing group and Fig. 4.12
and Beedholm 2019). The function deemphasizes plots the weighting curves.
the frequencies near the upper and lower limits of
the auditory sensitivities of each hearing group,
emphasizing frequencies where exposure to high-
amplitude noise is more likely to affect the focal 4.2.15 Frequency Bands
species (Houser et al. 2017). M-weighted SEL is
calculated through energy integration over all Different sound sources emit sound at different
frequencies following the application of the frequencies and cover different frequency bands.
M-weighting function to the noise spectrum. The whistle of a bird is quite tonal, covering a
The M-weighting functions have continued to narrow band of frequencies. An echosounder

Table 4.8 Constants of Eq. 4.4 for the six functional hearing groups of marine mammals (Southall et al. 2019)
Marine mammal hearing group a b f1 [kHz] f2 [kHz] C [dB]
Low-frequency cetaceans (LF) 1 2 0.2 19 0.13
High-frequency cetaceans (HF) 1.6 2 8.8 110 1.20
Very-high-frequency cetaceans (VHF) 1.8 2 12 140 1.36
Sirenians (SI) 1.8 2 4.3 25 2.62
Phocid carnivores in water (PCW) 1 2 1.9 30 0.75
Phocid carnivores in air (PCA) 2 2 0.75 8.3 1.50
Other marine carnivores in water (OCW) 2 2 0.94 25 0.64
Other marine carnivores in air (OCA) 1.4 2 2 20 1.39
4 Introduction to Acoustic Terminology and Signal Processing 131

0
-3

-20
-10

-40 LF
Gain [dB]

Amplitude [dB]
HF
VHF
-60 SI
PCW
-80 PCA
OCW
OCA f10l f3l fp f3u f10u Frequency [Hz]
-100
Fig. 4.13 Illustration of the 3-dB and 10-dB bandwidths
10 -2 10 -1 10 0 10 1 10 2 of a signal; p: peak, l: lower, u: upper
Frequency [kHz]

Fig. 4.12 Weighting curves calculated from the function


power, and the rms bandwidth BWrms, which
W( f ) (Eq. 4.4) and constants (Table 4.8), for each marine measures the standard deviation about the center
mammal hearing group frequency. With H( f ) representing the Fourier
transform, these quantities are computed as
(Fig. 4.14):
emits a sharp tone, concentrating almost all
Z 1
acoustic energy in a narrow frequency band cen-
tered on one frequency. These are narrowband f jH ð f Þj2 df
1
sources, while a ship propeller is a broadband fc ¼ Z 1
source generating many octaves in frequency. jH ð f Þj2 df
1
The term frequency band refers to the band of vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
frequencies of a sound. The bandwidth is the uZ 1
u
u ð f  f c Þ2 jH ð f Þj2 df
difference between the highest and the lowest u 1
BW rms ¼u Z 1
frequency of a sound. The spectrum of a sound t
shows which frequencies are contained in the jH ð f Þj2 df
1
sound and the amplitude at each frequency.
Peak frequency and 3-dB bandwidth are often Broadband sounds are commonly analyzed in
used to describe the spectral characteristics of a specific frequency bands. In other words, the
signal. Peak frequency is the frequency of maxi- energy in a broadband sound can be split into a
mum power of the spectrum. The 3-dB bandwidth series of frequency bands. This splitting is done
is computed as the difference between the by a filter, which can be implemented in hardware
frequencies (on either side of the peak frequency), or software. A low-pass filter lets low frequencies
at which the spectrum has dropped 3 dB from its pass and reduces the amplitude of (i.e.,
maximum (Fig. 4.13). Remember that a drop of attenuates) signals above its cut-off frequency.
3 dB is equal to half power; and so the 3-dB A high-pass filter lets high frequencies pass and
bandwidth is the bandwidth at the half-power reduces the amplitude of signals below its cut-off
marks. Similarly, the 10-dB bandwidth is measured frequency. A band-pass filter passes signals
10 dB down from the maximum power (i.e., where within its characteristic pass-band (extending
the power has dropped to one tenth of its peak). from a lower edge frequency to an upper edge
For non-Gaussian spectra (e.g., bat or frequency) and attenuates signals outside of this
dolphin echolocation clicks), two other measures band. It is a common misconception that a filter
are useful: the center frequency fc, which splits removes all energy beyond its cut-off frequency.
the power spectrum into two halves of equal Instead, a filter progressively attenuates the
132 C. Erbe et al.

Fig. 4.14 Echolocation click from a harbor porpoise points at one tenth of the peak power (i.e., 10 dB below
(Phocoena phocoena); (a) waveform and amplitude enve- the maximum). Computation of the 90% energy signal
lope (determined by Hilbert transform), (b) cumulative duration was explained in Sect. 4.2.6. Three bandwidth
energy, and (c) spectrum. Three different duration measures are shown. The 3-dB and 10-dB bandwidths are
parameters (τ) are shown. The 3-dB duration is the differ- measured down from the maximum power, which occurs
ence in time between the two points at half power (i.e., at the peak frequency fp, and the rms bandwidth is
3 dB down from the maximum of the signal envelope). measured about the center frequency fc. Click recording
The 10-dB duration is the time difference between the courtesy of Whitlow Au

energy. At the cut-off frequency, the energy is Octave bands are exactly one octave wide,
typically reduced by 3 dB. Beyond the cut-off with an octave corresponding to a doubling of
frequency, the attenuation increases; how rapidly frequency. The upper edge frequency of an octave
depends on the order of the filter. band is twice the lower edge frequency of
Band-pass filtering is very common in the the band: fup ¼ 2 flow. Fractional octave bands
study of broadband sounds, in particular broad- are a fraction of an octave wide. One-third octave
band noise such as aircraft or ship noise. A num- bands are common. The center frequencies fc of
ber of band-pass filters are used that have adjacent adjacent 1/3 octave bands are calculated as
pass-bands such that the sound spectrum is split fc(n) ¼ 2n/3, where n counts the 1/3 octave
into adjacent frequency bands. If these bands all bands. The lower and upper frequencies of band
have the same width, then the filters are said to n are calculated as:
have constant bandwidth. In contrast, propor-
tional bandwidth filters split sound into adjacent ƒlow ðnÞ ¼ 21=6 f c ðnÞ and ƒup ðnÞ ¼ 21=6 f c ðnÞ
bands that have a constant ratio of upper to lower
Another example for proportional bands are
frequency. These bands become wider with
decidecades. Their center frequencies fc are
increasing frequency (e.g., octave bands).
4 Introduction to Acoustic Terminology and Signal Processing 133

Table 4.9 Center frequencies of adjacent 1/3 octave bands [Hz]. The table can be extended to lower and higher
frequencies by division and multiplication by 10, respectively
10 12.5 16 20 25 31.5 40 50 63 80
100 125 160 200 250 315 400 500 630 800
1000 1250 1600 2000 2500 3150 4000 5000 6300 8000
10,000 12,500 16,000 20,000 25,000 31,500 40,000 50,000 63,000 80,000

calculated as fc(n) ¼ 10n/10, where n counts the 4.2.17 Band Levels


decidecades. The lower and upper frequencies of
band n are calculated as: Band levels are computed over a specified fre-
quency band. Band levels can be computed from
ƒlow ðnÞ ¼ 101=20 f c ðnÞ spectral densities by integrating over frequency
ƒup ðnÞ ¼ 101=20 f c ðnÞ before converting to dB.
Consider the sketched mean-square sound
Decidecades are a little narrower than 1/3 pressure spectral density as a function of fre-
octaves by about 0.08%. Decidecades are often quency (Fig. 4.15). The band level Lp in the
erroneously called 1/3 octaves in the literature. band from flow to fup is the total mean-square
Given this confusion and inconsistencies in sound pressure in this band:
rounding, preferred center frequencies have been 0Z f 1
up
published (Table 4.9). 2
p f df C
B
B f C
Lp ¼ 10 log 10 B low2 C
@ pf f0 A
0
4.2.16 Power Spectral Density
 !
p2f f up  f low
The spectral density of a power quantity is the ¼ 10 log 10
p2f 0 f 0
average of that quantity within a specified fre-
!
quency band, divided by the bandwidth of that p2f
band. Spectral densities are typically computed ¼ 10 log 10
p2f 0
for mean-square sound pressure or sound expo-  
sure. Furthermore, spectral densities are most f up  f low
þ10 log 10
commonly computed in a series of adjacent f0
constant-bandwidth bands, where each band is
exactly 1 Hz wide. The spectral density then where the reference frequency f0 is 1 Hz. The
describes how the power quantity of a sound is band level of mean-square sound pressure is
distributed with frequency. The mean-square thus equal to the level of the average mean-square
sound pressure spectral density level is expressed sound pressure spectral density plus 10 log10 of
in dB: the bandwidth. The band level is expressed in dB
re 1 μPa2 in water. In the in-air literature, it is
!
p2f more common to take the square root and report
Lp;f ¼ 10 log 10 2 band levels in dB re 20 μPa. The frequency band
pf0
should always be reported as well.
The reference value p2f 0 is 1 μPa2/Hz in The wider the bands, the higher the band
levels, as illustrated for 1/12, 1/3, and 1 octave
water. In air, it is more common to take the square
bands in Fig. 4.16.
root and report spectral density in dB re
pffiffiffiffiffiffi
20 μPa= Hz.
134 C. Erbe et al.

90
octave band levels
80

1/3 octave band levels

Band Level [dB re 1 Pa 2 ]


70
1/12 octave band levels
60

50
spectral density levels
40 (1 Hz band levels)

30

Fig. 4.15 Graph of mean-square pressure spectral density


20
(blue) and its average p2f (red) in the frequency band from
10 2 10 3 10 4 10 5
flow to fup
Frequency [Hz]

Fig. 4.16 Illustration of band levels versus spectral den-


sity levels, for the example of wind-driven noise under
4.3 Acoustic Signal Processing water at Sea State 2. Band levels are at least as high as the
underlying spectral density levels. There are twelve 1/12-
4.3.1 Displays of Sounds octave bands in each octave, and three 1/3-octave bands.
The wider the band, the higher the level, because more
power gets integrated
A signal can be represented in the time domain
and displayed as a waveform, or in the frequency
domain and displayed as a spectrum. Waveform different spectrum to the previous repetitive
plots typically have time on the x-axis and ampli- signals, with a maximum at zero frequency and
tude on the y-axis. Waveform plots are useful decaying in a series of ripples (known as
for analysis of short pulses or clicks. Before sidelobes) that decrease in amplitude as frequency
the common use of desktop computers, acoustic increases. It turns out that the shorter the pulse is,
waveforms were commonly displayed by the wider is the initial spectral peak. Also, the
oscilloscopes (or oscillographs). The display of faster the rise and fall times are, the more pro-
the waveform was called an oscillogram. Power nounced the sidelobes are and the slower they
spectra are typically displayed with frequency on decay. Panel (d) shows the waveform and spec-
the x-axis and amplitude on the y-axis. trum of a 1-kHz sinusoidal signal that has been
A few examples of waveforms and their spec- amplitude-modulated by the pulse shown in (c).
tra are shown in Fig. 4.17.2 A constant-wave The effect of this is to shift the spectrum of the
sinusoid (a) has a spectrum consisting of a single pulse so that what was at zero frequency is now at
spike at the signal’s fundamental frequency, in the fundamental frequency of the sinusoid, and to
this case 1 kHz. The signal shown in (b) has the mirror it around that frequency. Another way of
same fundamental frequency of 1 kHz, but its thinking about this is that the effect of truncating
spectrum shows additional overtones at integer the sinusoid is to broaden its spectrum from the
multiples of the fundamental that are due to its spike shown in (a). The effect of changing the
more complicated shape. A pulse (c) has a quite frequency during the burst can be seen in (e). In
this case, the frequency has been swept from
500 Hz to 1500 Hz over the 10-ms burst duration.
2
Dan Russell’s animations of the Fourier compositions of This has the effect of broadening the spectrum
different waveforms: https://www.acs.psu.edu/drussell/ and smoothing out the sidelobes that were
Demos/Fourier/Fourier.html; accessed 12 October 2020.
4 Introduction to Acoustic Terminology and Signal Processing 135

Fig. 4.17 Examples of signal waveforms (left) and their 10-ms long tone burst with a center frequency of 1000 Hz
spectra (right). (a) A sine wave with a frequency of and 2-ms rise and fall times; (e) a 10-ms long FM sweep
1000 Hz; (b) a signal consisting of a sine wave with a from 500 Hz to 1500 Hz with 2-ms rise and fall times; and
fundamental frequency of 1000 Hz and five overtones; (c) (f) uncorrelated (white) random noise
a 10-ms long pulse with 2-ms rise and fall times; (d) a

apparent in (d). Finally, (f) shows a waveform information about what it will be at any other
consisting of uncorrelated noise and its spectrum. time instant. This type of noise is often called
In this context “uncorrelated” means that knowl- white noise because it has a flat spectrum (like
edge of the noise at one time instant gives no white light), but as can be seen in this example,
136 C. Erbe et al.

the spectrum of any particular white noise signal frequency in the original signal. H(f) is a complex
is itself quite noisy and it is only flat if one function and the argument contains the phase of
averages the spectra of many similar signals, or that frequency. The inverse Fourier transform
alternatively the spectra of many segments of the recreates the original signal from its Fourier
same signal. components. For a continuous function with
A spectrogram is a plot with, most commonly, t representing time and f representing frequency,
time on the x-axis and frequency on the y-axis. A the Fourier transform is (i is the imaginary unit):
quantity proportional to acoustic power is Z 1
displayed by different colors or gray levels. If Hð f Þ ¼ hðt Þe2πift dt
properly calibrated, a spectrogram will show 1

mean-square sound pressure spectral density. A and the inverse Fourier transform is:
spectrogram is computed as a succession of Z 1
Fourier transforms. A window is applied in the
hð t Þ ¼ H ð f Þe2πift df
time domain containing a fixed number of 1
samples of the digital time series. The Fourier
transform is computed over these samples. While a sound wave might be continuous,
Amplitudes are squared to yield power. The during digital recording or digitization of an ana-
power spectrum is then plotted as a vertical col- logue recording, its instantaneous pressure is
umn with frequency on the y-axis. The window in sampled at equally spaced times over a finite
the time domain is then moved forward in time window in time. This results in a finite and dis-
and the next samples of the digital time series are crete time series. The equations for the discrete
taken and Fourier-transformed. This second spec- Fourier transform are similar to the above, where
trum is then plotted next to the first spectrum, as the integrals are replaced by summations. The fast
the second vertical column in the spectrogram. Fourier transform (FFT) is the most common
The window in the time domain is moved again, mathematical algorithm for computing the dis-
the third Fourier transform is computed and crete Fourier transform. In animal bioacoustics,
plotted as the third column of the spectrogram, the FFT is the most commonly used algorithm to
and so forth (see examples in Fig. 4.2). The spec- compute the frequency spectrum of a sound. The
trogram, therefore, shows how the spectrum of a most common display of the frequency spectrum
sound changes over time. With modern signal is as a power spectrum. Here, the amplitudes H(f)
processing software, researchers are able to listen are squared and in this process, the phase infor-
to the sounds in real-time while viewing the spec- mation is lost and, therefore, the original time
tral patterns. series cannot be recreated. If sufficient care is
taken to properly preserve the phase information,
it is not only possible, but often very convenient,
4.3.2 Fourier Transform to transform a signal into the frequency domain
using the FFT, carry out processing (such as
It turns out that any signal can be broken down filtering) in this domain, and then use an inverse
into a sum of sine waves with different FFT to resynthesize the processed signal in the
amplitudes, frequencies, and phases. This is time domain.
done by the Fourier transform, named after
French mathematician and physicist Joseph
Fourier. While the original signal can be 4.3.3 Recording and FFT Settings
represented as a time series h(t) (e.g., sound pres-
sure p(t)) in the time domain, the Fourier trans- Sounds in the various displays can look rather
form transforms the signal into the frequency different depending on the recording and analysis
domain, where it is represented as a spectrum parameters. There is no set of parameters that will
H(f). The magnitude of H is the amount of that produce the best display for all sounds. Rather,
4 Introduction to Acoustic Terminology and Signal Processing 137

Pressure [Pa]
0.5

-0.5

-1
0 0.2 0.4 0.6 0.

Fig. 4.18 Waveforms of a 1-Hz sine wave (black) and a red samples fit either sine wave. In fact, there is an infinite
9-Hz sine wave (blue), both sampled 8 times per second number of signals that fit these samples
(i.e., fs ¼ 8 Hz) as indicated by the red circles. Note that the

the ideal parameters depend on the question being 4.3.3.2 Aliasing


asked, and it is important to have a thorough Aliasing is a phenomenon that occurs due to
understanding of each of the parameters or select- sampling. A continuous acoustic wave is digitally
able settings, and how they interact. recorded by sampling at a sampling frequency fs
and storing the data as a time series p(t). It turns
4.3.3.1 Sampling Rate out that different signals can produce the identical
Microphones and hydrophones produce continu- time series p(t) and are therefore called aliases of
ous voltages in response to sounds. The voltage each other. In Fig. 4.18, pblack(t) has a frequency
outputs are termed analogue in that they are direct fblack ¼ 1 Hz, while pblue(t) has a frequency
analogues of the acoustic signal. Analogue-to- fblue ¼ 9 Hz. A recorder that samples at fs ¼ 8 Hz
digital converters sample the voltages of the sig- would measure the pressure as indicated by the
nal and the level is expressed as a number (a digit) red circles from either the red or the blue time
for each of the samples. The sampling rate is the series. Based on the samples only, it is impossible
number of samples per second and its unit is to tell which was the original time series. In fact,
1/s. The inverse is called the sampling frequency there is an infinite number of signals that fit these
(symbol: fs; unit: Hz). Music on commercial CDs samples. If f0 is the lowest frequency that fits
is digitized at 44.1 kHz (i.e., there are 44,100 these samples, then the frequency of the nth alias
samples stored every second). At high sampling is fa(n), with n being an integer number:
rates, the digital sound file becomes very large for
f a ð nÞ f
long-duration sound. The rate at which sounds are ¼ 0þn
fs fs
sampled by a digital recorder is typically stored in
the header of the sound file. This file is a list of
numbers with each number being the sound pres- The most common problem of aliasing in
sure at that sample point. Digital sound files are animal bioacoustics occurs if a high-frequency
an incomplete record of the original signal; the animal sound is recorded at too low a sampling
intervals in the original signal between samples frequency. After FFT, the spectrum or spectro-
are lost during digitizing. The result is that there is gram displays a sound at an erroneously low
a maximum frequency (related to the sampling frequency. The Nyquist frequency (named after
rate) that can be resolved during Fourier analysis. Harry Nyquist, a Swedish-born electronic
Imagine a low-frequency sine wave. Only a few engineer) is the maximum frequency that
samples are needed to determine its frequency can be determined and is equal to half the
and amplitude and to recreate the full sine wave sampling frequency. This requires some a
(by interpolation) from its samples. Those few priori information of the sounds to be recorded
samples might not be enough if the frequency is before a recording system is put together. The
higher.
138 C. Erbe et al.

Fig. 4.19 Examples of folding (aliasing). Top: A killer upsweeps greater than the Nyquist frequency appear as
whale sound sampled at 96 kHz (a) and at 32 kHz (b) downsweeps. Bottom: Humpback whale (Megaptera
(Wellard et al. 2015). If no anti-aliasing filter is applied, novaeangliae) notes recorded with a sampling frequency
frequencies above the Nyquist frequency (i.e., 16 kHz in of 6 kHz, but without an anti-aliasing filter. Contours
the right panel) will appear reflected downwards; above 3 kHz appear mirrored about the 3-kHz edge

higher the sampling frequency is, the higher the 32 kHz. Without an anti-aliasing filter, energy is
maximum frequency that can be accurately mirror-inverted or reflected about the Nyquist
digitized. frequency of 16 kHz in the second case.
In praxis, in order to avoid higher frequencies Conceptually, energy is folded down about the
of animal sounds being erroneously displayed Nyquist frequency by as much as it was above the
and interpreted as lower frequencies, an anti- Nyquist frequency.
aliasing filter is employed in the recording
system. This is a low-pass filter with a cut-off 4.3.3.3 Bit Depth
frequency below the Nyquist frequency. When a digitizer samples a sound wave (or the
Frequencies higher than the Nyquist frequency voltage at the end of a microphone), it stores the
are thus attenuated, so that the effect of aliasing pressure measures with a limited accuracy. Bit
is diminished. depth is the number of bits of information in
An example of aliasing is given in Fig. 4.19. each sample. The more bits, the greater the reso-
Spectrograms of the same killer whale (Orcinus lution of that measure (i.e., the more accurate the
orca) call are shown sampled at 96 kHz and at pressure measure). Inexpensive sound digitizers
4 Introduction to Acoustic Terminology and Signal Processing 139

use 12 bits per sample. Commercially available a time series consisting of real (i.e., not complex)
CDs store each sample with 16 bits of storage, numbers, the same result is obtained by doubling
which allows greater accuracy in records of pres- the squared amplitudes of the positive frequencies
sure. Blue-ray discs typically use 24 bits per and discarding the negative frequencies. This
sample. The more bits per sample, the larger the means that NFFT samples in the time domain
sound file to be stored, but the larger the dynamic yield NFFT/2 measures in the frequency domain.
range (ratio of loudest to quietest) of sounds that The FFT values, and therefore the power spec-
can be captured. trum calculated from them, are output at a fre-
quency spacing:
4.3.3.4 Audio Coding
fs
Audio coding is used to compress large audio Δf ¼
NFFT
files to reduce storage needs. A common format
is MP3, which can achieve 75–95% file reduction For example, if a sound recording was sam-
compared to the original time series stored on a pled at 44.1 kHz and the FFT was computed over
CD or computer hard drive. Most audio coding NFFT ¼ 1024 samples, then the frequency
algorithms aim to reduce the file size while spacing would be 43.07 Hz and the power spec-
retaining reasonable quality for human listeners. trum would contain 512 frequencies: 43.07 Hz,
The MP3 compression algorithm is based on per- 86.14 Hz,. . ., 22,050 Hz. A different way of
ceptual coding, optimized for human perception, looking at this is that the FFT produces spectrum
ignoring features of sound that are beyond normal levels in frequency bands of constant bandwidth.
human auditory capabilities. Playing MP3 files And the center frequencies in this example are
back to animals might result in quite different 43.07 Hz, 86.14 Hz,. . ., 22,050 Hz. If there were
perception compared to the playback of the origi- two tones at 30 Hz and 50 Hz, then the combina-
nal time series. Unfortunately, this is very often tion of recording settings ( fs ¼ 44.1 kHz) and
ignored in animal bioacoustic experiments. analysis settings (NFFT ¼ 1024) would be unable
Lossless compression does exist (e.g., Free to separate these tones. Their power would be
Lossless Audio Codec, FLAC; see Chap. 2 on added and reported as the single level in the
recording equipment). For animal bioacoustics frequency band centered on 43.07 Hz. To sepa-
research, it is best to use lossless compression or rate these two tones, a frequency spacing of no
none at all. more than 20 Hz is required. This is achieved by
increasing NFFT. To yield a 1-Hz frequency
4.3.3.5 FFT Window Size (NFFT) spacing, 1 s of recording needs to be read into
During Fourier analysis of a digitized sound the FFT; i.e., NFFT ¼ fs  1 s.
recording, a fixed number of samples of the origi- As the NFFT increases, the frequency spacing
nal time series is read and the FFT is computed on decreases, but at the cost of the temporal resolution.
this window of samples. The number of samples This is because an increase in NFFT means that
is a parameter passed to the FFT algorithm and is more samples from the original time series are read
typically represented by the variable NFFT. If in order to compute one spectrum. More samples
NFFT samples are read from the original time implies that the time window over which the spec-
series, then the Fourier transform will produce trum is computed increases. In the above example,
amplitude and phase measures at NFFT with fs ¼ 44.1 kHz, NFFT ¼ 1024 samples corre-
frequencies. However, the FFT algorithm spond to a time window Δt of 0.023 s:
produces a two-sided spectrum that is symmetri-
NFFT 1
cal about 0 Hz and contains NFFT/2 positive Δt ¼ ¼
fs Δf
frequencies and NFFT/2–1 negative frequencies.
To compute the power spectrum, after FFT, the While 44,100 samples last 1 s, 1024 samples
amplitudes of all frequencies (positive and nega- only last 0.023 s. The spectrum is computed over
tive) are squared and summed. In the usual case of
140 C. Erbe et al.

a time window of 0.023 s length. If the recording (NFFT) should be optimized for the sounds of
contained dolphin clicks of 100 μs duration, then interest.
the spectrum would be averaging over multiple
clicks and ambient noise. To compute the spec-
4.3.3.6 FFT Window Function
trum of one click, a time window of 100 μs is
The computation of a discrete Fourier transform
desired and corresponds to NFFT ¼ fs 
over a finite window of samples produces spectral
100 μs ¼ 4. This is a very short window. The
leakage, where some power appears at
resulting frequency spacing would be impracti-
frequencies (called sidelobes) that are not part of
cally coarse:
the original time series but rather due to the length
fs 44, 100 Hz and shape of the window. If a window of samples
Δf ¼ ¼ ¼ 10, 000 Hz
NFFT 4 is read off the time series and passed straight into
the FFT, then the window is said to have rectan-
There is a trade-off between frequency
gular shape. The rectangular window function has
spacing and time resolution in Fourier spectrum
values of 1 over the length of the window and
analysis. This is often referred to as the Uncer-
values of 0 outside (i.e., before and after). The
tainty Principle (e.g., Beecher 1988): Δf Δt ¼ 1.
window function is multiplied sample by sample
In spectrograms, using a large NFFT will result in
with the original time series so that NFFT values
sounds looking stretched out in time, while a
of unaltered amplitude are passed to the FFT
small NFFT will result in sounds looking
algorithm. A rectangular window produces a
smudged in frequency. The combination of
large number of sidelobes (Fig. 4.20).
recording settings ( fs) and analysis settings

Fig. 4.20 Comparison of some window functions (left) and their Fourier transforms (right) for (a) rectangular, (b) Hann,
(c) Hamming, and (d) Blackman-Harris windows
4 Introduction to Acoustic Terminology and Signal Processing 141

Spectral leakage can be reduced by using reduce the frequency spacing Δf. This so-called
non-rectangular windows such as Hann, Ham- zero-padding produces a smoother spectrum but
ming, or Blackman-Harris windows. These have does not improve the frequency resolution, which
values of 1 in the center of the window, but then is still determined by the shape of the window and
taper off toward the edges to values of 0. The the duration of the signal to which the window
amplitude of the original time series is thus was applied.
weighted. The benefits are fewer and weaker
sidelobes, which result in less spectral leakage.
The smallest difference in frequency between 4.3.4 Power Spectral Density
two tones that can be separated in the spectrum is Percentiles and Probability
called the frequency resolution and is determined Density
by the width of the main lobe of the window
function. There is therefore a trade-off between When recording soundscapes on land or under
the reduction in sidelobes and a wider main lobe, water, sounds fade in and out, from a diversity of
which results in poorer frequency resolution. sources and locations. A soundscape is dynamic,
In order to not miss a strong signal or strong changing on short to long time scales (see
amplitude at the edges of the window where the Chap. 7). The variability in sound levels can be
amplitude is weighted by values close to expressed as power spectral density (PSD)
0, overlapping windows are used. Rather than percentiles. The nth percentile gives the level that
reading samples in adjacent windows, windows is exceeded n% of the time (note: in engineering,
commonly have 50% overlap. A spectrogram that the definition is commonly reversed). The 50th
was computed with 50% overlapping windows percentile corresponds to the median level. An
will have twice the number of spectrum columns example from the ocean off southern Australia is
and appear to have finer time resolution. Each shown in Fig. 4.21. The median ambient noise
spectrum column still has the same Δt as for a level is represented by the thin black line and
spectrogram without overlapping windows, but goes from about 90 dB re 1 μPa2/Hz at 20 Hz to
there will be twice as many spectrum columns 60 dB re 1 μPa2/Hz at 30 kHz. The lowest thin
making the spectrogram appear finer in time. gray line corresponds to the 99th percentile. It gets
Zeros can be appended to each signal block quieter than this only 1% of the time. Levels at
(after windowing) to increase NFFT and therefore low frequencies (20–50 Hz) never drop below

Fig. 4.21 Percentiles of


ambient noise power
spectral densities measured
off southern Australia over
a year. Lines from top to
bottom correspond to the
following percentiles: 1, 5,
25, 50 (black), 75, 95,
and 99
142 C. Erbe et al.

75 dB re 1 μPa2/Hz because of the persistent noise sound source. By listening in air with two ears,
from distant shipping. we can tell the direction to the sound source and
These plots not only give the statistical level whether it remains at a fixed location or
distribution over time, but can also identify the approaches or departs. From recordings made
dominant sources in a soundscape based on the over a period of time, the closest point of
shapes of the percentile curves. The hump from approach (CPA) is often taken as the point in
100 Hz to lower frequencies is characteristic of time when mean-square pressure (or some other
distant shipping. The more leveled curves at acoustic quantity like particle displacement,
mid-frequencies (200–800 Hz) are characteristic velocity, or acceleration) peaked (Fig. 4.22).
of wind noise recorded under water. The median Whether a sound source is approaching or
level of about 68 dB re 1 μPa2/Hz corresponds to departing can also be told from the Doppler
a Sea State of 4. The hump at 1.2 kHz is charac- shift. As a car or a fire engine drives past and as
teristic of chorusing fishes. While there are likely an airplane flies overhead, the pitch drops. In fact,
other sounds in this soundscape at certain times as each approaches, the frequency received by a
(e.g., nearby boats or marine mammals), they do listener or a recorder is higher than the emitted
not occur often enough or at a high enough level, frequency, and as each departs, the received fre-
to stand out in PSD percentile plots. quency is lower than the emitted frequency.4 At
Probability density of PSD identifies the most CPA, the received frequency equals the emitted
common levels. In Fig. 4.21, at 100 Hz, the most frequency. The time of CPA can be identified in
common (probable) level was 75 dB re 1 μPa2/ spectrograms as the point in time when the
Hz. This was equal to the median level at this steepest slope in the decreasing frequency
frequency. The red colors indicate that the median occurred as the sound source passed or as the
levels were also the most probable levels. At mid- point in time when the frequency had decreased
to-high frequencies, the levels were more evenly half-way (Fig. 4.23). The Doppler shift Δf can
distributed (i.e., only shades of blue and no red easily be quantified as
colors). The most probable levels are not neces- v
sarily equal to the median levels. A case where Δf ¼ f
c 0
the most probable level (again from distant
shipping) was below the median (due to strong where v is the speed of the source relative to a
pygmy blue whale, Balaenoptera musculus fixed receiver, c is the speed of sound, and f0 is the
brevicauda, calling) is shown in Fig. 4.6, and a frequency emitted by the source (i.e., half-way
case where two different levels were equally between the approaching and the departing
likely (due to two seismic surveys at different frequencies). From a spectrogram, not only the
ranges) is shown in Fig. 4.8, both of Erbe et al. CPA, but also the speed of the sound source can
2016a.3 PSD percentile and probability density be determined.
plots (as well as other graphs) can be created for In the example of Fig. 4.23, one of the engine
both terrestrial and aquatic environments with the harmonics dropped from 96 Hz to 64 Hz. So the
freely available software suite by Merchant emitted frequency was 80 Hz and the Doppler
et al. 2015. shift was 16 Hz. With a speed of sound in air of
343 m/s, the airplane flew at 70 m/s ¼ 250 km/h.
The interesting part of this example is that the
4.4 Localization and Tracking recorder was actually resting on the riverbed, in
1 m of water, and hence in a different acoustic
There are a few simple ways to gain information medium to the source. How this affects the results
about the rough location and movement of a
4
Doppler shift animations by Dan Russell: https://www.
3
https://www.acoustics.asn.au/conference_proceedings/ acs.psu.edu/drussell/Demos/doppler/doppler.html;
AASNZ2016/papers/p14.pdf; accessed 13 October 2020. accessed 13 October 2020.
4 Introduction to Acoustic Terminology and Signal Processing 143

Fig. 4.22 Graphs of (a)


a) b)

Broadband power [dB]

Broadband power [dB]


square pressure [dB re 140 140
1 μPa2], (b) square particle 130
displacement [dB re 1 pm2], 130
(c) square particle velocity 120
[dB re 1 (nm/s)2], and (d) 120
square particle acceleration 110
[dB re 1 (μm/s2)2] as a 110
100
swimmer swims over a
hydrophone. The closest 35 40 45 35 40 45
point of approach is 120 c) 105 d)
Broadband power [dB]

Broadband power [dB]


identified as the time of
100
peak levels (i.e., at 42 s) 110
(Erbe et al. 2017a) 95
100 90
85
90
80
80 75
35 40 45 35 40 45
Time [s] Time [s]

depends on the depth of the hydrophone relative speed, so it was the in-air sound speed that deter-
to the acoustic wavelength. In this particular mined the Doppler shift. If the measurement had
instance, the hydrophone was a small fraction of been carried out in deeper water with a deeper
an acoustic wavelength below the water surface hydrophone, the signal would have been
and the signal reached it via the evanescent wave dominated by the air-to-water refracted wave,
(see Chap. 6 on sound propagation). The evanes- and the Doppler shift would have been deter-
cent wave traveled horizontally at the in-air sound mined by the in-water sound speed.
To accurately locate a sound source in space,
signals from multiple simultaneous acoustic
receivers need to be analyzed. These receivers
are placed in specific configurations, known as
arrays. Methods of localization are dependent on
the configuration of the receiver array, the acous-
tic environment, spectral characteristics of the
sound, and behavior of the sound source. There
are three broad classes of these methods:
time difference of arrival, beamforming, and
parametric array processing methods. The follow-
ing sections provide a condensed overview of the
three methods. For a comprehensive treatise,
please refer to the following: Schmidt 1986;
Van Veen and Buckley 1988; Krim and Viberg
1996; Au and Hastings 2008; Zimmer 2011;
Chiariotti et al. 2019.
Fig. 4.23 Spectrogram of an airplane flying over the Tracking is a form of passive acoustic moni-
Swan River, Perth, Australia, into Perth Airport. toring (PAM), where an estimation of the behav-
Recordings were made in the river, under water. The ior of an active sound source is maintained
closest point of approach occurred at about 18 s, when over time. Passive acoustic tracking has many
the frequencies of the engine tone and its overtones
dropped fastest (Erbe et al. 2018) demonstrated applications in the underwater and
terrestrial domains.
144 C. Erbe et al.

Sound Pressure [µPa]


shift low correlation coefficient
1

-1

0 20 40 60 80 100 120
Sound Pressure [µPa]

maximum correlation coefficient


1

-1

0 20 40 60 80 100 120
Time [ms]

Fig. 4.24 Determining TDOA by cross-correlation. Top: coefficient) is low. Bottom: The red time series is shifted
Two 100-ms time series were recorded by two spatially sample by sample against the blue time series and the dot
separated receivers. A signal of interest arrived 20 ms into product computed over the overlapping samples. When
the recording at receiver 1 (red) and 40 ms into the record- the signals line up, the correlation coefficient is maximum.
ing at receiver 2 (blue). The dot product (i.e., correlation In this example, the TDOA was 20 ms

4.4.1 Time Difference of Arrival series from receiver 2 (blue), and the dot product
is computed again (over the overlapping
Localization by Time Difference Of Arrival samples), yielding the second cross-correlation
(TDOA) is a two-step process. The first step is coefficient. By sliding the two time series against
to measure the difference in time between the each other (sample by sample) and computing the
arrivals of the same sound at any pair of acoustic dot product, a time series of cross-correlation
receivers. The second step is to apply appropriate coefficients forms. A peak in cross-correlation
geometrical calculations to locate the sound occurs when the time series have been shifted
source. TDOA methods work best for signals such that the signal recorded by receiver 1 lines
that contain a wide range of frequencies (i.e., up with the signal recorded by receiver 2. The
have a wide bandwidth), which includes short number of samples by which the time series were
pulses, FM sweeps, and noise-like signals. shifted, divided by the sampling frequency of the
two receivers, is the TDOA.
Generalized cross-correlation is a common
4.4.1.1 Generalized Cross-Correlation
way of determining TDOA. It is suitable for
TDOAs are commonly determined by cross-cor-
localization in air and water in environments
relation. The time series of recorded sound pres-
with high noise and reverberation and can be
sure by two spatially separated receivers are
computed in either the time or frequency domains
cross-correlated as a sliding dot product. This
(Padois 2018).
means that each sample from receiver 1 is
multiplied with a corresponding sample from
receiver 2, and the products are summed over 4.4.1.2 TDOA Hyperbolas
the full length of the overlapping time series. TDOAs are always computed between two
This yields the first cross-correlation coefficient. receivers (from a pair of receivers). Figure 4.25
Next, the time series from receiver 1 (red in sketches the arrangement of an animal A (at point
Fig. 4.24) is shifted by 1 sample against the time A) and two receivers (R1 and R2) in space. The
4 Introduction to Acoustic Terminology and Signal Processing 145

Fig. 4.25 Graphs of localization hyperbolas with two position; R1 and R2 mark the receiver positions. R2 is
receivers; (a) 3D hyperboloid and (b) 2D hyperbola (i.e., hidden inside the hyperboloid in the 3D image
cross-section) in the x-z plane. A marks the animal’s

distances A-R1 (mathematically noted as a line Reflections off boundaries can also be used to
connecting points A and R1 and then taking the refine the location estimate. Finally, if one
magnitude of it: j A R1 j ), A-R2, and R1-R2 are deploys more than two receivers, TDOAs can be
shown as red lines. If A produces a sound that is computed between all possible pairs of receivers,
recorded by both R1 and R2, then the arrival time yielding multiple hyperboloids that will intersect
at point R1 is equal to the distance A-R1, divided at the location of the animal.
by the speed of sound c, and the arrival time at R2
is equal to the distance A-R2, divided by the speed
4.4.1.3 TDOA Localization in 2
of sound c. The TDOA is simply the difference
Dimensions
between the two arrival times:
Localization in 2D space is, of course, simpler
j A R1 j  j A R2 j than in 3D, though it might seem a little
TDOA ¼ contrived. In Fig. 4.26, the airport arrival flight
c
path goes straight over a home. TDOA is used to
It turns out mathematically that the animal can locate (and perhaps track) each airplane. Two
be anywhere on the hyperboloid and the TDOA receivers on the ground will yield the upper half
will be the same. In other words, the TDOA of the hyperbola in Fig. 4.25b as possible airplane
defines a surface (in the shape of a hyperboloid) locations. We know the airplane cannot be under-
on which the animal may be located. With two ground, but in terms of its altitude and range, two
receivers in the free-field, the animal’s position receivers are unable to resolve these. A third
cannot be specified further. If there are receiver in line with R1 and R2 is needed. With
boundaries near the animal and/or receivers three receivers in a line array, three TDOAs can
(e.g., if a bird is tracked with receivers on the be computed and three hyperbolas can be drawn.
ground), then the possible location of the animal Any two of these hyperbolas will intersect at two
can be easily limited (i.e., the bird cannot fly points: one above and one below the x-axis (i.e.,
underground, eliminating half of the space). above and below ground). Knowing that the
146 C. Erbe et al.

a) y

x
m1 m2 m3

b) y

m2

x
m1 m3

Fig. 4.26 Sketches of a three-microphone line array (a)


and a triangular array (b)
Fig. 4.27 Sketches of seafloor-mounted arrays with 4 (a)
and 5 (b) hydrophones
airplane is above ground allows its position to be
uniquely determined. If there were no boundary
(i.e., ground in this case), an up-down ambiguity This is a common situation with line arrays
would remain; the plane could be at either of the towed behind a ship in search of marine fauna.
two intersection points. Using more than three In order to improve localization, a fourth
receivers in a line array (and thus adding more receiver is needed that is not in line with the
TDOAs and hyperbolas) will not improve the others. With four receivers, three hyperboloids
localization capability as all hyperbolas will inter- can be computed that will intersect in two points:
sect in the same two points: one above and one one above the plane of receivers and one below,
below the array. The up-down ambiguity can be yielding another up-down ambiguity. If the
resolved by using a 2D rather than 1D (i.e., line) receiver sits on the ground or seafloor, then one
arrangement. If one microphone is moved away of the points can be eliminated and the sound
from the line (as in Fig. 4.26b), the TDOA source uniquely localized. Otherwise, a fifth
hyperbolas will intersect in just one point: the hydrophone is needed that is not in the same
exact location of the airplane. plane as the other four, allowing general localiza-
tion in 3D space (Fig. 4.27).
The dimensions of an acoustic array used for
4.4.1.4 TDOA Localization in 3
TDOA localization are determined by the
Dimensions
expected distance to the sound source and the
The more common problem is to localize sound
likely uncertainty in the TDOA measurements,
sources in 3-d space; i.e., when the sound source
which is inversely proportional to the bandwidth
and the receivers are not in the same plane. Here,
of the sounds being correlated. A rough estimate
a line array of at least three receivers will result in
of the TDOA uncertainty, δt (s), is δt  1/BW
hyperboloids that intersect in a circle. No matter
where BW is the signal bandwidth (Hz). The
how many receivers are in the line array, all
corresponding uncertainty in the difference in
TDOA hyperboloids will intersect in the same
distances from the two hydrophones to the source
circle. There is up-down and left-right, in fact,
is then δd ¼ cδt where c is the sound speed (m/s).
circular ambiguity about the line of receivers.
4 Introduction to Acoustic Terminology and Signal Processing 147

When a sound source is far away from an array relative positions of all the hydrophones to be
of receivers, the TDOAs can still be used to accurately known, so this is not always easy to
determine the direction of the sound source achieve in practice.
but any estimate of its distance will become Beamforming itself is relatively simple
inaccurate. conceptually, but there are many subtleties (for
details, see Van Veen and Buckley 1988; Krim
and Viberg 1996). As for TDOA methods, the
4.4.2 Beamforming starting point is that when sound from a distant
source arrives at an array of hydrophones, it will
TDOA methods give poor results for sources arrive at each hydrophone at a slightly different
that emit narrow-bandwidth signals such as con- time, with the time differences depending on the
tinuous tones (e.g., some sub-species of blue direction of the sound source. The simplest type
whale) and can also be confounded in situations of beamformer is the delay and sum beamformer
where there are many sources of similar signals in in which the array is “steered” in a particular
different directions from the array (e.g., a fish direction by calculating the arrival time
chorus). However, a properly designed array can differences corresponding to that direction,
be used to determine the direction of narrowband delaying the received signals by amounts that
sources and can also determine the directional cancel out those time differences, and then adding
distribution of sound produced by multiple, them together. This has the effect of reinforcing
simultaneously emitting sources using a signals coming from the desired direction, while
processing method called beamforming. If two signals from other directions tend to cancel out.
or more spatially separated arrays can be This isn’t a perfect process and the array will still
deployed, then the directional information they give some output for signals coming from other
produce can be combined to obtain a spatial directions. The relative sensitivity of the
localization of the source. Alternatively, if the beamformer output to signals coming from differ-
source is known to be stationary, or moving suf- ent directions can be calculated and gives the
ficiently slowly, localization can be achieved by beam pattern of the array. The beam pattern of a
moving a single array, for example by towing it line array depends on the steering direction, with
behind a ship. the narrowest beams occurring when the array is
For the convenient, and hence commonly used steered at right-angles to the axis of the array
case of an array consisting of a line of equally (broadside), and the broadest beams when steered
spaced hydrophones, beamforming requires the in the axial direction (end-fire). There are a num-
hydrophone spacing to be less than half the ber of other beamforming algorithms that can
acoustic wavelength of the sound being emitted give improved performance in particular
by the source. Also, the accuracy of the bearing circumstances; see the above references for
estimates improves as the length of the array details.
increases. These two factors combined mean
that a useful array for beamforming is likely to
require at least eight hydrophones, and even that 4.4.3 Parametric Array Processing
would give only modest bearing accuracy. Con-
sequently, 16-element or even 24-element arrays The array requirements for parametric array
are commonly deployed in practice. A straight- processing methods are similar to those for
line array used for beamforming suffers from the beamforming, but these methods attempt to cir-
same ambiguity as a TDOA array in which all the cumvent the direct dependence of the angular
hydrophones are in a straight line. As in the accuracy on the length of the array (in acoustic
TDOA case, this ambiguity can be countered by wavelengths) that is inherent to beamforming. A
offsetting some of the hydrophones from the summary of these methods can be found in Krim
straight line, however beamforming requires the and Viberg (1996). One of the earliest and best
148 C. Erbe et al.

known parametric methods is the multiple signal y


sea surface
classification (MUSIC) algorithm proposed by
Schmidt 1986. These methods can give more su
rfa
accurate localization than beamforming in ce
re -
fle
situations where there is a high signal-to-noise c
sig ted
ratio and a limited number of sources, however direct signal n al
they are significantly more complicated to imple-
d h₁
ment and more time-consuming to compute. They s(x,y) e cte al
f l n
also rely on more assumptions and are more sen- re sig
o m-
sitive to errors in hydrophone positions than t
ot
beamforming.
x b

sea floor

Fig. 4.28 Sketch of localization in shallow water using a


4.4.4 Examples of Sound Localization
single hydrophone (Cato 1998)
in Air and Water

Passive acoustic localization in air poses logisti- z s(x,y,z)


cal challenges with sound attenuating more rap-
idly in air than in water. This is an issue when
localizing sound sources in open environments,
as suitable recordings can only be collected if the
microphone array is positioned closely around the
source with localization error increasing with
h₁
distance. h₂
Sound source localization in the terrestrial x
domain is generally undertaken using one of y
three methods. Firstly, TDOA is perhaps most
commonly applied to wildlife monitoring, includ-
ing birds (McGregor et al. 1997) and bats (e.g.,
Surlykke et al. 2009; Koblitz 2018). Secondly, Fig. 4.29 Sketch of two hydrophones localizing a fish in
beamforming is more often utilized in environ- 3D space with circular ambiguity using TDOA and inten-
mental noise measurement and management (e.g., sity differences (Cato 1998)
Huang et al. 2012; Prime et al. 2014; Amaral et al.
2018). Thirdly, the perhaps less common MUSIC reflected, seafloor-reflected, and direct sound
approach has been utilized in bird monitoring and propagation paths yielding both range and depth
localization in noisy environments (Chen et al. of the animal (Fig. 4.28), while not being able to
2006). resolve circular symmetry (Cato 1998; Mouy
Under water, both fixed and towed hydro- et al. 2012).
phone arrays are common. TDOA is the most Using TDOAs in addition to differences in
common approach in the case of localizing received intensity (when the source is located
cetaceans (Watkins and Schevill 1972; Janik much closer to one of two receivers) may allow
et al. 2000) and fishes (Parsons et al. 2009; localization in free space to a circle between the
Putland et al. 2018). Under specific conditions, two receivers and perpendicular to the line of two
one or two hydrophones may suffice to localize a receivers (Cato 1998), see Fig. 4.29.
sound source by TDOA. Beamforming is an established method for
Multi-path propagation in shallow water may localizing soniferous marine animals (Miller and
allow localization with just one hydrophone. Tyack 1998) and anthropogenic sound sources
TDOAs are computed between the surface-
4 Introduction to Acoustic Terminology and Signal Processing 149

such as vessels (Zhu et al. 2018). A MUSIC may be more than one animal vocalizing; any
approach to localization also has applications in one animal will have quiet times between
the underwater domain, having previously been vocalizations. So, TDOA locations need to be
used for recovering acoustically-tagged artifacts joined into tracks; tracks need to be continued;
by autonomous underwater vehicles (AUVs) old tracks need to be terminated; new tracks need
(Vivek and Vadakkepat 2015). to be initiated; tracks may need to be merged or
Finally, target motion analysis involves mark- split. Different algorithms have been developed to
ing the bearing to a sound source (from direc- aid this process, with Kalman filtering being com-
tional sensors or a narrow-aperture array) mon (Zimmer 2011; Zarchan and Musoff 2013).
successively over time. If the animal calls fre- While radio telemetry has historically been the
quently and moves slowly compared to the obser- primary approach to terrestrial animal tracking, pas-
vation platform, successive bearings will intersect sive acoustic telemetry has grown in popularity as
at the animal location (e.g., Norris et al. 2017). more animals can be monitored non-invasively (e.g.,
McGregor et al. 1997; Matsuo et al. 2014). Passive
acoustic tracking in water is a well-established
method of monitoring the behavior of aquatic
4.4.5 Passive Acoustic Tracking
fauna, including their responses to environmental
and anthropogenic stimuli (e.g., Thode 2005;
Passive acoustic tracking is the sequential locali-
Stanistreet et al. 2013). Both towed and moored
zation of an acoustic source, useful for monitor-
arrays are used, with towed arrays providing greater
ing its behavior. Such behavior includes kinetic
spatial coverage in the form of line-transect surveys.
elements (e.g., swim path and speed) and acoustic
elements (such as vocalization rate and type). In
praxis, the process is a bit more complicated than
just connecting TDOA locations over time. 4.5 Symbols and Abbreviations
Animals will be arriving and departing; there (Table 4.10)

Table 4.10 Most common quantities and abbreviations in this chapter


Quantity Abbreviation Symbol Unit
Frequency f Hz
Sampling frequency fs Hz
Wavelength λ m
Speed of sound c m/s
Particle velocity u m/s
Period of oscillation τ s
Time variable t s
Sound pressure p(t) Pa
Peak sound pressure ppk Pa
Peak-to-peak sound pressure ppk-pk Pa
Root-mean-square sound pressure prms Pa
Sound pressure level SPL Lp dB re 1 μPa or 20 μPa
Peak sound pressure level SPLpk Lp,pk dB re 1 μPa or 20 μPa
Radiated noise level RNL LRN dB re 1 μPa m or 20 μPa m
Sound exposure level SEL LE,p dB re 1 μPa2s or 400 μPa2s
Source level SL LS dB re 1 or 20 μPa m
Number of Fourier components NFFT
Power spectral density level PSD Lp,f dB re 1 μPa2/Hz or 400 μPa2/Hz
Time difference of arrival TDOA s
150 C. Erbe et al.

4.6 Summary Chen C-E, Ali AM, Wang H (2006) Design and testing of
robust acoustic arrays for localization and enhance-
ment of several bird sources. Proceedings of the 5th
This chapter presented an introduction to acous- International Conference on Information Processing in
tics and explained the basic quantities and Sensor Networks:268–275
concepts relevant to terrestrial and aquatic animal Chiariotti P, Martarelli M, Castellini P (2019) Acoustic
beamforming for noise source localization – Reviews,
bioacoustics. Specific terminology that was methodology and applications. Mech Syst Signal Pro-
introduced includes sound pressure, sound expo- cess 120:422–448. https://doi.org/10.1016/j.ymssp.
sure, particle velocity, sound speed, longitudinal 2018.09.019
and transverse waves, frequency modulation, Dziak RP, Bohnenstiehl DR, Matsumoto H, Fox CG,
Smith DK, Tolstoy M, Lau T-K, Haxel JH, Fowler
amplitude modulation, decibel, source level, MJ (2004) P- and T-wave detection thresholds, Pn
near-field, far-field, frequency weighting, power velocity estimate, and detection of lower mantle and
spectral density, and one-third octave band level, core P-waves on ocean sound-channel hydrophones at
amongst others. The chapter further introduced the Mid-Atlantic Ridge. Bull Seismol Soc Am 94(2):
665–677. https://doi.org/10.1785/0120030156
basic signal sampling and processing concepts Erbe C (2009) Underwater noise from pile driving in
such as sampling frequency, Nyquist frequency, Moreton Bay, Qld. Acoust Aust 37(3):87–92
aliasing, windowing, and Fourier transform. The Erbe C (2013) Underwater noise of small personal water-
chapter concluded with an introductory treatise of craft (jet skis). J Acoust Soc Am 133(4):EL326–
EL330. https://doi.org/10.1121/1.4795220
sound localization and tracking, including time Erbe C, King AR (2009) Modelling cumulative sound
difference of arrival and beamforming. exposure around marine seismic surveys. J Acoust
Soc Am 125(4):2443–2451. https://doi.org/10.1121/1.
3089588
Erbe C, McCauley R, Gavrilov A, Madhusudhana S,
References Verma A (2016a). The underwater soundscape around
Australia. Proceedings of Acoustics 2016, 9–11
Acton WI (1974) The effects of industrial airborne ultra- November 2016, Brisbane, Australia.
sound on humans. Ultrasonics 12(3):124–128. https:// Erbe C, Parsons M, Duncan AJ, Allen K (2016b) Under-
doi.org/10.1016/0041-624X(74)90069-9 water acoustic signatures of recreational swimmers,
Amaral FR, Serrano Rico JC, Medeiros MAF (2018) divers, surfers and kayakers. Acoust Aust 44(2):
Design of microphone phased arrays for acoustic 333–341. https://doi.org/10.1007/s40857-016-0062-7
beamforming. J Braz Soc Mech Sci Eng 40(7):354. Erbe C, Parsons M, Duncan AJ, Lucke K, Gavrilov A,
https://doi.org/10.1007/s40430-018-1275-5 Allen K (2017a) Underwater particle motion (acceler-
American National Standards Institute (2013) Acoustical ation, velocity and displacement) from recreational
Terminology (ANSI/ASA S1.1-2013). Acoustical swimmers, divers, surfers and kayakers. Acoust Aust
Society of America, Melville, NY, USA 45:293–299. https://doi.org/10.1007/s40857-017-
Au WWL, Hastings M (2008) Principles of marine bio- 0107-6
acoustics. Springer Verlag, New York Erbe C, Parsons M, Duncan AJ, Osterrieder S, Allen K
Baumann-Pickering S, McDonald MA, Simonis AE, (2017b) Aerial and underwater sound of unmanned
Solsona Berga A, Merkens KPB, Oleson EM, Roch aerial vehicles (UAV, drones). J Unmanned Veh Syst
MA, Wiggins SM, Rankin S, Yack TM, Hildebrand JA 5(3):92–101. https://doi.org/10.1139/juvs-2016-0018
(2013) Species-specific beaked whale echolocation Erbe C, Williams R, Parsons M, Parsons SK, Hendrawan
signals. J Acoust Soc Am 134(3):2293–2301. https:// IG, Dewantama IMI (2018) Underwater noise from
doi.org/10.1121/1.4817832 airplanes: An overlooked source of ocean noise. Mar
Beecher MD (1988) Spectrographic analysis of animal Pollut Bull 137:656–661. https://doi.org/10.1016/j.
vocalizations: Implications of the “Uncertainty Princi- marpolbul.2018.10.064
ple”. Bioacoustics 1(2–3):187–208. https://doi.org/10. Finneran J, Schlundt C (2011) Subjective loudness level
1080/09524622.1988.9753091 measurements and equal loudness contours in a
Caldwell MC, Caldwell DK (1965) Individualized whistle bottlenose dolphin (Tursiops truncatus). J Acoust Soc
contours in bottlenosed dolphins (Tursiops truncatus). Am 130(5):3124–3136. https://doi.org/10.1121/1.
Nature 207(4995):434–435. https://doi.org/10.1038/ 3641449
207434a0 Fletcher H, Munson WA (1933) Loudness, its definition,
Cato DH (1998) Simple methods of estimating source measurement and calculation. J Acoust Soc Am 5(2):
levels and locations of marine animal sounds. J Acoust 82–108. https://doi.org/10.1121/1.1915637
Soc Am 104(3):1667–1678. https://doi.org/10.1121/1. Holland RA, Waters DA, Rayner JMV (2004) Echoloca-
424379 tion signal structure in the Megachiropteran bat
4 Introduction to Acoustic Terminology and Signal Processing 151

Rousettus aegyptiacus Geoffroy 1810. J Exp Biol environments using microphone arrays. J Acoust Soc
207(25):4361. https://doi.org/10.1242/jeb.01288 Am 135(4):2207–2207. https://doi.org/10.1121/1.
Houser DS, Yost W, Burkard R, Finneran JJ, 4877207
Reichmuth C, Mulsow J (2017) A review of the his- McGregor PK, Dabelsteen T, Clark CW, Bower JL, Hol-
tory, development and application of auditory land J (1997) Accuracy of a passive acoustic location
weighting functions in humans and marine mammals. system: empirical studies in terrestrial habitats. Ethol
J Acoust Soc Am 141(3):1371–1413. https://doi.org/ Ecol Evol 9(3):269–286. https://doi.org/10.1080/
10.1121/1.4976086 08927014.1997.9522887
Huang X, Bai L, Vinogradov I, Peers E (2012) Adaptive Merchant ND, Fristrup KM, Johnson MP, Tyack PL, Witt
beamforming for array signal processing in MJ, Blondel P, Parks SE (2015) Measuring acoustic
aeroacoustic measurements. J Acoust Soc Am 131(3): habitats. Methods Ecol Evol 6(3):257–265. https://doi.
2152–2161. https://doi.org/10.1121/1.3682041 org/10.1111/2041-210X.12330
International Electrotechnical Commission (2013) Electro- Miller PJ, Tyack PL (1998) A small towed beamforming
acoustics - Sound level meters - Part 1: Specifications array to identify vocalizing resident killer whales
(IEC 61672-1 Ed. 2.0 b:2013). New York. (Orcinus orca) concurrent with focal behavioral
International Organization for Standardization (2003) observations. Deep Sea Res Part II Top Stud Oceanogr
Acoustics—Normal equal-loudness-level contours 45(7):1389–1405. https://doi.org/10.1016/S0967-0645
(ISO 226:2003). (98)00028-9
International Organization for Standardization (2007) Mouy X, Hannay D, Zykov M, Martin B (2012) Tracking
Acoustics — Definitions of basic quantities and terms of Pacific walruses in the Chukchi Sea using a single
(ISO/TR 25417). Geneva, Switzerland. hydrophone. J Acoust Soc Am 131(2):1349–1358.
International Organization for Standardization (2017) https://doi.org/10.1121/1.3675008
Underwater acoustics—Terminology (ISO 18405). Norris TF, Dunleavy KJ, Yack TM, Ferguson EL (2017)
Geneva, Switzerland. Estimation of minke whale abundance from an acous-
Janik VM, Van Parijs SM, Thompson PM (2000) A tic line transect survey of the Mariana Islands. Mar
two-dimensional acoustic localization system for marine Mamm Sci. https://doi.org/10.1111/mms.12397
mammals (Note). Mar Mamm Sci 16(2):437–447. Padois T (2018) Acoustic source localization based on the
https://doi.org/10.1111/j.1748-7692.2000.tb00935.x generalized cross-correlation and the generalized mean
Kamminga C, Beitsma GR (1990) Investigations on ceta- with few microphones. J Acoust Soc Am 143(5):
cean sonar IX, remarks on dominant sonar frequencies EL393–EL398. https://doi.org/10.1121/1.5039416
from Tursiops truncatus. Aquat Mamm 16(1):14–20 Parrack HO (1966) Effect of Air-Borne Ultrasound on
Kastelein RA, Wensveen PJ, Terhune JM, de Jong CAF Humans. Int J Audiol 5(3):294–308. https://doi.org/
(2011) Near-threshold equal-loudness contours for har- 10.3109/05384916609074198
bour seals (Phoca vitulina) derived from reaction times Parsons MJ, McCauley RD, Mackie MC, Siwabessy P,
during underwater audiometry: a preliminary study. J Duncan AJ (2009) Localization of individual
Acoust Soc Am 129(1):488–495. https://doi.org/10. mulloway (Argyrosomus japonicus) within a spawning
1121/1.3518779 aggregation and their behaviour throughout a diel
Koblitz JC (2018) Arrayvolution: using microphone arrays spawning period. ICES J Mar Sci 66(6):1007–1014.
to study bats in the field. Can J Zool 96(9):933–938. https://doi.org/10.1093/icesjms/fsp016
https://doi.org/10.1139/cjz-2017-0187 Prime Z, Doolan C, Zajamsek B (2014) Beamforming
Krim H, Viberg M (1996) Two decades of array signal array optimisation and phase averaged sound source
processing research: the parametric approach. IEEE mapping on a model wind turbine. Inter-Noise and
Signal Process Mag 13(4):67–94. https://doi.org/10. Noise-Con Congress and Conference Proceedings
1109/79.526899 249(7):1078–1086
Kuehne LM, Erbe C, Ashe E, Bogaard LT, Collins MS, Putland RL, Mackiewicz AG, Mensinger AF (2018)
Williams R (2020) Above and below: Military aircraft Localizing individual soniferous fish using passive
noise in air and under water at Whidbey Island, acoustic monitoring. Ecol Inform 48:60–68. https://
Washington. J Mar Sci Eng 8(11):923. https://doi.org/ doi.org/10.1016/j.ecoinf.2018.08.004
10.3390/jmse8110923 Robinson DW, Dadson RS (1956) A re-determination of
Leighton TG (2018) Ultrasound in air—Guidelines, the equal-loudness relations for pure tones. Br J Appl
applications, public exposures, and claims of attacks Phys 7(5):166–181. https://doi.org/10.1088/0508-
in Cuba and China. J Acoust Soc Am 144(4): 3443/7/5/302
2473–2489. https://doi.org/10.1121/1.5063351 Schmidt R (1986) Multiple emitter location and signal
Marley SA, Erbe C, Salgado Kent CP (2017) Underwater parameter estimation. IEEE Trans Antennas Propag
recordings of the whistles of bottlenose dolphins in 34(3):276–280. https://doi.org/10.1109/TAP.1986.
Fremantle Inner Harbour, Western Australia. Sci Data 1143830
4(170):126. https://doi.org/10.1038/sdata.2017.126 Southall BL, Bowles AE, Ellison WT, Finneran JJ, Gentry
Matsuo I, Wheeler A, Kloepper L, Gaudette J, Simmons RL, Greene CRJ, Kastak D, Ketten DR, Miller JH,
JA (2014) Acoustic tracking of bats in clutter Nachtigall PE, Richardson WJ, Thomas JA, Tyack
152 C. Erbe et al.

PL (2007) Marine mammal noise exposure criteria: bioacoustics. Appl Acoust 145:137–143. https://doi.
Initial scientific recommendations. Aquat Mamm org/10.1016/j.apacoust.2018.09.022
33(4):411–521. https://doi.org/10.1080/09524622. Van Veen BD, Buckley KM (1988) Beamforming: a ver-
2008.9753846 satile approach to spatial filtering. IEEE ASSP Mag
Southall BL, Finneran JJ, Reichmuth C, Nachtigall PE, 5(2):4–24. https://doi.org/10.1109/53.665
Ketten DR, Bowles AE, Ellison WT, Nowacek DP, Vivek R, Vadakkepat P (2015) Multiple signal classifica-
Tyack PL (2019) Marine mammal noise exposure tion (MUSIC) based underwater acoustic localization
criteria: Updated scientific recommendations for resid- module (UALM) for AUV. Paper presented at the 2015
ual hearing effects. Aquat Mamm 45(2):125–232. IEEE Underwater Technology (UT) Conference.
https://doi.org/10.1578/AM.45.2.2019.125 https://doi.org/10.1109/UT.2015.7108288.
Stanistreet JE, Risch D, Van Parijs SM (2013) Passive Ward R, Parnum I, Erbe C, Salgado-Kent CP (2016)
acoustic tracking of singing humpback whales Whistle characteristics of Indo-Pacific bottlenose
(Megaptera novaeangliae) on a Northwest Atlantic dolphins (Tursiops aduncus) in the Fremantle Inner
feeding ground. PLoS One 8(4):e61263. https://doi. Harbour, Western Australia. Acoust Aust 44(1):
org/10.1371/journal.pone.0061263 159–169. https://doi.org/10.1007/s40857-015-0041-4
Surlykke A, Pedersen SB, Jakobsen L (2009) Watkins WA, Schevill WE (1972) Sound source location
Echolocating bats emit a highly directional sonar by arrival-times on a non-rigid three-dimensional
sound beam in the field. Proc R Soc B Biol Sci hydrophone array. Deep-Sea Res Oceanogr Abstr
276(1658):853–860. https://doi.org/10.1098/rspb. 19(10):691–706. https://doi.org/10.1016/0011-7471
2008.1505 (72)90061-7
Suzuki Y, Takeshima H (2004) Equal-loudness-level Wellard R, Erbe C, Fouda L, Blewitt M (2015)
contours for pure tones. J Acoust Soc Am 116(2): Vocalisations of killer whales (Orcinus orca) in the
918–933. https://doi.org/10.1121/1.1763601 Bremer Canyon, Western Australia. PLoS One 10(9):
Taylor B, Thompson A (eds) (2008) The International e0136535. https://doi.org/10.1371/journal.pone.
System of Units (SI). National Institute of Standards 0136535
and Technology, Gaithersburg, MD Zarchan P, Musoff H (2013) Fundamentals of Kalman
Thode A (2005) Three-dimensional passive acoustic track- filtering: a practical approach, 4th edn. American Insti-
ing of sperm whales (Physeter macrocephalus) in tute of Aeronautics and Astronautics, Inc., Reston, VA
ray-refracting environments. J Acoust Soc Am Zhu C, Garcia H, Kaplan A, Schinault M, Handegard NO,
118(6):3575–3584. https://doi.org/10.1121/1.2049068 Godø OR, Huang W, Ratilal P (2018) Detection, local-
Tonin R (2018) A review of wind turbine-generated ization and classification of multiple mechanized ocean
infrasound: Source, measurement and effect on health. vessels over continental-shelf scale regions with pas-
Acoust Aust 46(1):69–86. https://doi.org/10.1007/ sive ocean acoustic waveguide remote sensing.
s40857-017-0098-3 Remote Sens 10(11):1699
Tougaard J, Beedholm K (2019) Practical implementation Zimmer WMX (2011) Passive acoustic monitoring of
of auditory time and frequency weighting in marine cetaceans. Cambridge University Press, Cambridge

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Source-Path-Receiver Model
for Airborne Sounds 5
Ole Næsbye Larsen, William L. Gannon, Christine Erbe,
Gianni Pavan, and Jeanette A. Thomas

5.1 Introduction laboratory, and the receiver is a pharmaceutical


worker. The SPRM guides the health and safety
The source-path-receiver model (SPRM) manager in minimizing the risk of exposure.1
provides a common framework for occupational Ideally, the source would be eliminated, but this
health and safety management. It is used for haz- might not be possible if this type of chemical is
ard control to minimize the risk of exposing required. Maybe it can be substituted by a less
workers to hazards. Such hazards may be volatile or toxic chemical? There may be engi-
chemicals (e.g., spilled compounds in a pharma- neering controls such as installing an isolation
ceutical laboratory), material (e.g., falling bricks chamber (or glove box) or exhaust hood. Engi-
on a construction site), or noise. neering controls may also be applied to the path
An example SPRM for chemical hazards is along which the chemical travels: installing
shown in Fig. 5.1a. The source is a poisonous ventilators, absorbing material, or mechanical
chemical, which leaks through the air inside a barriers, or simply extending the length of the
path to increase dilution. Finally, controls may
be applied at the receiver: proper training for
Jeanette A. Thomas (deceased) contributed to this chapter safe handling of the chemical, limiting work
while at the Department of Biological Sciences, Western
hours, rotating shifts, and wearing personal pro-
Illinois University-Quad Cities, Moline, IL, USA
tective equipment (PPE). In terms of reducing the
O. N. Larsen (*) risk of exposure, the measures rank from most to
Department of Biology, University of Southern Denmark, least effective (termed hierarchy of control): elim-
Odense M, Denmark
ination, engineering controls, procedural controls,
e-mail: onl@biology.sdu.dk
and finally, PPE.
W. L. Gannon
The SPRM applied to noise control helps
Department of Biology, Museum of Southwestern
Biology, and Graduate Studies, University of New break down the components of noise exposure
Mexico, Albuquerque, NM, USA that can be modified to reduce the risk of acoustic
e-mail: wgannon@unm.edu impacts. In the example of Fig. 5.1b, the source is
C. Erbe a busy downtown road. Noise from the cars
Centre for Marine Science and Technology, Curtin
University, Perth, WA, Australia
e-mail: c.erbe@curtin.edu.au
1
G. Pavan Example SPRM for hazard control. Canadian Centre for
Department of Earth and Environment Sciences, Occupational Health and Safety, Government of Canada;
University of Pavia, Pavia, Italy https://www.ccohs.ca/oshanswers/hsprograms/hazard_con
e-mail: gianni.pavan@unipv.it trol.html; accessed 4 December 2020.

# The Author(s) 2022 153


C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_5
154 O. N. Larsen et al.

Fig. 5.1 Examples of the a) SOURCE PATH RECEIVER


source-path-receiver model
for (a) chemical hazard
control in a laboratory and
(b) traffic noise control in
a city Eliminaon Absorpon Training
Substuon Diluon Procedures
Modificaon Barriers PPE
Isolaon

b) SOURCE PATH RECEIVER

Eliminaon Absorpon Pracces


Substuon Diffracon Procedures
Modificaon Barriers
Isolaon

travels to surrounding residential buildings.2 The Even though the SPRM was originally devel-
source may be eliminated by relocating all traffic oped to manage hazards at the workplace, it is
to an inner-city bypass and banning all traffic much more broadly applicable to the day-to-day
downtown. Maybe private car traffic can be lives of humans—and animals. In fact, the SPRM
substituted by a quieter, electric city bus service. is fundamental. Without a receiver, there is no
Imposing a speed limit reduces noise. Some cities hazard. Without a listener, there is no noise.
enforce noise emission standards for cars. Long- Researchers of animal bioacoustics might want
term engineering solutions may include building to apply the SPRM to their project in order to
a tunnel, resurfacing the road with noise- identify parameters of the source, path, and
absorbing material, installing noise barrier walls receiver, that might influence the results. Other
along the road, or erecting earth bunds. Residen- chapters in this book either explicitly or implicitly
tial buildings may have noise-reduction (double- apply the SPRM. Chapter 13 on the effects of
glazed) windows and residents may set up their noise on animals provides examples where the
bedrooms at the opposite side of the building. The source is a highway, the path follows from the
specific implementation of the SPRM depends on highway into the surrounding bush, and the
the application. For example, residents in an receivers are birds, whose abundance might
apartment building would not want to wear decrease closer to the source as a result of habitat
earmuffs at home, but for workers in a noisy degradation by noise. Chapter 11 deals with
plant, such PPE is common practice. A poster acoustic communication between animals, and
showing the steps involved in workplace noise so the source may be a male frog, the path may
control is shown in Fig. 5.2. lead through a tropical rain forest, and the
receivers are nearby females of the same species.
2 Chapter 12 is about echolocation. Here, the
Example SPRM for traffic noise. Environmental Protec-
tion Department, The Government of the Hong Kong source and the receiver are the same individual
Special Administrative Region https://www.epd.gov.hk/ animal. A bat echolocates on a moth and the
epd/noise_education/young/eng_young_html/m3/m3. echolocation signal reflects off the moth,
html; accessed 4 December 2020.
5 Source-Path-Receiver Model for Airborne Sounds 155

Fig. 5.2 Poster by WorkSafe New Zealand illustrating www.worksafe.govt.nz/about-us/about-this-site/copy


the steps involved in noise control at the workplace. right/. A more elaborate animation is also available (Ani-
# WorkSafe, New Zealand Government, 2018; https:// mation of the SPRM by WorkSafe, New Zealand
www.worksafe.govt.nz/dmsdocument/3987-managing- Government; https://youtu.be/8Cq5UR5KssA; accessed
noise-risk-poster. Reproduced with permission; https:// 4 December 2020.)

informing the bat how far away its prey is. The store acoustic data for later analysis in the labora-
signal travels through the environment twice: tory. The following sections first explore the basic
from the bat to its prey and back. Chapter 10 concepts of sound propagation in air before
covers audiometry, where the sources are con- applying these to an example SPRM.
trolled and engineered signals (often pure tones)
that are played to animals over short distances or
through earphones, and the receivers are individ-
5.2 Sound Propagation
ual animals whose hearing is being measured.
in Terrestrial Environments
Chapter 7 explores soundscapes on land and
under water. The sources are grouped into
The environment through which a sound travels
geophony (e.g., wind, rain, and waves), biophony
alters its acoustic features such as its spectral
(i.e., animals), and anthropophony (e.g., airplanes
composition and level. The effects of the environ-
or ships). The paths go through the air over land,
ment on bioacoustic signals were well explored in
under water, and through the ground. The
the classic works of Chappuis (1971), Marten and
receivers in passive acoustic monitoring of
Marler (1977), Michelsen (1978), and Wiley and
soundscapes are recorders, which collect and
Richards (1978).
156 O. N. Larsen et al.

Fig. 5.3 Diagram of some of the factors affecting sound propagation in air. Figure donated by Sara Torres Ortiz

Airborne sound propagation (often called out- temperature, wind speed and direction, and
door sound propagation) is characterized by a humidity) vary throughout the day and among
number of phenomena. Sounds attenuate with seasons, and so sound propagation can be quite
distance from the sender due to geometrical atten- variable. Sound propagation models exist and can
uation (i.e., spreading) and absorption by the be used to predict the distance over which sounds
medium. High-frequency sounds (i.e., sounds travel, create noise maps, estimate changes to the
having short wavelengths; see Chap. 4 on acoustic (e.g., spectral) features of received
definitions of frequency and wavelength) propa- sounds, and identify factors that could hinder or
gate over shorter distances than low-frequency enhance animal communication (see Lohr et al.
sounds (i.e., sounds having long wavelengths). 2003; Jensen et al. 2008). Bioacousticians should
Environmental and structural factors such as sub- consider the characteristics of sound propagation,
strate composition; terrain profile; obstacles along which could explain variability in the receiver’s
the path; amount of vegetative cover; wind speed behavioral response or the effectiveness of acous-
and direction; vertical gradients (i.e., increases or tic communication.
decreases) in wind speed, air temperature, and
humidity; air turbulence; and, to a small degree,
altitude (i.e., atmospheric pressure) affect sound 5.2.1 Ray Traces
propagation in air (Fig. 5.3). The propagation
paths, along which sounds travel, are rarely Sound propagation is accurately described by the
straight lines, but rather bend (i.e., refract or dif- acoustic wave equation. This is a four-
fract), reflect, and scatter. The same sound dimensional (4-d: three spatial coordinates and
traveling along different propagation paths may time) differential equation of the second order.
interfere with itself constructively or destruc- For an “easy” derivation of the acoustic wave
tively. The received sound is a weaker and often equation, see Larsen and Radford (2018). How-
distorted version of the sent sound (Wahlberg and ever, in the simplest situation of symmetric geom-
Larsen 2017). etry (i.e., omnidirectional signal in a
This section explains the basic concepts of homogeneous medium with no reverberation),
sound propagation in air and provides some the equation can be simplified and described by
insights into environmental effects on propaga- one variable: the range to the source (Wahlberg
tion. Some environmental factors (e.g., air and Larsen 2017). Even then, solving the wave
5 Source-Path-Receiver Model for Airborne Sounds 157

a) b)

t1 t2 t3 t4 t3 t4

Fig. 5.4 (a) Sketch of a rooster sitting on a branch. When propagation. (b) Illustration of Huygens’ principle. Each
the bird crows, sound is emitted in all directions (marked point on the wavefront at time t4 can be considered itself a
by a few example black arrows). The green concentric (secondary) source; nine example points are marked by
circles represent the wavefronts of the outgoing sound at suns. The wavefronts of the secondary sources (shown as
times t1  t4. The wave rays are perpendicular to the black circle segments) superpose to yield the new primary
wavefronts and point in the direction of sound wavefront, drawn at time t4

equation under the various and variable waves cancel out in some places but at the farthest
conditions encountered in common sound propa- range from the rooster in the center, the secondary
gation scenarios is quite a task. Fortunately, there wavefronts line up to yield the new primary
are much simpler, conceptual principles of sound wavefront at time t4.
propagation, which can yield satisfactory results. As the expanding wavefront encounters
One such concept is ray propagation or ray features of the environment (e.g., vegetation or
tracing. gradients in sound speed), its shape changes and
Let us consider an omnidirectional source, the directions of the wave rays change. The laws
which emits sound equally in all directions. An of physics and principles of sound propagation
example is the crowing rooster in Fig. 5.4a can be applied to trace the propagation paths. This
(although it is only omnidirectional at the lower is called ray tracing. For an easy introduction to
frequencies of its crow and it might not typically ray tracing, see Heller (2013). Wahlberg and
crow while roosting, but for the sake of Larsen (2017) suggested visualizing a ray as a
science. . .; Larsen and Dabelsteen 1990). Wave “small acoustic particle travelling along a narrow
rays point in the direction of sound propagation beam or ray in discrete steps and bouncing-off or
and are perpendicular to the wavefronts of the being refracted through surfaces.” This type of
propagating sound. The wavefronts are spheres sound field visualization, first introduced in
in 3D space (circles in 2D). Huygens’ principle 1967 (Krokstad et al. 2015), has been used exten-
(named after Christiaan Huygens, a Dutch physi- sively in linear acoustics to model phenomena in
cist) states that every point on a wavefront can be outdoor sound propagation with the computa-
considered a source of a new (secondary) wave. tional tools now available with computers
And all of the secondary wavefronts superpose (Attenborough et al. 1995).
to build the next (in time) primary wavefront. An example of ray tracing is shown in Fig. 5.5.
The wavefront at time t3 in Fig. 5.4a is also The omnidirectional source is located in the lower
shown in Fig. 5.4b. Nine example points on this left corner, 5 m above ground at range 0, and it
wavefront are “randomly” illustrated (as small emits a 10-Hz tone. The wave rays are shown and
suns). These each create their own set of concen- follow the sound propagation paths. Sound that is
tric wavefronts, drawn at time t4. The secondary initially emitted in an upwards direction bends
158 O. N. Larsen et al.

Fig. 5.5 Top: Ray traces modeling the propagation of an longer than typical animal sound communication
airborne 10-Hz tone from a point source located 5 m off distances, which normally are up to only a few hundred
the ground (lower left corner). The model suggests that meters. Bottom: Contour plot of propagation loss, PL (i.e.,
sound is bent downwards (downward refraction, typical attenuation) of the 10-Hz sound. Modified from
for nighttime) where it bounces off the ground several Attenborough et al. (1995). # Acoustical Society of
times depending on the initial direction from the source. America, 1995. All rights reserved
Note the scales: These effects occur at distances much

downward at a certain altitude (depending on its loss) and regions that only a few rays enter have
initial angle of emission). This is typical for night- low received levels (high propagation loss).
time sound propagation. Once rays hit the ground, For example, Ottemöller and Evers (2008)
they are reflected upwards again. The sound field used ray tracing to describe the sound propaga-
(i.e., the received level at every location in space) tion of a massive vapor cloud explosion at
is computed by summing sound pressure over all Buncefield fuel depot near Hemel Hempstead,
rays. Regions where rays travel close together UK, on the morning of 11 December 2005. The
have high received levels (little propagation storage tank overflowed and released over
5 Source-Path-Receiver Model for Airborne Sounds 159

300 tons of fuel. An explosion was triggered after ground up to about 10 m from a microphone, only
a vapor cloud formed and spread over a very large spherical spreading needs to be considered. If the
area (80,000 m2 or about 20 acres) before ignit- receiver is at a greater distance from the bird, then
ing. The explosion was huge, caused extensive ground and atmospheric effects also must be con-
damage, injured 43 people, and was detected by sidered. If the bird is flying overhead, then spher-
seismograph stations in the UK and the ical spreading and atmospheric effects need to be
Netherlands. The data provided significant infor- considered when determining propagation
mation on the ray trajectories of this explosion. characteristics.
If other sources of attenuation are negligible,
then Eq. 5.2 can be used to calculate the source
levels of a vocalizing animal located at distance
5.2.2 Geometrical Sound Spreading
r from the receiver. For instance, if a bioacousti-
cian measured RL ¼ 65 dB re 20 μPa at a distance
Sound from an omnidirectional source in the free-
of 10 m from a singing bird, then SL (at 1 m from
field spreads out evenly in a spherical pattern (i.e.,
the bird) becomes 65 dB re 20 μPa + 20
equally in all directions). The free-field is homo-
log10(10) dB re 1 m ¼ 85 dB re 20 μPa m (e.g.,
geneous (i.e., has no temperature or humidity
Dabelsteen 1981). Similarly, if somebody played
gradients) and unimpeded by buildings or vegeta-
back a sound at a known source level of 85 dB re
tion. At any receiver location in space, only a
20 μPa m, then the predicted RL at 1 km (¼
small proportion of the emitted sound arrives,
103 m) range would be 25 dB re 20 μPa, as
and so the received sound is attenuated compared
20 log10(103) ¼ 60.
to the sound energy emitted at the source. The
In some environments, and for some sources
total attenuation or loss of sound energy from the
(i.e., line sources rather than point sources), air-
source to a receiver is known as propagation loss
borne sound propagation can be better described
(PL; formerly transmission loss). The sound pres-
as cylindrical spreading. For an infinitely long
sure level at the source (defined as 1 m from a
line source, the propagation loss as a function of
point source; see Chap. 4) is called the source
range becomes PLcyl ¼ 10 log10(r) and so Eq. 5.1
level (SL), whereas the sound pressure level at
becomes:
the receiver at a distance (i.e., range r) from the
source is called the received level (RL). The rela- RL ¼ SL  10 log 10 ðr Þ ð5:3Þ
tion between these two levels is given by Eq. 5.1:
Most biological line sources, however, are
RL ¼ SL  PL ð5:1Þ finite, such as a row of vocalizing birds on a
power line. (Please be aware that this example is
Propagation loss in the free-field is termed
not a line source in the strict acoustic sense.) This
spherical spreading loss, which can be computed
means that geometrical spreading loss is some-
as PLsph ¼ 20 log10(r) (for derivation of this
where between that of spherical and cylindrical
expression, see Wahlberg and Larsen 2017). It is
spreading loss (Fig. 5.6). When the receiver dis-
independent of signal frequency and only
tance from the finite line source is much less than
depends on the geometry of the source and
the length of the finite line source, then the atten-
sound field. So, Eq. 5.1 may be reformulated:
uation is close to that of an infinite line source
RL ¼ SL  20 log 10 ðr Þ ð5:2Þ (i.e., 10 log10(r)), whereas at distances compara-
ble to or larger than the length of the finite line
As a first approximation, spherical spreading is source, the latter acts more like a point source and
a good model for the propagation of terrestrial attenuation develops as 20 log10(r). At suffi-
animal sounds produced in large open-air regions, ciently long distances, all sources can be regarded
such as grassland. Generally, if a bird sings on the as point sources.
160 O. N. Larsen et al.

5.2.3 Sound Absorption in Air

An important and predictable component of EA is


attenuation by absorption in air. Absorption refers
to the conversion of acoustic energy into heat,
mostly due to molecular relaxation of air
molecules and the air’s shear viscosity. Absorp-
tion loss EAabs is directly proportional to the
distance r from the source:
EAabs ¼ αr ð5:5Þ

The absorption coefficient α (measured in


Fig. 5.6 Propagation loss due to geometrical spreading in dB/m) is a complex function of sound frequency,
air from a finite length line source with distance r relative air temperature, relative humidity, and (to a lesser
to the length L of the finite line source. At distances from degree) atmospheric pressure (or altitude), in
the source shorter than L, the attenuation is close to 3 dB/ addition to characteristics of oxygen and nitrogen
dd (cylindrical attenuation), whereas at distances equal to
or longer than L, the attenuation becomes 6 dB/dd (spheri- molecules (Attenborough 2007).
cal attenuation); dd: distance doubled For instance, a 2-kHz signal propagating at
standard atmospheric pressure (1 atm) and 20  C
is attenuated by about 0.9 dB/100 m, if the rela-
The propagation loss, however, includes much tive humidity (r.h.) is 60%, but by about 4.5 dB/
more than geometrical spreading loss, since 100 m at 10% r.h. (Fig. 5.7). Generally, sound
beyond some distance from the source, RL attenuation is greater in drier air than in damp,
mostly becomes smaller with distance than humid air. The effect is especially important at
predicted by Eqs. 5.2 or 5.3. To account for this frequencies above 2 kHz. In other words, air acts
extra attenuation, Marten and Marler (1977) as a low-pass filter enabling only low-frequency
introduced the term excess attenuation (EA). sound to travel over long distances from the
This includes a number of other effects such as source (Attenborough 2007; Wahlberg and
atmospheric absorption, reflection and scattering, Larsen 2017; Larsen and Radford 2018). Conse-
the ground effect, attenuation by vegetative quently, bats use high source levels to overcome
cover, refraction by air temperature and wind the attenuation in air at high frequencies when
gradients, and attenuation due to turbulence— they echolocate on targets at long distances. This
and often there still is a rest attenuation not low-pass filter effect is especially visible in the
accounted for by these mechanisms (Wahlberg field for broadband sound signals produced by
and Larsen 2017). While geometrical spreading orthopterans and other insects (Römer 1998).
is frequency-independent, most of the effects Sound absorption in air varies with time of day
contributing to EA are frequency-dependent and and season, mainly due to variations in the rela-
thus alter the spectrum of the emitted sound. tive humidity, which usually peaks in the after-
In most bioacoustic scenarios, spherical atten- noon (see Larsson 2000; Attenborough 2007). So,
uation applies, and Eq. 5.2 can be if precise values of air absorption are needed in a
reformulated to: field experiment, the relative humidity, atmo-
spheric pressure, and air temperature must be
RL ¼ SL  20 log 10 ðr Þ  EA ð5:4Þ measured over time and used in subsequent
calculations (Wahlberg and Larsen 2017).
The following sections investigate each of
However, at the short distances (<100 m)
these components of EA.
where most acoustic communication between
5 Source-Path-Receiver Model for Airborne Sounds 161

Fig. 5.7 Sound absorption


coefficients α in air
(dB/100 m) at 20  C versus
frequency at four different
relative humidities (r.h. %).
Based on ISO 9613-1:1993
(International Organization
for Standardization. ISO
9613-1:1993, Acoustics—
Attenuation of sound
during propagation
outdoors—Part 1:
Calculation of the
absorption of sound by the
atmosphere. International
Organization for
Standardization; https://
www.iso.org/standard/
17426.html; accessed
9 January 2021)

animals takes place and at frequencies below secondary wave. Two secondary wavefronts are
10 kHz, the role of absorption in overall propaga- shown at time t3. From the time t1, when the first
tion loss is likely insignificant compared to other ray hits, to the time t3, the first wavefront has
environmental factors. Garcia et al. (2012), for expanded quite a bit. The second wavefront was
example, described the 40-Hz wing beat signals started at time t2, when the second ray hit, and has
of drumming ruffed grouse (Bonasa umbellus). expanded less by time t3. The third ray is just
Theoretically, these sound signals would be starting its secondary wave at time t3, with its
reduced by 6 dB due to air absorption at a dis- secondary wavefront not yet visible. The tangent
tance of 187 km from the drumming bird, to the secondary wavefronts at time t3 gives the
whereas spherical spreading loss alone would new wavefront of the reflected wave. The angle of
have reduced the signal amplitudes to a level far incidence (measured from the normal) is equal to
below auditory threshold of most animals at a the angle of reflection (also measured from the
distance of 1 km already (PLsph ¼ 60 dB re 1 m). normal). This is referred to as the law of reflec-
tion. It applies to the so-called specular reflection
(as from a mirror).
5.2.4 Reflection, Scattering, Reflection is not always specular but might
and Diffraction instead be diffuse. In diffuse reflection, sound is
scattered from the surface in all sorts of directions
A second and less predictable component of EA including the specular direction (Fig. 5.8b). This
is the attenuation caused by reflection, scattering, happens when the surface is not smooth but
and diffraction. As a sound wave hits a hard rough. Scattering depends on the ratio of the
surface, it is reflected. Reflection can be explained wavelength of sound to the size of the scatterer.
with Huygens’ principle. In Fig. 5.8a, the rooster When the sound wavelength is long (i.e., fre-
from Fig. 5.4a is very far away such that the quency is low) relative to the roughness of the
wavefronts at any location appear planar (rather surface, all the sound energy is reflected in the
than circular) and the wave rays are parallel specular direction. When the wavelength is short
(rather than radial). Three incident rays are (i.e., frequency is high) and less than the magni-
drawn, hitting the surface (e.g., a road) at times tude of the unevenness of the surface, then sound
t1, t2, and t3. By Huygens’ principle, each point on is scattered in other, non-specular directions. A
the road that is hit acts as the source of a gravel road, for instance, produces specular
162 O. N. Larsen et al.

a) b)
Incident Reflected Incident Scaered
wave wave wave rays wave rays
rays rays

⍬r ⍬i ⍬r
⍬i
t1 t2 t3

Fig. 5.8 (a) Sketch of specular reflection of a plane wave ray has started to grow a secondary wavefront, and the first
(originating from a far-away rooster) off a hard surface. ray has grown the largest wavefront. The angles of inci-
Wave fronts are shown as green lines; they are perpendic- dence θi are equal to the angles of reflection θr. (b) Sketch
ular to the wave rays, shown as black arrows. The three of diffuse reflection off a rough surface where the uneven-
incident rays hit at times t1  t3 at the locations marked by ness is great compared to the wavelength of incident
small suns. Each of these points creates a secondary wave sound. While there is a reflected ray in the specular direc-
by Huygens’ principle. The secondary wavefronts super- tion, too (indicated by a blue arrow), there are many other
pose to yield the new wavefront of the reflected wave, directions in which the incident sound is scattered
shown at time t3, when the third ray just hits, the second (indicated by red arrows)

reflection at frequencies below 15–20 kHz, but at (Holland et al. 2001). Consequently, leading
higher frequencies, where the gravel roughness is edges of sound segments are relatively well-
large relative to the wavelength, sound is preserved, whereas ending edges are lost in rever-
scattered in different directions (Michelsen and berant environments.
Larsen 1983). Diffraction occurs when a sound wave is par-
Reverberation is a result of multiple reflections tially obstructed. In Fig. 5.10a, a plane wave
and refers to the phenomenon of sound persisting (perhaps again from a far-away rooster) hits a
even if the source is turned off. In canyons, caves, wall with an opening in the center. The rays that
or other enclosures, sound bounces off the hit the wall are reflected (not drawn). The rays
boundaries again and again. The reverberant that hit the opening pass straight through. By
sound field is the space that is dominated by Huygens’ principle, each point of the opening
reflected sound (as opposed to the field near the acts as a source of secondary waves. As the sec-
source where the direct sound dominates). Once ondary wavefronts expand, they superpose to
the source is switched off, the reverberant field form new wavefronts that appear to bend behind
will continue to exist for some time, yet decay due the wall. This is termed diffraction. It also occurs
to absorption by the medium, boundaries (e.g., when the obstruction is finite (Fig. 5.10b).
the walls of a music room), and absorbers in the If the object that is in the path of a propagating
room (e.g., furniture and people). The more reflec- sound wave becomes much smaller than a wall
tive the boundaries, the greater the reverberation. (e.g., a bush or maybe just an insect in the air), to
Reverberation severely alters the structure of the point where the wavelength is much greater
the received sound and is one of the least wanted (at least by a factor 10) than the size of the object,
effects in analysis of recorded animal sounds then the sound wave “ignores” the object and
(Fig. 5.9). This type of signal degradation with propagates without obstruction. The sound effec-
propagation distance can be quantified by mea- tively cannot “see” the object; it is too small. In
suring the blur-ratio (see e.g., Dabelsteen et al. laboratory experiments, bioacousticians should
1993). The received sound appears longer in therefore make sure that objects in the sound
duration than the emitted sound, with the delayed path from loudspeaker to experimental animal
echoes forming a resulting “tail.” This reverbera- are at least 10 times smaller than the wavelength
tion tail can be quantified as the tail-to-signal ratio of the stimulus sound (Larsen 1995). When the
5 Source-Path-Receiver Model for Airborne Sounds 163

Fig. 5.9 Spectrogram and


envelope of a series of
simple blackbird (Turdus
merula) calls recorded at
two different distances
(amplitudes normalized and
realigned in time). The
spectrogram on top shows
higher reverberation due to
longer distance from the
source than the bottom one.
The color scale from white
to black is 96 dB in
6-dB bins

wavelength is of the same order of magnitude as as a paved road, ice sheet, cave wall, canyon,
the object, or somewhat greater, then diffractive subterranean tunnel, burrow wall, or wall of a
scattering occurs (Bradbury and Vehrencamp captive animal’s exhibit) reflects more and
2011). As the name suggests, this is a combina- absorbs less acoustic energy than a porous, soft
tion of diffraction and scattering, whereby some surface (such as tree leaves, grassy pastures, or
sound bends around the object and some sound forest canopy). Whether a surface or object is
scatters in all directions, leading to a complicated considered rough or smooth and hard or soft
sound field. depends on the wavelength of the sound. In a
Different surfaces or materials exhibit different mixed deciduous forest, reverberations for
degrees of sound reflection, absorption, and trans- frequencies above 4 kHz are stronger with leaves
mission. A hard, compact, smooth surface (such on the trees than without leaves (Wiley and

a) b)

Fig. 5.10 (a) Sketch of diffraction as a sound wave secondary waves. The secondary waves combine to create
passes through an aperture. Wave rays are indicated by the new wavefronts shown at three successive instances in
black arrows; wavefronts are indicated by green lines. As time. The wavefronts appear to bend behind the aperture.
the plane wave from a distant rooster hits a wall, each point (b) Sketch of diffraction as a sound wave passes by a finite
in the opening acts as a source (indicated by suns) of obstruction
164 O. N. Larsen et al.

Richards 1982). Reverberations essentially are the sound propagating along PD and PG. The
absent in an open field on a calm day. interference pattern has regions of enhanced
received level (due to constructive interference)
and of attenuated received level (due to destruc-
5.2.5 Ground Effect tive interference) at the position of R (Fig. 5.11b).
The received sound signal is a distorted version of
Another component of EA is the so-called ground the emitted signal. It is said to be comb-filtered, as
effect, which is always present in terrestrial sound the destructive interference creates the “comb
propagation. The sound signal from a sender teeth” attenuating some frequencies in the signal,
(S) located at some height above ground (e.g., a whereas the constructive interference enhances
bird at 4 m) will reach a receiver (R; e.g., a other frequencies of the signal. The magnitude
recordist’s microphone at 1.5 m) first by the direct of the ground effect depends on sound frequency,
path (PD) and a moment later by the indirect and on geometry of the sender-receiver separation
longer path when the signal has been reflected distance and height above ground, on the rough-
from the ground (PG) (Fig. 5.11a). This results ness and softness of the ground, and on atmo-
in a range-dependent interference pattern between spheric pressure, ambient temperature, relative

Fig. 5.11 Predicted ground effect. (a) Sender 4 m above covered field (flow resistivity 100 kPa s m2, porosity
ground, Receiver 1.5 m above ground, horizontal separa- 30%, layer depth 0.01 m). Red curve: As in the black
tion distance 50 m (not to scale). The direct wave PD and curve, but more realistic air absorption (at 20  C, 75%
the reflected wave PG superpose at R. (b) For frequencies relative humidity, standard atmospheric pressure) and
whose wavelengths are in phase, superposition results in moderate turbulence (mean-squared refractive index of
level enhancement up to 6 dB; at frequencies with 105) were added. Effects of temperature and wind-
wavelengths out of phase at R, levels are attenuated up to induced refraction were excluded in the model, which
20–30 dB. Black curve: The curve represents the predicted was developed by Keith Attenborough and Shahram
decibel values that need to be added to the geometric Taherzadeh and improved by Kenneth Kragh Jensen
attenuation loss. The ground was modeled as a grass-
5 Source-Path-Receiver Model for Airborne Sounds 165

humidity, and turbulence (see Attenborough et al. 24 m varied from about 5 dB at 2 kHz to 10 dB
2007). Acoustically hard ground surfaces (such as at 4 kHz, which is the range of dominant
rock or consolidated sand) produce comb-filter frequencies in many songbird songs. This foliage
effects over a wide frequency range extending to attenuation is less than, but needs to be added to,
relatively high frequencies, whereas acoustically the 28-dB attenuation caused by spherical spread-
soft surfaces (such as grasslands, forest floors, or ing over the same distance (Eq. 5.2).
unpacked snow) mainly generate the ground Some research on sound propagation through
effect at low frequencies. Recordists may reduce vegetation was motivated by a desire to attenuate
the ground effect by placing microphones as high anthropogenic noise such as road noise, but gen-
as practically possible above soft ground. For a erally and most surprisingly dense foliage only
general introduction to the phenomenon, see accounts for a small amount of attenuation.
Michelsen and Larsen (1983) or Wahlberg and Martínez-Sala et al. (2006) concluded that a
Larsen (2017). For a comparison between ground 15-m wide patch of regularly spaced trees could
effect models and outdoor recordings, see Jensen attenuate car noise by at least 6 dB. The effect was
et al. (2008). similar for more traditional noise barriers.
Defrance et al. (2002), for instance, found that a
100-m wide forest strip was effective at providing
5.2.6 Attenuation by Vegetative an acoustical barrier to noise, such as shown in
Cover Fig. 5.12, where octave-band sound was broad-
cast through dense foliage and recorded at differ-
Absorption of sound by vegetation is a compo- ent distances in the forest.
nent of EA that can further dissipate airborne At present, vegetation attenuation is not well
sounds over distance as acoustic energy is understood. A much larger database is needed
converted to heat in the plant material by viscous before it is possible to accurately predict the effect
friction. The absorption of sound in vegetation of different kinds of vegetation on sound propa-
depends on the material composition and hard- gation (see Attenborough et al. 2007).
ness of the surfaces including the soft ground
often found especially in woodland. Leaves
absorb more sound energy than a tree trunk;
5.2.7 Speed of Sound in Still Air
whereas a tree trunk reflects more sound than
leaves do. All of this is frequency-dependent.
The speed of sound in still air is affected only by
This component of EA obeys no simple rules
the ambient air temperature and, to a minimal
and needs to be measured by propagation
extent, air pressure (or altitude). If the sound
experiments in the field (e.g., Dabelsteen et al.
propagates under windy conditions, however,
1993). Aylor (1972a, b) measured sound propa-
the effective speed of sound will be modified by
gation loss through various crops, bushes, and
the wind velocity such that the wind velocity of a
trees by broadcasting from a loudspeaker and
tailwind will add to the speed of sound and the
recording at some distance with a microphone.
wind velocity of a headwind will subtract from
He found foliage enhanced absorption and scat-
the speed of sound.
tering. Price et al. (1988) modeled and measured
The speed of sound determines the arrival time
attenuation by vegetation in different forest
of a signal from the sender to the receiver and
environments and documented scattering from
bends a propagating sound wave away from
tree trunks, enhanced ground effect in the pres-
higher air temperature and towards lower air tem-
ence of mature forest litter, and attenuation by
perature (or from higher wind velocity towards
foliage. Foliage attenuation had the greatest effect
lower wind velocity). The speed of sound in air at
above 1 kHz and increased almost linearly with
21  C is 344 m/s. At freezing point, 0  C, the
the logarithm of frequency. Through mixed conif-
speed of sound in air is 331 m/s. A good
erous forest, for instance, the attenuation over
166 O. N. Larsen et al.

Fig. 5.12 Attenuation of


octave bands of noise
(63 Hz to 8000 Hz) after
propagating three distances
through dense foliage. Data
from ISO 9613-2:1996
(International Organization
for Standardization. ISO
9613-2:1996, Acoustics—
Attenuation of sound
during propagation
outdoors—Part 2: General
method of calculation.
International Organization
for Standardization; https://
www.iso.org/standard/
20649.html; accessed
9 January 2021)

approximation of the speed of sound c in dry air sin θi c1


¼ ð5:7Þ
with 0.04% CO2 and temperature Tc (in  C) is: sin θt c2

c ¼ ð331:45 þ 0:607 T c Þ m=s ð5:6Þ Note that, while the frequency of the sound
does not change during transmission, the wave-
length does change. With c ¼ λf (see Chap. 4,
section on the speed of sound), the wavelength is
5.2.8 Refraction by Air Temperature smaller in the medium with lower sound speed.
Gradients in Still Air Refraction of sound waves in air is a common
phenomenon due to vertical gradients of air
Refraction is the change of the direction of sound temperature and/or wind velocity. A gradual
propagation due to changes in the speed of sound. change in sound speed is illustrated in
In the example of Fig. 5.13a, a plane wave in Fig. 5.13b, where the rays bend more and more
medium 1 hits an interface with medium upwards as the sound speed increases. In terres-
2. Some of the acoustic energy might be reflected trial environments, the sound source is typically
(as in Fig. 5.8a, not drawn in Fig. 5.13a), and located close to the ground. A sound speed profile
some of the energy is transmitted. The transmitted that has the speed of sound increase with altitude
wave is refracted, because the speeds of sound is downward refracting, while a sound speed pro-
differ in the two media. If c1 > c2, then the file that has the speed of sound decrease with
transmitted wave bends towards the normal (i.e., altitude is upward refracting. Bent propagation
away from the interface; Fig. 5.13a); if c1 < c2, paths have the effect that sound appears to arrive
then the transmitted wave bends away from the from a non-intuitive (i.e., not straight-line) direc-
normal (i.e., towards the interface; Fig. 5.13b). tion. This phenomenon is like an acoustic mirage
The angles of incidence and refraction (transmis- in analogy to optical mirages, which produce
sion) are related via Snell’s law (named after displaced images of far-away objects and which
Dutch astronomer and mathematician Willebrord are also caused by refraction (of light).
Snell): The EA from refraction may be positive or
negative, and so RL may be smaller or greater
5 Source-Path-Receiver Model for Airborne Sounds 167

a) b)
Medium 1: c1 C1
Incident wave rays
C2

⍬i C3
t1 t2 t3

⍬t C4
Medium 2: c2
Refracted C5
(transmied) wave rays

Fig. 5.13 (a) Sketch of refraction at a boundary between medium. With rays, by definition, being perpendicular to
medium 1 (high sound speed) and medium 2 (low sound the wavefronts, it can be seen that the rays bend towards
speed). Three rays (black arrows) are shown, hitting the the normal in the second medium (θt < θi). Successive
interface at times t1-t3. Each gives rise to secondary waves wavefronts are drawn to show that they are spaced farther
(by Huygens’ principle) starting at the points marked with apart in the medium with higher sound speed, and so the
small suns. At time t3, the third ray just meets the interface, wavelength λ is greater in the medium with higher sound
the second ray has produced a small secondary wave, and speed. (b) Sketch of gradual refraction by a vertical gradi-
the first ray’s secondary wave has grown quite a bit. ent in sound speed. In the illustrated example,
Drawing the tangent to the secondary waves at time t3 c1 < c2 < c3 < c4 < c5
yields the new wavefront (green line) in the second

than predicted without a refracting atmosphere. zone around the sound source, where the sound
Air temperature varies throughout the day and level decreases way faster than predicted from
creates varying temperature gradients. So, record- distance alone (Fig. 5.14b). While the shadow
ing at the same location at a different time of day zone cannot be reached by a direct path, it may
can produce different results. Therefore, taking be ensonified by reflection off houses (or other
periodic measurements of the ambient tempera- reflectors) in the vicinity and by paths passing
ture at different heights above the ground can through turbulence, and the shadow zone is thus
provide the researcher with a notion of whether not totally quiet.
sound propagation is changing and at what pace. For example, on a sunny day with little wind,
In still air during daytime, the air is both the air temperature can be 30  C at the ground
warmer and more humid close to the ground and (c ¼ 351 m/s), but at 2–3 m above ground, the
a stable air temperature gradient can be temperature may be only 25  C (c ¼ 347 m/s).
established with warmer air near the ground, This decrease continues up through the atmo-
because of sunlight heating the ground, which sphere by 1  C/100 m, the so-called temperature
warms up much faster than the overlaying air. lapse. With such an air temperature gradient, the
At higher elevations, the air temperature sound rays from a sound source located a few
decreases by 0.01  C/m (Fig. 5.14a). Sound meters above ground will bend upwards, because
waves consequently bend away from locations part of the wave closest to the warmer ground will
near the ground where the temperature is higher travel the fastest. In a carefully conducted experi-
and upwards towards locations with lower ment, a combination of upward refraction, strong
temperatures (Fig. 5.14b). Horizontal rays will upwind propagation, and air absorption was
be directed upwards as will downwards directed measured to reduce the level of propagating
rays after bouncing from the ground. Therefore, a sound at a distance of 640 m by up to 20 dB
certain limiting ray exists that defines a shadow more than predicted from Eq. 5.2 (Attenborough
168 O. N. Larsen et al.

Fig. 5.14 Sketch of the effects of upward refracting Reprinted by permission from Springer Nature. Acoustic
sound speed gradients on outdoor sound propagation. (a) Conditions Affecting Sound Communication in Air and
Temperature profile: Air temperature and consequently Underwater, Larsen and Radford (2018), Fig. 5.5.4. In: H
sound speed increases towards the ground in still air. (b) Slabbekoorn, RJ Dooling, AN Popper and RR Fay (eds).
Ray traces: Sounds from a source (filled circle, here 5 m Effects of Anthropogenic Noise on Animals, Springer
above ground) are refracted upwards, creating a circular Handbook of Acoustic Research 66, Springer Science
shadow zone close to the ground around the source. and Business Media, LLC, part of Springer Nature:
Dashed line indicates a sound ray bouncing off the ground. New York, Heidelberg, Dordrecht, London.
(c) Wind velocity profile: Similar upward refraction is pp. 109–144. https://doi.org/10.1007/978-1-4939-8574-
created upwind. Arrows indicate wind direction towards 6_5. # Springer Nature, 2018. All rights reserved
the source (“headwind”) and their length wind speed.

2007). Perhaps for this reason, birds do not com- infrasonic elephant call during the middle of the
monly sing in open environments near the ground day would travel no more than 1 km (i.e., be heard
on sunny days. Rather, they sing in flight well over an area of 3 km2), but an elephant call at
above ground, or from a perch (Wiley 2009). night might be heard over an area of 300 km2 (see
On calm nights, the opposite air temperature also, Garstang et al. 1995; Larom et al. 1997).
gradient can occur close to ground (called tem- Elephants might adjust timing and abundance of
perature inversion) as it cools faster than the their low-frequency calls and apply them specifi-
overlaying air. Air temperatures increase up to cally for long-distance communication according
50–100 m above ground before decreasing again to atmospheric conditions.
with altitude. Therefore, sound rays bend down- An air temperature gradient can arise in other
wards and hit the ground (Fig. 5.15). A tempera- locations than just close to ground. Geiger (1965)
ture inversion favors long-distance sound found the air in and above the forest canopy begin-
propagation as it leads to higher received levels ning to warm immediately after sunrise, whereas
than predicted by spherical spreading. For this the air below the canopy was slower to respond.
reason, nocturnal communication distances of This creates a bilinear sound speed profile with an
low-frequency African savanna elephant upward refracting gradient above the canopy and a
(Loxodonta africana) sound doubled on the downward refracting gradient below the canopy.
savanna to as much as 10 km (Garstang et al. So, for a short period after sunrise, vocalizing birds
1995). In these conditions, sound energy is and, for instance, howler monkeys (Alouatta sp.)
channeled making spreading losses effectively located below the canopy can increase the range of
cylindrical, rather than spherical within the sur- their vocalizations relative to later in the day (Wiley
face layer. Garstang (2010) suggested that a loud and Richards 1978; Wiley 2009).
5 Source-Path-Receiver Model for Airborne Sounds 169

Fig. 5.15 Sketch of the effects of downward refracting the source (“tailwind”) and their length wind speed.
sound speed gradients on outdoor sound propagation. (a) Reprinted by permission from Springer Nature. Acoustic
Temperature profile: On calm nights, air temperature and Conditions Affecting Sound Communication in Air and
consequently sound speed may increase with height above Underwater, Larsen and Radford (2018), Fig. 5.5.5. In: H
ground until temperature lapse starts. (b) Ray traces: Slabbekoorn, RJ Dooling, AN Popper and RR Fay (eds).
Sounds from a source (filled circle, here 5–10 m above Effects of Anthropogenic Noise on Animals, Springer
ground) are refracted downwards, creating higher sound Handbook of Acoustic Research 66, Springer Science
levels with distance than predicted from spherical spread- and Business Media, LLC, part of Springer Nature:
ing. (c) Wind velocity profile: Similar downward refrac- New York, Heidelberg, Dordrecht, London.
tion with increased sound levels may be created pp. 109–144. https://doi.org/10.1007/978-1-4939-8574-
downwind. Arrows indicate wind direction away from 6_5. # Springer Nature, 2018. All rights reserved

5.2.9 Refraction by Gradients of Wind c(z), the air temperature profile T(z), and the wind
Velocity velocity profile u(z), where z is the height above
ground, when the wind blows in the direction of
Strong air temperature gradients cannot exist sound propagation (when the wind blows against
during strong wind conditions, so the effects of propagation, u(z) is added):
wind velocity on sound propagation in open rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
environments are more influential than air tem- T ðzÞ þ 273:15
cðzÞ ¼ cð0Þ þ uð z Þ ð5:8Þ
perature gradients (Attenborough 2007). Wind 273:15
may cause a shift in sound direction such that
Wind velocity is lowest at the ground and
the appearance from where the sound is generated
increases with altitude (Figs. 5.14c, 5.15c).
differs from where it is actually sent (acoustic
Sound traveling upwind refracts upwards and
mirage). Wind velocity gradients can enhance or
sound traveling downwind refracts downward
impede sound propagation, leading to negative or
(Fig. 5.14b, Fig. 5.15b). As with temperature
positive EA. The actual speed of sound is the sum
gradients, this creates a shadow zone upwind
of the air temperature-generated speed of sound
(Fig. 5.14b), where the sound is not heard. Down-
and the net wind velocity.
wind, sounds propagate in a channeled way
Attenborough et al. (2007) reported the gen-
(Fig. 5.15b) with less loss. Sound attenuates more
eral relationship between the sound speed profile
against the wind than with the wind. Despite this
170 O. N. Larsen et al.

Fig. 5.16 Noise map showing the received levels 50 cm ground. Note how the wind attenuates the gunshot upwind
above ground of a gunshot fired towards east at a location and enhances it downwind. Noise map calculated by
(small red circle in dark blue area upper left corner) close DELTA—a part of FORCE Technology, Hørsholm,
to a lake (lake contour lines indicated by thin black curves) Denmark, using Nord2000 software (https://eng.mst.dk/
with varied topography. The color coding indicates iso- air-noise-waste/noise/traffic-noise/nord2000-nordic-
dB-curves in 5-dB steps. The dark arrow indicates wind noise-prediction-method/; accessed 23 December 2020).
direction and its length corresponds to 300 m on the Figure donated by Jesper Madsen, Aarhus University

common phenomenon, Wiley (2009) commented pattern radiating from the sound source is typi-
that there are no documented cases of animals cally irregular in shape (rather than concentric)
selectively communicating downwind. But refrac- and helps identify environmental conditions that
tion by gradients of wind velocity played a signifi- impede or promote sound propagation. Sound
cant role in Civil War battles in the rolling hills of mapping tools can commonly utilize data on
the eastern U.S. There was no radio communica- topography and ground absorption, air tempera-
tion in the nineteenth century, so commanders ture, and wind direction and speed. The example
often depended on what they heard of the battle in Fig. 5.16 shows how wind attenuated noise
in front of them to make decisions about troop from a gunshot upwind but enhanced received
movements. An acoustic shadow zone existed dur- levels downwind.
ing the Battle of Gettysburg and commanders
could not hear the sounds of battle just 10 miles
away, whereas people 150 miles away in
5.2.10 Attenuation from Air
Pittsburgh clearly heard the skirmish (Ross 2000).
Turbulence
Sound maps portray the attenuation of sound
over distance from a source. The maps take a
Turbulence refers to unsteady and irregular
bird’s-eye view, showing attenuation in 360
motion of the air. It is very difficult to model
about a sound source. Such maps can be produced
and predict. It may be mechanically or thermally
at a specific receiver altitude, or commonly show
induced. Mechanical turbulence is caused by fric-
maximum received levels over a range of
tion, for example, when air moves over rough
altitudes with the intent of yielding “conserva-
ground or past obstacles such as houses and
tive” estimates of received level. The attenuation
trees. Friction causes eddies and thus turbulence.
5 Source-Path-Receiver Model for Airborne Sounds 171

This turbulence is stronger in higher wind speeds Fig. 5.17, two gentoo penguins (Pygoscelis
and rougher terrain. Turbulence is particularly papua) are communicating within their nesting
great during fall winds, which shoot down the colony in Antarctica. The sender (i.e., the source)
slope of a mountain. Thermal turbulence is cre- emits a penguin display call. The call spreads
ated when the sun heats the ground unevenly. For through the habitat, experiencing various forms
example, bare ground warms up faster than fields of attenuation. The receiver is another gentoo
with vegetative cover or bodies of water. Convec- penguin. It might respond acoustically and thus
tive air currents are established with warm and become the next sender. Whether this two-way
less dense air rising and cold and denser air sink- acoustic communication is successful, depends
ing. These currents, in turn, may generate eddies. on a number of parameters.
Eddies may extend from the ground to a few The locations of sender and receiver matter;
hundred meters height. They can be of various the closer together they are, the better the com-
sizes (height and diameter) and larger eddies may munication—most likely. If the source emission
break up into smaller ones. Because of air tem- pattern is directional rather than omnidirectional
perature, gradients and wind, air is always in (i.e., the call can be emitted in a specific direc-
motion and this motion may always generate tion), then the orientation of the sender towards
turbulence. the receiver matters. Similarly, if the receiver’s
Turbulence causes EA, which increases with hearing is directional, then the receiver’s orienta-
distance from the source, with the level of turbu- tion affects communication success. A stronger
lence, and with sound frequency (see red curve in source level will increase the likelihood of suc-
Fig. 5.11b). EA is typically highest during day- cessful reception, unless the environment is
time and on hot sunny days. A characteristic of highly reverberant, in which case the echoes
turbulence on sound propagation is that received would also be louder and potentially interfere
levels at a fixed location quickly fluctuate with with communication success. The frequency con-
time and, at some range, this fluctuation stabilizes tent of the call matters, because different
at a standard deviation of about 6 dB (Daigle et al. frequencies propagate differently, and the hearing
1983). Van Staaden and Römer (1997), for abilities of the receiver are frequency-dependent.
instance, reported that at night, the sound pressure Along the path, some of the call energy is lost
level of the song of an African bladder grasshop- due to geometrical spreading and some is
per (Bullacris intermedia) over open grassland absorbed by the air, snow, and soil. The direction
was reduced with distance very close to the of propagation changes due to reflection and scat-
expected 6-dB per doubling of distance of spheri- tering off rocks, and due to refraction by sound
cal attenuation. However, during daytime, the speed gradients in air. Diffraction around
attenuation was much larger and more variable mountains might play a role over longer ranges.
due to air turbulence. Ambient noise in the environment does not affect
For more in-depth reading on outdoor sound sound propagation; i.e., it neither leads to attenu-
propagation, please see Attenborough et al. ation nor changes the direction of propagation.
(2007), Attenborough et al. (2007), Larsen and Ambient noise in the environment affects
Wahlberg (2017), Wahlberg and Larsen (2017), whether the call is received and correctly
or Larsen and Radford (2018). interpreted. Ambient noise can be of abiotic,
biotic, or anthropogenic origin. Wind causes
noise, as do waves and breaking ice. The other
5.3 The Source-Path-Receiver penguins in the colony create ambient noise with
Model for Animal Acoustic their own acoustic communications. Human pres-
Communication ence (e.g., chatting tourists stomping through the
snow towards the penguin colony) might add to
The SPRM can be used to examine acoustic com- the ambient noise. Ambient noise at the location
munication among animals. In the example of of the receiver lowers the signal-to-noise ratio
172 O. N. Larsen et al.

SOURCE PATH RECEIVER

Parameters: Effects: Parameters:


• Locaon • Geometrical spreading • Locaon
• Orientaon • Absorpon • Orientaon
• Emission direcvity • Reflecon, scaering • Recepon direcvity
• Source level • Diffracon, refracon • Audiogram
• Source spectrum Ambient noise: • Crical raos
• Redundancy • Abioc, bioc, • Masking release
anthropogenic

Fig. 5.17 Example of the SPRM for animal acoustic propagation effects leading to attenuation. Ambient noise
communication. The source is a gentoo penguin emitting in the habitat stems from waves, wind, and ice (abiotic),
its display call within its nesting colony in Antarctica. The other penguins (biotic), and perhaps humans (anthropo-
sound propagation path takes the call through the local genic). Ambient noise at the receiver reduces the signal-to-
habitat. The receiver is another gentoo penguin in a neigh- noise ratio and hence the detectability of the call. Ambient
boring colony who might respond acoustically, thereby noise at the source may lead to increases in source level
becoming the next source. The parameters that affect suc- and repetition (redundancy) and shifts in spectral content
cessful communication are listed below the source and the (Lombard effect)
receiver. Along the path, the call experiences various

(SNR) at which the call is received. The critical 5.3.1 The Sender
ratios (specific to the receiver’s auditory system;
see Chap. 10) dictate, below which SNR the call In animal acoustic communication, the signal that
is masked by the ambient noise and thus not is being sent depends on the sender’s species,
detected. At intermediate SNRs, the call might demographic parameters, behavioral state, and
be detected, but not correctly interpreted. many other factors. Obviously, different taxo-
Masking-release processes (also specific to the nomic groups produce different sounds, ranging
receiver’s auditory system) include comodulation from infrasonic rumbles of elephants to ultrasonic
masking release and spatial release from masking clicks of bats (see Chap. 8 on classifying animal
(e.g., Erbe et al. 2016) and aid signal detection sounds). But even closely-related species may be
and interpretation. Ambient noise at the sender told apart acoustically. For example, Gerhardt
may lead to the Lombard effect (Lombard 1911), (1991) found that the number of pulses in the
whereby the sender raises the source level of its advertisement call in male Eastern gray treefrogs
call, actively changes the spectral characteristics (Dryophytes versicolor) and Cope’s gray
to move sound energy out of the frequency band treefrogs (Dryophytes chrysoscelis) is the major
most at risk from masking, and repeats the call to cue distinguishing sympatric males who are simi-
increase the likelihood of reception. Finally, lar in size and color. While species-specific calls
ambient noise may instill anti-masking strategies of bats have been recognized for decades
in both sender and receiver whereby they change (Balcombe and Fenton 1988; Fenton and Bell
their location and orientation (both towards each 1981; O’Farrell et al. 1999), more recently,
other) to foster communication success. acoustic differences have been noted in bat
5 Source-Path-Receiver Model for Airborne Sounds 173

species that are difficult to tell apart morphologi-


cally (Gannon et al. 2001; Gannon et al. 2003;
Gannon and Racz 2006). The more we record and
document species’ repertoires, the more success-
ful bioacousticians will become at identifying the
sender’s species.
Within the same species, populations living
in different geographic regions and habitats
may exhibit differences in their sounds, as
demonstrated for Italian vs. English tawny owls
(Strix aluco; Galeotti et al. 1996), pikas
(Ochotona spp.; Trefry and Hik 2010), and
chimpanzees (Pan troglodytes schweinfurthii;
Mitani et al. 1992). Animals can tell conspecifics
from a different region or population apart. Audi-
tory neighbor-stranger discrimination has been
demonstrated, for instance, in concave-eared tor-
rent frogs (Odorrana tormota; Feng et al. 2009)
and alder flycatchers (Empidonax alnorum;
Lovell and Lein 2004), where territory holders Fig. 5.18 Spectrograms of close calls of three banded
respond less aggressively towards played-back mongoose (two females and one male; top to bottom)
during a. digging, b. searching, and c. moving between
neighbor songs than to those of strangers, the
foraging sites. Black arrows point to the individually stable
“dear enemy effect.” foundation of each call. Dashed arrows point to the har-
Not just population identity, but even individ- monic extension, the duration of which was correlated with
ual identity may be encoded in the outgoing behavior (Jansen et al. 2012). # Jansen et al.; https://link.
springer.com/article/10.1186/1741-7007-10-97. Published
signal; for example, in oilbirds (Steatornis
under a Creative Commons Attribution License; https://
caripensis; Suthers 1994), banded mongoose creativecommons.org/licenses/by/2.0/
(Mungos mungo; Fig. 5.18; Jansen et al. 2012),
and in fallow deer (Dama dama; Vannoni and animals, such as penguins, gulls, pinnipeds, and
McEligott 2007). Galeotti and Pavan (1991) stud- bats especially rely on individual acoustic recog-
ied an urban population of non-songbirds, tawny nition between a mother and offspring. These
owls, in Pavia, Italy, and demonstrated that the mothers often leave their young in a colony
males’ territorial hoots have a clear species- while they forage, so proper recognition of their
specific structure with individual variations own young upon return is important to fitness.
mainly in the final note of the call. Bats use Especially in birds without nests and physical
individualized calls as they aggregate. For exam- landmarks such as king penguins (Aptenodytes
ple, Melendez and Feng (2010) determined that patagonicus), acoustic recognition between
communication calls of little brown bats (Myotis parents and chicks becomes critical (Aubin and
lucifugus) were individually distinct in minimum Jouventin 2002; Searby et al. 2004).
and maximum frequency, and call duration. Indi- As organisms grow, their physical dimensions
vidual pallid bats (Antrozous pallidus) emitted and size of their sound-producing organs become
unique calls below the frequency of their echolo- larger. Generally, emitted sounds transition from
cation clicks and in the presence of other bats high-frequency, low-amplitude sounds to
(Arnold and Wilkinson 2011). Wilkinson and low-frequency, high-amplitude sounds (Hardouin
Boughman (1998) provided evidence that the et al. 2014). It is partly a consequence of the
greater spear-nosed bat (Phyllostomus hastatus) simple physiology that animals cannot efficiently
used individual social calls to coordinate feeding emit sounds with wavelengths longer than the
on clumped nectar and fruit resources. Colonial dimensions of their sound-emitting organs (e.g.,
174 O. N. Larsen et al.

see Michelsen 1992; Genevois and Bretagnolle Context further determines acoustic signaling.
1994; Fletcher 2004, and Larsen and Wahlberg For example, predators often hunt quietly, and
2017). For instance, Charlton et al. (2011) prey remain silent when it is aware of being
reported that increased body size in male koalas stalked. A classic case where (prey) moths
(Phascolarctos cinereus) was reflected in the attempt to jam (predator) bat echolocation signals
closer spacing of vocalization formants. with a counter signal to confuse the approaching
(Formants refer to a concentration of acoustic predator has developed another twist. Ter
energy around particular frequencies caused by Hofstede and Ratcliffe (2016) found that, “spe-
resonances in the vocal tract.) Stoeger-Horwath cific predator counter-adaptations include calling
et al. (2007) reported age-dependent variations in at frequencies outside the sensitivity range of
the grunt and trumpet calls of African savanna most eared prey, changing the pattern and fre-
elephants. The grunts were only recorded in quency of echolocation calls during prey pursuit,
individuals less than 2 months of age and infants and quiet, or ‘stealth,’ echolocation.” Acoustic
never produced trumpet calls until they were interactions between a parent and offspring are
3 months old. The authors also reported often brief and relatively quiet to conceal and
age-dependent variations in the low-frequency protect the young. In contrast, messages with a
rumble; older individuals rumbled at a lower fun- high reproductive value, such as mating calls or
damental frequency than younger individuals, territorial defense calls, and calls with high sur-
and there also was a tendency for rumble duration vival value, such as infant distress calls or adult
to increase slightly with age. Weddell seal alarm calls, are produced loudly and repeatedly.
(Leptonychotes weddellii) pups on rookeries To this point, it has been shown that distress calls
emit high-frequency calls that transition into of three species of pipistrelle bats (Pipistrellus
low-frequency adult calls used exclusively while nathusii, P. pipistrellus, and P. pygmaeus) were
hauled-out on the ice (Thomas and Kuechle structurally convergent, “consisting of a series of
1982). Reby and McComb (2003) reported that downward-sweeping, frequency-modulated
lower-frequency male roars in red deer (Cervus elements of short duration and high intensity
elaphus) stags were associated with greater age with a relatively strong harmonic content” (Russ
and weight, so provided “honest” cues about et al. 2004). The study suggested that it was not as
reproductive condition. important to have species-specific signals as it
In many species, sex-specific differences in the was to have some device that produced a mob-
acoustic repertoires are employed to insure proper bing by bats of the predator regardless of species
mate selection (Hardouin et al. 2014). The of bat.
sender’s reproductive state and drive for mating Ambient noise at the location of the sender
often is represented in its acoustic signals. In may also affect signal emission level, repetition,
songbirds and many orthopteran insects, only and spectral shifts (collectively called the Lom-
males sing (Miller et al. 2007; Riede et al. bard effect; Brumm and Zollinger 2011). For
2010). Songs are under the influence of reproduc- instance, male túngara frogs (Engystomops
tive hormones associated with courtship, and pustulosus) increased the level, repetition, and
songbird songs are long, complex, and repeated complexity of their calls when noise overlapped
in a typical and recognizable sequence of sounds. with their normal frequency band of calling but
In species in which males compete acoustically to not when noise was higher and non-overlapping
attract a female mate, a substandard mating call in frequency (Halfwerk et al. 2016). Brumm
could indicate immaturity, agedness, or poor (2004) and Brumm and Todt (2003) noted that
health of the caller. For example, Hardouin et al. birds in a noisy environment called louder and
(2007) examined hoots by 17 male scops owls more often, and repositioned themselves, possi-
(Otus scops) on the Isle of Oléron, France. bly to increase the likelihood of the sound being
Heavier male owls made lower-frequency hoots, received. Similarly, greater horseshoe bats
which could give them a competitive mating (Rhinolophus ferrumequinum) increased their
advantage over lighter weight males. call level and shifted frequency in noisy
5 Source-Path-Receiver Model for Airborne Sounds 175

environments (Hage et al. 2013). Eliades and wind, there may be noise from branches creaking
Wang (2012) examined the neural processes and breaking in the heat or noise from rustling
underlying the Lombard effect in marmoset leaves in the understory as animals walk through.
monkeys (Callithrix jacchus) and found that Wind also drives waves; surf noise or noise from
increased vocal intensity was accompanied by a breaking waves is typical for coastal areas. Even
change in auditory cortex activity toward neural without wind, moving water, such as waterfalls,
response patterns observed during vocalizations can be noisy. Precipitation (i.e., rain, hail, thun-
under normal feedback conditions. der, and lightning) creates noise. Geological
Many animal communication calls are close to events such as earthquakes, seismic rumblings,
being omnidirectional, radiating equally in all and volcanic eruptions contribute noise to the
directions—at least at their lower frequencies terrestrial soundscape. In polar regions, melting
(Larsen and Dabelsteen 1990). However, some ice and calving glaciers contribute to ambient
bird species (e.g., juncos, warblers, and finches) noise.
showed an ability to focus their calls in the direc- Biotic ambient noise comes from animals in
tion of an owl to warn-off the predator. Yorzinski the environment. These can be of the same or
and Patricelli (2009) examined the acoustic direc- different species from the target species. Several
tionality of antipredator calls of 10 species of taxa call in large numbers at certain times of day
passerines and found that some birds would and season, significantly raising ambient noise
“call out of the side of their beaks” with their levels (e.g., chorusing cicadas, katydids, or
head pointed away from conspecifics in an appar- frogs). Biologists typically think of soniferous
ent attempt at ventriloquist behavior. Whether animals as calling with specialized anatomies for
terrestrial animals can actively change the sound sound production (i.e., syringes in birds and vocal
emission directivity in response to noise (in order cords in mammals). However, most animals also
to enhance acoustic communication) needs to be can produce mechanical sounds using external
investigated. anatomies, such as wing-stridulation by a locust,
abdomen vibration by a spider, beak-pecking by a
woodpecker, teeth-chattering by a squirrel, foot-
5.3.2 The Path and the Acoustic thumping by a rabbit, etc. In addition, animals can
Environment produce unintentional sounds, such as noise
associated with rustling leaves as an animal
As the signal leaves the sender and travels walks through a forest, respiration noise, flight
through the environment, it is subjected to various noise, feeding sounds, etc., not intended for com-
forms of attenuation (as detailed above) and so munication with a conspecific. Example
the level at the receiver location is less than the spectrograms for many of these sounds are
source level. In addition, ambient noise at the found in Chap. 7 on soundscapes as well as
receiver location reduces the SNR, making it Chap. 8 on detecting and classifying animal
harder for the receiver to detect the signal. Ambi- sounds.
ent noise may be classed according to its sources: Anthropogenic ambient noise is due to aircraft,
abiotic, biotic, or anthropogenic. Chapter 7 road traffic, trains, ships, military activities, con-
provides a detailed overview of ambient noise struction activities, etc. Increasing encroachment
with example spectrograms. of human activities on animal habitats results in
In terms of abiotic ambient noise, wind is a increased noise exposure for all taxa of animals
major contributor and its noise level increases (see Chap. 13 on noise impacts).
with wind speed. In addition, remember that the Ambient noise varies with time on scales of
direction of wind (i.e., upwind or downwind) hours, days, lunar phase, season, and year. The
affects the distance that sounds propagate. Wind reason is a combination of sound propagation
drives other types of noise, such as noise from effects and source behavior. The time of day and
vegetation moving in the wind. Even without season of year affect sound propagation. As
176 O. N. Larsen et al.

explained above, sounds can be heard from far- In American mink (Neovison vison), for
ther away during the night; for example, a train instance, hearing-sensitivity and frequency range
can be heard in the distance at night, but not changed markedly with postnatal age. Pups up to
during the day. Walking in the woods during the 32 days old were almost deaf, whereas three
winter, the listener can hear sounds over much weeks later, their audiogram started to resemble
greater distances than during the summer with that of an adult (in shape), but they remained less
thick vegetation. In many animals, sound- sensitive than adults, especially below 10 kHz
production rates are highest during the breeding (Brandt et al. 2013). There might be good reasons
season. Chorusing insects, amphibians, and birds why hearing in young is immature. For example,
precisely time the commencement of their a male fruit fly (Drosophila melanogaster) cannot
cacophonies to a breeding season each year. hear the female’s flight tone until he is physically
Amphibians stop calling when they go into winter mature enough to mate (Eberl and Kernan 2011).
hibernation, so chorusing can stop abruptly in late This ensures the female fruit fly that any pursuing
autumn. Some birds migrate, so their songs are male is mature. Hearing capabilities further
missing from the winter soundscape. Many change over an adult’s life. Natural deterioration
migrating birds are soniferous and their flight with age due to anatomical and physiological
calls can temporarily dominate the soundscape aging is a process called presbycusis. Hearing
as they pass through an area during a spring loss can also be caused by acute noise exposure
migration (e.g., a honking flock of migrating at strong levels and chronic exposure to moderate
geese or a chirping flock of starlings). Yet, other noise (see Chap. 13). Hearing loss likely affects
species of birds remain in temperate areas over the ability of a receiver to hear and interpret a
winter and produce sounds all year long (e.g., sender’s message. For example, a hearing-
cardinals, sparrows, and snow juncos). Tropical impaired moth, which typically avoids a bat pred-
insects, frogs, and birds can reproduce multiple ator through an evasive flight pattern, will be
times per year, they do not migrate or hibernate, easier to capture if the bat’s echolocation signals
and so are soniferous throughout the year. Diurnal are not heard.
cycles exist in all animals with birds calling in the The receiver’s sex rarely influences its hearing
morning, insects in the afternoon, frogs in the capabilities; however, Narins and Capranica
evening, and nocturnal animals in the middle of (1976, 1980) provided an example of sex
the night. differences in the auditory reception system of a
Puerto Rican treefrog, the coquina frog (Eleuther-
odactylus coqui). Male and female treefrogs
responded to different notes of the male’s
5.3.3 The Receiver
two-note, co-qui call. Females were attracted to
the qui-part of the call. Males paid most attention
The same factors that can affect the sender also
to the co-part of the call, which was important in
could affect the receiver’s ability to detect and
male–male aggressive interactions. The authors
interpret a signal (i.e., species, population, indi-
found that the inner ear basilar papilla was tuned
vidual traits, age, sex, context, and ambient
differently in males and females; males had fewer
noise). On the species level, different species
fibers tuned to the qui-part of the call and females
typically hear sound at different frequencies and
had fewer fibers tuned to the co-part of the call.
levels. In other words, audiograms are species-
These differences also occurred in higher-order
specific (Fig. 5.19). Fortunately, data on hearing
neurons in the brain, where response decisions
abilities of invertebrates, insects, reptiles,
take place. Later studies (Mason et al. 2003)
amphibians, fish, birds, and mammals continue
showed similar sexual differences in the middle
to accumulate (see Volume 2). Nonetheless,
ear of bullfrogs (Lithobates catesbeianus).
there is some intra-species and individual
Ambient noise is a ubiquitous factor
variability in hearing (see Chap. 10).
influencing signal reception and interpretation.
5 Source-Path-Receiver Model for Airborne Sounds 177

10 Hz 100 Hz 1 kHz 10 kHz 100 kHz 1 MHz


Tuna 50 Hz-1.1 kHz (4.5 8va)
Chicken 125 Hz-2 kHz (4.0 8va)
Goldfish 20 Hz-3 kHz (7.2 8va)
Bullfrog 100 Hz-3 kHz (4.9 8va)
Catfish 50 Hz-4 kHz (6.3 8va)
Tree frog 50 Hz-4 kHz (6.3 8va)
Canary 250 Hz-8 kHz (5.0 8va)
Cockatiel 250 Hz-8 kHz (5.0 8va)
Parakeet 200 Hz-8.5 kHz (5.4 8va)
Elephant 17 Hz-10.5 kHz (9.3 8va)
Owl 200 Hz-12 kHz (5.9 8va)
Human 31 Hz-19 kHz (9.3 8va)
Chinchilla 52 Hz-33 kHz (9.3 8va)
Horse 55 Hz-33.5 kHz (9.3 8va)
Cow 23 Hz-35 kHz (10.6 8va)
Raccoon 100 Hz-40 kHz (8.6 8va)
Sheep 125 Hz-42.5 kHz (8.4 8va)
Dog 64 Hz-44 kHz (9.4 8va)
Ferret 16 Hz-44 kHz (11.4 8va)
Hedgehog 250 Hz-45 kHz (7.5 8va)
Guinea pig 47 Hz-49 kHz (10.0 8va)
Rabbit 96 Hz-49 kHz (9.0 8va)
Sea lion 200 Hz-50 kHz (8.0 8va)
Gerbil 56 Hz-60 kHz (10.1 8va)
Opossum 500 Hz-64 kHz (7.0 8va)
Albino rat 390 Hz-72 kHz (7.5 8va)
Hooded rat 530 Hz-75 kHz (7.1 8va)
Cat 55 Hz-77 kHz (10.5 8va)
Mouse 900 Hz-79 kHz (6.4 8va)
Little brown bat 10.3 kHz-115 kHz (3.5 8va)
Beluga whale 1 kHz-123 kHz (6.9 8va)
Bottlenose dolphin 150 Hz-150 kHz (10.0 8va)
Porpoise 75 Hz-150 kHz (11.0 8va)
C C C C C C C C C C C C C C C C C
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Fig. 5.19 Hearing ranges of some animals and humans. (1994), Heffner (1983), Heffner and Heffner (2007),
Bars represent the approximate hearing frequency range, Lipman and Grassi (1942), Warfield (1973), and West
ordered after increasing upper frequency cut-off; blue: (1985), previously compiled by Vanderbilt University
fish, gray: bird, green: frog, orange: terrestrial mammal, and Louisiana State University (http://lsu.edu/deafness/
violet: human, and brown: marine mammal. The red verti- HearingRange.html; accessed 6 January 2021), and plot-
cal lines are the frequencies of musical notes C0–C16, for ted by Wikimedia Commons author Cmglee. https://
comparison. There is one octave between successive commons.wikimedia.org/wiki/File:Animal_hearing_fre
C-notes. Middle-C on a piano is C4. A full-sized piano quency_range.svg. Figure licensed under the Creative
will only range from just under C1 to C8, with tones >C11 Commons Attribution-Share Alike 3.0 Unported license;
being ultrasound. Data from Fay (1988), Fay and Popper https://creativecommons.org/licenses/by-sa/3.0/deed.en

Having experienced various forms of attenuation et al. 2003; Dooling et al. 2009; Dooling and
along its path, a signal will be audible if its Blumenrath 2013; Dooling and Leek 2018).
amplitude remains above the power spectral den- Some birds take advantage of these limitations
sity level of the ambient noise plus the critical by producing both high-amplitude broadcast
ratio of the receiver. The critical ratio is essen- sounds and low-amplitude soft sounds. The for-
tially a minimum SNR needed for signal detec- mer become public since they cover a large active
tion (see Chap. 10 for more information on the space with many potential receivers whereas the
critical ratio). An even higher SNR is needed for latter become private as they cover a very small
signal discrimination, recognition, and finally, active space with only few receivers (Larsen
comfortable communication (Fig. 5.20; Lohr 2020).
178 O. N. Larsen et al.

Zone of … Detection

Discrimination

Recognition

Comfortable
Communication
distance
between
100
10
00 18
180
8 0 210
21 2 45 birds [m]
1 0 245

Fig. 5.20 Sketch of the radii about a calling bird over louder ambient noise, the ranges will be even less. For
which a broadcast public call might be detected, animals with soft private calls or greater critical ratios, the
discriminated, and recognized. Detection (i.e., signal pres- radii will also be less (Erbe et al. 2016). # Erbe et al.;
ence/absence) is possible over the longest ranges (i.e., https://doi.org/10.1016/j.marpolbul.2015.12.007.
lowest SNR). A higher SNR is needed for signal discrimi- Licensed under CC BY 4.0; https://creativecommons.org/
nation, then signal recognition, and finally, comfortable licenses/by/4.0/
communication, yielding progressively shorter ranges. In

The auditory systems of some animals have noise from frequencies outside of the signal fre-
built-in masking-release processes to reduce the quency to filter the noise within the frequency
impact of ambient noise. A spatial release from band of the signal. A comodulation masking
masking results from the directional hearing release has been demonstrated in gray treefrogs
capabilities of the animal. If the signal arrives (Bee and Vélez 2018), European starling (Sturnus
from a direction in which the receiver is more vulgaris; Klump and Langemann 1995), and
sensitive and if the noise arrives from a direction house mice (Mus musculus; Klink et al. 2010).
in which the receiver is less sensitive, then Addionally, animals have a host of behavioral
the reception directivity improves the SNR and adaptations to optimize sound reception. For
the signal can be detected in higher ambient example, an animal may improve the SNR for
noise. A spatial release from masking has sound arriving at its ears by approaching the
been demonstrated in several taxa including source, tilting its head, adjusting its pinnae
tropical crickets (Paroecanthus podagrosus and (in the case of mammals), or moving to another
Diatrypa sp.; Schmidt and Römer 2011), gray location away from a noise source (Nelson and
treefrogs (Bee 2008), budgerigars (Melopsittacus Suthers 2004).
undulatus; Dent et al. 1997), and pigmented
Guinea pigs (Cavia porcellus; Greene et al.
2018). A comodulation masking release is possi-
5.4 Summary
ble if the noise is broadband and amplitude-
modulated coherently across its frequencies. The
The Source-Path-Receiver Model (SPRM) is used
animal might then utilize information about the
widely in technical noise control and illustrates
5 Source-Path-Receiver Model for Airborne Sounds 179

the importance of exploring a signal at all points 5.5 Additional Resources


between the source and receiver and of under-
standing factors that affect the observations. The following sites were last accessed 3 February
This chapter developed the SPRM for the exam- 2021.
ple of animal acoustic communication (also see
Chap. 11). The influences of the sender’s and • NoiseModelling is a free software package
receiver’s species, age, sex, individual identity, developed by the French Government’s Centre
and behavioral status were discussed. The receiv- National de la Recherche Scientifique and the
ing animal’s hearing ability is a major factor for Université Gustave Eiffel to produce
communication success. sound maps: https://noise-planet.org/
Terminology related to sound propagation noisemodelling.html
(or the path) was defined and basic concepts of • Dan Russell’s Acoustics and Vibration
outdoor sound propagation were developed, Animations: https://www.acs.psu.edu/
supported with simple equations. Several factors drussell/demos.html
play an important role in sound propagation: dis-
Acknowledgement We wish to thank Prof. Keith
tance between sender and receiver, air tempera-
Attenborough for his constructive review of this chapter.
ture, wind (direction and speed), obstacles along
the path, and ground cover. The concepts of
source level, received level, sound absorption,
reflection, scattering, reverberation, diffraction, References
refraction, acoustic shadows, acoustic mirages,
air temperature gradients, and wind speed Arnold B, Wilkinson G (2011) Individual specific contact
calls of pallid bats Antrozous pallidus attract
gradients were illustrated. Two types of geomet- conspecifics. Behav Ecol Sociobiol 65:1581–1593.
ric spreading (i.e., spherical and cylindrical) https://doi.org/10.1007/s00265-011-1168-4
were applied. Examples for ray tracing were Attenborough K (2007) Sound propagation in the
provided. Ambient noise (including its abiotic, atmosphere. In: Rossing TD (ed) Springer handbook
of acoustics. Springer, New York, pp 113–147
biotic, and anthropogenic sources) in terrestrial Attenborough K, Taherzadeh S, Bass HE, Di X, Raspet R,
environments and its influence on both sender Becker GR, Güdesen A, Chrestman A, Daigle GA,
and receiver was discussed. L’Espérance A, Gabillet Y, Gilbert KE, Li YL, White
The SPRM may be applied to many other MJ, Naz P, Noble JM, van Hoof HAJM (1995) Bench-
mark cases for outdoor sound propagation models. J
bioacoustic scenarios or studies such as animal Acoust Soc Am 97(1):173–191. https://doi.org/10.
biosonar (where the sender and receiver are the 1121/1.412302
same individual; see Chap. 12) or the effects of Attenborough K, Li KM, Horoshenkov K (2007)
noise on animals (where the source might be a Predicting outdoor sound. Taylor & Francis, London
Aubin T, Jouventin P (2002) How to vocally identify kin
highway; see Chap. 13). It would also be useful to in a crowd: the penguin model. Adv Study Behav 31:
consider passive acoustic monitoring (of animals 243–277
or soundscapes) within the framework of the Aylor D (1972a) Noise reduction by vegetation and
SPRM to understand the sound sources recorded, ground. J Acoust Soc Am 51(2):197–205
Aylor D (1972b) Sound transmission through vegetation
the way the environment affects the recorded in relation to leaf area density, leaf width, and breadth
soundscape, and the effects (and potential of canopy. J Acoust Soc Am 51(1):411–414
artifacts) of the recording system (i.e., the Balcombe JP, Fenton MB (1988) The communication role
receiver; see Chaps. 2 and 7). The SPRM might of echolocation calls in vespertilionid bats. In:
Nachtigall PE, Moore PWB (eds) Animal Sonar,
also guide the bioacoustician in setting up audio- NATO ASI science (Series A: Life Sciences), vol
metric experiments (where the source is an 156. Springer, Boston, MA
engineered signal; see Chap. 10). The SPRM is Bee MA (2008) Finding a mate at a cocktail party: spatial
a fundamental concept helpful in bioacoustic release from masking improves acoustic mate recogni-
tion in grey treefrogs. Anim Behav 75(5):1781–1791.
study design and interpretation. https://doi.org/10.1016/j.anbehav.2007.10.032
180 O. N. Larsen et al.

Bee MA, Vélez A (2018) Masking release in temporally Dooling RJ, West EW, Leek MR (2009) Conceptual and
fluctuating noise depends on comodulation and overall computational models of the effects of anthropogenic
level in Cope’s gray treefrog. J Acoust Soc Am 144(4): noise on birds. Proc Inst Acoust 31(1):1
2354–2362. https://doi.org/10.1121/1.5064362 Eberl DF, Kernan MJ (2011) Recording sound-evoked
Bradbury JW, Vehrencamp SL (2011) Principles of animal potentials from the Drosophila antennal nerve. Cold
communication, 2nd edn. Sinauer Associates, Spring Harb Protoc 2011:prot5576
Sunderland, MA Eliades SJ, Wang X (2012) Neural correlates of the Lom-
Brandt C, Malmkvist J, Nielsen RL, Brande-Lavridsen N, bard effect in primate auditory cortex. J Neurosci
Surlykke A (2013) Development of vocalization and 32(31):10737–10748. https://doi.org/10.1523/
hearing in American mink (Neovison vison). J Exp Biol JNEUROSCI.3448-11.2012
216:3542–3550 Erbe C, Reichmuth C, Cunningham KC, Lucke K,
Brumm H (2004) The impact of environmental noise on Dooling RJ (2016) Communication masking in marine
song amplitude in a territorial bird. J Anim Ecol 73: mammals: a review and research strategy. Mar Pollut
434–440 Bull 103:15–38. https://doi.org/10.1016/j.marpolbul.
Brumm H, Todt D (2003) Facing the rival: directional 2015.12.007
singing behaviour in nightingales. Behaviour 140(1): Fay RR (1988) Hearing in vertebrates: a psychophysics
43–53 databook. Hill-Fay Associates, Winnetka IL
Brumm H, Zollinger SA (2011) The evolution of the Fay RR, Popper AN (1994) Comparative hearing:
Lombard effect: 100 years of psychoacoustic research. mammals. Springer handbook of auditory research
Behaviour 148(11–13):1173–1198. https://doi.org/10. series. Springer-Verlag, New York
1163/000579511X605759 Feng AS, Arch VS, Yu Z, Yu X-J, Xu Z-M, Shen J-X
Chappuis C (1971) Un exemple de l’influence du milieu (2009) Neighbor–stranger discrimination in concave-
sur les émissions vocales des oiseaux: L’évolution des eared torrent frogs, Odorrana tormota. Ethology
chants en fôret équatoriale. Terre Vie 118:183–202 115(9):851–856
Charlton BD, Ellis WA, McKinnon AJ, Cowin GJ, Fenton MB, Bell G (1981) Recognition of species of
Brumm J, Nilsson K, Fitch WT (2011) Cues to body insectivorous bats by their echolocation calls. J Mam-
size in the formant spacing of male koala mal 62:233–243
(Phascolarctos cinereus) bellows: honesty in an Fletcher NH (2004) A simple frequency-scaling rule for
exaggerated trait. J Exp Biol 214:3414–3422 animal communication. J Acoust Soc Am 115:2334–
Dabelsteen T (1981) The sound pressure level in the dawn 2338
song of the blackbird Turdus merula and a method for Galeotti P, Pavan G (1991) Individual recognition of male
adjusting the level in experimental song to the level in tawny owls (Strix aluco) using spectrograms of their
natural song. Z Tierpsychol 56(2):137–149 territorial calls. Ethol Ecol Evol 3:113–126
Dabelsteen T, Larsen ON, Pedersen SB (1993) Habitat- Galeotti P, Appleby BM, Redpath SM (1996) Macro and
induced degradation of sound signals: Quantifying the microgeographical variations in the “hoot” of Italian
effects of communication sounds and bird location on and English Tawny Owls (Strix aluco). Ital J Zool 63:
blur ratio, excess attenuation, and signal-to-noise ratio 57–64
in blackbird song. J Acoust Soc Am 93(4):2206–2220 Gannon W, Racz GR (2006) Character displacement and
Daigle GA, Piercy JE, Embleton T (1983) Line-of-sight ecomorphological analysis of two long-eared Myotis
propagation through atmospheric turbulence near the (M. auriculus and M. evotis). J Mammal 87(1):
ground. J Acoust Soc Am 74(5):1505–1513 171–179
Defrance J, Barriere N, Premat E (2002) Forest as a mete- Gannon WL, Sherwin RE, de Carvalho TN, O’Farrell MJ
orological screen for traffic noise. Proc. 9th ICSV, (2001) Pinnae and echolocation call differences
Orlando, FL, USA. between Myotis californicus and M. ciliolabrum
Dent ML, Larsen ON, Dooling RJ (1997) Free-field bin- (Chiroptera: Vespertilionidae). Acta Chiropterologica
aural unmasking in budgerigars (Melopsittacus 3(1):77–91
undulatus). Behav Neurosci 111:590–598 Gannon WL, O’Farrell MJ, Corben C, Bedrick EJ (2003)
Dooling RJ, Blumenrath SH (2013) Avian sound percep- Call character lexicon and analysis of field recorded bat
tion in noise. In: Brumm H (ed) Animal communica- echolocation calls. In: Thomas JA, Moss CF, Vater M
tion and noise, Animal signals and communication, vol (eds) Echolocation in bats and dolphins. Univ Chicago
2. Springer-Verlag, Heidelberg, pp 229–250 Press, Chicago, IL, pp 478–484
Dooling RJ, Leek MR (2018) Communication masking by Garcia M, Charrier I, Rendall D, Iwaniuk AN (2012)
man-made noise. In: Slabbekoorn H, Dooling RJ, Pop- Temporal and spectral analyses reveal individual vari-
per AN, Fay RR (eds) Effects of anthropogenic noise ation in a non-vocal acoustic display: the drumming
on animals, Springer handbook of auditory research, display of the ruffed grouse (Bonasa umbellus, L.).
vol 66, New York, pp 23–46 Ethology 118(3):292–301
5 Source-Path-Receiver Model for Airborne Sounds 181

Garstang M (2010) Elephant infrasounds: Long-range Klink KB, Dierker H, Beutelmann R et al (2010)
communication. In: Brudzynski SM (ed) Handbook Comodulation masking release determined in the
of mammalian vocalization: an integrative neurosci- mouse (Mus musculus) using a flanking-band para-
ence approach. Elsevier BV, Oxford digm. JARO 11:79–88. https://doi.org/10.1007/
Garstang M, Larom D, Raspe R, Lindeque M (1995) s10162-009-0186-7
Atmospheric controls on elephant communication. J Klump GM, Langemann U (1995) Comodulation masking
Exp Biol 198:939–951 release in a songbird. Hear Res 87(1):157–164. https://
Geiger R (1965) The climate near the ground. Harvard doi.org/10.1016/0378-5955(95)00087-K
University Press, Cambridge, MA Krokstad A, Svensson UP, Strøm S (2015) The early history
Genevois F, Bretagnolle V (1994) Male blue petrels reveal of ray tracing in acoustics. In: Xiang N, Sessler G (eds)
their body mass when calling. Ethol Ecol Evol 6:377– Acoustics, information, and communication. Modern
383 acoustics and signal processing. Springer, Cham
Gerhardt HC (1991) Female mate choice in treefrogs: Larom DL, Garstang M, Payne RR, Lindeque M (1997)
static and dynamic acoustic criteria. Anim Behav The influence of surface atmospheric conditions on the
42(4):615–635 range and area reached by animal vocalizations. J Exp
Greene NT, Anbuhl KL, Ferber AT, DeGuzman M, Allen Biol 200:421–431
PD, Tollin DJ (2018) Spatial hearing ability of the Larsen ON (1995) Acoustic equipment and sound field
pigmented Guinea pig (Cavia porcellus): minimum calibration. In: Klump GM, Dooling RJ, Fay RR,
audible angle and spatial release from masking in azi- Stebbins WC (eds) Methods in comparative psycho-
muth. Hear Res 365:62–76. https://doi.org/10.1016/j. acoustics. Birkhäuser Verlag, Basel, pp 31–45
heares.2018.04.011 Larsen ON (2020) To shout or to whisper? Strategies for
Hage SR, Jiang T, Berquist SW, Feng J, Metzner W encoding public and private information in sound
(2013) Lombard effect in horseshoe bats. Proc Natl signals. In: Aubin T, Mathevon N (eds) Coding
Acad Sci 110(10):4063–4068. https://doi.org/10. strategies in vertebrate acoustic communication, Ani-
1073/pnas.1211533110 mal signals and communication, vol 7. Springer Nature
Halfwerk W, Lea AM, Guerra MA, Page RA, Ryan MJ Switzerland AG, Cham, pp 11–44
(2016) Vocal responses to noise reveal the presence of Larsen ON, Dabelsteen T (1990) Directionality of black-
the Lombard effect in a frog. Behav Ecol 27(2): bird vocalization. Implications for vocal communica-
669–676. https://doi.org/10.1093/beheco/arv204 tion. Ornis Scand 21:37–45
Hardouin LA, Reby D, Bavoux C, Burneleau G, Larsen ON, Radford C (2018) Acoustic conditions affect-
Bretagnolle V (2007) Communication of male quality ing sound communication in air and underwater. In:
in owl hoots. Am Nat 169(4):552–562 Slabbekoorn H, Dooling RJ, Popper AN, Fay RR (eds)
Hardouin LA, Thompson R, Stenning M, Reby D (2014) Effects of anthropogenic noise on animals. Springer
Anatomical bases of sex- and size-related acoustic handbook of acoustic research. Springer, New York,
variation in herring gull alarm calls. J Avian Biol 45: pp 109–144
157–166 Larsen ON, Wahlberg M (2017) Sound and sound
Heffner HH (1983) Hearing in large and small dogs: abso- sources. In: Brown CH, Riede T (eds) Comparative
lute thresholds and size of the tympanic membrane. bioacoustics: an overview. Bentham Science
Behav Neurosci 97:310–318 Publishers, Sharjah, pp 3–62
Heffner HE, Heffner RS (2007) Hearing ranges of labora- Larsson C (2000) Weather effects on outdoor sound prop-
tory animals. J Am Assoc Lab Anim Sci 46(1):20–22 agation. Int J Acoust Vib 5(1):33–36
Heller EJ (2013) Why you hear what you hear. Princeton Lipman EA, Grassi JR (1942) Comparative auditory sen-
University Press, Princeton, NJ sitivity of man and dog. Amer J Psychol 55:84–89
Holland J, Dabelsteen T, Pedersen SB, Paris AL (2001) Lohr B, Wright TF, Dooling RJ (2003) Detection and
Potential ranging cues contained within the energetic discrimination of natural calls in masking noise by
pauses of transmitted wren song. Bioacoustics 12(1): birds: estimating the active space of a signal. Anim
3–20 Behav 65:763–777
Jansen DA, Cant MA, Manser MB (2012) Segmental Lombard É (1911) Le signe de l’élévation de la voix.
concatenation of individual signatures and context Annales des Maladies de L’Oreille et du Larynx
cues in banded mongoose (Mungos mungo) close XXXVII(2):101–109
calls. BMC Biol 10(1):97. https://doi.org/10.1186/ Lovell SF, Lein MR (2004) Neighbor-stranger discrimina-
1741-7007-10-97 tion by song in a suboscine bird, the alder flycatcher,
Jensen KK, Larsen ON, Attenborough K (2008) Empidonax alnorum. Behav Ecol 15(5):799–804
Measurements and predictions of hooded crow (Cor- Marten K, Marler P (1977) Sound transmission and its
vus corone cornix) call propagation over open field significance for animal vocalization. I. Temperate
habitats. J Acoust Soc Am 123(1):507–518 habitats. Behav Ecol Sociobiol 2(3):271–290
182 O. N. Larsen et al.

Martínez-Sala R, Rubio C, García-Raffi LM, Sánchez- (eds) Comparative hearing: insects. Springer handbook
Pérez JV, Sánchez-Pérez EA, Llinares J (2006) Control of auditory research. Springer-Verlag, New York, pp
of noise by trees arranged like sonic crystals. J Sound 63–96
Vib 291:100–106 Ross CD (2000) Outdoor sound propagation in the US
Mason MJ, Lin CC, Narins PM (2003) Sex differences in Civil War. Appl Acoust 59:137–147
the middle ear of the bullfrog (Rana catesbeiana). Russ J, Jones G, Mackie I, Racey P (2004) Interspecific
Brain Behav Evol 61(2):91–101 responses to distress calls in bats (Chiroptera: Vesperti-
Melendez KV, Feng AS (2010) Communication calls of lionidae): A function for convergence in call design?
little brown bats display individual-specific Anim Behav 67:1005–1014. https://doi.org/10.1016/j.
characteristics. J Acoust Soc Am 128:919–923 anbehav.2003.09.003
Michelsen A (1978) Sound reception in different Schmidt AKD, Römer H (2011) Solutions to the cocktail
environments. In: Ali MA (ed) Sensory ecology, party problem in insects: selective filters, spatial
NATO Adv Study Inst Ser, vol 18. Plenum Press, release from masking and gain control in tropical
London, pp 345–373 crickets. PLoS One 6(12):e28593. https://doi.org/10.
Michelsen A (1992) Hearing and sound communication in 1371/journal.pone.0028593
small animals: evolutionary adaptations to the laws of Searby A, Jouventin P, Aubin T (2004) Acoustic recogni-
physics. In: Webster DB, Fay RR, Popper AN (eds) tion in macaroni penguins: an original signature sys-
The evolutionary biology of hearing. Springer-Verlag, tem. Anim Behav 67:615–625
New York, pp 61–77 Stoeger-Horwath AS, Stoeger S, Schwammer HM,
Michelsen A, Larsen ON (1983) Strategies for acoustic Kratochvil H (2007) Call repertoire of infant African
communication in complex environments. In: Huber F, elephants: first insights into the early vocal ontogeny. J
Markl H (eds) Neuroethology and behavioral physiol- Acoust Soc Am 121(6):3922–3931
ogy. Springer-Verlag, Berlin, pp 321–331 Suthers RA (1994) Variable asymmetry and resonance in
Miller EH, Williams J, Jamieson SE, Gilchrist HG, the avian vocal tract: a structural basis for individually
Mallory ML (2007) Allometry, bilateral asymmetry, distinct vocalizations. J Comp Physiol A 175:457–466
and sexual differences in the vocal tract of common ter Hofstede HM, Ratcliffe JM (2016) Evolutionary esca-
eiders Somateria mollissima and king eiders lation: the bat-moth arms race. J Exp Biol 219(11):
S. spectabilis. J Avian Biol 38:224–233 1589–1602. https://doi.org/10.1242/jeb.086686
Mitani JC, Hasegawa T, Gros-Louis J, Marler P, Byrne R Thomas JA, Kuechle V (1982) Quantitative analysis of the
(1992) Dialects in wild chimpanzees? Am J Primatol underwater repertoire of the Weddell seal
27:233–243. https://doi.org/10.1002/ajp.1350270402 (Leptonychotes weddellii). J Acoust Soc Am 72:
Narins PM, Capranica RR (1976) Sexual differences in the 1730–1738
auditory system of the tree frog Eleutherodactylus Trefry SA, Hik DS (2010) Variation in pika (Ochotona
coqui. Science 192(4237):378–380 collaris, O. princeps) vocalizations within and between
Narins PM, Capranica RR (1980) Neural adaptations for populations. Ecography 33:784–795
processing the two-note call of the Puerto Rican tree Van Staaden M, Römer H (1997) Sexual signalling in
frog, Eleutherodactylus coqui. Brain Behav Evol 17: bladder grasshoppers: tactical design for maximizing
48–66 calling range. J Exp Biol 200:2597–2608
Nelson BS, Suthers RA (2004) Sound localization in a Vannoni E, McEligott AG (2007) Individual acoustic var-
small passerine bird: discrimination of azimuth as a iation in fallow deer (Dama dama) common and harsh
function of head orientation and sound frequency. J groans: a source-filter theory perspective. Ethology
Exp Biol 207:4121–4133 113:223–234
O’Farrell MJ, Miller BW, Gannon WL (1999) Qualitative Wahlberg M, Larsen ON (2017) Propagation of sound. In:
identification of free-flying bats using the Anabat Brown CH, Riede T (eds) Comparative bioacoustics:
detector. J Mammal 80(1):11–23 an overview. Bentham Science Publishers, Sharjah, pp
Ottemöller L, Evers LG (2008) Seismo-acoustic analysis 63–121
of the Buncefield oil depot explosion in the UK, Warfield D (1973) The study of hearing in animals. In:
2005 December 11. Geophys J Int 172(3):1123–1134 Gay W (ed) Methods of animal experimentation,
Price MA, Attenborough K, Heap NW (1988) Sound IV. Academic Press, London, pp 43–143
attenuation through trees: measurements and models. West CD (1985) The relationship of the spiral turns of the
J Acoust Soc Am 84:1836–1844 cochlea and the length of the basilar membrane to the
Reby D, McComb K (2003) Anatomical constraints gen- range of audible frequencies in ground dwelling
erate honesty: acoustic cues to age and weight in the mammals. J Acoust Soc Am 77:1091–1101
roars of red deer stags. Anim Behav 65:519–530 Wiley RH (2009) Signal transmission in natural
Riede T, Fisher JH, Goller F (2010) Sexual dimorphism of environments. In: Squire LR (ed) Encyclopedia of neu-
the zebra finch syrinx indicates adaptation for high roscience, vol 8. Academic Press, Oxford, pp 827–832
fundamental frequencies in males. PLoS One 5:e11368 Wiley RH, Richards DG (1978) Physical constraints on
Römer H (1998) The sensory ecology of acoustic commu- acoustic communication in the atmosphere:
nication in insects. In: Hoy RR, Popper AN, Fay RR
5 Source-Path-Receiver Model for Airborne Sounds 183

Implications for the evolution of animal vocalizations. Wilkinson GS, Boughman JW (1998) Social calls coordi-
Behav Ecol Sociobiol 3:69–94 nate foraging in greater spear-nosed bats. Anim Behav
Wiley RH, Richards DG (1982) Adaptations for acoustic 55:337–350
communication in birds: sound transmission and signal Yorzinski JL, Patricelli GL (2009) Birds adjust acoustic
detection. In: Kroodsma DE, Miller EH, Quellet H directionality to beam their antipredator calls to
(eds) Acoustic communication in birds. Academic predators and conspecifics. Proc R Soc B 277:923–932
Press, New York, pp 131–181

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Introduction to Sound Propagation
Under Water 6
Christine Erbe, Alec Duncan, and Kathleen J. Vigness-Raposa

6.1 Introduction Furthermore, sound can be detected from all


directions, providing omnidirectional alerting of
It is imperative that bioacousticians who work in activities happening in the environment.
aquatic environments have a basic understanding Given that sound may propagate over very long
of sound propagation under water. Whether the ranges with little loss, a myriad of sounds is com-
topic is the function of humpback whale song, monly heard at any one place. These sounds may
echolocation in wild bottlenose dolphins, the be grouped by origin: abiotic, biotic, and anthro-
masking of grey whale sounds by ship noise, the pogenic. Natural, geophysical, abiotic sound
role of chorusing in fish spawning behavior, the sources include wind blowing over the ocean sur-
effects of seismic surveying on benthic face, rain falling onto the ocean surface, waves
organisms, or the capability of an echosounder breaking on the beach, polar ice breaking under
to track a school of fish, the way in which sound pressure and temperature influences, subsea
propagates through the ocean affects how we can volcanoes erupting, subsea earthquakes rumbling
use sound to study animals, how sound we pro- along the seafloor, etc. Biotic sound sources
duce impacts animals, and how animals use include singing whales, chorusing fishes, feeding
sound. urchins, and crackling crustaceans. Anthropogenic
Aquatic fauna has evolved to use sound for sources of sound include ships, boats, fish-finding
environmental sensing, navigation, and communi- echosounders, oil rigs, gas wells, subsea mines,
cation. This is because water conducts sound very dredgers, trenchers, pile drivers, naval sonar, seis-
well (i.e., fast and far), while light propagates mic surveys, underwater explosions, etc.
poorly under water. Visual sensing based on sun- As these sounds travel from their source
or moonlight is limited to the upper few meters through the environment, they may follow multi-
of water. And while water transports chemicals, ple propagation paths. Sounds may be reflected at
chemoreception is most effective over short the sea surface and seafloor. Some sound may
ranges, where chemical concentration is high. travel through the seafloor and radiate back into
the water some distance away. Sound is scattered
by scatterers in the water (such as gas bubbles or
C. Erbe (*) · A. Duncan fish swim bladders). Sound bends as the ocean is
Centre for Marine Science and Technology, Curtin
layered with pressure, temperature, and salinity
University, Perth, WA, Australia
e-mail: c.erbe@curtin.edu.au; A.J.Duncan@curtin.edu.au changing as a function of depth, and with fresh-
water inputs. All of these phenomena depend on
K. J. Vigness-Raposa
INSPIRE Environmental, Newport, RI, USA the frequency of sound. The spectrum of broad-
e-mail: kathy@INSPIREenvironmental.com band sound changes, too, as acoustic energy at
# The Author(s) 2022 185
C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_6
186 C. Erbe et al.

high frequencies is more readily scattered and 6.2.1 Propagation Loss Form
absorbed than energy at low frequencies. The
receiver of sound can thus infer information not As sound propagates through the ocean, it loses
just about the source of sound but also about the energy, termed propagation loss (PL2). A simple
environment’s complexity. form of the sonar equation equates PL to the
Understanding the physics of sound in water is difference between the source level (SL) and the
an important step in studies of aquatic animal received level (RL) of sound (Urick 1983):
sound usage and perception, whether these are
conspecific social sounds, predator sounds, prey PL ¼ SL  RL ðpropagation loss formÞ ð6:1Þ
sounds, navigational clues, environmental SL was defined by Urick as 10log10 of the ratio
sounds, or anthropogenic sounds. It is also critical of source intensity to reference intensity (see
for the study of impacts of sound on aquatic Chap. 4). RL was equal to 10log10 of the ratio of
fauna, and for using passive or active acoustic received intensity to reference intensity. PL was
tools for monitoring aquatic fauna and mapping computed as 10log10 of the ratio of source inten-
biodiversity. The goal of this chapter is to intro- sity to received intensity.
duce the basic concepts of sound propagation For example, a whale-watching boat might
under water. have SL ¼ 160 dB re 1 μPa2 (in terms of mean-
square pressure, which is proportional to inten-
sity; see Chap. 4) and be located 100 m from a
6.2 The Sonar Equation group of whales. If PL in this environment and
over this range is 40 dB, then RL at the whales is
The sonar equation was developed by the US 120 dB re 1 μPa2 (Erbe 2002; Erbe et al. 2016a).
Navy to assess the performance of naval sonar
systems. These sonar systems were designed to
detect foreign submarines. The sonar emits an 6.2.2 Signal-to-Noise Ratio Form
acoustic signal under water and listens to
returning echoes. The time of arrival and acoustic Another simple form of the sonar equation relates
features of the echo may determine not only from the RL of a signal to the background noise level
what target the signal reflected, but also the range (NL ¼ 10log10 of the ratio of noise intensity to
and speed of the target. The term “sonar” stands reference intensity):
for “SOund Navigation And Ranging.”
There are numerous forms of the sonar equa- SNR ¼ RL  NL ðsignal‐to‐noise ratio formÞ
tion. What they all have in common is that ð6:2Þ
(1) they each represent an equation of energy
conservation, meaning that the total acoustic SNR is the level of the signal-to-noise ratio,
energy on either side of the equation is the expressed in dB. For example, a call from a whale
same; and (2) all of the terms in the equation are might have a received level RL ¼ 105 dB re
expressed in decibel (dB). The sonar equation 1 μPa2 at another whale; however, background
with its original terms as defined in Urick noise at the time might be NL ¼ 115 dB re 1 μPa2
(1983) allows an easy conceptual exploration of over the frequency band of the call. The SNR is
various scenarios encountered in bioacoustics. 10 dB. Can the whale still hear the other one or
The definitions and notations of some of the does the noise mask the call?
terms are more mathematically specific in the Because the SNR is a negative number in this
recent underwater acoustics terminology standard example, if one was just considering the relative
(ISO 18405)1. levels of signal and noise, the animals would not

1 2
International Organization for Standardization. (2017). In this chapter, we italicize variables, but keep
Underwater acoustics—Terminology (ISO 18405). abbreviations as regular font; so PL is an abbreviation
Geneva, Switzerland. while PL is a variable.
6 Introduction to Sound Propagation Under Water 187

be able to hear one another because the back- not just the range between the two animals, but
ground noise level is much greater than the also at which depth each happens to be located. If
received signal level. However, animals (and the two animals are oriented towards each other,
sonar systems) can take advantage of spectral directional emission and reception capabilities
and temporal characteristics of a received sound, will enhance signal detection. The environment
as is explained below. Therefore, in the example changes the level and spectral characteristics of
of beluga whales (Delphinapterus leucas) trying the signal by reflection, refraction, scattering,
to communicate in icebreaker noise, the listening absorption, and spreading losses. The detection
whale can indeed detect the call, because of the capabilities of the receiver can be quantified by
different spectral and temporal structures of call the detection threshold, critical ratio, and other
and noise (Erbe and Farmer 1998). factors. Ambient noise in the environment can
initiate anti-masking strategies at both the sender
(e.g., increasing the source level) and receiver
(e.g., orienting towards the signal). A sonar equa-
6.2.3 Forms to Assess
tion can be constructed to investigate each of
Communication Masking
these factors, as outlined in the following
sections.
Acoustic communication under water remains an
The basic sonar relation for the communica-
area of active research. In the conceptual model of
tion scenario in Fig. 6.1 is:
Fig. 6.1, one animal (the sender) emits a signal,
which travels through the habitat to the location SL  PL  NL > DT ðbasic signal detection formÞ,
of the receiver. Whether the receiver can hear the
message depends on a number of factors that where DT is the detection threshold of the
relate to the sender, the habitat, and the receiver. receiver, expressed in dB. A sound is deemed
The level and spectral features of the signal will detectable if the expression on the left side
affect how far it propagates and how well it can be exceeds the detection threshold. In the absence
detected above the ambient noise in the environ- of noise, DT equals the audiogram. Audiograms
ment. The locations of sender and receiver matter, are measured by exposing an animal to pure-tone

Sender Receiver

Habitat
Effects:
• Propagation loss (PL)
• Absorption (αR)

Ambient noise (NL)


Relevant variables: Relevant variables:
• Location of sender • Location of receiver
• Source level (SL) • Audiogram (DT)
• Spectral characteristics of signal (TBP) • Critical ratio (CR)
• Emission directionality (DIs) • Directional hearing (DIr)

Fig. 6.1 Sketch of the factors related to acoustic commu- level (NL), and receiver detection threshold (DT), critical
nication in natural (not just aquatic) environments and ratio (CR), and directivity index (DIr). Modified from Erbe
their corresponding terms in the sonar equation: source et al. (2016c); # Erbe et al. (2016); https://www.
level (SL), time-bandwidth product (TBP), sender direc- sciencedirect.com/science/article/pii/
tivity index (DIs), propagation loss (PL), absorption S0025326X15302125. Published under CC BY 4.0;
(absorption coefficient α multiplied by range R), noise https://creativecommons.org/licenses/by/4.0/
188 C. Erbe et al.

signals of varying levels. The RL that is just when the tone is just audible (American National
detectable defines the audiogram at that fre- Standards Institute 2015). Conceptually, the CR
quency (see Chap. 10 for a more thorough defini- quantifies the ability of the auditory system to
tion of audiogram): focus on a narrowband (tonal) signal. It captures
how many of the noise frequencies surrounding
RL ¼ DT ðaudiogram formÞ
the tone frequency are effective at masking the
The mammalian auditory system acts as a bank tone, and the resulting band of frequencies has
of overlapping bandpass filters and the listener been termed the Fletcher critical band (American
focuses on the auditory band that receives the National Standards Institute 2015). A narrowband
highest SNR (Moore 2013). Under the equal- signal is thus detectable, if
power assumption (Fletcher 1940), a signal is RL  CR > NLf ðcritical ratio formÞ ð6:4Þ
detected if its power is greater than the noise
power in any of the auditory bands. So, for any RL is the tone level in dB re 1 μPa2, NLf is the
auditory band, noise mean-square pressure spectral density level
in dB re 1 μPa2/Hz, and CR is measured in dB re
RL  NL > 0 ðwithin an auditory bandÞ ð6:3Þ
1 Hz (see p. 29 in Erbe et al. 2016c).
Communication signals of many species, In the above-mentioned study with beluga
including birds and marine mammals (Erbe et al. whales communicating amidst icebreaker noise,
2017a), are commonly tonal, while noise is com- the beluga whale call consisted of a sequence of
monly broadband. In order to assess the risk of six tones with overtones from 800 to 1800 Hz,
communication masking, the critical ratio (CR) is and the icebreaker’s bubbler system noise was
a useful quantity that has been measured in broadband and relatively unstructured in fre-
humans and animals. The CR is the level differ- quency and time (Fig. 6.2) (Erbe and Farmer
ence between the mean-square sound pressure 1998). The bandwidth of the call, expressed in
level (SPL) of a tone and the mean-square sound dB, was 10log10(1800–800) ¼ 30 dB re 1 Hz (see
pressure spectral density level of broadband noise Chap. 4 for definitions and formulae). Given

Fig. 6.2 Spectrograms of


the lower two harmonics of
a beluga whale call (top
panel) and an icebreaker’s
bubbler system noise
(bottom panel). Colorbar in
dB re 1 μPa2/Hz. The
broadband levels are
RL ¼ 105 dB re 1 μPa2 for
the call and NL ¼ 115 dB re
1 μPa2 for the noise
6 Introduction to Sound Propagation Under Water 189

NL ¼ 115 dB re 1 μPa2 over the bandwidth of the 6.2.4 Form for Biomass Surveying
call, NLf was equal to NL (115 dB re 1 μPa2)
minus the bandwidth (30 dB re 1 Hz): NLf ¼ 85 dB Surveys for animals ranging from zooplankton to
re 1 μPa2/Hz. Beluga whales have a CR of fish and sharks may use an echosounder, fish
approximately 15 dB re 1 Hz at 800 Hz, therefore, finder, or sonar (e.g., Parsons et al. 2014; Kloser
the call with RL ¼ 105 dB re 1 μPa2 was audible, et al. 2013). In this scenario, the echosounder
because Eq. (6.4) was satisfied (Erbe 2008; Erbe emits a signal, which travels to the fish, where
and Farmer 1998): 105–15 > 85. some of it is reflected. How much of the signal is
In studies on critical ratios and in the beluga reflected is expressed by the target strength (TS),
whale experiments (Erbe and Farmer 1998; Erbe defined as 10log10 of the ratio of echo intensity to
2000), signal and noise were broadcast by the incident intensity (Urick 1983). The reflected sig-
same loudspeaker and thus arrived at the listener nal travels to the receiver, which has a specific DT
from the same direction. If the caller and the noise and DIr. The receiver is typically co-located with
are spatially separated, then there is an additional the source, so that the signal travels the same path
processing gain in the sonar equation: the twice and thus experiences twice the PL. The fish
receiver’s directivity index DIr: is detected if the following sonar equation is
satisfied:
RL  CR þ DIr  NLf > 0
ðcritical ratio form with directivity indexÞ SL  2 PL þ TS  NL > DT  DIr
ðtwo  way sonar surveying formÞ
The DIr is defined as 10log10 of the ratio of the
intensity measured by an omnidirectional receiver Target strength will vary for each type of ani-
to that of a directional receiver. Directivity mal, as well as with the number of animals in the
indices increase with frequency and values up to group and their orientation relative to the
19 dB have been measured for communication echosounder. Figure 6.4 shows reflected signals
sounds in marine mammals. The associated spa- received on a REMUS autonomous underwater
tial release from masking should be considered in vehicle. Individual animals are observed in two
environmental impact assessments of underwater aggregations, with two dolphins swimming
noise (Erbe 2015). Directivity indices are even within one of the aggregations. Researchers are
greater at higher frequencies used by dolphins using cameras on the same platforms to better
during echolocation (Fig. 6.3). understand the information contained in reflected

Fig. 6.3 Sketches of the receiving directivity pattern of a bottlenose dolphin (Tursiops truncatus) in the vertical (a) and
horizontal (b) planes. Courtesy of Chong Wei after data in (Au and Moore 1984)
190 C. Erbe et al.

Fig. 6.4 Echosounder image of marine fauna in two their high reflectivity (Benoit-Bird et al. 2017). # Benoit-
aggregations, with two dolphins being in the aggregation Bird et al. 2017; https://aslopubs.onlinelibrary.wiley.com/
on the left. Colors represent acoustic target strength and doi/full/10.1002/lno.10606. Published under CC BY 4.0;
the shapes of the two dolphins can easily be recognized by https://creativecommons.org/licenses/by/4.0/

signals and ultimately convert that information are therefore very useful for understanding how
into species classifications and estimates of bio- sound will propagate in different geographical
mass (Benoit-Bird and Waluk 2020). regions.

6.3 The Layered Ocean 6.3.1 Temperature and Salinity


Profiles
The speed of sound in sea water increases with
increasing temperature T [ C], salinity In non-polar regions (red curves in Fig. 6.6), the
S (measured in practical salinity units [psu]) and main source of heat entering the ocean is solar.
hydrostatic pressure, which in the ocean is pro- The sun heats the near-surface water, making it
portional to depth D [m]. The approximate less dense and suppressing convection. A surface
change in the speed of sound c [m/s] with a mixed layer with nearly constant temperature and
change in each property is: salinity is formed by mechanical mixing due to
surface waves and is typically 20–100 m thick.
• Temperature changes by 1  C ! c changes by
Below that, the temperature drops rapidly in a
4.0 m/s
region known as the thermocline, before becom-
• Salinity changes by 1 psu ! c changes by
ing almost constant at a temperature of about 2  C
1.4 m/s
in the deep isothermal layer that extends from a
• Depth (pressure) changes by 1 km ! c
depth of about 1000 m to the ocean floor.
changes by 17 m/s
Seasonal changes in solar radiation together
Maps of sea surface temperature and salinity with the ocean’s considerable thermal lag (due
for the northern hemisphere summer show to its great heat capacity) can complicate this
considerable variation (Fig. 6.5). However, tem- simple picture, but most of these changes only
perature and salinity vary much more rapidly with affect the top few hundred meters of the water
depth than they do in the horizontal plane, so the column, changing the detailed structure of the
ocean can often be thought of as a stack of hori- mixed layer and the upper part of the thermocline.
zontal layers, with each layer having different In polar regions (blue curves in Fig. 6.6), the
properties. Vertical profiles of these quantities situation is quite different. There is a net loss of
6 Introduction to Sound Propagation Under Water 191

Fig. 6.5 Maps of sea


surface temperature (top)
and salinity (bottom) for the
northern hemisphere
summer, averaged over the
period 2005 to 2017. Data
were taken from the World
Ocean Atlas (Locarnini
et al. 2018; Zweng et al.
2018)

Fig. 6.6 Depth profiles of 0


temperature, salinity, and
sound speed from the open
ocean based on the World 500
Ocean Atlas (Locarnini
et al. 2018; Zweng et al.
2018) seasonal decadal 1000
Depth (m)

average data for the austral


winter (solid) and austral
summer (dotted). Red 1500
curves are for 30.5 S,
74.5 E and are
representative of non-polar 2000
ocean profiles. Blue curves
are for 60.5 S, 74.5 E and
are representative of polar
2500
ocean profiles
3000
0 10 20 30 33 34 35 36 1460 1500 1540
Temperature (oC) Salinity (psu) Sound speed (m/s)
192 C. Erbe et al.

heat from the sea surface, which results in a heating and cooling effects can eliminate or
temperature profile in the upper part of the enhance this effect. As explained later in this
ocean that increases with increasing depth from chapter, whether or not there is a distinct increase
a minimum of about 2  C at or (in summer) in sound speed with depth in the mixed layer
slightly below the surface. determines whether there is a surface duct,
Salinity typically changes by only a small which has a considerable impact on acoustic
amount with depth, and in most parts of the propagation from near-surface sound sources
ocean is between 34 and 36 psu. As a result, the and to near-surface receivers.
sound speed is usually determined by temperature Below the mixed layer, the rapid reduction in
and depth, however, salinity can have an impor- temperature with depth (i.e., in the thermocline)
tant effect on sound speed in situations where it results in sound speed also reducing until, at a
changes abruptly. Examples include locations depth of about 1000 m, the temperature becomes
where there is a large freshwater outflow into nearly constant. In the deeper isothermal layer,
the ocean from a river, or in estuaries where it is the increasing pressure results in the sound speed
common to have a wedge of dense, saline water starting to increase with depth. There is therefore
underlying a surface layer of freshwater. In polar a minimum in the sound speed in non-polar
regions, the salinity of near-surface water can waters at a depth of approximately 1000 m,
vary considerably depending on whether sea ice which, as will be seen later, is important for
is forming, a process that excludes salt and there- long-range sound propagation.
fore increases salinity in the water below the ice. In polar waters, the temperature and pressure
When sea ice melts, freshwater is released, reduc- both increase with increasing depth, so the sound
ing near-surface salinity. speed also increases, which results in a strong
surface duct. However, in the Arctic Ocean, the
existence of water masses with different
properties entering from the Pacific and Atlantic
6.3.2 Sound Speed Profiles
oceans can lead to more complicated sound speed
profiles.
The following equation is one of a number of
Temperature and salinity profiles for the
equations of varying complexity that can be
world’s oceans can be found in the World
found in the literature relating the speed of
Ocean Atlas3 (Locarnini et al. 2018; Zweng
sound to temperature, salinity, and depth
et al. 2018). These are based on averages of a
(Mackenzie 1981). It is valid for temperatures
large amount of measured data and are very use-
from 2 to 30  C, salinities of 30 to 40 psu, and
ful for calculating estimated sound speed profiles
depths from 0 to 8000 m.
for particular locations for particular months or
c ¼ 1448:96 þ 4:591 T  5:304  102 T 2 seasons of the year. The real ocean is, however,
highly variable; particularly the upper thermo-
þ 2:374  104 T 3 þ 1:340 ðS  35Þ cline and mixed layer, which can change on
þ 1:630  102 D þ 1:675  107 D2 time scales of hours, and in some extreme cases,
tens of minutes, so there is no substitute for in situ
 1:025  102 T ðS  35Þ  7:139
measurements of temperature and salinity profiles
 1013 TD3 ½m=s to support acoustic work.

Sound speed profiles computed from the typi-


cal temperature and salinity profiles are also plot-
ted in Fig. 6.6.
In non-polar waters, the sound speed may
increase slightly with depth in the mixed layer 3
World Ocean Atlas https://www.nodc.noaa.gov/OC5/
due to its pressure dependence, however, diurnal woa18/; accessed 30 September 2020.
6 Introduction to Sound Propagation Under Water 193

6.4 Propagation Loss range is sufficiently large (i.e., the receiver is in


the acoustic far-field of the source; see Chap. 4),
The apparent simplicity of the propagation loss and the above assumptions are all met.
term (i.e., PL) in the various sonar equations Another situation in which spreading loss can
hides a great deal of complexity. There are a be calculated analytically is when the sound is
few special situations in which PL can be calcu- constrained in one dimension by reflection and/or
lated quite accurately using simple formulae, and refraction, so it can only spread in the other
a few more in which it might be possible to obtain two dimensions. In underwater acoustics, this
a reasonable estimate using a more complicated most commonly happens when the sound is
equation, but for everything else, these simple constrained in the vertical direction by the sea
approaches can lead to large errors, and it is surface or seafloor, but can still spread in the
necessary to resort to numerical modeling. To horizontal plane. The result is that the acoustic
further complicate matters, there are a number of wavefront forms the surface of a cylinder, the area
different types of numerical models used for of which is proportional to the range. The inten-
propagation loss calculations, each with its own sity is therefore inversely proportional to the
assumptions and limitations, and it is important to range, and the PL is given by the cylindrical
be familiar with these so that the most appropriate spreading equation:
model can be used for a given task.
PL ¼ 10 log 10 ðr=1mÞ ð6:6Þ

Some situations in which cylindrical spreading


6.4.1 Geometric Spreading Loss can occur are discussed later in this chapter,
but it should be noted that Eq. (6.6), strictly
The most basic concept of propagation loss is that speaking, only applies at all ranges from the
of geometric spreading, which accounts for the source in the highly unusual case that the source
fact that the same sound power is spread over a is a vertical line source that spans the entire depth
larger surface area as the sound propagates further interval into which the sound is constrained, and
from the source. The intensity is the sound power that no sound is lost into either the upper or lower
per unit area (see Chap. 4), so the increase in layers.
surface area results in a reduction in intensity. For the much more common case of a small
The simplest case is when the source is small source, the sound will undergo spherical spread-
compared to the distances involved, the sound ing at short ranges where the boundaries have no
speed is constant, and the boundaries (i.e., sea effect, followed by cylindrical spreading at long
surface, seabed, and anything else that might ranges where the fact that the source has a small
reflect sound) are sufficiently far away that vertical extent is of little consequence. In
reflected energy can be ignored. In this situation, between, there will be a transition region in
the acoustic wavefront forms the surface of a which neither formula is accurate. This situation
sphere. As the wavefront propagates outward, can be approximated by assuming a sudden tran-
the radius r of the sphere increases, the surface sition from spherical to cylindrical spreading at a
area of the sphere increases in proportion to r2, “transition range” rt. Equation (6.7) applies only
and therefore the intensity decreases inversely to ranges r  rt and still makes the assumption
proportional to r2. This leads to the well-known that there are no losses at the boundaries.
spherical spreading equation for PL:    
rt r
PL ¼ 20 log 10 þ 10 log 10
PL ¼ 20 log 10 ðr=1mÞ ð6:5Þ 1m rt
   
r r
Equation (6.5) is also applicable to calculating ¼ 10 log 10 t þ 10 log 10 ð6:7Þ
1m 1m
geometric spreading loss for sound radiated by a
directional source, such as an echosounder trans- In shallow-water situations, some authors rec-
ducer, or a dolphin’s biosonar, providing the ommend using a transition range equal to the
194 C. Erbe et al.

water depth; however, while useful for very rough The absorption coefficient increases with fre-
PL estimates, this approach should be adopted quency (Fig. 6.7). At low frequencies, it is
with caution as the best choice will depend on dominated by molecular relaxation of two minor
the characteristics of the seabed. The only way to constituents of seawater: B(OH)3 and MgSO4,
accurately determine rt for a given situation is to whereas above a few hundred kHz, it is primarily
carry out numerical propagation modeling, in due to the water’s viscosity.
which case you might as well use that to directly In summary, Fig. 6.8 compares how propaga-
determine the propagation loss, removing the tion loss increases with range for spherical
need for (Eq. 6.7) and its inherent inaccuracies. spreading (Eq. 6.5), cylindrical spreading
(Eq. 6.6), and combined spherical/cylindrical
spreading with a transition range of 100 m
(Eq. 6.7). The effect of absorption (Eq. 6.8) in
6.4.2 Absorption Loss
addition to spherical spreading is also shown for
frequencies of 1, 10, and 100 kHz.
When a sound wave propagates through water, it
results in a periodic motion of the molecules
present in the water, and the slight friction within
and between them converts some of the sound 6.4.3 Additional Losses
energy into heat, reducing the intensity of the
sound wave. This is called absorption loss and 6.4.3.1 The Air–Water Interface
results in a propagation loss that is proportional to
the range traveled: Reflection and Transmission Coefficients
In animal bioacoustics as well as noise research,
PL ¼ αr km ð6:8Þ one typically deals with sounds in one medium
where rkm is the range in kilometers and α is the (i.e., either air or water) and then sticks to this
absorption coefficient in dB/km. The propagation medium, only modeling propagation within this
loss due to absorption must be added to the prop- medium and only considering receivers in this
agation loss due to geometrical spreading medium. However, sound does cross into other
described in Sect. 6.4.1. media, and so a fish might be able to hear an
A commonly used formula for α is: airplane flying overhead, and a bird flying directly
overhead might be able to hear a submarine’s
f 1 f 2 ðpH8Þ=0:56 sonar (Fig. 6.9).
α ¼ 0:106 e
þ f2
f 21 As sound hits an interface, the incident wave,
  f 2 f 2 z=6
in most situations, gives rise to a reflected wave
T S
þ 0:52 1 þ e and a transmitted wave4 (also see Chap. 5, where
43 35 f 22 þ f 2
reflection is explained based on Huygens’ princi-
þ 4:9  104 f 2 eðT=27þz=17Þ ð6:9Þ ple). The energy of the reflected wave remains
within the medium of the incident sound, but the
with f1 = 0.78(S/35)1/2eT/26 and f2 = 42eT/17; f energy of the transmitted wave is lost from the
[kHz], α[dB/km] medium of the incident sound and transmitted
into the adjacent medium. The amplitudes of the
valid for 6 < T < 35 C ðS ¼ 35psu, pH ¼ 8, z ¼ 0Þ
reflected and transmitted (plane) waves are given
7:7 < pH < 8:3 ðT ¼ 10 C, S ¼ 35psu, z ¼ 0Þ
5 < S < 50psu ðT ¼ 10 C, pH ¼ 8, z ¼ 0Þ
0 < z < 7km ðT ¼ 10 C, S ¼ 35psu, pH ¼ 8Þ
4
Dan Russell’s animations of waves being reflected from
(François and Garrison 1982a, b; Ainslie and hard and soft boundaries, and being transmitted: https://
McColm 1998). www.acs.psu.edu/drussell/Demos/reflect/reflect.html;
accessed 12 October 2020.
6 Introduction to Sound Propagation Under Water 195

Fig. 6.7 Graph of


absorption loss dominated
by B(OH)3 for f < 5 kHz,
by MgSO4 for
5 kHz < f < 500 kHz, and
by viscosity above.
T ¼ 10  C, S ¼ 35 psu,
z ¼ 0 m, pH ¼ 8

Fig. 6.8 Plot of


propagation loss versus
range assuming spherical
spreading (Eq. 6.5),
cylindrical spreading
(Eq. 6.6), and mixed
spherical/cylindrical
spreading (Eq. 6.7) for a
transition range of 100 m.
Propagation loss is also
shown for spherical
spreading with the addition
of absorption (Eq. 6.8)
corresponding to
frequencies of 1, 10, and
100 kHz. Note that in the
literature, the y-axis is
sometimes flipped

by the reflection and transmission coefficients R where θ1 is the grazing angle of the incident
and T (Medwin and Clay 1998): wave, measured from the interface, and θ2 is the
grazing angle of the transmitted (refracted) wave,
also measured from the interface. The angle of
Z 2 sin θ1  Z 1 sin θ2
R ¼ ð6:10Þ incidence is measured from the normal (i.e., per-
Z 2 sin θ1 þ Z 1 sin θ2
pendicular to the interface); the angle of incidence
2Z 2 sin θ1 and the grazing angle of the incident wave always
T ¼
Z 2 sin θ1 þ Z 1 sin θ2 add to 90 . The acoustic impedance Z is the
196 C. Erbe et al.

Fig. 6.9 Sketches of a sound source in the air (helicopter; panel, medium 1 corresponds to air with sound speed c1,
left) and water (submarine; right), and the incident pi, and medium 2 corresponds to water with sound speed c2.
reflected pr, and transmitted pt rays (i.e., vectors pointing The situation is reversed in the right panel, where medium
in the direction of travel, perpendicular to the wavefront), 1 is water, and medium 2 is air
with corresponding grazing angles θ1 and θ2. In the left

product of density and sound speed: Z ¼ ρc. In air Z2  Z1 2Z 2


R ¼ and T ¼
at 0  C, Z ¼ 1.3 kg/m3  330 m/s ¼ 429 kg/(m2s). Z2 þ Z1 Z2 þ Z1
In freshwater at 5  C, Z ¼ 1000 kg/m3  1427 m/
s ¼ 1,427,000 kg/(m2s). In sea water at 20  C and For a sound source in air, Z1 < < Z2 ¼> R !
1 and T ! 2, at normal incidence. Almost all of
1 m depth with 34 psu salinity,
Z ¼ 1035 kg/m3  1520 m/s ¼ 1,573,200 kg/ the sound is reflected, but the pressure in the
(m2s) (see Chap. 4). So, Zair < < Zwater, whether it water increases by a factor 2. The air–water
boundary, for sound arriving from air, is consid-
is freshwater or saltwater.
Snell’s law (Fig. 6.9, Eq. 6.11)5 relates the ered “hard.” The value of T is the reason why
angles of the incident and refracted waves (θ1 even weak aerial sources (such as drones hover-
ing over whales) can be detected in water, below
and θ2) at the interface. Rays bend towards the
interface, if the speed of sound in medium 2 is the source, at several meters depth (Erbe et al.
greater than that in medium 1 (c2 > c1) and away 2017b), and commercial airplanes can be
recorded in coastal waters, lakes, and rivers even
from the interface, if c1 > c2. While Snell’s law
typically relates the sines of the angles measured if flying at hundreds of meters in altitude (Erbe
from the normal, it may also be expressed in et al. 2018). Received levels under water from
airplanes may exceed behavioral response
terms of the cosines of the grazing angles (Etter
2018): thresholds for underwater sound sources (Kuehne
et al. 2020). For non-normal incidence, with
cos θ1 c1 c2 > c1, there exists a critical angle, beyond
¼ ð6:11Þ
cos θ2 c2 which the transmitted wave disappears. This situ-
ation is called total internal reflection. The only
For normal incidence, all of the angles in
sound in the water is an evanescent field that
Eq. (6.10) are 90 , and so all of the sines are
decays exponentially in amplitude below the sea
1, hence
surface. The evanescent field is only important if
the depth of the receiver is smaller than the
in-water acoustic wavelength.
5 For a sound wave meeting the water–air inter-
Dan Russell’s animation of refraction and Snell’s law:
https://www.acs.psu.edu/drussell/Demos/refract/refract. face from below, Z1 > > Z2 therefore R ! 1
html; accessed 12 October 2020. and T ! 0. Almost all sound is reflected, albeit at
6 Introduction to Sound Propagation Under Water 197

Fig. 6.10 Spectrogram of


the recording of a ship
passing by a moored
recorder, showing the
pattern of constructive and
destructive interference
called the Lloyd’s mirror
effect. The closest point of
approach occurred at about
200 s. Modified from (Erbe
et al. 2016c); # Erbe et al.
2016; https://www.
sciencedirect.com/science/
article/pii/
S0025326X15302125.
Published under CC BY
4.0; https://
creativecommons.org/
licenses/by/4.0/

negative amplitude, which means that the incident interface. The direct ray does not experience a
and reflected pressures cancel each other out. This flip in amplitude. Depending on the relative path
is why the water–air interface is called a pressure- lengths, the surface-reflected sound will add con-
release boundary (or “soft” boundary) for sound structively to the sound that traveled along the
incident from below. For non-normal incidence, direct path, or they will cancel each other out.
R and T need to be computed with Eq. (6.10). This creates a pattern of constructive and destruc-
Also, as a sound source is moved to shallower tive interference about the sound source, called
depth (i.e., closer to the sea surface), the propor- the Lloyd’s mirror effect. As a ship passes a
tion of transmitted sound increases. This is moored recorder, the spectrogram shows the char-
because of the evanescent (i.e., exponentially acteristic U-shaped interference pattern as succes-
decaying) field, which is ignored by Eq. (6.10), sive peaks and troughs in amplitude at any one
but that might still have enough amplitude at the frequency over time (Fig. 6.10). Additional
sea surface for shallow sources (Godin 2008). images of the Lloyd’s mirror interference pattern
can be found in (Parsons et al. 2020) for small
Lloyd’s Mirror electric ferries and in (Erbe et al. 2016b) for
While not resulting in a loss of sound energy, the recreational swimmers and boogie boarders.
Lloyd’s mirror effect is a result of reflection from
the water–air interface from shallow sound
Scattering at the Sea Surface
sources. An omnidirectional source (i.e., one
If the sea surface is not flat, then some of the
that emits sound in all directions) close to the
reflected energy is scattered away from the geo-
sea surface (such as a ship’s propeller) emits
metric reflection direction, reducing the ampli-
some of its sound in an upwards direction, and
tude of the geometrically reflected wave. This is
this sound reflects off the sea surface. At any
called surface scattering loss, which increases as
receiver location, sound that traveled along the
the roughness of the sea surface increases, the
surface-reflected path overlaps with sound that
acoustic wavelength decreases (i.e., acoustic fre-
traveled along the direct path from the source to
quency increases), and the grazing angle between
the receiver. The reflected ray’s amplitude is
the direction of the incident wave and the plane of
opposite in sign to the incident ray’s amplitude
the sea surface increases. This relationship is
(R ¼ 1); conceptually, this ray emerged from
quantified by the Rayleigh roughness parameter
an image source (also called virtual source) with
(Jensen et al. 2011):
negative amplitude on the other side of the
198 C. Erbe et al.

h ignored by Eq. (6.13), which can therefore be


γ ¼ 4π sin θ ð6:12Þ
λ considered to provide an upper limit on the prop-
agation loss per bounce.
where h is the root-mean-square (rms) roughness
of the surface (i.e., approximately ¼ of the signif-
icant wave height), λ is the acoustic wavelength, 6.4.3.2 The Seafloor Interface
and θ is the grazing angle. The larger the value of The interaction of sound with the seafloor is more
γ is, the larger is the apparent roughness of the complicated. The acoustic properties of the sea-
surface. The corresponding effective pressure bed are often similar to those of the water, so a
reflection coefficient of the sea surface is then significant amount of sound can penetrate the
given by: seabed. The lower the frequency is, the deeper
the sound can penetrate. At frequencies below a
R 0 ¼ e0:5γ
2
ð6:13Þ few kHz, it is common for a significant amount of
acoustic energy to be reflected back into the water
which corresponds to an additional propagation
column from geological layering within the sea-
loss of 20 log 10 jR 0 j ¼ 4:34γ 2 dB each time the
bed. Seismic survey companies searching for oil
sound reflects off the surface (Fig. 6.11). Note,
and gas reserves are taking advantage of this.
however, that these formulae are only valid for
Some of this complexity is illustrated in
surfaces that are not too rough, which, in this
Fig. 6.12, which plots the pressure reflection coef-
case, means γ < 2, corresponding to a scattering
ficient as a function of grazing angle for four
loss < 17 dB per bounce.
different seabed types: silt, sand, limestone, and
Strictly speaking, the effective pressure reflec-
basalt. Silt and sand layers are unconsolidated,
tion coefficient (Eq. 6.13, Fig. 6.11) applies to the
which means that shear waves have a low
coherent component of the acoustic field, which
speed and attenuate rapidly. (Shear waves are
can be thought of as the component that does not
waves in which the particles oscillate at right
change as the rough sea surface moves. There will
angles to the direction of sound propagation; see
also be a scattered component that does change,
Chap. 4.) Acoustically, they can often be well
and in some situations, this is an important con-
approximated by a fluid (which does not support
tributor to the received signal. This component is
shear waves at all) with an increased attenuation
to account for the shear wave losses.
16 h/ =0.02
h/ =0.05
Propagation loss per bounce (dB)

14 h/ =0.08 1
h/ =0.11
Pressure reflection coefficient

h/ =0.14
12
h/ =0.17
0.8
h/ =0.2
10

8 0.6

6
0.4
4

2 Silt
0.2
Sand
Limestone
0 Basalt
0 20 40 60 80 0
Grazing angle (deg) 20 40 60 80
Grazing angle (deg)
Fig. 6.11 Graphs of additional propagation loss per
bounce as a function of grazing angle for reflection from Fig. 6.12 Curves of pressure reflection coefficient versus
rough surfaces with various ratios of rms roughness to grazing angle for four different seabed types, calculated
acoustic wavelength with parameters from Jensen et al. (2011)
6 Introduction to Sound Propagation Under Water 199

Unconsolidated sediments become more reflec- equation depend on grazing angle. The propaga-
tive as the sediment grain size increases from tion loss per bounce is given by 20 log 10 jR 0 j.
silt to sand. Limestone and basalt are consolidated
rocks, which allow both compressional waves 6.4.3.3 Scattering Within the Water
and shear waves to propagate, and are thus Column
referred to as solid elastic seabeds. Basalt is a Sound can be scattered within the water column
hard rock and highly reflective at all grazing by anything that causes sharp changes in sound
angles. The reflection coefficient of limestone, speed, density, or both (i.e., acoustic impedance,
however, is perhaps surprising. While it is also a which is the product of sound speed and density;
rock, it has the lowest reflectivity of the four see Chap. 4). This includes gas bubbles,
seabeds at small grazing angles. This is because biological organisms (in particular those with
the shear wave speed in limestone is very similar gas-filled organs like lungs or swim bladders),
to the sound speed in water, which allows energy and suspended sediment particles. Water column
to pass easily from sound waves in the water to scattering is utilized in active sonar systems,
shear waves in the seabed. which rely on the backscattered signal to detect
Curves of reflection coefficients versus and/or characterize objects within the water
grazing angle are even more complicated for column. However, clouds of air bubbles formed
layered seabeds due to interference between by breaking waves can cause an appreciable
waves reflecting from different layers, and in increase in propagation loss in some
this case, the reflectivity becomes frequency circumstances.
dependent. Despite the complexity, there are Air bubbles are essentially small, resonant
computer programs available, based on cavities within the water column, which can
techniques described in Jensen et al. (2011), that both scatter and absorb sound and, when found
can numerically calculate the reflection coeffi- in large numbers, can change the effective den-
cient curve for any arbitrarily layered seabed. A sity, and hence sound speed, of the water. When a
good example is BOUNCE, which is part of the wave breaks, it entrains a large amount of air
Acoustics Toolbox.6 A much bigger problem is down to depths of several meters, forming a
the common lack of information on the cloud of bubbles of a range of sizes. The large
geoacoustic properties of the seabed, to be able bubbles rise to the surface quite quickly, but the
to provide these programs with accurate smaller bubbles can remain at depth for many
input data. minutes. This can increase the propagation loss
Seafloor roughness can further reduce the for sound traveling close to the surface (Ainslie
apparent acoustic reflectivity, although if the 2005; Hall 1989).
rms roughness is known, this can be dealt with
(at least approximately) by using Eq. (6.12) to
calculate the associated Rayleigh roughness 6.4.4 Numerical Propagation Models
parameter γ as a function of grazing angle. The
effective seabed reflection coefficient is then: 6.4.4.1 The Wave Equation and Solution
Approaches
R 0 ¼ R e0:5γ
2
ð6:14Þ The ocean is a complicated environment for
sound propagation, and the simple approaches to
where R is the pressure reflection coefficient for
estimating propagation loss described above are
the flat seafloor (Eq. 6.10). All terms in this
very limited in their applicability. As a result, a
great deal of effort has gone into developing
numerical propagation models that can calculate
6 acoustic propagation loss for realistic situations.
Acoustics Toolbox: https://oalib-acoustics.org/models-
and-software/acoustics-toolbox/; accessed 30 September What follows is a brief introduction to the topic.
2020. The interested reader is referred to Etter (2018)
200 C. Erbe et al.

and Jensen et al. (2011) for a more comprehen- horizontal distance from the source to the
sive treatise. receiver, z is the receiver depth below the sea
Fundamentally, all numerical propagation surface, and ϕ is the horizontal plane azimuth
models solve the acoustic wave equation, which angle of the receiver relative to some direction
is a differential equation that relates the way the reference.
pressure changes over time to how it changes Many modeling approaches start by assuming
spatially as a wave propagates: that the solution has a harmonic time dependence
2
so that p(r, z, ϕ, t) ¼ pω(r, z, ϕ)eiωt where
pffiffiffiffiffiffiffi
1 ∂ Φ ω ¼ 2πf is the angular frequency and i ¼ 1 .
∇2 Φ ¼ ð6:15Þ
c2 ∂t 2 Substituting this solution form into the wave
where ∇2 is the Laplace operator, ∂ indicates the equation (Eq. 6.15) leads to another differential
partial derivative, c is the speed of sound, equation called the Helmholtz equation, which
t represents time, and Φ is the solution to the can be solved at a specified ω to give pω(r, z, ϕ).
wave equation. The computational advantage of this is that the
The wave equation itself is well understood Helmholtz equation can be solved independently
and straightforward to solve in simple cases; how- for each required frequency, converting a coupled
ever, there are two issues that make it difficult to four-dimensional (4D) problem into a number of
solve numerically for typical underwater acous- independent 3D problems. Models that use this
tics problems: approach are known as frequency domain
models, whereas models that directly solve the
1. Solutions are usually desired over domains wave equation are known as time domain models.
that are orders of magnitude larger than the If required, the time domain solution can be
acoustic wavelength. Direct solution methods, reconstructed from multiple frequency domain
such as finite differences or finite elements, solutions using Fourier synthesis (see Jensen
require meshing the solution domain at a reso- et al. 2011, Chap. 8, for details).
lution of a small fraction of a wavelength, so The azimuth angle dependence can be dealt
the size of the required domain makes these with by two different approaches. Modeling in
approaches impractical for most propagation 3D retains the full azimuth dependence of the
problems, even with modern computing environment, whereas N  2D modeling assumes
hardware. that changes in the environment due to small
2. The boundaries of the domain, particularly the changes in ϕ have negligible effect on sound
seabed, are complicated, but very important to propagation, so that modeling can be carried out
model accurately as they have a strong influ- independently along each azimuth of interest. The
ence on sound propagation. majority of numerical models use the N  2D
Getting around these difficulties requires approach, because there is again a substantial
making approximations that lead to equations computational saving, this time by reducing a
that are practical to solve for the problems of coupled 3D problem, solving for pω(r, z, ϕ), to
interest, with different approximations leading to a number of independent 2D problems, each solv-
different methods suitable for different situations. ing for pω, ϕ(r, z) using only environmental infor-
In general, the solution of the acoustic wave mation for the corresponding azimuth.
equation is a function of three spatial dimensions The inherent assumption of the N  2D
and time. In Cartesian coordinates, the acoustic method provides a good approximation to the
pressure can be written as: p(x, y, z, t). In most sound field in many propagation modeling
cases, we are interested in the field generated by a situations where horizontal sound speed gradients
small source, which can be approximated as a are much smaller than vertical sound speed
single point in space. It is more convenient to gradients, the seabed slopes are small, and the
work in cylindrical coordinates centered on the ranges are not large enough for the remaining
source location, p(r, z, ϕ, t), where r is the out-of-plane effects to have an appreciable effect
6 Introduction to Sound Propagation Under Water 201

on the sound field. However, there are cases transition will be smoother. Both of these problems
where full 3D modeling may be required; for are a result of a high-frequency approximation
example, around steep-sided submarine canyons, inherent in ray theory, which cannot deal with
in the presence of nonlinear internal waves that diffraction (i.e., the phenomenon of waves bending
can produce strong horizontal sound speed around obstacles or spreading out after passing
gradients, or for very-long-range propagation through a narrow gap; see Chap. 5 on sound prop-
across ocean basins. agation examples in the terrestrial world).
Some propagation models further simplify An alternative approach to calculating the
their calculations by assuming that the environ- amplitude of the acoustic field is to treat each
ment (but not the sound field) is independent of ray as the center of a beam with a specified
range, which means that the sound speed profile is (usually Gaussian) amplitude profile. The field
a function of depth only, and the water depth and at a particular location is then obtained by sum-
seabed properties are the same at all ranges (i.e., ming the contributions from all the beams that
the seafloor is flat). These are called range-inde- overlap at that location. The main challenge with
pendent (RI) propagation models, whereas prop- this approach is determining how the amplitude
agation models that allow the sound speed profile and width of the beam should change along the
and/or the water depth and/or the seabed ray, but algorithms have been developed to do
properties to vary with range are known as this (see Jensen et al. 2011, Sect. 3.5, for details).
range-dependent (RD) models. One of the best-known propagation codes of this
Acoustic propagation models are usually type is Bellhop (Porter and Bucker 1987), a fully
characterized by the numerical approach adopted, range-dependent, Gaussian beam tracing program
and the following sections described some of the suitable for N  2D modeling that is available as
most common. Guidance on which propagation part of the Acoustics Toolbox. The toolbox also
model to use in various scenarios follows this includes a fully 3D variant called Bellhop3D.
section. Although Gaussian beam tracing is an
improvement to conventional ray tracing and
6.4.4.2 Ray and Beam Tracing reduces the effects of the high-frequency assump-
A ray is a vector, normal to the wavefront, and tion inherent in ray theory, it does not completely
shows the direction of sound propagation. Ray eliminate them. Its treatment of shadow zones and
models trace rays by repeatedly applying Snell’s caustics produces realistic, but not necessarily
law (Eq. 6.11). For layered media (such as layers accurate results and, importantly, it does not pre-
of ocean water with differing properties), Snell’s dict waveguide cutoff effects.
law relates the angles of incidence θ1 and refrac- In underwater acoustics, the term waveguide
tion θ2 at every layer boundary. Rays bend or duct is used to describe any situation in which
towards the horizontal, if c2 > c1, and away sound is constrained to a particular span of
from the horizontal if c1 > c2. depths by reflection, refraction, or some combi-
There are several approaches to calculating the nation of the two. Common examples include
amplitude of the acoustic field. The simplest, (Fig. 6.13):
known as conventional ray tracing, is to use the
1. A shallow-water duct in which sound is
distance between initially adjacent rays to deter-
constrained by reflection from both the sea
mine the area over which the sound power has
surface and the seabed.
spread and calculate the intensity as the power
2. A surface duct, in which the sound speed near
per unit area. Unfortunately, this method results
the sea surface increases with increasing depth.
in unphysical predictions of infinite sound ampli-
This results in sound that is initially heading
tude at locations called caustics, where initially
downward being refracted upwards towards
adjacent rays cross and therefore have zero separa-
the sea surface, where it is reflected back down-
tion. It also predicts sharp transitions to zero sound
ward again, and so on. It is therefore
intensity in shadow zones, which are regions
constrained by reflection at the top and by
where rays do not enter, whereas in reality, the
202 C. Erbe et al.

Fig. 6.13 Sound speed profiles (left) and ray trace plots described in the text. The source depth was 10 m for all
computed using Bellhop (Porter and Bucker 1987, right) except the deep sound channel example, which had a
illustrating the common underwater acoustic ducts source depth of 1200 m

refraction at the bottom. Weak surface ducts the minimum in the sound speed (i.e., towards
are often found in the mixed layer due to sound the waveguide axis). The waveguide axis
speed increasing with increasing pressure, and occurs at a depth of about 1000 m in much of
strong surface ducts are ubiquitous in polar the world’s ocean. The sound is constrained by
oceans because both pressure and temperature refraction both above and below the axis of the
increase with increasing depth. Sea ice can, waveguide. However, these are not sharp
however, reduce the acoustic reflectivity of boundaries, and the steeper the angle of prop-
the sea surface and therefore increase the atten- agation is, the larger are the excursions of the
uation of sound traveling in the duct. ray paths away from the axis.
3. The Deep Sound Channel (DSC), also known 4. Convergence zone propagation in which
as the sound fixing and ranging (SOFAR) sound is constrained by reflection from the
channel, in which sound is refracted towards sea surface and refraction from the increase
6 Introduction to Sound Propagation Under Water 203

of sound speed with increasing depth that to zero, which requires that an incident sound
occurs below the axis of the DSC. wave is inverted on reflection. Conversely, the
seafloor is a hard boundary, which requires that
In all cases, the waveguide will only trap rays
the incident and reflected waves sum to a maxi-
leaving the source within a certain span of angles
mum pressure; so the amplitudes of the incident
from the horizontal. In the case of the shallow
and reflected waves must have the same sign.
water waveguide, this is because the seabed
Both of these boundary conditions have to be
reflectivity reduces as the grazing angle increases
satisfied simultaneously. The water depth is fixed,
(Fig. 6.12), so more energy is lost on each bottom
and normal modes consider one frequency at a
bounce at steeper angles. In the other waveguide
time, so the wavelength is fixed. The only vari-
cases, it is because the refraction is not
able that can change to satisfy the requirements is
strong enough to turn the ray around before it
the angle from the horizontal at which the wave
either reaches a depth where the sound speed
propagates. There are certain, discrete propaga-
gradient is refracting it away from the waveguide
tion angles that allow the surface and seafloor
(surface duct) or it hits the seabed (DSC and
boundary conditions to be met simultaneously,
convergence zone).
corresponding to the normal modes. Each normal
According to ray theory, rays can be launched
mode consists of a pair of plane waves, one
at any angle, irrespective of the frequency, and so
propagating upward and the other downward, at
it should always be possible to find rays that will
the same angle to the horizontal (Fig. 6.14). The
be trapped in the waveguide, provided the source
mode that corresponds to the pair of waves
is at a suitable depth. However, this is not actually
propagating closest to the horizontal is called the
the case at low frequencies, where the acoustic
lowest-order mode (mode 1), and the mode order
wavelength becomes an appreciable fraction of
increases as the propagation angle gets steeper.
the thickness of the waveguide. It turns out that
Note that the waves can never propagate exactly
if the frequency is sufficiently low, no energy will
horizontally, because that does not meet the
be trapped in the waveguide, and the waveguide
boundary conditions.
is said to be cut off. Understanding why this is the
A receiver in the water column will receive the
case requires an understanding of normal modes,
sum of the pressures from the upward and down-
which is the topic of the next section.
ward traveling waves. The amplitude of that com-
bined signal can be plotted as a function of depth
6.4.4.3 Normal Modes and range for each mode, yielding a series of
Most people find the concept of normal modes to mode shape curves (Fig. 6.15). Note that there is
be less intuitive than that of rays, but it is very always a null in pressure (i.e., a node) at the sea
useful for understanding low-frequency sound surface and a maximum in pressure magnitude
propagation in the ocean and forms the basis for (i.e., þ1 or 1; an antinode) at the hard seafloor.
a class of acoustic propagation models called The mode shapes are reminiscent of standing
normal-mode models. waves on a guitar string, which are also
Normal modes are best understood by first normal modes. However, on a guitar string,
considering an ideal shallow-water waveguide different modes correspond to different
with a constant depth (i.e., flat seafloor), constant frequencies of vibration, whereas in a waveguide,
sound speed, and perfectly reflecting seafloor. different modes correspond to sound of the same
Solving the Helmholtz equation for this situation frequency propagating at different angles to the
requires that two so-called boundary conditions horizontal.
be met: one at the sea surface and one at the For any waveguide thickness, the propagation
seafloor. The sea surface is a soft boundary as angles for a particular mode increase as frequency
far as underwater sound is concerned, so the is reduced. The ideal waveguide considered so far
boundary condition here is that the acoustic pres- has no limit to how steep the propagation angles
sure due to the incident and reflected waves sums can be, but that is not the case for real ocean
204 C. Erbe et al.

Fig. 6.14 Depth-range


plots showing how the
normal modes of an ideal
shallow-water waveguide
(lower panel) result from a
pair of upward (upper
panel) and downward
(middle panel) propagating
plane waves. Left-hand
panels are for mode 1, right-
hand panels are for mode
2. Arrows show the
direction of propagation.
The water depth is 50 m and
the acoustic wavelength is
20 m

as frequency is reduced, it will become too steep


to be constrained by the waveguide and will no
longer be able to propagate. As frequency is
reduced further, the same will happen to the
next-highest-order mode, and so on until the
lowest-order mode is unable to propagate, at
which point the waveguide is said to be cut off.
In real ocean waveguides, the sound speed
varies with depth, which causes the propagation
angle of each mode to also be a function of depth.
This changes the mode shapes, but you can still
consider a mode to consist of a pair of upward and
downward going waves, propagating at the same
angle to the horizontal at any given depth.
Fig. 6.15 Mode shapes for the first four normal modes of
a 50-m deep ideal shallow-water waveguide with a rigid The starting point for the mathematical deriva-
seabed tion of normal-mode models is the depth-
separated Helmholtz equation, which is valid for
range-independent problems and is obtained by
waveguides which, as discussed in the previous
assuming that the acoustic field can be
section, all have limits on the angular range of the
represented by the product of a function of
energy they can trap. The highest-order mode
depth and a function of range:
corresponds to the steepest propagation angle, so
6 Introduction to Sound Propagation Under Water 205

pω,ϕ ðr, zÞ ¼ F ðzÞGðr Þ: One limitation of normal-mode models such as


KRAKEN is that they only include the component
Substituting this into the Helmholtz equation of the acoustic field that is fully trapped in the
results in a one-dimensional differential equation waveguide, so they tend to be inaccurate at short
for F(z) in terms of a separation constant kr. The ranges where the component of the field that is
solution of this differential equation has poles losing energy out of the waveguide can be signif-
(infinities) at certain values of kr, which corre- icant. This problem can be addressed by includ-
spond to the normal modes. Normal-mode codes ing so-called leaky modes in the solution.
search for these values of kr, calculate the However, reliably finding leaky modes turns out
corresponding mode shapes, and then compute to be a very challenging numerical task. The most
pω,ϕ(r, z) by a mathematical technique called the successful normal-mode model to-date in this
“method of residues,” which involves summing respect is ORCA (Westwood et al. 1996), which
the contributions of all the poles, which in this is accurate at short range and can also deal with
case, corresponds to summing the contributions seabeds that support shear waves. ORCA was
of the individual modes. It turns out that kr has a written as a range-independent model, but there
geometric interpretation. It is called the horizontal have been several attempts to adapt it to range-
wavenumber and is related to the modal dependent problems using the adiabatic mode
propagation angle θ (relative to the horizontal) method (Hall 2004; Koessler 2016).
by kr ¼ ω cos(θ)/c.
Normal-mode codes are computationally very 6.4.4.4 Wavenumber Integration
fast for range-independent problems, because the The mathematical derivation of the wavenumber
modes only have to be found once, after which integration method also starts with the depth-
the field can be calculated at any desired range separated Helmholtz equation, but in this
with very little additional computational effort. case, F(z) is calculated by direct numerical solu-
Dealing with range-dependent problems tion of the one-dimensional differential equation
involves approximating the environment as a over a range of kr values, giving the so-called
series of range-independent sections, calculating wavenumber spectrum. The acoustic field
the modes for each of these sections, and then pω,ϕ(r, z) is then obtained by an integral trans-
calculating how the energy present in the modes form of the wavenumber spectrum that involves a
in one section transmits across the boundary to Hankel function. A numerical approximation to
the modes in the next section. There are two the Hankel function that is valid except at ranges
approaches: smaller than the acoustic wavelength can be used
1. The adiabatic mode method assumes that all to convert this integral transform into a Fourier
the energy in mode 1 stays in mode 1, all the transform, which can then be evaluated using the
energy in mode 2 stays in mode 2, etc. This is very efficient Fast Fourier Transform algorithm.
relatively simple to implement and fast to Wavenumber integration codes that use this
compute, but is only accurate for environments method of evaluating the integral transform are
that change relatively slowly with range. known as fast-field programs. Common examples
2. The coupled-mode method allows energy to are SAFARI, OASES, and SCOOTER (Porter
transition between modes, and so can deal with 1990; Schmidt and Glattetre 1985). OASES is a
environments that change more rapidly. But development of SAFARI and has largely
this method is much more computationally superseded it, whereas SCOOTER, which is part
demanding. of the Acoustics Toolbox (Footnote 5), is a
separate, but largely equivalent, development.
A good example of a normal-mode model is These programs are very accurate for acoustic
KRAKEN (Porter and Reiss 1984), which can be propagation calculations at ranges close enough
used for both range-independent and range- to the source that the environment can be consid-
dependent modeling (both adiabatic and coupled) ered range-independent, and can deal with
and is part of the Acoustics Toolbox (Footnote 5). arbitrarily complicated, layered seabeds. For
206 C. Erbe et al.

most applications, the short-range limitation so-called high-angle PE models greatly relaxed
introduced by the Hankel function approximation this approximation. The way in which the solu-
is of little consequence, but, if necessary, it can be tion marches out in range makes it straightfor-
removed (at additional computational cost) by ward to include range-dependent water depth,
directly evaluating the integral transform. sound speed profiles, and seabed properties, and
It has proved difficult to extend the as a result, high-angle PE models have become
wavenumber integration method to range- the method of choice for solving range-dependent
dependent problems in a way that results in an propagation problems.
efficient propagation model, although the full Perhaps the most widely used PE model is
(paid) version of OASES7 does have this capabil- RAM (Collins 1993), which allows the user a
ity. The theoretical background of this model is trade-off between the valid angular range and
described in Goh and Schmidt (1996). computational efficiency by specifying the num-
ber of terms to be used in a Padé approximation,
which is central to the wide-angle algorithm. The
6.4.4.5 Parabolic Equation
more terms that are used in the Padé approxima-
Inserting a solution of the form pω,ϕ ðr, zÞ ¼
ð1Þ
tion, the wider is the valid angular range. Even
f ðr, zÞH 0 ðk0 r Þ into the Helmholtz equation though this allows the paraxial approximation to
yields parabolic-equation (PE) models. Here, be greatly relaxed, it cannot be completely
ð1Þ
H 0 represents an outgoing cylindrical wave eliminated, and so PE models should always be
with wavenumber k0 ¼ 2πf /c0 where c0 is an used with care when acoustic energy propagating
ð1Þ
assumed sound speed. Technically, H 0 is a at steep angles is significant.
Hankel function of the first kind of zero order. Another consideration when running RAM or
The aim of PE models is to solve for f(r, z), which similar PE models is that they use a finite compu-
represents the way in which the true field varies tational grid in the depth direction, and energy
from that produced by the ideal outgoing will be artificially reflected by the sudden trunca-
cylindrical wave. tion at the bottom of the grid. This is usually dealt
If the sound is assumed to be propagating with by including an extra attenuation layer
predominantly in the range direction (the underneath the layer representing the physical
so-called paraxial approximation), then an effi- seabed. The attenuation layer has the same
cient numerical algorithm can be employed. density and sound speed as the seabed but an
Given f(r, z), a small range step dr is added to artificially high attenuation coefficient so that
calculate f(r + dr, z), a little bit farther from the little energy reaches the bottom of the grid,
source. This calculation can then be repeated as and any energy that does reflect is further
many times as desired to march the solution out in attenuated before reappearing in the water col-
range. The sound field at one range is thus used to umn. A sudden change in attenuation can also
calculate the sound field at the next range and so lead to reflections, so in critical situations, it is
on, without explicitly solving the depth-separated advisable to ramp the attenuation up smoothly
Helmholtz equation, making this a fundamentally from its seabed value to a high value, rather than
different approach to the normal mode and having a step change.
wavenumber integration methods discussed There are several variants of RAM intended for
previously. different purposes (Table 6.1). The only one that
Initially, the paraxial approximation was very can deal with elastic seabeds is RAMS, but it
restrictive and severely limited the utility of PE requires careful tuning of parameters to avoid
models for solving underwater acoustics instability, and in some cases involving layered
problems. The more recent development of seabeds, it is impossible to obtain a stable solu-
tion. More recent PE models have been devel-
7
OASES code https://oceanai.mit.edu/lamss/pmwiki/ oped that overcome these limitations (Collis
pmwiki.php?n¼Site.Oases; accessed 1 October 2020. et al. 2008) yet are research codes not readily
6 Introduction to Sound Propagation Under Water 207

Table 6.1 Summary of variants of the RAM parabolic-equation codes


Program Seabed layering Seabed type Sea surface
RAM Specified relative to the sea surface. Bathymetry cuts through the stack Fluid only Flat
of layers.
RAMSurf As for RAM. Fluid only Specified profile
RAMGeo Specified relative to the seabed. Layering follows bathymetry. Fluid only Flat
RAMS As for RAM. Elastic Flat

available. The majority of PE codes are intended the greater practicality of the PE range-marching
for N  2D modeling. However, research-level algorithm.
3D PE codes have been developed (see Jensen Range-dependent modeling with layered elas-
et al. 2011, Sect. 6.8, for details). tic seabeds remains a difficult computational task.
One commonly resorts to work-around strategies,
such as replacing the true seabed with an “equiv-
alent” fluid seabed that has a similar reflection
6.4.5 Choosing the Most Appropriate
coefficient versus grazing angle dependence at
Model
low grazing angles. This allows a standard PE
code to be used for the modeling but is only
If the frequency is high enough that the acoustic
accurate at ranges large enough that there is no
wavelength is less than a small fraction of the
high-angle energy reaching the receiver.
smallest significant feature in the sound speed
profile (e.g., mixed layer thickness, water
depth), then use a ray tracing or beam model
(e.g., Bellhop), otherwise use one of the 6.4.6 Accessing Acoustic Propagation
low-frequency models. A rule of thumb for the Models
‘small fraction’ is 1/100. However, accurately
modeling sound propagation in a weak duct may Many of the models described in this chapter
require the use of a low-frequency model up to a are freely available for download from the
higher frequency than this rule would suggest. If Ocean Acoustics Library8 (OALIB). OALIB
in doubt, run some tests using both types of includes Michael D. Porter’s Acoustics Toolbox,
models to determine the frequency at which the which incorporates a Gaussian beam tracing
two models start to agree. model (Bellhop), wavenumber integration code
When choosing a low-frequency model, if the (SCOOTER), normal-mode model (KRAKEN),
range is short enough that the environment can be as well as several other useful programs including
considered range-independent, then pick a one for calculating seabed reflectivity as a func-
wavenumber integration model (e.g., OASES or tion of grazing angle for arbitrarily complicated,
SCOOTER), otherwise use a PE model (e.g., layered seabeds (BOUNCE). These all use similar
RAM). The benefit of wavenumber integration input and output file formats, have been regularly
for range-independent modeling is its greater updated until at least 2020, and are well
accuracy at short range compared to either a documented. A number of MATLAB (The
normal-mode model (which only considers MathWorks Inc., Natick, MA, USA) routines for
trapped energy) or a PE model (which has high- dealing with the input and output are also
angle limitations). Wavenumber integration can provided. Also available on OALIB is the free
also deal accurately with elastic seabed effects, version of the wavenumber integration code
which tend to be most important at short range.
PE codes have largely replaced normal-mode
codes for range-dependent modeling because of 8
Ocean Acoustics Library https://oalib-acoustics.org/;
accessed 17 June 2020.
208 C. Erbe et al.

OASES and a number of different PE codes, sound emission characteristics, and are
including the RAM family. conceptually based on re-arranging the passive
Unfortunately, downloading a particular code sonar equation (Eq. 6.1) to solve for the received
is often just the start of a journey that may level RL:
include compiling it for the particular operating
RL ¼ SL  PL: ð6:16Þ
system you are using, deciphering the documen-
tation to determine what input files are required The tasks are:
and how they need to be formatted, and then
working out how to read and plot the output 1. Calculate RL as a function of range and depth
data. There are usually a number of adjustable in a given direction from a tonal (i.e., single-
parameters that affect how the program operates, frequency) source.
and it is necessary to have an understanding of 2. Calculate RL as a function of range and depth
the underlying numerical methods in order to set in a given direction from a broadband source.
these appropriately. Inappropriate parameter 3. Calculate RL as a function of geographical
selection will often lead to meaningless results, position and depth for an omnidirectional
so whenever you start using a different propaga- source in a directional environment.
tion model, you should run a series of tests on 4. Calculate RL as a function of geographical
simple problems (to which the answer is known) position and depth for a directional source in
in order to make sure you are getting the correct a directional environment.
results. The standard of documentation varies Indicative execution times are given for
considerably between the different models that calculations that were carried out on a desktop
are available from OALIB and is minimal computer with an Intel i7–7700 CPU, a clock
for some. speed of 3.6 GHz, and 64 GB of RAM. The
AcTUP9 is a MATLAB GUI to earlier (2005) processor had 4 physical cores but the models
versions of the Acoustics Toolbox and several of used here were single-threaded so only used one
the RAM family of PE codes. AcTUP comes core. The computer was running a 64-bit
packaged with the required Windows Windows 10 operating system.
executables. This provides a convenient entry
point for those new to acoustic propagation
modeling as it allows different codes to be run
on the same problem with minimal changes. 6.5.1 Received Level Versus Range
However, careful parameter selection is still and Depth from a Tonal Source
required in order to get meaningful results; put
garbage in, get garbage out. For this case, it is only necessary to specify the
acoustic environment (i.e., bathymetry profile,
sound speed profile, and seabed properties)
6.5 Practical Acoustic Modeling along a single azimuth from the source. The
Examples propagation loss PL is only required at the source
transmission frequency, and can be obtained
Having worked through the theory and concepts, using a single run of an appropriate propagation
this section finally puts all of the above into action model. The received level RL can then be
and provides examples of some practical acoustic obtained using Eq. (6.16).
propagation modeling tasks of increasing com- The example of a fin whale (Balaenoptera
plexity. These all involve the estimation of physalus) located about 40 km off the coast of
received levels due to a source with known southwestern Australia, at a depth of 50 m, while
emitting a 20-Hz tone at a source level of 189 dB
9 re 1 μPa m (Sirovic et al. 2007) is depicted in
AcTUP http://cmst.curtin.edu.au/products/underwater/
download/; accessed 1 October 2020. Fig. 6.16. The modeled direction of propagation
6 Introduction to Sound Propagation Under Water 209

A) B)

Fig. 6.16 (a) Sound speed profile used for the modeling a 20-Hz tone with a source level of 189 dB re 1 μPa m. The
examples. (b) Modeled received SPL as a function of magenta line is the seafloor
range and depth for a fin whale at a depth of 50 m emitting

was due west from the source, and the bathymetry has traveled from the source to the receiver via
profile (i.e., magenta line in Fig. 6.16b) was different paths. This is typical of the sound fields
interpolated from the Geosciences Australia produced by tonal sources. The overall reduction
0.150 resolution bathymetry database.10 The in received level with increasing range is quite
sound speed profile (Fig. 6.16a) was calculated slow, particularly beyond 70 km, due to the sound
from salinity and temperature data obtained from becoming constrained by refraction in the deep
the World Ocean Atlas (Locarnini et al. 2018; sound channel. This is typical of downslope prop-
Zweng et al. 2018). The seabed was modeled as agation from a near-surface source situated over
a fine sand half-space with parameters from the continental slope into deep water.
Jensen et al. (2011). Propagation loss modeling
was carried out with RAMGeo in AcTUP, which
is very efficient at such a low frequency, taking
6.5.2 Received Level Versus Range
only a few seconds. A simple program was writ-
and Depth from a Broadband
ten in MATLAB to read the propagation loss file
Source
produced by RAMGeo, calculate the received
levels using Eq. (6.16), and plot the results.
Many sources of underwater sound are broad-
Note that AcTUP can be used to plot propagation
band, which means that they produce significant
loss, but not received level.
acoustic output over a wide range of frequencies.
The sound field has a complicated structure of
Ships, pile driving, and the airgun arrays used for
peaks and nulls that is the result of constructive
seismic surveying all produce broadband noise,
and destructive interference between sound that
and modeling the resulting sound fields is of
importance when assessing the potential impacts
of these sources on marine animals.
10
A common way to carry out broadband
Whiteway, T., Australian Bathymetry and Topography
Grid, June 2009, https://ecat.ga.gov.au/geonetwork/srv/
modeling for continuous sound such as ship
eng/catalog.search#/metadata/67703; accessed noise is:
6 November 2020.
210 C. Erbe et al.

1. Break the required frequency span into a series Computing the received levels for impulsive
of frequency bands (e.g., 1/3 octave bands are sources follows the same steps as for broadband,
commonly used; see Chap. 4). continuous sources, except that in step 3, the
2. Use a propagation model to estimate a typical source spectrum needs to be specified as an
propagation loss for each band. This can either energy density spectrum instead of a power den-
be done by running the propagation loss model sity spectrum, and in step 5, it is sound exposures
at the center frequency of each band or by that are summed across the bands to obtain the
running it at a number of frequencies within overall sound exposure, which is then converted
the band and then averaging the results. The to a sound exposure level.
latter is preferred as it smooths out the inter- As an example, the modeled received sound
ference field to some extent, but if the source exposure levels due to a single 3.3-l (200-cui)
emits a wide range of frequencies that span airgun are plotted as a function of range and
many bands, then the two methods will yield depth in Fig. 6.17. The airgun (i.e., a cylindrical
very similar results for the total field. tube filled with compressed air, which is sud-
3. Integrate the source power spectral density denly released into the water) is located at the
over each band and convert to a source level. geographical location that was used for the fin
4. Use Eq. (6.16) to obtain the received level in whale example, but at a depth of 6 m, which is
each band. typical of seismic survey source depths. The
5. Sum the corresponding mean-square pressures scenario is otherwise the same as previously
across the bands to obtain an overall mean- described. The airgun’s source waveform was
square pressure that can then be converted to modeled using the Cagam airgun array model
an overall received sound pressure level (SPL, (Duncan and Gavrilov 2019). The airgun array
see Chap. 4). model also calculated the signal’s energy density
spectrum, which was then used in step 3 of the
The use of mean-square pressure as a metric is
broadband modeling procedure outlined above.
problematic for impulsive sources such as airguns
Once again, AcTUP was used to run RAMGeo to
or pile driving, because the results become very
carry out the propagation modeling, but this time
sensitive to the duration of the signal, which is
at 1/3 octave band center frequencies from
often hard to determine. Source and received
7.9 Hz to 1 kHz, which took about 5 minutes.
levels for impulsive sources are therefore usually
A separate MATLAB program was written to
characterized in terms of sound exposure, and its
carry out the post-processing steps and to plot
logarithmic measure, the sound exposure level
the results.
(SEL, see Chap. 4).

Fig. 6.17 Received SEL


from a 3.3-l (200-cui)
airgun at a depth of 6 m as a
function of range and depth.
The magenta line is the
seafloor
6 Introduction to Sound Propagation Under Water 211

Comparing Fig. 6.17 with Fig. 6.16, it can be situations involving sound propagating across
seen that the broad range of frequencies emitted steeply sloping seabeds, or in some special
by the airgun has the effect of smoothing out the situations in which horizontal sound speed
fluctuations in the sound field caused by gradients become significant.
interfering paths. The color scales on these two The result is a 3D grid of the received level as a
figures are not directly comparable because function of range, depth, and azimuth (i.e., direc-
Fig. 6.16 gives SPL in dB re 1 μPa whereas tion in the horizontal plane). To create a 2D map
Fig. 6.17 presents SEL in dB re 1 μPa2s. The of the sound field, it is necessary to extract some
two are related through: measure of the sound field in the vertical dimen-
sion and then interpolate that in the horizontal
SEL ¼ SPL þ 10 log 10 T ð6:17Þ
plane, with the appropriate measure depending
where T is the duration of the received signal in on the purpose of the modeling. For example, in
seconds, conventionally defined as the duration of environmental impact assessments, it is common
the time interval containing 90% of the signal’s to use the maximum level at any depth in the
energy (90% energy signal duration; see Chap. 4). water column, or the maximum level in a depth
range corresponding to the diving range of an
animal of interest.
Here we illustrate N  2D modeling using the
6.5.3 Received Level as a Function previous two examples, but this time carrying out
of Geographical Position the propagation modeling with bathymetry appro-
and Depth priate for each of the 37 tracks shown in Fig. 6.18.
These were set at 10 increments in azimuth, with
The geographical distribution of received sound some adjustment and an extra track inserted in the
levels can be modeled by repeating the tonal inshore direction to improve the definition of the
source modeling procedure (Sect. 6.5.1) or broad- received field in the vicinity of the two capes.
band source modeling procedure (Sect. 6.5.2) MATLAB programs were written to automate the
using bathymetry profiles appropriate for differ- various steps of the process.
ent directions from the source. For long-range Results are plotted in Fig. 6.19 for the fin
modeling, it may also be necessary to make the whale and the airgun. In both cases, the plots are
sound speed profile a function of range and direc- of the maximum received level over depth, but
tion. This is called N  2D modeling and is once again, they are not directly comparable
adequate in most circumstances, but is less accu- because SPL was plotted for the fin whale,
rate than running a fully 3D propagation model in whereas SEL was plotted for the airgun.

Fig. 6.18 Map showing


the bathymetry off the
southwest coast of
Australia. The lines
radiating from the chosen
source location show the
tracks along which
propagation was modeled
212 C. Erbe et al.

Fig. 6.19 (a) Map of maximum SPL over depth as a Map of maximum SEL over depth due to a single firing of
function of geographical position due to a fin whale calling an airgun of volume 3.3 l (200 cui) at a depth of 6 m
at a depth of 50 m off the southwest coast of Australia. (b)

6.5.4 Received Level as a Function Figure 6.20b combines range-depth plots for the
of Geographical Position 90 and 270 azimuths in a single plot, which
and Depth for a Directional illustrates the contrasting sound attenuation rates
Source in the upslope and downslope directions.

Another level of complexity occurs when the


source emits sound differently in different
6.5.5 Modeling Limitations
directions. We illustrate this for an airgun array
and Practicalities
typical of those used for offshore seismic surveys.
In this case, the array consists of 30 individual
Provided the chosen propagation modeling
airguns of different sizes arranged in a 21-m wide
approach is appropriate for the task, the largest
by 15-m long rectangular array, with all airguns at
uncertainties in the results are likely due to a lack
the same depth of 6 m. The total volume of the
of information on the environment, which
compressed air released when the airguns fire is
includes the bathymetry, seabed composition,
55.7 l (3400 cui), and the tow direction is towards
and water column sound speed profile. Bathyme-
the North. The Cagam airgun array model was
try and water column sound speed profiles are
used to calculate a representative source spectrum
often straightforward to measure or can be
corresponding to the direction of each of the
obtained from databases, but knowledge of the
propagation tracks shown in Fig. 6.18. Apart
acoustic properties of the seabed is often poor
from using a different source spectrum for each
(i.e., unavailable, patchy, and uncertain) and the
direction, the procedure for calculating the
parameters that contribute to the geoacoustics
received levels was identical to that described in
(e.g., sediment composition, density, and thick-
the previous section for the single airgun.
ness) vary over space and not coherently (Erbe
The maximum received SEL at any depth is
et al. 2021). Moreover, seabed properties tens or
plotted in Fig. 6.20a, which uses the same color
even hundreds of meters below the seafloor may
scale as Fig. 6.19b. The array produced higher
be important when modeling low-frequency
levels overall, and the sound field was more direc-
propagation (Etter 2018). As a result, it is often
tional, with distinct maxima east, west, and to a
prudent to carry out modeling with several
lesser extent, north and south from the source.
6 Introduction to Sound Propagation Under Water 213

Fig. 6.20 (a) Map of maximum SEL over depth as a from the same airgun array as a function of range and
function of geographical position due to a single firing of depth. The source was at 0-km range, negative ranges
a typical airgun array off the southwest coast of Australia. correspond to the 270 azimuth (i.e., west of the source)
The total volume of the airguns in the array was 55.7 l and positive ranges correspond to the 90 azimuth (i.e.,
(3400 cui), and the array was at a depth of 6 m. The tow east of the source). The magenta line is the seafloor.
direction of the array was northwards. (b) Received SEL Colorbar applies to both panels

different sets of seabed properties in order to simulated, which allows other signal measures
obtain an estimate of the uncertainty in the results. such as peak sound pressure levels (SPLpk) to be
The use of N  2D rather than fully 3D calculated. Calculating SPLpk by this means
modeling in the above examples may introduce works well at short ranges but tends to overesti-
some inaccuracies for cross-slope propagation mate levels at longer ranges because the propaga-
paths, which in this case are to the north and tion models do not properly account for seabed
south of the source. The effect of the sloping and sea surface scattering effects that broaden the
bathymetry would be to deflect the sound towards peaks and reduce their amplitudes.
the downslope direction, slightly increasing Simple propagation modeling tasks such as
levels downslope and decreasing them upslope. those described in Sects. 6.5.1 and 6.5.2 can be
The modeling methods described above treat carried out using free propagation modeling tools
the source as an ideal point source, which is a such as the Acoustics Toolbox and AcTUP, with the
good approximation provided the receiver is addition of some relatively straightforward post-
much farther away from the source than the processing coded in any convenient programming
dimensions of the source. Modeling received language. However, when N  2D modeling in
levels close to a large source such as an airgun multiple directions is required, it becomes desirable
array requires a different and more computation- to automate the process of interpolating bathymetry
ally intensive approach in which the individual profiles from databases, generating sound speed
airguns in the array are treated as separate profile files, initiating multiple runs of the
sources, and their signals are combined, taking propagation model, calculating received levels,
account of their relative phases at the receiver interpolating and plotting results, etc. Most
locations. The same approach accounts for the organizations that routinely carry out this type of
full 3D directivity of the source, rather than just modeling have written their own proprietary soft-
the horizontal directivity, as was the case for the ware for these tasks. To the authors’ knowledge,
example in Sect. 6.5.4. Combining this approach there is no freely available software package with
with a process called Fourier synthesis (Jensen all of these capabilities, although there is at least
et al. 2011) allows the received waveforms to be one commercially available package.
214 C. Erbe et al.

6.6 Summary • The Discovery of Sound in the Sea (DOSITS;


https://dosits.org/) website has over 400 pages
Sound propagation under water is a complex pro- of content in three major sections including the
cess. Sound does not propagate along straight- science of underwater sound and how people
line transmission paths. Rather, it reflects, and marine animals use underwater sound to
refracts, and diffracts. It scatters off rough conduct activities for which light is used in air.
surfaces (such as the sea surface and the seafloor) The website has been the foundational
and off reflectors within the water column (e.g., resource of the DOSITS Project, providing
gas bubbles, fish swim bladders, and suspended information at a beginner and advanced level,
particles). It is transmitted into the seafloor and based on peer-reviewed science (Vigness-
partially lost from the water. It is converted into Raposa et al. 2016, 2019). The web structure
heat by exciting molecular vibrations. There are has been transformed into structured tutorials
common misconceptions about sound propaga- that provide a streamlined, progressive devel-
tion in water, such as “low-frequency sound opment of knowledge. The tutorial layout
does not propagate in shallow water,” “over allows a user to proceed from one topic to the
hard seafloors, all sound is reflected, leading to next in sequence or jump to a specific topic of
cylindrical spreading,” and “over soft seafloors, interest. The three tutorials focus on the sci-
sound propagates spherically.” This chapter ence of underwater sound, the potential effects
aimed to remove common misconceptions and of underwater sound on marine animals, and
empower the reader to comprehend sound propa- the ecological risk assessment process for
gation phenomena in a range of environments and determining possible effects from a specific
appreciate the limitations of widely used sound sound source. Additional resources have been
propagation models. The chapter began by deriv- developed to provide the underwater acoustics
ing the sonar equation for a number of scenarios content in different formats, including instruc-
including animal acoustic communication, com- tional videos and webinars. Finally, there are
munication masking by noise, and acoustic print publications (an educational booklet and
surveying of animals. It introduced the concept a trifold brochure) available in hard copy or
of the layered ocean, presenting temperature, PDF format and two eBooks available for free
salinity, and resulting sound speed profiles. on the iBooks Store, including Book I: Impor-
These were needed to develop the most common tance of Sound in the Sea and Book II: Science
concepts of sound propagation under water: of Underwater Sound.
ray tracing and normal modes. The chapter
computed Snell’s law, reflection and transmission
coefficients, and Lloyd’s mirror. It provided an References
overview of publicly available sound propagation
software (including wavenumber integration and Ainslie MA (2005) Effect of wind-generated bubbles on
parabolic equation models). It concluded with a fixed range acoustic attenuation in shallow water at 1–4
few practical examples of modeling propagation kHz. J Acoust Soc Am 118(6):3513–3523. https://doi.
org/10.1121/1.2114527
loss for whale song and a seismic airgun array. Ainslie MA, McColm JG (1998) A simplified formula for
viscous and chemical absorption in sea water. J Acoust
Soc Am 103(3):1671–1672. https://doi.org/10.1121/1.
421258
6.7 Additional Resources American National Standards Institute (2015)
Bioacoustical Terminology (ANSI S3.20-2015, R
2020). Acoustical Society of America, New York
• Dan Russell’s Acoustics and Vibration
Au WWL, Moore PWB (1984) Receiving beam patterns
Animations: https://www.acs.psu.edu/ and directivity indices of the Atlantic bottlenose dol-
drussell/demos.html phin Tursiops truncatus. J Acoust Soc Am 75(1):
255–262. https://doi.org/10.1121/1.390403
6 Introduction to Sound Propagation Under Water 215

Benoit-Bird KJ, Waluk CM (2020) Exploring the promise Erbe C, Parsons M, Duncan AJ, Osterrieder S, Allen K
of broadband fisheries echosounders for species dis- (2017b) Aerial and underwater sound of unmanned
crimination with quantitative assessment of data aerial vehicles (UAV, drones). J Unmanned Veh Syst
processing effects. J Acoust Soc Am 147(1):411–427. 5(3):92–101. https://doi.org/10.1139/juvs-2016-0018
https://doi.org/10.1121/10.0000594 Erbe C, Williams R, Parsons M, Parsons SK, Hendrawan
Benoit-Bird KJ, Moline MA, Southall BL (2017) Prey in IG, Dewantama IMI (2018) Underwater noise from
oceanic sound scattering layers organize to get a little airplanes: an overlooked source of ocean noise. Mar
help from their friends. Limnol Oceanogr 62(6): Pollut Bull 137:656–661. https://doi.org/10.1016/j.
2788–2798. https://doi.org/10.1002/lno.10606 marpolbul.2018.10.064
Collins MD (1993) An energy-conserving parabolic equa- Erbe C, Peel D, Smith JN, Schoeman RP (2021) Marine
tion for elastic media. J Acoust Soc Am 94(2):975–982 acoustic zones of Australia. J Mar Sci Eng 9(3):340.
Collis JM, Siegmann WL, Jensen FB, Zampolli M, Küsel https://doi.org/10.3390/jmse9030340
ET, Collins MD (2008) Parabolic equation solution of Etter PC (2018) Underwater acoustic modeling and simu-
seismo-acoustics problems involving variations in lation, 5th edn. CRC Press, Boca Raton, FL. https://
bathymetry and sediment thickness. J Acoust Soc Am doi.org/10.1201/9781315166346
123(1):51–55. https://doi.org/10.1121/1.2799932 Fletcher H (1940) Auditory patterns. Rev Mod Phys 12:
Duncan AJ, Gavrilov AN (2019) The CMST Airgun Array 47–65
model—a simple approach to modeling the underwater François RE, Garrison GR (1982a) Sound absorption
sound output from seismic Airgun arrays. IEEE J based on ocean measurements: part I: pure water and
Ocean Eng 44(3):589–597. https://doi.org/10.1109/ magnesium sulphate contributions. J Acoust Soc Am
JOE.2019.2899134 72(3):896–907
Erbe C (2000) Detection of whale calls in noise: perfor- François RE, Garrison GR (1982b) Sound absorption
mance comparison between a beluga whale, human based on ocean measurements: part II: boric acid con-
listeners and a neural network. J Acoust Soc Am tribution and equation for total absorption. J Acoust
108(1):297–303. https://doi.org/10.1121/1.429465 Soc Am 72(6):1879–1890
Erbe C (2002) Underwater noise of whale-watching boats Godin OA (2008) Sound transmission through water–air
and its effects on killer whales (Orcinus orca). Mar interfaces: new insights into an old problem. Contemp
Mamm Sci 18(2):394–418. https://doi.org/10.1111/j. Phys 49(2):105–123. https://doi.org/10.1080/
1748-7692.2002.tb01045.x 00107510802090415
Erbe C (2008) Critical ratios of beluga whales Goh JT, Schmidt H (1996) A hybrid coupled wave-
(Delphinapterus leucas) and masked signal duration. number integration approach to range-dependent
J Acoust Soc Am 124(4):2216–2223. https://doi.org/ seismoacoustic modeling. J Acoust Soc Am 100(3):
10.1121/1.2970094 1409–1420. https://doi.org/10.1121/1.415988
Erbe C (2015) The maskogram: a tool to illustrate zones of Hall MV (1989) A comprehensive model of wind-
masking. Aquat Mamm 41(4):434–443. https://doi. generated bubbles in the ocean and predictions of the
org/10.1578/AM.41.4.2015.434 effects on sound propagation at frequencies up to
Erbe C, Farmer DM (1998) Masked hearing thresholds of 40 kHz. J Acoust Soc Am 86(3):1103–1117. https://
a beluga whale (Delphinapterus leucas) in icebreaker doi.org/10.1121/1.398102
noise. Deep Sea Res II 45(7):1373–1388. https://doi. Hall MV (2004) Preliminary analysis of the applicability
org/10.1016/S0967-0645(98)00027-7 of adiabatic modes to inverting synthetic acoustic data
Erbe C, Liong S, Koessler MW, Duncan AJ, Gourlay T in shallow water over a sloping sea floor. IEEE J Ocean
(2016a) Underwater sound of rigid-hulled inflatable Eng 29(1):51–58. https://doi.org/10.1109/JOE.2003.
boats. J Acoust Soc Am 139(6):EL223–EL227. 823315
https://doi.org/10.1121/1.4954411 Jensen FB, Kuperman WA, Porter MB, Schmidt H (2011)
Erbe C, Parsons M, Duncan AJ, Allen K (2016b) Under- Computational Ocean acoustics, 2nd edn. Springer,
water acoustic signatures of recreational swimmers, New York
divers, surfers and kayakers. Acoust Aust 44(2): Kloser RJ, Macaulay GJ, Ryan TE, Lewis M (2013) Iden-
333–341. https://doi.org/10.1007/s40857-016-0062-7 tification and target strength of orange roughy
Erbe C, Reichmuth C, Cunningham KC, Lucke K, (Hoplostethus atlanticus) measured in situ. J Acoust
Dooling RJ (2016c) Communication masking in Soc Am 134(1):97–108
marine mammals: a review and research strategy. Mar Koessler MW (2016) Modelling of underwater acoustic
Pollut Bull 103:15–38. https://doi.org/10.1016/j. propagation over elastic, range-dependent seabeds. Ph.
marpolbul.2015.12.007 D. Thesis, Curtin University, Perth, WA, Australia
Erbe C, Dunlop R, Jenner KCS, Jenner M-NM, McCauley Kuehne LM, Erbe C, Ashe E, Bogaard LT, Collins MS,
RD, Parnum I, Parsons M, Rogers T, Salgado-Kent C Williams R (2020) Above and below: military aircraft
(2017a) Review of underwater and in-air sounds emit- noise in air and under water at Whidbey Island,
ted by Australian and Antarctic marine mammals. Washington. J Mar Sci Eng 8(11):923. https://doi.org/
Acoust Aust 45:179–241. https://doi.org/10.1007/ 10.3390/jmse8110923
s40857-017-0101-z
216 C. Erbe et al.

Locarnini RA, Mishonov AV, Baranova OK, Boyer TP, Schmidt H, Glattetre J (1985) A fast field model for three-
Zweng MM, Garcia HE, Reagan JR, Seidov D, dimensional wave propagation in stratified environments
Weathers K, Paver CR, Smolyar I (2018) World based on the global matrix method. J Acoust Soc Am
Ocean atlas 2018, volume 1: temperature. National 78(6):2105–2114. https://doi.org/10.1121/1.392670
Oceanic and Atmospheric Administration Sirovic A, Hildebrand JA, Wiggins SM (2007) Blue and
Mackenzie KV (1981) Nine-term equation for sound speed fin whale call source levels and propagation range in
in the oceans. J Acoust Soc Am 70:807–812 the Southern Ocean. J Acoust Soc Am 122(2):
Medwin H, Clay CS (1998) Chapter 2 - sound 1208–1215. https://doi.org/10.1121/1.2749452
propagation. In: Medwin H, Clay CS (eds) Urick RJ (1983) Principles of underwater sound, 3rd edn.
Fundamentals of acoustical oceanography. Academic McGraw Hill, New York
Press, San Diego, pp 17–69. https://doi.org/10.1016/ Vigness-Raposa KJ, Scowcroft G, Morin H, Knowlton C,
B978-012487570-8/50004-0 Miller JH, Ketten DR, Popper AN (2016) Discovery of
Moore BCJ (2013) An introduction to the psychology of sound in the sea: resources for decision makers. Proc
hearing. Brill, Leiden, The Netherlands Meet Acoust 27(1):010008. https://doi.org/10.1121/2.
Parsons MJG, Parnum IM, Allen K, McCauley RD, Erbe 0000257
C (2014) Detection of sharks with the Gemini imaging Vigness-Raposa KJ, Scowcroft G, Morin H, Knowlton C,
sonar. Acoust Australia 42(3):185–189 Miller JH, Ketten DR, Popper AN (2019) Discovery
Parsons MJG, Duncan AJ, Parsons SK, Erbe C (2020) of sound in the sea: communicating underwater
Reducing vessel noise: an example of a solar-electric acoustics research to decision makers. Proc Meet
passenger ferry. J Acoust Soc Am 147(5):3575–3583. Acoust 37(1):025001. https://doi.org/10.1121/2.
https://doi.org/10.1121/10.0001264 0001204
Porter MB (1990) The time-marched fast-field program Westwood EK, Tindle CT, Chapman NR (1996) A normal
(FFP) for modeling acoustic pulse propagation. J mode model for acousto-elastic ocean environments. J
Acoust Soc Am 87(5):2013–2023. https://doi.org/10. Acoust Soc Am 100(6):3631–3645. https://doi.org/10.
1121/1.399329 1121/1.417226
Porter MB, Bucker HP (1987) Gaussian beam tracing for Zweng MM, Reagan JR, Seidov D, Boyer TP, Locarnini
computing ocean acoustic fields. J Acoust Soc Am RA, Garcia HE, Mishonov AV, Baranova OK,
82(4):1349–1359. https://doi.org/10.1121/1.395269 Weathers K, Paver CR, Smolyar I (2018) World
Porter M, Reiss EL (1984) A numerical method for ocean- Ocean atlas 2018, volume 2: salinity. National Oceanic
acoustic normal modes. J Acoust Soc Am 76(1): and Atmospheric Administration
244–252. https://doi.org/10.1121/1.391101

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Analysis of Soundscapes as an
Ecological Tool 7
Renée P. Schoeman, Christine Erbe, Gianni Pavan,
Roberta Righini, and Jeanette A. Thomas

7.1 Introduction (i.e., wind, water, animals, etc.) rapidly changed


after the Industrial Revolution to cities dominated
Whether listening in a forest or on an open plain, by sounds from machinery. Schafer further
by the side of a river or in the ocean, at the noticed that most people had ceased to listen to
outskirts of suburbia or right downtown, the the sounds of the environment and actively tried
Earth abounds with sounds. The use of the term to ignore unpleasant sound (i.e., noise). With the
“soundscape” in the literature has increased rap- goals of studying and archiving soundscapes,
idly since 2000 (Fig. 7.1) and can be traced back creating public awareness of noise pollution, and
to Southworth’s (1969) article on the sonic envi- creating healthy soundscapes through acoustic
ronment of Boston, MA, USA. The Canadian design, Schafer founded the World Soundscape
music composer and researcher Schafer later Project (WSP 1972–1979; Torigoe 1982).
defined soundscapes as “the auditory properties Soundscape studies by the WSP were human-
of landscapes” (Schafer 1977). Schafer was a centered, focusing on the acoustic composition
pioneer in highlighting the need for soundscape of cities and villages, studying only humans as
research and management. In his book, The receivers of acoustic information, and
New Soundscape, Schafer and his students emphasizing the negative effects of noise on
documented rapid changes in soundscapes over humans (Truax 1984, 1996). Krause (1987,
the course of human civilization (Schafer 1969). 1993) adopted an animal-centered approach to
Common settings of primitive cultures the study of soundscapes. He recorded and
surrounded by an abundance of natural sounds archived sounds of different animal species as
well as of entire ecosystems. According to
Krause, acoustic sampling of an area over a
Jeanette A. Thomas (deceased) contributed to this chapter period of time and under different conditions
while at the Department of Biological Sciences, Western allows us to study, and ultimately predict, how
Illinois University-Quad Cities, Moline, IL, USA
human-induced changes might affect ecosystems
R. P. Schoeman (*) · C. Erbe (Krause 1987).
Centre for Marine Science and Technology, Curtin While the term “soundscape” has different
University, Perth, WA, Australia uses in the literature, the International Organiza-
e-mail: renee.koper@postgrad.curtin.edu.au;
tion for Standardization officially defined
c.erbe@curtin.edu.au
“soundscape” as “an acoustic environment as
G. Pavan · R. Righini
perceived or experienced and/or understood by a
Centro Interdisciplinare di Bioacustica e Ricerche
Ambientali, University of Pavia, Pavia, Italy person or people, in context” and “acoustic envi-
e-mail: gianni.pavan@unipv.it ronment” as the “sound at the receiver from all
# The Author(s) 2022 217
C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_7
218 R. P. Schoeman et al.

600 make up the geophony, and sounds produced by


human activities or machinery are referred to as
anthropophony (Fig. 7.2; Krause 2008). Sounds
500
created by machinery (including power
generators, motors, etc.) are sometimes grouped
as technophony (Mullet et al. 2016), which is the
component of anthropophony typically associated
400
Number of Articles

with noise pollution. The identification of


soundscape components is a key element in the
research field of ecoacoustics, which investigates
300
the relationship of natural and anthropogenic
sounds with the environment on a range of scales
in space and time (Farina and Gage 2017). The
200
research field of soundscape ecology investigates
the interaction of organisms with their environ-
ment, mediated through sound (Pijanowski et al.
100 2011a, b). For example, sound sources distributed
within an environment provide acoustic cues (i.e.,
soundmarks), by which animals can orientate,
0 navigate, and make habitat choices (Slabbekoorn
1990 2000 2010 2020
and Bouton 2008). Under the Acoustic Habitat
Year of Publication
Hypothesis, the habitats that sound-dependent
Fig. 7.1 Number of articles with “soundscape” in the species select and occupy exhibit acoustic
abstract, listed by Scopus, versus publication year; characteristics that suit a species’ functional
retrieved 10 June 2022 needs and match its sound production and recep-
tion capabilities (Mullet et al. 2017a). Acoustic
sound sources as modified by the environment” habitat specialists are species whose acoustic hab-
(International Organization for Standardization itat is unique and vital to its functional needs,
[ISO] 2014). A soundscape is thus a perceptual while acoustic habitat generalists occupy acoustic
construct that requires a human listener, while the habitats that are less than unique but still impor-
acoustic environment is a physical phenomenon, tant to the species’ functional needs (Mullet et al.
extending in frequency beyond the human 2017a). Under the Acoustic Adaptation Hypothe-
hearing limits, including infrasounds and sis, the sounds of soniferous animals evolved to
ultrasounds. In the field of underwater acoustics, optimize propagation within the animals’ habitat
however, a soundscape is the “characterization of (Morton 1975), characterized by its soundscape
the ambient sound in terms of its spatial, temporal and sound propagation conditions. Under the
and frequency attributes, and the types of sources Acoustic Niche Hypothesis, animals evolved
contributing to the sound field” (International species-specific sounds in certain frequency
Organization for Standardization [ISO] 2017). bands and temporal patterns to minimize compe-
“Soundscape” in underwater acoustics thus does tition (i.e., masking) with sounds from other
not require a listener. In essence, the usage of the animals and the environment (Krause 1993). An
term “soundscape” in the literature is variable and interesting and related question is how animal
perhaps related to specific research objectives (and human) listeners make sense of the myriad
(Scarpelli et al. 2020). of sounds received from all directions,
The components of a soundscape may be overlapping in frequency and time, and thus
grouped by their origin. Sounds produced by masking each other. A listener must separate the
animals are grouped as biophony, sounds pro- parts belonging to different sources and merge the
duced by atmospheric or geophysical events parts belonging to the same source to make sense
7 Analysis of Soundscapes as an Ecological Tool 219

2012; Yip et al. 2017; Priyadarshani et al.


2018). While some soundscapes might have
been studied more than others (Scarpelli et al.
2020), there often are key sounds (i.e., sounds
characteristic for an ecosystem) by which an eco-
system may be identified. For example, a listener
may identify the terrestrial soundscape of a near-
shore ecosystem off central California, USA, by
the barks of California sea lions (Zalophus
californianus), the squawks of sea gulls (Larus
Fig. 7.2 Sketch of the sound sources within soundscapes
ranging from wilderness to countryside, to city. Biophony
californicus), and the tapping sounds made by sea
decreases and anthropophony increases while the otters (Enhydra lutris) that use a rock to crack-
geophony might vary comparatively little. Example spe- open shellfish.
cies are sketched along the way with decreasing density
and biodiversity. Acoustic habitat generalists occur in
multiple, different soundscapes, while acoustic habitat
specialists only occur in quite specific soundscapes 7.2.1 Biophony
(Mullet et al. 2017a)
The terrestrial biophony includes sounds pro-
duced by insects (e.g., Brady 1974; Römer and
of the acoustic scene. This is called auditory scene Lewald 1992; Polidori et al. 2013), anurans (e.g.,
analysis (Bregman 1990; Lewicki et al. 2014). Cunnington and Fahrig 2010; Zhang et al. 2017),
Natural soundscapes are appreciated for their reptiles (e.g., Crowley and Pietruszka 1983;
esthetic and recreational value (e.g., Davies et al. Galeotti et al. 2005), birds (e.g., Lengagne
2013; Francis et al. 2017; Franco et al. 2017) and et al. 1999; Charrier et al. 2001; Catchpole
also have a significant ecological and scientific and Slater 2008), bats (e.g., Gadziola et al.
value. Soundscapes should, therefore, be consid- 2012; Prat et al. 2016), and other mammals
ered a natural resource, worthy of study, manage- (such as dogs and seals; e.g., van Opzeeland
ment, and conservation (National Park Service et al. 2010; Mumm and Knörnschild 2014;
[NPS] 2000; Farina and Gage 2017; Pavan Bowling et al. 2017). Typically, multiple (vocal)
2017). How many undisturbed soundscapes taxa occur in the same environment and so, evi-
remain in this world of decreasing biodiversity, dence for the Acoustic Niche Hypothesis has
changes in land-use, and rising anthropogenic been demonstrated in various ecosystems among
noise? Can the soundscape of a pristine habitat insects (Sueur 2002), anurans (Villanueva-Rivera
function as a model to restore a degraded habitat 2014), birds (Azar and Bell 2016), and a combi-
(Pavan 2017; Gordon et al. 2019; Righini and nation of species (Hart et al. 2015).
Pavan 2020)? This chapter gives an overview of Terrestrial soundscape ecology studies have
terrestrial and aquatic soundscapes, outlines how been dominated by research on birds (Ferreira
soundscapes may change or have changed over et al. 2018). Most bird species are diurnal
time, provides tools for analyzing and quantifying vocalizers, with peak activity at dawn and dusk.
soundscapes, and discusses how passive acoustic Birds may emit single calls as well as sounds
monitoring applies to soundscape ecology arranged into long and complex songs (Fig. 7.3).
research, management, and conservation. Calls have a variety of functions and are, for
example, produced to raise alarm (Gill and
Bierema 2013), contact conspecifics (Bond and
7.2 Terrestrial Soundscapes Diamond 2005), or beg for food (Klenova 2015).
While bird song was long thought to be an exclu-
Terrestrial soundscapes may vary widely within sive male trait used for territorial defense and
as well as between ecosystems (e.g., Krause female attraction, there is mounting evidence
220 R. P. Schoeman et al.

be intense and potentially affect the timing and


frequency of other species’ vocalizations. Hart
et al. (2015), for example, found that birds in a
Costa Rican tropical rainforest either ceased
vocalizing or changed their call frequency to
avoid acoustic overlap with cicada choruses
(Fig. 7.4). As do birds, insects produce sounds
in flight, with dominant frequencies between
140 and 250 Hz (Fig. 7.5; Kawakita and Ichikawa
2019).
Social wasps, honeybees, bumble bees, and
some hoverflies produce sounds with dominant
frequencies between 152 and 317 Hz when
attacked by predators, potentially as a warning
Fig. 7.3 Soundscape of a temperate forest at dusk signal (Rashed et al. 2009). Smaller velvet ants
showing song of the chiffchaff (Phylloscopus collybita), (family of wasps) also produce distress calls but
squawks of a mallard duck (Anas platyrhynchos), and calls
from a marsh frog (Pelophylax ridibundus) at higher frequencies between 4 and 17 kHz
(Polidori et al. 2013). Ants produce distress calls
extending in frequency above 70 kHz (Pavan
that female bird song is globally widespread and et al. 1997).
used for territorial and reproductive purposes In many anuran species, males aggregate and
(Odom et al. 2014). Terrestrial birds primarily produce evening choruses of varying complexity
communicate within the frequency range of to advertise for females (i.e., courtship
human hearing, with recorded fundamental vocalizations; Grafe 2005). Most male anuran
frequencies (see Chap. 4) as low as 23 Hz for species cycle air through a vocal sac to produce
southern cassowary (Casuarius casuarius; Mack calls with main energy between 400 Hz and
and Jones 2003) and as high as 13 kHz for the 10 kHz (Fig. 7.5c; Cunnington and Fahrig 2010;
Ecuadorian hillstar hummingbird (Oreotrochilus Narins and Meenderink 2014; Villanueva-Rivera
chimborazo; Duque et al. 2018). Marine birds that 2014), although some species produce sounds
are heard within terrestrial soundscapes produce that extend into the ultrasonic range (i.e.,
calls with fundamental frequencies <2 kHz (e.g., >20 kHz; Feng et al. 2006; Arch et al. 2008).
Charrier et al. 2001; Bourgeois et al. 2007; Cure White-lipped frogs (Leptodactylus albilabris)
et al. 2009; Mulard et al. 2009; Dentressangle also thump their vocal sac on the underlying
et al. 2012). Lesser-known sounds of birds are substrate while vocalizing, thereby creating a
those produced by wings while in flight and while seismic signal, which potentially plays a role in
perched (Clark 2021). Because these sounds may seismic communication with conspecifics (Narins
be audible to the animal itself, conspecifics, and 1990).
other species (e.g., predators and prey), Clark Courtship vocalizations have also been
(2021) suggested that these sounds may be recorded for at least 35 species of tortoises. Call
selected to evolve from by-product to communi- characteristics of 11 tortoise species were studied
cation signal. in detail by Galeotti et al. (2005), revealing domi-
Insects are another common source of nant frequencies between 110 and 600 Hz and
biophony, with seasonal and diurnal choruses energy between 100 Hz and 3 kHz. Snakes may
produced by cicadas and crickets at dominant produce a broadband hiss (3–13 kHz; Young
frequencies between 2 and 50 kHz (Bennet- 1991), rattle (2–23 kHz; Young and Brown
Clark 1970; Robillard et al. 2013; Hart et al. 1993), or rasping sound (200 Hz–11 kHz;
2015; Buzzetti et al. 2020). These typically male Young 2003) when threatened. Crocodiles pro-
insect choruses, produced to attract females, can duce sounds with main energy <2 kHz (e.g.,
7 Analysis of Soundscapes as an Ecological Tool 221

Fig. 7.4 A comparison of the soundscapes at two differ- Catharus aurantiirostris, Arremon aurantiirostris,
ent moments of the morning in a secondary wet forest at Phaeothlypis fulvicauda, and Formicarius analis). Bottom
Las Cruces Biological Station, Costa Rica. Top spectro- spectrogram recorded at the same location just after the
gram recorded minutes prior to the onset of Zammara onset of cicada morning choruses. # Hart et al. (2015);
smaragdina cicada morning choruses, displaying https://academic.oup.com/view-large/figure/79529274/
vocalizations from seven bird species (Arremon beheco_arv018_f0001.jpeg. Published under CC BY 3.0;
aurantiirostris, Picumnus olivaceus, Arremon torquatus, https://creativecommons.org/licenses/by/3.0/

Vergne et al. 2009, 2011; Reber et al. 2017). Mammalian species vocalize at frequencies
Crocodile hatchlings emit calls before, during, that, for some taxa, are inversely related to their
and after hatching, which function to synchronize body size (Bowling et al. 2017). African
hatching, alert the mother to their due arrival, and elephants (Loxodonta africana) and Asian
stay in contact (Vergne et al. 2011; Chabert et al. elephants (Elephas maximus), for example, vocal-
2015). Adult crocodiles produce calls during ize within the infrasonic range (i.e., <20 Hz;
courtship, during territorial defense, and to main- fundamental frequency as low as 14 Hz). These
tain group cohesion with offspring (Fig. 7.6; low-frequency calls function to coordinate move-
Vergne et al. 2009; Reber et al. 2017). ment and to advertise an individual’s
222 R. P. Schoeman et al.

Fig. 7.5 Spectrograms of the flight sound produced by tonals and overtones and the European tree frog (Hyla
the European honeybee (Apis mellifera; a) and the Japa- arborea) with higher-pitched, broadband sounds starting
nese yellow hornet (Vespa simillima xanthoptera; b). at around 5 s and increasing in intensity and bandwidth
Sound files from Kawakita and Ichikawa (2019). Spectro- from 13 s onwards (c). Recording courtesy of Marco
gram of chorusing frogs in a pond in Colli Euganei, Italy. Pesente
Yellow-bellied toad (Bombina variegata) with 500-Hz

Fig. 7.6 Male (a) and female (b) American alligator (2017); https://www.nature.com/articles/s41598-017-
(Alligator mississippiensis) bellows that may be produced 01948-1/figures/2. Published under CC BY 4.0; https://
during courtship and territorial defense (Vergne et al. creativecommons.org/licenses/by/4.0/
2009). Modified from Reber et al. (2017). # Reber et al.

reproductive status over distances as far as 2.5 km also characterized by harmonics that extend well
(Soltis 2010). Elephants also produce vibrations into the ultrasonic range (Fig. 7.7; Behr and van
that propagate through the substrate and so pro- Helversen 2004; Lattenkamp et al. 2019).
vide additional cues to listening conspecifics Primate vocalizations cover a wide frequency
(Payne et al. 1986; O’Connell-Rodwell et al. range from approximately 100 Hz in western
2000). The majority of aerial feeding bats, at the gorillas (Gorilla gorilla; Salmi et al. 2013) to
opposite end of the body-size scale, produce short 16 kHz in pygmy marmosets (Cebuella pygmaea;
echolocation calls (biosonar) in the ultrasonic Pola and Snowdon 1975). Primate vocalizations
range (15–110 kHz), for navigation and hunting play an important role in intergroup communica-
(Fenton et al. 1998). Bat social calls, potentially tion, predominantly facilitating social interactions
related to agonistic encounters and courtship, are and group movement (Cheney and Seyfarth 1996,
7 Analysis of Soundscapes as an Ecological Tool 223

Fig. 7.7 Common social


calls with ultrasonic
components emitted by the
pale spear-nosed bat
(Phyllostomus discolor).
Modified figure.
# Lattenkamp et al.
(2019); https://www.
frontiersin.org/files/
Articles/447704/fevo-07-
00116-HTML/image_m/
fevo-07-00116-g002.jpg.
Published under CC BY
4.0; https://
creativecommons.org/
licenses/by/4.0/

2018). Primates are also known to use various bears frequently emit a low-intensity, repetitive,
alarm calls, which were previously suggested to pulsed sound when initiating or continuing body
be functionally referential signals (e.g., Cheney contact with their cub (20 Hz–2 kHz; Wemmer
and Seyfarth 1996). However, recent studies have et al. 1976). Pinnipeds produce in-air sounds with
shown that primates often use general alarm calls main energy <9 kHz (Fig. 7.8). Mother and pup
and infer meaning from previous experiences or recognize each other by individually unique calls
contextual information (Fichtel 2020). that help them to reunite amidst all other
Marine mammals, such as polar bears (Ursus individuals of the colony (Insley et al. 2010),
maritimus), pinnipeds (i.e., seals, sea lions, and while males produce individually unique calls
walruses), and sea otters (Enhydra lutris nereis) during agonistic behavior (e.g., Fernández-Juricic
also produce in-air sounds. Nursing female polar et al. 1999; Van Parijs and Kovacs 2002). Female

Fig. 7.8 In-air vocalizations produced by (a) a https://doi.org/10.1007/s40857-017-0101-z. Published under


New Zealand fur seal (Arctocephalus forsteri) and (b) an CC BY 4.0; https://creativecommons.org/licenses/by/4.0/
Australian sea lion (Neophoca cinerea). # Erbe et al. (2017);
224 R. P. Schoeman et al.

Fig. 7.9 Example spectrograms of dog barks (a) and bleating sheep (b). Sheep bleats were produced by an ewe (solid
box), her lamb (dashed box), and a distant lamb (dotted box)

and pup sea otters produce individually distinct New Zealand significantly decreased with
calls with main energy <5 kHz, which also seem increasing wind speeds from calm (<4 km/h) to
to function as contact calls between separated windy (>15 km/h) conditions (Priyadarshani
individuals (McShane et al. 1995). et al. 2018). Precipitation also creates sound
Urbanized areas may be characterized by the (Fig. 7.10). Rain increased sound levels within a
sounds of domesticated animals (i.e., pets and deciduous forest (Ardennes, France) within the
livestock). Dogs bark to greet conspecifics and frequency band of 100 Hz to 10 kHz (Lengagne
humans, during play (i.e., excitement), when rais- and Slater 2002). The increase in sound levels
ing alarm, or when seeking attention (Yin and resulted in a reduction of acoustic communication
McCowan 2004), sometimes to the nuisance of space (i.e., area over which an individual can
the neighborhood (Flint et al. 2014). Barks are communicate with conspecifics) for tawny owls
short acoustic signals with main energy between
300 Hz and 2.5 kHz (Fig. 7.9), often repeated in
bouts (Yin and McCowan 2004). Ewes and their
lamb recognize each other by unique calls with
main energy <5 kHz (Sèbe et al. 2008), resulting
in a cacophony of bleats in lambing season.

7.2.2 Geophony

The prevailing geophonic source of sound is


wind. Wind acts on vegetation, thereby
contributing to sound levels <1 kHz in leafless
trees, <4 kHz in leafed trees, and <10 kHz in
open grasslands, with a positive correlation
between wind speed and sound intensity Fig. 7.10 Spectrogram of a thunderstorm recorded in the
(Boersma 1997; Bolin 2009). Wind noise may Netherlands, depicting high-frequency (i.e., >8 kHz)
sound from raindrops falling nearby, constant high-
affect the audible range of biological sounds. frequency (i.e., 9–12 kHz) rain in the background, and
The detection of bird song in open grasslands in low-frequency (i.e., <1 kHz) sound from thunder
7 Analysis of Soundscapes as an Ecological Tool 225

(Strix aluco) to 1/69th of the space without rain, Low-frequency sound, mostly generated
with a simultaneous marked decrease in vocal by engines, propagates over large distances
activity. Thunder is the most common loud natu- and appears to be the most invasive and pervasive
ral sound with a peak frequency near 100 Hz, sound related to transportation infrastructures.
although sounds extend into the infrasonic and Sound from cars and heavy trucks caused by
mid-frequency range (250 Hz–4 kHz; Fig. 7.10). tire-pavement interaction, aerodynamic sources,
Other sources of terrestrial geophony are rivers, and engines peaks around 100 Hz (Rochat and
waterfalls, earthquakes, and volcanic eruptions. Reiter 2016), but may reach as high as 10 kHz
Infrasonic monitoring of soundscapes can iden- when measured close to the source (Fig. 7.11a).
tify the location of continuous geophonic sound Both birds (e.g., Halfwerk and Slabbekoorn
sources, such as waterfalls and seismic activity, as 2009) and anurans (e.g., Cunnington and Fahrig
well as transient (i.e., short-duration) sound 2010; Caorsi et al. 2017) have been found to
sources, such as thunder, up to distances of change vocal behavior in response to traffic
10 km (Johnson et al. 2006). noise (see Chap. 13). Conventional railway
sound (i.e., electrified railway with a service
speed <200 km/h) has a broad peak between
7.2.3 Anthropophony 10 Hz and 2 kHz, whereas high-speed railway
sound (i.e., electrified railway with a service
Anthropophony identifies the presence and speed >200 km/h) peaks <100 Hz (Di et al.
activities of human beings. Some of these sounds 2014).
give cues about local culture, tradition, language, Sound from aircrafts, especially near airports,
working habits, and religion (e.g., voices, music, is perceived by humans as a source of disturbance
cow and sheep bells, church bells, etc.) and can and may have negative effects on children’s
enrich a soundscape (Stack et al. 2011, Pavan learning, human sleep, and human health (Basner
2017). However, with the industrial revolution, et al. 2017). In addition, sound during take-off
new sound sources have emerged at an unprece- and landing overlaps with biophony resulting in
dented level and spatial extension, with conse- acoustic and behavioral responses (Fig. 7.11b;
quent impacts on natural soundscapes and Sáncez-Pérez et al. 2013; Vidović et al. 2017).
human health. Birds near international airports in Spain, for
Terrestrial anthropophony includes sounds example, were found to advance their dawn cho-
from transportation (e.g., road vehicles, trains, rus to reduce overlap with aircraft sound (Gil
snowmobiles, ships, and airplanes; Ernstes and et al. 2015), which is a common response to
Quinn 2016; Mullet et al. 2017b; White et al. noise for urban species (Bermúdez-Cuamatzin
2017; Duarte et al. 2019), recreational boats et al. 2020). However, common chiffchaffs
(Kariel 1990; Bernardini et al. 2019), machinery (Phylloscopus collybita) near airports in the UK
(e.g., excavation devices, drilling devices, and the Netherlands were found to sing songs
generators, and chain saws; Potočnik and Poje with a lower maximum and peak frequency than
2010; Deichmann et al. 2017), gunshots (Wrege conspecifics in nearby control areas, thus
et al. 2017), fireworks (Kukulski et al. 2018), and resulting in an increased overlap with aircraft
outdoor events (Greta et al. 2019; Kaiser and sound (Wolfenden et al. 2019). In addition, air-
Rohde 2013). The intensity of anthropophony port populations sang at a slower rate and
correlates with the degree of urbanization (Joo responded more aggressively to song playbacks.
et al. 2011; Kuehne et al. 2013) and is considered In South Africa, the critically endangered
noise pollution with an impact on both human Pickersgill’s reed frog (Hyperolius pickersgilli)
(European Environment Agency [EEA] 2014) called more frequently and at higher frequencies
and animal health (Barber et al. 2010; Shannon during and after aircraft overflights than before
et al. 2016), potentially affecting entire (Kruger and Du Preez 2016). Even in wild remote
ecosystems (Pavan 2017). areas, aircrafts flying at ~8000 m altitude may
226 R. P. Schoeman et al.

Fig. 7.11 (a) Spectrogram of a passing car at 2-m and a approach (at ~12 s) and the bird vocalizations between
truck at 5-m distance. (b) Spectrogram of a commercial 7 and 9 kHz. (c) Spectrogram of a 3-m recreational power
passenger airplane flying overhead at an altitude of ~300 boat with a 3-hp 2-stroke engine, passing at 5-m distance;
m after take-off. Note the Doppler shift from high to low bird vocalizations within the gray dashed boxes. (d) Spec-
frequency (from 2.8 to 2 kHz) around the time of closest trogram of a jackhammer breaking tar

produce noise below 500 Hz at 60 dB re 20 μPa et al. 2016). Small recreational power boats on
(unweighted) at ground level (Pavan 2017; Farina lakes, on rivers, and near shore also increase
et al. 2021). It is also essential to consider that in-air sound levels, predominantly below 1 kHz
take-off and landing corridors, where the noise (Fig. 7.11c), with potential negative effects on
levels are much higher, may cross more rural bird species and hauled-out sea lions (York
lands where airplane sound creates a stark con- 1994; Tripovich et al. 2012).
trast with ambient sound levels. Construction equipment may generate strong
Smaller transport vehicles, such as powered sounds that are audible over long ranges. Pneu-
two wheelers and snowmobiles, also contribute matic tools, for example, generate repetitive,
to the soundscape (Paviotti and Vogiatzis 2012; broadband sound (Fig. 7.11d). Heavy and station-
Mullet et al. 2017b). Mullet et al. (2017b) found ary equipment, such as earth-moving machinery
that snowmobile noise, with main energy and air-compressors, generate sounds at
<2 kHz, affected 39% of the Alaskan wilderness frequencies <2 kHz (e.g., Berglund et al. 1996;
open to snowmobiles and may mask Roberts 2009). Although one may associate con-
vocalizations from common winter bird species. struction sounds with urban areas, there are many
In-air ship noise from machinery and ventilation examples in rural and remote areas, too. In the
systems may propagate to areas near channels, western Amazon (Peru), sounds from the con-
ports, and coasts (Badino et al. 2012; Borelli struction and operation of a natural gas-well and
7 Analysis of Soundscapes as an Ecological Tool 227

pipeline (i.e., generators, helicopters, and pneu- The air may be layered, with layers at different
matic tools) were audible up to 250 m from the altitudes having different acoustic properties.
source (Deichmann et al. 2017). Anthropogenic Higher temperature and higher humidity increase
sources in rural areas include farming machinery the speed of sound. By Snell’s law of refraction,
dominating <500 Hz (Gulyas et al. 2002), sound bends toward the horizontal when the
chainsaws recorded in forests with main energy speed of sound increases and away from the hori-
between 100 Hz and 9 kHz (Potočnik and Poje zontal when the speed of sound decreases. During
2010), and transient, broadband gunshots (Prince the day, temperature typically decreases with
et al. 2019), which can provide valuable informa- increasing altitude, leading to an upward
tion on illegal hunting, in particular in remote refracting environment that exhibits so-called
areas that are difficult to patrol. In urban settings, shadow zones that have reduced sound levels. In
additional sources of anthropophony originate the morning or in winter, the air near the ground is
from outdoor events, such as (music) festivals often relatively cold, while there might be a
(Greta et al. 2019), fun parks (Kaiser and Rohde warmer layer of air at higher altitude; this situa-
2013), and Formula 1 races (Payne et al. 2012). tion is called a temperature inversion. Sound is
downward refracted and channeled close to the
ground. Hence, in winter, sound might travel very
7.2.4 Sound Propagation far at low altitude (see Chap. 5).
in Terrestrial Environments Vegetation attenuates sound, so in temperate
areas with high vegetation, the same sound during
The propagation of sound, from its source summer propagates over shorter distances than
through an environment, affects the local during winter (Aylor 1972). Areas or seasons of
soundscape. In environments with good sound full vegetative cover have soundscapes different
propagation conditions, sources from far away from those bare in vegetation (Attenborough et al.
contribute to the local soundscape; whereas in 2012). Both temperature and humidity near the
environments with poor sound propagation ground may change quickly; therefore, sound
conditions, only nearby sources contribute. propagation conditions, soundscapes, and the
Sound propagation is affected by air temperature, communication space of terrestrial animals can
humidity, ground cover (bare rock versus vary within a few hours.
grasslands or bush), wind, turbulence, and the
presence of sound absorbers (e.g., snow),
scatterers (e.g., trees), and reflectors (e.g., cliffs 7.3 Aquatic Soundscapes
or buildings; see Chap. 5).
As sound spreads, it is transmitted into and The vast majority of aquatic soundscape studies
through different media, absorbed, reflected, have focused on marine and estuarine
scattered, and diffracted. Many of these effects environments, where soundscapes vary among
depend on frequency; meaning that sound geographic regions from the northern marginal
propagates differently at different frequencies ice-zone via equatorial regions to Antarctic
and that the environment changes the spectral waters (Haver et al. 2017), from the deep ocean
characteristics of the sound. If the wavelength of (e.g., Dziak et al. 2017) to shallow coastal waters
sound is smaller in size than features of the envi- (e.g., McWilliam and Hawkins 2013), and from
ronment (e.g., rocks), then sound will reflect. The urban rivers (e.g., Marley et al. 2016) to estuarine
wavelength can be computed as the ratio of sound reserves (e.g., Ricci et al. 2016). Soundscape
speed (about 330 m/s in air) and frequency (e.g., a studies in freshwater are less common but have
100-Hz tone has a wavelength of 3 m in air; see covered a variety of settings from frozen lakes in
Chap. 4). At wavelengths much greater than Canada (Martin and Cott 2016) to urbanized lakes
features in the environment, sound will travel in the UK (Bolgan et al. 2016, 2018b), from
unhindered. pristine swamps in Costa Rica (Gottesman et al.
228 R. P. Schoeman et al.

2020) to urbanized lowlands in the Netherlands 7.3.1 Biophony


(van der Lee et al. 2020), and from litttle streams
in the USA (Holt and Johnston 2015) to the busy Aquatic species are well adapted to produce,
Ganges river in India (Dey et al. 2019). As in the sense, and use sounds in water (e.g., Schmitz
terrestrial environment, each soundscape is 2002; Ladich and Winkler 2017). The aquatic
characterized by a unique composition of biophony includes sounds produced by
biophony, geophony, and anthropophony. invertebrates (e.g., Iversen et al. 1963; Coquereau
Ambient sound encompasses all of the sounds et al. 2016; Gottesman et al. 2020), frogs
at a given location and time, except for any spe- (Brunetti et al. 2017), turtles (e.g., Giles et al.
cific signal of interest (International Organization 2009), fish (e.g., Kasumyan 2008; Bolgan et al.
for Standardization [ISO] 2017). Fig. 7.12 gives 2018b), birds (Thiebault et al. 2019), and
the spectra of characteristic ambient sounds in the mammals (e.g., Klinck et al. 2012; Erbe et al.
ocean, as originally compiled by Wenz (1962), 2017; Dey et al. 2019). The freshwater biophony
with updates from Cato (2008). Below 100 Hz, is not well described and so, sounds frequently
ambient sound is dominated by distant shipping, cannot be linked to specific species (Rountree
and, in shallow water, wind. Above 100 Hz, et al. 2019; Gottesman et al. 2020; Putland and
ambient sound is mostly wind driven. The Mensinger 2020). This lack of knowledge cur-
prevailing limits of ambient sound decrease with rently impedes the full utilization of freshwater
increasing frequency from a maximum of 140 dB soundscape studies as an ecological tool (Linke
re 1 μPa2/Hz at 1 Hz to a minimum of 15 dB re et al. 2020).
1 μPa2/Hz at 30 kHz. Above 30 kHz, molecular With regards to marine biophony, snapping
agitation limits the spectra of recorded ambient shrimps are well-known contributors, producing
sound. broadband sounds from a few hundreds of hertz

Intermittent + Local Sound:


precipitation
polar ice
ships, industrial activities
biophony
earthquakes, explosions
140
earthquakes
Power Spectral Density Level [dB re 1 μPa2/Hz]

120 wi
nd
in
sh
all upper limit of prevailing sound
ow
100 wa
ter
heavy rain
heavy ship traffic
80
c
affi
ter tr ffic
60 a tra sea state
-w
eep ater 6
d -w
ow 4
h all
40 s 2
1
lower limit of prevailing sound 0.5
20
r
ula
lec
mo tation
0 Frequency [Hz] agi
1 10 100 1k 10k 100k
Prevailing Sound: seismic background
turbulent pressure fluctuations
surface waves
ship traffic
bubbles, spray

Fig. 7.12 Spectra of prevailing and local underwater sound sources between 1 Hz and 100 kHz (after Wenz 1962; Cato
2008)
7 Analysis of Soundscapes as an Ecological Tool 229

Fig. 7.13 Spectrograms of (a) snapping shrimp, (b) a Nature. Coquereau L, Grall J, Chauvaud L, et al. Sound
swimming great scallop (Pecten maximus), and (c) a feed- production and associated behaviours of benthic
ing spider crab (Maja brachydactyla). Spectrograms b and invertebrates from a coastal habitat in the north-east Atlan-
c were created from supplementary material in Coquereau tic. Mar Biol 163: 127; https://doi.org/10.1007/200227-
et al. (2016). Reprinted by permission from Springer 016-2902-2. # Springer Nature, 2020. All rights reserved

up to 200 kHz (Fig. 7.13a; Knowlton and activities are displayed in Fig. 7.13b, c
Moulton 1963; Au and Banks 1998). This short, (Coquereau et al. 2016).
intense, repetitive sound is a byproduct of many Over 1200 fish species were estimated to pro-
shrimps rapidly closing their snapper claw, which duce sounds by Kaatz (2011), of which 800 were
creates a jet stream used in agonistic encounters confirmed soniferous species (Kaatz 2002;
and to stun prey (Herberholz and Schmitz 1999). Rountree et al. 2006). Fish produce sounds in a
As snapping shrimps predominantly live in large variety of behavioral contexts, such as courtship
aggregations (Duffy 1996; Duffy and Macdonald (Amorim et al. 2015), agonistic interactions
1999), their sounds can be heard as a constant (Ladich 1997), and when in distress (Knight and
‘crackling’ chorus with temporal and spatial Ladich 2014). It is therefore not surprising that
variations in intensity (e.g., Bohnenstiehl et al. fish are common contributors to aquatic
2016; Lillis et al. 2017). Other well-known soundscapes, most noticeably when large num-
sound-producing invertebrates are lobsters and bers vocalize in chorus (e.g., Rice et al. 2017;
sea urchins. Lobsters produce broadband pulse Pagniello et al. 2019). Parsons et al. (2016)
trains when facing predators or competing summarized fish chorus patterns over a 2-year
conspecifics (Staaterman et al. 2010; Jézéquel period in Darwin Harbour, Australia. Nine differ-
et al. 2019). Jézéquel et al. (2019) characterized ent chorus types were detected (Fig. 7.14),
pulse trains of the European spiny lobster dominating the frequency band from 50 Hz to
(Palinurus elephas) as signals with a mean band- 3 kHz and displaying cycles on several temporal
width of 5–23 kHz. Sea urchins scrape algae from scales (i.e., diurnal, lunar, seasonal, and annual).
rocks. This foraging strategy causes the fluid Fish chorusing was also associated with environ-
inside the sea urchin to resonate, producing mental parameters, including water temperature,
sound at frequencies between 700 Hz and 2 kHz depth, salinity, and tidal cycle.
(Radford et al. 2008). In New Zealand, groups of Marine mammal sounds range from
foraging endemic Kina sea urchins (Evechinus infrasounds of mysticetes (baleen whales; e.g.,
chloroticus) increase sound levels between Mellinger and Clark 2003) to ultrasounds of
18:00 and 20:00 compared to mid-day levels odontocetes (toothed whales; e.g., Hiley et al.
(Radford et al. 2008). Further examples of sounds 2017). Calls may function as contact or warning
from invertebrate movement and foraging signals. For example, northern right (Eubalaena
230 R. P. Schoeman et al.

Fig. 7.14 Spectrograms of the fish calls making up nine Salgado-Kent CP, Marley SA, et al., Characterizing diver-
fish choruses (50 Hz–3 kHz) in Darwin Harbour, sity and variation in fish choruses in Darwin Harbour.
Australia. The middle panel shows the chorus levels over ICES J Mar Sci 73:2058–2074; https://doi.org/10.1093/
time, in hours relative to sunrise and sunset. There is a icesjms/fsw037. # International Council for the Explora-
peak in chorusing activity shortly after sunset. tion of the Sea, 2016; https://global.oup.com/academic/
Figure created from material in Parsons et al. (2016), by rights/permissions/. All rights reserved. Reuse requires
permission from Oxford University Press. Parsons MJG, permission from OUP

glacialis) and southern right (E. australis) whale which may serve as an advertisement call and/or
upsweeps (i.e., upcalls; 50–235 Hz) seem to be agonistic call produced by male individuals
used as a contact call (Fig. 7.15a; Clark 1982; (Parks et al. 2006). However, female right whales
Parks et al. 2007). Another characteristic call of sometimes also produce this sound (Gerstein et al.
this species is a strong, brief, broadband pulse 2014). Foraging humpback whales (Megaptera
with energy up to 16 kHz (called gunshot), novaeangliae) produce a characteristic tonal call
7 Analysis of Soundscapes as an Ecological Tool 231

Fig. 7.15 Spectrograms of marine mammal sounds. (a) leptonyx) and (f) Ross seal (Ommatophoca rossii), both
Southern right whale upcall. (b) Humpback whale song. under water. # Erbe et al. (2017); https://doi.org/10.1007/
(c) Common dolphin (Delphinus delphis) whistles and (d) s40857-017-0101-z. Published under CC BY 4.0; https://
clicks and burst-pulse sounds. (e) Leopard seal (Hydrurga creativecommons.org/licenses/by/4.0/

with a fundamental frequency between 400 Hz Blue whales (Balaenoptera musculus),


and 1 kHz (Cerchio and Dahlheim 2001), which bowhead whales (Balaena mysticetus), fin whales
may function to herd prey, coordinate group (Balaenoptera physalus), and others arrange calls
movement, or recruit individuals into a feeding into patterned song, which may last from hours to
group (Cerchio and Dahlheim 2001; Fournet et al. days. Humpback whale song is particularly com-
2018). plex in structure, consisting of a variety of units
232 R. P. Schoeman et al.

that have peak frequencies between 20 Hz and produce non-vocal surface-generated sounds
6 kHz (Fig. 7.15b; Payne and McVay 1971). The through breaching, pectoral fin slapping, and tail
functions of whale song may include female slapping (e.g., Dunlop et al. 2007).
attraction, male-male interactions, and long-
range sonar (Herman 2017; Mercado 2018).
Odontocete echolocation clicks with peak energy 7.3.2 Geophony
between ~10 and ~150 kHz are used for naviga-
tion and prey capture (Au 1993). Odontocete The aquatic geophony comprises sounds from
tonal calls (i.e., whistles) with fundamental wind acting on the water surface (e.g., Knudsen
frequencies between ~1 and ~50 kHz and broad- et al. 1948); precipitation (e.g., Nystuen 1986);
band burst-pulse sounds are used for communica- ice movement, pressure cracking, and melting
tion (Fig. 7.15c, d; Herzing 1996). Some (e.g., Mikhalevsky 2001; Martin and Cott 2016);
odontocete species also communicate with clicks subsea volcanoes and earthquakes (e.g., Fox et al.
(e.g., sperm whales, Physeter macrocephalus, 2001; Dziak and Fox 2002); and sediment dis-
and porpoises, Phocoenidae; Weilgart and White- placement (e.g., Lorang and Tonolla 2014).
head 1993; Clausen et al. 2010). Delphinids may Geophony can be nearly continuous and domi-
arrange their whistles and burst-pulse sounds into nate the soundscape in certain regions at certain
patterned sequences (e.g., killer whales, Orcinus times (e.g., wind noise in southern Australia; Erbe
orca, Wellard et al. 2020; and pilot whales, et al. 2021). Wind-driven sound lies between
Globicephala melas, Courts et al. 2020). Seals, 100 Hz and 20 kHz (typical peak at 500 Hz;
sea lions, and walruses use underwater Wenz 1962). Rainfall can contribute to the under-
vocalizations particularly during the breeding water soundscape over frequencies between
season and in social interactions (Schusterman 500 Hz and 50 kHz depending on drop size,
et al. 1966; Stirling et al. 1987; Van Parijs and rainfall rate, and impact angle related to wind
Kovacs 2002). The majority of pinniped under- speed (Ma et al. 2005). In the Perth Canyon,
water vocalizations fall within the frequency Australia, rainfall is often accompanied by strong
range between 10 Hz and 6 kHz (Fig. 7.15e, f), wind. Consequently, the weather-related sound
although Weddell seals (Leptonychotes weddellii) spectrum shows two peaks: one dominated by
were found to produce calls containing energy up wind at 300–600 Hz and another dominated by
to 13 kHz (Thomas and Kuechle 1982). rain at about 3 kHz (Fig. 7.16a; Erbe et al. 2015).
Mysticetes, odontocetes, and pinnipeds also In polar regions and underneath frozen lakes,

140
A 2000 B
1000
PSD [dB re 1 μPa2/Hz]

120
Frequency [Hz]

500
200
100
100
50
80
20
10
60
0 100 200 300
Time [s]
Fig. 7.16 Sources of aquatic geophony. (a) Underwater (dB re 1 μPa2/Hz). Note the logarithmic frequency axes.
power spectral density (PSD) levels illustrating an increase Both figures were modified; # Erbe et al. (2015); https://
in levels under increased wind speeds (m/s) and rain fall doi.org/10.1016/j.pocean.2015.05.015. Published under
rates (mm/h). (b) Spectrogram of an earthquake recorded CC BY 4.0; https://creativecommons.org/licenses/by/4.0/
in the Perth Canyon, Australia. Colors indicate PSD level
7 Analysis of Soundscapes as an Ecological Tool 233

sounds of colliding, oscillating, breaking, and shark nets (e.g., Erbe and McPherson 2012),
melting ice range from <10 Hz to 8 kHz snowmobiles and vehicles on ice-covered lakes
(Talandier et al. 2006; Martin and Cott 2016). (Martin and Cott 2016), bridge traffic (Holt and
Sound from polar ice can be detected thousands Johnston 2015; Martin and Popper 2016), augers
of kilometers away at tropical latitudes (i.e., ice drills; Putland and Mensinger 2020),
(Matsumoto et al. 2014). Underwater volcanic airplanes (e.g., Martin and Cott 2016; Erbe et al.
eruptions generate impulsive sounds as well as 2018), and activities alongside, rather than on,
harmonic tremors <100 Hz, which can travel the water (Kuehne et al. 2013). Lesser-known
over distances greater than 12,000 km through anthropophony originates from unpowered recre-
the Sound Fixing And Ranging (SOFAR) channel ational activities (e.g., scuba diving and swim-
(Tepp et al. 2019). Similarly, earthquakes can be ming; Erbe et al. 2016c).
detected at thousands of kilometers in distance as Sound from ship traffic is the most pervasive
low-frequency (<100 Hz) rumbles, lasting sev- anthropogenic sound in the ocean (e.g., Sertlek
eral minutes (Fig. 7.16b; Erbe et al. 2015). Sedi- et al. 2019). The level of sound emitted depends
ment flow may generate sound in rivers and on ship type, size, speed, and operational mode
streams, creating acoustic cues for freshwater (e.g., reversing, idling, carrying, or towing load;
species (Tonolla et al. 2010, 2011). Depending MacGillivray and de Jong 2021). In water <300
on grain size and flow velocity, the spectrum may m deep, large ships (>300 t) can temporarily
range from tens of hertz to kilohertz. increase sound levels up to 125 kHz within
500 m from shipping routes (Hermannsen et al.
2014; Veirs et al. 2016). In deep water,
7.3.3 Anthropophony low-frequency sound from ships can travel far-
ther, especially when entering the SOFAR chan-
In the last century, human activities began to nel (Fig. 7.17; Erbe et al. 2019). The number of
contribute significantly to underwater sound small, recreational boats that occupy coastal
levels. The anthropophony has grown ambient waters is on the rise in many places and these
sound levels rapidly compared to evolutionary vessels may raise sound levels between 100 Hz
time scales, making it hard for animals to adapt and 20 kHz in coastal and estuarine habitats,
(see Chap. 13). Anthropogenic sound may be depending on boat type, hull type, length, propul-
present in aquatic soundscapes far away from sion system, operational mode, and speed
human activities, owing to the long-range propa- (Parsons et al. 2021).
gation of low-frequency sound in water (see Another common anthropogenic sound that
Chap. 6). The aquatic anthropophony includes has received much concern over its potential
personal watercrafts (e.g., jetskis; Erbe 2013), impacts on marine life (see Chap. 13) is produced
small boats (e.g., Erbe et al. 2016a; Dey et al. by seismic surveys, used for seabed profiling and
2019), electric ferries (Parsons et al. 2020), mer- hydrocarbon exploration. Surveys are done with a
chant ships (e.g., Ross 1976; Hatch et al. 2008; vessel towing an array of airguns. Airguns are
McKenna et al. 2012), offshore hydrocarbon metal chambers storing compressed air, which is
exploration and production (e.g., marine seismic rapidly released, producing an acoustic pulse with
surveys and drilling; Wyatt 2008; Erbe and King energy up to at least 10 kHz (Dragoset 2000;
2009; Erbe et al. 2013), near-shore construction Hermannsen et al. 2015). Airguns exist with dif-
including geotechnical work and pile-driving ferent operating volumes and firing pressures,
(e.g., Erbe 2009; Dahl et al. 2015; Erbe and affecting the spectrum and level of the acoustic
McPherson 2017), windfarms (e.g., Koschinski pulses (Fig. 7.18a; Erbe and King 2009;
et al. 2003; Tougaard et al. 2009), dredging Hermannsen et al. 2015). Airgun arrays can be
(e.g., Reine et al. 2014), explosions (e.g., tuned to focus acoustic emission down into the
Soloway and Dahl 2014), military sonars (e.g., seabed, yet some sound ends up traveling hori-
Ainslie 2010), acoustic alarms on fishing gear or zontally through the water. Hence, sounds from
234 R. P. Schoeman et al.

Fig. 7.17 Sketch of the propagation of sound from a left panel and a hard, dense, limestone seafloor. Colors
156-m ship (at 0 km range) sailing at a speed of 15 knots represent received level (RL). # Erbe et al. 2019; https://
above the continental slope in the absence of ambient www.frontiersin.org/files/Articles/476898/fmars-06-
sound. Propagation modeled with RAMGeo in AcTUP 00606-HTML/image_m/fmars-06-00606-g001.jpg.
V2.8 (https://cmst.curtin.edu.au/products/underwater/) Published under CC BY 4.0; https://creativecommons.org/
with an equatorial sound speed profile as indicated in the licenses/by/4.0/

22 A B
20
18
16
14
Frequency [kHz]

12
10
8
6
4
2

0 2 Time [s] 4 6 8 0 2 Time [s] 4 6

Fig. 7.18 Spectrograms of impulsive sound sources. (a) Seismic airgun pulses recorded off Western Australia (Erbe
et al. 2021). (b) Pile driving recorded in Moreton Bay, Queensland, Australia (Erbe 2009)

seismic surveys may affect marine life at both Pile driving for windfarm construction and
short and long ranges (Gordon et al. 2003; detonations of World War II ammunition are reg-
Slabbekoorn et al. 2019). A typical seismic sur- ular sources of sound within European waters
vey may last several weeks, during which the (Bailey et al. 2010; von Benda-Beckmann et al.
airgun array is discharged every few seconds. 2015). Impact pile driving generates high-
Other common sounds of concern are emitted intensity pulses with energy exceeding 40 kHz
by pile driving, explosions, and acoustic alarms. at close range (Fig. 7.18b). Acoustic alarms are
7 Analysis of Soundscapes as an Ecological Tool 235

devices that purposefully emit sound between a polar oceans, the speed of sound is the smallest at
few hundred hertz and tens of kilohertz to deter the surface. This leads to a surface duct, in which
marine animals from potential hazards, such as sound travels by repeated reflection off the sea
pile driving sites, aquaculture farms, or bather surface and refraction at depth.
protection nets (e.g., Jacobs and Terhune 2002; Snell’s law creates additional interesting phe-
Erbe and McPherson 2012), yet their efficacy nomena such as shadow zones and convergence
remains controversial (e.g., see Erbe et al. zones. Sound does not distribute evenly through-
2016d). Acoustic alarms differ widely in their out the oceans. There are patterns of shadow
signal type, frequency, and source level (Findlay zones (into which sound cannot travel by direct
et al. 2018). paths, and which receive little to no sound) and
convergence zones (where received levels are
enhanced; Fig. 7.17). These zones will be in dif-
7.3.4 Sound Propagation in Aquatic ferent places for different source locations. In
Environments addition, sound at low frequencies does not travel
far in shallow water. The waveguide concept and
Underwater, the propagation of sound is affected normal modes nicely explain this (see Chap. 6).
by water temperature, salinity, hydrostatic pres- The water depth can be too small to “fit” sound of
sure (i.e., depth below the sea surface), sea sur- large wavelength. As a result, ship noise may be
face roughness, potential ice cover, bathymetry, attenuated quickly in coastal water and the spec-
seafloor roughness, upper seafloor geology (i.e., tral hump of distant shipping is characteristic only
sediment type and thickness), depth and type of in offshore water (see Sect. 7.5.3.2). Ergo,
the underlying bedrock, and the presence of soundscapes may differ with location and depth,
sound absorbers, scatterers, and reflectors (e.g., merely because of sound propagation.
aquatic fauna, bubble clouds, or suspended sedi-
ment; see Chap. 6).
The speed of sound in water changes gradually 7.4 Soundscape Changes Over
with depth. As a result, sound does not travel in Space and Time
straight lines. Instead, sound paths are bent by
refraction. By Snell’s law, paths bend toward Soundscapes may vary on a range of spatial
local minima in sound speed. The most pro- scales, exhibit temporal cycles (e.g., because of
nounced local minimum occurs in all non-polar diurnal animal behaviors, periodic animal pres-
oceans at a depth of about 1000 m below the sea ence, or seasonal weather events; Erbe et al. 2015;
surface. Sound reaching this depth at not too steep Caruso et al. 2017; McWilliam et al. 2017), or
angles can get trapped in the so-called SOFAR gradually change over longer periods of time.
channel by being repeatedly refracted toward the Such changes may be natural or, directly or indi-
channel axis. This is how sound can traverse rectly, related to human activity. Understanding
entire oceans, with sound sources contributing to natural variability is important for using
soundscapes thousands of kilometers away (e.g., soundscapes (1) as an ecological tool to study
Gavrilov 2018). The SOFAR channel does not animal behavior and (2) as a management tool
only trap sounds from deep-water sound sources of the potential effects of human activity. Our
(e.g., submarines or diving megafauna) located understanding of the function of animal calls
within the channel, but also from sources near and natural or anthropogenic interferences is
the sea surface (e.g., ships or whales) because based on limited observational data (Slabbekoorn
sound can radiate into the SOFAR channel with et al. 2018) and so interpreting changes in sounds
just one reflection off a downward sloping sea- is even more difficult. Gavrilov et al. (2012), for
floor (Fig. 7.17). The minimum in sound speed example, recorded the underwater soundscape
(and so the axis of the SOFAR channel) rises to between 21 and 27 May in 2002, 2006, and
shallower depths in polar waters. In fact, in the 2010 off Cape Leeuwin, Australia. Between
236 R. P. Schoeman et al.

fin whales Antarctic blue whales ambient noise may result in spatial differences in
98 vocalizations (Slabbekoorn and Smith 2002). If
PSD [dB re 1 μPa2/Hz]

96 ambient noise differs consistently across a spe-


94 cies’ habitat, acoustic adaptation might result in
acoustic divergence between populations of the
92
same species (Dingle et al. 2008). If the calls of
90 2010 these populations diverge so much that they are
2006
88 2002
no longer recognized by all populations, sexual
selection may lead to the segregation into distinct
86
15 20 25 30 (sub)species (Dingle et al. 2010; Burbidge et al.
Frequency [Hz]
2015). For research on soundscapes and acoustic
Fig. 7.19 Power spectral density (PSD) of the ecology, spatial replication in sampling is
soundscape off Cape Leeuwin, Australia, showing paramount.
increases in level and decreases in frequency of the fin
and Antarctic blue whale characteristic sounds over eight
years. Figure courtesy of Sasha Gavrilov, Curtin Univer-
sity, Perth, Australia 7.4.2 Natural Cycles

Soundscapes vary naturally with diurnal, lunar,


years, an increase in sound levels at the
seasonal, or annual cycles because of temporal
frequencies characteristic of fin whales and Ant-
patterns in animal presence and behavior (e.g.,
arctic blue whales (Balaenoptera musculus
night-time foraging, lunar spawning, seasonal
intermedia) was seen (Fig. 7.19). This could be
hibernation, and annual migration) as well as
due to an increase in whale population sizes or
weather (e.g., annual monsoon). In Alaska, ambi-
changes in migration routes (i.e., closer to the
ent sound increased rapidly in early spring due to
recorder). The authors further noted that the fre-
an influx of migratory bird species and the awak-
quency of Antarctic blue whale calls decreased
ening of species from dormancy and hibernation
for unknown reasons.
(Mullet et al. 2016). Gage and Axel (2014) stud-
ied the diurnal and seasonal patterns in ambient
sound within 1-kHz frequency bands at Michigan
7.4.1 Spatial Patterns Lake, USA, from 2009 to 2012. At 2–3 kHz,
power levels were highest in early spring with
Soundscapes vary naturally over large and small the presence of spring peepers (Pseudacris cruci-
spatial scales, abruptly or gradually, resulting in fer, Hylidae). Levels dropped progressively
different soundscapes between and within toward early fall when spring peepers
habitats. Slabbekoorn (2004) sampled multiple disappeared and increased again in late fall
sites within a contiguous rainforest and an adja- because of chorusing insects. In contrast, at
cent ecotone forest in Cameroon. He found spatial 4–5 kHz, levels were low in early spring but
differences in ambient noise, which were due to increased in late spring with the presence of
differences in wind and species vocalizations breeding birds. Levels subsequently dropped yet
(insects, frogs, and birds). Over time, ambient increased again in late summer and early fall
noise can affect the vocal characteristics of because of insects. Diurnal changes in ambient
individuals, populations, and species (see sound were related to ecological activity. Within
Chap. 13). Consistent ambient noise may drive the 2–4 kHz frequency band, for example, spring
the features of a species’ vocalizations, so that peepers dominated the soundscape at night until
call transmission is optimized within the acoustic singing birds took over at dawn. Under water, in
environment (Acoustic Adaptation Hypothesis). the Ionian Sea, echolocation activity of dolphins
Just as temporal changes in ambient noise may occurred at nighttime and crepuscular hours
result in vocalization changes, spatial changes in (Caruso et al. 2017). In contrast, communication
7 Analysis of Soundscapes as an Ecological Tool 237

Fig. 7.20 Seasonal timing of pygmy blue whale migra- of pygmy blue whale singers as 24-h means. The red
tion along the west and south coasts of Australia based on horizontal lines indicate when the recorders were
passive acoustic monitoring. The chart shows the locations operating (Erbe et al. 2016b)
of sound recordings (red dots). The diagram shows counts

signals (i.e., whistles) were mostly produced dur- degradation by humans as a root cause. Humans
ing the day. Seasonal variation, with a peak num- add sound to soundscapes, change biodiversity
ber of clicks in August, was also evident, but no through land-use, and directly remove animals
effect of lunar cycle was observed. Off Western from habitats (e.g., by hunting). Humans also
Australia, pygmy blue whales (Balaenoptera contribute to climate change, with greenhouse
musculus brevicauda) are a seasonally dominant gas emissions resulting in environmental changes,
contributor to the marine soundscape and simply which can have direct and indirect effects on
by listening, their seasonal migration can be ecosystems and related soundscapes. The conser-
traced along the coast (Fig. 7.20; Erbe et al. vation of soundscapes is important not only for
2016b). scientific and ecological reasons but also for tour-
istic interests and human welfare (Pavan 2017).

7.4.3 Human Activities 7.4.3.1 Anthropophony


Humans alter soundscapes by growing
In many habitats, soundscapes have changed sig- anthropophony through an increase in transpor-
nificantly over the last century, with habitat tation, construction, mineral and hydrocarbon
238 R. P. Schoeman et al.

exploration and production, military exercises, primates, squirrels, tree-shrews, and bats between
recreational activities, etc. These activities undisturbed, logged, and transformed patches of
produce sounds over a wide range of forest (i.e., to rubber and oil palm plantations) in
frequencies and at a variety of intensities (see eastern Sumatra, Indonesia. Logging changed the
Sects. 7.2.3 and 7.3.3). While some activities composition of bird species, revealing a decrease
are temporary, others result in sustained in the number of specialized insectivorous species
increases in ambient sound levels over time. and an increase in insectivore-frugivore generalist
For example, underwater sound from shipping species. The species richness of bats also
has increased ambient sound levels between decreased with a concomitant increase in abun-
10 and 100 Hz in large parts of the world’s dance of the most dominant bat species. How-
oceans by up to 3 dB per decade (e.g., Andrew ever, logging impacts differed between
et al. 2011; Chapman and Price 2011; Miksis- geographical regions and management strategies
Olds et al. 2013). (e.g., conventional selective, salvage, or reduced-
Seismic surveys produce intense sound over a impact logging; Chaudhary et al. 2016; LaManna
few weeks at a time to explore a specified area; and Martin 2017). Land transformation to
yet, Nieukirk et al. (2004, 2012) detected airgun plantations resulted in a dramatic decrease in
pulses along the Mid-Atlantic ridge from seismic biodiversity with the disappearance of primates,
survey vessels located 3000–4000 km away. In squirrels, and tree-shrews as well as a reduction in
1999, airgun signals were routinely detected for bird and bat species richness by 90–95% and
more than 80% of the days in a month, which 75–87%, respectively.
increased to 95% in 2005. Finally, anthropogenic
sounds may affect animal behavior (i.e., physical 7.4.3.3 Direct Takes
or acoustic, Slabbekoorn et al. 2018; see Accidental, illegal, or over-harvesting of animal
Chap. 13), which can further alter soundscapes. species occurs in both terrestrial and aquatic
habitats (e.g., Challender and MacMillan 2014;
7.4.3.2 Land Use Anderson et al. 2020), resulting in population
Humans transform natural landscapes to increase declines and species extinctions (Hoffmann
agricultural land coverage, to build infrastructure et al. 2011; Dulvy et al. 2014). Perhaps one of
(e.g., roads, buildings, and power supply the greatest examples is the removal of millions
systems), or to extract resources (e.g., tree log- of whales during the nineteenth and twentieth
ging and mining). These activities generate centuries (Rocha Jr. et al. 2014), which unequiv-
sound and affect animal density and biodiversity, ocally changed marine soundscapes world-wide.
ultimately changing soundscapes (Phillips A modern example is the threat of dissapearing
et al. 2017). In 1962, ecologist Rachel Carson Gulf corvina (Cynoscion othonopterus) choruses
expressed her concern about the use of chemicals in the Colorado River delta because of
and pesticides in agriculture, killing not only soil overfishing (Erisman and Rowell 2017).
micro-fauna but also macro-fauna (Carson 1962). Overfishing can also result in excessive growth
She foresaw a silent natural world without the of algae, ultimately changing soundscapes.
songs of insects, frogs, and birds, if they were Freeman et al. (2018), for example, found a posi-
lost due to urbanization or chemical pollution. tive correlation between sound levels and
She was one of the first to consider animal sounds macroalgae coverage on Hawaiian coral reefs,
as an expression of ecosystem integrity and qual- attributable to ringing bubbles emitted during
ity. Kerr and Cihlar (2004) found a correlation photosynthesis.
between high-intensity, high-biomass agriculture
and high numbers of endangered species on both 7.4.3.4 Climate Change
national and regional levels in Canada. The Earth is experiencing rapid climate change,
Danielsen and Heegaard (1995) compared the affecting soundscapes in a variety of ways. The
species richness and abundance of birds, geophony is affected by changing weather
7 Analysis of Soundscapes as an Ecological Tool 239

patterns (i.e., wind, precipitation, and storms; sea-ice freeze-up (Hauser et al. 2016). These
Sueur et al. 2019). Rising temperatures reduce examples stress the importance of collecting envi-
sea- and land ice, which is changing polar ronmental data together with acoustic data, to
soundscapes (Intergovernmental Panel on Cli- correlate changes in animal distribution patterns
mate Change [IPCC] 2014). Climate change fur- and behavior with environmental change
ther modifies the acoustic properties of the (Kloepper and Simmons 2014).
environment with direct effects on sound propa-
gation and thus the audible distances of sounds.
Larom et al. (1997) calculated that the effective 7.5 How to Analyze Soundscapes
communication range for African elephant calls
varied between 2 and 10 km with temperature and Soundscape analysis may involve various, some-
windspeed. Ocean acidification, as a result of times sequential, methods ranging from listening
climate change, results in less absorption of to recordings, via visual inspection of
low-frequency sounds (Gazioğlu et al. 2015). spectrograms, to automated detection of target
Thus, low-frequency sound sources, such as signals, and computation of several acoustic
ships and whales, may become more prominent metrics. Often, the larger the acoustic monitoring
in future marine soundscapes. project, the more automated the tools, as long-term
Climate change may also directly affect a spe- projects, which might compare multiple recording
cies’ vocal behavior, distribution pattern, or sites, might gather terabytes of data, which are
timing of behavioral events, such as migration virtually impossible to analyze by hand.
and mating (Krause and Farina 2016; Sueur
et al. 2019). Narins and Meenderink (2014)
found that Puorto Rican coqui frogs (Eleuthero- 7.5.1 Standard Soundscape
dactylus coqui), over a period of 23 years, moved Measurements
to higher altitudes, while their calls increased in
pitch and decreased in duration. These changes in Initial assessments of soundscapes typically
distribution and call characteristics corresponded involve the computation of spectrograms and
to an overall increase in temperature of 0.37  C, some general statistics, such as the broadband
with a concomitant decrease in body size. A dif- root-mean-square (rms) Sound Pressure Level
ferent response was seen by four frog species near (SPLrms) in either dB re 20 μPa or dB re 1 μPa
Ithaca, NY, USA, who advanced the start of their in air and water, respectively (see Chap. 4). This
breeding season by 14 days between 1900–1912 allows an initial quality-check of the recordings
and 1990–1999, as evident from recordings of and the identification of potential spatial or
mating calls (Gibbs and Breisch 2001). During temporal patterns in overall sound levels,
this time, temperatures increased on average highlighting areas or temporal events of interest
0.7–1.7  C. Insects also depend on air tempera- for further investigation (e.g., very quiet or very
ture for the expression of their behavior, includ- noisy areas or times of day, Fig. 7.21). However,
ing sound emission (Ciceran et al. 1994). Rossi broadband SPLrms levels are strongly influenced
et al. (2016a, b) found that snapping shrimp by the noisiest events and cannot identify
(family Alpheidae) reduced their snap rate (i.e., the myriad of soundscape components and
snaps per minute) and intensity under increased contributors to spatial and temporal differences.
levels of CO2. This might affect the behavior of As sound sources are often known to cover
species that rely on acoustic cues from snapping certain frequency bands, it is beneficial to com-
shrimp for navigation (Rossi et al. 2016b). pute SPLs within purposefully chosen frequency
The eastern Chukchi Sea beluga whale bands or standard octave or 1/3 octave bands.
(Delphinapterus leucas) population delayed Buscaino et al. (2016) used Octave Band Levels
timing of migration from foraging habitats by (OBLs) at center frequencies from 62.5 Hz to
2–4 weeks, corresponding to a delay in regional 64 kHz to study temporal patterns in the
240 R. P. Schoeman et al.

Fig. 7.21 Spectrograms (top) and time series (bottom) of Reprinted by permission from Springer Nature.
broadband (20 Hz–22 kHz) sound pressure levels of a 24-h Bertucci F, Guerra AS, Sturny V, et al., A preliminary
recording period at three sites around Bora Bora Island, acoustic evaluation of three sites in the lagoon of Bora
French Polynesia. Recording schedule was set at 60 s Bora, French Polynesia. Environ Biol Fishes 103:891–
every 10 min. Note the increase in sound levels at night 902; https://doi.org/10.1007/s10641-020-01000-8.
(shaded areas) as well as the strong fluctuation in sound # Springer Nature, 2020. All rights reserved
levels between 60-s segments (Bertucci et al. 2020).

soundscape of a shallow-water Marine Protected 7.5.2 Identification of Sound Sources


Area in the Mediterranean Sea. Seasonal patterns
were seen within the lower (63 Hz–1 kHz) and Soundscape ecology involves the identification of
higher (4–64 kHz) OBLs due to increases in wind sound sources and whether they are part of the
in winter and snapping shrimp activity in sum- biophony, geophony, or anthropophony. Most
mer, respectively. In contrast, sound levels within sources have a unique sound signature (see
the 2-kHz octave band remained stable as sound examples earlier in this chapter), which can be
from both wind and snapping shrimp entered this identified from power spectra. Knowing to which
frequency band, thus attenuating seasonal soundscape component a sound belongs helps to
fluctuations. Sound levels in the 1/3 octave evaluate how pristine an environment is and pin-
bands centered at 63 and 125 Hz were set as point possible impacts from human activities.
indicators of ship noise by the European Com- Choruses by insects (Brown et al. 2019), anurans
mission Joint Research Centre (Tasker et al. (Nityananda and Bee 2011), birds (Baker 2009),
2010). Ship noise studies in shallow water, how- marine invertebrates (Radford et al. 2008), and
ever, highlight that natural sound sources (i.e., fish (Parsons et al. 2016) are so distinct that they
wind) and propagation characteristics may render are easily identified as biophony. Knowledge on
these indicators less useful in coastal areas and species-specific vocalizations helps to monitor
that bandlevels at 200 and 315 Hz should be species behavior and species-specific responses
included, particularly in areas frequented by to environmental stressors (such as noise) as
smaller recreational vessels (Garrett et al. 2016; demonstrated with insects (e.g., Walker and
Picciulin et al. 2016). Cade 2003), amphibians (e.g., Gibbs and Breisch
7 Analysis of Soundscapes as an Ecological Tool 241

Fig. 7.22 Spectrograms highlighting the difference in evolution of vocal displays in Traupidae (tanagers), the
vocalizations between 14 different tanager species, which largest family of songbirds. Biol J Linn Soc 114:538–551;
can be used to monitor behavior and response to environ- https://doi.org/10.1111/bij.12455. # The Linnean Society
mental change (Mason and Burns 2015). Reprinted by of London, 2015; https://global.oup.com/academic/rights/
permission from Oxford University Press. Mason NA, permissions/. All rights reserved. Reuse requires permis-
Burns KJ, The effect of habitat and body size on the sion from OUP

2001), birds (Fig. 7.22; e.g., Jahn et al. 2017), and inspection of sound files is labor intensive;
mammals (e.g., Nijman 2001; Parks et al. 2007). and so, some studies make use of automatic
Similarly, the sounds of the geophony and detection and classification software (see
anthropophony have characteristic spectral Chap. 8).
features by which they can be identified.
Studies differ, however, in their methodology
to identify sound sources. By listening to sounds 7.5.3 Visual Displays of Soundscapes
while observing their spectrograms in real-time
(see Sect. 7.5.3.1), experts can employ their per- 7.5.3.1 Spectrograms
sonal experience to separate biotic and abiotic A spectrogram displays acoustic power density as
sounds and to identify species. Alternatively, a function of time and frequency. Each column in
sounds can be compared to labeled recordings the spectrogram is a result of Fourier-
in sound libraries (see URLs at the end of this transforming a section of the recorded time series
chapter) and spectrograms can be compared to of sound pressure. The frequency and time
those found in the literature. However, manual resolutions of the spectrogram are affected by
242 R. P. Schoeman et al.

the window length and type of window function identifying the sound source. Spectrograms that
used (see Chap. 4). Techniques such as zero- contain the vocalizations of multiple sound
padding (i.e., expanding a time window with sources can provide information on species
zeros) and overlapping time windows may vocal dynamics, acoustic niches, and how
enhance the apparent resolution in frequency animals may be affected by acoustic changes in
and time. Each pixel (or cell) of the spectrogram their surroundings. For example, mixed anuran
eventually represents an average sound power, species’ breeding choruses in Minnesota, USA,
averaged into time and frequency bins. revealed acoustic niche partitioning within the
Spectrograms are a useful tool to examine the frequency domain (Fig. 7.23), while fin whale
time, frequency, and amplitude details of a vocalizations were masked by ship noise in Italy
sound at different time scales, potentially (Fig. 7.24).

Fig. 7.23 Anuran choruses recorded in Minnesota com- image; # Nityananda and Bee (2011); https://journals.
prising calls of four species. Note the occupation of differ- plos.org/plosone/article?id¼10.1371/journal.pone.
ent frequency bands by these species, suggesting acoustic 0021191. Published under CC BY 4.0; https://
niche partitioning within the frequency domain. Modified creativecommons.org/licenses/by/4.0/

125
A
100
Frequency [Hz]

75

50

25
0
125
B
100
Frequency [Hz]

75

50

25

0
0 30 60 Time [s] 90 120 150

Fig. 7.24 Spectrograms of (a) 20-Hz fin whale vocalizations off Sicily, Italy, and (b) a passing ship, which masked the
fin whale sounds
7 Analysis of Soundscapes as an Ecological Tool 243

Long-term monitoring programs typically noisy periods, or correlations between acoustic


make use of long-term spectral averages patterns and environmental factors. Fig. 7.25
(LTSAs), which are spectrograms that were aver- shows a 3-week LTSA, in which dominant events
aged into observation windows much longer than were marked (e.g., nightly fish chorus, whale
the underlying FFT windows. Observation choruses, stormy days, and passing ships).
windows may range from tens of seconds, to Break-out spectrograms show specific signals on
one minute, to several hours, to the length of a finer temporal scale (Erbe et al. 2016b). Alter-
one recording within a duty cycle (e.g., Gavrilov natively, long-term spectrograms may display
and Parsons 2014). LTSAs highlight persistent minimum (LTSmin), maximum (LTSmax),
soundscape contributors (e.g., shipping or median (LTSmed), or other percentile levels
storms), repetitive soundscape contributors (e.g., (e.g., LTS75), computed within each frequency
night-time choruses), and dominant events (e.g., bin over some time window (Righini and Pavan
an earthquake). They can be used to identify 2020). The minima will track the quietest baseline
specific days or hours rich in sounds, quiet versus and the maxima can highlight strong but brief

Fig. 7.25 Spectrograms of the marine soundscape in the surrounding panels display short-term spectrograms of
Perth Canyon, Australia. Middle panel shows a 3-week example sounds (Erbe et al. 2016b)
LTSA, computed with a 10-min observation window. The
244 R. P. Schoeman et al.

Fig. 7.26 LTSmax spectrograms from the same location decreased in August. LTSmax produced with SeaPro soft-
(Sasso Fratino Integral Nature Reserve, Italy) on three ware by combining 48 frames of 10 min each, recorded
different dates and under different weather conditions. every 30 min (Righini and Pavan 2020)
Biophony is concentrated between 1.5 and 9 kHz and

events, which would otherwise be averaged and might be a need to quantify this variability.
potentially missed in LTSAs. Fig. 7.26 shows Power spectral density (PSD) percentile plots
three 24-h LTSmax of an Italian soundscape on quantify the spectrum variability over the dura-
different dates and under different weather tion of a temporal analysis window. PSD is plot-
conditions (Righini and Pavan 2020). The images ted against frequency. At each frequency, several
show sound sources present from midnight to percentile levels are shown, commonly the
midnight: (top) one day in June 2015 with some median (50th percentile) and the quartiles (25th
bursts of rain, (middle) one day with good and 75th percentiles), but perhaps also additional
weather and a clear image of the biophony percentiles (e.g., 1st, 5th, 95th, and 99th). The nth
concentrated between dawn and dusk in the fre- percentile gives the levels that were exceeded n%
quency range 1.5–9 kHz, and (bottom) one day of the time. There is no standard for the length of
recorded in August, with a less dense biophony the temporal analysis window, and selection
during daylight hours but Orthopteran choruses in depends on the specific study questions. Tempo-
the night. In August, a short period of light rain is ral analysis windows of 24 h, one season, or one
also shown on the left side. In addition, the stream full year are common. Dominant contributors to
noise below 1 kHz in August was lower than in the soundscape can then be identified by the
June. The faint band between 12 and 18 kHz shape and levels of the curves. Additional infor-
present in all 3 panels was due to the intrinsic mation is provided by plotting the Spectral Prob-
noise of the recorder. ability Density (SPD) as background colors that
represent the probability of levels being reached
7.5.3.2 Power Spectral Density based on normalized histograms of sound levels
Percentile Plots within each frequency bin (Fig. 7.27; Merchant
While spectrograms (including LTSAs) show et al. 2013). Merchant et al. (2015) gave detailed
how the sound spectrum changes over time information on how to compute PSDs and SPDs
(from one FFT window to the next or from one with their publicly available software PAMGuide.
LTSA observation window to the next), there Also see Chap. 4.
7 Analysis of Soundscapes as an Ecological Tool 245

Pygmy blue whales


110
Near traffic 0.14

Power Spectral Density [dB re 1 μPa2/Hz] 100


Humpback 0.12
whales
90
0.1

Probability density
80 Fish 0.08

70 1% 0.06
5%
25%
60 50% 0.04
75%
50 0.02
Fish ? 95%
99%
40 0
101 102 Frequency [Hz] 10
3

Fig. 7.27 Plot of power spectral density percentiles and humpback whales at 300 Hz, and fishes at 2 kHz, whereas
probability density for the annual soundscape of the Perth the most common sound sources were distant shipping at
Canyon, Australia. The strongest sound sources were 10–100 Hz and wind at 300 Hz–3 kHz (Erbe et al. 2016b)
pygmy blue whales and nearby ships at 10–200 Hz,

7.5.3.3 Soundscape Maps identify areas of long-term risk to humans or


Soundscape maps literally show sound levels on a animals from noise exposure. Erbe et al. (2012)
map. Such maps are mostly produced by computed a map of average sound levels from
modeling sound propagation from multiple annual ship tracks to highlight areas along the
sources, distributed over the area. Model results Canadian coast where ship noise exceeded the
may be validated by point measurements (i.e., European criterion of 100 dB re 1 μPa rms
recordings at selected places; Erbe et al. 2014, (Fig. 7.29). The same concept was later used to
2021; Schoeman et al. 2022). Sound maps may identify areas where (a) strong sound levels
be produced for specific frequencies of interest overlapped with high animal density (identifying
(e.g., relevant to human audiology; Bozkurt and areas of risk; Fig. 7.30; Erbe et al. 2014), and
Demirkale 2017) or for a specified receiver height (b) low sound levels overlapped with high animal
or depth (e.g., migrating whales below the sea density (identifying areas of opportunity for con-
surface; Tennessen and Parks 2016; Bagočius servation management; Fig. 7.30; Williams et al.
and Narščius 2018). Sound propagation maps 2015).
typically focus on specific sound sources (e.g.,
highways or railways; Fig. 7.28; Aletta and
Kang 2015; Drozdova et al. 2019). 7.5.4 Acoustic Indices
Maglio et al. (2015) developed a near real-time
model that shows the propagation of sound from Apart from sound level statistics (such as SPL
individual ships in the Ligurian Sea. However, measures, PSD percentiles, and SPD), additional
focus can also be placed on cumulative or average metrics, such as acoustic indices, exist, which
sound levels over a specified time frame to may quantify soundscapes as a whole or quantify
246 R. P. Schoeman et al.

Fig. 7.28 Noise-map of a roadway in an urban area. Red com/journals/jat/2018/7031418/fig4/. Published under CC
indicates highest noise levels and green represents the BY 4.0; https://creativecommons.org/licenses/by/4.0/
quietest areas. # Cai et al. 2018; https://www.hindawi.

the biophony, geophony, and anthropophony sep- score the structure and distribution of acoustic
arately or in comparison. Acoustic indices can be power over frequency and/or time, reflecting a
used as a tool to assess the quality of soundscapes correlation with species presence and distribution
and the underlying ecosystem. Historically, (e.g., Towsey et al. 2014). While traditionally
researchers assessed the number of species (i.e., developed for terrestrial communities, acoustic
species richness) and number of individuals indices are now also increasingly applied to the
belonging to each species (i.e., species evenness) aquatic environment (e.g., Parks et al. 2014;
by counting the number of acoustic identifications Harris et al. 2016; Bolgan et al. 2018a). In partic-
while walking along survey transects or listening ular when the same instruments and protocols are
to recordings (Obrist et al. 2010). However, this used, acoustic indices allow for comparisons of
approach is inefficient, subjective, and limited to soundscapes between multiple sites recorded over
brief observation times. In contrast, a transect or the same period or an evaluation of the changes of
grid of automated recording systems allows a soundscape over time (Righini and Pavan 2020;
acoustic surveys in remote areas, over extended Farina et al. 2021).
periods, and in most field conditions (Acevedo Examples of acoustic indices include:
and Villanueva-Rivera 2006).
1. Bioacoustic Index (BI): Aims to quantify
To support the analyses and interpretation of
biophonic activity by thresholding spectral
consequent large datasets, researchers have been
power in biophony-specific frequency bands
developing acoustic indices that summarize and
(Fig. 7.31; Boelman et al. 2007),
7 Analysis of Soundscapes as an Ecological Tool 247

Fig. 7.29 Illustration of


the conversion of
cumulative hours of ship
traffic along the Canadian
coast to cumulative noise
levels (a) to identify areas
where annual average
received levels exceeded
the European criterion for
low-frequency ambient
noise of 100 dB re 1 μPa
rms (b; Erbe et al. 2012).
# Acoustical Society of
America 2012. All rights
reserved

2. Entropy Index (H): Equals the product of two and applies the Shannon entropy to these bins
sub-indices, spectral (Hf) and temporal (Villanueva-Rivera et al. 2011),
entropy (Ht), computed on the average fre- 4. Acoustic Evenness Index (AEI): Divides the
quency spectrum and on the Hilbert amplitude spectrum into specific frequency bins, selects
envelope of the raw bioacoustic signal, respec- the bins surpassing a preset power threshold,
tively (Sueur et al. 2008b), and considers the distribution of strong fre-
3. Acoustic Diversity Index (ADI): Divides the quency bins by computing the Gini coefficient
spectrum into specific frequency bins, selects (Villanueva-Rivera et al. 2011),
the bins surpassing a preset power threshold,
248 R. P. Schoeman et al.

Fig. 7.30 Maps of (a) harbor porpoise (Phocoena density and low noise) in British Columbia, Canada.
phocoena) density, (b) audiogram-weighted ship noise, # Williams et al. 2015; https://doi.org/10.1016/j.
(c) areas of risk (i.e., high animal density and high marpolbul.2015.09.012. Licensed under CC BY-NC-ND
noise), and (d) areas of opportunity (i.e., high animal 4.0; https://creativecommons.org/licenses/by-nc-nd/4.0/

5. Acoustic Complexity Index (ACI): Measures frequency power (indicative of biophony) to


the temporal variation in acoustic power by capture the level of anthropogenic disturbance
calculating sequential power differences (Kasten et al. 2012).
(from one FFT window to the next), in all
These and other indices are coded in shareware
frequency bands separately, then sums over
R packages, such as seewave (Sueur et al. 2008a;
frequency (Fig. 7.31; Pieretti et al. 2011), and
Sueur 2018), soundecology (Villanueva-Rivera
6. Normalized Difference Soundscape Index
and Pijanowski 2018), and bioacoustics (March-
(NDSI): Equals the ratio of low-frequency
al et al. 2020). However, the analysis of long-term
(indicative of anthropophony) to high-
recordings can also aim at recognizing individual
7 Analysis of Soundscapes as an Ecological Tool 249

Fig. 7.31 Bioacoustic Index (BI) and Acoustic Complex- peak at sunrise, followed by a gradual decline with a
ity Index (ACI) for three Italian locations in the Integral second peak at sunset
Nature Reserve of Sasso Fratino, Italy, showing a strong

species’ signatures by listening, by observing resolutions used in the computation of the various
spectrograms, and by using sound recognition quantities and is affected by temporal (and spa-
tools to identify the presence and recurrence of tial) patterns as well as local (and temporally
defined sound models. The R package monitoR variable) sound propagation conditions (Mooney
(Katz et al. 2016) can be used to identify user- et al. 2020). As a result, acoustic indices are
defined sound models. sometimes tuned for specific environments, limit-
It should be noted that acoustic indices applied ing comparability across environments and time.
in two different environments can produce
confounding results and so the robustness of
these indices to environmental change and to 7.6 Applications of Soundscape
different soundscape compositions has been Studies
questioned (Harris et al. 2016; Bolgan et al.
2018a). Soundscape studies can reveal information on
Parks et al. (2014) found that seismic airgun animal distribution, abundance, and behavior;
pulses interfered with the Entropy Index and species diversity; and changes of all of these
therefore did not accurately reflect species rich- over time under environmental and human
ness within the Atlantic Ocean where seismic influences. Hence, soundscape analyses can be
surveys were commonly detected. Bolgan et al. used as ecological tools to understand, conserve,
(2018a) assessed the robustness of the Acoustic and restore soundscapes as part of conservation
Complexity Index to fine variations in fish sound management plans (Pavan 2017).
abundance (i.e., number of sounds) and diversity
(i.e., number of different calls); both changed
index values. Hence, it would be difficult to
infer whether a change in this index resulted 7.6.1 Conservation of Natural
from a change in fish abundance or fish species Soundscapes
diversity. Biophony and anthropophony can over-
lap in frequency and time as well as vary with 7.6.1.1 Management
frequency and time. Acoustic index performance Documenting, analyzing, and understanding a
depends greatly on the frequency and time soundscape can provide important information
for wildlife and habitat managers on species
250 R. P. Schoeman et al.

richness, animal behavior patterns, effects of highlighting the potential use of soundscape stud-
anthropogenic sounds, land-use, and climate ies to monitor for illegal human activities and to
change. Documenting relatively pristine assess the effectiveness of conservation efforts.
soundscapes before they disappear (Righini and Investigation of underwater soundscapes can
Pavan 2020; Farina et al. 2021) can aid also aid in the detection of foreign vessels by
re-establishment of degraded acoustic habitats the military, unauthorized commercial fishing
through habitat restoration, animal relocation, vessels, unlawful vessels in restricted areas (i.e.,
elimination of invasive species, or restrictions of no-go zones or marine protected areas; Kline et al.
activities that generate anthropogenic sound and 2020), and illegal fishing activities with
affect animal behavior. The success of explosives (Xu et al. 2020).
soundscape restoration can then be demonstrated
through acoustic monitoring and analysis (Pavan 7.6.1.2 Education
2017). The rates of biodiversity loss, habitat loss, inva-
Development and implementation of a com- sion of alien species, and species extinctions are
prehensive acoustic monitoring program can aid high (Intergovernmental Science-Policy Platform
management of a protected area in several ways. on Biodiversity and Ecosystem Services [IPBES]
Firstly, storage of quantitative data about the 2019). Helping citizens and stakeholders appreci-
acoustic environment can be used to create piv- ate biodiversity is a necessity to establish a gen-
otal repositories for immediate or future analyses eral willingness to address anthropogenic causes
of spatial and temporal patterns and differences at of ecosystem demise. In this context, animal
large scales. LTSA spectrograms, for example, sound and soundscape recordings not only serve
provide a summary of day-by-day acoustic science but have the potential to trigger people’s
settings and the possibility to display information, curiosity to learn more about the importance of
not only on the diversity of acoustic species (as in ecosystems and their preservation, which will
a census) but also on the density and richness of lead to conservation efforts. Such transfer of sci-
the biophonic components. The study of an Inte- ence, via education, to conservation has been
gral Nature Reserve (Sasso Fratino, Casentinesi demonstrated in several case studies (e.g., Padua
Forests National Park, Italy) demonstrated that 1994; Macharia et al. 2010; Pavan 2017; Barthel
the biophony dominated both geophony and et al. 2018). Exhibits and educational programs
anthropophony, with undisturbed daily cycles on the sounds from nature in museums, zoos, park
(Righini and Pavan 2020; Farina et al. 2021). visitor centers, and websites can stimulate interest
Secondly, monitoring soundscapes can help in and care about the acoustic environment. An
managers detect unwanted and unlawful activities example is Bernie Krause’s Great Animal
in protected areas. Human voices can be used to Orchestra exhibition1. Alternatively, listening to
identify trespassers, gunshots to locate hunters animal sounds during a guided nature walk can
and poachers, humming chainsaws to find illegal generate an appreciation for soniferous animals,
logging, vehicle sounds to document unautho- which can result in long-term public engagement
rized vehicle use, and sounds from livestock to and commitment to conservation by citizen
pinpoint unlawful grazing. Wrege et al. (2017) scientists. Soundscape studies can help to create
found that gunshot sounds within a closed- publicly available sound libraries and help to
canopy forest of the Congo could be detected identify areas within a park for visitors to experi-
over a 7–10 km2 area, depending on the gun ence songbirds, calling frogs, chorusing insects,
used and orientation to the acoustic receiver. waterfalls, rushing streams, etc. One example of
Eight years of acoustic monitoring did not reveal integrating soundscape monitoring and education
a correlation between illegal hunting of forest is the Natural Sound Program, established in
elephants (Loxodonta cyclotis) and time of day
or season. However, hunting intensity seemingly 1
https://thevinylfactory.com/features/bernie-krause-
decreased after initiating patrols in 2009, great-animal-orchestra/; accessed 27 September 2020
7 Analysis of Soundscapes as an Ecological Tool 251

2000 by the U.S. National Park Service (National (Wiseman et al. 2014), anthropogenic sound from
Park Service [NPS] 2000). This program aims to mechanical devices (e.g., Wysocki et al. 2007;
manage the acoustic environment while Scheifele et al. 2012b), background music
providing for educational and inspirational visitor (Scheifele et al. 2012a), and visitors (e.g.,
experiences. Quadros et al. 2014; Sherwen and Hemsworth
2019) is characteristic of many indoor, outdoor,
and underwater animal holding facilities. O’Neal
7.6.2 Monitoring the Health (1998), for example, found that underwater sound
of Agroecosystems pressure levels were 25 dB (20–6400 Hz) louder
in exhibits inside the Monterey Bay Aquarium
High productivity from agricultural fields can be than in a nearby natural offshore environment,
maintained through insecticides, pesticides, and predominantly due to sound from machinery.
fertilizers, but the use of these products may result Similarly, Scheifele et al. (2012b) detected an
in chemical pollution with consequent loss of increase in sound pressure levels by 10–20 dB
plant and animal biodiversity (e.g., Carson (20 Hz–1 kHz) when air pumps were switched on
1962; Boatman et al. 2004; Kerr and Cihlar within the Georgia Aquarium. These increases in
2004; Kleijn et al. 2009). Hence, habitats sound levels can have adverse effects on animal
connected to agricultural lands might exhibit welfare because of physiological and behavioral
poorer soundscapes. In contrast, organic farmers changes (e.g., Owen et al. 2004).
strive to maintain productivity through natural Sound sources that may impact animals might
agroecosystems, ensuring environment quality not be audible to humans, and so animal keepers
and ecological balances. Bird, insect, amphibian, might not be aware of acoustic disturbance to
and bat communities serve as indicators of eco- kept animals. For example, laboratory mice
system health, and an agroecosystem should have are sensitive to ultrasound, above the human
a balance of mixed species that provide natural hearing range. Laboratory equipment (e.g., air
pest control. The ecological quality of an conditioners and lighting) may emit ultrasound
agroecosystem can therefore be evaluated by the and, unknown to humans, stress animals within
species-richness of its soundscape (e.g., Hole these facilities (Sales et al. 1988). Identifying
et al. 2005; Kleijn et al. 2011; Pavan 2017). such sources is necessary for the improvement
Doohan et al. (2019) identified bird and bat of acoustic conditions to increase captive animal
species-specific or guild-specific bioindicators as welfare (De Queiroz 2018). Sound can further be
successful biomonitoring tools for agricultural exacerbated by hard reflective surfaces and the
industries. Systematic monitoring of biological geometry of an exhibit; hence, some noise
sounds can provide an accurate and practical problems can be solved by improving exhibit
assessment tool for farmers, policymakers, design (Wark 2015; De Queiroz 2018).
researchers, and others interested in maintaining Restricting visitor group sizes, reducing operation
or restoring farmland ecosystems, and ultimately hours, limiting the number of shows, and reduc-
encourage the adoption of beneficial and sustain- ing the level of background music can also miti-
able farming practices. gate negative impacts of noise on captive animals.

7.6.3 Improving Captive Animal 7.7 Conclusion


Welfare
Soundscapes are composed of a myriad of
Noise may be omnipresent for captive animals in sounds that can be grouped into biophony,
livestock-operations, zoos, aquaculture, and geophony, and anthropophony based on their
aquaria. While wind and rain contribute naturally origin. Natural soundscapes have ecological
to ambient sound in outdoor animal enclosures value and modifying these natural assets could
252 R. P. Schoeman et al.

lead to changes in ecosystem functioning and • A collection of biophony, geophony, and vari-
biodiversity. At present, natural soundscapes are ous soundscape recordings from all over the
disappearing at an unprecedented rate because of world, the British Library: https://sounds.bl.
human interference. Human activities create uk/Environment
sound, change land-use patterns, directly remove • Sounds recorded by National Park Service
animals from their habitat through overharvesting researchers in U.S. National Parks, such as
and illegal hunting, and lead to climate change, Yellowstone National Park and Rocky Moun-
thereby directly and indirectly affecting both tain National Park: https://www.nps.gov/
geophony and biophony. Soundscape studies subjects/sound/gallery.htm
can be used as an ecological tool to study animal • A collection of biophony (i.e., invertebrates,
distribution, behavior, biodiversity, and the amphibians, fishes, reptiles, birds, and
effects of environmental stressors (such as anthro- mammals), Museum für Naturkunde. Note
pogenic noise or climate change). Soundscape that some sound descriptions are in German:
studies can subsequently inform conservation https://www.museumfuernaturkunde.berlin/
management and assess the effectiveness of man- en/science/animal-sound-archive
agement and conservation efforts. • A collection of biophony, SeaWorld Parks and
Entertainment: https://seaworld.org/animals/
sounds/
• A collection of marine biophony, geophony,
7.8 Additional Resources
and anthropophony, Ocean Conservation
Research: https://ocr.org/sound-library/
Below is a selection of free, online resources; last
• The Xeno-Canto collection of animal
accessed 20 June 2022.
recordings provided by scientists and amateur
recordists: https://www.xeno-canto.org/
• Web pages of the University of Pavia about
7.8.1 Sound Libraries bioacoustics and ecoacoustics, including
samples of sounds: http://www.unipv.it/cibra
Sound libraries can serve as reference during the
identification of sound sources. They are also an
educational tool to create awareness of the myriad
7.8.2 Ocean Acoustic Observatories
of sounds that may contribute to a soundscape.
• The Macauley library from the Cornell Lab of Ocean acoustic observatories provide a continu-
Ornithology contains a large collection of ous stream of acoustic data either in real-time or
biophony: https://search.macaulaylibrary.org/ archived:
catalog?view¼List&searchField¼animals
• Australia’s Integrated Marine Observing Sys-
• The Discovery Of Sound In The Sea
tem (IMOS): https://imos.org.au/facilities/
(DOSITS) website, developed by the Univer-
nationalmooringnetwork/
sity of Rhode Island Graduate School of
acousticobservatories
Oceanography in partnership with Marine
• Indian Ocean Acoustic Observatory
Acoustics Inc., contains an underwater sound
OHASISBIO: https://www-iuem.univ-brest.
library as well as a collection of easy-to-read
fr/lgo/les-chantiers/ohasisbio/?lang¼en
scientific information on sound in the ocean:
• Listening to the Deep Ocean (LIDO): http://
https://dosits.org
www.listentothedeep.net/
• The sounds of Australian and Antarctic marine
• Monterey Bay Aquarium Research Institute
mammals, Curtin University: https://cmst.
(MBARI): https://www.mbari.org/
curtin.edu.au/research/marine-mammal-
soundscape-listening-room/
bioacoustics/
7 Analysis of Soundscapes as an Ecological Tool 253

7.8.3 Software for Soundscape 7.8.4 Software for Sound


Analysis Propagation Modeling

• Characterization Of Recorded Underwater • The Acoustic Toolbox User interface and Post
Sound (CHORUS), a MATLAB (The processor (AcTUP) written in MATLAB for
MathWorks Inc., Natick, MA, USA) graphic modeling range-independent and range-
user interface developed by Curtin University: dependent environments: http://cmst.curtin.
https://cmst.curtin.edu.au/products/chorus- edu.au/products/underwater/ (Duncan and
software/ (Gavrilov and Parsons 2014). Maggi 2006).
• PAMGuard for passive acoustic monitoring: • Graphical user interface i-Simpa suitable for
http://www.pamguard.org/download.php? 3D indoor sound propagation modeling as
id¼108 well as for modeling of environmental noise:
• Triton Software Package, a MATLAB graphic https://i-simpa.ifsttar.fr/download/download0/
user interface developed at Scripps Institution • Software tool created by the openPSTD proj-
of Oceanography: http://www.cetus.ucsd.edu/ ect to aid sound propagation modeling in
technologies_triton.html urban environments: http://www.openpstd.
• OSPREY, a MATLAB graphic user interface org/Download%20openPSTD.html
developed by Oregon State University: • The NoiseModelling tool designed to create
https://www.mobysound.org/software.html environmental noise maps of large urban
• R package seewave available for download areas: https://noise-planet.org/noisemodelling.
from within RStudio: https://cran.r-project. html
org/web/packages/seewave/index.html • The ArcGIS toolbox SPreAD-GIS for
• R package soundecology available for down- modeling engine noise propagation in natural
load from within RStudio: https://cran.r- areas incorporating atmospheric, wind, vege-
project.org/web/packages/soundecology/ tation, and terrain effects (Reed et al. 2010).
index.html
• R package bioacoustics available for down-
load from within RStudio: https://cran.r-
7.8.5 Software for Automatic Signal
project.org/web/packages/bioacoustics/index.
Detection
html
• SoundRuler for measuring acoustic signals:
Some of the software packages for soundscape
http://soundruler.sourceforge.net/main/
analysis include signal detectors:
• Sound Analysis Pro for analysis of biophony:
http://soundanalysispro.com • CHORUS includes detectors for pygmy blue
• SeaPro and SeaWave for recording, analysis, whale song, fin whale 20-Hz downsweeps, and
and real-time display of bioacoustic signals an unidentified spot-call.
and biophony: http://www.unipv.it/cibra/ • PAMGuard includes detectors for odontocete
seapro.html and mysticete vocalizations.
• SOX a command line tool for sound file manip-
ulation and analysis: https://sourceforge.net/ Other automatic signal detection resources:
projects/sox/
• Raven Lite to record, save, and visualize • R package monitoR available for
sounds as spectrograms and waveforms: download from: https://cran.r-project.org/
https://ravensoundsoftware.com/software/ web/packages/monitoR/index.html
raven-lite/ • Ishmael: http://bioacoustics.us/ishmael.html
254 R. P. Schoeman et al.

References Baker MC (2009) Information content in chorus songs of


the group-living Australian magpie (Cracticus tibicen
dorsalis) in Western Australia. Ethology 115:227–238.
Acevedo MA, Villanueva-Rivera LJ (2006) Using digital
https://doi.org/10.1111/j.1439-0310.2008.01606.x
recording systems as effective tools for the monitoring
Barber JR, Crooks KR, Fristrup KM (2010) The costs of
of birds and amphibians. Wildl Soc Bull 34:211–214.
chronic noise exposure for terrestrial organisms.
https://doi.org/10.2193/0091-7648(2006)34[211:
Trends Ecol Evol 25:180–189. https://doi.org/10.
UADRSA]2.0.CO;2
1016/j.tree.2009.08.002
Ainslie M (2010) Principles of sonar performance
Barthel S, Belton S, Raymond CM, Giusti M (2018)
modelling. Springer, Heidelberg
Fostering children’s connection to nature through
Aletta F, Kang J (2015) Soundscape approach integrating
authentic situations: the case of saving salamanders at
noise mapping techniques: a case study in Brighton,
school. Front Psychol 9:928. https://doi.org/10.3389/
UK. Noise Mapp 1:1–12. https://doi.org/10.1515/
fpsyg.2018.00928
noise-2015-0001
Basner M, Clark C, Hansell A et al (2017) Aviation noise
Amorim MCP, Vasconcelos RO, Fonseca PJ (2015) Fish
impacts: state of the science. Noise Health 19:41–50.
sounds and mate choice. In: Ladich F (ed) Sound com-
https://doi.org/10.4103/nah.NAH_104_16
munication in fishes. Springer, Vienna, pp 1–33
Behr O, van Helversen O (2004) Bat serenades - complex
Anderson RC, Herrera M, Ilangakoon AD et al (2020)
courtship songs of the sac-winged bat (Saccopteryx
Cetacean bycatch in Indian Ocean tuna gillnet
bilineata). Behav Ecol Sociobiol 56:106–115. https://
fisheries. Endanger Species Res 41:39–53. https://doi.
doi.org/10.1007/s00265-004-0768-7
org/10.3354/esr01008
Bennet-Clark HC (1970) The mechanism and efficiency of
Andrew RK, Howe BM, Mercer JA (2011) Long-term
sound production in mole crickets. J Exp Biol 52:619–
trends in ship traffic noise for four sites off the north
652. https://doi.org/10.1242/jeb.52.3.619
American west coast. J Acoust Soc Am 129:642–651.
Berglund B, Hassmén P, Job RFS (1996) Sources and
https://doi.org/10.1121/1.3518770
effects of low-frequency noise. J Acoust Soc Am 99:
Arch VS, Grafe TU, Narins PM (2008) Ultrasonic signal-
2985–3002. https://doi.org/10.1121/1.414863
ling by a Bornean frog. Biol Lett 4:19–22. https://doi.
Bermúdez-Cuamatzin E, Delamore Z, Verbeek L et al
org/10.1098/rsbl.2007.0494
(2020) Variation in diurnal patterns of singing activity
Attenborough K, Bashir I, Hill TJ, Taherzadeh S (2012)
between urban and rural great tits. Front Ecol Evol 8:
Noise control by roughness-induced ground effects
24. https://doi.org/10.3389/fevo.2020.00246
and vegetative cover. In: Environmental noise propa-
Bernardini M, Fredianelli L, Fidecaro F et al (2019) Noise
gation 2012: definitions, measuring and control
assessment of small vessels for action planning in canal
aspects. Institute of Acoustics, London, pp 16–26
cities. Environments 6:31. https://doi.org/10.3390/
Au WWL (1993) The sonar of dolphins. Springer,
environments6030031
New York
Bertucci F, Guerra AS, Sturny V et al (2020) A prelimi-
Au WWL, Banks K (1998) The acoustics of the snapping
nary acoustic evaluation of three sites in the lagoon of
shrimp Synalpheus parneomeris in Kaneohe Bay. J
Bora Bora, French Polynesia. Environ Biol Fishes 103:
Acoust Soc Am 103:41–47. https://doi.org/10.1121/1.
891–902. https://doi.org/10.1007/s10641-020-01000-8
423234
Boatman ND, Brickle NW, Hart JD et al (2004) Evidence
Aylor D (1972) Sound transmission through vegetation in
for the indirect effects of pesticides on farmland birds.
relation to leaf area density, leaf width, and breadth of
Ibis (Lond 1859) 146:131–143. https://doi.org/10.
canopy. J Acoust Soc Am 51:411–413. https://doi.org/
1111/j.1474-919X.2004.00347.x
10.1121/1.1912852
Boelman NT, Asner GP, Hart PJ, Martin RE (2007) Multi-
Azar JF, Bell D (2016) Acoustic features within a forest
trophic invasion resistance in Hawaii: bioacoustics,
community of native and introduced species in
field surveys, and airborne remote sensing. Ecol Appl
New Zealand. Emu 116:22–31. https://doi.org/10.
17:2137–2144. https://doi.org/10.1890/07-0004.1
1071/MU14095
Boersma HF (1997) Characterization of the natural ambi-
Badino A, Borelli D, Gaggero T et al (2012) Noise emitted
ent sound environment: measurements in open agricul-
from ships: impact inside and outside the vessels.
tural grassland. J Acoust Soc Am 101:2104–2110.
Procedia Soc Behav Sci 48:868–879. https://doi.org/
https://doi.org/10.1121/1.418141
10.1016/j.sbspro.2012.06.1064
Bohnenstiehl DR, Lillis A, Eggleston DB (2016) The
Bagočius D, Narščius A (2018) Method for the simplistic
curious acoustic behavior of estuarine snapping
modelling of the acoustic footprint of the vessels in the
shrimp: temporal patterns of snapping shrimp sound
shallow marine area. MethodsX 5:1010–1016. https://
in sub-tidal oyster reef habitat. PLoS One 11:
doi.org/10.1016/j.mex.2018.08.011
e0143691. https://doi.org/10.1371/journal.pone.
Bailey H, Senior B, Simmons D et al (2010) Assessing
0143691
underwater noise levels during pile-driving at an off-
Bolgan M, O’Brien J, Winfield IJ, Gammell M (2016) An
shore windfarm and its potential effects on marine
investigation of inland water soundscapes: which sonic
mammals. Mar Pollut Bull 60:888–897. https://doi.
org/10.1016/j.marpolbul.2010.01.003
7 Analysis of Soundscapes as an Ecological Tool 255

sources influence acoustic levels? Proc Meet Acoust Buzzetti F, Brizio C, Pavan G (2020) Beyond the audible:
27:070004. https://doi.org/10.1121/2.0000260 wide band (0-125 kHz) field investigation on Italian
Bolgan M, Amorim MCP, Fonseca PJ et al (2018a) Acous- Orthoptera (Insecta) songs. Biodivers J 11:443–496.
tic complexity of vocal fish communities: a field and https://doi.org/10.31396/Biodiv.Jour.2020.11.2.443.
controlled validation. Sci Rep 8:10559. https://doi.org/ 496
10.1038/s41598-018-28771-6 Cai M, Yao Y, Wang H (2018) Urban traffic noise maps
Bolgan M, O’Brien J, Chorazyczewska E et al (2018b) under 3D complex building environments on a super-
The soundscape of the Arctic Charr spawning grounds computer. J Adv Transp 2018:7031418. https://doi.
in lotic and lentic environments: can passive acoustic org/10.1155/2018/7031418
monitoring be used to detect spawning activities? Bio- Caorsi VZ, Both C, Cechin S et al (2017) Effects of traffic
acoustics 27:57–85. https://doi.org/10.1080/09524622. noise on the calling behavior of two Neotropical hylid
2017.1286262 frogs. PLoS One 12:e0183342. https://doi.org/10.
Bolin K (2009) Prediction method for wind-induced vege- 1371/journal.pone.0183342
tation noise. Acta Acust United Acust 95:607–619. Carson R (1962) Silent Spring. Houghton Mifflin Com-
https://doi.org/10.3813/AAA.918189 pany, Boston
Bond AB, Diamond J (2005) Geographic and ontogenetic Caruso F, Alonge G, Bellia G et al (2017) Long-term
variation in the contact calls of the kea (Nestor monitoring of dolphin biosonar activity in deep pelagic
notabilis). Behaviour 142:1–20. https://doi.org/10. waters of the Mediterranean Sea. Sci Rep 7:4321.
1163/1568539053627721 https://doi.org/10.1038/s41598-017-04608-6
Borelli D, Gaggero T, Rizzuto E, Schenone C (2016) Catchpole CK, Slater PJR (2008) Bird song: biological
Holistic control of ship noise emissions. Noise Mapp themes and variations. Cambridge University Press,
3:107–119. https://doi.org/10.1515/noise-2016-0008 Cambridge
Bourgeois K, Curé C, Legrand J et al (2007) Morphologi- Cato DH (2008) Ocean ambient noise: its measurement
cal versus acoustic analysis: what is the most efficient and its significance to marine animals. In: Proceedings
method for sexing yelkouan shearwaters Puffinus of the Institute of Acoustics. Institute of Acoustics,
yelkouan? J Ornithol 148:261–269. https://doi.org/10. Southampton, pp 1–9
1007/s10336-007-0127-3 Cerchio S, Dahlheim M (2001) Variation in feeding
Bowling DL, Garcia M, Dunn JC et al (2017) Body size vocalizations of humpback whales Megaptera
and vocalization in primates and carnivores. Sci Rep 7: novaeangliae from southeast Alaska. Bioacoustics
41070. https://doi.org/10.1038/srep41070 11:277–295. https://doi.org/10.1080/09524622.2001.
Bozkurt TS, Demirkale SY (2017) The field study and 9753468
numerical simulation of industrial noise mapping. J Chabert T, Colin A, Aubin T et al (2015) Size does matter:
Build Eng 9:60–75. https://doi.org/10.1016/j.jobe. crocodile mothers react more to the voice of smaller
2016.11.007 offspring. Sci Rep 5:15547. https://doi.org/10.1038/
Brady J (1974) The physiology of insect circadian srep15547
rhythms. Adv Insect Phys 10:1–115. https://doi.org/ Challender DWS, MacMillan DC (2014) Poaching is more
10.1016/S0065-2806(08)60129-0 than an enforcement problem. Conserv Lett 7:484–
Bregman AS (1990) Auditory scene analysis: the percep- 494. https://doi.org/10.1111/conl.12082
tual organization of sound. The MIT Press, Cambridge Chapman NR, Price A (2011) Low frequency deep ocean
Brown A, Garg S, Montgomery J (2019) Automatic rain ambient noise trend in the northeast Pacific Ocean. J
and cicada chorus filtering of bird acoustic data. Appl Acoust Soc Am 129:EL161–EL165. https://doi.org/10.
Soft Comput 81:105501. https://doi.org/10.1016/j. 1121/1.3567084
asoc.2019.105501 Charrier I, Mathevon N, Jouventin P, Aubin T (2001)
Brunetti AE, Saravia AM, Barrionuevo JS, Reichle S Acoustic communication in a black-headed gull col-
(2017) Silent sounds in the Andes: underwater ony: how do chicks identify their parents? Ethology
vocalizations of three frog species with reduced tym- 107:961–974. https://doi.org/10.1046/j.1439-0310.
panic middle ears (Anura: Telmatobiidae: 2001.00748.x
Telmatobius). Can J Zool 95:335–343. https://doi.org/ Chaudhary A, Burivalova Z, Koh LP, Hellweg S (2016)
10.1139/cjz-2016-0177 Impact of forest management on species richness:
Burbidge T, Parson T, Caycedo-Rosales PC et al (2015) global meta-analysis and economic trade-offs. Sci
Playbacks revisited: asymmetry in behavioural Rep 6:23954. https://doi.org/10.1038/srep23954
response across an acoustic boundary between two Cheney DL, Seyfarth RM (1996) Function and intention in
parapatric bird species. Behaviour 152:1933–1951. the calls of non-human primates. In: Runciman WG,
https://doi.org/10.1163/1568539X-00003309 Smith JM, Dunbar RIM (eds) Proceedings of the Brit-
Buscaino G, Ceraulo M, Pieretti N et al (2016) Temporal ish Academy, Evolution of social behaviour patterns in
patterns in the soundscape of the shallow waters of a primates and man, vol 88. Oxford University Press,
Mediterranean marine protected area. Sci Rep 6:34230. Oxford, pp 59–76
https://doi.org/10.1038/srep34230 Cheney DL, Seyfarth RM (2018) Flexible usage and social
function in primate vocalizations. Proc Natl Acad Sci
256 R. P. Schoeman et al.

USA 115:1974–1979. https://doi.org/10.1073/pnas. De Queiroz MB (2018) How does the zoo soundscape
1717572115 affect the zoo experience for animals and visitors?
Ciceran M, Murray AM, Rowell G (1994) Natural varia- University of Salford, Manchester
tion in the temporal patterning of calling song structure Deichmann JL, Hernández-Serna A, Delagado CJA et al
in the field cricket Gryllus pennsylvanicus: effects of (2017) Soundscape analysis and acoustic monitoring
temperature, age, mass, time of day, and nearest neigh- document impacts of natural gas exploration on biodi-
bor. Can J Zool 72:38–42. https://doi.org/10.1139/ versity in a tropical forest. Ecol Indic 74:39–48. https://
z94-006 doi.org/10.1016/j.ecolind.2016.11.002
Clark CW (1982) The acoustic repertoire of the southern Dentressangle F, Aubin T, Mathevon N (2012) Males use
right whale, a quantitative analysis. Anim Behav 30: time whereas females prefer harmony: individual call
1060–1071. https://doi.org/10.1016/S0003-3472(82) recognition in the dimorphic blue-footed booby. Anim
80196-6 Behav 84:413–420. https://doi.org/10.1016/j.anbehav.
Clark CJ (2021) Ways that animal wings produce sound. 2012.05.012
Integr Comp Biol 61:696–709. https://doi.org/10. Dey M, Krishnaswamy J, Morisaka T, Kelkar N (2019)
1093/icb/icab008 Interacting effects of vessel noise and shallow river
Clausen KT, Wahlberg M, Beedholm K et al (2010) Click depth elevate metabolic stress in Ganges river
communication in harbour porpoises Phocoena dolphins. Sci Rep 9:15426. https://doi.org/10.1038/
phocoena. Bioacoustics 20:1–28. https://doi.org/10. s41598-019-51664-1
1080/09524622.2011.9753630 Di GQ, Lin QL, Li ZG, Kang J (2014) Annoyance and
Coquereau L, Grall J, Chauvaud L et al (2016) Sound activity disturbance induced by high-speed railway and
production and associated behaviours of benthic conventional railway noise: a contrastive case study.
invertebrates from a coastal habitat in the north-east Environ Health 13:12. https://doi.org/10.1186/1476-
Atlantic. Mar Biol 163:127. https://doi.org/10.1007/ 069X-13-12
s00227-016-2902-2 Dingle C, Halfwerk W, Slabbekoorn H (2008) Habitat-
Courts R, Erbe C, Wellard R et al (2020) Australian long- dependent song divergence at subspecies level in the
finned pilot whales (Globicephala melas) emit stereo- grey-breasted wood-wren. J Evol Biol 21:1079–1089.
typical, variable, biphonic, multi-component, and https://doi.org/10.1111/j.1420-9101.2008.01536.x
sequenced vocalisations, similar to those recorded in Dingle C, Poelstra JW, Halfwerk W et al (2010) Asym-
the northern hemisphere. Sci Rep 10: 20609. https:// metric response patterns to subspecies-specific song
doi.org/10.1038/s41598-020-74111-y differences in allopatry and parapatry in the gray-
Crowley SR, Pietruszka RD (1983) Aggressiveness and breasted wood-wren. Evolution 64:3537–3548.
vocalization in the leopard lizard (Gambelia https://doi.org/10.1111/j.1558-5646.2010.01089.x
wislizennii): the influence of temperature. Anim Doohan B, Fuller S, Parsons S, Peterson EE (2019) The
Behav 31:1055–1060. https://doi.org/10.1016/S0003- sound of management: acoustic monitoring for agricul-
3472(83)80012-8 tural industries. Ecol Indic 96:739–746. https://doi.org/
Cunnington GM, Fahrig L (2010) Plasticity in the 10.1016/j.ecolind.2018.09.029
vocalizations of anurans in response to traffic noise. Dragoset B (2000) Introduction to air guns and air-gun
Acta Oecologica 36:463–470. https://doi.org/10.1016/ arrays. Lead Edge 19:892–897. https://doi.org/10.
j.actao.2010.06.002 1190/1.1438741
Cure C, Aubin T, Mathevon N (2009) Acoustic conver- Drozdova L, Butorina M, Kuklin D (2019) Evaluation and
gence and divergence in two sympatric burrowing reduction of the common effect of road and rail
nocturnal seabirds. Biol J Linn Soc 96:115–134. noise. In: Proceedings of the 26th International Con-
https://doi.org/10.1111/j.1095-8312.2008.01104.x gress on Sound and Vibration. International Institute of
Dahl PH, Dall’Osto DR, Farrell DM (2015) The underwa- Acoustics and Vibration, Montreal
ter sound field from vibratory pile driving. J Acoust Duarte MHL, Caliari EP, Scarpelli MDA et al (2019)
Soc Am 137:3544–3554. https://doi.org/10.1121/1. Effects of mining truck traffic on cricket calling activ-
4921288 ity. J Acoust Soc Am 146:656–664. https://doi.org/10.
Danielsen F, Heegaard M (1995) Impact of logging and 1121/1.5119125
plantation development on species diversity: a case Duffy JE (1996) Eusociality in coral-reef shrimp. Nature
study from Sumatra. In: Sandbukt Ø 381:512–514. https://doi.org/10.1038/381512a0
(ed) Management of tropical forests: towards an Duffy JE, Macdonald KS (1999) Colony structure of the
integrated perspective. Centre for Development and social snapping shrimp Synalpheus filidigitus in Belize.
the Environment, University of Oslo, Oslo, pp 73–92 J Crustac Biol 19:283–292. https://doi.org/10.1163/
Davies WJ, Adams MD, Bruce NS et al (2013) Perception 193724099X00097
of soundscapes: an interdisciplinary approach. Appl Dulvy NK, Fowler SL, Musick JA et al (2014) Extinction
Acoust 74:224–231. https://doi.org/10.1016/j. risk and conservation of the world’s sharks and rays.
apacoust.2012.05.010 Elife 3:e00590. https://doi.org/10.7554/eLife.00590
Duncan AJ, Maggi AL (2006) A consistent, user friendly
interface for running a variety of underwater acoustic
7 Analysis of Soundscapes as an Ecological Tool 257

propagation codes. In: Proceedings of Acoustics. Erbe C, Parsons M, Duncan AJ, Allen K (2016c) Under-
Christchurch, 20–22 November 2006 water acoustic signatures of recreational swimmers,
Dunlop RA, Noad MJ, Cato DH, Stokes D (2007) The divers, surfers and kayakers. Acoust Aust 44:333–
social vocalization repertoire of east Australian migrat- 341. https://doi.org/10.1007/s40857-016-0062-7
ing humpback whales (Megaptera novaeangliae). J Erbe C, Wintner S, Dudley SFJ, Plön S (2016d) Revisiting
Acoust Soc Am 122:2893–2905. https://doi.org/10. acoustic deterrence devices: long-term bycatch data
1121/1.2783115 from South Africa’s bather protection nets. Proc Meet
Duque FG, Rodríguez-Saltos CA, Wilczynski W (2018) Acoust 27:010025. https://doi.org/10.1121/2.0000306
High-frequency vocalizations in Andean Erbe C, Dunlop R, Jenner KCS et al (2017) Review of
hummingbirds. Curr Biol 28:927–928. https://doi.org/ underwater and in-air sounds emitted by Australian
10.1016/j.cub.2018.07.058 and Antarctic marine mammals. Acoust Aust 45:179–
Dziak RP, Fox CG (2002) Evidence of harmonic tremor 241. https://doi.org/10.1007/s40857-017-0101-z
from a submarine volcano detected across the Pacific Erbe C, Williams R, Parsons M et al (2018) Underwater
Ocean basin. J Geophys Res 107:1–11. https://doi.org/ noise from airplanes: an overlooked source of ocean
10.1029/2001JB000177 noise. Mar Pollut Bull 137:656–661. https://doi.org/
Dziak RP, Haxel JH, Matsumoto H et al (2017) Ambient 10.1016/j.marpolbul.2018.10.064
sound at challenger deep, mariana trench. Oceanogra- Erbe C, Marley SA, Schoeman RP et al (2019) The effects
phy 30:186–197. https://doi.org/10.5670/oceanog. of ship noise on marine mammals: a review. Front Mar
2017.240 Sci 6:606. https://doi.org/10.3389/fmars.2019.00606
Erbe C (2009) Underwater noise from pile driving in Erbe C, Schoeman RP, Peel D, Smith JN (2021) It often
Moreton Bay, Qld. Acoust Aust 37:87–92 howls more than it chugs: wind versus ship noise under
Erbe C (2013) Underwater noise of small personal water- water in Australia’s maritime regions. J Mar Sci Eng 9:
craft (jet skis). J Acoust Soc Am 133:EL326–EL330. 1–27. https://doi.org/10.3390/jmse9050472
https://doi.org/10.1121/1.4795220 Erisman BE, Rowell TJ (2017) A sound worth saving:
Erbe C, King AR (2009) Modelling cumulative sound acoustic characteristics of a massive fish spawning
exposure around marine seismic surveys. J Acoust aggregation. Biol Lett 13:20170656. https://doi.org/
Soc Am 125:2443–2451. https://doi.org/10.1121/1. 10.1098/rsbl.2017.0656
3089588 Ernstes R, Quinn JE (2016) Variation in bird vocalizations
Erbe C, McPherson C (2012) Acoustic characterisation of across a gradient of traffic noise as a measure of an
bycatch mitigation pingers on shark control nets in altered urban soundscape. Cities Environ 8:7
Queensland, Australia. Endanger Species Res 19: European Environment Agency [EEA] (2014) Noise in
109–121. https://doi.org/10.3354/esr00467 Europe. Publications Office of the European Union,
Erbe C, McPherson C (2017) Underwater noise from Luxembourg
geotechnical drilling and standard penetration testing. Farina A, Gage SH (2017) Ecoacoustics: the ecological
J Acoust Soc Am 142:EL281–EL285. https://doi.org/ role of sounds. Wiley, Hoboken
10.1121/1.5003328 Farina A, Righini R, Fuller S et al (2021) Acoustic com-
Erbe C, MacGillivray A, Williams R (2012) Mapping plexity indices reveal the acoustic communities of the
cumulative noise from shipping to inform marine spa- old-growth Mediterranean forest of Sasso Fratino Inte-
tial planning. J Acoust Soc Am 132:EL423–EL428. gral Natural Reserve (Central Italy). Ecol Indic 120:
https://doi.org/10.1121/1.4758779 106927. https://doi.org/10.1016/j.ecolind.2020.
Erbe C, McCauley RD, McPherson C, Gavrilov A (2013) 106927
Underwater noise from offshore oil production vessels. Feng AS, Narins PM, Xu CH et al (2006) Ultrasonic
J Acoust Soc Am 133:EL465–EL470. https://doi.org/ communication in frogs. Nature 440:333–336. https://
10.1121/1.4802183 doi.org/10.1038/nature04416
Erbe C, Williams R, Sandilands D, Ashe E (2014) Fenton MB, Portfors CV, Rautenback IL, Waterman JM
Identifying modeled ship noise hotspots for marine (1998) Compromises: sound frequencies used in echo-
mammals of Canada’s Pacific region. PLoS One 9: location by aerial-feeding bats. Can J Zool 76:1174–
e89820. https://doi.org/10.1371/journal.pone.0089820 1182. https://doi.org/10.1139/z98-043
Erbe C, Verma A, McCauley R et al (2015) The marine Fernández-Juricic E, Campagna C, Enriquez V, Ortiz CL
soundscape of the Perth Canyon. Prog Oceanogr 137: (1999) Vocal communication and individual variation
38–51. https://doi.org/10.1016/j.pocean.2015.05.015 in breeding south American sea lions. Behaviour 136:
Erbe C, Liong S, Koessler MW et al (2016a) Underwater 495–517. https://doi.org/10.1163/156853999501441
sound of rigid-hulled inflatable boats. J Acoust Soc Am Ferreira LM, Oliveira EG, Lopes LC et al (2018) What do
139:EL223–EL227. https://doi.org/10.1121/1. insects, anurans, birds, and mammals have to say about
4954411 soundscape indices in a tropical savanna. J Ecoacoust
Erbe C, McCauley R, Gavrilov A et al (2016b) The under- 2: #PVH6YZ. https://doi.org/10.22261/jea.pvh6yz
water soundscape around Australia. In: Proceedings of Fichtel C (2020) Monkey alarm calling: it ain’t all referen-
Acoustics. Brisbane, 9–11 November 2016. tial, or is it? Anim Behav Cogn 7:101–107. https://doi.
org/10.26451/abc.07.02.04.2020
258 R. P. Schoeman et al.

Findlay CR, Ripple HD, Coomber F et al (2018) Mapping Gazioğlu C, Müftüoğlu AE, Demir V et al (2015) Connec-
widespread and increasing underwater noise pollution tion between ocean acidification and sound propaga-
from acoustic deterrent devices. Mar Pollut Bull 135: tion. Int J Environ Geoinformatics 2:16–26. https://doi.
1042–1050. https://doi.org/10.1016/j.marpolbul.2018. org/10.30897/ijegeo.303538
08.042 Gerstein ER, Trygonis V, McCulloch S et al (2014)
Flint EL, Minot EO, Perry PE, Stafford KJ (2014) A Female north Atlantic right whales produce gunshot
survey of public attitudes towards barking dogs in sounds. J Acoust Soc Am 135:2369. https://doi.org/10.
New Zealand. N Z Vet J 62:321–327. https://doi.org/ 1121/1.4877814
10.1080/00480169.2014.921852 Gibbs JP, Breisch AR (2001) Climate warming and calling
Fournet MEH, Gabriele CM, Sharpe F et al (2018) Feed- phenology of frogs near Ithaca, New York, 1900-1999.
ing calls produced by solitary humpback whales. Mar Conserv Biol 15:1175–1178. https://doi.org/10.1046/j.
Mamm Sci 34:851–865. https://doi.org/10.1111/mms. 1523-1739.2001.0150041175.x
12485 Gil D, Honarmand M, Pascual J et al (2015) Birds living
Fox CG, Matsumoto H, Lau TKA (2001) Monitoring near airports advance their dawn chorus and reduce
Pacific Ocean seismicity from an autonomous hydro- overlap with aircraft noise. Behav Ecol 26:435–443.
phone array. J Geophys Res 106:4183–4206. https:// https://doi.org/10.1093/beheco/aru207
doi.org/10.1029/2000JB900404 Giles JC, Davis JA, McCauley RD, Kuchling G (2009)
Francis CD, Newman P, Taff BD et al (2017) Acoustic Voice of the turtle: the underwater acoustic repertoire
environments matter: synergistic benefits to humans of the long-necked freshwater turtle, Chelodina
and ecological communities. J Environ Manag 203: oblonga. J Acoust Soc Am 126:434–443. https://doi.
245–254. https://doi.org/10.1016/j.jenvman.2017. org/10.1121/1.3148209
07.041 Gill SA, Bierema AMK (2013) On the meaning of alarm
Franco LS, Shanahan DF, Fuller RA (2017) A review of calls: a review of functional reference in avian alarm
the benefits of nature experiences: more than meets the calling. Ethology 119:449–461. https://doi.org/10.
eye. Int J Environ Res Public Health 14:864. https:// 1111/eth.12097
doi.org/10.3390/ijerph14080864 Gordon J, Gillespie D, Potter J et al (2003) A review of the
Freeman SE, Freeman LA, Giorli G, Haas AF (2018) effects of seismic surveys on marine mammals. Mar
Photosynthesis by marine algae produces sound, Technol Soc J 37:16–34. https://doi.org/10.4031/
contributing to the daytime soundscape on coral reefs. 002533203787536998
PLoS One 13:e0201766. https://doi.org/10.1371/jour Gordon TAC, Radford AN, Davidson IK et al (2019)
nal.pone.0201766 Acoustic enrichment can enhance fish community
Gadziola MA, Grimsley JMS, Faure PA, Wenstrup JJ development on degraded coral reef habitat. Nat
(2012) Social vocalizations of big brown bats vary Commun 10:5414. https://doi.org/10.1038/s41467-
with behavioral context. PLoS One 7:e44550. https:// 019-13186-2
doi.org/10.1371/journal.pone.0044550 Gottesman BL, Francomano D, Zhao Z et al (2020)
Gage SH, Axel AC (2014) Visualization of temporal Acoustic monitoring reveals diversity and surprising
change in soundscape power of a Michigan lake habitat dynamics in tropical freshwater soundscapes. Freshw
over a 4-year period. Ecol Inform 21:100–109. https:// Biol 65:117–132. https://doi.org/10.1111/fwb.13096
doi.org/10.1016/j.ecoinf.2013.11.004 Grafe TU (2005) Anuran choruses as communication
Galeotti P, Sacchi R, Fasola M, Ballasina D (2005) Do networks. In: McGregor G (ed) Animal communica-
mounting vocalisations in tortoises have a communi- tion networks. Cambridge University Press,
cation function? A comparative analysis. Herpetol J Cambridge, pp 277–299
15:61–71 Greta M, Louena S, Arianna A, et al (2019) Prediction of
Garrett JK, Blondel P, Godley BJ et al (2016) Long-term off-site noise levels reduction in open-air music events
underwater sound measurements in the shipping noise within densely populated urban areas. In: INTER-
indicator bands 63 Hz and 125 Hz from the port of NOISE 2019 MADRID - 48th International Congress
Falmouth Bay, UK. Mar Pollut Bull 110:438–448. and Exhibition on Noise Control Engineering. Interna-
https://doi.org/10.1016/j.marpolbul.2016.06.021 tional Institute of Noise Control Engineering, Madrid,
Gavrilov A (2018) Propagation of underwater noise from 16–19 June 2019
an offshore seismic survey in Australia to Antarctica: Gulyas K, Pinte G, Augusztinovicz F, et al (2002) Active
measurements and modelling. Acoust Aust 46:143– noise control in agricultural machines. In: Proceedings
149. https://doi.org/10.1007/s40857-018-0131-1 of the 2002 International Conference on Noise and
Gavrilov AN, Parsons MJG (2014) A MATLAB tool for Vibration Engineering, ISMA, Leuven, 16–18 Septem-
the characterisation of recorded underwater sounds ber 2002
(CHORUS). Acoust Aust 42:190–196 Halfwerk W, Slabbekoorn H (2009) A behavioural mech-
Gavrilov AN, McCauley RD, Gedamke J (2012) Steady anism explaining noise-dependent frequency use in
inter and intra-annual decrease in the vocalization fre- urban birdsong. Anim Behav 78:1301–1307. https://
quency of Antarctic blue whales. J Acoust Soc Am doi.org/10.1016/j.anbehav.2009.09.015
131:4476–4480. https://doi.org/10.1121/1.4707425
7 Analysis of Soundscapes as an Ecological Tool 259

Harris SA, Shears NT, Radford CA (2016) Ecoacoustic Hole DG, Perkins AJ, Wilson JD et al (2005) Does organic
indices as proxies for biodiversity on temperate reefs. farming benefit biodiversity? Biol Conserv 122:113–
Methods Ecol Evol 7:713–724. https://doi.org/10. 130. https://doi.org/10.1016/j.biocon.2004.07.018
1111/2041-210X.12527 Holt DE, Johnston CE (2015) Traffic noise masks acoustic
Hart PJ, Hall R, Ray W et al (2015) Cicadas impact bird signals of freshwater stream fish. Biol Conserv 187:
communication in a noisy tropical rainforest. Behav 27–33. https://doi.org/10.1016/j.biocon.2015.04.004
Ecol 26:839–842. https://doi.org/10.1093/beheco/ Insley SJ, Phillips AV, Charrier I (2010) A review of
arv018 social recognition in pinnipeds. Aquat Mamm 29:
Hatch L, Clark C, Merrick R et al (2008) Characterizing 181–201. https://doi.org/10.1578/
the relative contributions of large vessels to total ocean 016754203101024149
noise fields: a case study using the Gerry E. Studds Intergovernmental Panel on Climate Change [IPCC]
Stellwagen Bank National Marine Sanctuary. Environ (2014) Climate change 2014 synthesis report. Contri-
Manag 42:735–752. https://doi.org/10.1007/s00267- bution of working groups I, II and III on the fifth
008-9169-4 assessment report of the intergovernmental panel on
Hauser DDW, Laidre KL, Stafford KM et al (2016) climate change. IPCC, Geneva
Decadal shifts in autumn migration timing by Pacific Intergovernmental Science-Policy Platform on Biodiver-
Arctic beluga whales are related to delayed annual sea sity and Ecosystem Services [IPBES] (2019) Summary
ice formation. Glob Change Biol 23:2206–2217. for policymakers of the global assessment report on
https://doi.org/10.1111/gcb.13564 biodiversity and ecosystem services of the intergovern-
Haver SM, Klinck H, Nieukirk SL et al (2017) The not-so- mental science-policy platform on biodiversity and
silent world: measuring Arctic, Equatorial and Antarc- ecosystem services. PBES Secretariat, Bonn
tic soundscapes in the Atlantic Ocean. Deep Res I International Organization for Standardization [ISO]
Oceanogr Res Pap 122:95–104. https://doi.org/10. (2014) International Standard 12913-1 acoustics -
1016/j.dsr.2017.03.002 soundscape - Part 1: definition and conceptual frame-
Herberholz J, Schmitz B (1999) Flow visualisation and work. International Organization for Standardization,
high speed video analysis of water jets in snapping Geneva
shrimp (Alpheus heterochaelis). J Comp Physiol A International Organization for Standardization [ISO]
185:41–49. https://doi.org/10.1007/s003590050364 (2017) International Standard 18405 underwater
Herman LM (2017) The multiple functions of male song acoustics - terminology. International Organization
within the humpback whale (Megaptera novaeangliae) for Standardization, Geneva
mating system: review, evaluation, and synthesis. Biol Iversen RTS, Perkins PJ, Dionne RD (1963) An indication
Rev 92:1795–1818. https://doi.org/10.1111/brv.12309 of underwater sound production by squid. Nature 199:
Hermannsen L, Beedholm K, Tougaard J, Madsen PT 250–251. https://doi.org/10.1038/199250a0
(2014) High frequency components of ship noise in Jacobs SR, Terhune JM (2002) The effectiveness of acous-
shallow water with a discussion of implications for tic harassment devices in the Bay of Fundy, Canada:
harbor porpoises (Phocoena phocoena). J Acoust Soc seal reactions and a noise exposure model. Aquat
Am 136:1640–1653. https://doi.org/10.1121/1. Mamm 28:147–158
4893908 Jahn O, Ganchev TD, Marques MI, Schuchmann KL
Hermannsen L, Tougaard J, Beedholm K et al (2015) (2017) Automated sound recognition provides insights
Characteristics and propagation of airgun pulses in into the behavioral ecology of a tropical bird. PLoS
shallow water with implications for effects on small One 12:e0169041. https://doi.org/10.1371/journal.
marine mammals. PLoS One 10:e0133436. https://doi. pone.0169041
org/10.1371/journal.pone.0133436 Jézéquel Y, Bonnel J, Coston-Guarini J, Chauvaud L
Herzing DL (1996) Vocalizations and associated underwa- (2019) Revisiting the bioacoustics of European spiny
ter behavior of free-ranging Atlantic spotted dolphins, lobsters Palinurus elephas: comparison of antennal
Stenella frontalis and bottlenose dolphins, Tursiops rasps in tanks and in situ. Mar Ecol Prog Ser 615:
truncatus. Aquat Mamm 22:61–79. https://doi.org/10. 143–157. https://doi.org/10.3354/meps12935
12966/abc.02.02.2015 Johnson JB, Lees JM, Yepes H (2006) Volcanic eruptions,
Hiley HM, Perry S, Hartley S, King SL (2017) What’s lightning, and a waterfall: differentiating the menagerie
occurring? Ultrasonic signature whistle use in Welsh of infrasound in the Ecuadorian jungle. Geophys Res
bottlenose dolphins (Tursiops truncatus). Bioacoustics Lett 33:L06308. https://doi.org/10.1029/
26:25–35. https://doi.org/10.1080/09524622.2016. 2005GL025515
1174885 Joo W, Gage SH, Kasten EP (2011) Analysis and interpre-
Hoffmann M, Belant JL, Chanson JS et al (2011) The tation of variability in soundscapes along an urban–
changing fates of the world’s mammals. Philos Trans rural gradient. Landsc Urban Plan 103:259–276.
R Soc B 366:2598–2610. https://doi.org/10.1098/rstb. https://doi.org/10.1016/j.landurbplan.2011.08.001
2011.0116 Kaatz IM (2002) Multiple sound-producing mechanisms
in teleost fishes and hypotheses regarding their
260 R. P. Schoeman et al.

behavioural significance. Bioacoustics 12:230–233. Knight L, Ladich F (2014) Distress sounds of thorny
https://doi.org/10.1080/09524622.2002.9753705 catfishes emitted underwater and in air: characteristics
Kaatz IM (2011) How fishes use sound: quiet to loud and and potential significance. J Exp Biol 217:4068–4078.
simple to complex signalling. In: Farrell AP https://doi.org/10.1242/jeb.110957
(ed) Encyclopedia of fish physiology. Elsevier, San Knowlton RE, Moulton JM (1963) Sound production in
Diego, pp 684–691 the snapping shrimps Alpheus (Crangon) and
Kaiser F, Rohde T (2013) Orlando theme park acoustics - a Synalpheus. Biol Bull 125:311–331. https://doi.org/
soundscape analysis. In: Internoise. International Con- 10.2307/1539406
gress and Exposition on Noise Control Engineering, Knudsen VO, Alford RS, Emling JW (1948) Underwater
15–18 September 2013, Austrian Noise Abatement ambient noise. J Mar Res 7:410–429
Association, Innsbruck Koschinski S, Culik BM, Henriksen OD et al (2003)
Kariel HG (1990) Factors affecting response to noise in Behavioural reactions of free-ranging propoises and
outdoor recreational environments. Can Geogr 34: seals to the noise of a simulated 2 MW windpower
142–149. https://doi.org/10.1111/j.1541-0064.1990. generator. Mar Ecol Prog Ser 265:263–273. https://doi.
tb01259.x org/10.3354/meps265263
Kasten EP, Gage SH, Fox J, Joo W (2012) The remote Krause BL (1987) Bio-acoustics: habitat ambience & eco-
environmental assessment laboratory’s acoustic logical balance. Signal 57:14–16
library: an archive for studying soundscape ecology. Krause BL (1993) The niche hypothesis: a virtual
Ecol Inform 12:50–67. https://doi.org/10.1016/j. symphony of animal sounds, the origins of musical
ecoinf.2012.08.001 expression and the health of habitats. Soundscape
Kasumyan AO (2008) Sounds and sound production in Newsl 6:6–10
fishes. J Ichthyol 48:981–1030. https://doi.org/10. Krause B (2008) Anatomy of the soundscape: evolving
1134/S0032945208110039 perspectives. J Audio Eng Soc 56:73–80
Katz J, Hafner SD, Donovan T (2016) Tools for automated Krause B (2012) The great animal orchestra. Little, Brown
acoustic monitorig with the R package monitoR. Bio- and Company, Boston
acoustics 25:197–210. https://doi.org/10.1080/ Krause B, Farina A (2016) Using ecoacoustic methods to
09524622.2016.1138415 survey the impacts of climate change on biodiversity.
Kawakita S, Ichikawa K (2019) Automated classification Biol Conserv 195:245–254. https://doi.org/10.1016/j.
of bees and hornet using acoustic analysis of their flight biocon.2016.01.013
sounds. Apidologie 50:71–79. https://doi.org/10.1007/ Kruger DJD, Du Preez LH (2016) The effect of airplane
s13592-018-0619-6 noise on frogs: a case study on the critically
Kerr JT, Cihlar J (2004) Patterns and causes of species endangered Pickersgill’s reed frog (Hyperolius
endangerment in Canada. Ecol Appl 14:743–753. pickersgilli). Ecol Res 31:393–405. https://doi.org/10.
https://doi.org/10.1890/02-5117 1007/s11284-016-1349-8
Kleijn D, Kohler F, Báldi A et al (2009) On the relation- Kuehne LM, Padgham BL, Olden JD (2013) The
ship between farmland biodiversity and land-use inten- soundscapes of lakes across an urbanization gradient.
sity in Europe. Proc R Soc B 276:903–909. https://doi. PLoS One 8:e55661. https://doi.org/10.1371/journal.
org/10.1098/rspb.2008.1509 pone.0055661
Kleijn D, Rundlöf M, Scheper J et al (2011) Does conser- Kukulski B, Wszolek T, Mleczko D (2018) The impact of
vation on farmland contribute to halting the biodiver- fireworks noise on the acoustic climate in urban areas.
sity decline? Trends Ecol Evol 26:474–481. https://doi. Arch Acoust 43:697–705. https://doi.org/10.24425/
org/10.1016/j.tree.2011.05.009 aoa.2018.125163
Klenova AV (2015) Chick begging calls reflect degree of Ladich F (1997) Agonistic behaviour and significance of
hunger in three Auk species (Charadriiformes: sounds in vocalizing fish. Mar Freshw Behav Physiol
Alcidae). PLoS One 10:e0140151. https://doi.org/10. 29:87–108. https://doi.org/10.1080/
1371/journal.pone.0140151 10236249709379002
Klinck H, Nieukirk SL, Mellinger DK et al (2012) Sea- Ladich F, Winkler H (2017) Acoustic communication in
sonal presence of cetaceans and ambient noise levels in terrestrial and aquatic vertebrates. J Exp Biol 220:
polar waters of the north Atlantic. J Acoust Soc Am 2306–2317. https://doi.org/10.1242/jeb.132944
132:EL176–EL181. https://doi.org/10.1121/1. LaManna JA, Martin TE (2017) Logging impacts on avian
4740226 species richness and composition differ across latitudes
Kline LR, DeAngelis AI, McBride C et al (2020) and foraging and breeding habitat preferences. Biol
Sleuthing with sound: understanding vessel activity in Rev 92:1657–1674. https://doi.org/10.1111/brv.12300
marine protected areas using passive acoustic monitor- Larom D, Garstang M, Payne K et al (1997) The influence
ing. Mar Policy 120:104138. https://doi.org/10.1016/j. of surface atmospheric conditions on the range and
marpol.2020.104138 area reached by animal vocalizations. J Exp Biol 200:
Kloepper LN, Simmons AM (2014) Bioacoustic monitor- 421–431
ing contributes to an understanding of climate change. Lattenkamp EZ, Shields SM, Schutte M et al (2019) The
Acoust Tod 10:8–15 vocal repertoire of pale spear-nosed bats in a social
7 Analysis of Soundscapes as an Ecological Tool 261

roosting context. Front Ecol Evol 7:116. https://doi. Martin SB, Cott PA (2016) The under-ice soundscape in
org/10.3389/fevo.2019.00116 Great Slave Lake near the city of Yellowknife, North-
Lengagne T, Slater PJB (2002) The effects of rain on west Territories, Canada. J Great Lakes Res 42:248–
acoustic communication: Tawny owls have good rea- 255. https://doi.org/10.1016/j.jglr.2015.09.012
son for calling less in wet weather. Proc R Soc London Martin SB, Popper AN (2016) Short- and long-term moni-
B 269:2121–2125. https://doi.org/10.1098/rspb.2002. toring of underwater sound levels in the Hudson River
2115 (New York, USA). J Acoust Soc Am 139:1886–1897.
Lengagne T, Jouventin P, Aubin T (1999) Finding one’s https://doi.org/10.1121/1.4944876
mate in a king penguin colony: efficiency of acoustic Mason NA, Burns KJ (2015) The effect of habitat and
communication. Behaviour 136:833–846. https://doi. body size on the evolution of vocal displays in
org/10.1163/156853999501595 Traupidae (tanagers), the largest family of songbirds.
Lewicki MS, Olshausen BA, Surlykke A, Moss CF (2014) Biol J Linn Soc 114:538–551. https://doi.org/10.1111/
Scene analysis in the natural environment. Front bij.12455
Psychol 5:199. https://doi.org/10.3389/fpsyg.2014. Matsumoto H, Bohnenstiel DR, Tournadre J et al (2014)
00199 Antarctic icebergs: a significant natural ocean sound
Lillis A, Perelman JN, Panyi A, Mooney TA (2017) Sound source in the southern hemisphere. Geochem Geophys
production patterns of big-clawed snapping shrimp Geosyst 15:3448–3458. https://doi.org/10.1002/
(Alpheus spp.) are influenced by time-of-day and social 2014GC005454
context. J Acoust Soc Am 142:3311–3320. https://doi. McKenna MF, Ross D, Wiggins SM, Hildebrand JA
org/10.1121/1.5012751 (2012) Underwater radiated noise from modern com-
Linke S, Gifford T, Desjonquères C (2020) Six steps mercial ships. J Acoust Soc Am 131:92–103. https://
towards operationalising freshwater ecoacoustic moni- doi.org/10.1121/1.3664100
toring. Freshw Biol 65:1–6. https://doi.org/10.1111/ McShane LJ, Estes JA, Riedman ML, Staedler MM (1995)
fwb.13426 Repertoire, structure, and individual variation of
Lorang MS, Tonolla D (2014) Combining active and pas- vocalizations in the sea otter. J Mammol 76:414–427.
sive hydroacoustic techniques during flood events for https://doi.org/10.2307/1382352
rapid spatial mapping of bedload transport patterns in McWilliam JN, Hawkins AD (2013) A comparison of
gravel-bed rivers. Fundam Appl Limnol 184:231–246. inshore marine soundscapes. J Exp Mar Bio Ecol
https://doi.org/10.1127/1863-9135/2014/0552 446:166–176. https://doi.org/10.1016/j.jembe.2013.
Ma BB, Nystuen JA, Lien RC (2005) Prediction of under- 05.012
water sound levels from rain and wind. J Acoust Soc McWilliam JN, McCauley RD, Erbe C, Parsons MJG
Am 117:3555–3565. https://doi.org/10.1121/1. (2017) Patterns of biophonic periodicity on coral
1910283 reefs in the Great Barrier Reef. Sci Rep 7:17459.
MacGillivray A, de Jong C (2021) A reference spectrum https://doi.org/10.1038/s41598-017-15838-z
model for estimating source levels of marine shipping Mellinger DK, Clark CW (2003) Blue whale
based on Automated Identification System data. J Mar (Balaenoptera musculus) sounds from the north Atlan-
Sci Eng 9:369. https://doi.org/10.3390/jmse9040369 tic. J Acoust Soc Am 114:1108–1119. https://doi.org/
Macharia JM, Thenya T, Ndiritu GG (2010) Management 10.1121/1.1593066
of highland wetlands in central Kenya: the importance Mercado IIIE (2018) The sonar model for humpback
of community education, awareness and eco-tourism in whale song revised. Front Psychol 9:1156. https://doi.
biodiversity conservation. Biodiversity 11:85–90. org/10.3389/fpsyg.2018.01156
https://doi.org/10.1080/14888386.2010.9712652 Merchant ND, Barton TR, Thompson PM et al (2013)
Mack AL, Jones J (2003) Low-frequency vocalizations by Spectral probability density as a tool for ambient
cassowaries (Casuarius spp.). Auk 120:1062–1068. noise analysis. J Acoust Soc Am 133:EL262–EL267.
https://doi.org/10.1093/auk/120.4.1062 https://doi.org/10.1121/1.4794934
Maglio A, Soares C, Bouzidi M et al (2015) Mapping Merchant ND, Fristrup KM, Johnson MP et al (2015)
shipping noise in the Pelagos Sanctuary (French part) Measuring acoustic habitats. Methods Ecol Evol 6:
through acoustic modelling to assess potential impacts 257–265. https://doi.org/10.1111/2041-210X.12330
on marine mammals. Sci Rep Port Cros Natl Park 29: Mikhalevsky PN (2001) Acoustics, Arctic. In: Steele JH,
167–185 Thorpe SA, Turekian KK (eds) Encyclopedia of ocean
Marchal J, Fabianek F, Scott C et al (2020) “bioacoustics”: sciences, vol 1. Elsevier, San Diego, pp 53–61
analyse audio recordings and automatically extract ani- Miksis-Olds JL, Bradley DL, Maggie Niu X (2013)
mal vocalizations. R package version 0.2.8 Decadal trends in Indian Ocean ambient sound. J
Marley SA, Erbe C, Salgado-Kent CP (2016) Underwater Acoust Soc Am 134:3464–3475. https://doi.org/10.
sound in an urban estuarine river: sound sources, 1121/1.4821537
soundscape contribution, and temporal variability. Mooney TA, Di Iorio L, Lammers M et al (2020) Listening
Acoust Aust 44:171–186. https://doi.org/10.1007/ forward: approaching marine biodiversity assessments
s40857-015-0038-z using acoustic methods. R Soc Open Sci 7:201287.
https://doi.org/10.1098/rsos.201287
262 R. P. Schoeman et al.

Morton ES (1975) Ecological sources of selection on avian thesis, Naval Postgraduate School, Monterey. Avail-
sounds. Am Nat 109:17–34 able from https://apps.dtic.mil/sti/pdfs/ADA350428.
Mulard H, Aubin T, White JF et al (2009) Voice variance pdf (accessed on 21 June 2022)
may signify ongoing divergence among black-legged Obrist MK, Pavan G, Sueur J et al (2010) Bioacoustics
kittiwake populations. Biol J Linn Soc 97:289–297. approaches in biodiversity inventories. In: Eymann J,
https://doi.org/10.1111/j.1095-8312.2009.01198.x Degreef J, Häuser C et al (eds) Manual on field record-
Mullet TC, Gage SH, Morton JM, Huettmann F (2016) ing techniques and protocols for all taxa biodiverity
Temporal and spatial variation of a winter soundscape inventories. Abc Taxa, Brussels, pp 68–99
in south-central Alaska. Landsc Ecol 31:1117–1137. O'Connell-Rodwell CE, Arnason BT, Hart LA (2000)
https://doi.org/10.1007/s10980-015-0323-0 Seismic properties of Asian elephant (Elephas
Mullet TC, Farina A, Gage SH (2017a) The acoustic maximus) vocalizations and locomotion. J Acoust Soc
habitat hypothesis: an ecoacoustics perspective on spe- Am 108:3066–3072. https://doi.org/10.1121/1.
cies habitat selection. Biosemiotics 10:319–336. 1323460
https://doi.org/10.1007/s12304-017-9288-5 Odom KJ, Hall ML, Riebel K et al (2014) Female song is
Mullet TC, Morton JM, Gage SH, Huettmann F (2017b) widespread and ancestral in songbirds. Nat Commun 5:
Acoustic footprint of snowmobile noise and natural 3379. https://doi.org/10.1038/ncomms4379
quiet refugia in an Alaskan wilderness. Nat Areas J Owen MA, Swaisgood RR, Czekala NM et al (2004)
37:332–349. https://doi.org/10.3375/043.037.0308 Monitoring stress in captive giant pandas (Ailuropoda
Mumm CAS, Knörnschild M (2014) The vocal repertoire melanoleuca): behavioral and hormonal response to
of adult and neonate giant otters (Pteronura ambient noise. Zoo Biol 23:147–164. https://doi.org/
brasiliensis). PLoS One 9:e112562. https://doi.org/ 10.1002/zoo.10124
10.1371/journal.pone.0112562 Padua SM (1994) Conservationn awareness through an
Narins PM (1990) Seismic communication in anuran environmental education programme in the Atlantic
amphibians. Bioscience 40:268–274. https://doi.org/ forest of Brazil. Environ Conserv 21:145–151. https://
10.2307/1311263 doi.org/10.1017/S0376892900024577
Narins PM, Meenderink WF (2014) Climate change and Pagniello CMLS, Cimino MA, Terrill E (2019) Mapping
frog calls: long-term correlations along a tropical alti- fish chorus distributions in southern California using
tudinal gradient. Proc R Soc B 281:20140401. https:// an autonomous wave glider. Front Mar Sci 6:526.
doi.org/10.1098/rspb.2014.0401 https://doi.org/10.3389/fmars.2019.00526
National Park Service [NPS] (2000) Director’s order #47: Parks SE, Hamilton PK, Kraus SD, Tyack PL (2006) The
soundscape preservation and noise management. gunshot sound produced by male north Atlantic right
https://www.nps.gov/policy/DOrders/DOrder47.html. whales (Eubalaena glacialis) and its potential function
Accessed 5 Feb 2020 in reproductive advertisement. Mar Mamm Sci 21:
Nieukirk SL, Stafford KM, Mellinger DK et al (2004) 458–475. https://doi.org/10.1111/j.1748-7692.2005.
Low-frequency whale and seismic airgun sounds tb01244.x
recorded in the mid-Atlantic Ocean. J Acoust Soc Am Parks SE, Clark CW, Tyack PL (2007) Short- and long-
115:1832–1843. https://doi.org/10.1121/1.1675816 term changes in right whale calling behavior: the
Nieukirk SL, Mellinger DK, Moore SE et al (2012) potential effects of noise on acoustic communication.
Sounds from airguns and fin whales recorded in the J Acoust Soc Am 122:3725–3731. https://doi.org/10.
mid-Atlantic Ocean, 1999-2009. J Acoust Soc Am 1121/1.2799904
131:1102–1112. https://doi.org/10.1121/1.3672648 Parks SE, Miksis-Olds JL, Denes SL (2014) Assessing
Nijman V (2001) Effect of behavioural changes due to marine ecosystem acoustic diversity across ocean
habitat disturbance on density estimation of rain forest basins. Ecol Inform 21:81–88. https://doi.org/10.
vertebrates, as illustrated by gibbons (Primates: 1016/j.ecoinf.2013.11.003
Hylobatidae). In: Hillegers PJM, de Iongh HH (eds) Parsons MJG, Salgado-Kent CP, Marley SA et al (2016)
The balance between biodiversity conservation and Characterizing diversity and variation in fish choruses
sustainable use of tropical rain forests. Tropenbos in Darwin Harbour. ICES J Mar Sci 73:2058–2074.
Foundation, Wageningen, pp 217–226 https://doi.org/10.1093/icesjms/fsw037
Nityananda V, Bee MA (2011) Finding your mate at a Parsons MJG, Duncan AJ, Parsons SK, Erbe C (2020)
cocktail party: frequency separation promotes auditory Reducing vessel noise: an example of a solar-electric
stream segregation of concurrent voices in multi- passenger ferry. J Acoust Soc Am 147:3575–3583.
species frog choruses. PLoS One 6:e21191. https:// https://doi.org/10.1121/10.0001264
doi.org/10.1371/journal.pone.0021191 Parsons MJG, Erbe C, Meekan MG, Parsons SK (2021) A
Nystuen JA (1986) Rainfall measurements using underwa- review and meta-analysis of underwater noise radiated
ter ambient noise. J Acoust Soc Am 79:972–982. by small (<25 m length ) vessels. J Mar Sci Eng 9:827.
https://doi.org/10.1121/1.393695 https://doi.org/10.3390/jmse9080827
O’Neal D (1998) Comparison of the underwater ambient Pavan G (2017) Fundamentals of soundscape
noise measured in three large exhibits at the Monterey conservation. In: Farina A, Gage SH (eds)
Bay aquarium and in the inner Monterey Bay. M.Sc.
7 Analysis of Soundscapes as an Ecological Tool 263

Ecoacoustics. The ecological role of sound. Wiley, addressee, context, and behavior. Sci Rep 6:39419.
Hoboken, pp 235–258 https://doi.org/10.1038/srep39419
Pavan G, Priano M, De Carli P et al (1997) Stridulatory Prince P, Hill A, Covarrubias EP et al (2019) Deploying
organ and ultrasonic emission in certain species of acoustic detection algorithms on low-cost, open-source
Ponerine ants (Genus Ectatomma and Pachycondyla, sensors for environmental monitoring. Sensors 19:553.
Hymenoptera, Formicidae). Bioacoustics 8:209–221. https://doi.org/10.3390/s19030553
https://doi.org/10.1080/09524622.1997.9753363 Priyadarshani N, Castro I, Marsland S (2018) The impact
Paviotti M, Vogiatzis K (2012) On the outdoor annoyance of environmental factors in birdsong acquisition using
from scooter and motorbike noise in the urban envi- automated recorders. Ecol Evol 8:5016–5033. https://
ronment. Sci Total Environ 430:223–230. https://doi. doi.org/10.1002/ece3.3889
org/10.1016/j.scitotenv.2012.05.010 Putland RL, Mensinger AF (2020) Exploring the
Payne RS, McVay S (1971) Songs of humpback whales. soundscape of small freshwater lakes. Ecol Inform
Science 173:585–597. https://doi.org/10.1126/science. 55:101018. https://doi.org/10.1016/j.ecoinf.2019.
173.3997.585 101018
Payne KB, Langbauer WR Jr, Thomas EM (1986) Infra- Quadros S, Goulart VDL, Passos L et al (2014) Zoo visitor
sonic calls of the Asian elephant (Elephas maximus). effect on mammal behaviour: does noise matter? Appl
Behav Ecol Sociobiol 18:297–301. https://doi.org/10. Anim Behav Sci 156:78–84. https://doi.org/10.1016/j.
1007/BF00300007 applanim.2014.04.002
Payne CJ, Jessop TS, Guay PJ et al (2012) Population, Radford C, Jeffs A, Tindle C, Montgomery JC (2008)
behavioural and physiological responses of an urban Resonating sea urchin skeletons create coastal
population of black swans to an intense annual noise choruses. Mar Ecol Prog Ser 362:37–43. https://doi.
event. PLoS One 7:e45014. https://doi.org/10.1371/ org/10.3354/meps07444
journal.pone.0045014 Rashed A, Khan MI, Dawson JW et al (2009) Do
Phillips HRP, Newbold T, Purvis A (2017) Land-use hoverflies (Diptera: Syrphidae) sound like the
effects on local biodiversity in tropical forests vary Hymenoptera they morphologically resemble? Behav
between continents. Biodivers Conserv 26:2251– Ecol 20:396–402. https://doi.org/10.1093/beheco/
2270. https://doi.org/10.1007/s10531-017-1356-2 arn148
Picciulin M, Sebastianutto L, Fortuna CM et al (2016) Reber SA, Janisch J, Torregrosa K et al (2017) Formants
Are the 1/3-octave band 63- and 125 Hz noise levels provide honest acoustic cues to body size in American
predictive of vessel activity? The case in the Cres- alligators. Sci Rep 7:1816. https://doi.org/10.1038/
Lošinj Archipelago (northern Adreatic Sea, s41598-017-01948-1
Croatia). In: Popper AN, Hawkins A (eds) The effects Reed SE, Boggs JL, Mann JP (2010) SPreAD-GIS: an
of noise on aquatic life II. Springer, New York, pp ArcGIS toolbox for modeling the propagation of
821–828 engine noise in a wildland setting. Version 2.0. The
Pieretti N, Farina A, Morri D (2011) A new methodology Wilderness Society, San Franscisco, CA
to infer the singing activity of an avian community: the Reine KJ, Clarke DG, Dickerson C (2014) Characteriza-
Acoustic Complexity Index (ACI). Ecol Indic 11:868– tion of underwater sounds produced by hydraulic and
873. https://doi.org/10.1016/j.ecolind.2010.11.005 mechanical dredging operations. J Acoust Soc Am
Pijanowski BC, Farina A, Gage SH et al (2011a) What is 135:3280–3294. https://doi.org/10.1121/1.4875712
soundscape ecology? An introduction and overview of Ricci SW, Eggleston DB, Bohnenstiehl DR, Lillis A
an emerging new science. Landsc Ecol 26:1213–1232. (2016) Temporal soundscape patterns and processes
https://doi.org/10.1007/s10980-011-9600-8 in an estuarine reserve. Mar Ecol Prog Ser 550:25–
Pijanowski BC, Villanueva-Rivera LJ, Dumyahn SL et al 38. https://doi.org/10.3354/meps11724
(2011b) Soundscape ecology: the science of sound in Rice AN, Soldevilla MS, Quinlan JA (2017) Nocturnal
the landscape. Bioscience 61:203–216. https://doi.org/ patterns in fish chorusing off the coasts of Georgia and
10.1525/bio.2011.61.3.6 eastern Florida. Bull Mar Sci 93:455–474. https://doi.
Pola YV, Snowdon CT (1975) The vocalizations of pygmy org/10.5343/bms.2016.1043
marmosets (Cebuella pygmaea). Anim Behav 23:826– Righini R, Pavan G (2020) First assessment of the
842. https://doi.org/10.1016/0003-3472(75)90108-6 soundscape of the Integral Nature Reserve “Sasso
Polidori C, Pavan G, Ruffato G et al (2013) Common Fratino” in the Central Apennine, Italy. Biodiversity
features and species-specific differences in stridulatory 21:4–14. https://doi.org/10.1080/14888386.2019.
organs and stridulation patterns of velvet ants 1696229
(Hymenoptera: Mutillidae). Zool Anz 252:457–468. Roberts C (2009) Construction noise and vibration impact
https://doi.org/10.1016/j.jcz.2013.01.003 on sensitive premises. In: Proceedings of Acoustics
Potočnik I, Poje A (2010) Noise pollution in forest envi- 2009. Australian Acoustical Society, Adelaide, 23–25
ronment due to forest operations. Croat J For Eng 31: November 2009
137–148 Robillard T, Montealegre-Z F, Desutter-Grandcolas L et al
Prat Y, Taub M, Yovel Y (2016) Everyday bat (2013) Mechanisms of high-frequency song generation
vocalizations contain information about emitter, in brachypterous crickets and the role of ghost
264 R. P. Schoeman et al.

frequencies. J Exp Biol 216:2001–2011. https://doi. exhibit. Adv Acoust Vib 2012:402130. https://doi.
org/10.1242/jeb.083964 org/10.1155/2012/402130
Rocha RC Jr, Clapham PJ, Ivashchenko YV (2014) Emp- Scheifele PM, Johnson MT, Kretschmer L et al (2012b)
tying the oceans: a summary of industrial whaling Ambient habitat noise and vibration at the Georgia
catches in the 20th century. Mar Fish Rev 76:37–48. aquarium. J Acoust Soc Am 132:EL88–EL94. https://
https://doi.org/10.7755/MFR.76.4.3 doi.org/10.1121/1.4734387
Rochat JL, Reiter D (2016) Highway traffic noise. Acoust Schmitz B (2002) Sound production in crustacea with
Today 12:38–47 special reference to the Alpheidae. In: Wiese K
Römer H, Lewald J (1992) High-frequency sound trans- (ed) The crustacean nervous system. Springer, Berlin,
mission in natural habitats: implications for the evolu- pp 536–547
tion of insect acoustic communication. Behav Ecol Schoeman RP, Erbe C, Plön S (2022) Underwater chatter
Sociobiol 29:437–444. https://doi.org/10.1007/ for the win: a first assessment of underwater
BF00170174 soundscapes in two bays along the Eastern Cape
Ross D (1976) Mechanics of underwater noise. Pergamon coast of South Africa. J Mar Sci Eng 10:746. https://
Press, Oxford doi.org/10.3390/jmse10060746
Rossi T, Connell SD, Nagelkerken I (2016a) Silent oceans: Schusterman RJ, Gentry R, Schmook J (1966) Underwater
ocean acidification impoverishes natural soundscapes vocalizations by sea lions: social and mirror stimuli.
by altering sound production of the world’s noisiest Science 154:540–542. https://doi.org/10.1126/science.
marine invertebrate. Proc R Soc B 283:20153046. 154.3748.540
https://doi.org/10.1098/rspb.2015.3046 Sèbe F, Aubin T, Boué A, Poindron P (2008) Mother-
Rossi T, Nagelkerken I, Pistevos JCA, Connell SD young vocal communication and acoustic recognition
(2016b) Lost at sea: ocean acidification undermines promote preferential nursing in sheep. J Exp Biol 211:
larval fish orientation via altered hearing and marine 3554–3562. https://doi.org/10.1242/jeb.016055
soundscape modification. Biol Lett 12:20150937. Sertlek HÖ, Slabbekoorn H, ten Cate C, Ainslie MA
https://doi.org/10.1098/rsbl.2015.0937 (2019) Source specific sound mapping: spatial, tempo-
Rountree RA, Gilmore RG, Goudey CA et al (2006) ral and spectral distribution of sound in the Dutch
Listening to fish: applications of passive acoustics to North Sea. Environ Pollut 247:1143–1157. https://
fisheries science. Fisheries 31:433–446. https://doi. doi.org/10.1016/j.envpol.2019.01.119
org/10.1577/1548-8446(2006)31[433:LTF]2.0.CO;2 Shannon G, McKenna MF, Angeloni LM et al (2016) A
Rountree RA, Bolgan M, Juanes F (2019) How can we synthesis of two decades of research documenting the
understand freshwater soundscapes without fish sound effects of noise on wildlife. Biol Rev 91:982–1005.
descriptions? Fisheries 44:137–143. https://doi.org/10. https://doi.org/10.1111/brv.12207
1002/fsh.10190 Sherwen SL, Hemsworth PH (2019) The visitor effect on
Sales GD, Wilson KJ, Spencer KEV (1988) Environmen- zoo animals: implications and opportunities for zoo
tal ultrasound in laboratories and animal houses: a animal welfare. Animals 9:366. https://doi.org/10.
possible cause for concern in the welfare and use of 3390/ani9060366
laboratory animals. Lab Anim 22:369–375. https://doi. Slabbekoorn H (2004) Habitat-dependent ambient noise:
org/10.1258/002367788780746188 consistent spectral profiles in two African forest types.
Salmi R, Hammerschmidt K, Doran-Sheehy DM (2013) J Acoust Soc Am 116:3727–3733. https://doi.org/10.
Western gorilla vocal repertoire and contextual use of 1121/1.1811121
vocalizations. Ethology 119:831–847. https://doi.org/ Slabbekoorn H, Bouton N (2008) Soundscape orientation:
10.1111/eth.12122 a new field in need of sound investigation. Anim Behav
Sáncez-Pérez LA, Sánchez-Fernández LP, Suárez- 76:5–8. https://doi.org/10.1016/j.anbehav.2008.06.010
Guerra S, Carbajal-Hernández JJ (2013) Aircraft class Slabbekoorn H, Smith TB (2002) Habitat-dependent song
identification based on take-off noise signal segmenta- divergence in the little greenbul: an analysis of envi-
tion in time. Expert Syst Appl 40:5148–5159. https:// ronmental selection pressures on acoustic signals. Evo-
doi.org/10.1016/j.eswa.2013.03.017 lution 56:1849–1858. https://doi.org/10.1111/j.0014-
Scarpelli MDA, Ribeiro MC, Teixeira FZ et al (2020) 3820.2002.tb00199.x
Gaps in terrestrial soundscape research: it’s time to Slabbekoorn H, Dooling RJ, Popper AN, Fay RR (2018)
focus on tropical wildlife. Sci Total Environ 707: Effects of anthropogenic noise on animals. Springer,
135403. https://doi.org/10.1016/j.scitotenv.2019. New York
135403 Slabbekoorn H, Dalen J, de Haan D et al (2019)
Schafer RM (1969) The new soundscape: a handbook for Population-level consequences of seismic surveys on
the modern music teacher. Associated Music fishes: an interdisciplinary challenge. Fish Fish 20:
Publishers, New York 653–685. https://doi.org/10.1111/faf.12367
Schafer RM (1977) The tuning of the world. Knopf, Soloway AG, Dahl PH (2014) Peak sound pressure and
New York sound exposure level from underwater explosions in
Scheifele PM, Clark JG, Sonstrom K et al (2012a) Ball- shallow water. J Acoust Soc Am 136:218–223. https://
room music spillover into a beluga whale aquarium doi.org/10.1121/1.4892668
7 Analysis of Soundscapes as an Ecological Tool 265

Soltis J (2010) Vocal communication in African elephants Thiebault A, Charrier I, Aubin T et al (2019) First evi-
(Loxodonta africana). Zoo Biol 29:192–209. https:// dence of underwater vocalisations in hunting penguins.
doi.org/10.1002/zoo.20251 PeerJ 7:e8340. https://doi.org/10.7717/peerj.8240
Southworth M (1969) The sonic environment of cities. Thomas JA, Kuechle VB (1982) Quantitative analysis of
Environ Behav 1:49–70. https://doi.org/10.1177/ Weddell seal (Leptonychotes weddelli) underwater
001391656900100104 vocalizations at McMurdo Sound, Antarctica. J Acoust
Staaterman ER, Claverie T, Patek SN (2010) Soc Am 72:1730–1738. https://doi.org/10.1121/1.
Disentangling defense: the function of spiny lobster 388667
sounds. Behaviour 147:235–258. https://doi.org/10. Tonolla D, Acuña V, Lorang MS et al (2010) A field-based
1163/000579509X12523919243428 investigation to examine underwater soundscapes of
Stack DW, Peter N, Manning RE, Fristrup KM (2011) five common river habitats. Hydrol Process 24:3146–
Reducing visitor noise levels at Muir Woods National 3156. https://doi.org/10.1002/hyp.7730
Momument using experimental management. J Acoust Tonolla D, Lorang MS, Heutschi K et al (2011) Character-
Soc Am 129:1375–1380. https://doi.org/10.1121/1. ization of spatial heterogeneity in underwater
3531803 soundscapes at the river segment scale. Limnol
Stirling I, Calvert W, Spencer C (1987) Evidence of ste- Oceanogr 56:2319–2333. https://doi.org/10.4319/lo.
reotyped underwater vocalizations of male Atlantic 2011.56.6.2319
walruses (Odobenus rosmarus rosmarus). Can J Zool Torigoe K (1982) A study of the world soundscape project.
65:2311–2321. https://doi.org/10.1139/z87-348 York University, Toronto, Ontario
Sueur J (2002) Cicada acoustic communication: potential Tougaard J, Henriksen OD, Miller LA (2009) Underwater
sound partitioning in a multispecies community from noise from three types of offshore wind turbines: esti-
Mexico (Hemiptera: Cicadomorpha: Cicadidae). Biol J mation of impact zones for harbor porpoises and harbor
Linn Soc 75:379–394. https://doi.org/10.1046/j.1095- seals. J Acoust Soc Am 125:3766–3773. https://doi.
8312.2002.00030.x org/10.1121/1.3117444
Sueur J (2018) Sound analysis and synthesis with Towsey M, Wimmer J, Williamson I, Roe P (2014) The
R. Springer, Cham use of acoustic indices to determine avian species
Sueur J, Aubin T, Simonis C (2008a) “Seewave”: a free richness in audio-recordings of the environment. Ecol
modular tool for sound analysis and synthesis. Bio- Inform 21:110–119. https://doi.org/10.1016/j.ecoinf.
acoustics 18:213–226. https://doi.org/10.1080/ 2013.11.007
09524622.2008.9753600 Tripovich JS, Hall-Aspland S, Charrier I, Arnould JPY
Sueur J, Pavoine S, Hamerlynck O, Duvail S (2008b) (2012) The behavioural response of Australian fur
Rapid acoustic survey for biodiversity appraisal. seals to motor boat noise. PLoS One 7:e37228.
PLoS One 3:e4065. https://doi.org/10.1371/journal. https://doi.org/10.1371/journal.pone.0037228
pone.0004065 Truax B (1984) Acoustic communication. Ablex Publish-
Sueur J, Krause B, Farina A (2019) Climate change is ing Corporation, Norwood
breaking earth’s beat. Trends Ecol Evol 34:971–973. Truax B (1996) Soundscape, acoustic communication and
https://doi.org/10.1016/j.tree.2019.07.014 environmental sound composition. Contemp Music
Talandier J, Hyvernaud O, Reymond D, Okal EA (2006) Rev 15:49–65
Hydroacoustic signals generated by parked and drifting van der Lee GH, Desjonquères C, Sueur J, Kraak MHS,
icebergs in the southern Indian and Pacific oceans. Verdonschot PFM (2020) Freshwater ecoacoustics:
Geophys J Int 165:817–834. https://doi.org/10.1111/j. Listening to the ecological status of multi-stressed
1365-246X.2006.02911.x lowland waters. Ecol Indic 113:106252. https://doi.
Tasker ML, Amundin M, Andre M, et al (2010) Marine org/10.1016/j.ecolind.2020.106252
Strategy Framework Directive - Task Group 11: under- van Opzeeland I, van Parijs S, Bornemann H et al (2010)
water noise and other forms of energy. JRC Scientific Acoustic ecology of Antarctic pinnipeds. Mar Ecol
and Technical Report EUR 24341 EN - 2010. Office Prog Ser 414:267–291. https://doi.org/10.3354/
for Official Publications of the European Communities, meps08683
Luxembourg Van Parijs SM, Kovacs KM (2002) In-air and underwater
Tennessen JB, Parks SE (2016) Acoustic propagation vocalizations of eastern Canadian harbour seals, Phoca
modeling indicates vocal compensation in noise vitulina. Can J Zool 80:1173–1179. https://doi.org/10.
improves communication range for North Atlantic 1139/z02-088
right whales. Endanger Species Res 30:225–237. Veirs S, Veirs V, Wood JD (2016) Ship noise extends to
https://doi.org/10.3354/esr00738 frequencies used for echolocation by endangered killer
Tepp G, Chadwick WW Jr, Haney MM et al (2019) whales. PeerJ 4:e1657. https://doi.org/10.7717/peerj.
Hydroacoustic, seismic, and bathymetric observations 1657
of the 2014 submarine eruption at Ahyi seamount, Vergne AL, Pritz MB, Mathevon N (2009) Acoustic com-
Mariana arc. Geochem Geophys Geosyst 20:3608– munication in crocodilians: from behaviour to brain.
3627. https://doi.org/10.1029/2019GC008311 Biol Rev 84:391–411. https://doi.org/10.1111/j.
1469-185X.2009.00079.x
266 R. P. Schoeman et al.

Vergne AL, Aubin T, Taylor P, Mathevon N (2011) Wiseman S, Wilson PS, Sepulveda F (2014) Measuring a
Acoustic signals of baby black caimans. Zoology soundscape of the captive southern white rhinoceros
114:313–320. https://doi.org/10.1016/j.zool.2011. (Ceratotherium simum simum). In: Davy J, Don C,
07.003 McMinn T et al (eds) Proceedings of 43rd International
Vidović A, Štimac I, Zečević-Tadić R (2017) Aircraft Congress on Noise Control Engineering. The
noise monitoring in function of flight safety and air- Australian Acoustical Society, Melbourne, 16–19
craft model determination. J Adv Transp 2017: November 2014
2850860. https://doi.org/10.1155/2017/2850860 Wolfenden AD, Slabbekoorn H, Kluk K, de Kort SR
Villanueva-Rivera LJ (2014) Eleutherodactylus frogs (2019) Aircraft sound exposure leads to song fre-
show frequency but no temporal partitioning: quency decline and elevated aggression in wild
implications for the acoustic niche hypothesis. PeerJ chiffchaffs. J Anim Ecol 88:1720–1731. https://doi.
2:e496. https://doi.org/10.7717/peerj.496 org/10.1111/1365-2656.13059
Villanueva-Rivera LJ, Pijanowski BC (2018) Wrege PH, Rowland ED, Keen S, Shiu Y (2017) Acoustic
“Soundecology”: soundscape ecology. R package ver- monitoring for conservation in tropical forests:
sion 1.3.3 examples from forest elephants. Methods Ecol Evol
Villanueva-Rivera LJ, Pijanowski BC, Doucette J, Pekin B 8:1292–1301. https://doi.org/10.1111/2041-210X.
(2011) A primer of acoustic analysis for landscape 12730
ecologists. Landsc Ecol 26:1233–1246. https://doi. Wyatt R (2008) Review of existing data on underwater
org/10.1007/s10980-011-9636-9 sounds produced by the oil and gas industry. Joint
von Benda-Beckmann AM, Aarts G, Sertlek HO et al Industry Programme on Sound and Aquatic Life,
(2015) Assessing the impact of underwater clearance London
of unexploded ordnance on harbour porpoises Wysocki LE, Davidson JW III, Smith ME et al (2007)
(Phocoena phocoena) in the southern North Sea. Effects of aquaculture production noise on hearing,
Aquat Mamm 41:503–523. https://doi.org/10.1578/ growth, and disease resistance of rainbow trout
AM.41.4.2015.503 Oncorhynchus mykiss. Aquaculture 272:687–697.
Walker SE, Cade WH (2003) The effects of temperature https://doi.org/10.1016/j.aquaculture.2007.07.225
and age on calling song in a field cricket with a com- Xu W, Dong L, Caruso F, Gong Z, Li S (2020) Long-term
plex calling song, Teleogryllus oceanicus (Orthoptera: and large-scale spatiotemporal patterns of soundscape
Gryllidae). Can J Zool 81:1414–1420. https://doi.org/ in a tropical habitat of the Indo-Pacific humpback
10.1139/z03-106 dolphin (Sousa chinensis). PLoS One 15:e0236938.
Wark JD (2015) The influence of the sound environment https://doi.org/10.1371/journal.pone.0236938
on the welfare of zoo-housed callitrichine monkeys. Yin S, McCowan B (2004) Barking in domestic dogs:
Case Western Reserve University, Cleveland context specificity and individual identification. Anim
Weilgart L, Whitehead H (1993) Coda communication by Behav 68:343–355. https://doi.org/10.1016/j.anbehav.
sperm whales (Physeter macrocephalus) off the 2003.07.016
Galápagos Islands. Can J Zool 71:744–752. https:// Yip DA, Bayne EM, Sólymos P et al (2017) Sound atten-
doi.org/10.1139/z93-098 uation in forest and roadside environments:
Wellard R, Pitman RL, Durban J, Erbe C (2020) Cold call: implications for avian point-count surveys. Condor
the acoustic repertoire of Ross Sea killer whales Ornithol Appl 119:73–84. https://doi.org/10.1650/
(Orcinus orca, Type C) in McMurdo Sound, CONDOR-16-93.1
Antarctica. R Soc Open Sci 7:191228. https://doi.org/ York D (1994) Recreational-boating disturbances of natu-
10.1098/rsos.191228 ral communities and wildlife: an annotated bibliogra-
Wemmer C, von Ebers M, Scow K (1976) An analysis of phy. Biological Report 22. U.S. Department of the
the chuffing vocalization in the polar bear (Ursus Interior, Washington
maritimus). J Zool 180:425–439. https://doi.org/10. Young BA (1991) Morphological basis of “growling” in
1111/j.1469-7998.1976.tb04686.x the king cobra, Ophiophagus hannah. J Exp Zool 260:
Wenz GM (1962) Acoustic ambient noise in the ocean: 275–287. https://doi.org/10.1002/jez.1402600302
spectra and sources. J Acoust Soc Am 34:1936–1956. Young BA (2003) Snake bioacoustics: toward a richer
https://doi.org/10.1121/1.1909155 understanding of the behavioral ecology of snakes. Q
White K, Arntzen M, Walker F et al (2017) Noise annoy- Rev Biol 78:303–325. https://doi.org/10.1086/377052
ance caused by continuous descent approaches com- Young BA, Brown IP (1993) On the acoustic profile of the
pared to regular descent procedures. Appl Acoust 125: rattlesnake rattle. Amphibia Reptilia 14:373–380.
194–198. https://doi.org/10.1016/j.apacoust.2017. https://doi.org/10.1163/156853893X00066
04.008 Zhang F, Zhao J, Feng AS (2017) Vocalizations of female
Williams R, Erbe C, Ashe E, Clark CW (2015) frogs contain nonlinear characteristics and individual
Quiet(er) marine protected areas. Mar Pollut Bull signatures. PLoS One 12:e0174815. https://doi.org/10.
100:154–161. https://doi.org/10.1016/j.marpolbul. 1371/journal.pone.0174815
2015.09.012
7 Analysis of Soundscapes as an Ecological Tool 267

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Detection and Classification Methods
for Animal Sounds 8
Julie N. Oswald, Christine Erbe, William L. Gannon,
Shyam Madhusudhana, and Jeanette A. Thomas

8.1 Introduction method for comparing features, making systematic


measurements, testing hypotheses, and performing
Researchers have a natural tendency to classify statistical analyses.
biological systems into categories. For example, Bioacousticians have categorized sounds pro-
organisms can be classified based on biome, eco- duced by animals for decades, and new methods
system, taxon, phylogeny, niche, demographic for classification continue to be developed (Horn
class, behavior type, etc., and this allows complex and Falls 1996; Beeman 1998). Animals produce
systems to be organized. Categorization also can many different types of sounds that span orders of
make recognition of patterns easier and assist in magnitude along the dimensions of time, frequency,
understanding the ways in which biological and amplitude. For example, the repertoire of marine
systems work. Classification provides a convenient mammal acoustic signals includes broadband echo-
location clicks as short as 10 μs in duration and with
energy up to 200 kHz, as well as narrowband tonal
sounds as low as 10–20 Hz, lasting more than10 s.
Jeanette A. Thomas (deceased) contributed to this chapter Song birds and some species of baleen whales
while at the Department of Biological Sciences, Western arrange individual sounds into patterns called song
Illinois University-Quad Cities, Moline, IL, USA
and repeat these patterns for hours or days. Some
J. N. Oswald (*) mammal species produce distinctive, stereotyped
Scottish Oceans Institute, University of St Andrews, St sounds (e.g., chipmunks, dogs, and blue whales),
Andrews, Fife, UK while others produce signals with high variability
e-mail: jno@st-andrews.ac.uk
(e.g., mimicking birds, primates, and dolphins).
C. Erbe Because animals produce so many different
Centre for Marine Science & Technology, Curtin
types of sounds, developing algorithms to detect,
University, Perth, WA, Australia
e-mail: c.erbe@curtin.edu.au recognize, and classify a wide range of acoustic
signals can be challenging. In the past, detection
W. L. Gannon
Department of Biology and Graduate Studies, Museum of and classification tasks were performed by an
Southwestern Biology, University of New Mexico, experienced bioacoustician who listened to the
Albuquerque, NM, USA sounds and visually reviewed spectrographic
e-mail: wgannon@unm.edu
displays (e.g., for birds by Baptista and Gaunt
S. Madhusudhana 1997; chipmunks by Gannon and Lawlor 1989;
K. Lisa Yang Center for Conservation Bioacoustics,
baleen whales by Stafford et al. 1999; and
Cornell Lab of Ornithology, Cornell University, Ithaca,
NY, USA delphinids by Oswald et al. 2003). Before the
e-mail: shyamm@cornell.edu advent of digital signal-analysis, data were
# The Author(s) 2022 269
C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_8
270 J. N. Oswald et al.

analyzed while enduring the acrid smell of etched acoustic characteristics. For example, develop-
Kay Sona-Graph paper and piles of 8-s printouts ment of methods for detection and automated
removed from a spinning recording drum littering signal processing of bat sounds led to a variety
laboratory tables and floors. Output from a long- of automated, off-the-shelf, ready-to-deploy bat
duration sound had to be spliced together (see detectors that detect and classify sounds by spe-
Chap. 1). Many bioacoustic studies generated an cies (Fenton and Jacobson 1973; Gannon et al.
enormous amount of data, which made this man- 2004). These detectors can be very useful in
ual review process at best inefficient, and at worst addressing biological or management issues in
impossible to accomplish. ecology, evolution, and impact mitigation.
For decades, scientists have worked to auto- While the accuracy and robustness of automated
mate the process of detecting and classifying approaches are always a matter of concern (Herr
sounds into categories or types. Automated clas- et al. 1997; Parsons et al. 2000), modern
sification involves three main steps: (1) detection techniques promise much improved recognition
of potential sounds of interest, (2) extraction of performances that could rival manual analyses
relevant acoustic characteristics (or, features) (e.g., Brown and Smaragdis 2009).
from these sounds, and (3) classification of these Multivariate statistical methods can be power-
sounds as produced by a particular species, sex, ful for classification of sounds produced by spe-
age, or individual. Methods for the automated cies with variable vocal repertoires because they
detection of sounds have progressed quickly can identify complex relationships among many
with technological advances in digital recording acoustic features (see Chap. 9). With the advent
(see Chap. 2). Likewise, the extraction of sound of powerful personal computers in the 1980s and
variables useful in analysis has expanded with an 1990s, the use of multivariate techniques became
increasing amount of information provided by popular for classifying bird sounds (e.g., Sparling
new technology. For instance, where features and Williams 1978; Martindale 1980a, b). Since
such as maximum frequency or time between then, enormous effort has been expended to
sounds originally were measured manually off develop these and other automatic methods for
sonagraph paper, devices today allow for measur- the detection of sounds produced by many taxa
ing these, and many more variables, automati- and their classification into discrete categories,
cally or semi-automatically using computer such as species, population, sex, or individual.
software. Now, derived variables, such as time These days, there are applications (apps) for
difference between individual signal elements, smartphones that use advanced algorithms to
frequency modulation, running averages of automatically detect and recognize sounds. For
sound frequency, and harmonic structure can be example, the BirdNET app detects and classifies
easily obtained for classifying the sounds in a bird song—similar to the Shazam app for
repertoire. music—and provides a listing of the top-ranked
Some of the earliest methods used for matching species. It includes almost 1000 of the
automated detection and classification included most common species of North America and
energy threshold detectors (e.g., Clark 1980) and Europe. A similar app, Song Sleuth, recognizes
matched filters (e.g., Freitag and Tyack 1993; songs of nearly 200 bird species likely to be heard
Stafford et al. 1998; Dang et al. 2008; Mankin in North America and also provides references for
et al. 2008). These methods were used to detect species identification, such as the David Sibley
and classify simple, stereotypical sounds pro- Bird Reference (Sibley 2000), allowing the user
duced by species such as the Asian longhorn to “dig into” the bird's biology and conservation
beetle (Anoplophora glabripennis), cane toads needs.
(Rhinella marina), blue whales (Balaenoptera In this chapter, we present an overview of
spp.), and fin whales (Balaenoptera physalus). methods for detection and classification of sounds
Once sounds are detected, they can be organized along with examples from different taxa. No sin-
into groups, or classified, based on selected gle method is appropriate for every research
8 Detection and Classification Methods for Animal Sounds 271

project and so the strengths and weaknesses of sound type. An example of this difficulty in
each method are summarized to help guide describing a sound is the ubiquitous rooster
decisions on which methods are better suited for crow, which can be described by a US citizen as
particular research scenarios. Because algorithms “cock-a-doodle-doo” and by a German citizen as
for statistical analyses, automated detection, and “kikeriki”. Roosters make the same sound, no
computer classification of animal sounds are matter in which country they live, yet their single
advancing rapidly, this is not a comprehensive sound has been named so differently, as has the
overview of methods, but rather a starting point bark of dogs (Fig. 8.1). Of course, onomatopoeic
to stimulate further investigations. naming of sounds also fails when the sounds are
outside of the human hearing range.
If the above was not confusing enough, bird
8.2 Qualitative Naming calls have been described using onomatopoeic
and Classification of Animal phrases. For example, the song of a white-
Sounds throated sparrow (Zonotrichia albicollis) has
been described in Canada as sounding like “O
Prior to computer-assisted detection and classifi- sweet Canada Canada Canada” and in New
cation of animal sounds, bioacousticians used England, USA, as “Old Sam Peabody Peabody
various qualitative methods to categorize sounds. Peabody.” Another example is the barred owl
(Strix varia), which hoots “Who cooks for you?
Who cooks for you all?”.
8.2.1 Onomatopoeic Names

Frequently, researchers describe and name animal 8.2.2 Naming Sounds Based
sounds based on their perception of the sound and on Animal Behavior
thus based on their own language. This approach
has been common in the study of terrestrial Researchers sometimes name sounds based on
animals (in particular, birds) and marine observed and interpreted animal behavior. For
mammals (in particular, pinnipeds and example, the various echolocation signals
mysticetes). Researchers also have given ono- described for insectivorous bats have been
matopoeic names to sounds. These are names named “search clicks” (i.e., slow and regular
that phonetically resemble the sound they clicks) while pursuing insect prey and “terminal
describe. For example, the sounds of squirrels feeding buzz” (i.e., accelerated click trains) dur-
and chipmunks have been described as barks, ing prey capture (Griffin et al. 1960). The bird and
chatters, chirps, and growls. The primate litera- mammal literature is replete with sounds named
ture is also rich in these sorts of sound for a behavior, such as the begging call of nestling
descriptions (e.g., the hack sequences and chicks (Briskie et al. 1999; Leonard and Horn
boom-hack sequences described for Campbell’s 2001), the contact call for isolated young
monkeys, Cercopithecus campbelli; Ouattara (Kondo and Watanabe 2009), and the alarm call
et al. 2009). Bioacousticians studying humpback warning of a nearby predator (Zuberbuhler et al.
whales (Megaptera novaeangliae) have described 1999; Gill and Bierema 2013). In some cases, the
a repertoire of sounds including barks, bellows, function of sounds has been studied in detail,
chirps, cries, croaks, groans, growls, grumbles, which justifies using their function in the name.
horns, moans, purrs, screams, shrieks, sighs, Examples are feeding buzzes in echolocation or
sirens, snorts, squeaks, thwops, trumpets, violins, alarm calls in primates. However, naming sounds
wops, and yaps (Dunlop et al. 2007, 2013). While according to behavior can be misleading because
it is potentially convenient for researchers within a sound can be associated with several contexts.
a group to discuss sounds this way, it is more Names based on the associated behavior should
difficult for others, and perhaps impossible for really only be used after detailed studies of
foreign-language speakers to recognize the context-specificity of the calls in question.
272 J. N. Oswald et al.

OUAF OUAF
GUAU GUAU
GAHF GAHF
Russia
North
BOW WOW
America France WAN WAN
Spain
GUK GUK
WAI WAI
Japan
Nigeria Indonesia
AU AU
Brazil
Try It!
Say all the words out loud. Which
words do you think sound most like
a dog barking?

Fig. 8.1 Dogs speak out. Labels used for dog barks in different countries

8.2.3 Naming Sounds Based M. ciliolabrum as a “40-kHz bat,” which


on Mechanism of Sound describes the terminal frequency of the
Production downsweep of their ultrasonic echolocation
signals (Gannon et al. 2001). Under water, the
Some bioacousticians identify and classify most common sound recorded from southern
sounds based on the mechanism of sound produc- right whales (Eubalaena australis) is a 1–2 s
tion. For example, one syllable in insect song frequency-modulated (FM) upsweep from about
corresponds to a single to- and fro-movement of 50–200 Hz, commonly recorded with overtones,
a stridulatory anatomy or one cycle of a forewing and referred to in the literature as the upcall
opening and closing in the field cricket (Gryllus (Fig. 8.2; Clark 1982). Antarctic blue whales
spp.). McLister et al. (1995) defined a note in (Balaenoptera musculus intermedia) produce a
chorusing frogs as the sound unit produced dur- Z-call, which consists of a 10-s constant fre-
ing a single expiration. Classifying sound types quency (also called constant-wave, CW) sound
by their mode of production perhaps is less at 28 Hz, followed by a rapid FM downsweep to
ambiguous and unequivocal, but there are limited 18 Hz, where the sound continues for another
data on the mechanisms of sound production in 15-s CW component (Rankin et al. 2005).
many animals. While the measurement of features from
spectrograms and waveforms can be expected to
be more objective than onomatopoeic or func-
tional naming, the appearance of a spectrogram,
8.2.4 Naming Sounds Based
and thus the measurements made, depend on
on Spectro-Temporal Features
characteristics of the recording system, the time
and frequency settings of the analysis algorithm,
An alternative, but not necessarily better, way of
and analysis algorithm used. This can make
naming sounds is based on their spectro-temporal
sounds look rather different at various scales and
features. For instance, in distinguishing two mor-
therefore lead to inconsistent classification.
phologically similar species of bats, Myotis
californicus is referred to as a “50-kHz bat” and
8 Detection and Classification Methods for Animal Sounds 273

Fig. 8.2 Spectrograms of


southern right whale
“upcall” (left; sampling
frequency fs ¼ 12 kHz,
Fourier window length
NFFT ¼ 1200, 50%
overlap, Hann window) and
Antarctic blue whale “Z-
call” (right; fs ¼ 6 kHz,
NFFT ¼ 16384, 50%
overlap, Hann window)
recorded off southern
Australia (Erbe et al. 2017)

An example of the confusion that can sound appears as a series of pulses; however, each
arise from different representations of sound pulse actually is a 0.3-s FM downswept tone from
is the boing sound made by minke whales 300 to 100 Hz (Fig. 8.3b). As if this was not
(Balaenoptera acutorostrata), which was given enough in terms of interesting sounds and odd
an onomatopoeic name. In spectrograms, the names, dwarf minke whales produce the so-called
boing might look like an FM sound (Fig. 8.3a), star-wars sound, which is composed of a series of
however, it is actually a series of rapid pulses pulses with varying pulse rates (Gedamke et al.
(Rankin and Barlow 2005), similar to burst- 2001). The different pulse rates make this sound
pulse sounds produced by odontocetes (e.g., appear as a mixture of broadband pulses and FM
Wellard et al. 2015). As another example, the sounds in spectrograms, depending on the spec-
bioduck sound made by Antarctic minke whales trogram settings (Fig. 8.3c). The sound name
(Balaenoptera bonaerensis) got its name because presumes the reader is familiar with the sound-
it resembles a duck’s quack to human listeners track of an American movie from the 1970s.
(Risch et al. 2014). A spectrogram of the bioduck

Fig. 8.3 Spectrograms of the dwarf minke whale boing the dwarf minke whale star-wars sound (c fs ¼ 44 kHz,
(a fs ¼ 16 kHz, NFFT ¼ 1024, 50% overlap, Hann win- NFFT ¼ 4096, 50% overlap, Hann window). Recordings
dow), the Antarctic minke whale bioduck sound (b fs ¼ a and b from Erbe et al. (2017), c from Gedamke et al.
96 kHz, NFFT ¼ 8192, 50% overlap, Hann window), and (2001)
274 J. N. Oswald et al.

8.2.5 Naming Sounds Based and/or mate attraction. Birds of this taxon usually
on Human Communication use sets of sounds that are repeated in an
Patterns organized structure. In many species, males pro-
duce such songs continuously for several hours
The term “song” is perhaps the best-known exam- each day, producing thousands of songs in each
ple of using human communication labels in the performance. In the bird song literature, songs are
description of animal sounds. The word “song” distinguished from calls by their more complex
may be used to simply indicate long-duration and sustained nature, species-typical patterns, or
displays of a specific structure. Songs of insects syntax that governs their combination of syllables
and frogs are relatively simple sequences, and notes into a song. Songs are under the influ-
consisting of the same sound repeated over long ence of reproductive hormones and associated
periods of time. The New River tree frog with courtship (Bradbury and Vehrencamp
(Trachycephalus hadroceps), for example, 2011). Bird song can vary geographically and
produces nearly 38,000 calls in a single night over time (e.g., Fig. 8.4; Camacho-Alpizar et al.
(Starnberger et al. 2014). Many frogs use trilling 2018). In contrast, calls are typically acoustically
notes in mate attraction, which has been described simple and serve non-reproductive, maintenance
as song, but switch to a different vocal pattern in functions, such as coordination of parental duties,
aggressive territorial displays (Wells 2007). In foraging, responding to threats of predation, or
some frog songs, different notes serve different keeping members of a group in contact (Marler
purposes, with one type of note warding off com- 2004).
peting males, and another attracting females. In Several terrestrial mammals have been
birds and mammals, songs are often more com- reported to sing. For instance, adult male rock
plex, consisting of several successive sounds in a hyraxes (Procavia capensis) engage throughout
recognizable pattern. They appear to be used pri- most of the year in rich and complex vocalization
marily for territorial defense or mate attraction behavior that is termed singing (Koren et al.
(Bradbury and Vehrencamp 2011). Our 2008). These songs are complex signals and are
statements in this chapter show one way to composed of multiple elements (chucks, snorts,
describe calls and songs in animals; however, it squeaks, tweets, and wails) that encode the iden-
is important to note that borrowing terminology tity, age, body mass, size, social rank, and hor-
from human communication when studying monal status of the singer (Koren and Geffen
animals can lead to confusion. The terms we 2009, 2011). Holy and Guo (2005) described
discuss here are not well defined and are used ultrasonic sounds from male laboratory mice
differently by different authors. Make sure to (Mus musculus) as song. Von Muggenthaler
pay close attention to these definitions when et al. (2003) reported that Sumatran rhinoceros
reading literature about animal communication. (Dicerorhinus sumatrensis) produce a song com-
Some ornithologists have used human- posed of three sound types: eeps (simple short
language properties further to describe the struc- signals, 70 Hz–4 kHz), humpback whale like
ture of bird song. Song may be broken down into sounds (100 Hz–3.2 kHz, varying in length,
phrases (also called motifs). Each phrase is com- only produced by females), and whistle blows
posed of syllables, which consist of notes (loud, 17 Hz–8 kHz vocalizations followed by a
(or elements, the smallest building blocks; Catch- burst of air with strong infrasonic content). Clarke
pole and Slater 2008). Notes, syllables, and et al. (2006) described the syntax and meaning of
phrases are identified and defined based on their wild white-handed gibbon (Hylobates lar) songs.
repeated occurrence. An entire taxon of birds Among marine mammals, blue, bowhead
(songbirds, Order Passeriformes) has been (Balaena mysticetus), fin, humpback, minke, and
designated by ornithologists because of their use right whales, Weddell seals (Leptonychotes
of these elaborate sounds for territorial defense weddellii), harbor seals (Phoca vitulina), and
8 Detection and Classification Methods for Animal Sounds 275

Fig. 8.4 Geographic variation in birdsong. These (Camacho-Alpizar et al. 2018). # Camacho-Alpizar
spectrograms show a portion of song from Timberline et al.; https://doi.org/10.1371/journal.pone.0209508.
wrens (Thryorchilus browni) recorded at four locations Licensed under CC BY 4.0; https://creativecommons.org/
in Costa Rica (CBV ¼ Cerro Buena Vista, CV ¼ Cerro licenses/by/4.0/
Vueltas, CCH ¼ Cerro Chirripó, IV ¼ Irazú Volcano)
276 J. N. Oswald et al.

walrus (Odobenus rosmarus) have all been recorded in order to study the mating behavior of
reported to sing (Payne and Payne 1985; Sjare this species. Listening to the first few minutes of
et al. 2003; McDonald et al. 2006; Stafford et al. recording, the bioacoustician can easily hear the
2008; Oleson et al. 2014; Crance et al. 2019). The target species, but there are calls every few
songs of blue, bowhead, fin, minke, and right seconds—too many to pick by hand. So, the
whales are simple compared to those of the hump- scientist looks for software tools to help detect
back whale and little is known about the behav- all frog signals, and potentially sort them based
ioral context of song in any marine mammal on their acoustic features. The first step, signal
species besides the humpback whale. Humpback detection, is discussed in Sect. 8.3; the second
whales are well-known for their long, elaborate step, signal classification, is discussed in
songs. These songs are composed of themes Sect. 8.4.
consisting of repetitions of phrases made up of Automated signal detectors work by common
patterns of units similar to syllables in bird song principles. The raw input data are the ideally
(Fig. 8.5; Payne and Payne 1985; Helweg et al. calibrated time series of pressure recorded with
1998). Winn and Winn (1978) suggested that a microphone in air or hydrophone in water.
only male baleen whales sing, as a means of There might be one or more pre-processing
reproductive display. Sjare et al. (2003) reported steps to filter or Fourier transform the data in
that Atlantic walrus produce two main songs: the successive time windows (see Chap. 4). The
coda song and the diving vocalization song that pre-processed time series is then fed into the
differ by their pattern of knocks, taps, and bell detector, which computes a specific quantity
sounds. from the acoustic data. This may be instantaneous
Song production does not exclude the emis- energy, energy within a specified time window,
sion of non-song sounds and most singing species entropy, or a correlation coefficient, as a few
likely emit both. The non-song sounds of hump- examples. Then, a detection threshold is applied.
back and pygmy blue whales (Balaenoptera If the quantity exceeds the threshold, the signal is
musculus brevicauda), for example, have been deemed present, otherwise not.
cataloged (e.g., Recalde-Salas et al. 2014, 2020). The threshold is commonly computed the
Some song units may resemble non-song sounds. following way:
Whether sounds are part of song or not, their
 þ γσ E
E th ¼ E
detection and classification can be challenging
when repertoires are large and possibly variable where E symbolizes the chosen quantity (e.g.,
across time and space. Humpback whale songs, energy), E is its mean value computed over a
for example, vary by region and year (Cerchio long time window (e.g., an entire file), σ E is the
et al. 2001; Payne and Payne 1985). standard deviation, and γ is a multiplier (integer
Characterizing and describing the structure of or real). Setting a high threshold will result in
song can be a difficult task even for the experi- only the strongest signals being detected and
enced bioacoustician. With the assistance of com- weaker ones being missed. Setting a low thresh-
puter analysis tools, sound detection and old will result in many false alarms, which are not
classification may be more efficient. signals. By varying γ, the ideal threshold may be
found and the performance of the detector may be
assessed (see Sect. 8.3.6).
8.3 Detection of Animal Sounds

The problem to be solved may seem simple. For


example, a bioacoustician deployed an autono- 8.3.1 Energy Threshold Detector
mous recorder in the field for a month, and after
recovery of the gear, downloaded all data in the One of the most common methods for detecting
laboratory and now wants to pick all frog calls animal sounds from recordings is to measure the
8
Detection and Classification Methods for Animal Sounds

Fig. 8.5 Spectrogram of the song structure of humpback whales, with sounds organized by theme, phrases, and units (Garland et al. 2017). # Acoustical Society of America,
2017. All rights reserved
277
278 J. N. Oswald et al.

nightly fish chorus


110
2000
humpback whales 100

PSD [dB re 1 μPa2/Hz]


1000 wind
500 noise fish 90
fish
Frequency [Hz]

200
100 80
fin whale chorus Antarctic blue whale chorus
50 70
20 60
10 ship noise ship noise
50
14/9 17/9 18/9 19/9 23/9 Date 28/9 2/10

Fig. 8.6 Spectrogram showing three weeks of choruses were the cause of ongoing tones at 18 and 28 Hz for weeks
by fish, fin whales, and blue whales in the Perth Canyon, at a time. Colors represent power spectral density (PSD).
Australia (modified from Erbe et al. 2015). Fish raised Black arrows point to strong noise from passing ships.
ambient levels by 20 dB in the 1800–2500 Hz band # Erbe et al.; https://doi.org/10.1016/j.pocean.2015.05.
every night. Fin whales raised ambient levels by 20 dB 015. Licensed under CC BY 4.0; https://creativecommons.
in the 15–35 Hz band over two days. Antarctic blue whales org/licenses/by/4.0/

energy, or amplitude, of the incoming signal in a 8.3.2 Spectrogram Cross-Correlation


specified frequency band and to determine
whether it exceeds a user-defined threshold. If Spectrogram cross-correlation is a well-known
the threshold within the frequency band is technique to detect sounds produced by many
exceeded, the sound is scored as being present. species, such as rockfish (genus Sebastes; Širović
The threshold value typically is set relative to the et al. 2009), African elephants (Loxodonta afri-
ambient noise in the frequency band of interest cana; Venter and Hanekom 2010), maned wolves
(e.g., Mellinger 2008; Ou et al. 2012). A simple (Chrysocyon brachyurus; Rocha et al. 2015),
energy threshold detector does not perform well minke whales (Oswald et al. 2011), and sei
when signals have low signal-to-noise ratio whales (Balaenoptera borealis; Baumgartner
(SNR) or when sounds overlap. A number of and Fratantoni 2008). In this method,
techniques have been devised to overcome these spectrograms of reference sounds from the spe-
problems, including spectrogram equalization cies of interest are converted into reference
(e.g., Esfahanian et al. 2017) to reduce back- coefficients, or kernels, with one kernel for each
ground noise, time-varying (adaptive) detection sound type (Fig. 8.7). These reference kernels
thresholds (e.g., Morrissey et al. 2006), and using then are cross-correlated with the incoming spec-
concurrent, but different, detection thresholds for trogram on a frame-by-frame basis. Kernels can
different frequency bands (e.g., Brandes 2008; be a statistical representation of spectrograms of
Ward et al. 2008). Apart from finding individual known sound types, or they can be created empir-
animal sounds, energy threshold detectors also ically by trial-and-error from previously analyzed
have been successfully applied to the detection recordings.
of animal choruses, such as those produced by Proper selection of reference signals is critical
spawning fish, migrating whales (Erbe et al. to the performance of the detector and thus this
2015), and chorusing insects or amphibians. method is only suited for detection of stereotypi-
These choruses are composed of many sounds cal sounds. Seasonal and annual variability in call
from large and often distant groups of animals structure can significantly impact performance of
and so individual signals often are not detectable these detectors and so an analysis of the
in them. Choruses can last for hours and signifi- variability in call structure is vital when applying
cantly raise ambient levels in a species-specific spectrogram cross-correlation to detect calls in
frequency band (Fig. 8.6). long-term datasets (Širović 2016). Another
8 Detection and Classification Methods for Animal Sounds 279

Frequency (Hz)
103

102
Fig. 8.7 Spectrogram of the kernel for Omura’s whales’ 1 2 3 4 5 6 7 8
(Balaenoptera omurai) doublet calls, computed as the Time (s)
average of over 800 hand-picked calls (Madhusudhana
et al. 2020) Fig. 8.8 Spectrogram of marine mammal tonal sounds
with negative entropy (black curve) overlain. Negative
entropy is high when the power spectral density is
drawback to this method is that it can be concentrated in a few narrow frequency bands (Erbe and
prohibitively processor-intensive. To speed up King 2008)
the calculations, Harland (2008) first employed
an energy threshold detector (as described above) 1998; Bouffaut et al. 2018), and beaked whales
to detect times of potential signal presence and (Hamilton and Cleary 2010). Their performance
then used spectrogram cross-correlation to detect suffers in the presence of even a small amount of
individual signals within the flagged time periods. sound variation compared to the kernel.

8.3.3 Matched Filter 8.3.4 Spectral Entropy Detector

The matched filter approach for sound classifica- In general, entropy measures the disorder or
tion is similar to spectrogram cross-correlation uncertainty of a system. Applied to communica-
but is performed in the time-domain. This means tion theory, the information entropy (also called
that the waveforms (i.e., sound pressure levels as Shannon entropy; Shannon and Weaver 1998)
a function of time) are correlated instead of the measures the amount of information contained
spectrogram. A kernel of the waveform of the in a data stream. Entropy is computed as the
sound to be detected is produced, often empiri- negative product of a probability distribution
cally using a high-quality recording, and then and its logarithm. Therefore, a strongly peaked
cross-correlated with the incoming signal (i.e., probability distribution has low entropy, while a
the time series of sound pressure). Matched filters broad probability distribution has high entropy. If
are efficient at detecting signals in Gaussian noise applied to an acoustic power spectral density dis-
(white noise), but colored noise (typical in many tribution, entropy measures the peakedness of the
natural environments) poses more of a problem. power spectra and detects narrowband signals in
As with spectrogram cross-correlation, the selec- broadband noise (Fig. 8.8). Spectral entropy has
tion of kernels is critical to the performance of the successfully been applied to animal sounds; for
detector. Matched filters are only appropriate for example, from birds, beluga whales
detection of well-known, stereotyped acoustic (Delphinapterus leucas), bowhead whales, and
features, such as sounds produced by cane toads walruses (Erbe and King 2008; Mellinger and
(Dang et al. 2008), blue whales (Stafford et al. Bradbury 2007; Valente et al. 2007).
280 J. N. Oswald et al.

Detector Input
Signal Present Signal Absent
Signal True Positive (TP) False Positive (FP)
Reported Present Correct Detection False Alarm
Output Signal False Negative (FN) True Negative (TN)
Absent Missed Detection Correct Rejection

Fig. 8.10 Confusion matrix showing the possible


outcomes of a detector when a signal is present versus
absent

is set, the probability of one type of error increases


while the other decreases. The acceptability of
Fig. 8.9 Waveforms of odontocete clicks and their Gabor
either type of error is determined by the particular
fit (top) and TKEO outputs and Gaussian fit (bottom)
(Madhusudhana et al. 2015) application of the detector. For example, for rare
animals in critical habitats, detecting every sound,
even those that are very faint, is desired. In this
8.3.5 Teager–Kaiser Energy Operator situation, a low threshold can be chosen that
minimizes the number of missed detections; how-
The Teager–Kaiser energy operator (TKEO) is a ever, this can result in many false alarms. Quantifi-
nonlinear operator that tracks the energy of a data cation of these two errors is a useful way to
stream (Fig. 8.9). Operating on a time series, at any evaluate the performance of an automated detector.
one instance, the TKEO computes the square of the
sample and subtracts the product of the previous 8.3.6.1 Confusion Matrices
and next sample. The output is therefore high for One of the simplest and most common methods
very brief signals. The TKEO has successfully for conveying the performance of a detector (or a
been applied to the detection of clicks, such as classifier) is a confusion matrix (i.e., a type of
bat or odontocete biosonar sounds (Kandia and contingency table). A confusion matrix
Stylianou 2006; Klinck and Mellinger 2011). (Fig. 8.10) gives the number of true positives
Many biosonar signals are of Gabor type (i.e., a (i.e., correctly classified sounds, also called cor-
sinusoid modulated by a Gaussian envelope). The rect detections), false positives (i.e., false alarms),
TKEO output of the signals is a simple Gaussian, true negatives (i.e., correct rejections), and false
which can be detected with simple tools, such as negatives (i.e., missed detections).
energy threshold detection or matched filtering
(Madhusudhana et al. 2015).
8.3.6.2 Receiver Operating
Characteristic (ROC) Curve
The performance of detectors can be visualized
8.3.6 Evaluating the Performance using the receiver operating characteristic (ROC)
of Automated Detectors curve. A ROC curve is a graph that depicts the
trade-offs between true positives and false
Automated detectors can produce two types of positives (Egan 1975; Swets et al. 2000). The
errors: missed detections (i.e., missing a sound false positive rate (i.e., FP/(FP+TN)) is plotted on
that exists) and false alarms (i.e., incorrectly the x-axis, while the true positive rate (i.e., TP/(TP
reporting a sound that does not exist or reporting +FN)) is plotted on the y-axis (Fig. 8.11). A curve
a sound that is not the target signal). There is an is generated by plotting these values for the detec-
inevitable trade-off when choosing the acceptable tor at different threshold values. The (0|1) point on
rate of each. Most detectors allow the user to adjust the graph represents perfect performance: 100%
a threshold, and depending on where this threshold true positives and no false positives.
8 Detection and Classification Methods for Animal Sounds 281

Fig. 8.11 (a) Generalized receiver operating characteris- ROC curves computed during the development of
tic (ROC) plot, in which the probability of true positives is automated detectors for marine mammal calls in the Arc-
plotted against the probability of false positives. Areas in tic. The spectral entropy detector outperformed others
this graph that correspond to a liberal bias, conservative (Erbe and King 2008)
bias, and deliberate mistakes are indicated. (b) Example

The major diagonal in Fig. 8.11a represents differences between the numbers of TPs and
performance at chance, where the probabilities TNs. In these situations, precision and recall
of TP and FP are equal. Responses falling below (P-R) can provide a more accurate representation
the line would indicate deliberate mistakes. The of detector performance because this representa-
minor diagonal represents neutral bias, and splits tion does not rely on determining the number of
responses into conservative versus liberal. A con- true negatives (Davis and Goadrich 2006). In the
servative response strategy yields decreased cor- P-R framework, events are scored only as TPs,
rect detection and false alarm probabilities; a FPs, and FNs.
liberal response strategy yields increased correct Precision is a measure of accuracy and is the
detection and false alarm probabilities. An exam- proportion of automated detections that are true
ple ROC curve is given in Fig. 8.11b, comparing detections.
the performances of three detectors (operating on
TP
underwater acoustic recordings from the Arctic Precision ¼
TP þ FP
and trying to detect marine mammal calls)
based on: (1) spectral entropy, (2) bandpassed Recall is a measure of completeness and is the
energy, and (3) waveform (i.e., broadband) proportion of true events that are detected. This is
energy. The performance of the entropy detector the same as the true positive rate defined in the
surpassed that of the other two. ROC framework.
TP
8.3.6.3 Precision and Recall Recall ¼
TP þ FN
The performance of a detector can be over-
estimated using a ROC curve when there is a Detectors can be evaluated by plotting preci-
large difference between the numbers of TPs sion against recall (Fig. 8.12). An ideal detector
and TNs. In addition, estimation of the number would have both scores approaching a value of
of TNs requires discrete sampling units. The 1. In other words, the curve would approach the
duration of the discrete sampling units is often upper right-hand corner of the graph (Davis and
somewhat arbitrary and can lead to unrealistic Goadrich 2006). Precision and recall also can be
282 J. N. Oswald et al.

1.0 a classification algorithm. Feature sets (also called


feature vectors) should provide as much informa-
tion as sensible about the sounds. With today’s
0.8 software tools and computing power, a limitless
number of features can easily be measured that
0.6 would allow distinction between sounds even of
Precision

the same type. Such over-parameterization can


make it difficult to group like sounds, which can
0.4 be just as important as distinguishing between
different sounds. The challenge is to find the
Detector 1 trade-off and produce a set of representative
0.2
Detector 2 features for each sound type. Once the features
Detector 3 have been selected, automating the extraction and
0.0 subsequent analysis of these features reduces the
0 0.2 0.4 0.6 0.8 1 time required to analyze large datasets. Some
Recall
commonly used feature vectors are described
Fig. 8.12 Precision-Recall curves for three types of below.
detectors: (1) spectrogram cross-correlation, (2) blob
detection, and (3) spectral entropy for Omura’s whale 8.4.1.1 Spectrographic Features
calls (Madhusudhana et al. 2020)
Perhaps the most commonly used feature vectors
are those consisting of values measured from
represented by an F-score, which is the geometric
spectrograms. These measurements include, but
mean of these values. The F-score can be
are not limited to, frequency variables (e.g., fre-
weighted to emphasize either precision or recall
quency at the beginning of the sound, frequency
when optimizing detector performance (Jacobson
at the end of the sound, minimum frequency,
et al. 2013).
maximum frequency, frequency of peak energy,
bandwidth, and presence/absence of harmonics or
sidebands; Fig. 8.13; also see Chap. 4, Sect. 4.
8.4 Quantitative Classification 2.3), and time variables (e.g., signal duration,
of Animal Sounds phrase and song length, inter-signal intervals,
and repetition rate). More complex features,
Quantitative classification of animal sounds is such as those describing the spectrographic
based on measured features of sounds, no matter shape of a sound (e.g., upsweep, downsweep,
whether these are used to manually or automati- chevron, U-loop, inverted U-loop, or warble),
cally group sounds with the aid of software slopes, and numbers and relative positions of
algorithms. These features can be measured local extrema and inflection points (places where
from different representations of sounds, such as the contour changes from positive to negative
waveforms, power spectra, spectrograms, and slope or vice versa) also have been used in classi-
others. A large variety of classification methods fication. These measurements often are taken
have been applied to animal sounds, many draw- manually from spectrographic displays (e.g., by
ing from human speech analysis. a technician using a mouse-controlled cursor).
Automated techniques for extracting spectro-
graphic measurements can be less subjective and
8.4.1 Feature Selection less time-consuming, but are sometimes not as
accurate as manual methods. Examples are avail-
The acoustic features selected and the consistency able in the bird literature (e.g., Tchernichovski
with which the measurements are taken have a et al. 2000), bat literature (Gannon et al. 2004;
significant influence on the success (or failure) of O’Farrell et al. 1999), and marine mammal
8 Detection and Classification Methods for Animal Sounds 283

40

35

30

25
Frequency [kHz]

20

15

10

5
0.00
0 200 400 600
Time [ms]

Fig. 8.13 Spectrogram of a pilot whale (Globicephala changes from clockwise to counter-clockwise, or vice
melas) whistle showing the following features: Start fre- versa), and one overtone (Courts et al. 2020). # Courts
quency (Start f), End frequency (End f), Maximum fre- et al.; https://www.nature.com/articles/s41598-020-
quency (Max f), Minimum frequency (Min f), locations of 74111-y/figures/5. Licensed under CC BY 4.0; https://
two local maxima and one local minimum in the funda- creativecommons.org/licenses/by/4.0/
mental contour, four inflection points (where the curvature

literature (e.g., Mellinger et al. 2011; Roch et al. which significantly reduces the number of
2011; Gillespie et al. 2013; Kershenbaum et al. parameters that must be estimated (Picone
2016). Spectrographic measurements of bat calls, 1993). Cepstral coefficients are calculated by
for example, can be extracted using Analook computing the Fourier transform in successive
(Titley Scientific, Columbia, MO, USA), time windows over the recorded pressure time
SonoBat (Joe Szewczak, Department of Biology, series of a sound (see Chap. 4). The frequency
Humboldt State University, Arcata, CA, USA), or axis then is warped by multiplying the spectrum
Kaleidoscope Pro (Wildlife Acoustics, Inc., May- with a series of n filter functions at appropriately
nard, MA, USA), exported to an Excel spread- spaced frequencies. This is done because there is
sheet (XML, CSV, and other formats), classified evidence that many animals perceive frequencies
using machine learning algorithms, and compared on a logarithmic scale, in a similar fashion to
to a reference library for identification. humans (Clemins et al. 2005). The output of the
frequency band filters is then used as input to a
8.4.1.2 Cepstral Features discrete cosine transform, which results in an n-
Cepstral coefficients are spectral features of dimensional cepstral feature vector (Picone 1993;
bioacoustic signals commonly used in human Clemins et al. 2005; Roch et al. 2007, 2008).
speech processing (Davis and Mermelstein Using cepstral feature space allows the timbre
1980). These features are based on the source- of sounds to be captured, a quality that is lost
filter model of human speech analysis, which has when extracting parameters from spectrograms.
been applied to many different animal species Roch et al. (2007) developed an automated clas-
(Fitch 2003). Cepstral coefficients are well-suited sification system based on cepstral feature vectors
for statistical pattern-recognition models because extracted for whistles, burst-pulse sounds, and
they tend to be uncorrelated (Clemins et al. 2005), clicks produced by short- and long-beaked
284 J. N. Oswald et al.

common dolphins (Delphinus spp.), Pacific able to identify dolphin signature whistles more
white-sided dolphins (Lagenorhynchus reliably than computer methods. A problem with
obliquidens), and bottlenose dolphins (Tursiops qualitative classification of sounds in a repertoire
truncatus). The system did not rely on specific (and taxonomy in general), however, is that some
sound types and had no requirement for listeners are “splitters” and other listeners are
separating individual sounds. The system “lumpers.” So, even researchers on the same proj-
performed relatively well, with correct classifica- ect could classify an animal’s sound repertoire
tion scores of 65–75%, depending on the differently. One way to avoid individual
partitioning of the training- and test-data. Cepstral researcher differences in classification is to use
feature vectors also have been used as input to graphical, statistical, and computer-automated
classifiers for many other animal species, includ- methods that objectively sort and compare
ing groupers (Epinephelus guttatus, E. striatus, measured variables that describe the sounds. A
Mycteroperca venenosa, M. bonaci; Ibrahim et al. variety of statistical methods can be employed to
2018), frogs (Gingras and Fitch 2013), song birds classify animal sounds into categories (Frommolt
(Somervuo et al. 2006), African elephants et al. 2007). Below are brief descriptions of some
(Zeppelzauer et al. 2015), and beluga, bowhead, of the statistical methods that are commonly used
gray (Eschrichtius robustus), humpback, and for classification of animal sounds.
killer (Orcinus orca) whales, and walrus (Mouy
et al. 2008). Cepstral features appear to be a 8.4.2.1 Parametric Clustering
promising alternative to the traditional time- and Parametric cluster analysis produces a dendro-
frequency-parameters measured from gram (i.e., classification tree) that organizes simi-
spectrograms as input to classification algorithms. lar sounds into branches of a tree. A distance
However, cepstral features are relatively sensitive matrix also is generated, which gives correlation
to the SNR, the signal’s phase, and modeling coefficients between all variables in the dataset.
order (Ghosh et al. 1992). The resulting distance index ranges from 0 (very
Noda et al. (2016) used mel-frequency cepstral similar sounds) to 1 (totally dissimilar sounds).
coefficients and random forest analyses to classify The matrix can then be joined by rows or columns
sounds produced by 102 species of fish and com- to examine relationships. The type of linkage and
pared the performance of three classifiers: type of distance measurement can be selected to
k-nearest neighbors, random forest, and support find the best fit for a particular dataset (Zar 2009).
vector machines (SVMs). The mel-frequency Cluster analysis has been used to classify
cepstrum (or cepstrogram) is a form of acoustic sound types in several species, including owls
power spectrum (or spectrogram) that is (Nagy and Rockwell 2012), mice
computed as a linear cosine transform of a (Hammerschmidt et al. 2012), rats (Rattus
log-power spectrum that is presented on a nonlin- norvegicus, Takahashi et al. 2010), African
ear mel-scale of frequency. The mel-scale elephants (Wood et al. 2005), and primates
resembles the human auditory system better than (Hammerschmidt and Fischer 1998). In a study
the linearly-spaced frequency bands of the normal of six populations of the neotropical frog
cepstrum. All three classifiers performed simi- (Proceratophrys moratoi) in Brazil, Forti et al.
larly, with average classification accuracy ranging (2016) measured spectrographic variables from
between 93% and 95%. calls produced by males and performed cluster
analysis to examine similarities in acoustic traits
(based on the Bray–Curtis index of acoustic simi-
8.4.2 Statistical Classification larity) across the six locations (Fig. 8.14).
of Animal Sounds Baptista and Gaunt (1997) used hierarchical clus-
ter analysis of correlation coefficients of several
For some sounds, qualitative classification is suf- acoustic parameters to categorize sounds of the
ficient. Janik (1999) reported that humans were sparkling violet-eared hummingbird (Colibri
8 Detection and Classification Methods for Animal Sounds 285

Fig. 8.14 Dendrogram from a hierarchical cluster analy- Odontophrynidae species (Forti et al. 2016). # Forti
sis of the call similarities between 15 male Proceratophrys et al.; https://peerj.com/articles/2014/. Licensed under
moratoi from different sites and two other CC BY 4.0; https://creativecommons.org/licenses/by/4.0/

coruscans), which is found in two neighboring syllable sharing between individuals of Anna’s
assemblages in their study area. A matrix of hummingbird (Calypte anna). They identified
sound similarity values obtained from spectral 38 syllable types in songs of 44 males, which
cross-correlation of these birds’ songs indicated clustered into five basic syllable categories:
similar sound types from the two areas. Yang “Bzz,” “bzz,” “chur,” “ZWEE,” and “dz!”. Also,
et al. (2007) used cluster analysis to examine microgeographic song variation patterns were
286 J. N. Oswald et al.

6
MYCI MYCA
4

2
PC2 (Pinna Shape)

-6 -4 -2 2 4 6

-2

-4

-6
PC1 (Call Frequency Characters)

Fig. 8.15 Plot showing the results of principal compo- in ear height and characteristic frequency of their echolo-
nent analysis, in which two cryptic species of myotis bats cation signals. Plotted is characteristic frequency versus
(California myotis, Myotis californicus, MYCA, black signal duration for these species recorded from field sites
squares; western small-footed bat, M. ciliolabrum, in New Mexico and Arizona, USA
MYCI, hollow circles) were distinguished by differences

found in that nearest neighbors sang more similar variables (i.e., the features) into a set of linearly
songs than non-neighbors. Pozzi et al. (2010) uncorrelated variables (i.e., the principal
used several acoustic variables to group black components; Hotelling 1933; Zar 2009). The
lemur (Eulemur macaco macaco) sounds into principal components are linear combinations of
categories, including the frequencies of the fun- the original variables (features). Plotting the prin-
damental and of the first three harmonic overtones cipal components against each other shows how
(measured at the start, middle, and end of each the measurements cluster.
call), and the total duration. The agreement of this For example, by examining bat biosonar
analysis with manual classification was high signals in multivariate space, bat species that are
(>88.4%) for six of eight categories. very similar in external appearance can be distin-
guished. Using PCA, Gannon et al. (2001) found
8.4.2.2 Principal Component Analysis ear height and characteristic frequency were
Principal component analysis (PCA) is a multi- correlated, along with duration of the signal
variate statistical method that examines a set of (Fig. 8.15).
measurements such as the feature vectors As another example, Briefer et al. (2015)
discussed earlier in Sect. 8.4. These features categorized emotional states associated with vari-
may well be correlated. For example, bandwidth ation in whinnies from 20 domestic horses (Equus
is sometimes correlated with maximum fre- ferus) using PCA. They designed four situations
quency, or the number of inflection points can to elicit different levels of emotional arousal that
be correlated with signal duration (Ward et al. were likely to stimulate whinnies: separation
2016). PCA performs an orthogonal transforma- (negative situation) and reunion (positive situa-
tion that converts the potentially correlated tion) with either all group members (high
8 Detection and Classification Methods for Animal Sounds 287

Fig. 8.16 Spectrograms and oscillograms of horse higher G0 fundamentals than positive whinnies
whinnies in negative (a, c) and positive (b, d) situations (b, d Briefer et al. 2015). # Briefer et al.; https://www.
emitted by two different horses. Red arrows point to fun- nature.com/articles/srep09989/figures/3. Licensed under
damental frequencies (F0, G0) and first overtones (H1). CC BY 4.0; http://creativecommons.org/licenses/by/4.0/
Negative whinnies (a, c) are longer in duration and have

emotional arousal) or only one group member variables measured from a training dataset. One
(moderate emotional arousal). The authors canonical discriminant function is produced for
measured 21 acoustic features from whinnies each sound type in the dataset. Variables
(Fig. 8.16). PCA transformed the feature vectors measured from sounds in the test dataset are
into six principal components that accounted for then substituted into each function and each
83% of the variance in the original dataset. sound type is classified according to the function
that produced the highest value. Because DFA is
8.4.2.3 Discriminant Function Analysis a parametric technique, it is assumed that input
In discriminant function analysis (DFA), canoni- data have a multivariate normal distribution with
cal discriminant functions are calculated using the same covariance matrix (Afifi and Clark 1996;
288 J. N. Oswald et al.

Neotamias siskiyou
Neotamias townsendii
Discriminant Function 2

Neotamias senex

Neotamias ochrogenys

Discriminant Funcon 1

Fig. 8.17 Plot resulting from discriminant function anal- Discriminant function 1 was dominated by differences in
ysis. Four species of Townsend-group chipmunks maximum frequency of the signal and discriminant func-
(Townsend’s chipmunk, Neotamias townsendii; Siskiyou tion 2 was most influenced by temporal features including
chipmunk, N. siskiyou; Allen’s chipmunk, N. senex; and total signal length and the number of signals emitted by a
yellow-cheeked chipmunk, N. ochrogenys) in northern chipmunk during a signaling bout
California, USA, produced discernibly different sounds.

Zar 2009). Violations of these assumptions can The goal for each split is to divide the data into two
create problems with some datasets. One of the nodes, each as homogeneous as possible. As the
main weaknesses of DFA for animal sound clas- tree is grown, results are split into successively
sification is that it assumes classes are linearly purer nodes. This continues until each node
separable. Because a linear combination of contains perfectly homogeneous data (Gillespie
variables takes place in this analysis, the feature and Caillat 2008). Once this maximal tree has
space can only be separated in certain, restricted been generated, it is pruned by removing nodes
ways that are not appropriate for all animal and examining the error rates of these smaller trees.
sounds. Figure 8.17 shows the DFA separation The smallest tree with the highest predictive accu-
of California chipmunk (genus Neotamias) taxa racy is the optimal tree (Oswald et al. 2003).
that are morphologically similar but acoustically Tree-based analysis provides several
different, using six variables measured from their advantages over some of the other classification
sounds. techniques. It is a non-parametric technique;
therefore, data do not need to be normally
distributed as required for other methods, such
8.4.2.4 Classification Trees
as DFA. In addition, tree-based analysis is a sim-
Classification tree analysis is a non-parametric sta-
ple and naturally intuitive way for humans to
tistical technique that recursively partitions data
classify sounds. It is essentially a series of true/
into groups known as “nodes” through a series of
false questions, which makes the classification
binary splits of the dataset (Clark and Pregibon
process transparent. This allows easy examina-
1992; Breiman et al. 1984). Each split is based on
tion of which variables are most important in the
a value for a single variable and the criteria for
classification process. Tree-based analysis also
making splits are known as primary splitting rules.
8 Detection and Classification Methods for Animal Sounds 289

Fig. 8.18 Classification tree grown using Splus computer Fk, and slope (S1). Along the tangents between boxes are
software (version S-PLUS 6.2 2003, TIBCO Software values for variables used to split the nodes (for instance,
Inc., Palo Alto, CA, USA) from 1369 bat calls. The pruned Fmin is minimum frequency). The fraction below each
tree used variables measured from each bat call: duration box is the misclassification rate (e.g., 1/5 ¼ 20% misclassi-
(DUR), minimum frequency (Fmin), characteristic fre- fication rate). The tree has 12 terminal nodes defining the
quency (Fc; i.e., frequency at the flattest part of the call), branches, resulting in a classification designation for each
frequency at the “knee” of the call (Fk), time of Fc, time at species (Gannon et al. 2004)

accommodates for a high degree of diversity (2004) completed an analysis of echolocation


within classes. For example, if a species produces pulses from free-flying, wild bats. Fig. 8.18 is a
two or more distinct sound types, a tree-based classification tree grown from nearly 1400 calls
analysis can create two different nodes. In other using at least seven variables measured from each
classification techniques, different sound types call. The tree produced terminal nodes identified
within a species simply act to increase variability to species (MYVO is Myotis volans, MYCA
and make classification more difficult. Finally, M. californicus, etc.). In this study, recordings
surrogate splitters are provided at each node were made under field conditions where sounds
(Oswald et al. 2003). Surrogate splitters closely were affected by the environment, Doppler shift,
follow primary splitting rules and can be used in and diversity of equipment. Still, classification
cases when the primary splitting variable is miss- trees worked well to predict group membership
ing. Therefore, sounds can be classified even if and additional techniques, such as DFA, were
data for some variables are missing due to noise able to distinguish five Myotis species acousti-
or other factors. cally with greater than 75% accuracy (greater
To address some controversy as to whether than 90% in most instances).
closely related species of myotis bats could be Classification trees have been applied to
differentiated by their sounds, Gannon et al. marine mammal sounds by several researchers,
290 J. N. Oswald et al.

with promising results. Fristrup and Watkins approaches were employed, the resulting limited
(1993) used tree-based analysis to classify the set of chosen features or measurements are essen-
sounds of 53 species of marine mammal (includ- tially representations of the underlying data in a
ing mysticetes, odontocetes, pinnipeds, and reduced space. Such dimensionality reduction is
manatees). Their correct classification score of typically aimed at making the downstream task of
66% was 16% higher than the score obtained clustering (with PCA, DFA, etc.) computationally
when applying DFA to the same dataset. The tractable.
whistles of nine delphinid species were correctly In recent years, nonlinear dimensionality
classified 53% of the time by Oswald et al. (2003) reduction methods have gained widespread pop-
using tree-based analysis. Oswald et al. (2007) ularity, specifically in applications for exploring
subsequently applied classification tree analysis and visualizing very high-dimensional data.
to the whistles of seven species and one genus of Originally popular for processing image-like
marine mammal, resulting in a correct classifica- data in the field of machine learning, these
tion score of 41%. This score was improved methods bring about dimensionality reduction
slightly, to 46%, when classification decisions without requiring one to explicitly choose and
were based on a combination of classification extract features. The methods can be easily
tree and DFA results. Gannier et al. (2010) used adapted for processing bioacoustic recordings
classification trees to identify the whistles of wherein the qualitative cluster structure (i.e.,
five delphinid species recorded in the Mediterra- similarities in the visually identifiable informa-
nean, with a correct classification score of 63%. tion) in spectrogram-like data (e.g.,
Finally, Gillespie and Caillat (2008) classified the mel-spectrogram or cepstrogram) containing
clicks of Blainville’s beaked whales (Mesoplodon hundreds or thousands of time-frequency points
densirostris), short-finned pilot whales is effectively captured in an equivalent 2- or
(Globicephala macrorhynchus), and Risso’s 3-dimensional space (e.g., Sainburg et al. 2019;
dolphins (Grampus griseus). Their tree-based anal- Kollmorgen et al. 2020).
ysis classified 80% of clicks to the correct species. One of the earlier methods for capturing non-
linear structure, the t-distributed stochastic neigh-
8.4.2.5 Nonlinear Dimensionality bor embedding (t-SNE; van der Maaten and
Reduction Hinton 2008) is based on non-convex optimiza-
Clustering techniques described above require tion. It computes a similarity measure between
that certain features or measurements, as appro- pairs of points (data samples) in the original
priate for the problem domain, be available high-dimensional space and in the reduced
beforehand. They are gathered from sound space, then minimizes the Kullback–Leibler
recordings either manually (e.g., number of divergence between the two sets of similarity
inflection points in whistle contours, number of measures. t-SNE tries to preserve distances in a
harmonics) or using signal processing tools (e.g., neighborhood whereby points close together in
peak frequency, energy), or both. Manual extrac- the high-dimensional space have a high probabil-
tion of features is usually time-consuming and ity of staying close in the reduced space. The Bird
often inefficient, especially when dealing with Sounds project (Tan and McDonald 2017)
recordings covering large spatial and temporal presents an excellent demonstration of using
scales. Automated extraction of measurements t-SNE for organizing thousands of bird sound
improves efficiency and eliminates the risk of spectrograms in a 2-dimensional similarity grid.
human biases. However, when recordings contain Some of the shortcomings of t-SNE were
a lot of confounding sounds or have extreme addressed in a newer method called uniform man-
noise variations, reliability and accuracy of the ifold approximation and projection (UMAP;
measurements can become questionable and can McInnes et al. 2018). UMAP is backed with a
have adverse effects on clustering outcomes. strong theoretical framework. While effectively
Regardless of whether manual or automated capturing local structures like t-SNE, UMAP
8 Detection and Classification Methods for Animal Sounds 291

Acantheremus major (n = 57)

Docidocercus gigliotosi (n = 201)

Pristonotus tuberosus (n = 43)

Scopiorinus fragilis (n = 220)

Thamnobates subfalcata (n = 220)

Fig. 8.19 Demonstration of clustering katydid sounds the left, and clustering outcomes are shown on the right.
using UMAP. Randomly chosen samples of call The clustering activity has successfully captured both
spectrograms of the five species considered are shown on inter-species and intra-species variations

also offers a better promise for preserving some modern variants of variational autoencoders
global structures (inter-cluster relationships). (Kingma and Welling 2013).
UMAP processes data faster and is capable of
handling very large dimensional data. Fig. 8.19
is a demonstration of the use of UMAP for clus-
tering sounds of five species of katydids 8.4.3 Model Based Classification
(Tettigoniidae) from Panamanian rainforest
recordings (Madhusudhana et al. 2019). Inputs 8.4.3.1 Artificial Neural Networks
to UMAP clustering comprised of spectrograms Artificial neural networks (ANNs) were devel-
(dimensions 216h x 469w) computed from 1-s oped by modeling biological systems of
clips containing katydid call(s). The inputs often information-processing (Rosenblatt 1958) and
contained confounding sounds and varying noise became very popular in the areas of word recog-
levels. The clustering results, however, demon- nition in human speech studies (e.g., Waibel et al.
strate the utility of UMAP as a quick means to 1989; Gemello and Mana 1991) and character or
effective clustering. UMAP has also been used, in image-recognition (e.g., Fukushima and Wake
combination with a pre-trained neural network, 1990; Van Allen et al. 1990; Belliustin et al.
for assessing habitat quality and biodiversity 1991) in the 1980s. Since that time, ANNs have
variations from soundscape recordings across dif- been used successfully to classify a number of
ferent ecosystems (Sethi et al. 2020). complex signal types, including quail crows
We have presented here two popular methods (Coturnix spp., Deregnaucourt et al. 2001),
that are currently trending in this field of research. alarm sounds of Gunnison’s prairie dogs
There are, however, other alternatives available (Cynomys gunnisoni, Placer and Slobodchikoff
including earlier methods such as isomap 2000), stress sounds by domestic pigs (Sus scrofa
(Tenenbaum et al. 2000) and diffusion map domesticus, Schon et al. 2001), and dolphin echo-
(Coifman et al. 2005), newer variants of t-SNE location clicks (Roitblat et al. 1989; Au and
(e.g., Maaten 2014; Linderman et al. 2017), and Nachtigall 1995).
292 J. N. Oswald et al.

et al. 1992). In addition, ANNs are nonlinear


estimators that are well-suited for problems
involving arbitrary distributions and noisy input
(Ghosh et al. 1992; Potter et al. 1994).
Dawson et al. (2006) used artificial neural
networks as a means to classify the chick-a-dee-
dee-dee call of the black-capped chickadee
(Poecile atricapillus), which contains four note
types carrying important functional roles in this
species. In their study, an ANN first was trained
Fig. 8.20 Diagram of the structure of an artificial neural to identify the note type based on several acoustic
network
variables and then correctly classified recordings
of the notes with 98% accuracy. The performance
In their primitive forms, there are 20 or more of the network was compared with classification
basic architectures of ANNs (see Lippman 1989 using DFA, which also achieved a high level of
for a review). Each ANN approach results in correct classification (95%). The authors
trade-offs in computer memory and computation concluded that “there is little reason to prefer
requirements, training complexity, and time and one technique over another. Either method
ease of implementation and adaptation (Lippman would perform extremely well as a note-
1989). The choice of ANN depends on the type classification tool in a research laboratory”
of problem to be solved, size and complexity of (Dawson et al. 2006).
the dataset, and the computational resources Placer and Slobodchikoff (2000) used artificial
available. All ANNs are composed of units called neural networks to classify alarm sounds of
neurons and connections among them. They typ- Gunnison’s prairie dogs (Cynomys gunnisoni) to
ically consist of three or more neuron layers: one predator species with a classification accuracy of
input layer, one output layer, and one or more 78.6 to 96.3%. The ANN identified unique
hidden layers (Fig. 8.20). The input layer consists signals for four different species of predators:
of n neurons that code for n features in the feature red-tailed hawk (Buteo jamaicensis), domestic
vector representing the signal (X1 . . . Xn). The dog (Canis familiaris), coyote (Canis latrans),
output layer consists of k neurons representing and humans (Homo sapiens).
the k classes. The number of hidden layers Deecke et al. (1999) used artificial neural
between the input and output layers, as well as networks to examine dialects in underwater
the number of neurons per layer, is empirically sounds of killer whale pods. The neural network
chosen by the researcher. Each connection extracted the frequency contours of one sound
among neurons in the network is associated type shared by nine social groups of killer whales
with a weight-value, which is modified by suc- and created a neural network similarity index.
cessive iterations during the training of the Results were compared to the sound similarity
network. judged by three humans in pair-wise classification
ANNs are promising for automatic signal clas- tasks. Similarity ratings of the neural network
sification for several reasons. First, the input to an mostly agreed with those of the humans, and
ANN can range from feature vectors of were significantly correlated with the killer
measurements taken from spectrograms or whale group, indicating that the similarity indices
waveforms, to frequency contours, to complete were biologically meaningful. According to the
spectrograms. Second, ANNs serve as adaptive authors, “an index based on neural network anal-
classifiers which learn through examples. As a ysis therefore represents an objective and repeat-
result, it is not necessary to develop a good math- able means of measuring acoustic similarity, and
ematical model for the underlying signal allows comparison of results across studies, spe-
characteristics before analysis begins (Ghosh cies, and time” (Deecke et al. 1999).
8 Detection and Classification Methods for Animal Sounds 293

The greater potential of ANNs remained signals (e.g., spectrogram), many of the successes
largely untapped for many years, in part due to of CNNs in computer vision have been replicated
prevailing limitations in computational in the field of animal bioacoustics. In contrast to
capabilities. In the mid-1980s, backpropagation CNNs, RNNs are better suited for processing
paved a way for efficiently training multi-layer sequence inputs. RNNs contain internal states
ANNs (Rumelhart et al. 1986). Backpropagation, (memory) that allow them to “learn” temporal
an algorithm for supervised learning of the patterns. However, their utility is limited by the
weights in an ANN using gradient descent, “vanishing gradient problem,” wherein the
greatly facilitated development of deeper gradients (from the gradient descent algorithm)
networks (having many hidden layers). Many of the network's output with respect to the
classes of deep neural networks (DNNs; LeCun weights in the early layers become extremely
et al. 2015) such as convolutional neural small. The problem is overcome in modern
networks (CNNs) and recurrent neural networks flavors of RNNs such as long short-term memory
(RNNs) became easier to train. While the afore- (LSTM; Hochreiter and Schmidhuber 1997)
mentioned ANN approaches often require hand- networks and gated recurrent unit (GRU; Cho
picked features or measurements as inputs, DNNs et al. 2014) networks.
trained with backpropagation demonstrated the These types of ML solutions are heavily data-
ability to learn good internal representations driven and often require large quantities of train-
from raw data (i.e., the hidden layers captured ing samples. Typically, the training samples are
non-trivial representations effectively). In their time-frequency representations (e.g., spectrogram
landmark work on using CNNs for the automatic or mel-spectrogram) of short clips of recordings
recognition of handwritten digits, LeCun et al. (e.g., Stowell et al. 2016; Shiu et al. 2020).
(1989a, b) used backpropagation to learn Robustness of the resulting models are improved
convolutional kernel coefficients directly from by ensuring that the inputs adequately cover pos-
images. Over the past two decades, advances in sible variations of the target signals and of the
computing technology, especially the wider avail- ambient background conditions. Data scientists
ability of graphics processing units (GPUs), have employ a variety of data augmentation techniques
considerably accelerated machine learning to overcome data shortage. Some examples
(ML) research in many disciplines such as com- include introducing synthetic variations such as
puter vision, speech processing, natural language infusion of Gaussian noise, shifting in time (hori-
processing, recommendation systems, etc. Shift zontal shift) and frequency content (vertical shift)
invariance is an attractive characteristic of (Jaitly and Hinton 2013; Ko et al. 2015; Park et al.
CNNs, which makes them suitable for analyzing 2019). The training process, which involves suc-
visual imagery (LeCun et al. 1989a, b, 1998). cessively lowering a loss function iteratively
CNN-based solutions have consistently using the backpropagation algorithm, is usually
dominated many of the large-scale visual recog- computationally intensive and is often sped up
nition challenges. As such, several competing with the use of GPUs.
architectures of CNNs have been developed: DNNs have been used in the automatic recog-
AlexNet (Krizhevsky et al. 2017), ResNet nition vocalizations of insects (e.g.,
(He et al. 2016), DenseNet (Huang et al. 2017), Madhusudhana et al. 2019), fish (e.g., Malfante
etc. Some of these architectures have become the et al. 2018), birds (e.g., Stowell et al. 2016; Goëau
state-of-the-art in computer vision applications et al. 2016), bats (e.g., Mac Aodha et al. 2018),
such as face recognition, emotion detection, marsupials (e.g., Himawan et al. 2018), primates
object extraction, scene classification, and also (e.g., Zhang et al. 2018), and marine mammals
in conservation applications (e.g., species identi- (e.g., Bergler et al. 2019). CNNs have been used
fication in camera trap data, land-use monitoring in the recognition of social calls, song calls, and
in aerial surveys). Given the image-like nature of whistles (e.g., Jiang et al. 2019; Thomas et al.
time-frequency representations of acoustic 2019). While typical 2-dimensional CNNs have
294 J. N. Oswald et al.

been successfully used in the detection of echolo- Wiener 2002; Cutler et al. 2007; Armitage and
cation clicks (e.g., Bermant et al. 2019), Ober 2010; Ross and Allen 2014).
1-dimensional CNNs (with waveforms as inputs) One of the advantages of a random forest
have been attempted as well (e.g., Luo et al. analysis is that it provides information on the
2019). CNNs and LSTM networks have been degree to which each one of the input variables
compared in an application for classifying grou- contributes to the final species classification. This
per species (Ibrahim et al. 2018) where the information is given by the Gini index and is
authors observed similar performances between known as the Gini variable importance. The
the two models. Shiu et al. (2020) attempted Gini index is calculated based on the “purity” of
combining a CNN with a GRU network for each node in each of the classification trees,
detecting North Atlantic right whale (Eubalaena where purity is a measure of the number of
glacialis) calls. Madhusudhana et al. (2021) whistles from different species in a given node
incorporated long-term temporal context by com- (Breiman et al. 1984). Smaller Gini indices repre-
bining independently trained CNNs and LSTM sent higher purity. When a random forest analysis
networks and achieved notable improvements in is run, the algorithm assigns splitting variables so
recognition performance. An attractive approach that the Gini index is minimized at each node
for developing recognition models is the use of (Oh et al. 2003). When a forest has been grown,
transfer learning technique (Torrey and Shavlik the Gini importance value is calculated for each
2010), where components of an already trained variable by summing the decreases in Gini index
model are reused. Typically, weights of the early from one node to the next each time the variable is
layers of a pre-trained network are frozen used. Variables are ranked according to their Gini
(no longer trainable) and the model is adapted to importance values—those with the highest values
the target domain by training only the leaf nodes contribute the most to the random forest model
with data from the target domain. Zhong et al. predictions. Random forests also produce a prox-
(2020) used transfer learning to produce a CNN imity measure, which is the fraction of trees in
model for classifying the calls of a few species of which particular observations end up in the same
frogs and birds. terminal nodes. This measure provides informa-
tion about the similarity of individual
observations because similar observations should
8.4.3.2 Random Forest Analysis
end up in the same terminal nodes more often
A random forest is a collection of many (hundreds
than dissimilar observations (Liaw and Wiener
or thousands) individual classification trees,
2002).
which are grown without pruning. Each tree is
Armitage and Ober (2010) compared the
different from every other tree in the forest
classification performance of random forests, sup-
because at each node, the variable to be used as
port vector machines (SVMs), artificial neural
a splitter is chosen from a random subset of the
networks, and DFA for bat echolocation signals
variables (Breiman 2001). Each tree in the forest
and found that, with the exception of DFA, which
produces a predicted category for the sound to be
had the lowest classification accuracy, all
classified as, and the sound is ultimately classified
classifiers performed similarly. Keen et al.
as the category that was predicted by the majority
(2014) compared the performance of four classi-
of trees. Random forests are often more accurate
fication algorithms using spectrographic
than single classification trees because they are
measurements (spectrographic cross-correlation,
robust to over-fitting and stable to small
dynamic time-warping, Euclidean distance, and
perturbations in the data, correlations between
random forest) for flight calls from four warbler
predictor variables, and noisy predictor variables.
species. In this study, random forests produced
Random forests perform well on polymorphic
the most accurate results, correctly classifying
categories such as the variety of flight calls pro-
68% of calls.
duced by many bird species (e.g., Liaw and
8 Detection and Classification Methods for Animal Sounds 295

Oswald et al. (2013) compared classifiers depended on the number of cepstral coefficients
generated using DFA versus random forest and the number of Gaussian mixtures in the
classifiers for whistles produced by eight model. Lee et al. (2013) used GMMs to classify
delphinid species recorded in the tropical Pacific song segments of 28 species of birds based on
Ocean and found that random forests resulted in image-shape features instead of traditional spec-
the highest overall correct classification score. trographic features. This approach resulted in
Rankin et al. (2016) trained a random forest clas- 86% or 95% classification accuracy for 3- or 5-s
sifier for five delphinid species in the California birdsong segments, respectively.
Current ecosystem. This classifier used informa- Roch et al. (2008) classified clicks produced
tion from whistles, clicks, and burst-pulse sounds by Blainville’s beaked whales, pilot whales, and
and correctly classified 84% of acoustic Risso’s dolphins using a GMM. Correct classifi-
encounters. Both Oswald et al. (2013) and Rankin cation scores for these three species were 96.7%,
et al. (2016) used spectrographic measurements 83.2%, and 99.9%, respectively. Brown and
as input variables for their classifiers. Smaragdis (2008, 2009) used GMMs to classify
sounds of killer whales, resulting in up to 92%
8.4.3.3 Gaussian Mixture Models agreement with 75 perceptually created
Gaussian Mixture Models (GMMs) are used com- categories of sound types, depending on the num-
monly to model arbitrary distributions as linear ber of cepstral coefficients and Gaussians in the
combinations of parametric variables. They are estimate of the probability density function.
appropriate for species identification when there GMMs were used to classify the A and B type
are no expectations, such as the sequence of sounds produced by blue whales in the Northeast
sounds (Roch et al. 2007). To create a GMM, a Pacific (McLaughlin et al. 2008), and six marine
set of n normal distributions with separate means mammal species (Mouy et al. 2008) recorded in
and diagonal covariance matrices are scaled by the Chukchi Sea: bowhead whales, humpback
weight-factors ci (1 < i < n). The sum over all ci whales, gray whales, beluga whales, killer
must be 1 to ensure that the GMM represents a whales, and walruses. Both studies reported that
probability distribution (Huang et al. 2001; Roch their classifiers worked very well, but correct
et al. 2007, 2008). The number of mixtures in the classification scores were not provided.
GMM is chosen empirically and its parameters
are estimated using an iterative algorithm, such as 8.4.3.4 Support Vector Machines
the Expectation Maximization algorithm (Moon Support vector machines (SVMs) are a rich fam-
1996). Once a GMM has been trained, likelihood ily of learning algorithms based on Vapnik’s
is computed for each sound type and a log- (1998) statistical learning theory. An SVM
likelihood-ratio test is used to decide the species works by mapping features measured from
(Roch et al. 2008). sounds into a high-dimensional feature space.
Gingras and Fitch (2013) used GMMs to clas- The SVM then finds the optimal hyperplane
sify male advertisement songs of four genera of (function) that maximizes the separation among
anurans (Bufo, Hyla, Leptodactylus, Rana) based classes with the lowest number of parameters and
on spectral features and mel-frequency cepstral the lowest risk of error. This approach attempts to
coefficients. The GMM based on spectral features meet the goal of minimizing both the training
resulted in 60% true positives and 13% false error and the complexity of the classifier (Mazhar
positives, and the GMM based on et al. 2007). The best hyperplane is one that
mel-frequency cepstral coefficients resulted in maximizes the distance between the hyperplane
41% true positives and 20% false positives. and the nearest data points belonging to different
Somervuo et al. (2006) correctly classified classes. The support vectors are the data points
55–71% of song fragments from 14 different spe- that determine the position of the hyperplane, and
cies of birds based on mel-frequency cepstral the distance between the hyperplane and the sup-
coefficients. The correct classification score port vectors is called the margin (Fig. 8.21). The
296 J. N. Oswald et al.

Fig. 8.21 Examples of support vector machine hyperplanes. (a) The margin of the hyperplane is not optimal, (b) a
hyperplane with a maximized margin. The support vectors are circled

optimal classifier maximizes the margin on both classifiers at each node. The two datasets used
sides of the hyperplane. Because the hyperplane by Fagerlund (2007) contained six and eight
can be defined by only a few of the training bird species and correct classification scores
samples, SVMs tend to be generalized and robust were 78–88% and 96–98% for the two datasets,
(Cortes and Vapnik 1995; Duda et al. 2001). respectively, depending on which variables were
When classes cannot be separated linearly, used in the classifiers.
SVMs can map features onto a higher dimen- Zeppelzauer et al. (2015) and Stoeger et al.
sional space where the samples become linearly (2012) both used SVM to identify African ele-
separable (see Fig. 8.26 in Zeppelzauer et al. phant rumbles. Zeppelzauer et al. (2015) used
2015). cepstral feature vectors and an SVM to distin-
SVMs originally were designed for binary guish African elephant rumbles from background
classification, but a number of methods have noise. This SVM resulted in an 88% correct
been developed for applying them to multi-class detection rate and a 14% false alarm rate. In
problems. The three most common methods are: addition to SVM, Stoeger et al. (2012) also used
(1) form k binary “one-against-the-rest” linear discriminant analysis (LDA) and nearest
classifiers, where k is the number of classes and neighbor classification algorithms to categorize
the class whose decision-function is maximized is two types of rumbles produced by five captive
chosen (Vapnik 1998), (2) form all k(k  1)/2 African elephants based on spectral
pair-wise binary classifiers, and choose the representations of the sounds. They obtained a
class whose pair-wise decision-functions are classification accuracy of greater than 97% for
maximized (Li et al. 2002), and (3) reformulate all three classification methods.
the objective function of SVM for the multi-class Jarvis et al. (2006) developed a new type of
case so decision boundaries for all classes are multi-class SVM, called the class-specific SVM
optimized jointly (Guemeur et al. 2000). (CS-SVM). In this method, k binary SVMs are
Gingras and Fitch (2013) used four different created, where each SVM discriminates between
algorithms (SVM, k-nearest neighbor, multivari- one of the k classes of interest and a common
ate Gaussian distribution classifier, and GMM) to reference-class. The class whose decision-
classify advertisement calls from four genera of function is maximized with respect to the
anurans and obtained comparable accuracy levels reference-class is selected. If all decision-
from all three models. Fagerlund (2007) used functions are negative, the reference-class is
SVMs to classify bird sounds produced by several selected. The advantage of this method is that
species using decision trees with binary SVM noise in recordings is treated as the reference-
8 Detection and Classification Methods for Animal Sounds 297

class. Jarvis et al. (2006) used their CS-SVM to songbirds: indigo buntings (Passerina cyanea)
discriminate clicks produced by Blainville’s and zebra finches (Taeniopygia guttata). Their
beaked whales from ambient noise and obtained analysis resulted in 97% correct classification of
a correct classification score of 98.5%. They also stereotyped syllables and 84% correct classifica-
created a multi-class CS-SVM that classified tion of syllables in plastic song. It is important to
clicks produced by Blainville’s beaked whales, note, however, that these results were obtained for
spotted dolphins (Stenella attenuata), and song recorded from a single individual of each
human-made sonar pings. This CS-SVM resulted species in a controlled setting. Somervuo et al.
in 98% correct classification for Blainville’s (2006) performed DTW to classify bird song
beaked whale clicks, 88% correct classification syllables produced by 14 different species. They
for spotted dolphin clicks, and 95% correct clas- compared two different methods for computing
sification for sonar pings. It is important to note distance between syllables: (1) simple Euclidean
that the training data were included in their test distances between frequency-amplitude vectors,
data, which likely resulted in inflated correct clas- and (2) absolute distance between frequencies
sification scores. weighted by the sum of their amplitudes. Classi-
fication accuracy was low, at about 40–50%,
8.4.3.5 Dynamic Time-Warping depending on the species and the distance method
Dynamic time-warping (DTW) is a class of used. They obtained higher classification success
algorithms originally developed for automated using classification methods such as hidden Mar-
human speech recognition (Myers et al. 1980). kov models (HMM) and GMM based on song
DTW is used to quantitatively compare time- fragments, rather than on single syllables.
frequency contours of different durations using Buck and Tyack (1993) performed DTW to
variable extension and compression of the time classify three signature whistles from each of
axis (Deecke and Janik 2006; Roch et al. 2007). five wild bottlenose dolphins recorded in
There are different DTW techniques (e.g., Itakura Sarasota, Florida, USA, with 100% accuracy.
1975; Sakoe and Chiba 1978; Kruskal and Deecke and Janik (2006) used DTW to classify
Sankoff 1983), but all are based on comparing a signature whistles produced by captive bottlenose
reference sound to a test sound. The test sound is dolphins. The DTW algorithm outperformed
stretched and compressed along its contour to human analysts and other statistical methods
minimize the difference between the shapes of tested by Janik (1999). DTW also was applied
the two contours. Restrictions can be placed on to classify stereotypical pulsed sounds produced
the amount of time-warping that takes place. For by killer whales, both in captivity (Brown et al.
example, Buck and Tyack (1993) did not time- 2006) and at sea (Deecke and Janik 2006; Brown
warp contours that differed by a factor of more and Miller 2007). In all of these studies, sounds
than 2 in duration and assigned those contours a were classified into categories that were identified
similarity score of zero. Deecke and Janik (2006) perceptually by humans with very high correct
stated that contours could only be stretched or classification scores.
compressed up to a factor of 3 to fit the reference Oswald et al. (2021) used dynamic time-
contour. In a DTW analysis, all individual warping and neural network analysis to group
contours are compared to all other contours and whistle contours produced by short- and long-
a similarity matrix is constructed. Sounds are beaked common dolphins (Delphinus delphis
clustered into categories based on the similarity and D. bairdii) into categories. Many of the
matrix using methods such as k-nearest neighbor resulting categories were shared between the
cluster analysis or ANNs (Deecke and Janik two species, but each species also produced a
2006; Brown and Miller 2007). number of species-specific categories. Random
DTW has been used to classify bird sounds. forest analysis showed that whistles in species-
Anderson et al. (1996) applied DTW to recognize specific categories could be classified to species
individual song syllables for two species of with significantly higher accuracy than whistles
298 J. N. Oswald et al.

in shared categories. This suggests that not every used to classify the sounds produced by birds
whistle carries species information, and that spe- (Kogan and Margoliash 1998; Trawicki et al.
cific whistle types play an important role in dol- 2005, Trifa et al. 2008, Adi et al. 2010), red
phin species identification. deer (Cervus elaphus; Reby et al. 2006), African
elephants (Clemins et al. 2005), common
8.4.3.6 Hidden Markov Models dolphins (Sturtivant and Datta 1997; Datta and
Hidden Markov mode (HMM) theory was devel- Sturtivant 2002), killer whales (Brown and
oped in the late 1960s by Baum and Eagon (1967) Smaragdis 2008, 2009); beluga whales (Clemins
and now is used commonly for human speech and Johnson 2005; Leblanc et al. 2008), bowhead
recognition (Rabiner et al. 1983, 1996; Levinson whales (Mellinger and Clark 2000), and hump-
1985; Rabiner 1989). To create an HMM, a vec- back whales (Suzuki et al. 2006). HMMs perform
tor of features is extracted from a signal at discrete as well as, or better than, both GMMs and DTW
time steps. The temporal evolution of these (Weisburn et al. 1993; Kogan and Margoliash
features from one state to the next is modeled by 1998) and are becoming more common in animal
creating a transition matrix M, where Mij is the classification studies.
probability of transition from state i to state j, and Adi et al. (2010) also used HMMs to examine
an emission matrix E, where Eis is the probability individually distinct acoustic features in songs
of observing signal s in state i (Rickwood and produced by ortolan buntings (Emberiza
Taylor 2008). A different HMM is created for hortulana). They represented each song syllable
each species in the dataset and a sound is classi- using a 15-state HMM (Fig. 8.22). These HMMs
fied by determining which of the HMMs has the then were connected to represent song types. The
highest likelihood of producing that particular set 14 most common song types were included in the
of signal states. Training HMMs requires signifi- analysis and correct classification ranged from
cant amounts of computing, and proper estima- 50% to 99%, depending on the song type. Over-
tion of the transition and output probabilities is of all, 90% of songs were correctly classified. Adi
crucial importance (Makhoul and Schwarz 1995). et al. (2010) used these results to illustrate the
Excellent tutorials on HMMs can be found in feasibility of using acoustic data to assess popula-
Rabiner and Juang (1986) and Rabiner (1989). tion sizes for these birds.
A significant advantage inherent to HMMs is Reby et al. (2006) used HMMs to examine
their ability to model time and spectral variability whether common roars uttered by red deer during
simultaneously (Makhoul and Schwarz 1995). the rutting season can be used for individual
They are able to model time series that have subtle recognition. They recorded roar bouts from
temporal structure and are efficient for modeling seven captive red deer and used HMMs to
signals with varying durations by performing non- model roar bouts as successions of silences and
linear, temporal alignment during both the training roars. Each roar in the analysis was modeled as a
and classification processes (Clemins et al. 2005; succession of states of frequency components
Roch et al. 2007; Trifa et al. 2008). Using HMMs, measured from the roars. Overall, the HMM
complex models can be built to deal with compli- correctly identified 85% of roar bouts to the indi-
cated biological signals (Rickwood and Taylor vidual deer, showing that roars were individually
2008), but care must be taken when choosing train- specific. Reby et al. (2006) also used HMMs to
ing samples to obtain a high generalization ability. examine stability in this individuality over the
The performance of an HMM is influenced by the rutting season. They did this by training an
size of the training set, the feature extraction HMM using roar bouts recorded at the beginning
method, and the number of states in the model of the rutting season and testing the model using
(Trifa et al. 2008). Recognition performance is roar bouts recorded later in the rutting season.
also affected by noise (Trifa et al. 2008). Overall, 58% of roar bouts were classified
In addition to being successfully implemented correctly, suggesting that individual identification
in human speech recognition, HMMs have been cues in roar bouts varied over time.
8 Detection and Classification Methods for Animal Sounds 299

Fig. 8.22 Example of a 15-state hidden Markov model pattern of the syllable (Adi et al. 2010). # Acoustical
representation of the waveform of a song syllable pro- Society of America, 2010. All rights reserved
duced by an ortolan bunting to capture the temporal

complicates their classification. For example,


8.5 Challenges in Classifying
sounds can be misrepresented in recordings if
Animal Sounds
the frequency response of the recording system
is not linear, if the sampling frequency is too low,
Placing sounds into categories is not always
if sounds exist below or above the functional
straightforward. Sounds produced by a particular
frequency range of the recording system, or if
species often contain a great deal of variability
aliasing occurs (see Chap. 4). Ideally, recording
caused by different factors (e.g., location, date,
systems should be carefully assembled and
age, sex, and individuality), which can make it
calibrated for the specific application. If the
difficult to define categories. In addition, sound
effects of the recording system could always be
categories are not always sharply demarcated, but
removed completely from recordings, sound clas-
instead grade or gradually transition from one
sification would be more consistent and compara-
form to another. It is important to be aware of
ble. However, sounds published in the literature
the challenges in a particular dataset. Below are
are sometimes received sounds that were affected
some types of variation that can be encountered in
by the recorder and/or the sound propagation
the classification of animal sounds.
environment.
One of the most common problems in under-
water acoustic recordings is mooring noise. If
8.5.1 Recording Artifacts hydrophones are held over the side of a boat, the
recordings will contain sound from waves
Bioacousticians need to be aware that recorded splashing against the boat or the hydrophone
animal sounds are affected by the frequency and cable rubbing against the boat. Recorders built
sensitivity specifications of the recording system into mooring lines can record cable strum or
used. An inappropriate recording system can clanking chains. If multiple oceanographic
result in distorted or partial sounds, which sensors are moored together, sounds from other
300 J. N. Oswald et al.

instruments (e.g., wipers on a turbidity sensor) microphone contain more ultrasonic components
may be recorded. Recorders resting on soft sea- than signals recorded from a bat flying away from
floor in coastal water may record the sound of the microphone. The signal with the longest fre-
sand swishing over the mooring. In addition, quency modulation (from 100 to 50 kHz) is
hydrostatic pressure fluctuations from the received when the bat is closest to the micro-
recorder bouncing in the water column or vortices phone. Variations in this spectrogram show how
at the hydrophone if deployed in strong currents one sound type could be categorized differently
will cause flow noise. All of these artifacts can simply because of distance between the animal
last from seconds to minutes and appear in and recorder, orientation to the microphone, and
spectrograms as power from a few hertz to high the gain setting.
kilohertz. Minimization of mooring noise and Other sound propagation effects include rever-
identification of recording artifacts is an art (also beration (which leads to the temporal spreading of
see Chaps. 2 and 3). brief, pulsed sounds) and frequency dispersion.
Similarly, artifacts can be recorded during air- Frequency dispersion is a result of energy at dif-
borne recordings. Wind is a primary artifact; ferent frequencies traveling at different speeds.
however, moving vegetation and precipitation This leads to sounds being spread out in time
can also add noise to a recording. Any distur- and, specifically in some underwater
bance to the microphone can generate unwanted environments, can cause pulsed sounds to
tapping or static on a recording. Recording become frequency-modulated sounds (either up-
systems in terrestrial environments need to be or downsweeps; Fig. 8.24).
secured to minimize such noises. Finally, ambient noise (i.e., geophysical noise,
anthropogenic noise, and non-target biological
noise) superimposes with animal sounds, and at
8.5.2 Sound Propagation Effects some distances and frequencies, parts of the ani-
mal sound spectrum will begin to drop below the
Environmental features of air or water can change levels of ambient noise. As a result, the same
the way sound propagates and thus the acoustic animal sound in a different environment and at a
characteristics of a recorded sound. Bioacousticians different distance from the animal can look quite
need to understand environmental effects on the different on a spectrogram and cause it to be
features of received sound to avoid classification misclassified as two different sound types.
of a signal variant as a new type, rather than as a
particular sound type affected by propagation
conditions. The sound propagation environment 8.5.3 Angular Aspects of Sound
can affect both the spectral and temporal features Emission
of sound as it propagates from the animal to the
recorder (see Chaps. 5 and 6). For example, energy The orientation of an animal relative to the
at high frequencies is lost (attenuates) very quickly receiver (microphone or hydrophone) can change
due to scattering and absorption, and therefore high- the acoustic features of the recorded sound. This
frequency harmonics do not propagate over long complicates classification, and off-axis variations
ranges. Acoustic energy at low frequencies (i.e., of a sound need to be known so they can be
long wavelengths) does not travel well in narrow categorized as just a variant of a particular
waveguides (e.g., shallow water). Because different sound type, rather than as a new sound type.
frequencies within a sound can attenuate at different Not all sounds emitted by animals are omni-
rates, the same sound can appear differently on a directional (i.e., propagate equally in all angles
spectrogram, depending on the distance at which it relative to the animal). Au et al. (2012) studied the
was recorded. directionality of bottlenose dolphin echolocation
Differential attenuation of frequencies in air is clicks by measuring the horizontal and vertical
shown in Fig. 8.23. Signals produced by a big emission beam patterns of these sounds. The
brown bat (Eptesicus fuscus) flying toward a angle at which an echolocation click was
8 Detection and Classification Methods for Animal Sounds 301

160k
140k

120k Echolocation pulses


100k
90k
80k
70k

60k

50k
45k
40k
Frequency [Hz]

35k

30k

25k

20k
18k
Search phase Approach Terminal
16k
phase phase
14k

12k

10k
9k
8k Sequence of Big Brown Bat
7k

secs
0 . 00 0 . 05 0 . 10 0 . 15 0 . 20 0 . 25 0 . 30 0 . 35 0 . 40 0 . 45
Time [s]

Fig. 8.23 Spectrogram of big brown bat (Eptesicus bat pursues an insect prey for capture. Notice that the bat
fuscus) circling a recording device while searching and emits “search” calls at 25–40 kHz, approach calls at
pursuing aerial prey. As the bat approaches the micro- 30–70 kHz when it is in pursuit or trying to navigate flight
phone, more of the ultrasonic signal is received (calls through complex space, and finally terminal calls at
reach up to 70 kHz). As the bat moves away, the signal 30–55 kHz
is attenuated. Time between calls shortens notably as the

recorded relative to the transducer living in different regions, has been documented
(or echolocating animal) not only affected its for many terrestrial and aquatic animals, includ-
received level, but also the waveform and fre- ing Hawaiian crickets (Mendelson and Shaw
quency spectrum (Fig. 8.25). Sperm whale 2003), Túngara frogs (Engystomops pustulosus,
(Physeter macrocephalus) echolocation clicks, Prӧhl et al. 2006), bats (Law et al. 2002;
when recorded off-axis (i.e., away from the center Aspetsberger et al. 2003; Russo et al. 2007;
of its emission beam), consisted of multiple com- Yoshino et al. 2008), pikas (Borisova et al.
plex pulses that were likely due to internal 2008), sciurid rodents (Gannon and Lawlor
reflections within the sperm whale’s head (Møhl 1989; Slobodchikoff et al. 1998; Yamamoto
et al. 2003; also see Chap. 12). et al. 2001; Eiler and Banack 2004), singing
mice (Scotinomys spp., Campbell et al. 2010),
primates (Mitani et al. 1992; Delgado 2007;
Wich et al. 2008), cetaceans (Helweg et al.
8.5.4 Geographic Variation
1998; McDonald et al. 2006; Delarue et al.
2009; Papale et al. 2013, 2014), and elephant
Geographic variation, or differences in the sounds
seals (Mirounga spp., Le Boeuf and Peterson
produced by populations of the same species
302 J. N. Oswald et al.

150 the east coast (Iron Range National Park) had


2000 140
1000 130 unique contact sounds and produced fewer
500 120 sound types than at other locations. The authors
200 110
100 100 speculated that this large difference was due to
90 long-term isolation at this site and noted that

power spectral density [dB re 1mPa2/Hz]


50
80
20 70 documentation of geographic variation in sounds
10 60
50
provided important conservation information
0 10 20 30 40 50 60 70 80 90
2000
110 for determining connectivity of these six
1000 100 populations.
frequency [Hz]

500 90 Thomas and Golladay (1995) employed PCA


200 80 to classify nine underwater vocalization types
100 produced by leopard seals (Hydrurga leptonyx)
70
50
60 at three study sites near Palmer Peninsula,
20
50
Antarctica. The PCA successfully separated
0 10 20 30 40 50 60 70 80 90
120 vocalizations from the three study areas and
90
80 100 provided information about what features of the
70 80 sounds were driving the differences among
60
50
60 locations. For example, the first principal compo-
40 40
30
nent was influenced by maximum, minimum,
20
20
0
start, and end frequencies, the second principal
10
0 -20
component was influenced by the presence or
0 10 20 30 40 50 60 70
time [s] absence of overtones, and the third principal com-
ponent was predominantly related to time
Fig. 8.24 Spectrograms of marine seismic airgun signals relationships, such as duration and time between
recorded at three different ranges: 1.5 km (top), 80 km
over soft seabed (middle), and 40 km over a hard seabed successive sounds. Note that some sound types
(bottom). The top and bottom spectrograms are of the were absent at some locations.
same seismic survey. Pulses were brief and broadband
near the source, but became frequency-modulated and
narrowband some distance away due to dispersion (Erbe
et al. 2016). # Erbe et al.; https://ars.els-cdn.com/content/ 8.5.5 Graded Sounds
image/1-s2.0-S0025326X15302125-gr9_lrg.jpg. Licensed
under CC BY 4.0; https://creativecommons.org/licenses/ Some animals produce sound types that grade or
by/4.0/ gradually transition from one type to another.
Researchers should not neglect the potential exis-
tence of vocal intermediates in classification. For
1969). When developing classifiers, it is impor- example, Schassburger (1993) described sounds
tant to understand the degree of geographic varia- produced by timber wolves (Canis lupus) as
tion in a sound repertoire and the range over barks, growl-moans, growls, howls moans, snarls,
which this occurs. If geographic variation exists, whimpers, whine-moans, whines, woofs, and
then a classifier trained using data collected in one yelps. Wolves combine these 11 principal sounds
location may not work well when applied to data to create mixed-sounds that often grade from one
collected in another location. type into another.
One of the underlying causes of geographic Clicks trains, burst-pulse sounds, and whistles
variation may be reproductive isolation of a pop- produced by delphinids are typically considered
ulation. Keighley et al. (2017) used DFA with as three distinct categories of sound. Click trains
stepwise variable selection to determine geo- and burst-pulse sounds are composed of short,
graphic variation in sounds from six major exponentially damped sine waves separated by
populations of palm cockatoos (Probosciger periods of silence, while whistles are generally
aterrimus) in Australia. Palm cockatoos from thought of as continuous tonal sounds, often
8 Detection and Classification Methods for Animal Sounds 303

Fig. 8.25 Waveforms and spectra of a bottlenose dolphin echolocation click in the horizontal (a) and vertical (b) planes
(Au et al. 2012). # Acoustical Society of America, 2012. All rights reserved
304 J. N. Oswald et al.

8.5.6 Repertoire Changes Over Time

Some animal sound repertoires change over time,


which complicates their classification. For exam-
ple, humpback whale song slowly changes over
the course of a breeding season as new units are
introduced and old ones discarded (Noad et al.
2000). Song also changes from one season to the
next, and in one instance, eastern Australian
humpback whales changed to the song of the
western Australian population within 1 year
(Noad et al. 2000).
Antarctic blue whales can be heard off south-
Fig. 8.26 Spectrogram and waveform of a false killer western Australia from February to October every
whale vocalization. The vocalization appears to be a whis- year. The upper frequency of their Z-call
tle in the spectrogram, but the waveform reveals discrete decreases over the season by about 0.4–0.5
pulses between 61 and 67 ms (Murray et al. 1998).
# Acoustical Society of America, 1998. All rights Hz. At the beginning of the next season, the
reserved Z-call jumps in frequency to about the mean of
the Z frequency of the previous season, and then
decreases again, leading to an average decrease in
sweeping in frequency. While these sounds
the frequency of the upper part of the Z-call by
appear quite different from one another on
0.135  0.003 Hz/year (Fig. 8.27; Gavrilov et al.
spectrograms, closer inspection of their
2012). A similar decrease (albeit at different rates
waveforms reveals that some sounds that look
at different locations) has been observed for the
like whistles on a spectrogram actually contain a
“spot call,” of which the animal source remains
high degree of amplitude modulation. In other
elusive (Fig. 8.27; Ward et al. 2017). The reasons
words, some sounds that are considered to be
for these shifts are unknown.
whistles are made up of pulses with inter-pulse
intervals that are too short to hear or be resolved
by the analysis window of the spectrogram
(Fig. 8.26). As an example of this, Murray et al. 8.6 Summary
(1998) used self-organizing neural networks to
analyze the vocal repertoires of two captive false Animals, whether they are in air, on land, or under
killer whales (Pseudorca crassidens) based on water, produce sound in support of their various
measurements taken from waveforms. They life functions. Cicadas join in chorus to repel
found that rather than organizing sounds into predatory birds (Simmons et al. 1971); male
distinct categories, the vocal repertoire was more fishes chorus on spawning grounds to attract
accurately represented by a graded continuum, females (Amorim et al. 2015); frogs call to attract
with exponentially damped sinusoidal pulses on mates and to mark out their territory (Narins et al.
one end and continuous sinusoidal signals at the 2006); birds, too, sing for territorial and reproduc-
other. Beluga whales also have been shown to tive reasons (Catchpole and Slater 2008); bats
have a graded vocal repertoire (Karlsen et al. emit clicks for echolocation during hunting and
2002; Garland et al. 2015). Whistles with a high navigating, as do dolphins (Madsen and Surlykke
degree of amplitude modulation have been 2013). In order to study animals by listening to
recorded from Atlantic spotted and spinner their sounds, sounds need to be classified to spe-
(Stenella longirostris) dolphins (Lammers et al. cies, to behavior, etc. In the early days, this was
2003), suggesting that this graded continuum done without measurements or with only the sim-
model is applicable to these species as well. plest measuring tools. Scientists listened to the
8 Detection and Classification Methods for Animal Sounds 305

29
ABW, Cape Leeuwin & others
Spot calls, Cape Leeuwin
28 Spot calls, Perth Canyon
Spot calls, Portland
Spot calls, GAB
Spot calls, Bremer Bay
27 Spot calls, Kangaroo Island
Frequency (Hz)

26

25

24

23

22
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
Year

Fig. 8.27 Weekly means of the upper part of the Antarc- locations are off Australia (GAB: Great Australian Bight).
tic blue whale Z-call over several years, as well as of the Data updated from Gavrilov et al. (2012) and Ward et al.
spot call, which remains to be identified to species. All (2017). Courtesy of Sasha Gavrilov

sounds in the field, often while visually observing are detected and how many are missed. We
animals. Scientists recorded sounds in the field presented two ways of finding the best threshold
and analyzed the recordings in the laboratory by and assessing detector performance: receiver
listening, looking at oscillograms or operating characteristics and precision-recall
spectrograms, and manually sorting sounds into curves.
types. Nowadays, with the affordability of auton- Once signals have been detected, they can be
omous recording equipment, bioacousticians col- classified. A common pre-processing step imme-
lect vast amounts of data, which can no longer be diately prior to classification includes the mea-
analyzed without the aid of automated data surement of sound features such as minimum
processing, data reduction, and data analysis and maximum frequency, duration, or cepstral
tools. Given simultaneous advances in computer features. The software tools we presented for
hard- and software, datasets may be analyzed classification included parametric clustering,
more efficiently, and with the added advantage principal component analysis, discriminant func-
of reducing opportunities for human subjective tion analysis, classification trees, and machine
biases. learning algorithms. No single tool outperforms
In this chapter, we presented software tools for all others; rather, the best tool suited for the spe-
automatically detecting animal sounds in acoustic cific task needs to be employed. We discussed
recordings, and for classifying those sounds. The advantages and limitations of the various tools
detectors we discussed compute a specific quan- and provided numerous examples from the litera-
tity of the sound (such as its instantaneous energy ture. Finally, challenges resulting from recording
or entropy) and then apply a threshold above artifacts, the environment affecting sound
which the sound is deemed detected. The specific features, and changes in sound features over
detectors were based on acoustic energy, Teager– time and space were explored.
Kaiser energy, entropy, matched filtering, and It is important to remember that human per-
spectrogram cross-correlation. Setting the detec- ception of a sound likely is not the same as an
tion threshold critically affects how many signals animal’s perception of the sound and yet
306 J. N. Oswald et al.

bioacousticians commonly describe or classify ani- With a goal to foster wider participation in
mal sounds in human terms. Classification of the research on bioacoustic pattern recognition, a
acoustic repertoire of an animal into sound types number of global competitions are held regularly.
provides a convenient framework for comparing The annual Detection and Classification of
and contrasting sounds, taking systematic Acoustic Scenes and Event (DCASE) workshops
measurements from portions of the repertoire, and and BirdCLEF challenges (part of Cross Lan-
performing statistical analyses. However, categories guage Evaluation Forum) attract hundreds of
determined based on human perception may have data scientists for developing machine learning
little or no relevance to the animals and so human solutions for recognizing bird sounds in
categorizations can be biologically meaningless. soundscape recordings. The marine mammal
For example, humans have limited low-frequency community organizes the biennial Detection,
and high-frequency hearing abilities compared to Classification, Localization, and Density Estima-
many other species, and so aural classification of tion (DCLDE) workshops. These challenges put
sound types is sometimes based on only a portion of out large training datasets for researchers to
a sound audible to the human listener. Whether develop detection and classification systems,
sound types determined by humans are meaningful assess the performance of submitted solutions
classes to the animals is mostly unknown. While with “held out” datasets, and reward the
categorizing sounds based on function is an attrac- top-ranked submissions. The datasets from these
tive approach for the behavioral zoologist, challenges are often made available for use by the
establishing the functions of these sounds is often research community after the competitions, while
challenging. In our review of classification some workshops make available the submitted
methods, it was clear that methods developed for solutions as well.
human speech could be applied to animal sounds.
Some fascinating questions lie ahead for
bioacousticians as they attempt to extend under-
8.7 Additional Resources
standing of the perception experienced by other
animals.
• PAMGuard is an open-source software pack-
Even with the above caveats, detection and
age for acoustic detection, classification, and
classification of animal sounds is useful for
localization of cetacean sounds: https://www.
research and conservation. It allows populations
pamguard.org/
to be monitored, their distribution and abun-
• Ishmael is a free software package for acoustic
dance to be determined, and impacts (e.g., from
detection, classification, and localization of
human presence or climate change) to be
cetacean sounds: http://www.bioacoustics.us/
assessed. It can also be useful for conservation
ishmael.html
of a species (i.e., to create taxonomy, identify
• Koe is a free, web-based software for annota-
geographic variation in populations, examine
tion, measurement, and classification of bio-
ecological connectivity among populations, and
acoustics signals: https://koe.io.ac.nz/#
detect changes in the biological uses sounds due
(Fukuzawa et al. 2020)
to the advent and growth of anthropogenic
• Praat is free software originally designed for
noise). Classification of animal sounds is impor-
human speech analysis, but used by many
tant for understanding behavioral ecology and
bioacousticians: https://www.fon.hum.uva.nl/
social systems of animals and can be used to
praat/
identify individuals, social groups, and
• Characterization Of Recorded Underwater
populations. The ability to study these types of
Sound (CHORUS) is a MATLAB graphic user
topics will ultimately lead to a deeper under-
interface developed by Curtin University,
standing of the evolutionary forces that shape
Perth, WA, Australia, with built-in automatic
animal bioacoustics.
detectors for pygmy blue and fin whales
8 Detection and Classification Methods for Animal Sounds 307

(Gavrilov and Parsons 2014): https://cmst. aquatic mammals. De Spil Publishers, Woerden, The
curtin.edu.au/products/chorus-software/ Netherlands, pp 183–199
Au WWL, Branstetter B, Moore P, Finneran J (2012) The
• Detection, Classification, Localization, and biosonar field around an Atlantic bottlenose dolphin
Density Estimation of Marine Mammals (Tursiops truncatus). J Acoust Soc Am 131(1):
using Passive Acoustics meeting websites: 569–576. https://doi.org/10.1121/1.3662077
– Mount Hood, Oregon, USA, 2011: http:// Baptista LF, Gaunt SSL (1997) Social interaction and
vocal development in birds. In: Snowden CT,
www.bioacoustics.us/dcl.html Hausberger M (eds) Social influences on vocal devel-
– St Andrews, Scotland, UK, 2013: https:// opment. Cambridge Univ Press, Cambridge, pp 23–40
soi.st-andrews.ac.uk/dclde2013/ Baum LE, Eagon JA (1967) An inequality with
– San Diego, California, USA, 2015: http:// applications to statistical estimation for probabilistic
functions of Markov processes and to a model for
www.cetus.ucsd.edu/dclde/index.html ecology. Bull Am Math Soc 73:360–363
– Paris, France, 2018: http://sabiod.univ-tln. Baumgartner MF, Fratantoni DM (2008) Diel periodicity
fr/DCLDE/ in both Sei whale vocalization rates and the vertical
– Hawaii, USA, 2022: http://www.soest. migration of their copepod prey observed from ocean
gliders. Limnol Oceanogr 53:2197–2209. https://doi.
hawaii.edu/ore/dclde/ org/10.4319/lo.2008.53.5_part_2.2197
• Bird sound recognition challenges: http:// Beeman K (1998) Digital signal analysis, editing and
dcase.community/ (DCASE), https://www. synthesis. In: Hopp SL, Owren MJ, Evans CS (eds)
imageclef.org/BirdCLEF2020 (BirdCLEF) Animal acoustic communication: sound analysis and
research methods. Springer, Berlin, pp 59–103
• BirdNET is an Android app for birdsong rec- Belliustin NS, Kuznetsov SO, Nuidel IV, Yakhno VG
ognition: https://birdnet.cornell.edu/ (1991) Neural networks with close nonlocal coupling
• SongSleuth is an Apple or Android app for for analyzing composite image. Neurocomputing 3:
birdsong recognition: https://www. 231–246. https://doi.org/10.1016/0925-2312(91)
90005-V
songsleuth.com/#/ Bergler C, Schröter H, Cheng RX, Barth V, Weber M,
• All accessed 5 Aug 2022. Nöth E, Hofer H, Maier A (2019) ORCA-SPOT: an
automatic killer whale sound detection toolkit using
deep learning. Sci Rep 9(1):1–7. https://doi.org/10.
1038/s41598-019-47335-w
References Bermant PC, Bronstein MM, Wood RJ, Gero S, Gruber
DF (2019) Deep machine learning techniques for the
detection and classification of sperm whale bioacous-
Adi K, Johnson MT, Osiejuk TS (2010) Acoustic
tics. Sci Rep 9(1):1–10. https://doi.org/10.1038/
censusing using automatic vocalization classification
s41598-019-48909-4
and identity recognition. J Acoust Soc Am 127:874–
Borisova NG, Rudneva LV, Starkov AI (2008) Interpopu-
883. https://doi.org/10.1121/1.3273887
lation variability of vocalizations in the Daurian pika
Afifi AA, Clark V (1996) Computer-aided multivariate
(Ochotona daurica). Zool Zh 87:850–861
analysis, 3rd edn. Chapman and Hall/CRC, New York
Bouffaut L, Dréo R, Labat V, Boudraa AO, Barruol G
Amorim MC, Vasconcelos RO, Fonseca PJ (2015) Fish
(2018) Passive stochastic matched filter for Antarctic
sounds and mate choice. In: Ladich F (ed) Sound com-
blue whale call detection. J Acoust Soc Am 144(2):
munication in fishes. Springer, Vienna, pp 1–33
955–965. https://doi.org/10.1121/1.5050520
Anderson SE, Dave AS, Margoliash D (1996) Template-
Bradbury JW, Vehrencamp SL (2011) Principles of animal
based automatic recognition of birdsong syllables from
communication, 2nd edn. Sinauer Associates,
continuous recordings. J Acoust Soc Am 100:1209–
New York
1219. https://doi.org/10.1121/1.415968
Brandes TS (2008) Feature-vector selection and use with
Armitage DW, Ober HK (2010) A comparison of
Hidden Markov Models to identify frequency-
supervised learning techniques in the classification of
modulated bioacoustic signals amidst noise. IEEE
bat echolocation calls. Ecol Inform 5:465–473. https://
Trans Speech Lang Process 16:1173–1180. https://
doi.org/10.1016/j.ecoinf.2010.08.001
doi.org/10.1109/TASL.2008.925872
Aspetsberger F, Brandsen D, Jacobs DS (2003) Geo-
Breiman L (2001) Random forests. Mach Learn 45:5–32
graphic variation in the morphology, echolocation
Breiman L, Friedman J, Olshen R, Stone C (1984) Classi-
and diet of the little free-tailed bat, Chaerephon
fication and regression trees. Wadsworth, Pacific
pumilus (Molossidae). Afr Zool 38:245–254. https://
Grove, CA
doi.org/10.1080/15627020.2003.11407278
Briefer EF, Maigrot A-L, Roi T, Mandel R, Briefer
Au WWL, Nachtigall PE (1995) Artificial neural network
Freymond S, Bachmann I, Hillmann E (2015) Segre-
modeling of dolphin echolocation. In: Kastelein RA,
gation of information about emotional arousal and
Thomas JA, Nachtigall PE (eds) Sensory systems of
308 J. N. Oswald et al.

valence in horse whinnies. Sci Rep 5(1):1–11. https:// Clark LA, Pregibon D (1992) Statistical models. In:
doi.org/10.1038/srep09989 Chambers SJM, Hastie TJ (eds) Statistical models in
Briskie JV, Martin PR, Martin TE (1999) Nest predation S. Wadsworth and Brooks/Cole, Pacific Grove, CA
and the evolution of nestling begging calls. Proc R Soc Clarke E, Reichard UH, Zuberbühler K (2006) The syntax
Lond B 266:2153–2159. https://doi.org/10.1098/rspb. and meaning of wild gibbon songs. PLoS One 1(1):
1999.0902 E73. https://doi.org/10.1371/journal.pone.0000073
Brown JC, Miller PJO (2007) Automatic classification of Clemins PJ, Johnson MT (2005) Unsupervised classifica-
killer whale vocalizations using dynamic time warping. tion of beluga whale vocalizations. J Acoust Soc Am
J Acoust Soc Am 122:1201–1207. https://doi.org/10. 117:2470. https://doi.org/10.1121/1.4809461
1121/1.2747198 Clemins PJ, Johnson MT, Leong KM, Savage A (2005)
Brown JC, Smaragdis P (2008) Automatic classification of Automatic classification and speaker identification of
vocalizations with Gaussian mixture models and African elephant (Loxodonta africana) vocalizations. J
Hidden Markov Models. J Acoust Soc Am 123:3345. Acoust Soc Am 117:956–963. https://doi.org/10.1121/
https://doi.org/10.1121/1.2933896 1.1847850
Brown JC, Smaragdis P (2009) Hidden Markov and Coifman RR, Lafon S, Lee AB, Maggioni M, Nadler B,
Gaussian mixture models for automatic sound classifi- Warner F, Zucker SW (2005) Geometric diffusions as
cation. J Acoust Soc Am 125:EL221–EL224. https:// a tool for harmonic analysis and structure definition
doi.org/10.1121/1.3124659 of data: diffusion maps. Proc Natl Acad Sci 102(21):
Brown JC, Hodgins-Davis A, Miller PJO (2006) Classifi- 7426–7431. https://doi.org/10.1073/pnas.0500334102
cation of vocalizations of killer whales using dynamic Cortes C, Vapnik V (1995) Support-vector networks.
time warping. J Acoust Soc Am 119:EL34–EL40. Mach Learn 20:273–297
https://doi.org/10.1121/1.2166949 Courts R, Erbe C, Wellard R, Boisseau O, Jenner KC,
Buck JR, Tyack PL (1993) A quantitative measure of Jenner M-N (2020) Australian long-finned pilot whales
similarity for Tursiops truncatus signature whistles. J (Globicephala melas) emit stereotypical, variable,
Acoust Soc Am 94:2497–2506. https://doi.org/10. biphonic, multi-component, and sequenced
1121/1.407385 vocalisations, similar to those recorded in the northern
Camacho-Alpízar A, Fuchs EJ, Barrantes G (2018) Effect hemisphere. Sci Rep 10(1):20609. https://doi.org/10.
of barriers and distance on song, genetic, and morpho- 1038/s41598-020-74111-y
logical divergence in the highland endemic Timberline Crance JL, Berchok CL, Wright DL, Brewer AM,
Wren (Thryorchilus browni, Troglodytidae). PLoS Woodrich DF (2019) Song production by the North
One 13(12):e0209508. https://doi.org/10.1371/jour Pacific right whale, Eubalaena japonica. J Acoust Soc
nal.pone.0209508 Am 145(6):3467–3479. https://doi.org/10.1121/1.
Campbell P, Pasch B, Pino JL, Crino OL, Phillips M, 5111338
Phelps SM (2010) Geographic variation in the songs Cutler DR, Edwards TC Jr, Beard KH, Cutler A, Hess KT,
of neotropical singing mice: testing the relative impor- Gibson J, Lawler JJ (2007) Random forests for classi-
tance of drift and local adaptation. Evolution 64(7): fication in ecology. Ecology 88:2783–2792. https://
1955–1972. https://doi.org/10.1111/j.1558-5646. doi.org/10.1890/07-0539.1
2010.00962.x Dang T, Bulusu N, Hu W (2008) Lightweight acoustic
Catchpole CK, Slater PJB (2008) Bird song: biological classification for cane toad monitoring. In: 42nd
themes and variations, 2nd edn. Cambridge University Asilomar Conference on Signals, Systems and
Press, Cambridge Computers. IEEE, New York, pp 1601–1605
Cerchio S, Jacobsen JK, Norris TF (2001) Temporal and Datta S, Sturtivant C (2002) Dolphin whistle classification
geographical variation in songs of humpback whales, for determining group identities. Sig Process 82(2):
Megaptera novaeangliae: synchronous change in 251–258. https://doi.org/10.1016/S0165-1684(01)
Hawaiian and Mexican breeding assemblages. Anim 00184-0
Behav 62(2):313–329. https://doi.org/10.1006/anbe. Davis J, Goadrich M (2006) The relationship between
2001.1747 precision-recall and ROC curves. In: Proceedings of
Cho K, Van Merriënboer B, Bahdanau D, Bengio Y the 23rd International Conference on Machine
(2014) On the properties of neural machine translation: Learning, Pittsburgh, PA
encoder-decoder approaches. arXiv:1409.1259 Davis SB, Mermelstein P (1980) Comparison of
Clark CW (1980) A real-time direction-finding device for parametric representations for monosyllabic word rec-
determining the bearing to the underwater sounds of ognition in continuously spoken sentences. IEEE Trans
southern right whales, Eubalaena australis. J Acoust Acoust Speech Sig Process 28:357–366. https://doi.
Soc Am 68:508–511. https://doi.org/10.1121/1. org/10.1109/TASSP.1980.1163420
384762 Dawson MRW, Charrier I, Sturdy CB (2006) Using an
Clark CW (1982) The acoustic repertoire of the southern Artificial Neural Network to classify black-capped
right whale, a quantitative analysis. Anim Behav 30(4): chickadee (Poecile atricapillus) sound note types. J
1060–1071. https://doi.org/10.1016/S0003-3472(82) Acoust Soc Am 119(5):3161–3172. https://doi.org/
80196-6 10.1121/1.2189028
8 Detection and Classification Methods for Animal Sounds 309

Deecke VB, Janik VM (2006) Automated categorization learning algorithms. Appl Acoust 120:158–166.
of bioacoustic signals: avoiding perceptual pitfalls. J https://doi.org/10.1016/j.apacoust.2017.01.025
Acoust Soc Am 119:645–653. https://doi.org/10.1121/ Fagerlund S (2007) Bird species recognition using support
1.2139067 vector machines. EURASIP J Appl Sig Proc 2007(1):
Deecke VB, Ford JKB, Spong P (1999) Quantifying com- 1–8. https://doi.org/10.1155/2007/38637
plex patterns of bioacoustic variation: use of a neural Fenton MB, Jacobson SL (1973) An automatic ultrasonic
network to compare killer whale (Orcinus orca) sensing system for monitoring the activity of some
dialects. J Acoust Soc Am 105:2499–2507. https:// bats. Can J Zool 51:291–299. https://doi.org/10.1139/
doi.org/10.1121/1.426853 z73-041
Delarue J, Todd SK, Van Parijs SM, Di Iorio L (2009) Fitch WT (2003) Mammalian vocal production: themes
Geographic variation in Northwest Atlantic fin whale and variation. In: Proceedings of the 1st International
(Balaenoptera physalus) song: implications for stock Conference on Acoustic Communication by Animals,
structure assessment. J Acoust Soc Am 125:1774– 27–30 July, pp 81–82
1782. https://doi.org/10.1121/1.3068454 Forti LR, Costa WP, Martins LB, Nunes-de-Almeida CH,
Delgado RA (2007) Geographic variation in the long Toledo LF (2016) Advertisement call and genetic
sounds of male orangutans (Pongo spp.). Ethology structure conservatism: good news for an endangered
113:487–498. https://doi.org/10.1111/j.1439-0310. Neotropical frog. PeerJ 4:e2014. https://doi.org/10.
2007.01345.x 7717/peerj.2014
Deregnaucourt S, Guyomarch JC, Richard V (2001) Clas- Freitag LE, Tyack PL (1993) Passive acoustic localization
sification of hybrid crows in quail using artificial neural of the Atlantic bottlenose dolphin using whistles and
networks. Behav Process 56:103–112. https://doi.org/ echolocation clicks. J Acoust Soc Am 93:2197–2205.
10.1016/S0376-6357(01)00188-7 https://doi.org/10.1121/1.406681
Duda R, Hart P, Stork D (2001) Pattern classification, 2nd Fristrup KM, Watkins WA (1993) Marine animal sound
edn. Wiley, Hoboken, NJ classification. Woods Hole Oceanographic Institution
Dunlop RA, Noad MJ, Cato DH, Stokes D (2007) The Technical Report WHOI-94-13, p 29
social vocalization repertoire of east Australian migrat- Frommolt K-H, Bardeli R, Clausen M (eds) (2007)
ing humpback whales (Megaptera novaeangliae). J Computational bioacoustics for assessing biodiversity.
Acoust Soc Am 122(5):2893–2905. https://doi.org/ Proceed Internat Expert meeting on IT-based detection
10.1121/1.2783115 of bioacoustical patterns, 7–10 December 2007 at the
Dunlop RA, Cato DH, Noad MJ, Stokes DM (2013) International Academy for Nature Conservation (INA)
Source levels of social sounds in migrating humpback Isle of Vilm, Germany. BfN - Skripten Federal Agency
whales (Megaptera novaeangliae). J Acoust Soc Am for Nature Conservation, p 234
134(1):706–714. https://doi.org/10.1121/1.4807828 Fukushima K, Wake N (1990) Alphanumeric character
Egan JP (1975) Signal detection theory and ROC analysis. recognition by neocognitron. In: Miller RE
Academic Press, New York (ed) Advanced neural computers. Elsevier Science,
Eiler KC, Banack SA (2004) Variability in the alarm call Amsterdam, pp 263–270
of golden-mantled ground squirrels (Spermophilus Fukuzawa Y, Webb WH, Pawley MD, Roper MM,
lateralis and S. saturatus). J Mammal 85:43–50. Marsland S, Brunton DH, Gilman A (2020) Koe:
https://doi.org/10.1644/1545-1542(2004)085<0043: web-based software to classify acoustic units and ana-
VITACO>2.0.CO;2 lyse sequence structure in animal vocalizations.
Erbe C, King AR (2008) Automatic detection of marine Methods Ecol Evol 11:431–441. https://doi.org/10.
mammals using information entropy. J Acoust Soc Am 1111/2041-210X.13336
124(5):2833–2840. https://doi.org/10.1121/1.2982368 Gannier A, Fuchs S, Quebre P, Oswald JN (2010) Perfor-
Erbe C, Verma A, McCauley R, Gavrilov A, Parnum I mance of a contour-based classification method for
(2015) The marine soundscape of the Perth Canyon. whistles of Mediterranean dolphins. Appl Acoust 7:
Prog Oceanogr 137:38–51. https://doi.org/10.1016/j. 1063–1069. https://doi.org/10.1016/j.apacoust.2010.
pocean.2015.05.015 05.019
Erbe C, Reichmuth C, Cunningham K, Lucke K, Dooling R Gannon WL, Lawlor TE (1989) Variation in the chip
(2016) Communication masking in marine mammals: a vocalization of three species of Townsend’s
review and research strategy. Mar Pollut Bull 103:15– chipmunks (genus Eutamias). J Mammal 70:740–753
38. https://doi.org/10.1016/j.marpolbul.2015.12.007 Gannon WL, Sherwin RE, deCarvalho TN, O’Farrell MJ
Erbe C, Dunlop R, Jenner KCS, Jenner M-NM, McCauley (2001) Pinnae and echolocation call differences
RD, Parnum I, Parsons M, Rogers T, Salgado-Kent C between Myotis californicus and M. ciliolabrum
(2017) Review of underwater and in-air sounds emitted (Chiroptera: Vespertilionidae). Acta Chiropterol 3(1):
by Australian and Antarctic marine mammals. Acoust 77–91
Aust 45:179–241. https://doi.org/10.1007/s40857- Gannon WL, O’Farrell MJ, Corben C, Bedrick EJ (2004)
017-0101-z Call character lexicon and analysis of field recorded bat
Esfahanian M, Erdol N, Gerstein E, Zhuang H (2017) echolocation calls. In: Thomas J, Moss C, Vater M
Two-stage detection of north Atlantic right whale (eds) Echolocation in bats and dolphins. The Univer-
upcalls using local binary patterns and machine sity of Chicago Press, Chicago, pp 478–484
310 J. N. Oswald et al.

Garland EC, Castellote M, Berchok CL (2015) Beluga Hamilton LJ, Cleary J (2010) Automatic discrimination of
whale (Delphinapterus leucas) vocalizations and call beaked whale clicks in noisy acoustic time series. In:
classification from the eastern Beaufort Sea population. OCEANS’10 IEEE Sydney, pp 1–5
J Acoust Soc Am 137:3054–3067. https://doi.org/10. Hammerschmidt K, Fischer J (1998) The vocal repertoire
1121/1.4919338 of Barbary macaques: a quantitative analysis of a
Garland EC, Rendell L, Lilley MS, Poole MM, Allen J, graded signal system. Ethology 104(3):203–216.
Noad MJ (2017) The devil is in the detail: quantifying https://doi.org/10.1111/j.1439-0310.1998.tb00063.x
vocal variation in a complex, multi-levelled, and rap- Hammerschmidt K, Reisinger E, Westekemper K,
idly evolving display. J Acoust Soc Am 142(1): Ehrenreich L, Strenzke N, Fischer J (2012) Mice do
460–472. https://doi.org/10.1121/1.4991320 not require auditory input for the normal development
Gavrilov AN, Parsons MJG (2014) A MATLAB tool for of their ultrasonic vocalizations. BMC Neurosci 13:40
the characterization of recorded underwater sound Harland E (2008) Processing the workshop datasets using
(CHORUS). Acoust Aust 42(3):190–196 the TRUD algorithm. Can Acoust 36:27–33
Gavrilov A, McCauley R, Gedamke J (2012) Steady inter He K, Zhang X, Ren S, Sun J (2016) Deep residual
and intra-annual decrease in the vocalization frequency learning for image recognition. Proc IEEE Conf
of Antarctic blue whales. J Acoust Soc Am 131(6): Comput Vis Pattern Recogn 2016:770–778
4476–4480. https://doi.org/10.1121/1.4707425 Helweg DA, Cato ADH, Jenkins PF, Garrigue D,
Gedamke J, Costa DP, Dunstan A (2001) Localization and McCauley RD (1998) Geographic variation in South
visual verification of a complex minke whale vocaliza- Pacific humpback whale songs. Behaviour 135:1–27
tion. J Acoust Soc Am 109(6):3038–3047. https://doi. Herr, A, Klomp, NL, Atkinson, JS (1997) Identification of
org/10.1121/1.1371763 bat echolocation calls using decision tree classification
Gemello R, Mana F (1991) A neural approach to speaker system Complexity International. https://www.
independent isolated word recognition in an uncon- researchgate.net/publication/293134471_Identifica
trolled environment. In: Proceedings of the Interna- tion_of_bat_echolocation_calls_using_a_decision_
tional Neural Networks Conference, Paris 9–13 July tree_classification_system. Accessed 17 July 2017
1990, vol 1. Kluwer Academic Publishers, Dordrecht, Himawan I, Towsey M, Law B, Roe P (2018). Deep
pp 83–86 learning techniques for Koala Activity detection. In:
Ghosh J, Deuser LM, Beck SD (1992) A neural network INTERSPEECH, pp. 2107–2111
based hybrid system for detection, characterization, Hochreiter S, Schmidhuber J (1997) Long short-term
and classification of short-duration oceanic signals. memory. Neural Comput 9(8):1735–1780
IEEE J Ocean Eng 17:351–363. https://doi.org/10. Holy TE, Guo Z (2005) Ultrasonic songs of male mice.
1109/48.180304 PLoS One Biol 3(12):e386. https://doi.org/10.1371/
Gill SA, Bierema AM-K (2013) On the meaning of alarm journal.pbio.0030386
calls: a review of functional reference in avian alarm Horn AG, Falls JB (1996) Categorization and the design of
calling. Ethology 119:449–461. https://doi.org/10. signals: the case of song repertoires. In: Kroodsma DE,
1111/eth.12097 Miller EH (eds) Ecology and evolution of acoustic
Gillespie D, Caillat M (2008) Statistical classification of communication in birds. Comstock Publishing
odontocete clicks. Can Acoust 36:20–26 Associates, Ithaca, pp 121–135
Gillespie D, Caillat M, Gordon J (2013) Automatic detec- Hotelling H (1933) Analysis of a complex of statistical
tion and classification of odontocete whistles. J Acoust variables into principal components. J Edu Psychol 24:
Soc Am 134:2427–2437. https://doi.org/10.1121/1. 417–441
4816555 Huang X, Acero A, Hon H-W (2001) Spoken language
Gingras G, Fitch WT (2013) A three-parameter model for processing. Prentice Hall, Upper Saddle River, NJ
classifying anurans into four genera based on adver- Huang G, Liu Z, Van Der Maaten L, Weinberger KQ
tisement calls. J Acoust Soc Am 133:547–559. https:// (2017) Densely connected convolutional networks.
doi.org/10.1121/1.4768878 Proc IEEE Conf Comput Vis Pattern Recogn 2017:
Goëau H, Glotin H, Vellinga WP, Planqué R, Joly A 4700–4708
(2016) LifeCLEF bird identification task 2016: the Ibrahim AK, Chérubin LM, Zhuang H, Schärer Umpierre
arrival of deep learning. CLEF 1609:440–449 MT, Dalgleish F, Erdol N, Ouyang B, Dalgleish A
Griffin DR, Webster FA, Michael CR (1960) The echolo- (2018) An approach for automatic classification of
cation of flying insects by bats. Anim Behav 8:141– grouper vocalizations with passive acoustic monitor-
154 ing. J Acoust Soc Am 143:666–676. https://doi.org/10.
Guemeur Y, Elisseeff A, Paugam-Moisey H (2000) A new 1121/1.5022281
multi-class SVM based on a uniform convergence Itakura F (1975) Minimum prediction residual principle
result. Proceedings of the IEEE-INNS-ENNS Interna- applied to speech recognition. IEEE Trans Acoust
tional Joint Conference on Neural Networks. IJCNN Speech Sig Process 23:57–72
2000. Neural Computing: New Challenges and Jacobson EK, Yack TM, Barlow J (2013) Evaluation of an
Perspectives for the New Millennium 4:183–188 automated acoustic beaked whale detection algorithm
8 Detection and Classification Methods for Animal Sounds 311

using multiple validation and assessment methods. In: Ko T, Peddinti V, Povey D, Khudanpur S (2015) Audio
NOAA Technical Memorandum NOAA-TM-NMFS- augmentation for speech recognition. In: Sixteenth
SWFSC-509 Annual Conference of the International Speech Com-
Jaitly N, Hinton GE (2013) Vocal tract length perturbation munication Association
(VTLP) improves speech recognition. In: Proceedings Kogan J, Margoliash D (1998) Automated recognition of
of ICML Workshop on Deep Learning for Audio, bird song elements from continuous recordings using
Speech and Language, vol 117 dynamic time warping and hidden Markov models: a
Janik VM (1999) Pitfalls in the categorization of behavior: comparative study. J Acoust Soc Am 103:2185–2196.
a comparison of dolphin whistle classification https://doi.org/10.1121/1.421364
methods. Anim Behav 57:133–143. https://doi.org/10. Kollmorgen S, Hahnloser RH, Mante V (2020) Nearest
1006/anbe.1998.0923 neighbours reveal fast and slow components of motor
Jarvis S, Dimarzio N, Morrissey R, Moretti D (2006) learning. Nature 577(7791):526–530. https://doi.org/
Automated classification of beaked whales and other 10.1038/s41586-019-1892-x
small odontocetes in the Tongue of the Ocean, Kondo N, Watanabe S (2009) Contact calls: information
Bahamas. Oceans 2006:1–6. https://doi.org/10.1109/ and social function. Jpn Psych Res 51:197–208.
OCEANS.2006.307124 https://doi.org/10.1111/j.1468-5884.2009.00399.x
Jiang JJ, Bu LR, Duan FJ, Wang XQ, Liu W, Sun ZB, Li Koren L, Geffen E (2009) Complex call in male rock
CY (2019) Whistle detection and classification for hyrax (Procavia capensis): a multi-information
whales based on convolutional neural networks. Appl distributing channel. Behav Ecol Sociobiol 63(4):
Acoust 150:169–178. https://doi.org/10.1016/j. 581–590. https://doi.org/10.1007/s00265-008-0693-2
apacoust.2019.02.007 Koren L, Geffen E (2011) Individual identity is
Kandia V, Stylianou Y (2006) Detection of sperm whale communicated through multiple pathways in male
clicks based on the Teager–Kaiser energy operator. rock hyrax (Procavia capensis) songs. Behav Ecol
Appl Acoust 67(11):1144–1163. https://doi.org/10. Sociobiol 65(4):675–684. https://doi.org/10.1007/
1016/j.apacoust.2006.05.007 s00265-010-1069-y
Karlsen JD, Bisther A, Lyndersen C, Haug T, Kovacs KM Koren L, Mokady O, Geffen E (2008) Social status and
(2002) Summer vocalizations of adult male white cortisol levels in singing rock hyraxes. Horm Behav
whales (Delphinapterus leucas) in Svalbard, Norway. 54:212–216
Polar Biol 25:808–817. https://doi.org/10.1007/ Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet
s00300-002-0415-6 classification with deep convolutional neural networks.
Keen S, Ross JC, Griffiths ET, Lanzone M, Farnsworth A Commun ACM 60(6):84–90
(2014) A comparison of similarity-based approaches Kruskal J, Sankoff D (1983) An anthology of algorithms
in the classification of flight calls of four species of and concepts for sequence comparison. In: Sankoff D,
North American wood-warblers (Parulidae). Ecol Inf Kruskal J (eds) Time warps, string edits and
21:25–33. https://doi.org/10.1016/j.ecoinf.2014.01. macromolecules: the theory and practice of string com-
001 parison. Addison-Wesley, Reading, MA, pp 265–310
Keighley MV, Langmore NE, Zdenek CN, Heinsohn R Lammers MO, Au WWL, Herzing DL (2003) The broad-
(2017) Geographic variation in the vocalizations of band social acoustic signaling behavior of spinner and
Australian palm cockatoos (Probosciger aterrimus). spotted dolphins. J Acoust Soc Am 114:1629–1639.
Bioacoustics 26(1):91–108. https://doi.org/10.1080/ https://doi.org/10.1121/1.1596173
09524622.2016.1201778 Law BS, Reinhold L, Pennay M (2002) Geographic varia-
Kershenbaum A, Blumstein DT, Roch MA, Akcay C, tion in the echolocation sounds of Vespadelus spp.
Backus G, Bee MA, Bohn K, Cao Y, Carter G, (Vespertilionidae) from New South Wales and
Cäsar C, Coen M, DeRuiter SL, Doyle L, Edelman S, Queensland, Australia. Acta Chiropt 4:201–215.
Ferrer-i-Cancho R, Freeberg TM, Garland EC, https://doi.org/10.3161/001.004.0208
Gustison M, Harley HE, Huetz C, Hughes M, Bruno Le Boeuf BJ, Peterson RS (1969) Dialects in elephant
JH, Ilany A, Jin DZ, Johnson M, Ju C, Karnowski J, seals. Science 166(3913):1654–1656. https://doi.org/
Lohr B, Manser MB, McCowan B, Mercado E, Narins 10.1126/science.166.3913.1654
PM, Piel A, Rice M, Salmi R, Sasahara K, Sayigh L, Leblanc E, Bahoura M, Simard Y (2008) Comparison of
Shiu Y, Taylor C, Vallejo EE, Waller S, Zamora- automatic classification methods for beluga whale
Gutierrez V (2016) Acoustic sequences in non-human vocalizations. J Acoust Soc Am 123:3772
animals: a tutorial review and prospectus. Biol Rev 91: LeCun Y, Boser B, Denker JS, Henderson D, Howard RE,
13–52 Hubbard W, Jackel LD (1989a) Backpropagation
Kingma DP, Welling M (2013) Auto-encoding variational applied to handwritten zip code recognition. Neural
bayes. arXiv preprint arXiv:1312.6114 Comput 1(4):541–551. https://doi.org/10.1162/neco.
Klinck H, Mellinger DK (2011) The energy ratio mapping 1989.1.4.541
algorithm: a tool to improve the energy-based detection LeCun Y, Boser B, Denker JS, Henderson D, Howard RE,
of odontocete echolocation clicks. J Acoust Soc Am Hubbard W, Jackel LD (1989b) Handwritten digit rec-
129(4):1807–1812. https://doi.org/10.1121/1.3531924 ognition with a back-propagation network. In:
312 J. N. Oswald et al.

Proceedings of the 2nd International Conference on comparison. J Acoust Soc Am 147(5):3078–3090.


Neural Information Processing Systems, pp 396–404 https://doi.org/10.1121/10.0001108
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient- Madhusudhana S, Shiu Y, Klinck H, Fleishman E, Liu X,
based learning applied to document recognition. Proc Nosal EM, Helble T, Cholewiak D, Gillespie D,
IEEE 86(11):2278–2324. https://doi.org/10.1109/5. Širović A, Roch MA (2021) Improve automatic detec-
726791 tion of animal call sequences with temporal context. J
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. R Soc Interface 18:20210297. https://doi.org/10.1098/
Nature 521(7553):436–444. https://doi.org/10.1038/ rsif.2021.0297
nature14539 Madsen PT, Surlykke A (2013) Functional convergence in
Lee C-H, Hsu S-B, Shih J-L, Chou C-H (2013) Continu- bat and toothed whale biosonars. Physiology 28(5):
ous birdsong recognition using Gaussian mixture 276–283. https://doi.org/10.1152/physiol.00008.2013
modeling of image shape features. IEEE Trans Multi- Makhoul J, Schwarz R (1995) State of the art in continu-
media 15:454–464. https://doi.org/10.1109/TMM. ous speech recognition. Proc Nat Acad Sci USA 92:
2012.2229969 9956–9963. https://doi.org/10.1073/pnas.92.22.9956
Leonard ML, Horn AG (2001) Begging calls and parental Malfante M, Mohammed O, Gervaise C, Dalla Mura M,
feeding decisions in tree swallows (Tachycineta Mars JI (2018) Use of deep features for the automatic
bicolor). Behav Ecol Sociobiol 49:170–175. https:// classification of fish sounds. In: 2018 OCEANS-MTS/
doi.org/10.1007/s002650000290 IEEE Kobe Techno-Oceans (OTO), pp 1–5. https://doi.
Levinson S (1985) Structural methods in automatic speech org/10.1109/OCEANSKOBE.2018.8559276
recognition. Proc IEEE 73:1625–1648. https://doi.org/ Mankin RW, Smith T, Tropp JM, Atkinson EB, Young DY
10.1109/PROC.1985.13344 (2008) Detection of Anoplophora glabripennis (Coleop-
Li Z, Tang S, Yan S (2002) Multi-class SVM classifier tera: Cerambycidae) larvae in different host trees and
based on pair wise coupling. In: Proceedings of the tissues by automated analysis of sound-impulse fre-
First International Workshop, SVM 2002, Niagara quency and temporal patterns. J Econ Entomol 101(3):
Falls, Canada, p 321 838–849. https://doi.org/10.1093/jee/101.3.838
Liaw A, Wiener M (2002) Classification and regression by Marler P (2004) Bird calls: a cornucopia for
Random Forest. R News 2:18–22 communication. In: Marler P, Slabbekoorn H (eds)
Linderman GC, Rachh M, Hoskins JG, Steinerberger S, Nature’s music: the science of birdsong. Elsevier,
Kluger Y (2017) Efficient algorithms for t-distributed Amsterdam, pp 132–177
stochastic neighborhood embedding. arXiv preprint Martindale S (1980a) On the multivariate analysis of avian
arXiv:1712.09005 vocalizations. J Theor Biol 83:107–110. https://doi.
Lippman R (1989) Pattern classification using neural org/10.1016/0022-5193(80)90374-4
networks. IEEE Commun Mag 1989:47–64 Martindale S (1980b) A numeric approach to the analysis
Luo W, Yang W, Zhang Y (2019) Convolutional neural of solitary vireo songs. Condor 82:199–211. https://
network for detecting odontocete echolocation clicks. J doi.org/10.2307/1367478
Acoust Soc Am 145(1):EL7–EL12. https://doi.org/10. Mazhar S, Ura T, Bahl R (2007) Vocalization based indi-
1121/1.5085647 vidual classification of humpback whales using sup-
Maaten LV (2014) Accelerating t-SNE using tree-based port-vector-machine. Oceans 2007:1–9. https://doi.
algorithms. J Mach Learn Res 15(1):3221–3245 org/10.1109/OCEANS.2007.4449356
Maaten LV, Hinton G (2008) Visualizing data using McDonald MA, Mesnick SL, Hildebrand JA (2006) Bio-
t-SNE. J Mach Learn Res 9:2579–2605 geographic characterisation of blue whale song world-
Mac Aodha O, Gibb R, Barlow KE, Browning E, wide: using song to identify populations. J Cetacean
Firman M, Freeman R, Harder B, Kinsey L, Mead Res Manag 8(1):55–65
GR, Newson SE, Pandourski I (2018) Bat detective— McInnes L, Healy J, Melville J (2018) UMAP: uniform
deep learning tools for bat acoustic signal detection. manifold approximation and projection for dimension
PLoS Comput Biol 14(3):e1005995. https://doi.org/10. reduction. arXiv preprint arXiv:1802.03426
1371/journal.pcbi.1005995 McLaughlin J, Josso N, Ioana C (2008) Detection and
Madhusudhana S, Gavrilov AN, Erbe C (2015) Automatic classification of sound types in the vocalizations of
detection of echolocation clicks based on a Gabor north-east pacific blue whales. J Acoust Soc Am 123:
model of their waveform. J Acoust Soc Am 137(6): 3102
3077–3086. https://doi.org/10.1121/1.4921609 McLister D, Stevens ED, Bogart JP (1995) Comparative
Madhusudhana S, Symes LB, Klinck H (2019) A deep contractile dynamics of calling and locomotor muscles
convolutional neural network based classifier for pas- in three hylid frogs. J Exp Biol 198(7):1527–1538.
sive acoustic monitoring of neotropical katydids. J https://doi.org/10.1242/jeb.198.7.1527
Acoust Soc Am 146(4):2982–2982. https://doi.org/ Mellinger DK (2008) A neural network for classifying
10.1121/1.5137323 clicks of Blainville’s beaked whales (Mesoplodon
Madhusudhana S, Murray A, Erbe C (2020) Automatic densirostris). Can Acoust 36:55–59
detectors for low-frequency vocalizations of Omura’s Mellinger DK, Bradbury JW (2007) Acoustic measure-
whales, Balaenoptera omurai: a performance ment of marine mammal sounds in noisy
8 Detection and Classification Methods for Animal Sounds 313

environments. In: Proceedings of the 2nd International their acoustic signals. Appl Sci 6(12):443. https://doi.
Conference on Underwater Acoustic Measurements: org/10.3390/app6120443
Technologies and Results, Heraklion, Greece, O’Farrell MJ, Miller BW, Gannon WL (1999) Qualitative
25–29 June 2007 identification of free-flying bats using Anabat detector.
Mellinger DK, Clark CW (2000) Recognizing transient J Mammal 80:11–23
low-frequency whale sounds by spectrogram correla- Oh J, Laubach M, Luczak A (2003) Estimating neuronal
tion. J Acoust Soc Am 107(6):3518–3529. https://doi. variable importance with random forest. Proc IEEE
org/10.1121/1.429434 Bioeng Conf:33–34. https://doi.org/10.1109/NEBC.
Mellinger DK, Martin SW, Morrissey RP, Thomas L, 2003.1215978
Yosco JJ (2011) A method for detecting whistles, Oleson EM, Širović A, Bayless AR, Hildebrand JA (2014)
moans and other frequency contour sounds. J Acoust Synchronous seasonal change in fin whale song in the
Soc Am 129:4055–4061. https://doi.org/10.1121/1. North Pacific. PLoS One 9(12):e115678. https://doi.
3531926 org/10.1371/journal.pone.0115678
Mendelson TC, Shaw KL (2003) Rapid speciation in an Oswald JN, Barlow J, Norris TF (2003) Acoustic identifi-
arthropod. Nature 433:375–376. https://doi.org/10. cation of nine delphinid species in the eastern tropical
1038/433375a Pacific Ocean. Mar Mamm Sci 19:20–37. https://doi.
Mitani JC, Hasegawa T, Groslouis J, Marler P, Byrne R org/10.1111/j.1748-7692.2003.tb01090.x
(1992) Dialects in wild chimpanzees. Am J Primatol Oswald JN, Rankin S, Barlow J, Lammers MO (2007) A
27:233–243 tool for real-time acoustic species identification of
Møhl B, Wahlberg M, Madsen PT, Heerford A, Lund A delphinid whistles. J Acoust Soc Am 122:587–595.
(2003) The monopulsed nature of sperm whale sonar https://doi.org/10.1121/1.2743157
clicks. J Acoust Soc Am 114(2):1143–1154. https:// Oswald JN, Au WWL, Duennebier F (2011) Minke whale
doi.org/10.1121/1.1586258 (Balaenoptera acutorostrata) boings detected at the Sta-
Moon TK (1996) The expectation-maximization algo- tion ALOHA cabled observatory. J Acoust Soc Am 129:
rithm. IEEE Sig Process Mag 13:47–60. https://doi. 3353–3360. https://doi.org/10.1121/1.3575555
org/10.1109/79.543975 Oswald JN, Rankin S, Barlow J, Oswald M (2013) Real-
Morrissey RP, Ward J, DiMarzio N, Jarvis S, Moretti DJ time odontocete call classification algorithm: software
(2006) Passive acoustic detection and localization of for species identification of delphinid whistles. In:
sperm whales (Physeter macrocephalus) in the tongue Adam O, Samaran F (eds) Detection, classification
of the ocean. Appl Acoust 67:1091–1105. https://doi. and localization of marine mammals using passive
org/10.1016/j.apacoust.2006.05.014 acoustics, 2003-2013: 10 years of international
Mouy X, Leary D, Martin B, Laurinolli M (2008) A research. DIRAC NGO, Paris, France
comparison of methods for the automatic classification Oswald JN, Walmsley SF, Casey C, Fregosi S, Southall B,
of marine mammal vocalizations in the Arctic. In: Janik VM (2021) Species information in whistle fre-
Proceedings of the PASSIVE’08 Workshop on New quency modulation patterns of common dolphins.
Trends for Environmental Monitoring using Passive Philos Trans R Soc B 376:20210046. https://doi.org/
Systems, Hyeres, France, 14–17 October 2008 10.1098/rstb.2021.0046
Murray SO, Mercado E, Roitblat HL (1998) Ou H, Au WWL, Oswald JN (2012) A non-spectrogram-
Characterizing the graded structure of false killer correlation method of automatically detecting minke
whale (Pseudorca crassidens) vocalizations. J Acoust whale boings. J Acoust Soc Am 132:EL317–EL322
Soc Am 104:1679–1687. https://doi.org/10.1121/1. Ouattara K, Lemasson A, Zuberbunter K (2009)
424380 Campbell’s monkeys concatenate vocalizations into
Myers C, Rabiner LR, Rosenberg AE (1980) Performance context-specific call sequences. Proc Natl Acad Sci
tradeoffs in dynamic time warping algorithms for USA 106(51):22026
isolated word recognition. IEEE Trans Acoust Speech Papale E, Azzolin M, Cascao I, Gannier A, Lammers MO,
Sig Process 28:623–635. https://doi.org/10.1109/ Martin VM, Oswald JN, Perez-Gil M, Prieto R, Silva
TASSP.1980.1163491 MA, Giacoma C (2013) Geographic variability in the
Nagy CM, Rockwell RF (2012) Identification of individ- acoustic parameters of striped dolphin’s (Stenella
ual eastern screech-owls (Megascops asio) via vocali- coeruleoalba) whistles. J Acoust Soc Am 133:1126–
zation analysis. Bioacoustics 21:127–140. https://doi. 1134. https://doi.org/10.1121/1.4774274
org/10.1080/09524622.2011.651829 Papale E, Azzolin M, Cascao I, Gannier A, Lammers MO,
Narins PM, Feng AS, Fay RR (eds) (2006) Hearing Martin VM, Oswald J, Perez-Gil M, Prieto R, Silva
and sound communication in amphibians. Springer, MA, Giacoma C (2014) Macro- and micro- geographic
New York variation of short-beaked common dolphin’s whistles
Noad MJ, Cato DH, Bryden MM, Jenner MN, Jenner KCS in the Mediterranean Sea and Atlantic Ocean. Ethol
(2000) Cultural revolution in whale songs. Nature 408: Ecol Evol 26:392–404. https://doi.org/10.1080/
537. https://doi.org/10.1038/35046199 03949370.2013.851122
Noda JJ, Travieso CM, Sánchez-Rodríguez D (2016) Park DS, Chan W, Zhang Y, Chiu C, Zoph B, Cubuk ED,
Automatic taxonomic classification of fish based on Le QV (2019) SpecAugment: a simple data
314 J. N. Oswald et al.

augmentation method for automatic speech recogni- clicks and burst-pulses. Mar Mamm Sci 33:520–540.
tion. Proc Interspeech 2019:2613–2617. https://doi. https://doi.org/10.1111/mms.12381
org/10.21437/Interspeech.2019-2680 Reby D, André-Obrecht R, Galinier A, Farinas J,
Parsons S, Boonman AM, Obrist MK (2000) Advantages Cargnelutti B (2006) Cepstral coefficients and hidden
and disadvantages of techniques for transforming and Markov models reveal idiosyncratic voice
analyzing chiropteran echolocation calls. J Mammal characteristics in red deer (Cervus elaphus) stags. J
81:927–938. https://doi.org/10.1644/1545-1542(2000) Acoust Soc Am 120:4080–4089. https://doi.org/10.
081<0927:AADOTF>2.0.CO;2 1121/1.2358006
Payne K, Payne R (1985) Large scale changes over Recalde-Salas A, Salgado Kent CP, Parsons MJG, Marley
19 years in songs of humpback whales in Bermuda. Z SA, McCauley RD (2014) Non-song vocalizations of
Tierpsychol 68:89–114. https://doi.org/10.1111/j. pygmy blue whales in Geographe Bay, Western
1439-0310.1985.tb00118.x Australia. J Acoust Soc Am 135(5):EL213–EL218.
Picone JW (1993) Signal modeling techniques in speech https://doi.org/10.1121/1.4871581
recognition. Proc IEEE 81:1215–1247. https://doi.org/ Recalde-Salas A, Erbe C, Salgado Kent C, Parsons M
10.1109/5.237532 (2020) Non-song vocalizations of humpback whales
Placer J, Slobodchikoff CN (2000) A fuzzy-neural system in Western Australia. Front Mar Sci 7:141. https://
for identification of species-specific alarm sounds of doi.org/10.3389/fmars.2020.00141
Gunnison’s prairie dogs. Behav Process 52:1–9. Rickwood P, Taylor A (2008) Methods for automatically
https://doi.org/10.1016/S0376-6357(00)00105-4 analyzing humpback song units. J Acoust Soc Am 123:
Potter JR, Mellinger DK, Clark CW (1994) Marine mam- 1763–1772. https://doi.org/10.1121/1.2836748
mal sound discrimination using artificial neural Risch D, Gales NJ, Gedamke J, Kindermann L, Nowacek
networks. J Acoust Soc Am 96:1255–1262. https:// DP, Read AJ, Siebert U, Van Opzeeland IC, Van Parijs
doi.org/10.1121/1.410274 SM, Friedlander AS (2014) Mysterious bio-duck
Pozzi L, Gamba M, Giacoma C (2010) The use of Artifi- sound attributed to the Antarctic minke whale
cial Neural Networks to classify primate vocalizations: (Balaenoptera bonaerensis). Biol Lett 10:20140175.
a pilot study on black lemurs. Am J Primatol 72(4): https://doi.org/10.1098/rsbl.2014.0175
337–348. https://doi.org/10.1002/ajp.20786 Roch MA, Soldevilla MS, Burtenshaw JC, Henderson EE,
Prӧhl H, Koshy RA, Mueller U, Rand AS, Ryan MJ Hildebrand JA (2007) Gaussian mixture model classi-
(2006) Geographic variation of genetic and behavioral fication of odontocetes in the Southern California
traits in northern and southern Túngara frogs. Evol 60: Bight and the Gulf of California. J Acoust Soc Am
1669–1679. https://doi.org/10.1111/j.0014-3820.2006. 121:1737–1748. https://doi.org/10.1121/1.2400663
tb00511.x Roch MA, Soldevilla MS, Hoenigman R, Wiggins SM,
Rabiner LR (1989) A tutorial on hidden Markov models Hildebrand JA (2008) Comparison of machine-
and selected applications in speech recognition. Proc learning techniques for the classification of echoloca-
IEEE 77:257–285 tion clicks from three species of odontocetes. Can
Rabiner LR, Juang BH (1986) An introduction to Hidden Acoust 36:41–47
Markov Models. IEEE ASSP Mag 1986:4–16 Roch MA, Brandes TS, Patel B, Barkley Y, Baumann-
Rabiner LR, Levinson S, Sondhi M (1983) On the appli- Pickering S, Soldevilla MS (2011) Automated
cation of vector quantization and hidden Markov extraction of odontocete whistle contours. J Acoust
models to speaker-independent, isolated word recogni- Soc Am 130:2212–2223. https://doi.org/10.1121/1.
tion. Bell Syst Tech J 62:1075–1106. https://doi.org/ 3624821
10.1002/j.1538-7305.1983.tb03115.x Rocha HS, Ferreira LS, Paula BC, Rodrigues HG, Sousa-
Rabiner LR, Juang B, Lee C (1996) An overview of Lima RS (2015) An evaluation of manual and
automatic speech recognition. In: Lee C, Soong F, automated methods for detecting sounds of mane
Paliwal K (eds) Automatic speech and speaker recog- wolves (Chrysocyon brachyurus Illiger 1815). Bio-
nition. Kluwer Academic, New York, pp 1–30 acoustics 24:185–198. https://doi.org/10.1080/
Rankin S, Barlow J (2005) Source of the North Pacific 09524622.2015.1019361
‘boing’ sound attributed to minke whales. J Acoust Soc Roitblat HL, Moore PWB, Nachtigall PE, Penner RH, Au
Am 118(5):3346–3351. https://doi.org/10.1121/1. WWL (1989) Natural echolocation with an artificial
2046747 neural network. Int J Neural Syst 1:239–247
Rankin S, Ljungblad D, Clark CW, Kato H (2005) Rosenblatt F (1958) The perceptron: a probabilistic model
Vocalisations of Antarctic blue whales, Balaenoptera for information storage and organization in the brain.
musculus intermedia, recorded during the 2001/2002 Psychol Rev 65:386–408. https://doi.org/10.1037/
and 2002/2003 IWC/SOWER circumpolar cruises, h0042519
Area V, Antarctica. J Cet Res Manag 7(1):13–20 Ross JC, Allen PE (2014) Random forest for improved
Rankin S, Archer F, Keating JL, Oswald JN, Oswald M, analysis efficiency in passive acoustic monitoring. Ecol
Curtis A, Barlow J (2016) Acoustic classification of Inform 21:34–39. https://doi.org/10.1016/j.ecoinf.
dolphins in the California Current using whistles, 2013.12.002
8 Detection and Classification Methods for Animal Sounds 315

Rumelhart DE, Hinton GE, Williams RJ (1986) Learning Slobodchikoff CN, Ackers SH, Van Ert M (1998) Geo-
representations by back-propagating errors. Nature graphic variation in alarm calls of Gunnison’s prairie
323(6088):533–536. https://doi.org/10.1038/323533a0 dogs. J Mammal 79(4):1265–1272. https://doi.org/10.
Russo D, Mucedda M, Bello M, Biscardi S, 2307/1383018
Pidinchedda E, Jones G (2007) Divergent echolocation Somervuo P, Härmä A, Fagerlund S (2006) Parametric
sound frequencies in insular rhinolophids (Chiroptera): representations of bird sounds for automatic species
a case of character displacement? J Bioeng 34:2129– recognition. IEEE Trans Audio Speech Lang Process
2138. https://doi.org/10.1111/j.1365-2699.2007. 14:2252–2263. https://doi.org/10.1109/TASL.2006.
01762.x 872624
Sainburg T, Theilman B, Thielk M, Gentner TQ (2019) Sparling DW, Williams JD (1978) Multivariate analysis of
Parallels in the sequential organization of birdsong and avian vocalizations. J Theor Biol 74:83–107. https://
human speech. Nat Commun 10:3636. https://doi.org/ doi.org/10.1016/0022-5193(78)90291-6
10.1038/s41467-019-11605-y Stafford KM, Fox CG, Clark DS (1998) Long-range
Sakoe H, Chiba S (1978) Dynamic programming optimi- acoustic detection and localization of blue whale
zation for spoken word recognition. IEEE Trans sounds in the northeast Pacific Ocean. J Acoust Soc
Acoust Speech Sig Process 26:43–49. https://doi.org/ Am 104(6):3616–3625. https://doi.org/10.1121/1.
10.1109/TASSP.1978.1163055 423944
Schassburger RM (1993) Vocal communication in the Stafford KM, Nieukirk SL, Fox CG (1999)
timber wolf, Canis lupus, Linnaeus: structure, motiva- Low-frequency whale sounds recorded on
tion, and ontogeny. Parey Scientific Publication, hydrophones moored in the eastern tropical Pacific. J
New York Acoust Soc Am 106:3687–3698. https://doi.org/10.
Schon PC, Puppe B, Manteauffel G (2001) Linear predic- 1121/1.428220
tion coding analysis and self-organizing feature map as Stafford KM, Moore SE, Laidre KL, Heide-Jørgensen MP
tools to classify stress sounds of domestic pigs (Sus (2008) Bowhead whale springtime song off West
scrofa). J Acoust Soc Am 110:1425–1431. https://doi. Greenland. J Acoust Soc Am 124(5):3315–3323.
org/10.1121/1.1388003 https://doi.org/10.1121/1.2980443
Sethi SS, Jones NS, Fulcher BD, Picinali L, Clink DJ, Starnberger I, Preininger D, Hödl W (2014) The anuran
Klinck H, Orme CD, Wrege PH, Ewers RM (2020) vocal sac: a tool for multimodal signalling. Anim
Characterizing soundscapes across diverse ecosystems Behav 97:281–288. https://doi.org/10.1016/j.anbehav.
using a universal acoustic feature set. Proc Natl Acad 2014.07.027
Sci 117(29):17049–17055. https://doi.org/10.1073/ Stoeger AS, Heilmann G, Zeppelzauer M, Ganswindt A,
pnas.2004702117 Hensman S, Charlton BD (2012) Visualizing sound
Shannon CE, Weaver W (1998) The mathematical theory emission of elephant vocalizations: evidence for two
of communication. University of Illinois Press, rumble production types. PLoS One 7:1–8. https://doi.
Champaign org/10.1371/journal.pone.0048907
Shiu Y, Palmer KJ, Roch MA, Fleishman E, Liu X, Nosal Stowell D, Wood M, Stylianou Y, Glotin H (2016). Bird
EM, Helble T, Cholewiak D, Gillespie D, Klinck H detection in audio: a survey and a challenge. In: 2016
(2020) Deep neural networks for automated detection IEEE 26th International Workshop on Machine
of marine mammal species. Sci Rep 10(1):1–12. Learning for Signal Processing (MLSP), pp 1–6.
https://doi.org/10.1038/s41598-020-57549-y https://doi.org/10.1109/MLSP.2016.7738875
Sibley DA (2000) The Sibley field guide to birds. Knopf, Sturtivant C, Datta S (1997) Automatic dolphin whistle
New York detection, extraction, encoding, and classification. Proc
Simmons JA, Wever EG, Pylka JM (1971) Periodical Inst Acoust 19:259–266
cicada: sound production and hearing. Science Suzuki R, Buck J, Tyack P (2006) Information entropy
171(3967):212–213. https://doi.org/10.1126/science. of humpback whale songs. J Acoust Soc Am 119:
171.3967.212 1849–1866. https://doi.org/10.1121/1.2161827
Širović A (2016) Variability in the performance of the Swets JA, Dawes RM, Monahan J (2000) Better decisions
spectrogram correlation detector for north-east Pacific through science. Sci Am 283:82–87
blue whale calls. Bioacoustics 25(2):145–160. https:// Takahashi N, Kashino M, Hironaka N (2010) Structure of
doi.org/10.1080/09524622.2015.1124248 rat ultrasonic vocalizations and its relevance to behav-
Širović A, Cutter GR, Butler JL, Demer DA (2009) ior. PLoS One 5(11):e14115. https://doi.org/10.1371/
Rockfish sounds and their potential use for population journal.pone.0014115
monitoring in the Southern California Bight. ICES J Tan M, McDonald K (2017) Bird sounds | Experiments
Mar Sci 66:981–990. https://doi.org/10.1093/icesjms/ with Google [online]. https://experiments.withgoogle.
fsp064 com/bird-sounds
Sjare B, Stirling I, Spencer C (2003) Seasonal and longer- Tchernichovski O, Nottebohm F, Ho CE, Pesaran B, Mitra
term variability in the songs of Atlantic walruses PP (2000) A procedure for an automated measurement
breeding in the Canadian High Arctic. Aquat Mamm of song similarity. Anim Behav 59:1167–1176. https://
29(2):297–318 doi.org/10.1006/anbe.1999.1416
316 J. N. Oswald et al.

Tenenbaum JB, De Silva V, Langford JC (2000) A global conjunction with a digital tag (DTag) recording. Can
geometric framework for nonlinear dimensionality Acoust 36:60–66
reduction. Science 290(5500):2319–2323. https://doi. Ward R, Parnum I, Erbe C, Salgado-Kent CP (2016)
org/10.1126/science.290.5500.2319 Whistle characteristics of Indo-Pacific bottlenose
Thomas JA, Golladay CL (1995) Analysis of underwater dolphins (Tursiops aduncus) in the Fremantle Inner
vocalizations of leopard seals (Hydrurga leptonyx). In: Harbour, Western Australia. Acoust Aust 44(1):
Kastelein RA, Thomas JA, Nachtigall PE (eds) Sen- 159–169. https://doi.org/10.1007/s40857-015-0041-4
sory systems of aquatic mammals. De Spil Publishers, Ward R, Gavrilov AN, McCauley RD (2017) “Spot” call:
Amsterdam, pp 201–221 A common sound from an unidentified great whale in
Thomas M, Martin B, Kowarski K, Gaudet B, Matwin S Australian temperate waters. J Acoust Soc Am 142(2):
(2019) Marine mammal species classification using EL231–EL236. https://doi.org/10.1121/1.4998608
convolutional neural networks and a novel acoustic Weisburn BA, Mitchell SG, Clark CW, Parks TW (1993)
representation. In: Joint European Conference on Isolating biological acoustic transient signals. Proc
Machine Learning and Knowledge Discovery in IEEE Int Conf Acoust Speech Sig Process 1:269–
Databases, pp 290–305 272. https://doi.org/10.1109/ICASSP.1993.319107
Torrey L, Shavlik J (2010) Transfer learning. In: Hand- Wellard R, Erbe C, Fouda L, Blewitt M (2015)
book of research on machine learning applications and Vocalisations of killer whales (Orcinus orca) in the
trends: algorithms, methods, and techniques. IGI Bremer Canyon, Western Australia. PLoS One 10(9):
Global, New York, pp 242–264 e0136535. https://doi.org/10.1371/journal.pone.
Trawicki MB, Johnson MT, Osiejuk TS (2005) Automatic 0136535
song-type classification and speaker identification of Wells KD (2007) The ecology and behaviour of
Norwegian ortolan bunting. IEEE Int Conf Mach Learn amphibians. University of Chicago Press, Chicago, IL
Sig Process (MLSP) 2005:277–282. https://doi.org/10. Wich SA, Schel AM, De Vries H (2008) Geographic
1109/MLSP.2005.1532913 variation in Thomas langur (Presbytis thomasi) loud
Trifa VM, Kirschel ANG, Taylor CE (2008) Automated sounds. Am J Primatol 70:566–574. https://doi.org/10.
species recognition of antbirds in a Mexican rainforest 1002/ajp.20527
using hidden Markov Models. J Acoust Soc Am 123: Winn HE, Winn LK (1978) The song of the humpback
2424–2431. https://doi.org/10.1121/1.2839017 whale Megaptera novaeangliae in the West Indies.
Valente D, Wang H, Andrews P, Mitra PP, Saar S, Mar Biol 47:97–114. https://doi.org/10.1007/
Tchernichovski O, Golani I, Benjamini Y (2007) BF00395631
Characterizing animal behavior through audio and Wood JD, McCowan B, Langbauer WR, Viljoen JJ,
video signal processing. IEEE Multimedia 14:32–41. Hart LA (2005) Classification of African elephant
https://doi.org/10.1109/MMUL.2007.71 Loxodonta africana rumbles using acoustic
Van Allen E, Menon MM, Dicaprio N (1990) A modular parameters and cluster analysis. Bioacoustics 15:
architecture for object recognition using neural 143–161. https://doi.org/10.1080/09524622.2005.
networks. In: Proceedings of International Neural 9753544
Networks Conference, Paris, vol 1, pp 35–379, Yamamoto O, Moore B, Brand L (2001) Variation in the
13 July 1990. Kluwer Academic Publishers, Dordrecht bark sound of the red squirrel (Tamiasciurus
Vapnik VN (1998) Statistical learning theory. Wiley, hudsonicus). West N Am Nat 61:395–402
New York Yang X-J, Lei F-M, Wang G, Jesse AJ (2007) Syllable
Venter PJ, Hanekom JJ (2010) Automatic detection of sharing and inter-individual syllable variation in
African elephant (Loxodonta africana) infrasonic Anna’s hummingbird Calypte anna songs, in San
vocalizations from recordings. Biosyst Eng 106:286– Francisco, California. Folia Zool 56:307–318
294. https://doi.org/10.1016/j.biosystemseng.2010.04. Yoshino H, Armstrong KN, Izawa M, Yokoyama J,
001 Kawata M (2008) Genetic and acoustic population
Von Muggenthaler E, Reinhart P, Lympany B, Craft RB structuring in the Okinawa least horseshoe bat: are
(2003) Songlike vocalizations from the Sumatran rhi- intercolony acoustic differences maintained by vertical
noceros (Dicerorhinus sumatrensis). Acoust Res Lett maternal transmission? Mol Ecol 17:4978–4991.
4(3):83–88. https://doi.org/10.1121/1.1588271 https://doi.org/10.1111/j.1365-294X.2008.03975.x
Waibel A, Hanazawa T, Hinton G, Shikano K, Lang KL Zar JH (2009) Biostatistical analysis, 5th edn. Pearson,
(1989) Phoneme recognition using time-delay neural New York, p 960
networks. IEEE Trans Acoust Speech Signal Proc 37: Zeppelzauer M, Hensman S, Stoeger AS (2015) Towards
328–339. https://doi.org/10.1109/29.21701 an automated acoustic detection system for free-
Ward J, Morrissey R, Moretti D, DiMarzio N, Jarvis S, ranging elephants. Bioacoustics 24:13–29. https://doi.
Johnson M, Tyack PL, White C (2008) Passive acous- org/10.1080/09524622.2014.906321
tic detection and localization of Mesoplodon Zhang YJ, Huang JF, Gong N, Ling ZH, Hu Y (2018)
densirostris (Blainville’s beaked whale) vocalizations Automatic detection and classification of marmoset
using distributed bottom-mounted hydrophones in vocalizations using deep and recurrent neural
8 Detection and Classification Methods for Animal Sounds 317

networks. J Acoust Soc Am 144(1):478–487. https:// labeling. Appl Acoust 166:107375. https://doi.org/10.
doi.org/10.1121/1.5047743 1016/j.apacoust.2020.107375
Zhong M, LeBien J, Campos-Cerqueira M, Dodhia R, Zuberbuhler K, Jenny D, Bshary R (1999) The predator
Ferres JL, Velev JP, Aide TM (2020) Multispecies deterrence function of primate alarm calls. Ethology
bioacoustic classification using transfer learning of 105:477–490. https://doi.org/10.1046/j.1439-0310.
deep convolutional neural networks with pseudo- 1999.00396.x

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Fundamental Data Analysis Tools
and Concepts for Bioacoustical Research 9
Chandra Salgado Kent, Tiago A. Marques, and Danielle Harris

9.1 Introduction Sometimes, government regulators require “yes”


or “no” answers to these questions. A knowledge-
Bioacoustics has emerged as a prominent, able bioacoustician, any scientist in fact, will know
non-invasive, and innovative approach to that usually it is difficult to provide simple ‘yes’ or
obtaining scientific knowledge about animal ‘no’ answers. This is because the magnitude of
behavior and ecology. As a consequence, impact that is biologically significant is usually
bioacousticians play an important role in today’s not known. For instance, imagine the question
societies, often informing decision-makers in relates to whether loud construction works will
governments, industries, and communities. As an result in a decline of a local population of animals.
example, bioacousticians are often asked whether a The observed impact is that animals reduce the
species, a population, a community, or individual time spent feeding. Therefore, the required reduc-
animals will sustain impacts from noise—or any tion in time feeding that will lead to a population
other impact, of course, but noise is particularly decline must be known to be able to provide a
relevant to the running theme of the book— “yes” or “no” answer. Consequently, the
generated from particular human activities. bioacoustician’s question is not whether there is
simply a statistically significant effect, which by
itself may be meaningless and even misleading
C. Salgado Kent (*)
Centre for Marine Science and Technology, Curtin (e.g., Wasserstein et al. 2019), but whether the
University, Perth, WA, Australia magnitude of the effect is biologically important.
Oceans Blueprint, Perth, WA, Australia That is a much more difficult question to answer,
and hence why it is often ignored albeit inadver-
Centre for Marine Ecosystems Research, School of
Science, Edith Cowan University, Perth, WA, Australia tently. By ensuring that research questions have
e-mail: c.salgadokent@ecu.edu.au biological relevance, bioacousticians can design
T. A. Marques studies that can draw meaningful conclusions
Centre for Research into Ecological and Environmental about animals and their populations.
Modelling, University of St Andrews, St Andrews, Fife, Once the biologically relevant question has
UK
been identified, the bioacoustician can determine
Departamento de Biologia Animal, Centro de Estatística what study design is required and whether it is
e Aplicações, Faculdade de Ciências da Universidade de
possible to carry it out. All too commonly,
Lisboa, Lisbon, Portugal
constraints occur in available budgets and time
D. Harris
allocated to undertake the research. This often
Centre for Research into Ecological and Environmental
Modelling, University of St Andrews, St Andrews, Fife, results in sub-optimal study designs and sample
UK sizes (e.g., reduced numbers of surveys, available
# The Author(s) 2022 319
C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_9
320 C. Salgado Kent et al.

acoustic instruments, and/or surveyed animals). approach may still do the job well. Consequently,
The reality is that for a bioacoustician to be able not only is it important for researchers to have a
to confidently answer research questions, budgets solid foundation in long-established analytical
must allow for robust experimental designs and approaches, but they must keep up to date with
sufficient time to collect sample sizes representa- new developments. In general, a researcher
tive of the study population. Even when budgets should understand the fundamentals involving
and time allow for carefully designed randomness, variability, and statistical modeling
experiments, however, environmental conditions discussed in this chapter, and be able to adapt
and study animals often cannot be controlled, them to their specific context—this understanding
particularly when studied in their natural environ- is arguably more valuable than a book of recipes
ment. Moreover, many studies occur that tells a researcher which method to use
opportunistically and are not the result of an and when.
experimental design developed specifically for A consequence of the many advancements
the study aims. They are observational in nature over recent years and the large range of analytical
and can take advantage of large, long-term approaches available today is that selecting the
existing datasets or unexpected opportunities to right tool can be an overwhelming task. In fact,
collect field data. In fact, data collected the right tool might not exist for a specific setting.
opportunistically are prevalent in bioacoustical In such cases, collaboration with an applied stat-
studies, as many researchers take recording istician may be fundamental. This chapter aims to
systems into the field during other work to use give general guidance on considerations that
when time permits. bioacousticians should make when tasked with
The challenges described above, from ensur- undertaking research resulting in what are often
ing that the research questions have biological complex and messy bioacoustical datasets. The
relevance, to evaluating the achievability of a information presented in this chapter is by no
study and reliability of its outcomes, are only a means meant to provide a menu of analytical
few of many challenges faced by bioacousticians. tools, their mathematical basis, or conditions of
To overcome these challenges, bioacousticians use. There are a large number of widely available
must have solid foundational knowledge about textbooks that do just that, and many are
the quantitative aspects of their research: from referenced here. Bioacousticians should consult
how to formulate quantitative research questions, the relevant textbooks for in-depth knowledge of
to designing robust studies and undertaking suit- approaches, their applications, limitations, and
able analyses. Only by having these skills can assumptions about the characteristics of the data
reliable conclusions and scientific claims that must be met. Rather, the focus of this chapter
be made. is to provide practical guidance on: (1) the devel-
Today, not only are there a wide range of opment of meaningful research questions, (2) data
analytical tools available to select from, but this exploration and experimental design
ever-increasing number has been evolving considerations (also see Chap. 3), and (3) common
quickly over recent decades due to the dramatic analytical approaches used today. The approach
improvement in computer capacity. Moreover, taken in this chapter is to define basic terms and
ongoing research in statistics continually updates concepts as they appear in the text, so that readers
our knowledge on the suitability of commonly new to the subject can also understand the more
used methods (Wilcox 2010). In some instances, complex concepts discussed, regardless of their
methods previously used over a wide range of prior statistical knowledge.
applications may now only be acceptably applied Note that this chapter has been written from
to certain scenarios, with new methods the perspective of a biologist faced with the
superseding old ones. Having said this, while a challenges common to bioacoustical research. If,
new method may be considered the ‘Rolls Royce’ from this chapter, the reader gains an appreciation
of analyses, sometimes an older, simpler of limitations in their data, considerations they
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 321

should make when selecting analytical commonly used during preliminary data explora-
approaches, and the biological relevance of their tion before undertaking inferential, explanatory,
analytical outputs, then this chapter has achieved or predictive studies (see Sect. 9.3.3). Indeed,
its purpose. Entire books could be written about descriptive and exploratory surveys are often
how a bioacoustician, in fact, any ecologist, might used to develop the more complex inferential,
become more quantitative. A good example of explanatory, and predictive study type questions.
such a book is suitably named How to be a Inferential studies build on descriptive and
quantitative ecologist (Matthiopoulos 2010), exploratory studies by quantifying whether
which we wholeheartedly recommend as good findings are likely to be true for a broader popu-
reading after this chapter. lation and hence can be generalized. For example,
inferential studies are commonly used to make
decisions about whether there is sufficient evi-
9.2 Developing a Clear Research dence regarding observed patterns or
Question relationships in sample data to believe that they
have not arisen from the population by pure
At the concept stage of any study, the purpose and chance alone. Explanatory studies aim to identify
specific research aim must be clearly defined. The associated conditions (e.g., species, age, sex of an
research aim should be novel (i.e., not already animal, date, time of day, season, and environ-
answered in previous research). Once the general mental factors such as temperature, noise, etc.)
aim has been defined, the specific analytical influencing or explaining an outcome (e.g., the
research question can be developed. While devel- rate at which animals produce their calls). These
oping the question may seem to be a simple, self- studies seek to determine the magnitude and
evident task, it requires careful consideration. The direction of relationships (Leek and Peng 2015).
structure of the question drives the experimental Predictive studies aim to predict future outcomes
design and selection of analytical tools, thus its in given conditions or scenarios (but may not
accurate development is essential. To frame a necessarily explain conditions leading to an
question in clear, concise analytical terms, it is observed outcome). By identifying which of the
useful to identify the type of study involved. study types your research aim falls into, the gen-
There are many types of studies conducted for a eral structure of the analytical question can be
wide range of purposes. Depending upon the formed. Some examples of the different study
discipline, groupings that describe types of stud- types and corresponding analytical questions are
ies and their definitions vary. Here, we have given in Table 9.1.
adopted five of the six groupings referred to by
Leek and Peng (2015) as common in bioacous-
tics. These study types include descriptive, 9.3 Designing the Study
exploratory, inferential, explanatory (called and Collecting Data
‘causal’ in Leek and Peng 2015), and predictive
studies. Definitions we give here have been Once the analytical question has been formulated
framed within the context of common based on the study type, novelty, and whether it
bioacoustical questions, and thus are adapted truly addresses the research question, the feasibil-
from more broad definitions. ity of collecting the required data will need to be
Of the study types, descriptive studies are the assessed. Practical considerations, for instance,
simplest, aiming to summarize datasets collected. include identifying any hindrances to study site
Exploratory studies take a step beyond and accessibility or timely ethics approvals and ani-
explore relationships, trends, and patterns in mal experimentation permits. Below (Fig. 9.1) is
datasets. Neither of these types of studies a checklist of some preliminary considerations
attempts to infer beyond the dataset collected to before committing to developing, designing, and
the wider population. These types of studies are executing a study.
322 C. Salgado Kent et al.

Table 9.1 Examples of study types and their corresponding objectives and questions
Study type Purpose Example objective Example questions
Descriptive Studies conducted to describe Describe the characteristics • What is the frequency range of
phenomena and conditions of sound produced by sea sounds produced?
measured during a study. turtle hatchlings recorded • What are the source levels of
during a study. sounds produced?
• What is the rate of sound
production by sea turtle hatchlings?
Exploratory Studies exploring relationships, Establish how observed • How does observed hatchling sea
trends, and patterns in datasets hatchling sea turtles’ sound turtles’ sound production vary
(not in a broader population). production varied during a during a given survey?
survey.
Inferential Studies aiming to estimate Determine the average • What is the average expected
population parameters or test expected sound production sound production rate of a
hypotheses about a broader rate of a population of population of hatchling sea turtles?
population. hatchling sea turtles.
Explanatory Studies that aim to understand Identify what influences • Are communications influenced
the underlying cause(s) of a sound production in sea turtle by the presence of other sea turtles,
behavior, state, or phenomenon. hatchlings. environmental conditions, or
human/predator threats?
Predictive Studies that aim to predict an Predict hatchling sea turtle • What will be the expected sound
outcome (such as animal sound production rate when production rate of hatchling sea
behaviors) in response to a threatened by humans. turtles when exposed to human
stimulus or condition. threats?

Has the queson been already answered in past research?

Does the analycal queson address the research aims?

Will there be any logiscal / ethical constraints that will affect the execuon of the study?

Fig. 9.1 Checklist of some considerations to be made before committing to a study

9.3.1 Experimental Design collected for another primary study are used to
answer a new research question. In these cases,
The ideal situation is to formulate the analytical the methods and experiment are not necessarily
question before data are collected (i.e., a priori) so designed according to the analytical requirements
that experiments can be designed to maximize the of the new research question. Bioacoustical stud-
chance that, based on the observations, they pro- ies using pre-existing opportunistic data often do
duce precise (i.e., close to one another) and accu- so because collecting new data can be
rate (i.e., proximal to true values) estimates of the prohibitively expensive (e.g., if the field site is
parameters of interest, and so that there is a high remote or if specialized equipment is required).
probability of detecting relevant effects (i.e., that Since the methods and experimental design may
there is sufficient statistical power) when they are be sub-optimal for the current study questions, the
present. In some cases, however, formulation of data must be meticulously evaluated to check that
the analytical questions occurs after data have newly formulated analytical questions can indeed
been collected (i.e., a posteriori). This may be answered. Studies attempting to answer spe-
occur as a result of poor planning or of new and cific research questions using sub-optimal or
unforeseen research opportunities. A scenario in poor-quality data cannot always be salvaged,
which this often occurs is when data already even with sophisticated analyses. The prominent
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 323

twentieth century biostatistician, Sir Ronald if there was no error then there would be no need
Fisher, illustrated this problem with the following for statistics. Of course, the performance of the
quote: “To call in the statistician after the experi- analytical methods is affected by the amount of
ment is done may be no more than asking him to error in the data, in that the statistical power to
perform a post-mortem examination: he may be detect significant effects decreases with increas-
able to say what the experiment died of” (Fisher ing error, but if there was no error, by definition
1959). This message cannot be overstated. It is there would be no questions left to answer and
critical, wherever possible, to consider the ques- statistics would have no role to play. Systematic
tion carefully a priori, so that the study is able to error (i.e., bias) is consistent error that is repeat-
answer the question (Cochran 1977). If you think able if the data are recorded again. It can arise
you might need to consult with a statistician, do from many causes, such as a person consistently
so before collecting the data. making the same erroneous observation (i.e.,
For analyses to answer ecological research biased observation; e.g., incorrectly recording
questions, the experimental design must yield male birds as female birds) or an incorrectly
sufficient information about the question of inter- calibrated instrument. In behavioral studies,
est. Often, ecological questions involve sets of biases in collected data can also be introduced
sampling units taken from a larger group (i.e., by the presence of the researchers themselves
the statistical population, hereafter referred to as (e.g., through human disturbance in a study on
a population unless otherwise stated). For a given supposedly undisturbed animal vocal behavior).
study species, or set of species, sampling units The introduction of bias can be further illustrated
could be defined as individuals, groups, cohorts, in the example of a bioacoustician estimating
communities, or local populations of the species acoustic cue production rate (i.e., number of
of interest—it depends on the research question. cues, such as calls, produced per unit time) for a
Usually, due to logistical and time constraints, it population. In this example, the researcher
is not possible nor desirable to make obtains samples of animals by locating the
measurements over all objects or the whole pop- animals producing acoustic cues. It is highly
ulation. In these cases, a sample is taken and data likely, however, that the sample collected will
collected from the sample are considered to be be only from animals that are in a sound-
representative of the population. It is key that the producing state (as silent animals will go unde-
process used to draw the sample is well under- tected), hence acoustic cue rate might be inadver-
stood and is ideally random in design. The pro- tently overestimated. Furthermore, animals may
cess of drawing conclusions regarding a respond to the presence of the researcher by alter-
population based on a sample from it is called ing their cue production rates, thereby introducing
statistical inference. further error to cue rate estimation. Such studies
To make meaningful inferences about the should be designed to remove or control biases. If
properties of a population, the sampling protocol controls cannot be integrated into the experimen-
must yield a sample size that is sufficiently large tal design, then these may be able to be applied at
to represent the population. In addition, the sam- the analytical stage (statistical controls; see
pling protocol should either eliminate or control Dytham 2011) and estimation of, and adjustments
significant sources of error including random and for, unavoidable biases may be made during the
systematic error (Cochran 1977; Panzeri et al. analysis. For topics on experimental design (e.g.,
2008). Random error is caused by unknown and systematic, stratified-random, and random-block)
unpredictable changes, such as in the environ- that aim to reduce biases and increase inferential
ment, in instruments taking measurements, or as power, the reader is referred to textbooks such as
a result of the inability of an observer to take the Lawson (2014), Manly and Alberto (2014),
exact same measurement in the same way. Statis- Cohen (2013), Underwood (1997), and Cochran
tical methods typically quantify this error and, in (1977), among many others. It is critical that
fact, build on it to draw inferences. In some sense, researchers carefully consider and identify the
324 C. Salgado Kent et al.

 Does the scope of the experimental design match those of the quesons?

 Is the sample size large enough given the effect size (see Secon 9.5.1.2 for discussion on
effect size) being invesgated?

 Are the resources (e.g., me, money, and trained personnel) available for the project
sufficient to carry out the study?

 Will data be reliable (i.e., accurate and precise) enough to answer the quesons?

 Will causes of biases in data collected be able to be idenfied and removed or addressed
adequately?

Fig. 9.2 Checklist of some considerations to determine whether a research question can be answered

most suitable sampling design for their research have sufficient accuracy and precision to detect
questions. the effect(s) of interest. The accuracy of an esti-
Despite all attempts to obtain reasonable sam- mate is its proximity to the true value, while
ple sizes, minimize biases, and carefully select an precision refers to the variability of successive
appropriate experimental design, data quality is estimates of the same quantity. Naturally, to be
frequently sub-optimal due to logistical or practi- able to derive accurate and precise estimates,
cal constraints. Often unexpected restrictive measurements must also be accurate and precise.
weather conditions and/or failure of instruments Accuracy and precision of measurements are
limit data collection during fieldwork. Good evaluated through calibration and testing of the
planning can mitigate unexpected data instruments. Some instruments may simply not
limitations, thus wherever possible, there should have the capacity or range required for the
be contingency plans in place to deal with the study. For example, a low-frequency acoustic
unexpected (e.g., budgeting for a reasonable recorder will not have the capacity to measure
number of poor-weather days or redundancy in the acoustic behavior of bats, which produce
instrumentation). Even with careful design and high-frequency echolocation signals. While care-
contingencies implemented, data limitations can ful consideration must be made in selecting
still occur and may need to be dealt with at the instrumentation, considerable advances in their
analysis stage. However, as noted before, sophis- capacities have been made over recent decades.
ticated analyses to deal with these are always a Instrumentation in bioacoustical studies is
second-best option over implementing data col- discussed in detail in Chap. 2. Below is a check-
lection methods and survey design that are robust list for evaluating whether the selected instrumen-
to potential limitations. Figure 9.2 gives a list of tation will collect the required data for a project
some considerations to be made for assessing (Fig. 9.3).
whether research questions can be answered
before data are collected.
9.3.3 Preliminary Data Exploration

9.3.2 Instruments and Measurements Data quality resulting from the experimental
design, selected instrumentation, and
Instruments must be able to measure subject measurements must be checked through data
behavior and conditions of interest in the study exploration and visualization (e.g., graphics,
such that estimates derived from the observations spectrograms) before embarking on planned
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 325

Do the instruments have the sensitivity (i.e., sufficiently low noise floor and thus sufficiently
low amplitude that can be recorded), dynamic range (i.e., range of amplitudes that can be
recorded), frequency range (for sound recorders), and field robustness required for the study?

Do the instruments obtain sufficiently accurate and precise measures?

Is there a quality-control process to ensure that instrument accuracy and precision can be
measured over time (e.g., systematic calibration and testing)?

Are the instruments reliable in that they will not result in significant sets of missing or biased
data?

Fig. 9.3 Checklist of example considerations for selecting instrumentation for a bioacoustical study

analyses. It can be said that it is never early honest approach, with little added cost, is to pres-
enough to explore data, nor can there be too ent and discuss the results of an analysis with and
many graphs involved in doing so. In fact, a without those observations. This approach
preliminary exploration of data should always provides useful information about the practical
be conducted at the beginning of data collection consequences of the presence of anomalous
to allow the structure of the data to be observations.
investigated, including the presence of anomalous If sufficiently large gaps in information from
data points, missing values, and potential biases. missing values occur, the data may not be repre-
By identifying these early in the study, unfore- sentative of the larger population, especially since
seen design, sampling, or instrumentation issues it might be hard to determine after the survey
can be rectified. Preliminary exploration of data, whether the data were missing at random. Simi-
after data collection has been completed, will larly, if measurements were collected under cer-
allow for any remaining anomalies and biases to tain conditions (e.g., poor weather or noise), the
be identified and planned analyses refined. Suspi- data cannot typically be used to make inferences
cious observations can be introduced at different outside this range of conditions (which would be
stages of the research, for instance through: referred to as extrapolation). Finally, data of very
(1) data entry error, (2) changes in the measure- poor quality may not be salvageable, and—as
ment methods, (3) experimental error, or (4) some mentioned before—it is far preferable to get the
unexpected, but real variation. For the first three data right in the first place than to trust analytical
cases, the anomalous value(s) might be removed solutions to deal with problems introduced at the
before analysis. In the last case, there could be data collection stage. Data exploration and visu-
some biologically important reason for the alization are further discussed in Sects. 9.4
observed unexpected values. Sometimes the and 9.5.
word “outlier” is used to refer to these suspicious
observations, but we prefer to avoid the term. An
outlier implies something that was unexpected,
9.4 Data Types and Statistical
but only after defining what would be expected
Concepts
can we decide what the word “outlier” means.
Often “outliers” are very informative and can
Regardless of the analytical approaches used,
even lead to new research questions. Conse-
there are some fundamental terms and concepts
quently, it is important to understand how
that need to be understood before embarking on
anomalies have occurred and to ascertain whether
analyses.
they should be removed or not. A good and
326 C. Salgado Kent et al.

9.4.1 Variable Types and Their corresponding factor would have three levels.
Distributions Numerical variables are quantitative, and can be
discrete (e.g., integers such as counts) or continu-
Measures of observations or conditions of interest ous (where, by definition, an infinite number of
in a study can be called variables. For instance, values are possible between any two values).
variables can be measurable properties of Examples of continuous variables are the height
animals, their behaviors, or their environment. and weight of an individual or pressure and tem-
In a study of the acoustic characteristics of ele- perature, while the number of sounds or the num-
phant vocalizations recorded at different ranges ber of individuals are examples of discrete
from the animal, relevant variables might include variables. A summary of variable classification
the range between the microphone and the ele- and metrics is given in Table 9.2.
phant, the subject (i.e., which animal it is), the Properties of these variables, such as central
sound type, the received sound level, the spectral tendency measures like the mean, mode, and
characteristics of the sound at the receiver median, or measures of spread like variance and
locations, and the acoustic characteristics of the standard deviation, are statistics that can be used
environment between the elephant and the to describe a sample of values. When these refer
receiver. In general, a researcher will have a to the values that these quantities have in the
good idea about the plausible values for the population (as distinct from a sample of that pop-
variables of interest, and hence what range of ulation), these properties are called parameters.
values to expect, but not know the exact values Often, additional variables are collected that
before the observations are made. Variables of are not necessarily of interest in explaining a
known expected range but whose exact values research question but could influence the
are unknown until observed are random variables response variables. For example, while a bioac-
by definition. The notion of “outlier” is related to oustician might be interested in measuring the
this expectation, as “unexpected” values might be rate of vocalization of chicks as a function of the
considered suspicious. Within a regression con- parents’ presence, the frequency of predator visi-
text (see Sect. 9.4.3 for more detail), the variables tation could also influence vocalization rates. In
that represent the outcome of interest are called this example, collecting information on the main
dependent variables or response variables. When independent variable (parent presence) and the
they represent the conditions that influence the variable not of direct interest (predator presence)
outcome, they are called independent variables would be considered important to capture all
or explanatory variables, sometimes known as variables influencing vocalization rate. Some of
predictors or covariates. Hereafter we use all these variables might be of direct interest, but
terms to discuss variables, choosing each time some might just be included in a study because
the definition we feel will help to make the mean- they can affect the response, and if ignored,
ing of a concept most intuitive. would confound the results. For this reason, they
Variables can be of two types: (1) categorical, might sometimes be referred to as confounding
which can be further subdivided into nominal or factors or confounding effects. Note that these
ordinal (if there is an order), and (2) numerical, terms and their definitions vary with discipline
which could be discrete or continuous. Categori- (e.g., there is some discussion about the exact
cal variables are often called factors and are qual- definition of a covariate; see Salkind 2010) and
itative. For example, if the variable was a sound analytical software, and sometimes are used inter-
type produced by a bird categorized as either song changeably. Therefore, the reader should make
or chirp, then sound type would be a nominal sure that, when reading a source or when
factor with two levels, also called a binary vari- reporting their own results, the context provides
able. If the bird species was known to produce the required clarity for the wording chosen.
three different sound types, then the Not only are variables described according to
the properties they measure and whether they are
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 327

Table 9.2 Variable classification and metrics


Categorical Numerical
Nominal Ordinal Discrete Continuous
Description Non-ordered Ordered categories Variables in which the data can Variables in which the
categories take on only certain values (i.e., data can take on real
values that have values and all
non-infinitesimal gaps between infinitesimal real values
them containing no values) between them
Example Sound type Vocal activity on a Acoustic cue counts Received sound exposure
(e.g., scale ranging from not level (in dB)
downsweep, vocally active (0) to
upsweep, highly vocally active
constant tone) (5)

independent or dependent variables, but in the standard deviation, and for the case of the
context of some analytical methods (e.g., linear Poisson, it is defined by the mean only. Given
regression models and their extensions) they are the parameter values that define a random vari-
also described by whether they represent a specific able, all the characteristics of the random variable
or random set of values. Generally, in statistics, a are unambiguously defined.
variable with a value that is not known before it is Values of a discrete variable are characterized
observed (e.g., peak frequency of a call or number by a probability mass function (pmf). A pmf is a
of animals in a group), but of which the range of function that gives the probability that a single
possible values is known (e.g., a positive continu- realization of the variable takes on a specific
ous number like the amplitude of a lion’s roar), is discrete value. The number of vocalizing
known as a random variable, as described above. individuals detected in an area might be
Its range of possible values is referred to as the approximated by a Poisson random variable,
domain of the random variable. characterized by its mean (such as 3.7
A random variable can be characterized by its individuals). The Poisson distribution is special
probability distribution, which describes the in that its variance is equal to its mean, a restric-
probability of observing values in a given range tion that means that often it does not fit biological
of the domain of the variable. An infinite number data well, where larger variance than the mean is
of distributions exist, but some, given their useful the norm.
properties, are widely used. These distributions In contrast, continuous variables can be
are given names so that we can easily refer to characterized by a probability density function
them. Arguably, the most widely used are the (pdf). In the instance of a variable such as the
Gaussian distribution (perhaps more often change in duration of song, the pdf might be
known as the normal distribution, but since there represented by a Gaussian distribution—a bell-
is nothing normal about it and it induces shaped curve characterized by its mean and stan-
practitioners to think there might be, we avoid dard deviation. For example, the variable “change
the term here), gamma distribution, and beta dis- in song duration” could have a true mean change
tribution, used to model continuous data; while in duration of 240 s and a true standard deviation
the Poisson distribution, negative binomial distri- of 12 s. These true values are generally unob-
bution, and binomial distribution are useful when served, but we would like to estimate them. A
modeling discrete values. The uniform distribu- single measurement of change in song duration
tion is one in which all values in the domain are by a researcher could produce a value of 228 or
equally likely and can be either continuous or 271 s. These single values are referred to as
discrete. These distributions are typically defined realizations of the random variable. Pdf functions
by their parameters. As an example, the normal provide information about how the values are
distribution is defined by the mean and the distributed before they are observed. Further
328 C. Salgado Kent et al.

Fig. 9.4 Examples of samples taken from different experiments and outcome success probability p),
distributions. The Gaussian, gamma (defined by its shape represented with barplots, are discrete distributions. Note
parameter k and scale parameter θ) and beta (defined by some distributions can be special cases of others. As an
shape parameters α and β) are continuous distributions, example, the beta distribution, with shape parameters
represented with histograms. The Poisson (defined by its α ¼ 1, β ¼ 1 is shown, illustrating the fact that it is
mean) and binomial (defined by n independent equivalent to a uniform distribution

examples of distributions are given in Fig. 9.4. so that the terms do not come as a surprise. The
The reader is referred to Quinn and Keough reader is referred to Casella and Berger (2002) for
(2002) for a good introduction to useful probabil- further details on statistical inference, estimators
ity distributions in biostatistics. and their variance.
As discussed previously, a parameter is a
quantity relating to the population of interest.
When performing statistical inference, we want
9.4.2 Estimators and Their Variance
to estimate the parameters in the population (e.g.,
the mean cue production for a species of whale)
In this section, we introduce estimators and
using samples (e.g., a sample of acoustic tags put
related concepts because we will need them
on whales). To estimate parameters, we use
later, but we note that we do so very briefly, just
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 329

estimators. An estimator is a formula that we can might be considered. The rationale behind the
use to compute a parameter based on a sample. In bootstrap is that one can resample with replace-
the case of estimating the population mean, the ment from the original sample, and the variability
estimator is, not surprisingly, the well-known for- of the estimates computed over the resamples is
mula for the sample mean. Estimators are there- an estimate of the estimator variability. The
fore based on random variables, in the sense that reader is referred to Manly (2007) for further
each time we collect a sample we would get a new details about these procedures. While variance is
observed value (i.e., a new estimate). Thus, an commonly reported, when comparing variances
estimator can also be thought of as a sample of quantities that have different means, the coeffi-
statisitic that estimates the population parameter cient of variation (CV), which is the standard
such as the mean. If we collected infinite samples deviation divided by the mean, can be useful.
and computed the estimator each time, we would The CV is typically reported as a percentage (%
get the estimator sampling distribution, from CV ¼ standard deviation/mean 100).
which we could evaluate the bias and the variance
of an estimator. However, collecting infinite
samples is not possible, but by understanding 9.4.3 Modeling
the properties of the estimator and the design
used to collect the data, we can also quantify the In its most simplistic form, a model is a mathe-
variability associated with an estimator, based on matical generalization of the relationship among
a single sample. Variability is a key attribute of an processes (Ford 2000). Models are by necessity a
estimator, and the resulting estimate from the simplification of reality. Extending a quote
single sample (known as the point estimate) is popularized by George P. Box (1976), all models
not enough to provide a full representation of are strictly wrong, in that they are always
it. For example, it is very different to say that oversimplifications of reality, but many models
we estimate a cue production rate to be 7.2 sounds are useful, in that they provide useful
per hour, than to provide the additional informa- explanations or predictions of reality. Models
tion that it could vary from 7.1 to 7.2, or that it can either be empirical or theoretic. A common
could vary from 1.2 and 27.7. In the first example example of a theoretical model in acoustics is the
we have a small variance, and the latter we have piston model used to represent the beam pattern in
such a large variance that the estimator itself is a directional sound source like the dolphin
borderline useless. To compute an estimator’s biosonar system (Zimmer et al. 2005). While
variance, there are two main approaches. If the theoretical models are based on theory, empirical
estimator and the process by which we collect the models are based on observations. Here we will
sample is simple enough, we have standard focus discussion on empirical models as observed
formulae for the variance. That is the case for data are commonly used to fit models to describe
the sample mean from a simple random sample. bioacoustical processes. Models describing the
However, often in practice, that is not the case, relationships between whale vocalization rates
say because the sampling procedure is convo- and season or location (Warren et al. 2017) or
luted, there is a hierarchy in the process, or the dolphin occupancy and pile driving noise (Paiva
estimator is composed of several random et al. 2015) are examples of empirical models.
components, possibly not independent among Another example is a mathematical equation that
themselves. A good example is an animal density describes the number of bird calls recorded within
estimator from Passive Acoustic Monitoring a given period as a function of the number of
(PAM), where different random components like birds present. By identifying the mathematical
encounter rate, detection probability, cue rate, and relationship between variables, past events can
false-positives might be at play (see Sect. 9.6.2 be explained and future scenarios predicted.
for a PAM density estimation example). In such However, finding such an association requires
cases, resampling techniques like the bootstrap careful interpretation, especially in observational
330 C. Salgado Kent et al.

studies. Finding an association between two or a variables. In addition to having these two explan-
set of variables does not necessarily imply a cau- atory variables of direct interest, other variables
sation. This could be either a spurious associa- may also be relevant to include in models,
tion, or an observation induced by a variable that because they might a priori be expected to also
was not recorded. It is a statistical capital sin to influence the response variable. Variables that
confuse correlation with causation. For example, may affect vocalization rate may include time,
on hot days, the consumption of ice creams season, social context, or location. Studies in
increases, and so does the number of fires. But which multiple explanatory variables influence
you can eat an ice cream guilt-free as you will not the outcome might have interactions between
cause a fire! the explanatory variables that are important to
consider. For instance, vocalization rate may dif-
fer between male and female sea lions, but only
9.4.3.1 Introduction to Regression: The
for sub-adults and adults and not for pups and
Cornerstone of Statistical
juveniles.
Ecology
In a regression model, a distribution is typi-
Arguably, the most common and most useful
cally assumed for the response variable. This will
class of statistical models are regression models.
induce a distribution for the random errors. His-
The simplest regression model (i.e., the Gaussian
torically, regression models considered the errors
linear regression model) has three basic
of the dependent variable to be Gaussian
components: (1) a dependent variable that is to
distributed, and much of regression theory was
be modeled (i.e., described or explained), and
developed under this assumption. Note that a
(2) independent variables that are thought to
model assuming a Gaussian error distribution in
influence the dependent variable. The third com-
the dependent variable is commonly simply
ponent, the random error, distinguishes statistical
referred to as a linear model. Nowadays many
models from deterministic mathematical models.
generalizations to linear models exist
The random error captures how the model differs
(as described below and see Zuur et al. 2009 for
from the actual observations. In other words, it
common examples in ecology; see Generalized
measures how well, or badly, our model describes
Linear Models in Sect. 9.5.3 below). Arguably,
reality. Written as a mathematical expression, the
as noted above for random variables, the more
simple regression model looks like this:
commonly used distributions in regression
Y ¼ α þ Xβ þ ε, ð9:1Þ models are Gaussian and gamma for continuous
data, Poisson and negative binomial for counts,
where Y is the response variable, α is the intercept binomial for binary data, and beta for proportions
(a constant), X is the fixed independent variable, β (or probabilities), but many others exist. As for
is the regression coefficient for the fixed indepen- linear models, generalizations assuming other
dent variable that describes the rate of change of distributions associated with the response vari-
the response variable as a function of the indepen- able and associated error structure are commonly
dent variable, and ε is the random error. In gen- referred to by their distributions. For example, a
eral, the parameters α and β are not known and Poisson distributed response variable with
must be estimated based on data. associated error structure of counts of animals is
Most variables, particularly in ecology, are commonly referred to simply as a Poisson model.
influenced by many covariates, and hence models A gamma model might be used to model continu-
can include multiple independent variables. For ous positive values resulting from measurements
instance, in a study on whether the vocalization of duration of a recorded song. Values
rate of sea lions differs with sex and age, vocali- representing the probability of producing a
zation rate (i.e., number of vocalizations per unit sound (between 0 and 1), however, might be
time) would be the response (dependent) variable modeled assuming a beta distribution.
and sex and age the explanatory (independent)
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 331

Regardless of the error distribution of a model, independent. However, by explicitly accounting


classical regression models assume that for the autocorrelation structure in the model,
observations are independent of each other (i.e., more efficient inferences are bound to be obtained
the value that one observation takes on is not as there is no loss of information. Model imple-
influenced by another). The easiest way to ensure mentation does become a bit more complex, how-
this happens is by design, and all efforts should be ever. Studies that purposefully measure subjects
made to enforce it. In the biological world, the or populations repeatedly over time to create a
assumption is very often violated, and almost as time series of data are called longitudinal studies.
often ignored. This can lead to errors in inferences Because time-series measurements, such as those
made, the severity of which depends upon the from longitudinal studies, usually cannot be con-
degree and type of non-independence between sidered independent from one another (e.g., an
observations. A few obvious sources of lack of animal’s current behavior is likely dependent on
independence (i.e., dependency) are observations its behavior during the previous sample time), a
collected within groups that share a characteristic wide range of models have been purposefully
(e.g., a litter or a pod of animals), or observations developed to account for non-independence (see
collected over space (where two observations are Sect. 9.5.3). Researchers should carefully con-
more likely to be similar the closer they are in sider and plan for potential sources of depen-
space) and over time (where two successive dency in the design of their studies and data
observations are more likely to be less indepen- collection protocols.
dent than two observations separated by a longer A checklist of some considerations for describ-
period of time). Researchers often mistakenly ing and defining variables in your study, includ-
analyze data collected without proper consider- ing whether they are autocorrelated or not, is
ation of whether observations are independent. illustrated in Fig. 9.5. These considerations
By exploring and accounting for dependencies, should be made as part of the experimental design
or even purposefully including them in an experi- and analytical planning process prior to data col-
mental design, the power of an analysis may be lection and will need to be reassessed post data
enhanced. As an example, in a repeated measures collection.
study of bird vocalization rate as a function of
time of day, repeated measurements of the same
individuals during the day and night could be 9.5 Tackling Analyses
undertaken by design (instead of randomly sam-
pling birds at each time period). Another example In this section, common analytical approaches
is that of a chorusing group of insects, in which used in descriptive and exploratory studies are
sounds can be produced for hours. A researcher presented first, followed by those used in inferen-
may be interested in measuring whether the tial, explanatory, and predictive studies. It is
insects chorus in a given 5-min period. At any important to note that analyses relevant to infer-
point of time within a chorusing bout, the proba- ential, explanatory, and predictive questions
bility that insects will be chorusing in a 5-min require preliminary data exploration (see Sect.
time window will be expected to be high if they 9.3.3), thus requiring descriptive and exploratory
were chorusing during the previous 5 min. This analyses first. In these cases, preliminary explora-
leads to what are called autocorrelated tion of data attributes may refine previously
observations. In such cases, the autocorrelation planned analytical approaches. This is particu-
structure can be incorporated into the model. If larly relevant since sufficient data quality and
evaluating the effect of time was not of specific specific distributions are required for empirical
interest in this study, an alternative and simpler model assumptions to be met and these features
solution would be for the model to use can be assessed via initial data exploration.
subsampled data to include only times at which Analytical approaches described in this section
insect sound production can be considered are examples only of a wider range available. The
332 C. Salgado Kent et al.

Have the variables and variable types been identified?


If there are multiple independent variables, are there interactions of interest and/or are any of
variables highly correlated?
Are data for variables likely to be independent or autocorrelated?

Fig. 9.5 Checklist of some considerations for defining variables in your study

purpose is, by way of examples, to provide a taste properties such as statistics for central tendency
of the explosion of tools developed over the past including the mean (note that there are different
few decades, the lively discussion that has arisen types of means; e.g., arithmetic, geometric, and
from their varied and inherent limitations, and the harmonic), median, or mode, and spread of data
resulting developments in statistical approaches. including the range (maximum and minimum),
The reader is directed to the wide range of avail- variance, standard deviation, skewness (degree
able statistical textbooks and scientific papers to of asymmetry), kurtosis (i.e., how peaked a distri-
gain an in-depth understanding of the full range of bution is), or interquartile range (see Table 9.3).
approaches, their underlying concepts, and their Data corresponding to a single variable can be
correct use, limitations, and interpretation of summarized and explored using a range of
outputs. graphing tools, such as histograms, box plots,
bar charts, or scatterplots. Additionally, geo-
graphical data can be explored on maps and
9.5.1 Descriptive and Exploratory marine charts, and acoustic spectral
Research Questions characteristics on spectrograms (representing sig-
nal strength over different frequencies over time).
Having defined the question (Sect. 9.2) and As noted previously, it is (arguably) almost
identified the variable types and some of their impossible to produce too many graphs at an
attributes (Sect. 9.4), tackling the analyses is the exploratory stage—the more that you can learn
natural next step. For descriptive and exploratory about your data, the better. The reader is referred
questions and preliminary data exploration, sum- to standard statistical textbooks for information
mary statistics and graphical visualizations pro- on the large range of summary statistics and
vide information about the attributes of variable graphical visualizations available (e.g., Zuur
measures and patterns and relationships in data. et al. 2007; Zuur 2015; Rahlf 2019 for examples
The information relates only to the properties of in R).
the observed data. Analyses that aim to generalize
a sample to a population require inferential, 9.5.1.2 Bivariate and Multivariate
explanatory, and predictive type analyses Descriptive Statistics
(discussed in Sects. 9.5.2 and 9.5.3). The analyses of two variables together are called
bivariate analyses. For instance, exploration and
9.5.1.1 Univariate Summary Statistics visualization of a given variable as a function of
and Graphical Visualization another variable to investigate possible correla-
Exploration and visualization in their simplest tion is a bivariate analysis (see Fig. 9.7). A prac-
forms are undertaken by evaluating each variable tical example of a bivariate visualization is the use
on its own (Fig. 9.6). Analyses of single variables of box plots to visualize the distribution of call
are called univariate analyses and are used for types (one variable) as a function of age class
representing and summarizing the characteristics (a second variable), or a scatterplot of a recorded
of the variable in question. For example, univari- acoustic cue rate as a function of time of day.
ate exploratory statistics describe a variable’s Following this logic, multivariate analyses
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 333

Fig. 9.6 Example of univariate data visualizations of dolphin sounds detected: (left) scatterplot and (right) line chart.
Data source: WAMSI as part of Project 1.2.4 (Brown et al. 2017)

Table 9.3 Description of example univariate analytical and visualization tools


Measure Statistic Visualization tools Common purposes
Location Mean (arithmetic, geometric, harmonic), Point, line and bar charts, Describe the central
and central median, mode histogram, boxplot tendency of values in a
tendency variable
Spread Range (maximum and minimum), variance, Scatter plots, box plots, Describe the spread of
standard deviation, skewness, kurtosis, interquartile range, point, line measures in a variable
standard error, interquartile range and bar charts with standard and identify patterns and
error bars data gaps

naturally consist of the joint analysis of multiple referred to as an effect size. For example,
variables. Visualization tools and summary statis- Pearson’s correlation coefficient is a standardized
tics can also be applied to multivariate analyses. metric ranging from 1 to 1; with a perfect nega-
For instance, two and three-dimensional tive association yielding a value of 1, no asso-
scatterplots, bar charts, stacked bar charts, and ciation 0, and a perfect positive association a
multiple line graphs can display statistics and value of 1. In some disciplines, conventional
spread of data as a function of multiple variables criteria have been suggested to classify effects
on the same figure. as small, medium, and large (see Cohen 1988).
When bi- or multivariate analyses aim to What may be in one study considered a large
explore associations and patterns, the magnitude effect (say, r ¼ >0.6), however, may not neces-
of the association can sometimes be quantified. sarily be in another study (where say, r ¼ >0.8
For example, in a bivariate analysis, the magni- might be considered large). Consequently,
tude of the linear relationship between two evaluating what is a meaningful effect size that a
variables can be quantified using a statistic called study aims to detect should always guide the
Pearson’s correlation coefficient (r). The magni- design of a study and interpretation of its
tude of an association such as this one is often outcomes. It is a question that the researcher
334 C. Salgado Kent et al.

Mean of the number sounds per minute


9

Number sounds per minute


Number sounds per minute

20 20

10 10

0 0 0

-28
-22
-07

-11

-12

-17

-19

-21

-22

-23

-27

-28

-07

-11

-12

-17

-19

-21

-22

-23

-27

-28

-07

-11

-12

-17

-19

-27
-21

-23

-07
-07
-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07

-07
-07

-07

14
14
14

14

14

14

14

14

14

14

14

14

14

14

14

14

14

14

14

14

14

14

14

14

14

14

14

14
14

14

20
20
20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20
20

20
Date

Fig. 9.7 Example of bivariate data visualizations of dol- bars. Data source: WAMSI as part of Project 1.2.4 (Brown
phin sounds detected during July 2014: (left) scatterplot, et al. 2017)
(middle) box plot, and (right) bar chart with standard error

should answer based on their biological knowl- characteristics of acoustic signals. Consequently,
edge and is not related to statistical using a data reduction method to capture the most
considerations. variance explained by these variables by creating
When a study’s goal is to explore associations just one or two new variables (called principal
and patterns among many variables, analyses components in PCA) makes the exploration of
become more complex. Multivariate approaches patterns in sound characteristics easier. The first
are commonly used to reduce many variables to a principal component retains most of the original
few key ones. This is known as dimension reduc- variance, followed by the second component, and
tion. Multivariate approaches are also used to so forth. These principal components are some-
explore relationships and clustering, and to clas- times called factors. Factor 1 and 2 can be plotted
sify objects based on common multiple variable against each other, and distinct groupings of plot-
attributes. A good source for additional details on ted values for different populations would be
multivariate methods is Borcard et al. (2011). suggestive of differing characteristics in
One of the most common analyses used for stridulations among populations. To statistically
dimension reduction is principal components test differences, PCA might be used to generate
analysis (PCA). The name of the method is factor scores as inputs into inferential, explana-
derived from the fact that new variables, known tory, and predictive analyses (e.g., a regression
as principal components, are obtained from the analysis). Note that there are many dimensionality
set of original variables. For example, a reduction approaches (see Van der Maaten et al.
researcher may be interested in exploring whether 2007), and researchers planning on using these
populations of a social insect, such as a species of tools should acquaint themselves with the wide
ant, can be determined based solely on acoustic range available today, their conditions of use, and
signals (e.g., stridulations) its individuals produce their limitations. While one approach may be suit-
for communication. In this case, a range of able given the attributes of one dataset, another
variables might be measured, such as pulse dura- may be required for a different dataset.
tion, bandwidth, minimum and maximum fre- Clustering and classification analyses assign
quency, and intensity, to name a few. In objects into groups based on measured attributes
acoustics, a large number of variables might be (variables). Cluster analyses form groups
measured to capture the full range of (McGarigal et al. 2000; Zuur et al. 2009) using
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 335

“unsupervised learning,” where you do not In frequentist probability, the probability of an


“train” the procedure by labeling “training” data outcome occurring is based on the relative fre-
with group membership as you might in other quency of occurrence based on a large number of
methods. A range of cluster analysis algorithms observations taken. For example, the probability
are available including common approaches such of bird vocalizations being recorded at a study site
as k-means and hierarchical clustering (see might be based on many sample recordings taken
Borcard et al. 2011). Clustering and classification under the same conditions at the site. If
are used commonly for pattern recognition and vocalizations occurred 48% of the time, the prob-
are described further in Chap. 8. ability of the outcome of birds vocalizing would
Many other multivariate analytical approaches be interpreted as 0.48. As the sample size
are available, ranging in their assumptions, increases, the proportion of occurrences
strengths, and limitations, and the variable approaches the true (unknown) proportion. If the
attributes for which they are most suitable. For sample size is small, the calculated proportion
example, correspondence analysis (CA) is similar may not be a reliable representation of the true
to PCA, but can better cope with categorical data. probability.
The reader is referred to the many textbooks on In the Bayesian interpretation, the probability
the subject, such as Everitt and Hothorn (2011) on is the degree of belief of the likelihood of the
some of the more commonly used multivariate outcome. For example, it may be that a researcher
methods and their practical application in the believed that vocalization in nesting birds is
software R. related to predator presence. The researcher had
As in the univariate case, we reiterate that visited the site and rarely heard birds vocalizing
associations identified in exploratory multivariate when predators were absent but noticed them
analyses do not indicate causation. Researchers vocalizing more often when predators were pres-
interpreting exploratory analysis results should ent. Maybe the researcher had even made a few
take care to never conclude that the results are recordings when predators were present and
evidence of causation. A brief checklist has been absent and found that birds were vocalizing
provided below with examples of the types of 5 out of the 10 times she recorded in the presence
data considerations required for selecting of predators and 1 out of 10 times in their
analyses suitable for descriptive or exploratory absence. In this example, these observations
questions (Fig. 9.8). The checklist is not exhaus- would constitute the prior belief. The research
tive, rather it is indicative of the kinds of then undertakes a study designed for the purpose
considerations required. of collecting an unbiased set of observations to be
used in analyses (sampling in the presence and
absence of predators). Using Bayes’ Theorem, the
prior knowledge can be used to calculate the
9.5.2 Inferential Studies
probability of vocalization that accounts for
knowledge before and after collecting evidence
Statistical inference is used to infer properties of a
(sampling). If the number of samples is large, the
population (e.g., estimate parameters) or test
resulting probability estimate may not change
hypotheses. There are two widely used distinct
much from that obtained in a frequentist frame-
frameworks for making statistical inferences: the
work. However, if the sample size is small, the
frequentist and the Bayesian paradigms. Classical
prior knowledge may significantly affect the esti-
frequentist inference has a long history and has
mate of probability. Therefore, the lower the sam-
dominated past animal behavior and ecology
ple size (i.e., in general the lower amount of data
research, while Bayesian inference is becoming
coming from the data), the more the prior
increasingly popular. Both approaches can pro-
becomes important.
vide insightful information, however, they repre-
Many professional statisticians fall firmly in
sent different interpretations of probability.
the frequentist or Bayesian camp. This often
336 C. Salgado Kent et al.

 Do I require descripon, exploraon, and visualizaon of individual variables, either for


answering the main study queson or for checking the quality of the data and assumpons of
analyses planned for inferenal, explanatory, or predicve studies?

The answer to this is always YES. Data always need to be checked for quality
and aributes, and if the queson requires inference or empirical models, the
validity of assumpons needs to be checked (see Secon 9.3.3 and 9.4)!

 What types of variables do I have?


 Does the study queson involve single or mulple variables?
If mulple variables,
 are there a large number of variables that I need to reduce, explore their associaon,
or invesgate clustering or classificaon of groups characterised by them?

Fig. 9.8 Checklist of some considerations for identifying approaches for descriptive and exploratory questions

follows directly from their training, or just by values of a distribution are estimated by
convenience and actually not having thought maximizing the likelihood function so that the
much about the philosophical ramifications of MLE estimates are the values of the parameters
their choice. Sometimes they are rather inflexible that are most likely given the sample data. An
in their beliefs (be it in one or the other camp). We alternative method is Least-Squares Estimation
recommend a more pragmatic approach in prac- (LSE), where a solution that minimizes the sum
tice. Depending upon the problem at hand, one or of the squares of the residuals (the difference
the other framework might be more suited to the between the observed values and those obtained
question, easier to implement, or more sensible using the fitted model) is obtained. For a
for incorporating all available information Gaussian-distributed response variable, and sev-
(Nuzzo 2014; Ortega and Navarrete 2017). Con- eral other simple examples, the LSE solution is
sequently, we believe that the modern bioacousti- equivalent to the MLE. Nowadays LSE are
cian should have a basic understanding of the mostly introduced for teaching purposes, and
differences between frequentist and Bayesian most implementations use maximum likelihood.
approaches, and suggest that rather than only As indicated above, the Bayesian framework
being frequentist or Bayesian, a pragmatic combines information on the likelihood of an
approach be taken. Below, we provide a very outcome using observed data with prior informa-
brief introduction to statistical inference applied tion on the distribution of the unknown parameter
to parameter estimation and hypothesis testing. being estimated. The prior distribution can be an
assumption based on the researcher’s understand-
9.5.2.1 Parameter Estimation ing and experience of the parameter before the
There are a range of approaches to estimate pop- study began or it can be based on the results from
ulation parameters, such as the population mean a pilot or previous study. Often the prior distribu-
or variance, or a shape or scale parameter of a tion simply reflects a lack of knowledge and may
distribution, from a sample. In the context of be uniform over all the possible values the param-
ecological modeling, the frequentist approach to eter of interest might take (i.e., the parameter
estimating parameters typically uses maximum- space). A posterior distribution (i.e., updated
likelihood (Hilborn and Mangel 1997). In Maxi- understanding) is attained by multiplying the
mum Likelihood Estimation (MLE), parameter prior distribution function with the likelihood
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 337

function and scaling the result to provide a prob- 9.5.2.2 Hypothesis Testing
ability distribution function. All the inferences are While hypothesis testing has been traditionally
then based on this posterior distribution. The pos- undertaken using a frequentist approach (called
terior distribution thus can be seen as a compro- null hypothesis significance testing, NHST),
mise between the prior information and the equivalent Bayesian approaches are increasingly
information contained in the data, expressed via applied. This section focuses on providing a brief
the likelihood function. There are various introduction to NHST as a foundation and
resources available for further reading on the provides references for further reading on Bayes-
Bayesian framework. Ellison (2004) provides an ian approaches. These basic concepts are
excellent and gentle introduction to the use of introduced here with examples of their applica-
Bayesian methods in ecology, while McCarthy tion to test statistics (i.e., statistics values used to
(2007) provides a more thorough overview. reject or support a null hypothesis), however, they
Stauffer (2007) gives an in-depth introduction to are also an integral part of modeling and model
Bayesian and frequentist statistical research selection in explanatory and predictive questions
methods and Gelman et al. (2013) discuss Bayes- (discussed in Sect. 9.5.3).
ian data analysis. Statistical Rethinking by NHST constitutes a widespread paradigm
McElreath (2020) is a comprehensive treatment under which research has been conducted
for a reader wanting to become fully versed in the (NHST, Fisher 1959), however, it is often not
Bayesian philosophy, including R code to explore used sensibly, and frequently blindly used and
all the key concepts. abused. In some of these cases, pressure on
When inferential methods, such as those researchers to find statistically significant effects
introduced above, are used to estimate parameters has resulted in poor research practices (see Nuzzo
from sample data, the inferences we draw from 2014; Beninger et al. 2012 for detailed
them are uncertain. Confidence intervals (CIs; a discussions on the topic). Applying NHST to
frequentist approach) and credible intervals (CrIs; reasonable hypotheses and qualifying results
Bayesian counterparts) are tools for expressing our according to the limitations and assumptions of
uncertainty about parameter estimates. Confidence NHST, however, can produce important new
intervals, although more widely used, are arguably knowledge. To achieve this, an understanding of
more difficult to interpret than credible intervals. how NHST works is required. Here we provide
Confidence intervals give information based on insight into the framework by way of example.
our sample estimate, and by definition, if we Under the NHST framework, researchers put
repeated the procedure many times, 95% would forward a hypothesis (i.e., proposed explanation)
include the true parameter value. Note a 95% CI about the phenomena being studied based on a
does not mean that 95% of the observations lie study question. Let us say the researchers’ ques-
within the interval, nor that the probability of the tion is “Do seal pup call rates differ between night
true value of the parameter being in the estimated and day?” The null hypothesis (H0) is that call
interval is 0.95. After you estimate the confidence rates do not differ between night and day, and the
interval, the true parameter value either is, or is not, corresponding alternative hypothesis (HA) is that
in the interval, even if we do not know which it pup call rates do differ between night and day.
is. In contrast, 95% CrIs would represent a range of Note that this hypothesis implies a two-tailed test,
values for which there is a 0.95 probability that the one for which the null hypothesis is rejected if a
parameter falls in that range. Ironically, what this positive or a negative effect (i.e., a large or small
means is that while most people use frequentist value of the test statistic) is found. In contrast, a
confidence intervals, they often interpret them, one-tailed test would be used by a researcher
incorrectly, as credible intervals. Although credi- interested only in the difference between groups
ble intervals are intuitively easier to understand, in a specific direction (e.g., “Are call rates greater
they can be more difficult to calculate than confi- during the day than at night?”).
dence intervals.
338 C. Salgado Kent et al.

Fig. 9.9 Binomial


probability mass function
with parameters n ¼ 100
trials and p ¼ 0.5, with the
quantiles 2.5% and 97.5%
represented by vertical
dashed lines. Under H0 only
5% of the observations
would be more extreme
than those quantile values

In this example, the researchers cannot mea- during the day than at night, and on average
sure the call rates of all animals in the population, T (number of successes) would equal 50 (T ¼ 50).
so they collect a random sample, say of Now imagine that the researchers observe
100 animals. Sampling at random is key to T ¼ 46. From Fig. 9.9, T ¼ 46 is consistent with
collecting data that represent the broad popula- the null hypothesis, which we would not reject for
tion, thereby avoiding biases in the parameter the usual levels of statistical significance (see
estimates. In this example, on a given day, for below for a more in-depth discussion of signifi-
each animal, the researchers record the number of cance levels). On the contrary, consider the case
calls produced during daylight hours and during of T ¼ 11. This result would have been extremely
the night. Let us call the event, in which for a unlikely under the null hypothesis, and we would
given animal there are more calls during the day be tempted to reject the null hypothesis, implying
than at night, a “success.” If we assume animals that differences between night and day might
operate independently, then the number of occur.
successes in the 100 animals provides informa- The example given here illustrates the ratio-
tion about the null hypothesis: the further from nale under NHST, the steps of which are:
the expected number if there were no differences (1) define the hypothesis, (2) collect the data,
between night and day, the larger the evidence (3) calculate a test statistic, with known distribu-
against H0. We also assume that the probability of tion under H0, (4) evaluate how likely
a success is constant and independent across trials (or unlikely) the data would be under the null
and animals. Under H0 we assume the probability hypothesis, and (5) if very unlikely, then reject
of a success is p ¼ 0.5. Under H0, the number of the null hypothesis, but if not unlikely, do not
successes has a binomial distribution with reject it. Consequently, the trick is to put forward
parameters n (the sample size) and p. The a null hypothesis under which the distribution of
corresponding probability mass function with the test statistic can be evaluated to assess how
n ¼ 100 and p ¼ 0.5 is illustrated in Fig. 9.9. likely the data are under the null hypothesis.
To test the null hypothesis, the researchers use Given the sampling uncertainty (i.e., not observ-
the number of successes as a test statistic. The test ing the entire population), we can make mistakes
statistic has information about the null hypothe- when making decisions about whether to reject
sis, and under the null hypothesis, we know the the null hypothesis or not. The confusion matrix
distribution of the test statistic. If call rates are on in Table 9.4 illustrates the possible outcomes of a
average the same during the night and day (i.e., decision.
H0 is true), then we would expect that animals The two wrong decisions we can make are to
have a probability of 0.5 of producing more calls reject the null hypothesis when it is in fact true or
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 339

Table 9.4 Confusion matrix showing the possible require a significance level (i.e., Type I error rate), which
outcomes of a null hypothesis decision: correct decisions defines the probability of being wrong if the null hypothe-
and Type I and Type II errors. Statistical tests usually sis is true
Decision on null hypothesis
Do not reject Reject
Reality H0 true Correct decision Type I error
H0 false Type II error Correct decision

to not reject it when it is false. The former is hypothesis before it is rejected. Alternatively, we
known as a Type I error (i.e., an incorrect rejec- can compute the probability of, given the null
tion, sometimes referred to as a false-positive) hypothesis is true, observing a value for the test
and the latter a Type II error (i.e., failing to find statistic that is as or even more extreme than the
a real effect, sometimes referred to as a false- observed value. This probability value is com-
negative). In general, it is believed that Type I monly referred to as the p-value. In the above
error is what we should guard against, with the example, assuming a two-tailed test, the p-value
logic illustrated here as analogous to the legal associated with T ¼ 46 or T ¼ 11 would be 0.484
system: It is better to have a guilty defendant and ~0, respectively. This would lead us not to
not convicted than to have an innocent defendant reject the null hypothesis in the first case, but to
sent to death. We note, however, that depending reject it in the second case. Note that a common
on the problem at hand, a Type II error could have error is to confuse the p-value with the probability
a greater consequence than a Type I error. To of the null hypothesis being true or the alternative
illustrate this, imagine that you are testing being false. Researchers should take care in their
whether the size of a population has decreased interpretation of p-values to ensure they are
below a critical threshold that requires an action accurate.
for it to not go extinct. If you do not reject the null The predefined probability threshold below
hypothesis (i.e., that the population size has not which we are willing to reject the null hypothesis
changed) but it is false, you might miss the oppor- is called the significance level (typically
tunity to take action and prevent the population’s designated as α). A typical value for the signifi-
extinction. Alternatively, if you mistakenly take cance level is 5%, with tests having p-values
action to protect the population while it is in fact lower than 0.05 often being reported as statisti-
above the minimum threshold, you might waste cally significant. This value has become widely
money but any risk of detrimental population used; however, it should be noted explicitly that
consequences is eliminated. So, while many there is nothing special about a 5% significance
textbooks may allude to the importance of level. While using this threshold has been
safeguarding against Type I error, the error type extremely useful in practice, there is arguably no
that should be of most concern is likely to be other concept in statistics that has received more
study-specific. The usual advice applies: Do not criticism. The abuse of the 5% significance level
use cookbook recipes, rather think about your by blindly using it is among the most common
study. The allowable Type I error can typically criticisms of the p-value and hypothesis testing
be specified with a critical significance level value (Nuzzo 2014; Yoccoz 1991; Beninger et al.
(defined below). Estimation of Type II errors 2012). Using common sense is fundamental in
typically requires another step, called a power selecting significance levels. It is intuitively sen-
analysis (see Ellis 2010 for a textbook on power sible that it cannot be sound science to blindly
analyses). claim a result to be significant if p ¼ 0.049 but not
In practice, the amount of evidence against the significant if p ¼ 0.051. Ultimately, researchers
null hypothesis required in a study is given by need to think carefully about the cost of errors
setting a threshold based on how unlikely the they can incur and define suitable significance
observed data would have to be under the null levels accordingly. The focus should arguably
340 C. Salgado Kent et al.

be on reporting confidence intervals and assessing (marine) ecology. An entire Forum section in
the biological importance of reported effects, not the journal Ecology has been dedicated to the
on claims of statistical significance that are often topic in recent years, and Ellison et al. (2014)
not more than statements about sample size. show that while having been discussed and
Given a large enough sample size, even the revisited many times in recent years, the discus-
smallest difference will become statistically sig- sion about their use is alive and kicking!
nificant. Therefore, it is perhaps not surprising Having said this, a wide range of NHSTs have
that a common pitfall for researchers, and equally been developed over many decades to accommo-
as or arguably more important than evaluating date a range of questions and data types. Tradi-
statistical significance, is failure to consider a tionally, many of these have been described as
result’s biological significance. Imagine two either “parametric tests” or “non-parametric
populations of a whale species that produce the tests,” with parametric tests often assuming
same stereotyped calls. Let us say animals in samples arise from Gaussian distributions and
population A produced calls at a mean rate of non-parametric tests are often used for categorical
22.7 per hour and in population B at 22.6 calls or continuous data that do not fit assumptions of
per hour, and that these are significantly different parametric tests. While we urge the reader to be
statistically. Is this result meaningful biologi- cautious about blindly using such tests and be
cally? In other words, is the effect size of a mag- aware of their limitations, we feel we must dis-
nitude that we care about? In most cases, almost cuss them since this is how statistics is presented
certainly not. Therefore, a researcher should have in most undergraduate and postgraduate courses
a good understanding a priori of the magnitude of aimed at the applied sciences, biology and ecol-
the effect that is biologically relevant. ogy included. As examples, tests commonly
Researchers undertaking studies with large sam- referred to as parametric include the z-test (for
ple sizes having the power to detect very small testing a sample mean), t-test (for comparing the
effect sizes can fall into the trap of reporting means of two groups), and analysis of variance or
results as important based on statistical signifi- ANOVA (used for comparing two or more
cance instead of on effect size and significance groups). Common non-parametric alternatives to
together. Conversely, studies having a large prob- the t-test and the (one-way) ANOVA are the
ability of incurring Type II errors (also known as Mann–Whitney U and Kruskal–Wallis tests,
low power, i.e., having a low probability of respectively. The tests referred to here are only a
correctly rejecting the null hypothesis when it is few of the vast range available, and readers will
false) due to a small sample size may only be able not find it difficult to find a plethora of textbooks
to detect very large effect sizes and miss smaller describing them. Note that these tests have been
ones that are biologically important. The effect used widely in past decades and continue to be
size that is meaningful in a study, thus, needs to used in current research. Today, however, with
inform the experimental design to ensure a suffi- improved knowledge of limitations of these tests,
ciently large sample is collected before the study they are losing their appeal (see e.g., Touchon and
commences. McCoy 2016). In general, they are no longer the
While NHST and p-values can provide valu- standard go-to for particular types of problems as
able tools to bioacousticians, it is not amiss for they have been superseded by more robust
researchers to be well aware of the lively discus- approaches. With advances in statistics, a wide
sion on their misuse, drawbacks, and limitations. range of readily available modeling approaches
Nuzzo (2014) provides an introduction to this has been developed that more than accommodate
discussion, Yoccoz (1991) provides a classical data that would have traditionally been analyzed
critical review regarding their use in biology and using non-parametric tests (see Sect. 9.5.3 for an
ecology, and Beninger et al. (2012) frame the overview). Note that while many disciplines are
problem in the wider context of statistics in guided by traditional “parametric” and “non-
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 341

parametric” classifications, where parametric extensions of the former methods). Note that
would often be associated exclusively with the these approaches have additional assumptions,
Gaussian distribution, modern approaches in sta- such as that of homogeneity of variances. Homo-
tistical ecology using regression models are gen- geneity of variance means that the variance for a
erally not said to be parametric or non-parametric; response variable is assumed to be constant
rather, they tend to be referred to based on the across values of the independent variable. Many
data distributions for which they are suited, such datasets have been forced through these methods
as a Poisson or gamma regression (see below for even when they were clearly not the right tool for
more on these). the job. This included, for example, transforming
the response variable (e.g., by applying a log
function to it) until Gaussian distributional
9.5.3 Explanatory and Predictive assumptions were met to a reasonable extent.
Research Questions But even then, often a method’s assumptions
were not met. For instance, there is no transfor-
Explanatory and predictive studies have mation that will turn a discrete count into a con-
questions requiring a response variable to be tinuous variable. For an interesting presentation
described as a function of a set of independent about why not to log-transform data, see O’Hara
variables. Arguably, the majority of the models and Kotze (2010). Nonetheless, sometimes pro-
used by ecologists to answer this type of question cesses might have properties that make a
are some kind of regression model. However, log-transformation of the data sensible and useful
these models come in many forms. This section (e.g., Kerkhoff and Enquist 2009). While
aims to introduce the reader to different types of transforming data to fulfill methods’ assumptions
regression models. We note upfront that model has been acceptable in the past given a lack of
selection and validation, and inference from accessible alternative methods, this is often no
selected models, are fundamental aspects of longer the case, and successful ecologists need
these analyses and are only very briefly men- to have a few additional tools in their toolbox.
tioned in Sect. 9.5.3.1. Relevant yet accessible The rule is one that practitioners do not enjoy:
books with plenty of practical examples There is not a single rule that fits all questions and
addressing these steps include Zuur et al. (2007) problems, we need to understand the problem to
and Zuur et al. (2009). know how to model it. Sometimes it is even said
Historically, linear regression models that modeling is as much an art as it is a science.
(in which the errors are assumed to follow a But like any good artist, you must master the
Gaussian distribution) were the only tools avail- techniques to use them correctly.
able to answer this type of question. When the The next level of sophistication in regression
only tool you have is a hammer, all your problems models came with the advent of Generalized Lin-
begin to look like nails. With a Gaussian error ear Models (GLMs). GLMs allow for different
distribution assumption, the only analytical types of response variable and some degree of
options are simple linear regression models of non-linearity in the relationship between the
the type given in Eq. (9.1) or linear regression response and explanatory variables. The relation-
models with several predictors (i.e., multiple ship will still be linear at some level, but it might
regression). There are many special cases of not be at the response level, it might only be linear
such linear normal regression models including at the level of the link function. What is the link
the independent sample t-test, ANOVA (i.e., function? It is a fundamental component of a
analysis of variance for multiple sample mean GLM and is what allows responses to be
comparison), ANCOVA (i.e., analysis of covari- constrained to a specific range of values. The
ance for regressing a continuous response vari- link function, as its name implies, links the linear
able on a factor and a continuous covariate), and predictor and the response variable so that the
MANOVA or MANCOVA (i.e., multivariate model equation looks like:
342 C. Salgado Kent et al.

gðE ðY ÞÞ ¼ α þ Xβ, ð9:2Þ While GLMs allow added flexibility to stan-


dard linear regression as a result of the link func-
where g is the link function, E(Y ) is the expected tion, if the relationship between the response and
value of the response variable, and as in simple the predictors is highly non-linear (i.e., cannot be
linear regression (see Eq. 9.1), α is the intercept assumed linear even on the link function scale),
(a constant), X is the predictor variable, and β is then a GLM will not be adequate. This is where
the regression coefficient. For a vector of we need to bring non-linear functions into play,
n observations, the equation is in matrix form, and perhaps the most widely used non-linear
where β is a vector of parameters and X is a matrix approach is the Generalized Additive Model
of predictor observations. The presence of a link (GAM). GAMs also consider a link function to
function in Eq. (9.2) means that to obtain a pre- allow different distributions for the response var-
diction from this model, we need to apply the iable (as in GLMs), but we now have the response
inverse of the link function to the linear being a function of smooth functions of the
predictors. As an example, consider a model predictors. In a univariate case, the model equa-
with a log-link function. The inverse of the log tion looks like:
is the exponent. This means that we need to
exponentiate linear predictors to obtain the gðEðY ÞÞ ¼ α þ f ðxÞ, ð9:3Þ
predicted value of Y for the corresponding values.
where g is the link function, E(Y) is the
But then, this also means that, irrespective of the
expected value of the response variable, α is the
covariate values and the coefficients estimated,
intercept, x is the predictor variable, and f is a
the prediction will be positive (because the expo-
function such as a polynomial or spline. The
nent of any number is positive). Some link
polynomial or spline applies a smooth, curved-
functions allow values predicted for the response
type function to the variable.
variable to be constrained (limited) to between
All the models described so far, be it a simple
0 and 1, further increasing the range of modeling
linear model (LM), a GLM, or a GAM, include
possibilities to include binary responses (e.g.,
only independent variables that are considered to
presence/absence) or proportions. For instance,
be fixed effects. However, sometimes the inclu-
binary response variables like presence/absence
sion of random effects might be necessary. A
are modeled using a binomial GLM, with logistic
random effect is useful when we have observed
regression being a special case of a binomial
a (random) subset of a larger population of possi-
GLM, where the link function is the logit func-
ble values for a covariate. For example, a study
tion. Count data can be modeled using a Poisson
may be interested in identifying responses of bats
GLM. The Poisson distribution is quite inflexible,
from a certain population before, during, and after
however, because as noted above, it assumes that
exposure to high-frequency sound. The individ-
the mean and the variance are the same. Quite
ual bats, whose responses were measured before,
often, biological data are overdispersed, meaning
during, and after exposure, are a random effect.
that the variance is greater than the mean. For
Random effects can be incorporated into a range
such count data, a quasi-Poisson or negative bino-
of linear regression type models. For instance,
mial response is often a second natural choice as it
Generalized Linear Mixed Models (GLMM) and
allows the variance to be greater than the mean.
Generalized Additive Mixed Models (GAMM)
Finally, we could also consider other less com-
are GLMs and GAMs that incorporate both
monly used, but equally useful, GLMs: (1) multi-
fixed and random effects. The reader is referred
nomial regression when the response can take one
to Harrison et al. (2018) for an overview of mixed
of several categorical outcomes, (2) gamma
models in ecology, Pedersen et al. (2019) for
regression where the response is strictly positive,
non-linear models including mixed effects, and
and (3) beta regression when the response is a
Nakagawa and Schielzeth (2010) for a review of
probability or a proportion.
the general issue of dealing with repeated
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 343

measurements sharing a correlation structure in to Martin et al. (2005) for a gentle introduction to
biological studies. the topic with ecological examples.
Despite these advances, some data still do not Truncated regression is another special case of
fit the distributional requirements of GLMs and regression under which some values of the
GAMs. Generalized Estimating Equations response variable cannot be observed. An exam-
(GEEs) have been introduced recently, and ple is modeling animal group sizes as a function
hence they might still be considered in their of their acoustic footprint (e.g., the number of
infancy, but they are showing promising results. sounds produced by a group that are detected
GEEs generalize GLMs and GAMs even further per minute). Now that you know about GLMs,
by not requiring that the response variable come your first thought might be to consider a Poisson
from a particular family of distributions. GEEs or negative binomial GLM, with group size as the
simply impose a relationship between the mean response variable and numbers of sounds detected
and variance of the response. These models also as the predictor. However, in modeling this, you
allow a wide range of correlation structures to be soon face a problem: You fit your model and
imposed on the data, making them quite appeal- make some predictions, one of which is a group
ing when there are many observations clustered size of zero! What does this mean? Nothing
inside a few individuals. GEEs are marginal really, it is what we call an inadmissible estimate
models in that the focus of inference is on the and a clear sign that something is not adequate.
population average, and we are not so interested Under such a case, you might want to try a zero-
in the responses at the individual level. GEEs are truncated regression, which is essentially a GLM
quite specialized, and the reader is referred to for which zeroes cannot be observed. Chapter 11
Zuur et al. (2009, Chap. 12) for an introduction. in Zuur et al. (2009) explores both zero-inflated
In addition to the somewhat “general” regres- and zero-truncated models.
sion models above, there is a range of specialized Survival models are regression techniques that
regression models that are worth considering in deal with a special type of response variable: the
certain biological questions. For instance, we time up to an event. While these types of models
have mentioned the problem of overdispersion. were developed to model survival of animals,
Often with biological data, we have very special plants, and people, they can be used in any sce-
cases of overdispersion in which there is an nario where observations might be censored.
excess of zeroes. For example, consider you are Censored data result when we do not know the
trying to model the number of echolocation clicks real value of the response variable but know it is
a sperm whale produces per second as a function at least above or below some limit or within some
of depth, time of day, and sex. There are (at least) interval; say because we observe an animal is
two reasons for there being zero clicks in a given dead at a given time, and/or we know it was
second. A whale is in a silent state when recorded alive at a different time. For example in a
and many zeroes occur in successive seconds, or bioacoustic study, a researcher may wish to
the whale is in a click-producing state but does model the time animals take to produce their
not produce a click in the given second recorded. first acoustic cue, and animals are observed for
The regression models discussed above will 5 min each. However, we do not know when an
likely fail to produce reasonable answers because animal produced a cue before observations began
the excess zeroes from the silent periods (poten- (i.e., left censoring). In addition, an animal might
tially not explained by the covariates; i.e., not not produce any cues during the 5 min, or the
dependent on sex, depth, or time of day) cannot animal might leave the study area before the
be accommodated. Under such a scenario, hurdle 5 min elapse (i.e., right censoring). Finally, if
models or zero-inflated models might come in we recorded only which minute, but not the actual
handy. While these are advanced methods and second a sound was produced, we would only
more difficult to implement and evaluate, they know that the event occurred sometime within
are worth knowing about. The reader is referred the interval of that minute. These are interval
344 C. Salgado Kent et al.

Table 9.5 Description of some commonly used models to test the association between multiple explanatory variables
and a response variable
Model type Use
Generalized Linear Modeling (GLM) Allows different distributions for the response variable and some degree of
non-linearity in the relationship between response and explanatory variables
Generalized Linear Mixed Effects An extension of GLM for use with random effects (e.g., repeated measures of
Modeling (GLMM) subjects)
Generalized Additive Modeling Allows different distributions for the response variable (as in GLMs) modeled
(GAM) as a function of smoothed predictors
Generalized Additive Mixed Effects An extension of GAM for use with random effects (e.g., repeated measures of
Modelling (GAMM) subjects)
Generalized Estimating Equations Do not require the response variable to come from a particular family of
(GEE) distributions, and allows correlation structures in the data to be accounted for

censored data. While a somewhat contrived 20, then sound is predicted to spread spherically
example, this allows us to introduce the different (see Chaps. 5 and 6 on sound propagation in air
kinds of censoring that are common in survival and under water, respectively).
analysis. All the models described so far do not consider
Generalized Least Squares (GLS) is a regres- predictor variables that are in hierarchies.
sion approach that might be used when we want Hierarchical data occur when variables are nested
to relax the usual assumption of homogeneous within each other (i.e., organized into levels). For
residual variance by modeling the variance as a example, individuals from different resident
function of covariates. Zuur et al. (2009, Chap. 4) populations can be said to be nested within
provide examples of the use of GLS and Reyier subpopulations. In turn, subpopulations can be
et al. (2014) give an acoustics application of GLS. nested within populations. Hierarchical modeling
Another perhaps more specialized use of such a (also known as multilevel modeling) is used when
regression technique is when we want to consider inferences need to be drawn for population means
a general non-linear model with a specific form to at specified levels and is useful for fitting models
relate a response variable with covariates. Then to data obtained from complex, multilevel survey
we might still want to find the parameters of the designs. For example, a study may evaluate vocal
model that best fit the data. A way to do so is, akin complexity of elephants at the population,
to what might happen if one considers a straight sub-population, and resident population levels.
line, to find the parameter values that minimize Here, we do not discuss these methods further.
the sum of the squares of the residuals (i.e., the Rather, we refer the reader to Cressie et al. (2009)
difference between the observations and the and Royle and Dorazio (2008) for descriptions of
model). In a simple regression context, the these methods, including their strengths and
model produces the fitted line, while in a limitations.
generalized least squares context, the model is Given the large range of models available
any function in which we might be interested. (a taste of which has been described above),
For example, if you want to determine the propa- what should aspiring ecologists today have in
gation loss (PL) for a sound that has traveled from their statistical regression toolbox? We propose
the source to the receiver, and you expect it is that a bare minimum is an understanding of the
proportional to log(r), where r is the range, then structure, implementation, outputs, and interpre-
your model is PL ¼ K log (r). Based on tation of GLMs, GLMMs, GAMs, and GAMMs
measurements of received levels of sounds with (Table 9.5). Parameter estimates and significance
known source level, you may apply a GLS regres- tests resulting in p-values are common outputs of
sion to estimate the value of K that best fits your software capable of fitting GLMs, GLMMs,
data. If K is close to 10, then your environment GAMs, GAMMs, and GEEs. For a practical
supports cylindrical spreading, if it is close to guide to applying these in behavioral and
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 345

ecological studies, see Zuur et al. (2009). O’Hara Olivares and Garcia-Forero 2010), while residual
(2009) and Bolker et al. (2009) provide good diagnostics determine whether residuals fit the
introductions to GLMMs for ecologists, and the assumption of being effectively random (see
books by Zuur et al. (2007, 2009) provide infor- Zuur et al. 2009 for common examples in ecol-
mation to implement and interpret GLMMs. For ogy). Checking for multi-collinearity (i.e., collin-
GAMs, the book by Wood (2006) is a standard earity between two or more covariates) is also
reference, and Zuur et al. (2009) has worked-out standard for explanatory modeling, while it is
examples in the software R. close to irrelevant for predictive modeling (see
Most of the models described in this section Shmueli 2010 for detailed discussion). In contrast
can be implemented in a frequentist framework, to explanatory modeling, model validation in pre-
for instance using maximum likelihood or dictive modeling is focused on evaluating the
restricted maximum likelihood estimation. None- model’s ability to generalize and predict new
theless, for more complex models such as those data. Validation commonly is undertaken using
including (often complex) spatial and temporal approaches such as cross-validation. In cross-
covariates (i.e., spatio-temporal models), Bayes- validation, the model’s ability to accurately pre-
ian implementations are gaining ground. For dict a new data set is assessed after calibrating it
instance, GLMs and GLMMs are fitted via maxi- with a training dataset (Shmueli 2010; Cawley
mum likelihood, or Markov Chain Monte Carlo and Talbot 2010).
(MCMC). MCMCs are Bayesian iterative Once a set of models have been validated, the
solutions and are described in Gamerman best candidate model is selected (though model
(1997), Brémaud (1999), Draper (2000), and validation and selection can often be an iterative
Link (2002). With advances of widely available process). Approaches to model selection, again,
implementations, users might even be using depend upon whether modeling has an explana-
Bayesian approaches without realizing it. An tory or predictive goal. In explanatory modeling,
example is the Integrated Nested Laplace the explanatory power of nested candidate models
Approximation (INLA) implemented via is commonly compared with a step-wise approach
R-INLA (www.r-inla.org) and its derivatives using significance testing (e.g., using an F-test).
that allow fitting complex spatio-temporal models Here a nested model refers to one composed of
without the Bayesian framework being obvious subsets of covariates of another candidate model.
(by not requiring priors to be explicitly defined). Caution should be taken, however, as researchers
The philosophical nuances of which framework may be inclined to remove covariates that are not
might be more adequate under given settings, significant, even when there is a strong theoretical
however, are beyond what we hope to discuss in justification for retaining them since they are rel-
this chapter. evant in the models, regardless of whether they
are significant or not (Shmueli 2010). For exam-
9.5.3.1 Model Validation, Selection, ple, a covariate representing the age class of a
and Averaging sparrow in a study assessing the influence of
Depending upon whether modeling is undertaken predator presence on sparrow vocal behavior
for explanatory or predictive purposes, may be of theoretical importance in the model.
approaches for model validation and selection Model selection in predictive modeling com-
may differ (Shmueli 2010). Validation means monly involves a priori specification of candidate
that the model has been demonstrated to have models and selecting the best model based on the
satisfactory accuracy for its intended use (Rykiel smallest possible number of parameters that ade-
Jr 1996). Validation in explanatory modeling quately represent the data (i.e., the principle of
commonly takes the form of goodness-of-fit and parsimony). The simpler a model is, the more it
residual diagnostics. Goodness-of-fit tests evalu- can be generalized, while more complex models
ate how well-observed values agree with those (containing more parameters) are more specific to
expected under the statistical model (Maydeu- the data used to fit the model. Consequently,
346 C. Salgado Kent et al.

criteria for model selection have been developed covariance between models is low (McElroy
that essentially maximize the likelihood while 2016).
penalizing for the number of parameters included. While a highly simplified overview of some
The Akaike’s Information Criterion (AIC; see tools available on the topic of model validation,
Akaike 1974) and Bayesian Information Criterion selection, and averaging has been provided here,
(BIC) currently are the most commonly used, researchers should be familiar with them and
among a range of others available. They are access the latest literature to identify the appropri-
widely used for comparing nested and ate approaches for their study.
non-nested models (Burnham and Anderson
2002), although there is some discussion around
suitability for use in non-nested models (see 9.5.4 The Future of Bioacoustical
Ripley 2004). Resulting criteria such as AIC or Analytical Approaches
BIC values for candidate models are then com-
pared and the model yielding the lowest value is In this chapter, we have only provided a flavor of
generally deemed to be preferred. Note that there common approaches used today and have not
is active research on the circumstances under delved into the wide range of new developments
which AIC, BIC, and the many other criteria being introduced into the discipline. Interdisci-
available perform best, and whether they should plinary research linking the fields of biology,
be used together to inform model selection (Kuha ecology, and statistics has a long tradition of
2004). An important take-home message is that providing fertile ground for innovative statistical
model selection criteria such as AIC and BIC can methods, with many methods having been devel-
only suggest a preferred model from those com- oped when existing methods were not adequate to
pared, even if they all perform poorly at the cope with new problems (Olivier et al. 2014). The
validation stage. In other words, the preferred current revolution in data acquisition systems (see
model may still be a poorly fitting model, and Chap. 2), such as high-resolution sensors in
therefore, selection criteria are only relative animal-borne tags and increasing numbers of
measures of model goodness-of-fit. long-term passive acoustic deployments that
In predictive modeling, averaging over a range lead to big data, is also likely to influence the
of plausible models has become widely used to next generation of statistical methods suited for
reduce prediction error and improve model selec- ecological and acoustical analysis. Analysis of
tion uncertainty. This is undertaken, for example, big data through increased computational capac-
by computing a measure that ranks the set of ity has already provided a range of new powerful
plausible models according to their support by tools to science.
the data (e.g., Akaike weights), applying the As an example of such approaches, machine
weights to predictions from each model, and learning is rapidly gaining in popularity as it
then computing the average. This provides increasingly improves pattern recognition accu-
weighted averaged predictions, with weights racy (Christin et al. 2019). Such methods can
dependent on how much each model is supported improve processing capacity in large datasets
by the data. There are many other methods for resulting from acoustic instrumentation. An
undertaking model averaging. Model averaging example of more sophisticated analytical
performance depends on each model’s predictive approaches is the growing use of hierarchical,
bias and variance and covariance between state-space, and hidden process methods (e.g.,
models, among other things (see McElroy 2016 Auger-Méthé et al. 2020 for an introduction to
for complete discussion). In recent work, model their application in ecology) that model underly-
averaging has been shown to be particularly use- ing processes while accounting for biases and
ful when predictive errors of contributing model uncertainty. Advances in these approaches may
predictions are dominated by variance, and when improve our ability to predict future scenarios and
implement intervention before a potentially
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 347

undesirable future scenario unfolds (see Cressie Rather than reviewing analytical approaches
et al. 2009 for discussion). across the hundreds of existing bioacoustics stud-
We also suggest readers to be acquainted with ies, we have selected two recent studies as
the growing work being conducted in the area of examples, and discuss the rationale for the partic-
statistical decision theory, which is concerned ular analytical approaches taken. The research
with making decisions by accounting for topics in the example studies are exploring tem-
uncertainties involved in the decision process poral changes in call frequency and using acous-
using statistical knowledge resulting from data tic data for abundance and density estimation.
collected. Rather than attempting to provide a
general review of the large field of decision the-
ory here, we refer the reader to an introduction in 9.6.1 Temporal Changes in Call
its application to ecology by Williams and Frequency
Hooten (2016), which will introduce the reader
to a range of other resources on the topic. As indicated previously, due to ever-increasing
Because the advancement of these and many computing power and storage and technological
other methods are continually evolving, advances in acoustic equipment, acoustic studies
researchers are encouraged to keep well-informed can provide extremely long-term datasets. These
of current developments appearing in methods- datasets allow us to explore changes to calling
based scientific journals, such as Methods in behavior on a scale that, until recently, would
Ecology and Evolution. have been very difficult. A recent example is
illustrated in Miksis-Olds et al. (2018) where the
frequency content of a type of blue whale song
9.6 Examples in Bioacoustics recorded primarily in the Indian Ocean was
investigated. The song type is attributed to a
The wide range of quantitative approaches pygmy blue whale subspecies (Balaenoptera
introduced above can be used to analyze musculus indica, Committee on Taxonomy
bioacoustical data to answer research questions 2021) that appears to be resident in the northern
ranging from understanding natural vocal behav- Indian Ocean. The song type has three distinct
ior to activity patterns, community and conserva- units, and this analysis focused on the ~60-Hz
tion ecology, habitat use, species diversity, component of Unit 2, a frequency-modulated
distribution, occupancy, density and abundance, upsweep, and Unit 3, a ~100-Hz tonal
and anthropogenic impacts (among many others). downsweep. A decade of data from the Indian
Faunal groups that have been the subject of bio- Ocean Comprehensive Nuclear-Test-Ban Treaty
acoustics research include invertebrates, anurans International Monitoring Station (CTBTO IMS)
(i.e., frogs and toads), fish, birds, bats, other ter- at Diego Garcia was analyzed (2002–2013).
restrial mammals, and marine mammals, but Ambient noise was also analyzed, but we do not
many others could be considered. As long as focus on that part of the study here.
sound is produced, it could be used as a source Power spectral densities (PSD) were computed
of information. A recent review documented for 2-h sections of data, which could be used to
460 peer-reviewed published papers on passive detect peaks in the frequency bands of interest
acoustic monitoring in terrestrial habitats alone, (approximately 56–63 Hz for the 60-Hz compo-
with bats (50% of papers) and activity patterns nent of Unit 2, and 107–100 Hz for Unit 3), using
(24%) dominating (Moreria Sugai et al. 2018). a 3-dB signal-to-noise threshold. The paper
Marine mammals feature prominently in shows a figure of number of hours with vocal
bioacoustic research as water is a highly condu- presence detected each week, for each year
cive medium for sound to travel through, and (Fig. 9.3 in Miksis-Olds et al. 2018), highlighting
visual observations can prove comparatively the importance of producing exploratory plots; in
expensive for limited returns on detections. this case, the variability in the data is made clear.
348 C. Salgado Kent et al.

The average over each week, across years, was a Generalized Linear Model, or non-linear
used to identify weeks with peak average vocal patterns in the frequency decline could be
presence. Weeks 21 and 22 were those with peak explored using a Generalized Additive Model.
average vocal presence and data from these weeks
were investigated further. The frequency peaks
from the PSDs from these weeks across all years 9.6.2 Abundance and Density
were measured. A linear regression model was Estimation
fitted to the week 21 and 22 frequency peak
measurements from all years. The response vari- The estimation of animal population size (abun-
able was frequency, and year and song unit were dance) and the number of animals in a given area
explanatory variables. Song unit was included in (density) are metrics that are very informative for
the model as a factor variable. An interaction was management and conservation actions. There are
also included between year and song unit, which several abundance and density estimation
was used to investigate whether the rate of any methods available (e.g., Borchers et al. 2002);
frequency change over time differed between the popular methods include mark-recapture and dis-
two song units. Model assumptions (linearity, tance sampling. Such methods are known as
constant error variance, error independence, and absolute abundance or density estimation
normality) were all assessed using diagnostic methods, as the methods estimate the total num-
plots and relevant hypothesis tests, and all ber of animals (in a defined area, for density
model assumptions were met. estimates), including animals missed by a survey.
The linear model results are depicted in Common reasons why animals are not detected
Fig. 9.10. The figure shows all weekly data plot- during a survey is that they may be too far away,
ted (blue dots) with the modeled 21–22 week data and/or detection is made difficult by environmen-
highlighted in red for both song units. Again, the tal conditions (e.g., rough seas may prevent
utility of plotting data is clear here: the decline in marine mammal sightings at sea unless the
frequency is evident, with an apparent difference animals are very close, or windy conditions may
in rate of decline between the two units. The mask the sounds of singing birds in recordings).
linear model results confirmed the frequency The probability of detecting an animal is a key
decline; the frequency of the ~60-Hz Unit parameter in absolute abundance and density esti-
2 decreased at a rate of 0.18 Hz/year, while the mation methods, and accounts (in part) for unde-
frequency of Unit 3 decreased at 0.54 Hz/year. tected animals during a survey.
The interaction term was selected during model Acoustic data are increasingly being used for
selection (using an F-test), which confirmed that absolute abundance and density estimation, both
the rates of frequency decline were indeed differ- in terrestrial and marine environments (e.g.,
ent between the two units. Marques et al. 2013; Stevenson et al. 2015).
This analysis shows that simple regression Here we discuss a density estimation analysis
analyses can be very effective in confirming for Blainville’s beaked whales (Mesoplodon
patterns observed in exploratory data plots. We densirostris) from seafloor-moored hydrophone
note here that the regression analysis in the paper data recorded in the Bahamas (Marques et al.
focused on data from weeks 21 and 22 to be 2009). The analysis involved several of the
comparable with methods from a similar study concepts we have discussed throughout the chap-
(Gavrilov et al. 2012). However, frequency ter, which we highlight here.
measurements were taken across all weeks of The paper begins by introducing the density
each year (as shown in Fig. 9.10), which could estimation equation (i.e., the estimator; see Sect.
also be used in a regression model. In addition, it 9.4.2). The equation contains several parameters
is common for bioacoustical analyses to have to be estimated, including the probability of
several natural extensions. In this case, relaxing detecting a beaked whale echolocation click on
the Gaussian assumption could be considered via one of the seafloor-moored hydrophones. Survey
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 349

swam near the moored hydrophones. Clicks pro-


duced by the animals and recorded on the tags
created “trials”; a successful trial was achieved if
the same clicks recorded on tags of the tagged
animal were detected on the moored
hydrophones. In addition, the tag data provided
the slant distance of each tagged animal from the
moored hydrophones, as well as the animal’s
orientation toward, or away from, a given moored
hydrophone. These data allowed detection proba-
bility to be modeled as a function of a whale’s
orientation and distance from the moored
hydrophones using regression modeling. Specifi-
cally, a Generalized Additive Model (GAM) was
used due to its flexibility in allowing non-linear
relationships between the response and explana-
tory variables. The response variable was defined
as the detection, or non-detection, of each click
produced by the tagged animal on the moored
hydrophones. The explanatory variables, or
covariates, were (a) the horizontal off-axis angle
(hoa) and (b) vertical off-axis angle (voa) of the
tagged whale, with respect to a given moored
Fig. 9.10 Peak frequency of Sri Lankan whale
hydrophone, and (c) the distance of the tagged
vocalizations determined from weekly PSD sound whale from the hydrophone. A binomial distribu-
averages. The blue circles are the weekly peaks measured tion was assumed for the response variable due to
throughout the season when whales were vocally present. the binary nature of the trial data (i.e., detected, or
The trend line is related to the red circles that are peak
frequency from weeks 21 and 22 of each year. The greyed
not detected) and a logistic link function was used
regions designate the 95% confidence intervals for the in the GAM. Finally, to estimate the average
trend. Reprinted with permission from Miksis-Olds et al. detection probability (i.e., a single parameter
(2018). # Acoustical Society of America, 2018. All rights value for the estimator), a Monte Carlo simulation
reserved
was implemented where the dive profiles from the
tags were randomly placed around virtual moored
hydrophones. In the simulation, the slant range
design and variance estimation of the parameters and orientation of the clicks from the dive profiles
(including confidence intervals) are also from the moored hydrophones could be calcu-
discussed. A summary of methods to estimate lated, and then these values could be used along
the detection probability is given. Mark-recapture with the GAM to predict the detection probability
and distance sampling methods are commonly for each click in the simulation. The average of
used approaches to estimate the detection proba- these predicted detection probabilities was used
bility, but Marques et al. (2009) needed an alter- in the estimator. Two other parameters required
native method, given that the hydrophone for the estimator, the false-positive proportion
recordings were not suitable for either mark- and cue production rate, are discussed in the
recapture, or distance sampling-based methods. paper in detail, on which we do not focus here.
Therefore, a trial-based detection probability esti- The results of the GAM are shown in
mation method was used. The specific trial-based Fig. 9.11. The modeled relationships between
method used in this study relied on auxiliary data (a) detection probability and slant range,
from animals tagged with acoustic tags, which (b) vertical and horizontal off-axis angle and
350 C. Salgado Kent et al.

Fig. 9.11 The estimated detection function. Plots (on the the smooths, where black and white represent respectively
response scale) of the fitted smooths for a binomial GAM an estimated probability of detection of 0 and 1. Distance
model with slant distance and a 2D smooth of hoa and voa. (top right panel) and angle not shown (bottom panels) are
For the top left plot, the off-axis angles are fixed at 0, 45, fixed respectively at 0 m and 0 . Reprinted with permis-
and 90 (respectively the solid, dashed, and dotted lines). sion from Marques et al. (2009). # Acoustic Society of
Remaining plots are two-dimensional representations of America, 2009. All rights reserved

detection probability, (c) horizontal off-axis angle of detecting that same click). The variance around
and slant range, and (d) vertical off-axis angle and the average was estimated using the bootstrap and
slant range are all depicted. The average detection presented as a coefficient of variation (CV,
probability of a beaked whale click within 8 km defined in Sect. 9.4.2) and was estimated to be
of a moored hydrophone was estimated to be 0.03 0.16, or 16% when expressed as a percentage.
(i.e., if a beaked whale click was produced within Finally, the estimator was used to estimate beaked
8 km of a moored hydrophone, the study whale density in the study area of either 25.3 (CV:
estimated that there was, on average, a 3% chance 19.5%) or 22.5 (19.6%) animals per 1000 km2,
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 351

depending on the false-positive proportion used for statistical analysis of biological data. R can be
(two estimates were produced using differing accessed and downloaded through a web
methods). browser2 and for most users, we recommend a
user-friendly GUI like RStudio (RStudio Team
20203). RStudio is an integrated development
9.7 Software for Analyses environment for R that includes a console, an
editor for code development and execution, and
There are many standard, relatively easy-to-use tools for plotting, debugging, tracking history,
software packages that require no (or very little) and managing the workspace. An interesting fea-
coding skills to carry out statistical analyses, ture of R integrated with RStudio is the ability to
including SPSS (IBM Corp., Armonk, NY, adhere in a straightforward way to the concept of
USA), Statistica (TIBCO Software, CA, USA), reproducible research via dynamic reports in
Stata (StataCorp, College Station, TX, USA), RMarkdown. If the reader is new to the topic,
Minitab (Minitab Inc., State College, PA, USA), we recommend the book by Xie et al. (2020).4
Xlstat (Addinsoft, Ile-de-France, France), and
SAS (SAS Institute, Cary, NC, USA), among
others. In the field of bioacoustics, it is common 9.8 Summary
for acoustic data to be processed in MATLAB
(The MathWorks Inc., Natick, MA, USA) due to A key outcome of bioacoustics research is the
its powerful signal processing package. MATLAB production of new knowledge that informs con-
users may find that their workflow is streamlined servation management. The knowledge produced
by undertaking statistical analyses in the same needs to be reliable and easily understood, which
software if all required tools are available. is no trivial task given the complicated nature of
For those planning, however, on undertaking animal behavior. The reality is that the phenom-
analyses that draw from the most recent up-to- ena from which we want to derive inferences are
date developments in statistical ecology and multifaceted, with many interconnecting
require a highly flexible environment to do so, a attributes, and patterns and signals obscured by
free open-source software environment like R is statistical noise (i.e., variability not associated
recommended (R Core Team 2020). R is primar- with the conditions under investigation). Conse-
ily used for statistical computing and production quently, underlying mechanisms that explain the
of graphics (though R’s GIS, and even signal patterns we observe are not easily revealed.
processing capabilities, are expanding). The soft- Not only are animal behaviors occurring in a
ware benefits from a large number of base and highly complex environment, but many
contributed packages that can easily be challenges are presented in conducting the
downloaded and an environment in which users research itself. For instance, as researchers we
may develop their own algorithms and packages. are not easily able to avoid or reduce the statistical
There are now many sources of instructional noise in the environment by controlling field
manuals and books guiding users on how to cre- conditions; and when we undertake experiments
ate high-quality data representations and run of animals in captivity to reduce noise in a labo-
analyses in R, including Crawley (2013), Kerns ratory, we cannot be sure that results are
(2010), Zuur et al. (2009), Bolker (2008), Lawson
(2014), among many others. The CRAN
2
Task View: Analysis of Ecological and Environ- R Core Team is accessible at https://www.r-project.org/;
accessed 1 January 2020.
mental Data1 maintained by Gavin Simpson is an 3
RStudio is accessible at https://www.rstudio.com/
excellent resource for locating suitable packages products/RStudio/; accessed 9 November 2020.
4
RMarkdown: The Definitive Guide by Xie Y, Allaire JJ,
1
CRAN Task View: https://CRAN.R-project.org/ Grolemund G: https://bookdown.org/yihui/rmarkdown/;
view¼Environmetrics; accessed 9 November 2020. accessed 9 November 2020.
352 C. Salgado Kent et al.

transferable to the wild. In addition, we introduce driving the final flavor of a meal, and guides the
biases in our observations through our own sub- collection and mixing of the ingredients, through
jective, non-random filters. Only by understand- sampling, experimentation, and analysis. Taken
ing these filters can we either eliminate or adjust together, hopefully, delicious scientific meals will
biases to make reliable inferences about nature. result, by drawing meaningful and reliable
Quantitative skills, including survey design inferences from data. Statistics is paramount for
considerations, are therefore an essential part of science in general, and bioacoustics is in that
a bioacoustician’s toolkit and should be viewed regard no exception.
just as essential as field skills and signal
processing methods. These statistical methods Acknowledgement We thank Steve Buckland and Jay
are tools that enable the researcher to ask difficult Barlow for their helpful comments prior to Springer’s
peer-review.
but often important and exciting questions about
their research topic.
However, given the complexity in nature,
research design challenges, and the multi- References
disciplinary nature of studying animal behavior
through acoustics, it is not realistic to expect Akaike H (1974) A new look at the statistical model
identification. IEEE Trans Autom Control 19(6):
specialists in one field to become experts across
716–723
multiple fields (i.e., behavior, ecology, bioacous- Auger-Méthé M, Newman K, Cole D, Empacher F,
tics, and statistics). What behaviorists and Gryba R, King AA, Leos-Barajas V, Flemming JM,
bioacousticians can aim for is to understand foun- Nielson A, Petris G (2020) An introduction to state-
space modeling of ecological time series. arXiv pre-
dational statistical concepts, have a broad knowl-
print arXiv:2002.02001
edge of the range of existing techniques available, Beninger PG, Boldina I, Katsanevakis S (2012)
and be able to identify critical pitfalls in survey Strengthening statistical usage in marine ecology. J
design and data analyses. In addition, Exp Mar Biol Ecol 426:97–108
Bolker BM (2008) Ecological models and data in
practitioners should be able to conduct a range
R. Princeton University Press, Princeton
of current standard analyses and know when to Bolker BM, Brooks ME, Clark CJ, Geange SW, Poulsen
seek support for more sophisticated approaches. JR, Stevens MHH, White JSS (2009) Generalized lin-
It is our hope that through the introduction of ear mixed models: a practical guide for ecology and
evolution. Trends Ecol Evol 24(3):127–135
basic statistical concepts in this chapter, readers
Borcard D, Gillet F, Legendre P (2011) Numerical ecology
can more confidently avoid design and analysis with R. Springer, New York
pitfalls and make the necessary considerations to Borchers DL, Buckland ST, Zucchini W (2002)
select the most suitable approaches to success- Estimating animal abundance. Springer, New York
Box GE (1976) Science and statistics. J Am Stat Assoc
fully answer their research questions. We would
71(356):791–799
like researchers to feel empowered to critically Brémaud P (1999) Markov chains: Gibbs fields and Monte
evaluate the transferability of standard practices Carlo simulation. Springer, New York, pp 253–322
across broader spectra of questions and identify Brown A, Smith J, Salgado Kent C, Marley S, Allen S,
Thiele BL, Erbe C, Chabanne D (2017). Relative abun-
inadequacies where they occur. Finally, and fore-
dance, population genetic structure and passive acous-
most, we hope that at the conclusion of this chap- tic monitoring of Australian snubfin and humpback
ter, readers feel inspired to place greater focus on dolphins in regions within the Kimberley. https://doi.
the biological significance of research outputs, org/10.13140/RG.2.2.17354.06082
Burnham K, Anderson D (2002) Model selection and
using quantitative methods as a tool to support
multimodel inference: a practical information-theoretic
their conclusions. approach, 2nd edn. Springer, New York
We close this chapter by providing you, the Casella G, Berger RL (2002) Statistical inference.
reader, with our culinary rendition of the meaning Duxbury, Belmont, CA
Cawley GC, Talbot NL (2010) On over-fitting in model
of statistics: It is the science that uses data as its
selection and subsequent selection bias in performance
main ingredient, uncertainty as a key seasoning evaluation. J Mach Learn Res 11:2079–2107
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 353

Christin S, Hervet É, Lecomte N (2019) Applications for Kuha J (2004) AIC and BIC: comparisons of assumptions
deep learning in ecology. Methods Ecol Evol 10(10): and performance. Sociol Methods Res 33(2):188–229
1632–1644 Lawson J (2014) Design and analysis of experiments
Cochran WG (1977) Sampling techniques, 3rd edn. Wiley, with R, vol 115. CRC Press, Boca Raton, FL
New York Leek JT, Peng RD (2015) What is the question? Science
Cohen J (1988) Statistical power analysis for the behav- 347(6228):1314–1315
ioral sciences, 2nd edn. L. Erlbaum Associates, Link RF (2002) Principal applications of Bayesian
Hillsdale, NJ methods in actuarial science: a perspective. North Am
Cohen J (2013) Statistical power analysis for the behav- Actuarial J 6(2):129
ioral sciences, 2nd edn. Routledge, New York Manly BFJ (2007) Randomization, bootstrap and Monte
Committee on Taxonomy (2021) List of marine mammal Carlo methods in biology. CRC Press, Boca Raton, FL
species and subspecies. Society for Marine Mammal- Manly BF, Alberto JAN (2014) Introduction to ecological
ogy. www.marinemammalscience.org. Accessed sampling. CRC Press, Boca Raton, FL
2 Sep 2021 Marques TA, Thomas L, Ward J, Dimarzio N, Tyack PL
Crawley MJ (2013) The R book, 2nd edn. Wiley, (2009) Estimating cetacean population density using
Hoboken, NJ fixed passive acoustic sensors: an example with
Cressie N, Calder CA, Clark JS, Hoef JMV, Wikle CK Blainville’s beaked whales. J Acoust Soc Am 125(4):
(2009) Accounting for uncertainty in ecological analy- 1982–1994. https://doi.org/10.1121/1.3089590
sis: the strengths and limitations of hierarchical statis- Marques TA, Thomas L, Martin SW, Mellinger DK, Ward
tical modeling. Ecol Appl 19(3):553–570 JA, Moretti DJ, Harris D, Tyack PL (2013) Estimating
Dytham C (2011) Choosing and using statistics: a animal population density using passive acoustics. Biol
biologist’s guide, 3rd edn. Wiley, Boca Raton, FL Rev 88(2):287–309
Ellis PD (2010) The essential guide to effect sizes: statisti- Martin TG, Wintle BA, Rhodes JR, Kuhnert PM, Field
cal power, meta-analysis, and the interpretation of SA, Low-Choy SJ, Tyre AJ, Possingham HP (2005)
research results. Cambridge University Press, Zero tolerance ecology: improving ecological infer-
Cambridge ence by modelling the source of zero observations.
Ellison AM (2004) Bayesian inference in ecology. Ecol Ecol Lett 8(11):1235–1246
Lett 7(6):509–520 Matthiopoulos J (2010) How to be a quantitative ecologist.
Ellison AM, Gotelli NJ, Inouye BD, Strong DR (2014) P Wiley, Hoboken, NJ
values, hypothesis testing, and model selection: it's Maydeu-Olivares A, Garcia-Forero C (2010) Goodness-
déjà vu all over again. Ecology 95(3):609–610 of-fit testing. Int Encycl Educ 7(1):190–196
Everitt B, Hothorn T (2011) An introduction to applied McCarthy MA (2007) Bayesian methods for ecology.
multivariate analysis with R. Springer, New York Cambridge University Press, Cambridge
Fisher RA (1959) Statistical methods and scientific infer- McElreath R (2020) Statistical rethinking: a Bayesian
ence, 2nd ed. Oliver and Boyd, Edinburgh, UK course with examples in R and Stan. CRC Press,
Ford ED (2000) Scientific method for ecological research. Boca Raton, FL
Cambridge University Press, Cambridge McElroy TS (2016) Nonnested model comparisons for
Gamerman D (1997) Sampling from the posterior distri- time series. Biometrika 103(4):905–914
bution in generalized linear mixed models. Stat McGarigal K, Cushman SA, Stafford S (2000) Multivari-
Comput 7(1):57–68 ate statistics for wildlife and ecology research, 1st edn.
Gavrilov AN, McCauley RD, Gedamke J (2012) Steady Springer, New York
inter and intra-annual decreases in the vocalization Miksis-Olds JL, Nieukirk SL, Harris DV (2018) Two unit
frequency of Antarctic blue whales. J Acoust Soc Am analysis of Sri Lankan pygmy blue whale song over a
131:4476–4480. https://doi.org/10.1121/1.4707425 decade. J Acoust Soc Am 144(6):3618–3626
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Moreria Sugai L, Freire Silva T, Wagner Ribeiro J, Llusia
Rubin DB (2013) Bayesian data analysis, 3rd edn. D (2018) Terrestrial passive acoustic monitoring:
CRC Press, Boca Raton, FL review and perspectives. Bioscience 69. https://doi.
Harrison XA, Donaldson L, Correa-Cano ME, Evans J, org/10.1093/biosci/biy147
Fisher DN, Goodwin CE, Robinson BS, Hodgson DJ, Nakagawa S, Schielzeth H (2010) Repeatability for Gauss-
Inger R (2018) A brief introduction to mixed effects ian and non-Gaussian data: a practical guide for
modelling and multi-model inference in ecology. PeerJ biologists. Biol Rev 85:935–956
6:e4794 Nuzzo R (2014) Scientific method: statistical errors. Nat
Hilborn R, Mangel M (1997) The ecological detective: News 506(7487):150
confronting models with data, vol 28. Princeton Uni- O’Hara RB (2009) How to make models add up—a primer
versity Press, Princeton on GLMMs. Annales Zoologici Fennici, BioOne, pp
Kerkhoff AJ, Enquist BJ (2009) Multiplicative by nature: 124–137
why logarithmic transformation is necessary in allom- O’Hara R, Kotze J (2010) Do not log-transform count data.
etry. J Theor Biol 257(3):519–521 Nat Precedings 1:118–122
Kerns GJ (2010) Introduction to probability and statistics Ortega A, Navarrete G (2017) Bayesian hypothesis test-
using R, 1st edn. G. Jay Kerns, Youngstown ing: an alternative to null hypothesis significance
354 C. Salgado Kent et al.

testing (NHST). In: Psychology and social sciences. RStudio Team (2020) RStudio: integrated development
IntechOpen for R. RStudio, PBC, Boston, MA. http://www.
Paiva EG, Salgado Kent CP, Gagnon MM, McCauley R, rstudio.com/
Finn H (2015) Reduced detection of Indo-Pacific R Core Team (2020) R: A language and environment for
bottlenose dolphins (Tursiops aduncus) in an inner statistical computing. R Foundation for Statistical
harbour channel during pile driving activities. Aquat Computing, Vienna. https://www.R-project.org/
Mamm 41(4):455–468 Touchon JC, McCoy MW (2016) The mismatch between
Panzeri S, Magri C, Carraro L (2008) Sampling bias. current statistical practice and doctoral training in ecol-
Scholarpedia 3(9):4258 ogy. Ecosphere 7(8):e01394
Pedersen EJ, Miller DL, Simpson GL, Ross N (2019) Underwood AJ (1997) Experiments in ecology: their logi-
Hierarchical generalized additive models in ecology: cal design and interpretation using analysis of variance.
an introduction with mgcv. PeerJ 7:e6876 Cambridge University Press, Cambridge
Quinn GP, Keough MJ (2002) Experimental design and Van Der Maaten L, Postma E, Van den Herik J (2007)
data analysis for biologists. Cambridge University Dimensionality reduction: a comparative review. J
Press, Cambridge Mach Learn Res 10(66–71):13
Rahlf T (2019) Data visualisation with R: 111 examples. Warren VE, Marques TA, Harris D, Thomas L, Tyack PL,
Springer Nature, New York Aguilar de Soto N, Hickmott LS, Johnson MP (2017)
Reyier EA, Franks BR, Chapman DD, Scheidt DM, Stolen Spatio-temporal variation in click production rates
ED, Gruber SH (2014) Regional-scale migrations and of beaked whales: implications for passive acoustic
habitat use of juvenile lemon sharks (Negaprion density estimation. J Acoust Soc Am 141(3):
brevirostris) in the US South Atlantic. PLoS One 1962–1974
9(2):e88470 Wasserstein RL, Schirm AL, Lazar NA (2019) Moving to
Ripley BD (2004) Selecting amongst large classes of a world beyond “p<0.05”. Taylor & Francis,
models. In: Methods and models in statistics, In New York
Honour of Professor John Nelder, FRS. World Scien- Wilcox RR (2010) Fundamentals of modern statistical
tific, New York, pp 155–170 methods: substantially improving power and accuracy.
Royle JA, Dorazio RM (2008) Hierarchical modeling and Springer, New York
inference in ecology: the analysis of data from Williams PJ, Hooten MB (2016) Combining statistical
populations, metapopulations and communities. inference and decisions in ecology. Ecol Appl 26:
Elsevier, New York 1930–1942
Rykiel EJ Jr (1996) Testing ecological models: the mean- Xie Y, Allaire JJ, Grolemund G (2020) R markdown: the
ing of validation. Ecol Model 90:229–244 definitive guide. CRC Press, Boca Raton, FL
Salkind NJ (2010) Encyclopedia of research design, vol Yoccoz NG (1991) Use, overuse, and misuse of signifi-
1. Sage, Thousand Oaks, CA cance tests in evolutionary biology and ecology. Bull
Shmueli G (2010) To explain or to predict? Stat Sci 25(3): Ecol Soc Am 72(2):106–111
289–310 Zimmer WM, Johnson MP, Madsen PT, Tyack PL (2005)
Stauffer HB (2007) Contemporary Bayesian and Echolocation clicks of free-ranging Cuvier’s beaked
frequentist statistical research methods for natural whales (Ziphius cavirostris). J Acoust Soc Am
resource scientists. Wiley, Hoboken, NJ 117(6):3919–3927
Stevenson BC, Borchers DL, Altwegg R, Swift RJ, Zuur A, Ieno EN, Smith GM (2007) Analyzing ecological
Gillespie DM, Measey GJ (2015) A general framework data. Springer, New York
for animal density estimation from acoustic detections Zuur A, Ieno EN, Walker N, Saveliev AA, Smith GM
across a fixed microphone array. Methods Ecol Evol (2009) Mixed effects models and extensions in ecology
6(1):38–48. https://doi.org/10.1111/2041-210X.12291 with R. Springer, New York

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license
and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain
permission directly from the copyright holder.
Behavioral and Physiological
Audiometric Methods for Animals 10
Sandra L. McFadden, Andrea Megela Simmons, Christine Erbe,
and Jeanette A. Thomas

10.1 Introduction responding to sounds, the ability to hear may be


inferred but is not guaranteed. For this reason,
Audiometric studies, using behavioral or physio- behavioral methods are considered the “gold stan-
logical methods, describe and quantify the dard” for audiometric assessment.
hearing capabilities of animals. Audiometric stud- Animals hear sounds across a range of
ies using behavioral methods test hearing directly, frequencies, and their sensitivity to audible
by requiring an animal to make an observable sounds varies with frequency. By employing
response when it hears a target sound. The behavioral or physiological methods, researchers
required response can be a natural, untrained can determine the range of sound frequencies that
response to sound, or the response can be one animals hear, the amount of energy needed for the
the animal is trained to make using classical or detection of sounds at each frequency, and the
operant conditioning procedures. Physiological particular sound frequencies to which animals are
audiometric data, which do not require training, most sensitive. Determining what sounds animals
are more easily obtained than are behavioral data hear provides information about their acoustic
based on conditioning procedures. However, environment and insight into the evolution of
physiological methods can assess the perceptual hearing among taxa. For example, toothed
process of hearing only indirectly. If it is shown whales, microchiropteran bats, some shrews, and
that an animal’s auditory system is capable of oil birds have evolved hearing abilities adapted
for echolocation (see Chap. 12 on echolocation
Jeanette A. Thomas (deceased) contributed to this chapter and the taxon-specific chapters in upcoming
while at the Department of Biological Sciences, Western Volume 2), and some insect and fish prey have
Illinois University-Quad Cities, Moline, IL, USA evolved keen hearing to detect their echolocating
predators. Sounds to which animals are most sen-
S. L. McFadden (*)
Department of Psychology, Western Illinois University, sitive are the ones most relevant to intraspecies
Macomb, IL, USA communication and survival (because they pro-
e-mail: sl-mcfadden@wiu.edu vide information about mating partners or about
A. M. Simmons predators and other sources of danger) and there-
Department of Cognitive, Linguistic, & Psychological fore are of particular interest.
Sciences, Brown University, Providence, RI, USA
In addition to providing information about
e-mail: andrea_simmons@brown.edu
normal hearing capabilities of animals, audiomet-
C. Erbe
ric studies can show how hearing changes as
Centre for Marine Science and Technology, Curtin
University, Bentley, WA, Australia a function of aging, environmental challenges,
e-mail: c.erbe@curtin.edu.au and experimental manipulations. Like humans,
# The Author(s) 2022 355
C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_10
356 S. L. McFadden et al.

animals can experience presbycusis (i.e., loss of calling methods, which do not require training
hearing with age; Willott 1991; McFadden et al. but which likely underestimate the animals’
1997) and they can develop hearing loss if true hearing sensitivity. Understanding the
exposed to ototoxic drugs, such as auditory capabilities of non-traditional species
aminoglycoside antibiotics or platinum-based provides insight into how hearing has become
anti-cancer medications (Henderson et al. 1999). adapted to the challenges that animals face in a
Hearing loss in wildlife due to noise exposure is variety of natural environments. Unfortunately,
of increasing concern because of widespread for the vast majority of species, and even
noise sources associated with anthropogenic major taxa, there are no audiometric data
activities in the ocean and on land (see Chap. 13 available.
on the effects of noise). Audiometric studies of
animals can also contribute to the understanding
and treatment of human hearing and hearing
10.2 What Is an Audiogram?
disorders. For example, the study of the genetic
and biological bases of hearing disorders often
An audiogram is a graph of hearing threshold as a
involves audiometric testing of animals with
function of frequency (ANSI/ASA S3.20-2015;
induced genetic conditions (e.g., knockin and
ISO 18405: 2017).1 Frequency refers to the sinu-
knockout mice in which an existing gene is
soidal vibration in cycles/s of a pure tone (sine
replaced or disrupted with an artificial piece of
wave). The hearing threshold of a listener is
DNA, thereby altering or eliminating its function)
defined as the minimum stimulus level that
and the investigation of pharmacological
evokes an auditory sensation in a specified frac-
influences on human hearing is studied in labora-
tion of trials at a given frequency. On an audio-
tory animals.
gram (Fig. 10.1), low threshold values correspond
Audiometric studies have been conducted on
to high sensitivity to sound at that frequency and
many aquatic and terrestrial species, with the
vice versa. The stimulus level is often a root-
choice of species guided by availability and the
mean-square sound pressure level (SPL)
particular questions (biological, medical, or evo-
expressed in dB with a reference of 20 μPa
lutionary) that the experimenter poses. Hearing
when testing in air or 1 μPa when testing under
abilities have been studied extensively in tradi-
water; see Chap. 4, Introduction to Acoustics. The
tional laboratory mammals (Fig. 10.1) including
stimulus level may also be a root-mean-square
the house mouse (Mus musculus), chinchilla
sound particle velocity level (e.g., in the case of
(Chinchilla lanigera), Mongolian gerbil
some fish audiograms) specified in dB re 1 nm/s.
(Meriones unguiculatus), guinea pig (Cavia
Because audiograms may be measured with
porcellus), and laboratory rat (Rattus norvegicus).
signals other than pure tones (e.g., tone pips or
These species are easy to obtain, easily bred in the
clicks), signal type, threshold level, and reference
laboratory, and readily trained in conditioning
value should be reported, along with the
procedures, and so have long served as models
measured ambient noise levels. If the ambient
for both normal and impaired human hearing.
noise is negligible, the hearing threshold is
Audiometric studies have been conducted with
referred to as an unmasked threshold. If the ambi-
many non-mammal species, including insects,
ent noise is high enough to raise the hearing
amphibians, reptiles, fishes, and birds (see Vol-
threshold above its unmasked level, the hearing
ume 2). Many species are challenging to obtain,
threshold is called a masked threshold (ISO
to house, and to train in a laboratory environment.
18405: 2017).
For these reasons, behavioral audiograms are
sometimes based on data from only one or
very few animals, which limits the generaliz- 1
Acoustical Society of America, Standard Acoustical &
ability of the results. Further, hearing in some Bioacoustical Terminology Database: https://asastandards.
species is estimated by phonotaxis and evoked org/asa-standard-term-database/; accessed 5 January 2021.
10 Behavioral and Physiological Audiometric Methods for Animals 357

Fig. 10.1 Left: Behavioral audiograms of rodents com- averaged thresholds based on 50% correct detection.
monly used as laboratory animal models for hearing. Data were collected by Heffner and Heffner (1991, from
Tones were presented through loudspeakers, and the three chinchillas); Koay et al. (2002, from two domestic
animals’ conditioned responses measured. All of the mice); Heffner et al. (1994, from four Norway rats); and
audiograms are U-shaped, with frequencies of best sensi- Heffner et al. (1971, from four Mongolian gerbils). Right:
tivity (tip of the audiogram, at the lowest sound pressure The photo of a mouse participating in a behavioral hearing
level) within the range of 4–16 kHz. These species differ test is courtesy of Micheal Dent, University at Buffalo,
considerably in the low-frequency limit of hearing, with The State University of New York (Screven and Dent
the chinchilla being more sensitive to a broader range of 2019)
low frequencies than the domestic mouse. Plots are

There are two general approaches to assessing the non-acoustic self-noise arising from myo-
the auditory thresholds of live animals: behav- genic and neurogenic sources plus any artifact
ioral and physiological. The behavioral hearing due to non-biological electrical interference.
threshold is the lowest level that evokes a behav- Electrophysiological hearing threshold estimates
iorally measurable auditory sensation in a can be determined from different physiological
specified fraction of trials (ISO 18405: 2017). processes (e.g., microphonic potentials, auditory
The pure-tone behavioral hearing threshold mea- brainstem response, cortical evoked responses),
surement procedure (prescribed in ANSI/ASA which characterize auditory processing at differ-
S3.21-2004) recommends that the behavioral ent levels of the auditory system. Various thresh-
hearing threshold be defined as the lowest input old estimation procedures also exist; each carries
level at which responses occur in at least 50% of a with it associated errors and assumptions, so the
series of ascending trials (i.e., trials in which method for threshold estimation should be
signal level is systematically increased). The specified.
behavioral hearing threshold provides an Electrophysiological methods are not equiva-
integrated, whole-organism response to signal lent to behavioral procedures, and electrophysio-
detection. logical hearing thresholds can differ from
An electrophysiological hearing threshold is behavioral hearing thresholds (even for the same
the lowest level that evokes a detectable and test animal). Within each of these two
reproducible electrophysiological response (ISO approaches, several methods can be employed,
18405:2017). Both the ambient noise and the depending on the species being tested and the
background electrophysiological noise levels goals of the researcher. Behavioral techniques
should be reported. Electrophysiological noise is can be based on either unconditioned responses
358 S. L. McFadden et al.

that the animal makes spontaneously and as part other directions being attenuated. The external
of its natural repertoire, or conditioned responses auditory meatus is an acoustic resonator that
that the animal is trained to make. Common phys- boosts the amplitude of received frequencies at
iological techniques measure otoacoustic and near its resonant frequency. The resonant
emissions (OAEs; i.e., sounds generated by frequency of the ear canal is inversely propor-
outer hair cells in the inner ear and measured tional to its length, so animals with short ear
using a very sensitive microphone) and auditory canals, such as mice, have their best hearing sen-
evoked potentials (AEPs; i.e., summed electrical sitivity at high frequencies, whereas animals with
responses of hair cells and auditory neurons long ear canals, such as elephants, have their best
recorded from electrodes). Results from behav- hearing sensitivity at low frequencies. The reso-
ioral and AEP experiments in the same species or nant characteristics of the external auditory mea-
even in the same animal can produce audiograms tus, coupled with the sound transfer properties of
that are similar in shape and frequency range but the middle ear, help determine the acoustic
may differ in absolute thresholds (see energy levels reaching the inner ear.
Sect. 10.4.3). Often, audiograms are incorrectly interpreted
Audiograms in most species are typically as illustrating hard thresholds to sounds, assum-
U-shaped, but not symmetrical (Fig. 10.1). The ing that sounds at amplitudes just below the
frequency region of best sensitivity encompasses published audiogram are inaudible and sounds
those sound frequencies at the trough of the just above the audiogram are always audible.
U-shaped curve, where thresholds are lowest. That is not the case. The faintest sound that an
The animal’s best hearing sensitivity (or lowest animal can hear depends on many factors, includ-
threshold) corresponds to the threshold range at ing stimulus characteristics (e.g., duration, repeti-
the frequency region of best sensitivity. The range tion rate), environmental factors (e.g., ambient
of hearing specifies the sound frequencies that are noise level, testing context such as anechoic
audible to an animal at some specified level (e.g., chamber versus natural environment), and indi-
60 dB) above the lowest threshold. The range of vidual factors (e.g., health, response bias, atten-
hearing for sounds at high sound levels is wider tion, age). A given animal may show a loss of
than the range of hearing for sounds at low sound sensitivity due to aging, noise exposure, or expo-
levels because the audiogram is broad and sure to ototoxic drugs, and even due to repeated
U-shaped. The range of hearing should be or prolonged exposure to the stimulus during
expressed as between X Hz and Y Hz at Z dB testing that leads to sensory adaptation and/or
above the best hearing sensitivity. Unfortunately, cognitive habituation. At high ambient noise
many publications do not include the number of levels or when additional sounds are present, an
decibels above the best hearing sensitivity when animal might lose the ability to hear a sound it
reporting the range of hearing for an animal or previously heard in a quiet environment. This is
species, and they may not indicate whether the because of masking, in which the presence of
highest and lowest frequencies shown in an non-target sounds or noise decreases the detect-
audiogram reflect the limits of testing or the limits ability of the sound of interest.
of the animal’s hearing capabilities. Within a species, there can be significant indi-
In terrestrial mammals, the main contributors vidual differences in hearing sensitivity, which
to the U-shape of the audiogram and the location can reflect differences in attention to the task,
of the frequency of best sensitivity are the acous- age, health, and history of exposure to sounds,
tic properties of the auditory periphery: the pin- among other factors. Because there can be con-
nae, external auditory meatus, and middle ear siderable variability among animals of a given
(Tonndorf 1976; Hellström 1995). The pinna species, it is important to test many animals
serves to funnel sounds into the external auditory when possible. Also, it is important to know
meatus (i.e., the ear canal), with sounds from when examining an audiogram whether the
some directions being amplified and those from
10 Behavioral and Physiological Audiometric Methods for Animals 359

Fig. 10.2 Left: Underwater behavioral audiograms of averaged data from the same male and female and an
three beluga whales obtained at two different times additional juvenile male, obtained by Awbrey et al.
10 years apart. Data were obtained using an ascending (1988). The gray squares show the ambient noise level in
Method of Limits (described in Sect. 10.3.3). The whales the test pool, which was close to the measured thresholds
were trained to leave a station when they heard a tone and at 4 and 8 kHz, indicating that the whales’ actual
swim to the trainer for a food reward. Thresholds were thresholds at these frequencies were likely lower than
defined as the tone level at which the whales detected the indicated on this graph. The gray dashed line is 60 dB
signal 50% of the time. The red triangles show the mean above the lowest threshold at 30 kHz, where the range of
audiogram from one male and one female beluga whale hearing was measured. Right: Photo of two beluga whales
reported by White et al. (1978). The arrow shows the most at Vancouver Aquarium
sensitive frequency at 30 kHz. The blue circles show

curve is based on a single animal or a group of and conditioned response techniques. Uncondi-
animals. tioned response techniques are based on
Audiograms from three beluga whales behaviors that the animal naturally makes to
(Delphinapterus leucas) are shown in Fig. 10.2. sound and are readily employed in the animal’s
From this graph, it can be seen that testing was natural habitat. Animals must be trained to make
conducted in water because the dB reference is conditioned responses, and this training should be
1 μPa, rather than 20 μPa for sounds presented in based on the species’ typical behavioral reper-
air (as in Fig. 10.1). In belugas, hearing sensitivity toire. Klump et al. (1995) provide a full discus-
increased from low frequencies around 250 Hz to sion of different methods used to study hearing
the best frequency range around 30 kHz (thresh- sensitivity in animals.
old around 37 dB re 1 μPa), and then decreased For both techniques, establishing stimulus
toward higher frequencies up to 120 kHz; this control over an animal’s behavior is crucial. A
results in a U-shaped hearing curve. The range pure tone is typically the test signal, although
of hearing at 60 dB above lowest threshold broadband clicks, and noises of varying
extends from about 1–110 kHz. bandwidths can be used, depending on the
research question. How signals are generated
and presented is extremely important to control
and monitor. The sound may be delivered via a
10.3 Behavioral Methods
loudspeaker to animals ranging freely, being con-
for Audiometric Studies on Live
fined to the experimental chamber, or trained to
Animals
hold station (e.g., at a bite plate or in a hoop), or
delivered via tubes, insert earphones, or
Behavioral approaches can be divided into two
headphones (Fig. 10.3). Stimuli can be presented
general types, unconditioned response techniques
360 S. L. McFadden et al.

Fig. 10.3 Photos of a budgerigar (Melopsittacus a reward during a frequency discrimination experiment
undulatus) wearing headphones during a sound localiza- (right; Dent et al. 2000). Courtesy of Micheal Dent, Uni-
tion experiment (left; Welch and Dent 2011) and receiving versity at Buffalo, The State University of New York

using several different protocols, each of which hearing abilities but are not good measures for
has its own assumptions and limitations. Ambient determining absolute thresholds of hearing.
noise can influence thresholds and so must also be The Preyer reflex has been described as an
controlled. Ambient noise can be minimized if the orientation or attentional reflex (Jero et al.
animal is tested in an anechoic chamber or a 2001). In mammalian species that are able to
sound-attenuating chamber (Fig. 10.4). If animals move their pinnae, it involves a quick retraction
are tested in their natural environments where of the ears, a rapid twitch of the ears, or a change
ambient noise levels cannot be controlled, in orientation of the pinnae toward the source of
researchers must take periodic measurements of the sound. In species with immobile pinnae, turn-
the amount of ambient noise present during ing of the head toward the sound source (which
hearing tests. brings the source of the sound into the animal’s
line of vision) is the measure of orientation. In
some studies, a trained observer simply rates the
10.3.1 Behavioral Methods Using Preyer reflex as present or absent. The reflex also
Unconditioned Behaviors can be monitored using a motion-tracking camera
system and reflective markers attached to each of
10.3.1.1 Preyer Reflex and Acoustic the animal’s pinnae, as described in a study using
Startle Response the guinea pig (Berger et al. 2013). The magni-
The Preyer reflex and the acoustic startle response tude and latency of the Preyer reflex can then be
(ASR) are behaviors triggered automatically by determined by measuring pinnae displacement
unexpected, high-amplitude sounds. These are during sound presentation.
reflexive responses to sound that require no train- The ASR is a whole-body response to unex-
ing of the animal and thus are relatively easy to pected sounds presented at very high amplitudes
implement. On the other hand, animals can habit- (typically above 90 dB re 20 μPa) and has been
uate to repeated presentations of high-amplitude interpreted as a protective or alarm reflex. It can
sounds that best evoke these reflexes. Thus, be elicited in a wide range of adults and develop-
sound-evoked reflexes can be useful as fast and ing vertebrates, including fishes and most
easy screening tests for bracketing an animal’s mammals, and typically is quantified in terms of
10 Behavioral and Physiological Audiometric Methods for Animals 361

Fig. 10.4 A sound


attenuating chamber set up
for acoustic startle reflex
(ASR) testing in small
animals such as mice and
rats. The animal is placed in
a plastic tube or a wire
restraining device on an
accelerometer platform.
Voltages produced by the
movement of the animal on
the platform are recorded
and quantified. Typical
ASR measures are peak
amplitude and response
latency

response amplitude and response latency. In tele- plates filled with water and mounted on top of a
ost fish, the ASR is called the tail-flip reflex or vibration device that produces particle motion
C-start response, and it involves an initial full stimulation. A high-speed video camera is needed
flexion of the body followed by a weaker flexion to visualize the C-start response (Bhandiwad and
in the opposite direction, so that the animal bends Sisneros 2016).
and swims away from the source of the stimulus. In small mammals such as rodents, the ASR
The response is mediated by the Mauthner cells, a consists of hunching of the shoulders,
pair of giant neurons located at the level of the dorsiflexion of the neck, and rapid extension
auditory-vestibular nerve in the hindbrain. The then flexion of the limbs. ASR in rodents is typi-
Mauthner cells receive input from the auditory cally measured by placing the animal on a plat-
nerve and then send signals to motor neurons on form that measures displacement and force or
the opposite side of the body, which then produce acceleration caused by limb extension
the behavioral response. The ASR in fishes can be (Fig. 10.4). In primates, the ASR involves the
measured by placing the animals in small acrylic reflex contraction of striate skeletal muscles,
362 S. L. McFadden et al.

primarily muscles of the face, neck, shoulders, 50% or more of the fish in a school reacted to the
and arms (Braff et al. 2001). sound stimulus by increasing swimming speed
An animal that twitches its ears or startles and making tight turns. Disadvantages of using
repeatedly (e.g., in at least two out of three startle responses are that they require presentation
presentations) in response to finger snaps, hand of high amplitude stimuli and they habituate
claps or pure tones at different frequencies has quickly.
demonstrated an ability to hear. At the same time,
however, the presence of a startle response does 10.3.1.2 Prepulse Inhibition (PPI)
not mean the animal has normal hearing. This was and Reflex Modification
demonstrated clearly in a study of the sensitivity Although the ASR is a reflex that is not typically
and specificity of the Preyer reflex by Jero et al. under voluntary control, it is sensitive to and can
(2001). The researchers used hand claps or the be modified by ongoing behaviors and attentional
metallic sound of two hammers hitting together to status of an animal. The ASR can be potentiated
elicit startle responses from young adult albino under some circumstances and attenuated or
laboratory mice of the FVB strain. They found inhibited under others. Animals typically show
that the reflex test was effective for identifying larger ASRs when they are afraid or anxious
profound hearing loss, but was insensitive for than when they are not, so fear-potentiated startle
identifying less severe hearing losses. paradigms commonly are used to study fear and
Reflex responses to sound can be used to show anxiety states in animals. When an animal is
differences between groups of animals as a func- processing another stimulus, such as a brief
tion of age or experimental treatment. Bhandiwad low-level sound or a puff of air or a flash of
and Sisneros (2016) examined the development light, it will startle less to a sudden, loud sound
of hearing in two species of larval fishes, the than when it is not otherwise engaged. The ability
three-spined stickleback (Gasterosteus aculeatus) of an auditory, tactile, or visual prepulse stimulus
and the zebrafish (Danio rerio), by quantifying to reduce the amplitude of the ASR is termed
the probability of a startle reflex in response to prepulse inhibition (PPI).
sounds of different frequencies at different ages Even an auditory prepulse stimulus near the
post-fertilization. McFadden et al. (2010) showed hearing threshold of an animal can attenuate the
declines in the amplitude and increases in the ASR, and this makes the PPI paradigm suitable
latency of the ASR with age in laboratory rats. for testing threshold levels of sound and deter-
Age-related changes in one or more of the mining subtle effects of treatments on auditory
components of the ASR circuit or to brain regions function. PPI has been used to study the auditory
providing inhibitory input to this circuit can sensitivity of fishes, frogs, and mammals
account for ASR changes observed in older (Fig. 10.5). In larval zebrafish, the probability of
animals and humans. an ASR to a high-amplitude tone was reduced
Startle responses also can be useful for deter- when the tone was preceded by other tones at
mining the range of frequencies that an animal sub-startle levels (Bhandiwad and Sisneros
can hear. Bowles and Francine (1993) determined 2016). Thresholds obtained by PPI in this species
that kit foxes (Vulpes macrotis) have a functional were lower than thresholds obtained by using the
hearing range from 1 to 20 kHz by observing ASR alone.
startle responses of four wild-caught kit foxes to Reflexes other than acoustic startle responses
playbacks of tones of different frequencies. An can be modified by the prior presentation of a
additional advantage of startle reflex testing is sound; these paradigms are termed reflex
that a group of animals can be tested simulta- modifications (Hoffman and Ison 1980).
neously. Kastelein et al. (2008) determined the Simmons and Moss (1995) adapted this paradigm
frequency range of hearing for eight species of to obtain audiograms for two species of frogs, the
marine fish by noting the frequencies at which American bullfrog (Lithobates catesbeianus) and
10 Behavioral and Physiological Audiometric Methods for Animals 363

10.3.1.3 Phonotaxis
Some animals have a natural tendency to
approach sound (positive phonotaxis) or make
evasive movements away from sound (negative
phonotaxis). Sounds that elicit positive
phonotaxis include species advertisement calls
(i.e., mating calls), while sounds that elicit nega-
tive phonotaxis include sounds made by
predators. These natural behavioral responses to
sound can be exploited to estimate hearing sensi-
tivity in those species for which training
procedures based on conditioned responses are
extremely difficult to implement. Phonotaxis
experiments are readily conducted in the animal’s
Fig. 10.5 Schematic drawing of a setup used to study habitat and so can provide crucial information on
prepulse inhibition of the ASR in Mongolian gerbils. The the acoustic features animals use to recognize
top drawing shows a gerbil placed into an acrylic tube
10 cm in front of a loudspeaker. The force sensor under conspecific (own species) vocal signals such as
the acrylic tube monitors the gerbil’s movements. The C advertisement and aggressive calls. These kinds
label shows the position of the stimulation/recording com- of field studies are particularly important for
puter. Center drawing shows the timing of acoustic stimu- identifying the impact of the entire soundscape
lation (dB) with the pre-stimulus (lower amplitude trace)
preceding the startle-producing stimulus (higher amplitude on sound detection and discrimination, and for
trace). Bottom drawing shows the response measured by assessing the effects of environmental variables,
the force sensor. Here, the response occurs only to the such as air temperature and humidity, on acoustic
stimulus and not to the pre-stimulus. After repeated communication.
pairings of the pre-stimulus and stimulus, the response to
the stimulus declines (Walter et al. 2012). # Walter et al. Phonotaxis has been especially useful for
2012; https://www.scirp.org/journal/paperinformation. studying auditory capabilities of female orthop-
aspx?paperid¼17796. Licensed under CC BY 4.0; teran insects, frogs, and songbirds, because these
https://creativecommons.org/licenses/by/4.0/ animals naturally approach stationary calling
males in order to mate with them. For example,
the green treefrog (Dryophytes cinereus). Frogs gravid female frogs readily approach
were constrained inside a small dish (1–2 cm in loudspeakers broadcasting sounds (tone bursts,
diameter larger than the animal), which was then amplitude-modulated tones, or frequency-
placed on top of a stabilimeter that picked up the modulated tones) which they recognize as
frog’s movements within the dish. Two copper components of the advertisement calls of males
strips cemented to the side of the dish produced a of their own species, or even a synthetic version
mild electric shock that evoked small reflex of these conspecific calls (Gerhardt 1995). The
contractions of the frog’s hind limbs. The reflex sensitivity of females to these sounds is measured
evoked by the electric shock was modified in in experiments in which sounds of different
strength by prepulses of pure tones, with the levels, frequencies, or temporal patterning are
extent of modification varying with prepulse broadcast from a loudspeaker, and the female’s
amplitude. At any given tone frequency, the approach to the loudspeaker is quantified. Sounds
amplitude of the prepulse producing 10% inhibi- can be broadcast from one source (one-speaker
tion of the reflex response was defined as the design) to estimate sound detection or from two
threshold to that frequency. The magnitude of sources (choice or two-speaker design) to esti-
the reflex modification effect varied with the mate sound discrimination. The researcher can
amplitude of the prepulse, but only when stimula- obtain an estimate of the female’s relative sensi-
tion was spaced at intervals wide enough to avoid tivity to sounds (if sound frequency is varied) or
habituation. her ability to distinguish sounds of two intensities
364 S. L. McFadden et al.

(if sound level is varied). Responses are Because most species of insects and frogs call
quantified in terms of the nearness and the path at night, visualizing their movements in a
of the phonotactic approach, the latency of the phonotaxis experiment can be challenging. Fig-
response, and the presence of orientation ure 10.6 shows a new technique designed to
movements, such as head-turning toward the monitor phonotactic movements of frogs in both
sound source. Data are typically presented as the the laboratory and the natural environment
proportion of females responding to a particular (Aihara et al. 2017). In this technique, a female
stimulus as a function of whatever parameter is Australian orange-eyed treefrog (Ranoidea
being varied, with the 50% correct point on the chloris) wears a miniature LED backpack. A
resulting function defined as the threshold in a video camera records the energy emitted from
one-choice experiment and the 75% correct point the LEDs, thus allowing researchers to track the
(midway between chance and perfect perfor- frog’s movements. Sounds are broadcast through
mance) defined as the threshold in a two-choice multiple loudspeakers, and monitored by separate
experiment (see Volume 2, Chap. 3 on LED sound indication devices, each of which has
amphibians). a different pattern of illumination. In this way,

Fig. 10.6 (a) An image of a sound indication device that the middle of the arena. The lights emitted by the sound
consists of a miniature microphone and a light-emitting indication device and the LED backpack are recorded by a
diode (LED). The LED is illuminated when detecting video camera. (d) Natural habitat of the orange-eyed
sounds. (b) Photo of an orange-eyed female treefrog wear- treefrog. The position of the sound-indication device is
ing a LED backpack. (c) Arena playback experiment. Two shown (Aihara et al. 2017). # Aihara et al. 2017; https://
loudspeakers at each end of the arena present sounds. A www.nature.com/articles/s41598-017-11150-y. Licensed
sound indication device is placed in front of each loud- under CC BY 4.0; https://creativecommons.org/licenses/
speaker. The female wearing the backpack is released from by/4.0/
10 Behavioral and Physiological Audiometric Methods for Animals 365

researchers can not only track the female’s what acoustic features of communication signals
movements but also which of several are most important for mediating behavioral
loudspeakers is playing the preferred sound. responses. Despite their limitations, phonotaxis
There are limitations to the use and interpreta- and evoked calling techniques are useful because
tion of phonotaxis data. Although phonotaxis they provide insight into what sounds animals pay
experiments can tell us which sounds animals attention to in their natural environment and thus
prefer and how sensitive they are to these sounds, into perceptual decision-making in a biologically
they are not suitable for the compilation of entire relevant context.
audiograms or estimates of an animal’s entire
range of hearing. When a female fails to approach
a sound source, it may be because she does not 10.3.2 Behavioral Methods Using
hear it or because she does not recognize it as an Conditioned Behaviors
advertisement call. Moreover, females of many
species will show phonotaxis only when they are 10.3.2.1 Classical Conditioning
gravid. This limits the timespan during which Classical conditioning techniques have been used
experiments can be conducted, although to train several species of animals for audiometric
phonotaxis can be induced by hormone injections studies. In classical conditioning, an uncondi-
(Gerhardt 1995). Male insects and frogs typically tioned stimulus that naturally elicits an uncondi-
exhibit phonotaxis only in response to a high tioned response is paired with a conditioned
amplitude sound resembling an advertisement stimulus. After a number of pairings of the
call or an aggressive call from a rival male. conditioned stimulus with the unconditioned
Males treat aggressive calls from rivals as threats stimulus, presentation of the conditioned stimulus
and respond aggressively, by approaching the alone elicits a conditioned response that is the
source and attempting to engage it physically. same as or similar to the unconditioned response.
Because males are less likely than females to Fay (1995) described the use of classical respi-
approach sound sources, descriptions of their ratory conditioning to estimate auditory
hearing sensitivity based on phonotaxis are not thresholds in the goldfish (Carassius auratus).
reliable. The goldfish was restrained in a cloth bag and
submerged in a small tank. An underwater loud-
10.3.1.4 Evoked Calling speaker was placed on the bottom of the tank. A
Evoked calling is another method based on tone of a particular frequency was presented
unconditioned responses that can be used to esti- shortly before a brief electric shock (uncondi-
mate hearing sensitivity and acoustic preferences. tioned stimulus) that produced an unconditioned
Males of some species (orthopteran insects, frogs, suppression of the fish’s respiration. Changes in
songbirds) vocalize in response to playbacks of the amplitude and rate of fish’s respiration were
signals resembling conspecific advertisement or measured by a thermister placed in front of the
aggressive calls. The male’s sensitivity to these fish’s mouth. After multiple pairings of the tone
playbacks can be estimated by lowering the and shock, presentation of the tone alone pro-
amplitude of the signal until the male no longer duced a conditioned suppression of respiration.
vocalizes back. Varying the acoustic features (fre- By determining the amplitude level of the tone
quency, temporal patterning) of the signal can that no longer produced a conditioned response,
provide estimates of sensitivity to these particular the fish’s sensitivity to that tone frequency could
features (Fay and Simmons 1999). Evoked call- be determined.
ing experiments, like phonotaxis experiments, Ehret and Romand (1981) used both uncondi-
can be implemented either in the laboratory or in tioned and classically conditioned pinnae
the field. As with the phonotaxis technique, the movements and eye-blink responses to track the
evoked calling technique does not measure audi- postnatal development of auditory thresholds in
bility per se but can be useful for determining domestic kittens (Felis catus). Unconditioned
366 S. L. McFadden et al.

movements of the pinnae and/or facial muscles in various frequencies and amplitudes of sound to
response to high-intensity tone bursts were determine the audiogram. Sometimes animals
observed in one group of kittens up to 12 days mistakenly respond when there is no signal pres-
of age. A second group of kittens (aged 10 days to ent; this is a false alarm. Some animals are more
1 month) was trained with tone-shock pairs to inclined to make false alarms than others. To
make conditioned movements of their eyelids assess this bias, “catch trials” (i.e., control trials
and pinnae when they heard a sound. Ehret and in which no signal is presented) are interspersed
Romand’s results showed that some kittens as at random in the stimulus series. Some
young as 1–2 days of age were able to respond researchers desire to assess the animal’s attentive-
to some frequencies, and that sensitivity to low, ness to a hearing task before collecting data, such
mid, and high frequencies developed at as by conducting a set of easily heard “warm-up
different ages. trials” at the beginning of a session, and a set of
easily heard “cool-down trials” at the end of a
10.3.2.2 Operant Conditioning session. Criteria can be set such that if the
There are many responses animals can make to animal’s performance does not reach a certain
indicate when sounds are heard (or not heard), percent of correct responses during either the
such as touching a response paddle, pressing a warm-up or the cool-down trials (e.g., 80%), test-
lever with a nose or paw, lifting a paw, licking a ing is discontinued for that session or data from
tube from a water bottle, swimming across a that session are eliminated.
barrier, or vocalizing. It is important to choose a In conditioned suppression/avoidance
response that is based on an animal’s natural paradigms, an animal learns to suppress an ongo-
behaviors and thus is easy to learn. Once the ing behavior when it detects a sound that signals
response is chosen, there are several behavioral shock (Heffner and Heffner 2001). The shock
methods that can be used to train animals to make levels used in these studies are kept low so that
the response when a sound is detected or refrain the animals do not become agitated or develop a
from the response when no stimulus is presented. fear of the test apparatus that would impair their
These different paradigms have been performance. Heffner et al. (2014) used the
implemented successfully with a large number conditioned suppression procedure to determine
of species, with modifications that take into behavioral audiograms and sound localization
account species-typical behaviors and habitats. abilities of three young male alpacas (Vicugna
Operant conditioning techniques can use posi- pacos). Thirsty alpacas were trained to break con-
tive or negative reinforcement procedures for tact with a water spout when they heard a tone or
training or “shaping” a conditioned response. noise signal (a conditioned stimulus) that warned
Positive reinforcement methods establish the of impending shock (unconditioned stimulus) and
behavior by providing a reward, such as food, to resume drinking water following a safety sig-
water, or even verbal praise or tactile stimulation nal. The safety signal for tone threshold testing
whenever the animal makes the appropriate was a shock indicator light that turned off when
response. Negative reinforcement methods shock was terminated. Hit rates (measuring the
remove an unpleasant or aversive stimulus (usu- percentage of correct detections of sound,
ally mild electric shock) whenever the animal indicated by breaking contact with the water
makes the appropriate response. Methods can bowl when the tone signal was present) and
also be used to decrease unwanted or incorrect false alarm rates (measuring the percentage of
responses; these are termed punishment false alarms, indicated by breaking contact with
procedures. For example, a time-out period the water bowl when no tone was present) were
might be imposed (positive punishment) when determined for each stimulus intensity. The pure-
an animal makes an incorrect response. After the tone thresholds of the three alpacas showed little
desired behavior has been established through an variability among individuals. Indeed, Heffner
appropriate schedule of reinforcement during a and Heffner (2001) argued that individual varia-
training phase, the animal is then tested using tion among animals is less when using
10 Behavioral and Physiological Audiometric Methods for Animals 367

Fig. 10.7 Photo of a


beluga whale holding
station in front of an
underwater loudspeaker
during behavioral training
for later audiogram
measurements at
Vancouver Aquarium.
During the actual
experiment, the computer
operator moved behind the
rock wall, out of sight of
trainers and whale

conditioned suppression compared to methods can wane if there are changes in the social envi-
based on positive reinforcement. ronment, routine, or the animal’s health.
Another common technique based on positive Because behavioral audiograms require a long
reinforcement, used in many species of aquatic period to train and test the animal, and since the
(Fig. 10.7) and terrestrial species, is a go/no-go number of individuals in captivity is limited for
response paradigm. Thomas et al. (1990) used this many species, in some marine mammals, hearing
technique to measure the audiogram of a subadult data are available for only a single animal. Hall
male Hawaiian monk seal (Neomonachus and Johnson (1972) conducted a behavioral
schauinslandi). At the start of each trial, a trainer audiogram on a captive killer whale (Orcinus
sent the seal, using a hand cue, to station under orca) and reported that this species had much
water with its chin resting on a headstand. If a tone worse high-frequency hearing than other toothed
was heard, the seal was expected to leave the whales tested to that date. Later, Bain et al. (1993)
station, touch a response paddle, and swim to the conducted behavioral audiograms on five killer
trainer for a fish reward (go response). If no tone whales and found their hearing was very typical
was heard (either a control trial or an inaudible of other toothed whales. Upon investigation, the
signal), the seal was supposed to stay at the station, researchers found that the original test subject had
wait for the trainer to give a release whistle, and been given high dosages of an ototoxic antibiotic.
then swim back to the trainer for a reward (no-go So, the first killer whale tested was likely hearing
response). Half the trials were signal-present and impaired as a result of antibiotic-induced death of
half were signal-absent controls; the order of pre- hair cells in the high-frequency region of the
sentation of the trial types was pseudorandomized cochlea. By now, another eight individuals have
throughout a session so that the animal would been tested confirming more typical delphinid
adopt a neutral response bias. The trainer then audiograms in killer whales (Branstetter et al.
called the seal back to the initial station with a 2017).
whistle and the next trial commenced.
There are several drawbacks of behavioral
audiometric studies based on conditioning 10.3.3 Signal Presentation Paradigms
procedures. Most notably, weeks or months may for Behavioral Audiograms
be required to train the animal to respond reliably.
It is important to maintain the animal’s motivation There are three classic paradigms commonly used
to respond and attention to the task, both of which for signal presentation in behavioral audiogram
368 S. L. McFadden et al.

tests with animals (Levitt 1970; Klump et al. level is determined (often by interpolation) as
1995): the Method of Constant Stimuli, the the level at which the animal indicated it heard
Method of Limits, and the Up/Down Staircase the signal on 50% of the trials.
method (also called “adaptive tracking method”). The stimulus presentation levels cover a wide
One important factor to keep in mind when range that bracket the animal’s threshold, so addi-
choosing a signal presentation paradigm is the tional points on the psychometric function can be
time available for measuring thresholds, as there estimated. Randomized presentation of stimuli
is a trade-off between the number of trials and the prevents the animal from anticipating the stimulus
accuracy and reliability of hearing-threshold level on the next trial. Many of the stimulus levels
measurements. are well above threshold, so the animal is not
required to make difficult detections on every
10.3.3.1 Method of Constant Stimuli trial. On the other hand, the method is time-
The Method of Constant Stimuli provides the consuming, and the choice of stimulus levels to
greatest accuracy and reliability for threshold present requires some prior knowledge of likely
measurements. In this paradigm, the animal is thresholds at a specific frequency.
tested at one frequency in a session with blocks
of trials having an equal number of different 10.3.3.2 Method of Limits
signal levels ranging from very low to very high The Method of Limits involves the presentation
amplitude (i.e., no silent controls), presented in of stimuli in small steps (typically 2 to 5 dB) over
random order. The animal makes a response when a fixed range of stimulus levels. At each level, the
a signal is heard, and the results for each signal experimenter records whether the animal
presentation (“Yes” the tone was heard or “No” responded to the test tone or not (Fig. 10.9).
the tone was not heard) are tallied by amplitude Stimuli may be presented in an ascending series,
levels (Fig. 10.8 left panel). After all responses from the lowest amplitude to the highest, or in a
are tallied, a psychometric function (i.e., a plot of descending series, from the highest amplitude to
the animal’s responses, typically the percentage the lowest. Multiple runs are conducted, and for
of “Yes” responses) versus amplitude level each run, the crossover level (i.e., the level half-
(Fig. 10.8 right panel) is made. The threshold way between the stimulus level not heard and the

Fig. 10.8 Illustration of the Method of Constant Stimuli. the highest stimulus levels, the subject reported detection
Left panel: Fifty stimuli were presented at each of nine on all 50 trials (100%). Right panel: Data from the tallies
stimulus levels (450 trials total). The number of times the chart were used to plot a psychometric function, showing
subject indicated that the stimulus was heard at each level performance as a function of stimulus level. Threshold,
was tallied in the Number column and converted to a defined as the stimulus level at which the subject made a
percentage in the Percent column. At stimulus levels detection response on 50% of the trials, was interpolated to
below threshold, the subject rarely responded, whereas at be 5.2 in this example
10 Behavioral and Physiological Audiometric Methods for Animals 369

The Method of Limits is often preferred over


the Method of Constant Stimuli because of its
greater efficiency in bracketing thresholds; i.e.,
fewer trials are needed for a reliable estimate of
threshold. In the example shown in Fig. 10.9,
responses to test tones at six stimulus levels
were recorded across five runs; this required
30 trials total. If the Method of Constant Stimuli
had been used, with 50 signals presented at each
of the six stimulus levels, a total of 300 trials
would have been presented.

10.3.3.3 Up/Down Staircase Method


The Up/Down Staircase method, or adaptive
Fig. 10.9 Illustration of the Method of Limits. Five series tracking signal presentation paradigm, is a varia-
of trials (runs) were used, with test tones at six stimulus
tion of the Method of Limits that was developed
levels (15–45 dB re 20 μPa) presented in each run. Stimuli
were presented from the highest level to the lowest (i.e., in by von Békésy (1960) as a way of efficiently
descending order) on the first, third, and fifth runs, and determining thresholds (Fig. 10.10). This method
from the lowest level to the highest (i.e., in ascending is also referred to as a Modified Method of Limits.
order) on the second and fourth runs. The crossover level
The test begins with the presentation of a high-
was recorded for each run, then crossover levels were
averaged to estimate threshold. In this example, a total of amplitude signal that is likely to be easily heard.
30 trials were conducted across five runs, and the threshold Then, the amplitude is reduced in 2- to 10-dB
was estimated to be 24.5 dB re 20 μPa steps until the animal does not respond to the
signal. When the animal signifies it can no longer
hear the signal, the dB level is immediately
next level heard, e.g., 22.5 dB for run 1 and increased (in 1- to 5-dB steps) until the animal
27.5 dB for run 2 in Fig. 10.9) is determined. reports it again hears the sound. At that level, the
The mean threshold is estimated by averaging direction is reversed and the procedure is
all of the crossover levels for that frequency. repeated. Thus, this method includes both
Presenting all runs in either descending order descending and ascending staircases, with
or solely in ascending order may produce a strong reversals triggered by a change in the animal’s
response bias that influences threshold estimates. response. The hearing threshold can be estimated
When trials are presented using the descending by taking the average of the signal levels at a
Method of Limits, the animal can become accus- designated number of reversals or by noting the
tomed to reporting that it perceives a stimulus and lowest level with a criterion number of “Yes”
can continue reporting hearing the signal below responses on ascending trials. Catch trials or
the threshold; this is known as the error of habit- silent control trials controls in which all electron-
uation. Alternatively, in the ascending Method of ics are switched on, but no test signal is projected
Limits, the animal can anticipate that the stimulus may be used to control for response bias (see
is about to become detectable and make an error example audiometric study of a Hawaiian monk
in responding in the absence of the signal; this is seal, Sect. 10.3.2.2). In addition, the time interval
known as the error of anticipation. The bias between signal presentations can be varied, so
introduced by signal predictability is a drawback that the subject does not develop a pattern of
of using the Method of Limits. The influence of responding based on predictable timing.
habituation and anticipation errors can be partly The Up/Down Staircase procedure can be dif-
overcome by using an equal number of ascending ficult for an animal, because many trials are
and descending runs alternately on the same presented at near-threshold levels. This could
subject. affect an animal’s motivation to respond.
370 S. L. McFadden et al.

Fig. 10.10 Example of “bracketing” a hearing threshold immediate reversal. Signals were presented at random
using the Up/Down Staircase method (Modified Method intervals to prevent the subject from developing a response
of Limits). The first signal was presented at a level that the bias based on timing. In this example, the predetermined
subject easily heard (“Yes” at 40 dB re 20 μPa). Signal criterion for threshold was the lowest signal level with
level was then decreased in 5-dB steps until the subject no three “Yes” responses on ascending trials (circled
longer signaled detection (“No” at 25 dB re 20 μPa). The responses), so 30 dB re 20 μPa was the threshold for this
change of response from “Yes” to “No” triggered the first frequency. Testing at this frequency terminated when the
reversal, from a descending series to an ascending one. criterion for threshold was met
Thereafter, each change of response triggered an

However, receiving a reward for both correct broadcast), (3) false alarm (i.e., responding that a
responses to signal and silent control trials helps signal is present when it is not, or indicating “yes”
reduce negative effects. The major advantage of before the signal is broadcast), and (4) missed
the adaptive tracking method over the Method of detection or miss (i.e., responding that a signal
Constant Stimuli and the Method of Limits is that is absent when a signal is broadcast or failing to
fewer trials need to be conducted, resulting in a respond). The four response choices of an animal
shorter test session for both the researcher and the in a behavioral hearing test are illustrated in
animal subject. Fig. 10.11.
Response bias can be disentangled from sen-
sory capabilities by constructing a Receiver
10.3.4 Receiver Operating Operating Characteristic (ROC) curve (Green
Characteristic (ROC) Curves and Swets 1966). Upon signal presentation, the

Animals, like humans, can have a bias toward a


more conservative or liberal response during a
hearing test (Klump et al. 1995), which could
lead to underestimating or overestimating the
hearing threshold, respectively. Procedures have
been developed to separate response bias from
actual behavioral sensitivity in psychophysical
experiments. In a yes/no (audible/inaudible sig-
nal) detection task, there are four possible
outcomes of each trial: (1) correct detection or Fig. 10.11 A two-by-two decision matrix relating the
hit (i.e., responding that a signal is present when it signal condition (signal presence versus signal absence)
is broadcast), (2) correct rejection (i.e., to the animal’s possible responses (indicating signal pres-
responding that a signal is absent when it is not ence versus signal absence) during audiometric tests
10 Behavioral and Physiological Audiometric Methods for Animals 371

animal can respond either “yes” or “no” and so decreasing signal-to-noise ratio (from 0 to
the probability of correct detection, P(CD), and 30 dB), the animal’s hit rate decreased (i.e.,
the probability of missed detection, P(MD) add to decreasing P(CD)). False alarms were only
1: P(CD) + P(MD) ¼ 1. Similarly, in the case of made at low signal-to-noise ratio (24 dB)
no signal presented, the probabilities of false indicating an overall conservative response bias.
alarm, P(FA), and correct rejection, P(CR), add Data are based on the study by Erbe and Farmer
to 1: P(FA) + P(CR) ¼ 1. In other words, the (1998); see Fig. 10.7 for a photo of the training
probabilities computed from the animal responses setup.
in Fig. 10.11 are not all independent. In the ROC The bias of the animal in these hearing tests
plot, therefore, two independent probabilities are can be manipulated by changing the reinforce-
plotted against each other: P(CD) versus P(FA). ment regimen. If the possible responses from
As illustrated in Fig. 10.12a, the major diagonal Fig. 10.11 are differently rewarded (e.g., positive
line marks all the points at which P(CD) ¼ P(FA), reinforcement for the two correct responses and
which would be expected if the subject were negative reinforcement for the two false
making random choices or simply guessing. responses), then the animal will aim to maximize
Below this line, the animal would perform the percentage of correct responses. If the four
worse than by chance; i.e., the animal would be responses are all differently rewarded, then the
making deliberate mistakes. The minor diagonal perceived values and risks will influence the
corresponds to P(CD) + P(FA) ¼ 1 and so animal’s response. For example, in a study with
represents neutral response bias, with responses an Arctic fox (Vulpes lagopus; Stansbury et al.
falling to the left of the line indicating a conser- 2014), correct detections and correct rejections
vative response bias (i.e., low false alarm proba- were rewarded with 3–4 pieces of kibble. When
bility) and to the right a liberal response bias (i.e., the animal missed a signal, it was rewarded with
high false alarm probability). The best possible 1 piece of kibble. False alarms resulted in a 2–3 s
performance is at the point (0|1), where the ani- time-out, after which the animal was restationed
mal detects all signals and does not report any for the next trial. By rewarding misses (i.e., one of
false alarms. Actual results from a beluga whale the two false responses) and with only false
(Fig. 10.12b) detecting played-back beluga calls alarms receiving no food but instead a time-out,
in icebreaker noise are shown in Fig. 10.12c. At the animal was conditioned to avoid false alarms

Fig. 10.12 (a) Receiver Operating Characteristic (ROC) noise at signal-to-noise ratios of 0, 6, 12, 18, 24,
plot showing the lines and areas relating the probability of and 30 dB. The animal was trained to indicate whenever
correct detection, P(CD), and the probability of false it heard the call in the noise. The animal’s performance
alarm, P(FA). (b) Photo of a beluga whale at Vancouver decreased with decreasing signal-to-noise ratio. The ani-
Aquarium. (c) ROC plot of this animal’s performance mal adopted a very conservative response bias (Erbe and
when presented with a beluga call mixed into icebreaker Farmer 1998)
372 S. L. McFadden et al.

but accept misses. The reinforcement regimen (whereby fewer catch trials render the animal
directly influenced the animal’s conservative more liberal; Schusterman and Johnson 1975) or
bias. Similar conditioning likely happened with even changing the probability of handing out a
the beluga whale (Erbe and Farmer 1998). After reward (i.e., not all correct trials are rewarded all
the animal stationed, a sound was played ran- the time; Schusterman 1976). The resulting ROC
domly within a 30-s period. The animal indicated curves then allow the separation of the animal’s
a detection (of the beluga call mixed into ice- actual sensitivity from its bias (Green and Swets
breaker noise) by breaking from the station. If 1966; Au 1993), but much more experimental time
the animal did not detect a call, it held station is needed to collect all these data.
for the full 30 s. Correct detections were rewarded
with fish within 2 s. False alarms received a time-
out. A “no” response received a delayed (by up to 10.4 Physiological Methods
30 s) fish reward; these would have correct for Audiometric Studies on Live
rejections (i.e., signal absent trials) and missed Animals
detections (i.e., signal present trials, but under
the assumption that the signal was too quiet to Behavioral tests of hearing can be too time-
be detected). Effectively, the animal thus also consuming to conduct, too difficult to employ
received a reward (albeit delayed) for missed because of animals’ limitations in learning or
detections, even if the signal was above threshold performing a behavioral task, or impractical for
on some of the trials. Not knowing in advance some other reason such as animal health, disposi-
what the animal’s hearing threshold is, it is tion, or developmental status. Physiological
impossible to tell whether the animal truly did methods offer a practical, complementary
not hear the signal when it indicated “no” to a approach because they do not require training
low-level signal-present trial. the animal and they can be completed in a rela-
An even greater benefit of ROC analysis is tively shorter period of time. However, because
realized by measuring actual ROC curves (rather physiological methods do not require a behavioral
than settling for scatter plots of data as in response from the animal that indicates the sound
Fig. 10.12c). To do that, the animal’s bias needs was perceived, they are considered to be tests of
to be actively manipulated using reinforcement. “auditory function” rather than “hearing” per
For example, the beluga experiment could be se. The relationship between behavioral and
redone with the same animal, but instead of physiological measures of hearing is discussed
rewarding both correct responses with one fish, later in this chapter.
the animal might be given 3 fishes for a correct As in behavioral studies, physiological studies
detection and only 1 fish for a correct rejection. test responses to different kinds of acoustic stim-
The animal might begin to favor the “yes” ulation and must take into account ambient noise
response, exhibiting a more liberal response bias. that can affect thresholds. Other factors to con-
So, rather than having just one data point at say sider in physiological studies are body tempera-
12 dB signal-to-noise ratio, we would get a curve ture and whether or not the animal is anesthetized,
for 12 dB, with the points along the curve because these factors can affect neural thresholds,
corresponding to the same sensitivity (hence also amplitudes, and latencies. Anesthesia is com-
called isosensitivity curve) but to different biases, monly used in physiological studies because it is
which were driven by the different reinforcement difficult to keep an unanesthetized animal in a
regimen. This is exactly what was done by fixed position in a sound field during testing and
Schusterman et al. (1975) with a California sea physical restraint can be stressful. However, anes-
lion (Zalophus californianus) and a bottlenose dol- thesia can affect brain activity and severely
phin (Tursiops truncatus), yielding actual ROC diminish or abolish neural responses to sound
curves. Other ways of actively changing the bias (Cui et al. 2017; Kiebel et al. 2012; McFadden
include changing the percentage of catch trials and Kiebel 2013; Fig. 10.13). Anesthesia can also
10 Behavioral and Physiological Audiometric Methods for Animals 373

Fig. 10.13 Top: Testing apparatus devised by Kiebel typically remained on the platform for the entire testing
et al. (2012) for recording auditory evoked potentials session (30–45 min). Stimuli were delivered from a head-
from awake mice. The mice were placed on a platform phone speaker placed 700 above the animal’s head. A
(i.e., an inverted jar about 300 in diameter) in a plastic tub computer-controlled camera was used to monitor the
containing warm water in a recording chamber. Mice were mouse, and recording was manually paused when the
acclimated to the apparatus in daily 10-min sessions for animal groomed or became active. Bottom: Auditory
1–2 days prior to the first recording session. Typically, a evoked responses recorded from a mouse while it was
mouse placed on the platform for the first time would enter awake and then again after it had been anesthetized. The
the water and after a brief period of swimming, would waveforms are responses to 12 kHz tones at 90 dB re
climb back on the platform and remain there until removed 20 μPa, averaged across 100 artifact-free trials in each
by the researcher. In subsequent sessions, the mouse condition

impair thermoregulation, resulting in changes in the absence of acoustic stimulation (spontaneous


body temperature that can be countered by plac- otoacoustic emissions) or in response to acoustic
ing the animal on a heating pad during testing. stimulation (transient otoacoustic emissions,
When brain responses must be obtained from TOAEs, elicited by a single tone or click; and
awake animals (see Fig. 10.13), electrical artifacts distortion product otoacoustic emissions,
created by movements during exploration or DPOAEs, elicited by two primary tones, f1 and
grooming can be problematic, and many trials f2). OAEs reflect nonlinear processing in the inner
may be required to achieve acceptable signal-to- ear and occur due to the action of a “cochlear
noise ratios. amplifier,” which functions to increase sensitivity
to low-level sounds. Moreover, they are
frequency-specific and so will emerge at those
frequencies where hearing is near normal (Kemp
10.4.1 Otoacoustic Emission Methods
2002). DPOAE testing has become popular as a
rapid, non-invasive way to assess the functional
Otoacoustic emissions (OAEs) are sounds
integrity of hair cells in a wide variety of species,
generated by hair cells in the inner ear, either in
374 S. L. McFadden et al.

including frogs, lizards, birds, and mammals presented through the sound tubes, and the sound
(Manley 2001). DPOAEs are abolished by loss in the ear canal is sampled by the microphone for
or dysfunction of outer hair cells, and also by a fixed period of time. The output of the micro-
middle ear dysfunction that prevents retrograde phone is filtered, digitized, averaged over a num-
transmission of acoustic energy from the cochlea ber of trials, and then analyzed using a
to the ear canal. It is important to recognize, computerized signal-analysis system. A normal
however, that the absence of OAEs is not neces- inner ear will generate several nonlinear distor-
sarily evidence of outer hair cell dysfunction, tion products that will be propagated in a reverse
because OAEs are not recordable from all normal direction back through the middle ear and into the
ears. The technique is not very useful for ear canal (when present). When this occurs, spec-
pinnipeds because their stapedial reflex shuts trum analysis of the sound recorded by the micro-
down the auditory meatus as an adaptation for phone will show not only the original f1 and f2
diving. tones that were delivered to the ear, but also
DPOAE tests in mammals typically use a several new tones that were generated as nonlin-
probe assembly that is inserted into the external ear distortion products. The largest distortion
auditory meatus to form a closed acoustic system. product is the cubic DPOAE, with a frequency
For animals lacking ear canals (e.g., fishes, frogs, equal to 2f1  f2. For example, if f1 ¼ 1000 Hz
reptiles, and birds), the probe tip is placed inside a and f2 ¼ 1200 Hz, then the cochlea will generate a
plastic tube that is then coupled to the animal’s cubic DPOAE at 800 Hz. Because 2f1  f2 is the
ear using silicone grease or Vaseline to seal any largest DPOAE produced (typically 30–40 dB re
gaps (Bergevin et al. 2008). The probe tip 20 μPa below the level of the primary tones) and
contains a very sensitive external microphone is less variable than other distortion products, it is
and tubes from two external sound sources typically the only one reported in animal studies.
(Fig. 10.14). Two primary test tones, f1 and a The frequency ratio f2: f1 of the primary tones,
higher frequency tone f2, are generated by sepa- the level of the higher-frequency primary tone L2,
rate channels of a sound-generating system and and the difference between the levels of the two
primary tones L1  L2 are selected to maximize
the amplitude of the cubic DPOAE in the ear
canal. These parameters are species-specific and
must be determined empirically. For all
combinations of stimulus parameters ( f2:f1, L2
and L1  L2), the amplitude of the cubic
DPOAE increases as the level of the primary
tones increases until it saturates. DPOAEs can
be difficult to measure at low frequencies due to
masking by low-frequency ambient sounds in the
ear canal (i.e., high noise-floor levels occur at low
frequencies). But it is possible to measure
low-frequency DPOAEs if great care is taken to
ensure deep insertion and a good seal of the probe
assembly in the ear canal.
Shaffer and Long (2004) measured
Fig. 10.14 A commercially available low-noise micro- low-frequency DPOAEs in two species of kanga-
phone with two external sound sources. The probe tip roo rats to test the hypothesis that a large foot-
containing the microphone and sound tubes is covered
with a foam or plastic ear tip and inserted into the ear
drumming species (Dipodomys spectabilis) has
canal to form a closed acoustic system. For animals with- better low-frequency sensitivity than a small
out ear canals, the probe can be inserted into a plastic tube foot-drumming species (D. merriami). In both
that is then sealed in place against the ear of the animal species, DPOAEs were generated rated at low
10 Behavioral and Physiological Audiometric Methods for Animals 375

frequencies between 225 and 900 Hz. DPOAE Electrical potentials generated by the cochlea
amplitudes were greater in the larger kangaroo and auditory nerve include the cochlear micro-
rat species compared to the smaller species. Addi- phonic potential (CM potential) generated by
tionally, the authors found good correspondence outer hair cells, the summating potential
between DPOAE amplitudes, behavioral hearing (SP) generated primarily by inner hair cells, and
thresholds, and electrophysiological hearing the compound action potential (CAP) generated
thresholds in D. merriami. This suggests that by the synchronous depolarization of auditory
DPOAE amplitudes are good estimates of hearing nerve fibers. AEPs generated by the auditory
sensitivity. nerve and neurons in the auditory brainstem
(i.e., cochlear nucleus, superior olive, lateral lem-
niscus, and inferior colliculus) contribute to the
10.4.2 Auditory Evoked-Potential short-latency scalp-recorded auditory brainstem
and Auditory Brainstem response (ABR). AEPs recorded from electrodes
Response Methods implanted into the auditory midbrain of mammals
are referred to as inferior colliculus evoked
Auditory evoked-potential (AEP) methods record potentials (IC-EVPs). AEPs generated by fore-
stimulus-evoked electrical activity at various brain regions (thalamus and cortex) include
levels of the auditory nervous system. Hair cells long-latency potentials recorded from electrodes
and neurons in the auditory system function by implanted into the brain or from surface
generating electrical potentials in response to electrodes.
sounds, and measurements of these stimulus- AEP methods share a number of common
evoked potentials can provide information about procedures. Stimuli can be presented using the
the functional state of the inner ear, auditory same paradigms discussed in Sect. 10.3.3
nerve, central auditory nuclei, and their fiber (Method of Constant Stimuli, Method of Limits,
pathways (Salvi et al. 2000; McFadden 2007). Up/Down Staircase method) with the criterion for
There are many ways of classifying AEPs. threshold being an electrophysiological, rather
Common classifications are based on: (1) the than a behavioral, response. Responses are
region involved in the generation of the response recorded and averaged over a number of trials
(e.g., cochlea, brainstem, thalamus, or cortex), (e.g., 50–2000 trials); the number of trials
(2) the latency of the response (i.e., short-, mid- depends on the size of the response relative to
dle-, and long-latency potentials reflecting gener- background electrical noise (i.e., the signal-to-
ation by neural elements at progressively higher noise ratio). They are typically quantified in
regions of the auditory system), (3) electrode terms of response amplitude (e.g., peak-to-peak
placement (invasive near-field recordings made voltage or peak voltage relative to a baseline
with an electrode inserted into an auditory voltage level) and latency (i.e., the lag-time
nucleus versus noninvasive far-field recordings between the onset of the stimulus and a defined
made from electrodes placed on the scalp), portion of the response). Threshold is variously
(4) the type of electrode used (high-impedance defined as the lowest stimulus level that elicits a
microelectrodes for recording potentials from detectable physiological response, the lowest
individual cells versus low-impedance surface or level at which a peak replicates, the midpoint
needle electrodes for recording activity from large between the level at which a response replicates
groups of neurons from the scalp), and (5) the size and the next lower level at which it does not, or
of the cellular population contributing to the the sound pressure level at which the amplitude of
response (e.g., local field potentials reflecting a particular peak reaches a criterion voltage level.
the extracellular electrical activity of a discrete Other parameters that are commonly measured
group of neurons versus gross potentials from AEP waveforms include peak amplitudes,
generated by large populations of cells such as peak latencies, and in the case of the ABR, inter-
those recorded from scalp electrodes). peak intervals (i.e., time between different peaks,
376 S. L. McFadden et al.

reflecting neural conduction time). Results are higher anatomical sites of generation. ABRs
summarized as input-output functions that show from mammals typically have five prominent
response magnitude or latency as a function of peaks (Fig. 10.15). The first peak of the waveform
stimulus level, or as an audiogram, showing has a cochlear origin, reflecting the summed syn-
threshold as a function of stimulus frequency. chronous neural activity from the peripheral por-
Because the ABR is an onset response that tion of the auditory nerve, and the second peak
requires synchronous activity of an ensemble of most likely reflects neural activity from the cen-
neural elements, stimuli with very short rise/fall tral portion of the auditory nerve at the level of the
times are most effective. Clicks, which are brief cochlear nucleus. Subsequent peaks are generated
(e.g., 5–100 μs) and therefore spectrally broad, by brainstem regions between the cochlear
often are used as stimuli, particularly for screen- nucleus and the lateral lemniscus or inferior
ing of auditory function. Pure tones with a rapid colliculus. In all species studied, peak amplitudes
onset are preferred when more frequency-specific of the ABR increase and latencies decrease as the
information is required, as for testing the fre- stimulus level increases (Fig. 10.15). The rate of
quency range of hearing. Sinusoidal amplitude stimulus presentation can influence response
modulated tones provide even greater frequency amplitudes and thresholds. Data acquisition time
specificity. is shortened by using a rapid signal presentation
At high stimulus levels that are clearly audible rate, but there is a cost in terms of response size,
to an animal, several characteristic peaks are typ- with high signal rates resulting in decreased peak
ically present in the response waveform, with amplitudes in the response waveform and
latencies that correspond to their progressively increased response latencies.

Fig. 10.15 Left: Photo of a squirrelfish (Sargocentron level, in 5-dB steps from 90 to 55 dB re 20 μPa. Threshold,
sp.) with subcutaneous electrodes about to undergo ABR defined as the lowest level with a repeatable response, was
testing. Photo courtesy of Rob McCauley, Centre for 65 dB re 20 μPa for this frequency. The first two peaks of
Marine Science and Technology, Curtin University. the ABR (short bracket) show activity from the auditory
Right: ABR waveforms obtained from an anesthetized nerve, whereas the subsequent peaks (long bracket) arise
C57BL/6J mouse. Needle electrodes (pictured at top left) from successively more rostral regions of the central audi-
were inserted under the skin at the top of the head (active), tory nervous system. Note the decrease in peak amplitude
behind the right ear (reference), and at the base of the tail and increase in peak latency with decreasing stimulus
(ground). Two waveforms were collected at each stimulus level, typical of ABR waveforms
10 Behavioral and Physiological Audiometric Methods for Animals 377

Preparation of animals for ABR testing is min- less to obtain a complete audiogram from an
imal. Typically, the animal is restrained or anesthetized animal), as compared to the weeks
sedated or anesthetized to keep it still during the or months needed to train an animal for compiling
recording session. Aquatic animals under human behavioral audiograms. In addition, ABR testing
care can be trained to remain still at a station (e.g., is practical to use in studies requiring many
in a hoop) and are maintained at a good ambient animals and multiple measurements (e.g., before
water temperature in a pool. Terrestrial animals and after a treatment is applied), and for testing
are placed on a heating pad to maintain normal young animals in developmental studies. For
body temperature. Electrodes for recording elec- example, McFadden et al. (1996) used ABR
trical activity are then applied. For most animals, methods to study the ontogeny of auditory func-
the electrodes are low-impedance needle tion in the Mongolian gerbil and identified three
electrodes that are inserted under the skin; how- phases of development based on frequency-
ever, other types of electrodes, such as surface threshold curves. ABRs were elicited by intense
electrodes and suction-cup electrodes that attach stimuli in the low- and mid-frequency range as
to the surface of the head (Fig. 10.16) are suitable early as 10 post-natal days (pnd) in a small pro-
as well. One electrode, termed the active, portion of animals. By 16 pnd, all gerbils were
non-inverting, or positive electrode, is placed at responding reliably to tones between 125 Hz and
the vertex (upper surface of the head, along the 32 kHz, similar to adult animals.
midline, and between the ears) and another, ABR testing has become the AEP method of
termed the reference, inverting, or negative elec- choice for audiometric testing in a wide range of
trode, is placed behind the pinna or in another species. In particular, ABRs are useful for
relatively neutral region of the head. A third elec- estimating hearing capabilities of animals that
trode, which serves as a ground, is placed in the are difficult to test using other methods. For
pool water or in a non-neural site on the animal example, Hu et al. (2009) used ABR recordings
(e.g., beneath the skin of the neck, back, or leg). to determine hearing of cephalopods: the oval
One advantage of ABRs is that it requires less squid (Sepiotheuthis lessoniana) and the common
time to collect a complete set of data (often 1 h or octopus (Octopus vulgaris). Each cephalopod

Fig. 10.16 Photo of a


harbor porpoise (Phocoena
phocoena) stationing
during an ABR test of its
hearing at Fjord & Bælt
Denmark. The recording
electrodes, attached to the
animal’s head and back
using suction cups, measure
small electrical voltages
produced by the brain in
response to acoustic
stimulation. Photo courtesy
of Solvin Zankl, Fjord &
Bælt and the Marine
Biological Research Center,
University of Southern
Denmark, Kerteminde,
Denmark
378 S. L. McFadden et al.

was anesthetized and then transferred to a holder in numerous degenerative disorders and
inside a plastic tub filled with seawater. Teflon- age-related hearing loss (McFadden et al.
coated silver needle electrodes were inserted on 2001a, b). For example, hearing thresholds of
the head between the eyes (non-inverting) and on aged (13-month-old), wild type (WT) mice with
the mantle (inverting) and a wire was placed in normal levels of SOD1 are lower at all four tested
the tub to serve as the ground. In both frequencies than those of SOD1-deficient
cephalopods, the ABR had only one prominent littermates. SOD1 deficiency had a greater effect
peak. The resulting ABR audiogram showed that on thresholds at 16 and 32 kHz than at lower
the squid responded to a wider frequency range frequencies (8 and 4 kHz).
(400–1500 Hz vs. 400–1000 Hz) and had signifi-
cantly lower thresholds at 600 Hz (its frequency
of best sensitivity) compared to the octopus.
10.4.3 Comparison of Behavioral
Comparisons of ABR audiograms can show
and Physiological Audiograms
the effects of factors such as age, noise exposure,
drug treatment, and genetic mutations. The ABR
It is important to compare data obtained from
audiograms shown in Fig. 10.17, for example,
physiological and behavioral methods to deter-
show the effects of an induced genetic mutation
mine their reliability and validity. Even in the
of the gene that codes for the copper-zinc form of
same species, experiments might use different
superoxide dismutase (SOD1) on auditory sensi-
stimulus presentation paradigms and different
tivity in mice. SOD1, an enzyme found in the
threshold criteria, making direct comparisons of
cytosol of all cells, serves as a first line of defense
results difficult. Although ABR and behavioral
against oxidative damage and has been implicated
audiograms in the same species can have the
same overall shape and similar frequencies of
best hearing sensitivity, actual thresholds may
differ considerably (Fig. 10.18). Some authors
argue that these audiograms should not be con-
sidered equivalent (Sisneros et al. 2016). Ladich
and Fay (2013) compiled AEP and behavioral
audiograms of goldfish collected in different stud-
ies in different laboratories. They found that, at
frequencies below 1000 Hz, median ABR
thresholds were about 10 dB higher than behav-
ioral thresholds, while at higher frequencies,
ABR thresholds were lower than behavioral
thresholds.
Schlundt et al. (2007) quantified differences in
audiograms recorded from bottlenose dolphins in
Fig. 10.17 Average ABR thresholds (dB re 20 μPa) from a variety of underwater test conditions (in a quiet
aged mice with normal levels of SOD1 enzyme
(WT) compared to thresholds from littermates missing
pool and in a noisy bay). AEPs were recorded
50% (HET) or 100% (KO) of SOD1 due to genetic manip- using a transducer embedded in a suction cup on
ulation of the copper-zinc superoxide dismutase gene. the jawbone. In behavioral tests, the dolphins
WT ¼ wildtype mice (with two normal gene alleles and were conditioned by the trainer’s whistle to
normal levels of SOD1); HET ¼ heterozygous knockout
mice (with one abnormal allele, resulting in 50% reduction
respond when the same tone was heard.
of SOD1); KO ¼ homozygous knockout mice (with two Thresholds measured using the two techniques
abnormal alleles, resulting in complete elimination of were very similar, although there was less
SOD1) variability in behavioral data.
10 Behavioral and Physiological Audiometric Methods for Animals 379

Fig. 10.18 Comparison of 160


underwater hearing
thresholds of individual
140
bottlenose dolphins
collected by behavioral

SPL [dB re 1 Pa]


(black) versus ABR (red) 120
methods. Data from
Johnson (1966), Popov and 100
Supin (1990), Brill et al.
(2001), Houser and
Finneran (2006), Finneran 80
et al. (2008), Finneran et al.
(2011) 60

40

10 2 10 3 10 4 10 5
Frequency [Hz]

frequency or intensity that an animal can


10.5 Other Audiometric
detect—called the just noticeable difference
Measurements
(jnd) or the difference limen (DL). To measure a
frequency DL using behavioral methods, the ani-
Other crucial aspects of hearing can be examined
mal is trained to detect a frequency difference
using variations on the basic audiometric methods
(ΔF) between two test tones. In a typical para-
outlined above. These include frequency discrimi-
digm, the animal is presented with a constant
nation, intensity discrimination, equal-loudness
stimulus (i.e., a tone burst of one frequency) that
functions, frequency selectivity (e.g., critical ratios,
sometimes changes in frequency, and the animal
critical bandwidths, and psychophysical tuning
is trained to respond when it perceives a fre-
curves), masking (i.e., forward, backward, and
quency change. The smallest frequency differ-
simultaneous), duration discrimination, stimulus
ence that the animal can perceive reliably,
generalization, and directional hearing (i.e., sound
according to some set criterion, is the jnd or
localization). All of these aspects of hearing have
DL. Because the animal is discriminating
been studied in a wide range of vertebrate species.
between two frequencies, a common criterion
Fay (1988) compiled results of behavioral
for threshold is 75% correct, which is midway
experiments from a large number of different spe-
between chance and perfect performance.
cies. Klump et al. (1995) provided complete
Heffner and Heffner (1982) measured fre-
descriptions of behavioral methods that have been
quency DLs in an Indian elephant (Elephas
developed for these kinds of experiments. Selected
maximus indicus) housed in a zoo. The elephant
examples of these experiments are discussed briefly
was trained to press one of two response buttons
below. It is important to note that physiological
on a panel with its trunk upon hearing a sound.
techniques can also be used to obtain information
When she heard a train of tone pulses with all the
on these other aspects of hearing, but that again,
same frequencies, then the correct response was
estimates of sensitivity may differ.
to press the left button. When she heard a train of
tone pulses that alternated between two different
frequencies, then the correct response was to
10.5.1 Frequency and Intensity press the right button. Correct responses were
Discrimination rewarded with a fruit-flavored sugar solution.
The DL was determined by reducing the fre-
Frequency and intensity discrimination quency difference between the tones in the two
experiments measure the smallest difference in
380 S. L. McFadden et al.

Fig. 10.19 Psychometric function at a tone frequency of 1000 Hz, the frequency difference limen is 30 Hz. Middle:
1000 Hz (left) and a graph of the Weber fraction across The Weber fraction (ΔF/F) increases with frequency. The
frequency (middle) collected from an Indian elephant Weber fraction is low at frequencies of 250 and 500 Hz,
(right). Left: A psychometric function showing percent indicating good ability to discriminate frequency
correct detection of a frequency difference between two differences, and increases at higher frequencies, indicating
tones. The base frequency is 1000 Hz, and frequency poorer acuity. Data collected by Heffner and Heffner
differences range from 20 to 100 Hz. The solid gray line (1982). Image of the elephant from Evelyn Fuchs, Univer-
shows the elephant’s performance and the dashed gray line sity of Vienna
shows the 75% correct criterion for the frequency DL. At

types of pulse trains, until the animal no longer it provides a standard error for the hearing thresh-
detected the difference reliably. A psychometric old values.
function for a tone frequency of 1000 Hz, a fre- Intensity DLs are estimated using similar
quency of best sensitivity for the elephant, is procedures as used for estimating frequency
plotted in Fig. 10.19. The 75% correct discrimi- DLs, except that tone frequency is kept constant
nation threshold is at 1030 Hz, giving a DL or while tone intensity is varied. Difference limens
30 Hz. The DLs calculated from psychometric are also commonly measured for noise. These
functions at different tone frequencies are plotted measurements are useful for estimating a species’
in Fig. 10.19 as the Weber fraction (ΔF/F) the dynamic range of hearing, the intensity range
ratio of the DL to the test frequency. The Weber over which changes in sound levels can be per-
fraction increases with frequency, showing that ceived. Determining an animal’s sensitivity to the
the ability to discriminate differences in tone fre- depth of amplitude modulation in a sound and the
quency becomes absolutely worse with increases ability to detect a short, silent gap between two
in frequency. Changes in the Weber fraction with sounds is also a problem of intensity
tone frequency have implications for understand- discrimination.
ing how frequency is coded in the nervous system
across different species.
The psychometric function illustrated in 10.5.2 Frequency Selectivity
Fig. 10.19 is based on actual data points. Some
investigators use a statistical procedure called Frequency selectivity refers to the perceptual abil-
Probit Analysis to find the best-fitting regression ity to discriminate two simultaneous signals of
line through the data points, and then base the different frequency (e.g., a signal against noise).
estimate of the DL from that regression (Levitt Behavioral measures of frequency selectivity are
1970). The center of the best-fitting regression used to estimate the width of internal auditory
line can then be taken as the most probable filters (i.e., the physical space including number
threshold value. Probit analysis is useful because of hair cells and portion of the sensory epithelia)
10 Behavioral and Physiological Audiometric Methods for Animals 381

devoted to a particular frequency or frequency 10.5.2.1 Critical Ratio


range along the basilar membrane or sensory sur- The critical ratio (CR) can be thought of as the
face in the inner ear. Thus, behavioral measures minimum signal-to-noise ratio for detecting a
of frequency selectivity provide an estimate of the tone against a background of broadband masking
resolving power of the ear. Physiological noise. It is defined as the mean-square sound
techniques are used to provide a more direct mea- pressure of a narrowband signal (e.g., a tone)
surement. Auditory filters are often thought of as divided by the mean-square sound pressure spec-
a series of contiguous bands of frequency in tral density of the masking noise at a level, where
which the auditory system analyses incoming the signal is just detectable (ISO 18405:2017).
sound, and sounds of different frequencies are ‘Just detectable’ again refers to a specified frac-
processed in different filters (i.e., independently tion of trials in behavioral experiments. The CR is
of one another) without mutual interference. For typically expressed as a level-quantity in dB with
ease of modeling, auditory filters often are a reference value of 1 Hz. Therefore, the CR can
assumed to be rectangular in shape. For very also be computed as the difference between the
sharp frequency selectivity, hence good ability sound pressure level of the signal and the power
to separate signals from noise, auditory filters spectral density level of the noise—at detection
should be narrow. Wide auditory filters are sus- threshold. To measure the CR, the levels of signal
ceptible to greater masking. Different measures of (or noise) are changed. As with measuring
frequency selectivity exist (e.g., Fletcher critical audiograms, the CR can be measured behavior-
bands, critical bandwidths, equivalent rectangular ally using the Method of Constant Stimuli, the
bandwidths, etc.; Fig. 10.20). Method of Limits, or the Up/Down Staircase

10 5

1 octave
1/3 octave
1/6 octave
1/12 octave
10 4
Tursiops truncatus
Tursiops truncatus
Bandwidth [Hz]

Tursiops truncatus
Tursiops truncatus
Delphinapterus leucas
Delphinapterus leucas
10 3 Neophocaena phocaenoides
Phocoena phocoena
Mirounga angustirostris
Phoca vitulina
Phoca vitulina
Phoca vitulina
Zalophus californianus
10 2

10 3 10 4 10 5
Frequency [Hz]

Fig. 10.20 Graph of frequency selectivity in marine in-air and underwater measurements are shown (Erbe
mammals. *: Critical bandwidths. ★: Equivalent rectangu- et al. 2016). # Erbe et al. 2016; https://www.
lar bandwidths. +: 3-dB bandwidths. O: 10-dB sciencedirect.com/science/article/pii/
bandwidths. Some of these data were collected behavior- S0025326X15302125. Licensed under CC BY 4.0; https://
ally, others electrophysiologically. For pinnipeds, both creativecommons.org/licenses/by/4.0/
382 S. L. McFadden et al.

Critical Ratios of Cetaceans & Sirenians underwater Critical Ratios of Pinnipeds underwater
45 1/3 octave 35 1/3 octave
1/6 octave 1/6 octave
1/12 octave 1/12 octave
Delphinapterus leucas Callorhinus ursinus
40 Phocoena phocoena Mirounga angustirostris
Pseudorca crassidens 30 Phoca largha
Tursiops truncatus Phoca vitulina
Trichechus manatus Pusa hispida
35 Zalophus californianus
25
CR [dB]

CR [dB]
30

20
25

15
20

15 10

10 2 10 3 10 4 10 5 10 2 10 3 10 4
frequency [Hz] frequency [Hz]

Fig. 10.21 Graphs of critical ratios in dB re 1 Hz of et al. 2016; https://www.sciencedirect.com/science/arti


marine mammals under water (Erbe et al. 2016). Frac- cle/pii/S0025326X15302125. Licensed under CC BY
tional octave lines are shown for comparison. # Erbe 4.0; https://creativecommons.org/licenses/by/4.0/

method. The CR can also be measured of the size of the auditory filter. It is a good
electrophysiologically. approximation in some bird species (Langemann
CR measurements are relatively easy to obtain et al. 1995) but in many other species differs from
and are thus available for a number of species. In a more direct measure, the critical bandwidth.
the horseshoe bat (Rhinolophus ferrumequinum)
and in the green treefrog, for example, CRs are
10.5.2.2 Critical Bandwidth
lowest, implying sharper filters, at the spectral
The critical bandwidth (CB) refers to a band of
peaks within this species’ echolocation and
frequencies within which sound at any frequency
advertisement calls, respectively (Long 1977;
can interfere with sound at the center frequency
Moss and Simmons 1986). In many other species,
(ANSI/ASA S3.20-2015; ISO 18405: 2017). The
CRs gradually increase with tone frequency (e.g.,
critical bandwidth is typically measured in noise-
Fay 1988; Erbe et al. 2016). In the absence of CR
widening experiments. The listener tries to detect
data, 1/3 octave bands are often used (in particular
a tone at the center of a band of masking noise. As
in the noise impact assessment literature). While
the noise band is widened, the level of the tone
this is a good approximation in birds (e.g.,
has to increase for it to remain audible. There
Dooling and Blumenrath 2013), in several spe-
comes a bandwidth, at which the width of the
cies, 1/3 octave bands overestimate CRs at some
masking noise band no longer affects the level
frequencies (Fig. 10.21).
of the tone at detection threshold. This is the
The CR is often taken as an estimate of the
critical bandwidth. The difference between a CR
width of the auditory filters. In this case, it should
and a CB experiment thus is that the listener has
be referred to as the Fletcher critical band (ANSI/
to detect a tone in broadband masking noise in the
ASA S3.20-2015).2 If CR is in dB re 1 Hz, then
former and in noise of variable (increasing) band-
the Fletcher critical band is computed as 10CR/10.
width in the latter. CBs are time-consuming to
The Fletcher critical band is an indirect estimate
collect, because they require determining masked
thresholds at each tone frequency at many differ-
2
Acoustical Society of America, Standard Acoustical & ent noise bandwidths. For this reason,
Bioacoustical Terminology Database: https://asastandards. measurements of CB are available for fewer spe-
org/working-groups-portal/asa-standard-term-database/;
accessed 7 January 2021. cies than are measurements of CR.
10 Behavioral and Physiological Audiometric Methods for Animals 383

10.5.2.3 Psychophysical Tuning Curves 10.6 Summary


Psychophysical tuning curves are another mea-
sure of behavioral frequency selectivity. In these Describing and quantifying the hearing
experiments, a tone is fixed in frequency and capabilities of different animals is essential in
amplitude just above (typically, 10 dB) its abso- bioacoustical studies. Basic features of hearing,
lute threshold. The animal is trained to detect the such as the range of audibility, thresholds of
tone in the presence of a masker (either other hearing as a function of frequency, and the fre-
tones or narrowband noise). The masker can be quency range of best hearing, are easily shown on
presented simultaneously with the tone (simulta- an audiogram. Hearing sensitivities are best in
neous masking), or prior to the tone (forward young, healthy animals and may decline in some
masking). Psychophysical tuning curves are typi- animals as they age or if they are exposed to
cally V-shaped, so that as the frequency separa- ototoxic antibiotics. Acute exposure to high-
tion between the tone and the masker increases, amplitude noise or long-term exposure to lower
the level of the masker required to mask the tone levels of noise also can temporarily or perma-
increases (Fig. 10.22). They are similar in shape nently reduce hearing sensitivity.
to tuning curves of auditory nerve fibers, and so A variety of behavioral and physiological
can provide non-invasive estimates of neural fre- methods can be used to test hearing in live
quency selectivity (Serafin et al. 1982). The draw- animals. The aims of a study and the
back of this technique is that it is time-consuming characteristics of the animals should be consid-
to conduct, so that data are available for only a ered carefully when selecting the appropriate
few animal species. audiometric methods to use. This chapter

Fig. 10.22 Psychophysical tuning curves (left) for the threshold. Masker tones (130-ms duration, with
Pig-tailed macaque monkey (Macaca nemestrina; right), frequencies varying around that of the probe tone) were
measured in a forward masking paradigm. Animals were presented 2 ms before the onset of the probe tone. The
trained to detect tones using positive reinforcement. Tones blue, dark red, and dark gray curves show the psychophys-
were presented via earphones, and the animals were seated ical tuning curves plotting the level of the masker (y-axis)
inside a sound-attenuating chamber. Masked thresholds to needed to just mask the probe tone at each masker fre-
probe tones (0.5, 2, and 8 kHz; blue, dark red, dark gray, quency. The black dashed line shows the animals’ absolute
respectively; x-axis) were determined using an adaptive thresholds (audiogram). Data collected by Serafin et al.
tracking procedure and defined as the mean of eight rever- (1982). # Stauss, 2006; https://commons.wikimedia.org/
sal points at each frequency. Probe tones (25-ms duration) w/index.php?curid¼1733069. Licensed under CC BY-SA
were presented at a level of 10 dB above absolute 3.0; https://creativecommons.org/licenses/by-sa/3.0/
384 S. L. McFadden et al.

described common behavioral and physiological Au WWL (1993) The sonar of dolphins. Springer,
methods, along with some of their strengths and New York
Awbrey FT, Thomas JA, Kastelein RA (1988)
weaknesses. Testing hearing abilities in animals Low-frequency underwater hearing sensitivity in
is not as easy as in humans because animal belugas, Delphinapterus leucas. J Acoust Soc Am
subjects cannot verbally report to the researcher 84(6):2273–2275
when a test signal is heard. Instead, animals indi- Bain DE, Kriete B, Dahlheim M (1993) Hearing abilities
of killer whales (Orcinus orca). J Acoust Soc Am
cate that they heard a sound by making unlearned 94(3):1829–1829
or learned responses in behavioral studies. Berger JI, Coomber B, Shackleton TM, Palmer AR,
Thresholds based on conditioned responses are Wallace MN (2013) A novel behavioural approach to
the most accurate and reliable, but conditioning detecting tinnitus in the Guinea pig. J Neurosci
Methods 213(2):188–195
procedures are not suitable for all animals or Bergevin C, Freeman DM, Saunders JC, Shera CA (2008)
research questions. Some animals are not train- Otoacoustic emissions in humans, birds, lizards, and
able or are unable to participate in a behavioral frogs: evidence for multiple generation mechanisms. J
study due to age, health, or some other factor. Comp Physiol A 194(7):665–683
Bhandiwad AA, Sisneros JA (2016) Revisiting
Physiological methods, especially auditory psychoacoustic methods for the assessment of fish
brainstem response testing, can be particularly hearing. In: Sisneros JA (ed) Fish hearing and bio-
helpful in these situations. While ABR and other acoustics: an anthology in honor of Arthur N. Popper
physiological methods provide useful informa- and Richard R. Fay, vol 877. Springer, New York, pp
157–184
tion about auditory function, it is important to Bowles AE, Francine JK (1993) Effects of simulated air-
recognize that the results they provide are not craft noise on hearing, food detection, and predator
equivalent to those from behavioral studies that avoidance behavior of the kit fox, Vulpes macrotis. J
assess hearing directly; thresholds obtained using Acoust Soc Am 93:2378–2378
Braff DL, Geyer MA, Swerdlow NR (2001) Human stud-
physiological methods may under- or over- ies of prepulse inhibition of startle: normal subjects,
estimate behavioral thresholds in an unpredict- patient groups, and pharmacological studies. Psycho-
able manner. pharmacology 156(2–3):234–258
Research on hearing abilities in animals has Branstetter BK, Leger JS, Acton D, Stewart J, Houser D,
Finneran JJ, Jenkins K (2017) Killer whale (Orcinus
advanced beyond documenting the basic audio- orca) behavioral audiograms. J Acoust Soc Am
gram of a species. Data on frequency and inten- 141(4):2387–2398. https://doi.org/10.1121/1.4979116
sity discrimination, sound localization, and the Brill RL, Moore PWB, Dankiewicz LA (2001) Assess-
effects of noise on hearing in animals are current ment of dolphin (Tursiops truncatus) auditory sensitiv-
ity and hearing loss using jawphones. J Acoust Soc Am
topics of study for many animal species. Informa- 109(4):1717–1722
tion on hearing and an animal’s abilities to adapt Cui J, Zhu B, Fang G, Smith E, Brauth SE, Tang Y (2017)
to noise can have important applications for the Effect of the level of anesthesia on the auditory
conservation of species in areas of high anthropo- brainstem response in the Emei Music Frog (Babina
daunchina). PLoS One 12(1):e0169449. https://doi.
genic noise. org/10.1371/journal.pone.0169449
Dent ML, Dooling RJ, Pierce AS (2000) Frequency dis-
crimination in budgerigars (Melopsittacus undulatus):
effects of tone duration and tonal context. J Acoust Soc
References Am 107(5):2657–2664. https://doi.org/10.1121/1.
428651
Aihara I, Bishop PJ, Ohmer MEB, Awano H, Mizumoto T, Dooling RJ, Blumenrath SH (2013) Avian sound percep-
Okuno HG, Narins PM, Hero JM (2017) Visualizing tion in noise. In: Brumm H (ed) Animal communica-
phonotactic behavior of female frogs in darkness. Sci tion in noise. Springer, Heidelberg, pp 229–250.
Rep 7:10539. https://doi.org/10.1038/s41598- https://doi.org/10.1007/978-3-642-41494-7_8
017-11150 Ehret G, Romand R (1981) Postnatal development of
American National Standards Institute (2004) Methods for absolute auditory thresholds in kittens. J Comp Physiol
manual pure-tone threshold audiometry (ANSI S3.21- Psychol 95(2):304–311
2004). Acoustical Society of America, New York Erbe C, Farmer DM (1998) Masked hearing thresholds of
American National Standards Institute (2015) a beluga whale (Delphinapterus leucas) in icebreaker
Bioacoustical terminology (ANSI S3.20-2015, R noise. Deep Sea Res II Top Stud Oceanogr 45(7):
2020). Acoustical Society of America, New York 1373–1388. https://doi.org/10.1016/S0967-0645(98)
00027-7
10 Behavioral and Physiological Audiometric Methods for Animals 385

Erbe C, Reichmuth C, Cunningham KC, Lucke K, Henderson D, Salvi RJ, Quaranta A, McFadden SL,
Dooling RJ (2016) Communication masking in marine Burkard RF (eds) (1999) Ototoxicity: basic science
mammals: a review and research strategy. Mar Pollut and clinical applications, Annals of the New York
Bull 103:15–38. https://doi.org/10.1016/j.marpolbul. Academy of Sciences, vol 884. The New York Acad-
2015.12.007 emy of Sciences, New York
Fay RR (1988) Hearing in vertebrates: a psychophysics Hoffman H, Ison JR (1980) Reflex modification in the
databook. Hill-Fay Associates, Winnetka, IL domain of startle: I. Some empirical findings and
Fay RR (1995) Psychoacoustical studies of the sense of their implications for how the nervous system pro-
hearing in goldfish using conditioned respiratory cesses sensory input. Psychol Rev 87:175–189
suppression. In: Klump GM, Dooling RJ, Fay RR, Houser DS, Finneran JJ (2006) A comparison of underwa-
Stebbins WC (eds) Methods in comparative psycho- ter hearing sensitivity in bottlenose dolphins (Tursiops
acoustics. Birkhauser, Basel, pp 249–261 truncatus) determined by electrophysiological and
Fay RR, Simmons AM (1999) The sense of hearing in behavioral methods. J Acoust Soc Am 120:1713–1722
fishes and amphibians. In: Popper AN, Fay RR (eds) Hu MY, Yan HY, Chung W-S, Shiao J-C, Hwant PP
Comparative hearing: fish and amphibians. Springer, (2009) Acoustically evoked-potentials in two
New York, pp 269–318 cephalopods inferred using the auditory brainstem
Finneran JJ, Houser DS, Blasko D, Hicks C, Hudson J, response (ABR) approach. Comp Biochem Physiol A
Osborn M (2008) Estimating bottlenose dolphin 153(3):278–283
(Tursiops truncatus) hearing thresholds from single International Organization for Standardization (2017)
and multiple simultaneous auditory evoked potentials. Underwater acoustics—terminology (ISO 18405).
J Acoust Soc Am 123(1):542–551 Switzerland, Geneva
Finneran JJ, Mulsow J, Schlundt CE, Houser DS (2011) Jero J, Coling DE, Lalwani AK (2001) The use of Preyer’s
Dolphin and sea lion auditory evoked potentials in reflex in evaluation of hearing in mice. Acta
response to single and multiple swept amplitude Otolaryngol 121(5):585–589
tones. J Acoust Soc Am 130(2):1038–1048. https:// Johnson CS (1966) Auditory thresholds of the bottlenose
doi.org/10.1121/1.3608117 porpoise (Tursiops truncatus, Montagu). U.S. Naval
Gerhardt HC (1995) Phonotaxis in female frogs and toads: Ordnance Test Station. Tech Publ 4178:1–28
execution and design of experiments. In: Klump GM, Kastelein RA, Heu S, van der Verboom W, Jennings N,
Dooling RJ, Fay RR, Stebbins WC (eds) Methods in Veen J, Vander J, de Haan D (2008) Startle response of
comparative psychoacoustics. Birkhauser, Basel, pp captive North Sea fish species to underwater tones
209–220 between 0.1 and 64 kHz. Mar Environ Res 65(5):
Green D, Swets J (1966) Signal detection theory and 369–377
psychophysics. Wiley, New York. Reprinted 1974 by Kemp DT (2002) Otoacoustic emissions, their origin in
Krieger, Huntington, New York cochlear function, and use. Br Med Bull 63(1):
Hall JD, Johnson CS (1972) Auditory thresholds of a killer 223–241
whale, Orcinus orca Linnaeus. J Acoust Soc Am 51: Kiebel EM, Sunderman MG, Leonhard JR, McFadden SL
515–517 (2012) Measurement of cortical auditory event-related
Heffner RS, Heffner HE (1982) Hearing in the elephant potentials in conscious CBA/CaJ mice. Association for
(Elephas maximus): absolute sensitivity, frequency Psychological Science conference, Chicago
discrimination, and sound localization. J Comp Physiol Klump GM, Dooling RJ, Fay RR, Stebbins WC (eds)
Psychol 96:926–944 (1995) Methods in comparative psychoacoustics.
Heffner RS, Heffner HE (1991) Behavioral hearing range Birkhauser, Basel
of the chinchilla. Hear Res 52:13–16 Koay G, Heffner RS, Heffner HE (2002) Behavioral
Heffner HE, Heffner RS (2001) Behavioral assessment of audiograms of homozygous medJ mutant mice with
hearing in mice. In: Willott JF (ed) Handbook of sodium channel deficiency and unaffected controls.
mouse auditory research: from behavior to molecular Hear Res 171:111–118
biology. CRC Press, Boca Raton, FL, pp 19–29 Ladich F, Fay RR (2013) Auditory evoked-potential audi-
Heffner RS, Heffner HE, Masterton B (1971) Behavioral ometry in fish. Rev Fish Biol Fish 23:317–364
measurements of absolute and frequency-difference Langemann U, Klump GM, Dooling RJ (1995) Critical
thresholds in Guinea pig. J Acoust Soc Am bands and critical-ratio bandwidth in the European
49(6B):1888–1895 starling. Hear Res 84(1–2):167–176
Heffner HE, Heffner RS, Contos C, Ott T (1994) Audio- Levitt H (1970) Transformed up-down methods in psy-
gram of the hooded Norway rat. Hear Res 73:244–248 choacoustics. J Acoust Soc Am 49:467–477
Heffner RS, Koay G, Heffner HE (2014) Hearing in Long GR (1977) Masked auditory thresholds from the bat,
alpacas (Vicugna pacos): audiogram, localization acu- Rhinolophus ferrumequinum. J Comp Physiol 116:
ity, and use of binaural locus cues. J Acoust Soc Am 247–255
135(2):778–788 Manley GA (2001) Evidence for an active process and a
Hellström P-A (1995) The relationship between sound cochlear amplifier in nonmammals. J Neurophysiol
transfer functions and hearing levels. Hear Res 86(2):541–549
88(1–2):54–60 McFadden SL (2007) Biochemical bases of hearing. In:
Campbell K (ed) Pharmacology and ototoxicity for
386 S. L. McFadden et al.

audiologists. Thomson Delmar Learning, New York, Acoust Soc Am 57(6):1526–1532. https://doi.org/10.
pp 86–123 1121/1.380595
McFadden SL, Kiebel EM (2013) A parametric study of Screven LA, Dent ML (2019) Perception of ultrasonic
auditory event-related potentials recorded from cortex vocalizations by socially housed and isolated mice.
of CBA/CaJ mice. Association for Psychological Sci- eNeuro 6(5). https://doi.org/10.1523/ENEURO.
ence Conference, Washington, DC 0049-19.2019
McFadden SL, Walsh EJ, McGee J (1996) Onset and Serafin JV, Moody DB, Stebbins WC (1982) Frequency-
development of auditory brainstem response selectivity of the monkey’s auditory system: psycho-
thresholds in the Mongolian gerbil (Meriones physical tuning-curves. J Acoust Soc Am 71(6):
unguiculatus). Hear Res 100:68–79 1513–1518
McFadden SL, Campo P, Quaranta N, Henderson D Shaffer LA, Long GR (2004) Low-frequency distortion
(1997) Age-related decline of auditory function in the product otoacoustic emissions in two species of
chinchilla (Chinchilla laniger). Hear Res 111:114–126 kangaroo rats: implications for auditory sensitivity. J
McFadden SL, Ohlemiller KK, Ding DL, Salvi RJ (2001a) Comp Physiol A 190(1):55–60
The role of superoxide dismutase in age-related and Simmons AM, Moss CF (1995) Reflex modification: a tool
noise-induced hearing loss: clues from Sod1 knockout for assessing basic auditory function in anuran
mice. In: Willott JF (ed) Handbook of mouse auditory amphibians. In: Dooling R, Fay R, Klump G, Stebbins
research: from behavior to molecular biology. CRC W (eds) Methods in comparative psychoacoustics.
Press, Boca Raton, FL, pp 489–504 Birkhauser, Basel, pp 197–208
McFadden SL, Ohlemiller KK, Ding DL, Salvi RJ (2001b) Sisneros JA, Popper AN, Hawkins AD, Fay RR (2016)
The influence of superoxide dismutase and glutathione Auditory evoked-potential audiograms compared to
peroxidase deficiencies on noise-induced hearing loss behavioral audiograms in aquatic animals. In: Popper
in mice. In: Henderson D, Prasher D, Kopke R, AN, Hawkins AD (eds) Effects of noise on aquatic life
Salvi R, Hamernik R (eds) Noise-induced hearing II, vol 875. Springer, New York, pp 1049–1056
loss: basic mechanisms, prevention and control. NRN Stansbury AL, Thomas JA, Stalf CE, Murphy LD,
Publications, London, pp 3–18 Lombardi D, Carpenter J, Mueller T (2014) Behavioral
McFadden SL, Zulas AL, Morgan RE (2010) audiogram of two Arctic fox (Alopex lagopus). Polar
Age-dependent effects of modafinil on acoustic startle Biol 37:417–422. https://doi.org/10.1007/s00300-014-
and prepulse inhibition in rats. Behav Brain Res 1446-5
208(1):118–123 Thomas JA, Moore PWB, Withrow R, Stoermer M (1990)
Moss CF, Simmons AM (1986) Frequency selectivity of Underwater audiogram of a Hawaiian monk seal
hearing in the green treefrog, Hyla cinerea. J Comp (Monachus schauinslandii). J Acoust Soc Am 87(1):
Physiol A 159:257–266 417–420
Popov VV, Supin AY (1990) Electrophysiological studies Tonndorf J (1976) Relationship between the transmission
on hearing in some cetaceans and a manatee. In: characteristics of conductive system and noise-induced
Thomas JA, Kastelein RA (eds) Sensory abilities of hearing-loss. In: Henderson D, Hamernik RP, Dosanjh
cetaceans: laboratory and field evidence. Plenum Press, DS, Mills JH (eds) Effects of noise on hearing. Raven
New York, pp 405–415 Press, New York, pp 159–178
Salvi RJ, McFadden SL, Wang J (2000) Anatomy and von Békésy G (1960) Experiments in hearing. McGraw
physiology of the peripheral auditory system. In: Hill, New York
Roeser RJ, Valente M, Hosford-Dunn H (eds) Audiol- Walter M, Tziridis K, Ahlf S, Schulze H (2012) Context
ogy diagnosis. Thieme, New York, pp 19–43 dependent auditory thresholds determined by
Schlundt CE, Dear RL, Green L, Houser DS, Finneran JJ brainstem audiometry and prepulse inhibition in Mon-
(2007) Simultaneously measured behavioral and electro- golian gerbils. Open J Acoust 2:34–49. https://doi.org/
physiological hearing thresholds in a bottlenose dolphin 10.4236/oja.2012.21004
(Tursiops truncatus). J Acoust Soc Am 122:615–622 Welch TE, Dent ML (2011) Lateralization of acoustic
Schusterman RJ (1976) California Sea lion underwater signals by dichotically listening budgerigars
auditory detection and variation of reinforcement (Melopsittacus undulatus). J Acoust Soc Am 130(4):
schedules. J Acoust Soc Am 59(4):997–1000. https:// 2293–2301
doi.org/10.1121/1.380928 White JR, Norris MJH, Ljungblad DK, Barton K, di
Schusterman RJ, Johnson BW (1975) Signal probability Sciarra GN (1978) Auditory thresholds of two beluga
and response bias in California Sea lions. Psych Rec whales, Delphinapterus leucas. HSWRI Tech Rep No
25(1):39–45. https://doi.org/10.1007/BF03394287 78-109 Sea World Research Institute, San Diego, CA
Schusterman RJ, Barrett B, Moore P (1975) Detection of Willott JF (1991) Aging and the auditory system: anat-
underwater signals by a California Sea lion and a omy, physiology, and psychophysics. Singular Publi-
bottlenose porpoise: variation in the payoff matrix. J cation Group, San Diego, CA
10 Behavioral and Physiological Audiometric Methods for Animals 387

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Vibrational and Acoustic
Communication in Animals 11
Rebecca Dunlop, William L. Gannon, Marthe Kiley-
Worthington, Peggy S. M. Hill, Andreas Wessel,
and Jeanette A. Thomas

11.1 Introduction ordinary mammals. Therefore, other than a brief


discussion of human language at the end of the
The study of animal communication, which is chapter, we will not discuss anthroposemiotics.
sometimes called zoosemiotics (as opposed to Instead, we highlight and discuss what much of
anthroposemiotics, the study of human communi- the rest of the Kingdom does.
cation), is fundamental to the areas of ethology, In Acoustic behavior of animals (edited by
evolutionary biology, and animal cognition. Here, Busnel 1963, p. 751), Tembrock stated that, “the
we are not so emboldened as to claim that humans production of sounds is not a fancy of Nature, but
are separate from other “animals.” In fact, we are an expression of biological needs.” Moles (also in
Busnel 1963), in what are believed to be the main
lines of acoustic communication in animals,
included a code that is received and acted upon
Jeanette A. Thomas (deceased) contributed to this chapter (p. 112). Groundbreaking as this volume was,
while at the Department of Biological Sciences, Western knowledge of acoustic communication in animals
Illinois University-Quad Cities, Moline, IL, USA
has come a long way since. Just 20 years later,
R. Dunlop (*) Kroodsma (1982) published Acoustic communica-
School of Biological Sciences, University of Queensland, tion in birds. The first volume of this multivolume
Brisbane, QLD, Australia publication discussed the significant advances
e-mail: r.dunlop@uq.edu.au
made in recording animal signals, as well as the
W. L. Gannon advancement in knowledge of the anatomy of
Department of Biology, Museum of Southwestern
neural and auditory structures, the physical
Biology, and Graduate Studies, University of New
Mexico, Albuquerque, NM, USA characters of signal transmission, signaler motiva-
e-mail: wgannon@unm.edu tion and coding, species-specific signaling, and
M. Kiley-Worthington the use of signals in behaviors such as spacing
Centre of Eco-Etho Research & Education, Cranscombe and mating (Morton 1982). The second volume
Cleave, Brendon, UK (Kroodsma and Miller 1982) discussed issues of
P. S. M. Hill signal ontogeny, mimicry, vocal learning, and the
Department of Biological Science, The University of ecological, behavioral, and genetic implications of
Tulsa, Tulsa, OK, USA
variations within vocalizations. Other early com-
e-mail: peggy-hill@utulsa.edu
pendiums, such as Sebeok (1977), provided an
A. Wessel
extensive summary of high-quality research stud-
Center for Integrative Biodiversity Discovery, Museum
für Naturkunde Berlin, Berlin, Germany ies from an expanding discipline of behavior and
e-mail: andreas.wessel@mfn.berlin animal communication.
# The Author(s) 2022 389
C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_11
390 R. Dunlop et al.

Fig. 11.1 Biotremology


examines mechanical
communication such as that
produced by many insects,
including planthoppers
(Apache degeeri; common
in places such as North
Carolina, USA). Photo “9
Apache degeeri
(planthopper)” by
Wildreturn; https://
wordpress.org/openverse/
image/4323324f-25c8-
408f-9b88-8c5b3ae93655/.
Licensed under CC BY 2.0;
https://creativecommons.
org/licenses/by/2.0/

Bioacoustics is defined as the study of use devices such as laser Doppler vibrometers
mechanical communication by acoustic (sound) and wavelet analysis. These function to detect
waves. It is a widely used term when referring to faint vibrational emissions made by animals. In
animal communication. Biotremology is a rela- addition, electromagnetic transducers produce
tively recent term. It was conceived to refer to signals, and when in contact with the substrate,
communication signals that comprise substrate- serve as vibration generators for artificial play-
borne vibrations, and which are detected as sur- back experiments.
face vibrations by specialized perception organs Now, nearly 60 years later beyond Busnel’s
such as slit-sense organs in spiders, subgenual (1963) paradigm of bioacoustics, tremendous
organs in insects, hair receptors, or Pacinian and changes in recording technology and analysis
Herbst corpuscles in vertebrates (Hill and have occurred. Acoustic identification of any-
Wessel 2016). Substrate-borne vibrations are thing from birds to bats can be carried out using
sensed via, “. . .pressure waves traveling an iPhone, an acoustic detection application, and
through . . . solid matter . . . detected via the a bluetooth speaker or microphone!
surface vibrations they elicit or the airborne
waves (sound) they induce” (Hill and Wessel
2016). Bioacoustical (sound) communication, 11.2 The Origins of Substrate-Borne
refers to signals that are encoded in acoustic Vibrational and Acoustic
waves, and are detected using the ear. Vibra- Communication
tional communication has been recognized as
evolutionarily older than bioacoustic communi- Communication is the transfer of information
cation and is much more prevalent among some from one animal (sender) to another animal
animal groups (e.g., arthropods; Fig. 11.1). (receiver) that can affect the current or future
Therefore, researchers are also interested in behavior of the receiver. In other words, commu-
how these mechanical vibrations affect nication conveys information. It is adaptive, in
behavior. that a successful communication exchange
Both areas of study use similar equipment to enhances the survival of one or both participants.
record and analyze communication signals. How- Vibrational communication has been suggested to
ever, scientists in the field of biotremology also have evolved, along with chemical
11 Vibrational and Acoustic Communication in Animals 391

communication, concurrently with evolution of example, holding the breath and then letting it
the Metazoa (all animals; Endler 2014). We out as a sigh or a cough produces various sounds.
know that any movement of an animal, whether These sounds are then associated with situations
in water or at the boundary between air and any being experienced by the sender, meaning this
type of substrate, creates vibrations that can be information is available to all who hear
detected by any other organism with receptors it. Presumably, it was this evolutionary process
capable of receiving and translating them. that gave rise to sound-making organs in the
Increasing evidence also suggests that inverte- respiratory tract to the point where vocal commu-
brate hearing organs evolved from vibrational nication now involves a larynx.
precursors millions of years ago (Stumpner and Ritualization is the evolutionary process by
von Helversen 2001; Lakes-Harlan and Strauss which a pattern of behavior changes to become
2014). Therefore, the discussion of origins of more effective as a signal (Huxley 1966; Morris
communication in this section is restricted to the 1957). The behavior is performed in a consistent
more recently evolved acoustic communication. way and is either stereotyped or incomplete.
The origins of acoustic communication are Incomplete behaviors may be used for activities
likely to be in nonverbal sounds made by chance such as courtship. For example, a drake mallard
as the animal moves through the environment. (Anas platyrhynchos), when preening and
These sounds could be scraping, a stick breaking, displaying to a female, acts as if he is addressing
footfalls, opening or flapping of wings, or a skin irritation (Morris 1956), but he may not
scratching. They are the result of environmental even touch his feathers during the display. In
disturbance, which in turn makes a sound through other words, the behavior seems to be a preening
the air, earth, or water. By just being made, these behavior, but is in fact a courtship behavior. To
sounds convey to others the presence of the ani- increase the effectiveness of the ritualized signal,
mal, and something about what it might be doing. anatomical modifications may also have evolved.
It is then a simple developmental step for a par- A classic example of this is the elaborate colors of
ticular sound to become associated with a partic- the Mandarin drake (Aix galericulata). During the
ular situation and thus carry a particular message courtship of a female, the male will highlight
to the recipient. Examples of nonverbal sounds these colors by pointing to them during incom-
are sounds from an elephant breaking sticks as it plete, exaggerated, and stereotypical preening.
moves through the environment, a sigh, a cough, Exaggerated signal ritualization is
or a sneeze. Originally, these sounds may not characterized by a clear signaling behavior, such
have been made to communicate. However, as the ears of a horse (Equus caballus) flattening
sounds that provide an advantage for an individ- back as a precursor signal to biting. This
ual, or a population, will be perpetuated if they exaggerated ear movement has a clearer meaning
enhance the fitness of the species. This, ulti- than just putting the ears back. Ritualistic behav-
mately, gives them an evolutionary advantage ior is usually no longer tied to its original role
that would reinforce further refinement of this because it has become more important for the
new sensory mode. signaler’s fitness to communicate, rather than
This origin likely gave the evolutionary open- being used for its original purpose. Therefore,
ing to develop specialized body parts that could the signal has evolved to produce a clear message.
produce auditory signals, in tandem with sophis- Signals can also evolve to become more effec-
ticated sensory capabilities to receive them tive by redundancy, or by emulation of another’s
(Narins et al. 2009). One such specialized body acoustic or vibrational expression. Redundancy in
part is the respiratory tract. Once a respiratory animal acoustic communication is the repeated
tract had developed in vertebrates, sounds use of a signal. Vocal signals, for example, can
associated with breathing could convey informa- be repeated for long periods of time, such as the
tion to others, and so the necessary adaptations continuous chorusing of frogs advertising during
for sound generation began to develop. For mating sessions. Redundancy reduces the risk
392 R. Dunlop et al.

vocalization is copied and repeated by a recipient


and can cause increased arousal in both the sender
and the recipient (Kiley 1972). Animals copying
new sounds, which often happens by emulation,
requires vocal learning (Janik and Slater 2000).
A more complex version of this is antiphonal
singing, which is an acoustic exchange between
animals where they call at the same time to pro-
duce a chorus. There are benefits to this emulative
calling behavior. Males that chorus, such as frogs
and toads (Anura), cicadas (Cicadoidea), and
humpback whales (Megaptera novaeangliae),
may attract more females to a localized area. For
example, millions of cicadas gather to mate in a
forest in the eastern US, where the singing males
produce loud, pure-tone sounds above 90 dB SPL
(Fig. 11.3; Bennet-Clark 1998, 2000). Prairie
mole cricket (Gryllotalpa major) males in the
south-central US sing in choruses from burrows
in the soil that individuals construct in
Fig. 11.2 Emulative acoustic behavior is seen when a aggregations. At 20-cm from the burrow
domestic dog (Canis lupus familiaris) hears a siren or entrance, the males’ loud harmonic songs average
other high-pitched signal. Photo “Howling white husky”
96 dB SPL (Hill 1998).
by Tambako the Jaguar; https://wordpress.org/openverse/
image/7d77b8d9-3dc4-4f3d-9c04-318833d1759e/. The larynx and various resonating cavities in
Licensed under CC BY-ND 2.0; https://creativecommons. the respiratory tract (throat, mouth, and nasal
org/licenses/by-nd/2.0/ cavities that can be specialized into trunks or
elongated noses) are collectively responsible for
that a signal will be missed or misinterpreted and an enormous range of vocal sounds made by
assures that the signal is heard even when envi- different species. Vocal signals have evolved to
ronmental conditions are poor (e.g., when there convey a great variety of messages,
are masking sounds from the environment and/or encompassing many meanings that can be
human sources). This continual production of interpreted by the recipients. The development
sounds in chorus can also sustain the state of of this messaging system becomes intricate with
arousal or excitement, which may be necessary human language. Whether the degree of develop-
for completion of the behavior. ment of the young at birth (which could relate to
Signal emulation is when other members of a cognitive development; Scheiber et al. 2017;
group join in when a signal is given. An example Wilson-Henjum et al. 2019) influences the com-
of this is when a group of domestic dogs (Canis plexity of vocalizations and other displays are yet
lupus familiaris) hear the high-pitched siren of an to be determined.
emergency vehicle. One may start to howl, and
others soon join in (Fig. 11.2). When one individ-
ual calls, this often stimulates others to make the 11.3 A Summary of Communication
same call. Other examples include the greeting
calls (trumpeting) between mother and offspring Communication occurs when a signaler encodes a
elephant (Elephas maximus), or the “see-saw” message in a signal, which passes through some
vocalized inspiration and expiration call and medium (air, water, soil, plant organs, etc.), and is
reply signals of bull cattle (Bos taurus). Sound received, decoded, and acted upon by the
emulation is also common in humans. The receiver. The receiver’s response benefits the
11 Vibrational and Acoustic Communication in Animals 393

Fig. 11.3 17-year Cicada


(Magicicada sp). Photo by
the U.S. Department of
Agriculture; https://www.
flickr.com/photos/usdagov/
8672057401/in/
photostream/. Licensed
under CC BY 2.0; https://
creativecommons.org/
licenses/by/2.0/

fitness of the signaler, and perhaps itself. It is a 11.3.1 Communication Concepts


common misconception that communication
always consists of a simple signal that is Marler (1961) recognized four functions of
reciprocated with a single response. In fact, com- signals: identifiers, designators, prescribers, and
munication often uses multimodal sensory appraisers. For example, a male seal swims into
combinations of visual, olfactory, tactile, gusta- the territory of another seal and the territory
tory, electrical (as in electric fish or the duck- holder sends out a warning call. This call
billed platypus, Ornithorhynchus anatinus), identifies the place and time of the territory holder
substrate-borne vibrational, and acoustical (identifier), reports that he is the territory holder
modes. The use of multimodal signals helps (designator), warns that the intruder (prescriber)
ensure that the message is unmistakable. For should stop approaching, and allows the intruder
example, a cat can swish her tail, pull back her to react to his call (appraiser). Smith (1969)
ears, swipe with her claws, and hiss to give an expanded this into 12 generalized categories for
aggressive signal of potential attack, whereas just vertebrates. Since then, with technological and
hissing or swishing her tail is a less clear message. analytical advancements, signal functions have
The focus of this chapter is substrate-borne been expanded to include complex displays,
(vibrational) and acoustic (sound) communica- either vocal or nonvocal, and the other categories
tion. A signal, for the purposes of this chapter, explored below.
contains substrate-borne or acoustic information Displays are behaviors that use one or several
that is broadcast by an individual and is available signals. These signals have evolved and become
to be received by another individual. The receiver specialized to convey specific information. A
may be the intended target of the signal or an classic example of a display behavior is the
unintended eavesdropper. Any individual in the chest-beating of a mountain gorilla (Gorilla
environment with the appropriate receptor can beringei), made famous by King Kong movies.
receive the signal (Wiley 1983). The receiver of This signal is given only by the dominant silver-
a signal may recognize it as containing informa- back males when he encounters a threat, such as
tion beyond that of just sensing the signal and the another gorilla male, though the display can be
presence of the signaler. practiced or mocked by the young (Fig. 11.4).
The chest-beating forms part of a complex threat
394 R. Dunlop et al.

Fig. 11.4 Displays such as


shown by this young gorilla
(Gorilla beringei) often
accompany both vibrational
and acoustic
communication. Photo
“Gorilla Holding Baby
Sister and Beating Her
Chest” by Eric Kilby;
https://www.flickr.com/
photos/ekilby/
36360289044. Licensed
under CC BY-SA 2.0;
https://creativecommons.
org/licenses/by-sa/2.0/

display, which involves nine steps, and includes Much of the communication in insects, other
both visual and acoustic modalities (Schaller invertebrates, and nonmammalian vertebrates
1964). In other words, the threat display can such as fish and amphibians, involves stereotyped
encompass several different signals. signals. That is, the signal is produced in a con-
A similar threatening display is produced by a stant form and the response is evoked only by that
dog (Canis lupus familiaris), drawing back its signal. As a result, this signal/response relation-
lips and exposing its teeth (visual), as well as ship becomes characteristic of that species. In this
growling (acoustic) (Fig. 11.5). Again, this is a way, stereotyped signals can be important in evo-
complex display involving multiple steps and lution. For example, if a signal influences mate
multiple modalities. However, displays can be selection, then a slight alteration in the signal
simpler, such as a grasshopper (Orthoptera) could lead to failure to reproduce, or if mating is
scraping its wings as an acoustic signal to indicate successful, it might give rise to a new species.
location and readiness to mate.

Fig. 11.5 Yellow


Labrador retriever growls at
a border collie, while using
a mix of visual displays and
vocalizations; the collie
responds. "Growl" by
smerikal is licensed under
CC BY-SA 2.0; https://
commons.wikimedia.org/
wiki/File:Labrador_Growl.
jpg
11 Vibrational and Acoustic Communication in Animals 395

11.3.2 Biotremology waves are of special interest as most animals


that make use of vibrational communication
Vibrational behavior in animals has gained receive the signals by detector organs. These
momentum in general awareness and research in organs are in contact with a substrate surface, be
the last few decades (Narins 1990; Hill 2008; it the ground, the surface of plant stems and
Cocroft et al. 2014; Hill et al. 2019). Any sort of leaves, or the water surface.
motion of a living organism produces vibrations In addition to pressure waves (P-waves) and
in the various media around them, including the shear waves (S-waves) traveling inside the body
soil, air, plants, water surface, or spider webs. of a solid (see Chap. 4), we have at the substrate
Some vibrations can be signals, while others are surface Rayleigh waves (R-waves) and Love waves
incidental cues not produced purposefully, or to (L waves). Both R- and L-waves show particle
benefit the sender. The rather new branch of oscillation perpendicular to the direction of the
behavioral biology studying vibrational commu- wave, but different propagation characteristics.
nication is called “biotremology” and is P- and L-waves, for example, both have a higher
concerned with substrate-borne mechanical propagation velocity than R-waves. Animals who
waves used as a communication channel (Hill can detect those waveforms differently could local-
and Wessel 2016). In contrast to airborne sound, ize the source of these waves—be it a communica-
which consists of pressure waves only (see tion partner, a predator, or prey.
Chap. 4, Sect. 4.2.2), in solid substrates mechani- In 1979, Brownell and Farley showed that
cal energy can travel in several waveforms, espe- scorpions localize their prey by using differences
cially at the surface (i.e., the boundary between in the propagation velocity of P- and R-waves
two distinct media; Fig. 11.6). Surface-borne (150 m/s:50 m/s), which they perceive using

Fig. 11.6 Mechanical wave forms produced by a signal- planthopper tymbal organ is homologous to the “drum-
ing plant-dwelling insect. A planthopper is one of the ming organ” of the large singing cicadas. Tens of
small relatives of the cicadas. It has a tymbal organ to thousands of these smaller hemipteran bugs use tymbal
produce vibrations, which are transferred through its legs, organs to produce “silent songs.” Reprinted by permission
then the thin air layer between its body and the plant from Elsevier. Hill P SM, Wessel A (2016). Biotremology.
surface, to the plant on which it is sucking fluids. By Current Biology 26, R181–R191; https://doi.org/10.1016/
doing this, the planthoppers produce a very faint sound, j.cub.2016.01.054 # Elsevier, 2016. All rights reserved
which can be propagated through the air or soil. The
396 R. Dunlop et al.

different sensory organs (tarsal hair receptors In mammals, most vibrational signals are pro-
v. basitarsal slit sensilla). That was a significant duced by drumming or vocalization. Curiously,
discovery on the path to biotremology. Until then, the vibrational communication of the largest land
the substrate the scorpions use, loose sand, was animal, the African savanna elephant (Loxodonta
considered as not fitting for the transmission of africana), was discovered by O’Connell-Rodwell
vibrational signals, nor for the differential detec- in the 1990s, when she noticed peculiar
tion of different waveforms. Since the establish- behaviors. A freezing behavior in the elephant
ment of the view that a host of natural substrates and change in orientation, without an apparent
are suitable for vibrational communication, a cause, nevertheless reminded her of the behaviors
great number of (apparently) well-known of the tiny planthoppers whose vibrational com-
behaviors are now seen in a new perspective, munication she had studied earlier (Fig. 11.7).
and new discoveries are made for almost all ani- O’Connell-Rodwell and colleagues demonstrated
mal groups with increasing frequency (Hill et al. that the signals the elephants generate with low
2022). frequency “rumbles” (about 20 Hz) could be very
The production of vibrational signals nor cues useful for intraspecific long-distance communica-
can be accomplished through different forms: tion (O’Connell-Rodwell et al. 1997, 2000).
drumming (any sort of percussion event where a Also, drumming is a type of long-range vibra-
body part impacts the substrate of soil or a plant tional signal production. For instance, drumming
or water, etc.), tremulation (a body shaking/trem- by prairie chickens (Tympanuchus cupido) can be
bling that does not strike the substrate as the detected up to 5 km away from the source
signal travels through the signaler’s legs to the (Jackson and DeArment 1963). Kangaroo rats
surface on which they are standing), stridulation (Dipodomys deserti, D. ingens, and
(rubbing together a specialized file and scraper, D. spectabilis) drum the soil surface (seismic
which may be found on a variety of body parts), communication) with their feet to communicate
buckling of tymbal organs in animals that have such things as territorial ownership, their compet-
them, vocalizations and perhaps others, such as itiveness, and their presence and location to other
scraping a surface while signaling, or even kangaroo rats (Fig. 11.8, Randall 1984; Randall
scratching against a tree, or rolling on the ground. and Lewis 1997; Cooper and Randall 2007).
Some of these signal production mechanisms, Many species of marsupial kangaroos
such as drumming, stridulation, and vocalization, (Macropodidae) are known to produce a foot
always produce both a substrate-borne (vibra- thump when confronted by predators. The
tional) and an airborne (acoustic) component intended recipient of the vibration is not known
with a single action, even if only one of the and could be either a predator or other kangaroos
potential signals is capable of eliciting a response (Narins et al. 2009). Sheep and many other
in a receiver. ungulates stamp their feet when frightened or
Arthropods, and especially insects, show the aroused in other ways.
greatest variety of specialized organs to produce As every movement of an animal cause
vibrational signals. All mentioned means of particles in the surrounding media to oscillate
vibration production, except for vocalization, are and evokes all possible sorts of mechanical
present in several groups of arthropods and may waves, it is the mechanism of reception of
have evolved several times, independently. For a mechanical signals or cues that defines acoustic
subgroup of the insect order Hemiptera, the vs vibrational communication. It also follows that
Tymbalia or tymbal bugs, comprising tens of every act of communication establishes—at least
thousands of species including plant- and potentially—a complex communicational net-
leafhoppers, cicadas, and true bugs (Heteroptera), work in the realm of the “acousto-vibro-active-
vibrational communication is known to be evolu- space,” whereby the active space for vibrational
tionarily old and ubiquitous (Hoch et al. 2006; signals can be surprisingly wide, even bridging
Wessel et al. 2014). air gaps (Fig. 11.9; Virant-Doberlet et al. 2014;
11 Vibrational and Acoustic Communication in Animals 397

Fig. 11.7 Elephant vibration detection posture. (a) To for triangulation or better coupling). If focused on an
detect a signal, an elephant appears to focus solely on acoustic signal, an elephant will hold its ears out and
somatosensory detection via receptors in the trunk. Its scan its head back and forth in the general direction of
ears are relaxed suggesting no airborne assessment for the sound. Reprinted by permission from Springer Nature.
signals. (b) Elephant vibration detection posture, where it Biotremology: Studying vibrational behavior, edited by
appears to be using its toenails and trunk to assess a P. S. M Hill, R. Lakes-Harlan, V. Mazzoni. P. M. Narins,
ground-borne signal. Again, its ears are not fully extended. M. Virant-Doberlet and A. Wessel, pp. 259–276, Vibra-
This suggests it uses both bone conduction through the tional communication in Elephants: A case for bone con-
toenails and a somatosensory pathway through Pacinian duction, C. O’Connell-Rodwell, X. Guan and S. Puria;
corpuscles in the trunk for signal detection. Elephants may https://link.springer.com/chapter/10.1007/978-3-030-
also lean forward on their front legs with ears flat, some- 22293-2_13. # Springer Nature, 2019. All rights reserved
times lifting one of the front feet off the ground (possibly

Fig. 11.8 Kangaroo rats (genus Dipodomys) produce 49936422922). (right) Ord’s Kangaroo rat (Dipodomys
seismic signals by drumming the soil surface with their ordii). Photo of “Two Ord’s Kangaroo rats, Alberta” by
large hind feet. (left) Photo of “Kangaroo Rat by Stuart Andy Teucher licensed under CC BY-NC 2.0; https://
Wilson” by cameraclub231 is licensed under CC BY 2.0 www.flickr.com/photos/63265212@N03/8736679123
(https://www.flickr.com/photos/135081788@N03/
398 R. Dunlop et al.

Fig. 11.9 Types of communication acts by a vibrational a braconid wasp) are eavesdropping on the spider whereby
signaler. The signaling lycosid wolf spider establishes establishing a complex communication network.
vibrational communication with a conspecific receiver, Reprinted by permission from Elsevier. Hill P SM, Wessel
even one that is not on the same substrate as the sender. A (2016). Biotremology. Current Biology 26, R181–
Likewise, a vibrational communicating prey (e.g., a R191; https://doi.org/10.1016/j.cub.2016.01.054.
planthopper) and an acoustically orienting parasite (e.g., # Elsevier, 2016. All rights reserved

Mazzoni et al. 2014; Gordon et al. 2019). On an female to freeze at the end of the courtship,
ecosystems level, we have begun to think of, and facilitating copulation (McKelvey et al. 2021).
to study, a whole complex multilevel vibroscape The male’s vibrational signals are transmitted
(Šturm et al. 2021). through the common courtship floor—overripe
Despite the importance of reception fruits—and were picked up by a subset of neurons
mechanisms for the study of vibrational commu- of the female’s femoral chordotonal organ. By
nication, they are, for now, the least understood genetic knockout experiments of several
aspect in biotremology. Arthropods have in their mechanotransducer ion channels, McKelvey
bauplan—in every body segment and at every (et al.) also identified a protein involved known
joint of their legs—mechanosensitive stretch to be responsible for gentle touch sensitivity in
organs (chordotonal organs) that are responsible vertebrates—suggesting a deep evolutionary ori-
for body and movement control, but could also gin of vibrational communication.
pick up environmental vibrations. In some In several cases, we need to consider a bimodal
groups, such as grasshoppers, crickets, and acousto-vibrational communication on the signal
cicadas, chordotonal organs have evolved into production as well as on the reception side that
ears with a tympanum attached to one end of the results in a complex perception of the environ-
stretch organ. It is hypothesized that in every such ment outside of the experience of human beings.
case these hearing organs transformed through an Elephants, for example, produce low-frequency
evolutionary intermediate stage of vibration signals by vocal “rumbles” and “foot stomps”
receptors, i.e., vibrational reception is evolution- that produce airborne vibrations (sound) as well
arily older than hearing. as seismic waves (O’Connell-Rodwell et al.
A recent breakthrough was the demonstration 2000). New findings point to a simultaneous
of the complete pathway, from signaling through monitoring of the signaling by three reception
reception, to perception, and response behavior, pathways: sound hearing by the ear’s tympanum,
of the vibrational component of the courtship of bone conduction hearing, and somatosensory
the fruit fly Drosophila melanogaster. It is the detection via receptors in the trunk (Fig. 11.7;
vibrational signaling of the male that triggers the O’Connell-Rodwell et al. 2019). In this way, the
11 Vibrational and Acoustic Communication in Animals 399

overall chance of detecting a signal at all in a environment, and production of these vibrations
heterogeneous environment is improved, and the cannot be eliminated by the individual, even if
animals could also make use of the different walking more softly does lower the amplitude.
propagation velocities for assessing the distance Therefore, we can be certain that in both verte-
to the source of the signal. brate and invertebrate predators, a substrate-borne
vibration or sound that alerts potential prey of the
presence and direction of movement of the preda-
11.3.3 Diversity in Communication tor is not communication. In animal communica-
tion, we refer to this class of unintended
Recent evidence indicates that many messages information as a cue. On the other hand, we may
may be conveyed auditorily in nonhuman also be familiar with a hunting dog moving
primates when the larynx is not used. These com- through a meadow and flushing birds on the
monly take the form of rumbling of the stomach, ground into flight with the result that the hunter
farting, breaking sticks, swishing of grass, sounds can shoot them. We simply do not know if this
during digging or flying, and others. In fact, many sort of behavior exists in a more natural less
sounds made by an individual can carry informa- domesticated setting.
tion to those who hear, but the question is whether
they are used for communication. These sounds
could just be the result of physiological or envi- 11.4 The Advantages
ronmental adjustments that the sender may or and Disadvantages
may not be able to control, or that are not of Vibrational and Acoustic
recognized as significant in communication. One Communication
example is surface behavior in humpback whales.
Humpback whales can launch their body out of Substrate-borne vibrational and acoustic signals
the water, turn, and splash down on their side or are used in communication by almost all
back (breach), slap the water with their pectoral invertebrates and vertebrates. Sometimes each
fins, tail flukes, and even their head. These pro- type of signal is used by a single species but in
duce loud “bang” sounds, thought to be used as different contexts. There are many examples of
communication signals during periods of high the two being used across animal taxa in the same
underwater noise when vocal signals are not as basic context. Some major groups of animals
effective (Dunlop et al. 2010). have evolved a heavier dependence on one than
In general, the use of these sounds for commu- the other. For example, only as recently as 2015
nication has not been given much research time to did we observe the first described substrate-borne
date, except for cases where they have been signaling in mating birds (Ota and Soma 2022)
ritualized to carry information to others. For and in the very well-studied fruit fly Drosophila
example, we do know, from centuries of hunter’s melanogaster (McKelvey et al. 2021), both of
anecdotal evidence, that a hunted antelope, ele- which were well-known for acoustic and visual
phant, or even a rhino, will move much more signaling. These signals are essential for many
carefully to not make a sound when it is being species to find a mate, keep in contact (such as
hunted, compared to when traveling/grazing in a between mother and young), maintain territory,
group (e.g., Baze 1950). If this is the case, the warn conspecifics of predators, link food location,
individual must recognize that the sound will reinforce social living, communicate emotional
carry a message (Heyes and Dickinson 1990). state, and many other types of information
In invertebrates and non-primate vertebrate (Bradbury and Vehrencamp 1998). For any ani-
animals, ascertaining whether or not these signals mal, being out in the world advertising your pres-
are being used for communication is more of a ence has many advantages, but it also has its
challenge. Each movement of an animal’s body disadvantages. The advantages of using vibra-
creates vibrations that propagate through the tional and acoustic communication signals are
400 R. Dunlop et al.

essentially the same. There is no need for light— to avoid predation. A conspecific eavesdropper
so signals can be detected at night. Sound can can gain important information about the sig-
flow around obstacles, so acoustic signals can be naler/receiver relationship without having to
heard anywhere and anytime, and even though directly take part in the interaction. Siamese fight-
the substrate filters vibrational signals and cues ing fish (Betta splendens), for example, eavesdrop
in ways that are difficult to predict, they still can on fighting males to gain information about their
be detected without respect to time. Compared strength, which they then use in future
with other signals, most vibrational and acoustic interactions (Oliveira et al. 1998; Peake and
signals do not need a great deal of energy to McGregor 2004). To add further complexity, the
produce. Because of the physics of signal propa- presence of an eavesdropper audience can affect
gation, vibrational and acoustic signals can travel communicative interactions and force signalers to
over long distances. For instance, in primates, the change their signaling behavior according to who
roaring of howler monkeys (genus Alouatta) can else may be listening in. This is known as the
travel up to 1 km. audience effect and was first documented in a
However, there are disadvantages to vibra- study of domestic chickens (Gallus gallus;
tional and acoustic communication. These Evans and Marler 1991, 1994).
include energetic and developmental costs, such Despite these and other disadvantages, it is
as requiring special structures for signal produc- obvious that substrate-borne vibrational and
tion and reception. Being able to produce a loud acoustic communication and all that they entail
signal often requires new, and possibly elaborate have provided extraordinary benefits in compet-
structures, such as the larynx of vertebrates and ing, surviving, and propagating the next genera-
the melon of sperm whales, Physeter tion. The stories of the development of vibrational
macrocephalus). Invertebrates have also evolved and acoustic communication are ongoing and
specialized structures, such as the stridulatory much knowledge about the mechanisms,
apparatus in insects, which requires a receptor meanings, and extent of these systems is yet to
such as the subgenual organ (for substrate-borne be discovered.
vibrations) and the ear (for sound) to pick up the
messages. Many animals have evolved
specialized receptors to detect substrate-borne 11.5 The Influence
vibration signals (Pacinian corpuscles, Meissner’s of the Environment
corpuscles, Eimer’s organ; Narins and Lewis on Acoustic and Vibrational
1984; Narins et al. 2009). Communication
The disadvantages of signaling can, however,
be subtle—such as a wasted broadcast when there For the most part, animals do not sit in a studio,
is no one to receive it or alerting others and then acoustic lab, or anechoic chamber when signaling
being overcome by a predator. “Blurting out” acoustically or with substrate-borne vibrations.
who and where one is means others can find They are usually in a natural environment subject
you. By listening in, these others, or unintended to atmospheric and other conditions. Signals may
receivers, which could be predators, prey, or even be affected by spatial separation, movement of
eavesdropping conspecifics, can obtain valuable the caller, and they may even vary spatially or
information about the signaler. This may come at geographically. Environmental noise is a signifi-
a cost to the signaler. If the unintended receiver is cant factor influencing animal signaling behavior.
a predator, the cost is obvious: by listening in on While few studies to date have addressed vibra-
the sound signals, the predator can recognize the tional environmental noise, this topic is the focus
signaler as prey and locate it. Conversely, prey of a recent review of both terrestrial and marine
can be alerted to, and identify, a signaling preda- anthropogenic noise topics and literature, includ-
tor and its location, thus making it easier for prey ing previously unpublished case studies that can
11 Vibrational and Acoustic Communication in Animals 401

be used as guides for future work (Roberts and A similar mechanism to spatial release from
Howard 2022). masking is known as the cocktail party effect.
Here, the receiver focuses its attention on the
signaler, while selectively filtering out other
11.5.1 Atmospheric Conditions stimuli such as other sounds. At a party, humans
can “tune in” to one conversation when many are
Atmospheric conditions, which include changes in taking place. Many frogs and songbirds have also
temperature and wind, exert powerful and predict- been shown to successfully communicate in noisy
able influences on animal sounds. These influences party-like situations. Frogs can recognize, local-
can cause the ability to detect a signal to change ize, and respond to signals within a cacophony of
rapidly. The transmitting of a signal may be chorusing (Gerhardt and Bee 2006; Wells and
prolonged or modulated by topography, regional Schwartz 2006). Songbirds are able to recognize
weather, seasonality, and climate. Mammalian conspecific song and songs from other species
carnivores, such as coyotes (Canis latrans) and within a dawn chorus (Benney and Braaten
wolves (Canis lupus), live in areas with nocturnal 2000; Hulse et al. 1997). Reunited offspring and
lower temperatures (David Mech and Boitani parents within a noisy colony clearly occur suc-
2003). These animals show crepuscular calling to cessfully in penguin colonies (Aubin and
maximize their chances of being heard over the Jouventin 1998).
longest possible distances. Vibrations in the soil The above mechanisms demonstrate how the
or other substrates due to wind or rain can also receiver overcomes masking sounds to improve
interfere with normal signal production and recep- signal detectability. Another way to improve sig-
tion to the extent that individuals will stop court- nal detectability is for a signaler to change the
ship displays under windy or rainy conditions. way it calls. For example, a signaler could
increase its call amplitude, call duration, and/or
call at a different frequency. These changes are
11.5.2 Masking Sounds collectively known as the “Lombard Effect.” The
Lombard effect has been demonstrated in species
Masking sounds are environmental sounds, such such as the Japanese quail (Coturnix japonica;
as a stream, wind moving through the trees, and Potash 1972), budgerigars (Manabe et al. 1998),
sounds from other animals, which cover, or chickens (Gallus gallus domesticus; Brumm et al.
dilute, the signal. In birds and other animals, 2009), nightingales (Luscinia megarhynchos;
spatially separating a signal from a masking Brumm and Todt 2002), white-rumped munia
sound is one way to improve signal detectability. (Lonchura striata; Brumm and Zollinger 2011),
If the signal and masking sound are separated and zebra finches (Taeniopygia guttata; Cynx
spatially, the receiver can focus efforts to hear et al. 1998) and even in large whales such as the
the signal. This “spatial release from masking” humpback whale (Dunlop et al. 2014).
has been demonstrated in the behavior and physi-
ology of the northern leopard frog (Lithobates
pipiens) (Ratnam and Feng 1998). Bee (2007) 11.5.3 Geographic Variation
showed that female Cope’s gray treefrogs and Dialects
(Dryophytes chrysoscelis) approached a target
signal more readily when they were spatially Changes in the environment may lead to geo-
separated by 90 from a masking sound, implying graphic variation, and this variation can eventu-
this spatial separation aided with signal reception. ally separate animals within a species into
Spatial release from masking has also been shown different populations. It should be noted that geo-
to occur in budgerigars (Melopsittacus undulatus; graphic variation is not necessarily due to
Dent et al. 1997) and killer whales (Orcinus orca; changes in the environment. While this is occur-
Bain and Dahlheimm 1994). ring, geographic separation can lead to the
402 R. Dunlop et al.

formation of dialects. A dialect can evolve where and Slater 2008; Podos and Warren 2007;
species dispersal is occurring and their acoustic Keighley et al. 2017). Another mechanism that
contact with each other becomes limited (Slater helps maintain variable acoustic dialects is social
1986, 1989). As a result, individuals within a adaptation. Social adaptation refers to the ability
species population may exhibit similar sounds to to adjust behavior to a prevailing pattern in a
each other, but these sounds may be quite differ- population. Migrating birds, for example, learn
ent in structure to other separated and more dis- calls quickly (Salinas-Melgoza and Wright
tant populations (Catchpole and Slater 2008; 2012), which provides reproductive benefits due
Gannon and Lawlor 1989). This results in to acoustic familiarity by potential mates (Catch-
within-species vocal variation. pole and Slater 2008; Farabaugh and Dooling
Dialects are also known from biotremology 1996). In this way, newly arriving immigrants fit
studies. For example, the well-known southern in quickly and do not insert changes to bird songs
green stink bug (Nezara viridula) has spread of the residents, thereby maintaining the local
throughout the world (except for the Arctic and dialect.
Antarctic) from its native Ethiopia in the past Vocal dialects can act as precursors to genetic
100 years. Geographically isolated populations isolation (e.g., in coastal US chipmunks, genus
(e.g., California and Florida in the United States, Neotamias). Dialects can also be maintained over
the French Antilles, Australia, Japan, Slovenia, time if the populations are separated and have
and France) have distinct differences in duration little acoustic contact. This separation can be
and repetition time of male and female signals. reinforced by geographic boundaries, or other
Individuals appear to be able to recognize adults isolation mechanisms, that reduce breeding
from other populations but prefer to mate with chances (Gannon and Lawlor 1989). Examples
those of their own dialect/population (Virant- include the pika (Ochotona), grasshopper mice
Doberlet and Čokl 2004). (Onychomys), white-crowned sparrows
The study of population dialects offers a (Zonotrichia), prairie dogs (Cynomys), and bats
means to explore the causes and the functions of (Myotis evotis), which have all been shown to
signal variation and change (Henry et al. 2015). exhibit dialects due to geographic variation. Sev-
Geographic variation in acoustic signals can eral species of birds, such as the chaffinch
reflect historical evolutionary changes within spe- (Fringilla coelebs), have been identified as hav-
cies. Not only can these signals be used to assess ing song dialects and therefore are described as
links between geographic variations and popula- having distinct “cultures” (Slater 1981). One of
tion connectivity, but they can be used to provide the most striking examples of cultural influences
important information for the conservation of a is the rapid spread of new humpback whale songs
species. For example, geographic variation in across the South Pacific basin. All male hump-
calls could indicate how birds disperse through a back whales within a population generally con-
fragmented habitat, meaning the study of dialects form to the same song pattern, making it a cultural
can be used as a noninvasive tool to assess popu- trait. These song types move eastward across the
lation connectivity (Kroodsma and Miller 1982; South Pacific basin in a series of cultural waves at
Amos et al. 2014). a geographic scale unparalleled in the animal
The formation of dialects can occur through kingdom (Garland et al. 2011).
several mechanisms; as a result of a side-effect or Behavioral repertoires are malleable—that is,
“epiphenomenon” of learning via incorporating they are affected by the environment, learning,
copying errors (such as adding or omitting parts and interactions within a population. Variants in
of the call), due to structural changes to call signal characteristics are no exception (Brumm
elements through drift, or as a possible indicator et al. 2009). Thus, signal characteristics can act
of the level of behavioral or genetic variation in a as precursors to variants in other genetic
population (Baptista and Gaunt 1997; Catchpole characteristics, and eventually, speciation.
11 Vibrational and Acoustic Communication in Animals 403

statistically, that variation of the calls was


shown to be dramatic enough to warrant elevation
to four distinct species. Originally based on
acoustic data, this was confirmed by genitalia
and genetic information (Gannon and Lawlor
1989; Sutton and Nadler 1974; Sullivan et al.
2014).

11.6 Information Content


or the Meaning of Signals

Vocal signals can be used to provide (a) static


Fig. 11.10 Hoary bat (Lasiurus cinereus). “Hoary bat” information about the species, including the size
(https://www.flickr.com/photos/33247428@N08/
48546621027) by Oregon State University is licensed and shape of the vocal apparatus, or (b) dynamic
under CC BY-SA 2.0; https://creativecommons.org/ information, that is, the motivational state of the
licenses/by-sa/2.0/ sender. Vocal signals can be context-dependent,
where the same call can mean different things in
different situations, or context-independent,
Notably, O’Farrell et al. (2000) examined nearly where the call has a specific meaning whatever
2500 calls from 43 sites in Hawaii and mainland the context. Species recognize one other from
United States for the Hoary bat (Lasiurus their vocalizations, and produce signals related
cinereus; Fig. 11.10). They found some geo- to various situations such as alarm calls in the
graphic variation within the calls, but the varia- presence of a predator, distress calls when
tion could not be explained by isolation separated from a parent, singing and chorusing
(mainland distance of about 2300 miles to attract or deter conspecifics, or reflect behav-
(3800 km) from the proximity of San Francisco, ioral changes. The question then arises; how does
CA, USA and Honolulu, Oahu, Hawaii, USA). the recipient know what the caller means in that
They were unable to exclude the effects of con- situation? The answer is, at least in birds and
text, behavior, or in some cases low sample size. mammals, the receiver assesses call meaning by
Bats of this species, regardless of where they were observing the sender and the context in which the
recorded, could be identified as L. cinereus. In signal is sent.
other words, these bats were showing variations
in call structure and behavior but had not yet
evolved into different species. 11.6.1 Static Information
There are instances in which different species
have evolved. Several studies in mammals have In addition, the anatomy of the vocal apparatus in
found that research into the geographic variation mammals determines features of its sounds, and
of acoustic signals is important taxonomically by these features correlate with the animal’s body
discovering cryptic species. Chipmunks size (Fitch 1997 in rhesus macaques, Macaca
(Neotamias) occurring mostly along the US mulatta). Larger lungs can produce longer
coasts of California, Oregon, and Washington vocalizations. Vocal folds that are longer and
were thought to be one species (Eutamias thicker produce sounds at lower fundamental
townsendii) with several subspecies. The species frequencies (for example, pika, Ochotona alpina;
was characterized mostly by cranial and pelage Volodin et al. 2018). The longer vocal tract
features. It was not until localities throughout the concentrates the energy in the lower frequencies
range of the four subspecies within E. townsendii (Ey et al. 2007). Thus, correlations have been
were sampled acoustically, and examined found between an animal’s vocal tract length,
404 R. Dunlop et al.

body mass, and formant dispersion (e.g., domes- often associated with pleasure, close contact
tic dog, Canis lupus familiaris, Riede and Fitch between animals that like each other (such as
1999; southern elephant seals, Mirounga leonina, mother to young), or between social partners
Sanvito et al. 2007). when close (Morton 1977).
As a result, information about the sender’s Affiliative calls can indicate a welcoming, or
body size, sex, age, and sometimes rank can be “I am fond of you” context. For example, familiar
acquired from their vocalizations. Sounds from elephants meeting each other after a long separa-
small or young animals are typically higher in tion may trumpet for pleasure/joy (a high state of
frequency than those of larger or older animals arousal). They also murmur to a friend, infant, or
(see Riondato et al. 2021 for an exception). Some- person they like who has been close, indicating a
times rank information is used by females low level of arousal but a similar emotion (Kiley-
selecting males. For example, the “roar” of the Worthington 2017).
male Red deer (Cervus elaphus) contains infor- Aggressive calls include territorial calls and
mation on its sex and size. The larger the animal, calls used as threats, and like affiliative calls, the
the lower the frequency of the roar. Females agnostic call structure can change because of
chose mates based on their roar and have been arousal. A highly aroused bull (Bos taurus), for
found to prefer the roars of larger males (Charlton example, will give visual signals: pawing, lower-
et al. 2007). The signaler’s dominance rank can ing his head withdrawing his chin and rubbing his
also be signaled using size-related formants (e.g., horns in the earth, at the same time as roaring. At
male fallow deer, Dama dama, Vannoni and the highest level of threat, the roar has a vocalized
McElligott 2008; and baboons, Papio ursinus, inspiration as well as a vocal expiration known as
Fischer et al. 2004). As the sender’s features do a “see saw” call (Kiley 1972).
not change (e.g., their sex), or change slowly over
time (e.g., their size or age), it is known as static
information. 11.6.3 Context-Dependent Meanings

Context-dependent communication is where the


11.6.2 Dynamic Information same signal may be used in different contexts but
has different meanings. For example, a male east-
A second type of information is known as ern kingbird (Tyrannus tyrannus) emits a “kitter”
dynamic. This information relates to the sender’s call-in three different contexts: (1) when the bird
motivation or arousal. Dynamic, or context- is indecisive or concerned about attempting to
dependent calls, follow a motivational code approach some object (to perch, mate, or toward
(Morton 1977). A loud or long sound, for exam- another bird), (2) when lone males fly from perch
ple, is associated with the signaler experiencing to perch in a new delimited territory, or (3) as an
high arousal that may be due to aggression, fear, appeasement signal by the male when
frustration, distress, or pain. Signalers in hostile approaching his mate. Another example is the
contexts tend to emit longer, lower-frequency familiar roar of a lion (Panthera leo) that—from
“harsh” (broadband) sounds which can signify the viewpoint of a human—is a spectacular vocal
signaler size. These sounds function to mediate display during aggressive interactions. However,
aggressive interactions between it and the the call also helps individuals belonging to the
receiver. High tonal sounds, that mimic infant same pride find, and identify, each other and can
sounds, are more likely to be emitted in appeasing serve as a bonding signal for members of a pride
(fearful) contexts given they potentially have an to gather. It can also separate neighboring pride.
“appeasing” effect on the receiver. Distress calls Affiliative calls can also be food calls (Kondo
(often “scream” or “whistle-like” vocalizations) and Watanabe 2009). Food calls can be context-
are used when “fear” and “aggression” are dependent given these signals are directed at other
conflicting motivations. A short quiet signal is conspecifics and can indicate the presence of
11 Vibrational and Acoustic Communication in Animals 405

food. The variation in these food calls can indi- fighting for access to a breeding female. In these
cate food a quality and quantity. For example, groups, where arousal level is much higher,
spider monkeys (genus Ateles) are known to pro- “grumbles” turn into harsh sounding “roars” and
duce a higher call rate in response to greater “purrs,” and become more modulated to sound
quantities and quality of food. Acoustic signals more like “groans” and “moans.”
can attract group members to food locations and Different levels of graded calls can be given in
these calls can also be used to protect the food one situation. For example, cattle may give a low
resource from others (Clay et al. 2012). These “mmmmm” call when in close contact with other
authors examined food-associated calls made by cattle. On opening its mouth, the sound has an
some birds and mammals (see page added syllable: “en” to “mmen.” When it is suffi-
326, Table 11.1 in Clay et al. 2012) and found ciently aroused, a “hh” syllable is added, which is
that most species did not produce unique calls for the result of letting the remaining air out of her
different foods. More commonly, signalers varied respiratory track. This can change even further
their calling rate to advertise food quality or with higher excitement or arousal by being
abundance. repeated. Finally, at the highest level of arousal,
Therefore, context-dependent vocalizations the inspiratory phase of the call is also vocalized
may not necessarily convey information about (Table 11.1). This is a very different type of
the type of situation but can act as an analogue auditory communication from context-
system to inform the recipient about the general independent calls such as human language
level of arousal of the sender, and consequently, where auditory communication can reflect either
how (or if) to respond. In some species, calls are or both and environmental contexts or come from
graded, meaning that there are intermediates some thought or idea generated by cognition.
between one call and another. Humpback whales,
for example, use a repertoire of graded signals
and the use of these signals is likely related to the
11.6.4 Species Recognition
motivation and arousal of the signaler (Dunlop
2017). “Grumbles” and “snorts” are used by
To be sure that the call maintains the same struc-
females and their calf while migrating by them-
ture (and can therefore be recognized as having
selves and presumably in a low-arousal context.
the same message), there are a number of
Female–calf pairs can be joined by male escorts
measures including call interval, maximum fre-
and form a competitive group, where males are
quency, minimum frequency, fundamental or

Table 11.1 The variety of situations that give rise to the major call types of Bos taurus (reproduced from Kiley 1969)
Situation/call mm men menh (m)enENh SeeSaw A (no inspir) SeeSaw B (+inspir)
Confident greeting + + +
Greeting equals + + + +
Defensive threat + + +
Aggressive threat + + + +
Fear + +
Close contact retain +
Tactile stimulation +
Isolation + + + + +
Startle
Pain/fear + + +
Frustration + + + + + +
Anticipation pleasant + + + + + +
Anticipation unpleasant + + + + + +
Disturbance + + + + + +
406 R. Dunlop et al.

predominate frequency, call length, duration, relate to the context, also emit a specific “warning
amplitude or loudness, and the repetition rate bark”—a context-independent short sharp call
found in both acoustic and vibrational signals. that is difficult to locate as an alarm call (Kiley
These characteristics, combined with the presence 1972). This alarm call works to conceal the posi-
of harmonics, form patterns that are often charac- tion of the signaler but conveys that a disturbing
teristic of a species or individual. As a result, object has been sighted.
other animals are likely to be able to identify The importance of altruism (or lack of it) when
individuals from their calls, as we can with vocalizing has been investigated within the con-
human voices. For example, many species of text of emitting alarm calls and food calls. For
vespertilionid bat can be identified by time and example, studies have shown that, even those
frequency characters measured from their echolo- calls that are difficult to locate (ventriloquial
cation calls (Gannon et al. 2003). Individual rec- calls), will increase the chances of being detected
ognition is also evident in bats. Playback by a predator (Fig. 11.11). However, studies on
responses in common vampire bats (Desmodus kinship and altruism have yet to relate the ease of
rotundus) suggested they vocally recognized locating an alarm call by a predator to the rate of
individual bats, given they were biased toward vocalizations and to actual predation (Reznikova
callers that had fed them more (food sharing), 2019). Still, it seems that coterie members of
but not biased toward kin (Carter and Wilkinson prairie dogs (Cynomys ludovicianus) alert others
2016). Crickets (Teleogryllus spp.) can be to the presence of potential predators using alarm
differentiated based on the amplitude and repeti- calls, and that these alarms significantly reduce
tion of their call, not just their call “note” (that is, predation (Wilson-Henjum et al. 2019).
the fundamental). The mean frequency of this Functionally referential signals are those that
signal is approximately 4 kHz, but the pattern provide very specific information. They are struc-
and call rate increase as the cricket’s motivation turally distinct and reflect a stimulus-specific
changes from “calling” to “encountering” to meaning used only in a very specific set of
“fighting” to “courtship” and finally “copulating.” circumstances. Most alarm calls are nonspecific,
but the vervet monkey (Chlorocebus
pygerythrus), uses a lexicon of four or five sounds
11.6.5 Context-Independent to identify the type of intruder. When a major bird
Meanings or mammal predator is nearby, the vervet
produces a “chirp” and “bark” (Strusaker 1966).
Some calls in animals, like human language, have When a snake is nearby it evokes a special
a specific meaning, whatever the context. These “chutter” call, a minor bird or mammalian preda-
calls often include alarm calls used to alert a tor is indicated by an abrupt “uh” or “nyow”
group to danger of an approaching predator, terri- sounding signal, and a major bird predator elicits
torial invader, or other “alarm” in the caller’s a “rraup.”
environment. The alarm call may elicit a response Distress calls can be context independent, such
by recipients to retreat, freeze in place, or conduct as the calls used by young to attract adults to their
defensive behavior. Slobodchikoff et al. (2009) location. African wild dog (Lycaon pictus) pups,
discussed the complexity of alarm calls in prairie for example, emit a “lamenting call” when they
dogs (Cynomys gunnisoni) in the southwestern are deserted by their parents. Precocial birds, such
United States. He and his students have found as domestic fowl, ducks, or geese, “pipe” in the
that prairie dogs are precise in their signaling same way as when they are cold or hungry.
and can communicate a description of the preda- Young, collared lemmings (Dicrostonyx
tor, its size, its speed, and even its color. Wild groenlandicus) emit ultrasonic chirps when they
boars (Sus scrofa) use context-dependent calls, are abandoned, cold, or feel as if they are in
such as “grunts” and “screams,” whose meanings danger (Sales and Pye 1974). Young primates,
11 Vibrational and Acoustic Communication in Animals 407

Fig. 11.11 Young prairie


dogs (Cynomys
ludovicianus) at Rocky
Mountain Arsenal National
Wildlife Refuge,
Commerce, CO, USA. One
pup giving a yipping call.
US Fish and Wildlife
Service Photo Credit: Rich
Keen at RMA; https://
commons.wikimedia.org/
wiki/File:Yipping_Prairie_
Dog_Pups.jpg. Licensed
under CC BY 2.0; https://
creativecommons.org/
licenses/by/2.0/

including humans, shriek or scream when


threatened or abandoned.

11.6.6 Songs

Songs are composed of call notes that have been


elaborated in structure and length. The main func-
tion of song is to identify the singer as a member
of a species, sexually mature, on a territory, prone
to territorial defense, and ready for courtship.
Song refers to the melodic quality (with
harmonics) of songs, as opposed to broadband
“noise,” and bird song is often analyzed into
themes and phrases, where researchers try to
interpret the meaning or function of the different Fig. 11.12 Male indigo bunting (Passerina cyanea)
phrases. Marler and Tamura (1964) and Marler produces a song where certain elements of the song pro-
vide meaning to the listener. Photo “IndigoBuntin-
and Doupe (2000) believe that certain parts of the gonPlant.jpg” by Kevin Bolton; https://wordpress.org/
song contain certain types of information and that openverse/image/15bcd71f-0728-4bda-8122-
birds decode the songs. Emlen (1972) experimen- 38fcf4a82ce6/. Licensed under CC BY 2.0; https://
tally modified the songs of male indigo buntings creativecommons.org/licenses/by/2.0/
(Passerina cyanea), and based on responses to
playbacks, could identify the meaning of certain song in other populations. Within each popula-
elements in the song (Fig. 11.12). tion, the song structure changes gradually over
The male humpback whale is a well known the mating season and between years. A call unit
marine singer. Males within each population of can drop out of the repertoire, be replaced with
whales sing the same song, but each population of another unit, or units can be added. These
whales has its own unique song (rather like a changes are known as song evolutions, as the
dialect), which can sound different from the song structure evolves gradually within a
408 R. Dunlop et al.

population over time. Songs can also completely Duetting can be especially important within
change between 1 year and the next, known as a environments, such as in dense vegetation,
song revolution. This is thought to be due to the where birds cannot see each other. By duetting,
influx of males from a different population, car- pairs keep close to each other, and in synchrony,
rying with them their own song. Males from the so when conditions in a variable environment
original population then pick up and learn this become right, mating can be achieved quickly
new song causing the song within that population and efficiently. In most gibbon species (family
to completely change (Noad et al. 2000). Hylobatidae), males, and some females, sing
A duet is an exchange of sounds or substrate- solos that function to attract mates and advertise
borne vibrations between a pair of animals often their territory. If a male and female like one
produced in rapid succession (Fig. 11.13). The another’s song, they will find each other and
duet may be so rapid, that it is difficult to distin- conduct a short mating dance followed by a long
guish which animal is producing the various vigorous mating ritual. The song dialect is used to
parts. It functions as a contact-maintaining signal identify the singing gibbon’s species and the area
and individual mated pairs within a species can it is from. Therefore, duetting also reduces
develop their unique duet helping them to main- hybridization with closely related species (Mitani
tain contact with their partner. Duets are especially and Marler 1989).
common in frogs, birds (cranes, sea eagles, geese,
quail, grebes, woodpeckers, barbets, megapode
scrub hens, kingfishers, ravens, cuckoo-shrikes, 11.6.7 From Chorusing to Copulation
and honey-eaters), tree shrews (mammalian order
Scandentia), and siamang (Symphalangus Males that chorus (e.g., frogs, toads, and insects
syndactylus), as well as being common in major such as locusts (order Orthoptera) and cicadas
groups of insects that communicate via substrate- (order Hemiptera)), attract females to a localized
borne vibrations. Species that perform duets often area. A classic example of this are the periodical
are monogamous (such as siamangs) and the two cicadas (Magicicada sp.). Millions of 17-year
sexes resemble each other in appearance (that is, cycle cicada gather to mate in forests in the east-
they are not dimorphic). ern United States. Males aggregate into chorus
Duets are used when mated pairs are required centers and attract mates by producing high-
to remain in touch over long periods of time. intensity sounds (Fig. 11.13). The desert locust

Fig. 11.13 A duet of


ravens (Corvus corax).
Photo “Ravens’ Duet” by
Ron Mead; https://www.
flickr.com/photos/
14093853@N04/
2678807340 . Licensed
under CC BY 2.0; https://
creativecommons.org/
licenses/by/2.0/
11 Vibrational and Acoustic Communication in Animals 409

Fig. 11.14 Desert locusts


(Acrididae) emerge and go
into flight en masse. Photo
“Locust” by [nivs]; https://
www.flickr.com/photos/
42805979@N00/
34263361. Licensed with
CC BY-SA 2.0; https://
creativecommons.org/
licenses/by-sa/2.0/

(Schistocerca gregaria) forms one of the most


intense swarms (Fig. 11.14), and can be found
in countries such as Kenya, Somalia, India, and
Saudi Arabia. Their loud chorusing is a means of
sexual advertisement. BBC News reported on the
“biblical locust plagues of 2020”, when these
insects swarmed in large numbers in East Africa
(BBC News 2020).
The gecko Ptenopus garrulus produces loud
continuous chirruping during a dusk chorus
(Walker, 1998). These calls strengthen social
bonding during sexual and courtship activities
and are often produced together with visual and
tactile behaviors.
An example of a more spatially contained
event used by male sage grouse (Centrocercus
urophasianus) to attract mates acoustically and
visually is leks. Male sage grouse form large
courtship leks in a social arena to produce elabo- Fig. 11.15 Male Greater Sage-Grouse (Centrocercus
urophasianus) by USFWS Pacific Southwest Region;
rate visual displays with their gular pouches and https://www.flickr.com/photos/54430347@N04/
the accompanying sounds of “swish-swish-coo- 6928668188. Licensed under CC 2.0; https://
oo-poink” (Fig. 11.15; Bush et al. 2010). This creativecommons.org/licenses/by/2.0/
study (p. 343) found that despite lekking behav-
ior, male–male competition was spread out spa-
tially and females often covered the entire social inside a burrow he constructs in the soil, produc-
arena before copulating. Leks also are increas- ing an airborne (sound) component that signals to
ingly being recognized in invertebrates that com- fly females as a sexual advertisement. The same
municate through substrate-borne vibrations, stridulation event has a substrate-borne compo-
such as the prairie mole cricket (Gryllotalpa nent (vibration) that is used by nearby males to
major). In this species, a male stridulates from aid in spacing their burrows (Hill 1999).
410 R. Dunlop et al.

After mate attraction, comes copulation. Ovu- chimp Pan troglodytes with their son and treated
lation in female alpacas (Vicugna pacos) is her similarly. At the end of several years,
thought to be simulated during copulation, although their son was talking, the chimp found
where the male produces a loud “orrgle” for great difficulty making human sounds, and man-
30 to 45 minutes while mounting the female aged only “mama.” The conclusion was that the
(Abba et al. 2013). Even after copulation, calling chimp’s inability to learn language implied that
may continue, where the tree frog Phyllomedusa chimps have lower intelligence than humans.
(Hylidae) gives a separate call after oviposition. However, later it was discovered that the reason
for her difficulty in making speech sounds was
not a mental/cognitive lapse, it was physiological.
11.7 Comparing Human Language She did not have the necessary muscles to control
to Nonhuman Auditory the sophisticated movements of the tongue, lar-
Communication ynx, buccal and nasal cavities in order to make the
different sounds (Lyn 2012). More recently, Fitch
Despite the phenomenal array of different types (2011) has argued that humans have what he
of auditory communication in the different spe- called a “language ready brain.” However,
cies, what are the defining characteristics of Savage-Rumbaugh et al. (2009) argue strongly
human language? Human language involves the that human language may not be any more
use of vocal sounds that are symbolic of sophisticated than ape languages. This is
meanings, and therefore context independent. supported by the recognition of the many mental
Thus, human language can be understood in the homologies between humans and other mammals
total absence of the communicator, such as when (e.g., Kiley-Worthington 2017).
written, or when heard on the telephone. Since the middle of the twentieth century, the
There is a vast literature on human language, distinguishing features found in human language
and a whole field of study: linguistics. Many have been widely discussed, and the synopsis
scientists believe that the development of human developed by Hockett (1960) is still widely
language was the most important evolutionary adhered to. The first question is to what degree
step in distinguishing humans biologically. It is these defining features are found in other species
also widely maintained that development of (Table 11.2).
human language was responsible for the further This list has been elaborated, extended, and
cognitive development of humans. Interestingly, modified, to include tactile, visual, taste, and
nonhumans respond to general sounds and olfactory communication (e.g., Christin 1999).
emotions in human language. More recent work The vocal repertoire of many species has been
has shown that some primates, dogs, marine shown to fulfill most of these characteristics, and
mammals, horses, and elephants comprehend a list of some of the most pertinent studies is
individual words and phrases. In fact, with expe- given here (e.g., Fitch 2011; Herman et al. 1984;
rience, they understand a great deal more human Schusterman and Kastak 1998; Nehaniv and
language than we previously assumed (e.g., de Dautenhahn 2002; Rendell and Whitehead 2001;
Waal 2016; Kiley-Worthington 2017). Young Christiansen and Kirby 2003).
human or nonhuman mammals do not only learn To simplify the differences between human
the meaning of words by conditioning as the spoken language, and communication attributes
behaviorists believed (Skinner 1957), but they of other species, there are two human
also learn by observing others, imitation, and specializations. The first is that the human spoken
learning about cause and effect. language, unlike auditory communication of
One of the first experiments to test if many other species (although not all), is mainly
nonhumans could learn to speak a human lan- (but not exclusively) context independent. That
guage was the Kelloggs’ studies (Kellogg and is, the same word means the same thing in any
Kellogg 1933). This family raised a young context. Humans have developed this
11 Vibrational and Acoustic Communication in Animals 411

Table 11.2 Design features of human language and whether they have been recorded in other species. The species listed
here are only examples, since there are others for which better evidence exists
Design features Humans Chimpanzees Horses Elephants
PRODUCTIVITY + + + +
Different components together at different times
ARBITRARINESS + + + +
Different responses to same display
INTERCHANGEABILITY + + + ?
One display triggers another
SPECIALIZATION + + + +
Not directly related to consequences
DISPLACEMENT + + + +
Key features not related to antecedents
CULTURAL TRANSMISSION + + + +
Differences between populations as a result of learning
DUALITY + + ? ?
Symbols form sentences; components of expression contribute to
whole interpretation

characteristic much further than other species,


and as a result, the meaning of what they are
saying can be assessed whatever the situation,
whether it be on the telephone, read, or written.
However, it is true that many words can have
multiple meanings or are used in specific
contexts. Furthermore, using the same word in
different communication contexts can change its
meaning. Meanwhile, primate alarm calls seem to
share a lot of features of words. The other impor-
tant characteristic is that human language is
highly symbolic. Again, this is not a unique char-
acteristic of human language. For example,
movements such as a horse swishing his tail, Fig. 11.16 Mandarin ducks (Aix galericulata) perform a
specialized courtship routine. The males shake and bob
which may mean he will kick you, and ritualized their heads, as well as mocking drinking and preening,
displays, such as the courtship preening of Man- while raising their crest and orange sail feathers to “show
darin ducks (Aix galericulata; Fig. 11.16) are also off.” They also incorporate sound into their courtship in
highly symbolic. However, humans have taken the form of a whistling call. “Mandarin duck” by Tambako
the Jaguar; https://www.flickr.com/photos/
symbolism further so that symbols can be built 8070463@N03/853400195. Licensed under CC BY-ND
on top of each other. For example, one dog can be 2.0; https://creativecommons.org/licenses/by-nd/2.0/
seen to be a dog and only one, but it can also be
represented by a 1. Another 1 can be added,
which is represented as 2. This led to the emer- language. This includes teaching chimpanzees
gence of mathematics, and to further symbolic sign languages, and more recently, to use com-
links in formulae culminating in our explanations puter symbols. Interestingly Washoe, one of the
of gravity or electricity and other phenomena in first chimps, was taught American Sign Lan-
the world. guage. This chimp eventually managed to com-
Some research has concentrated on teaching bine symbols to produce new meanings. For
apes and marine mammals to develop and use a example, when asked what a duck was when
language that has features characteristic of human swimming in the water, she signed it was a
412 R. Dunlop et al.

“water bird” (Gardner and Gardner 1984). Gluck investigate this before it is too late and many
(2016), in his account of grappling with central species have become extinct due to our actions,
philosophical problems in animal ethics, most of which are the consequences of human
recollects one of his weekly lab meetings language.
(he was part of a research lab known for numer-
ous breakthroughs in psychology and animal
behavior) where the graduate students would dis-
11.8 Summary
cuss their research and topics of the day; signing
chimps was a hot topic at the time. He noted that
With modern technological aids and further stud-
one of the students, a bit of a maverick, inquired
ies, the study of acoustic and substrate-borne
whether the chimp ever asked “Can I go home
vibrational communication has advanced consid-
now?” or “Can I leave?” Gluck and the other
erably since Busnel’s (1963) seminal work. The
students dismissed this as foolhardy and would
origins of acoustic communication are likely to be
spend the next two decades exploring how pri-
from sounds associated with moving about in the
mate models could inform human biomedical and
environment and breathing in and out through
behavioral science. But that is still the question of
respiratory passages. These sounds have become
our time. If a captive animal could, would they
specialized for communication. Likewise, as
ask to be released? Would they ask “Why are you
animals move, regardless of how quietly, the
doing this to me?” These animal-intensive tests
motions lead to vibrations through the substrate
came under extreme criticism from other
that can be detected by others of the same or
scientists (Terrace 1985). Since then, a gorilla,
different species. Responses to these vibrations
bonobos (Pan paniscus), and other chimps, have
by others are reinforced or are lethal to the
learned to use computer symbols as a human-type
receiver, but likely also inform the sender. The
language (Hopkins and Savage-Rumbaugh
first step is for the sounds or vibrations to become
1991). Kenneally explored the origin of the first
ritualized, leading to displays. The development
word, and speculated on which great apes might
of the necessary sending and receiving structures,
have been capable of speaking the first word.
such as the larynx or the insect tymbal, and a
Among other things, she said that such a speaker
sensory apparatus such as the ear or subgenual
would have to have the anatomical and physio-
organ, facilitated the evolution of an extremely
logical capacity for speech, but they would also
diverse range of auditory and vibratory signals
have to have something to say. In her view, this
and cues, of which only some are described here.
probably eliminated chimps, which she thought
Auditory and vibratory communication each
were immature and lacking in focus, rather than
has advantages and disadvantages. Though a sig-
cognitively limited (Kenneally 2007).
nal can travel through substrates, meaning the
Thomas Nagel’s (1974) thought-provoking
signaler does not have to be in visual range, it
question “What is it like to be a bat?” argues
can be overheard by others. Atmospheric
that humans might imagine what it is like to be
conditions can influence the signal and other
another being but can never know the conscious
sounds/vibrations can mask it. Geographic sepa-
mental state to be that species, or even another
ration of animals within a population can cause
human. We can look at systems, patterns, and
auditory and vibrational signals to evolve over
responses, but each species and every human
time into different dialects and cultural waves.
retain their own secrets and have their own
This variation can eventually separate animals
experiences. That does not mean we should not
within a species into different populations. One
try to understand nonhuman auditory and vibra-
thing that is becoming increasingly clear is that
tional communication signals. These different
there is not much time to uncover more about the
world views, or knowledge of the world, lead us
complexities of auditory and substrate-borne
to a study of the epistemology of different spe-
vibrational communication in nonhumans before
cies. Let us hope that we begin seriously to
11 Vibrational and Acoustic Communication in Animals 413

the behavior of our species, as human language Brumm H, Schmidt R, Schrader L (2009) Noise-
users, has led to the extinction of many species. dependent vocal plasticity in domestic fowl. Anim
Behav 78:741–746
Bush K, Aldridge CL, Carpenter JE, Paszkowski CA,
BoyceM CDW (2010) Birds of a feather do not always
Lek together: genetic diversity and kinship structure of
References greater sage-grouse (Centrocercus urophasianus) in
Alberta. Auk 127(2):343–353. https://doi.org/10.
Abba MA, Bianchi C, Cavilla V (2013) Chapter 15— 1525/auk.2009.09035
South American Camelids. In: Tynes VV (ed) The Busnel RG (ed) (1963) Acoustic behavior of animals.
behavior of exotic pets. Wiley-Blackwell, New York Elsevier, New York
Amos JN, Harrisson JA, Radford JQ, White M, Newell G, Carter GG, Wilkinson GS (2016) Common vampire bat
Nally NM, Sunnucks P, Pavlova A (2014) Species-and contact calls attract past food-sharing partners. Anim
sex-specific connectivity effects of habitat fragmenta- Behav 116:45–51. https://doi.org/10.1016/j.anbehav.
tion in a suite of woodland birds. Ecology 95(6): 2016.03.005
1556–1568. https://doi.org/10.1890/13-1328.1 Catchpole CK, Slater PJB (2008) Bird song. Biological
Aubin T, Jouventin P (1998) Cocktail-party effect in king themes and variations, 2nd edn. Cambridge University
penguin colonies. Proc R Soc B 265:1665–1673 Press, Cambridge
Bain DE, Dahlheimm ME (1994) Effects of masking noise Charlton BD, Reby D, McCombe K (2007) Female per-
on detection thresholds of killer whales. In: Bain DE, ception of size-related formant shifts in red deer
Dahlheim ME (eds) Marine mammals and the Exxon (Cervus elaphus). Anim Behav 74:707–714
Valdez. Elsevier, New York Christiansen MH, Kirby S (2003) Language evolution.
Baptista LF, Gaunt SSL (1997) Social interaction and Oxford University Press, New York
vocal development in birds. In: Snowden CT, Christin AM (1999) Les origines de l’écriture: Image,
Hausberger M (eds) Social influences on vocal devel- signe, trace. Le Débat 106(4):28. https://doi.org/10.
opment. Cambridge University Press, Cambridge 3917/deba.106.0028
Baze W (1950) Just elephants. Corgi Books, London Clay Z, Smith CL, Blumstein DT (2012) Food-associated
Bee MA (2007) Sound source segregation in grey vocalizations in mammals and birds: what do these
treefrogs: spatial release from masking by the sound really mean? Anim Behav 83:323–330
of a chorus. Anim Behav 74(3):549–558. https://doi. Cocroft R, Gogala M, Hill PSM, Wessel A (2014) Study-
org/10.1016/j.anbehav.2006.12.012 ing vibrational communication. Springer, Berlin
Bennet-Clark H (1998) How cicadas make their noise. Sci Cooper LD, Randall JA (2007) Seasonal changes in home
Am 278(5):58–61. http://www.jstor.org/stable/ ranges of the giant kangaroo rat (Dipodomys ingens): a
26057783. Retrieved 7 Feb 2021 study of flexible social structure. J Mammol 88:1000–
Bennet-Clark HC (2000) Resonators in insect sound pro- 1008. https://doi.org/10.1644/06-MAMM-A-197R1.1
duction: how insects produce loud pure-tone songs. J Cynx J, Lewis R, Tavel B, Tse H (1998) Amplitude
Exp Biol 202:3347–3357 regulation of vocalizations in noise by a songbird,
Benney KS, Braaten RF (2000) Auditory scene analysis in Taeniopygia guttata. Anim Behav 56(1):107–113.
estrildid finches (Taeniopygia guttata and Lonchura https://doi.org/10.1006/anbe.1998.0746
striata domestica): a species advantage for detection David Mech LH, Boitani L (eds) (2003) Wolves: behavior,
of conspecific song. J Comp Psychol 114:174–182 ecology, and conservation. University of Chicago
Bradbury JW, Vehrencamp SL (1998) Principles of animal Press, Chicago, p 472
communication. Sinauer Associates, Sunderland, MA De Waal F (2016) Are we smart enough to know how
British Broadcasting Corporation (BBC) News (2020) The smart animals are? W W Norton & Company,
biblical locust plagues of 2020. https://www.bbc.com/ New York
future/article/20200806-the-biblical-east-african- Dent ML, Larsen ON, Dooling RJ (1997) Free-field bin-
locust-plagues-of-2020 aural unmasking in budgerigars (Melopsittacus
Brownell P, Farley RD (1979) Orientation to vibrations in undulatus). Behav Neurol 111(3):590–598. https://
sand by the nocturnal scorpion Paruroctonus doi.org/10.1037/0735-7044.111.3.590
mesaensis: mechanisms of target localization. J Comp Dunlop RA (2017) Potential motivational information
Physiol A 131:31–38 encoded within humpback whale non-song vocal
Brumm H, Todt D (2002) Noise-dependent song ampli- sounds. J Acoust Soc Am 141(3):2204–2213. https://
tude regulation in a territorial songbird. Anim Behav doi.org/10.1121/1.4978615
63(5):891–897. https://doi.org/10.1006/anbe.2001. Dunlop RA, Cato DH, Noad MJ (2010) Your attention
1968 please: increasing ambient noise levels elicits a change
Brumm HB, Zollinger SA (2011) The evolution of the in communication behaviour in humpback whales
Lombard effect: 100 years of psychoacoustic research. (Megaptera novaeangliae). Proc R Soc B 277(1693):
Behav 148(11–13):1173–1198. https://doi.org/10. 2521–2529. https://doi.org/10.1098/rspb.2009.2319
1163/000579511X605759
414 R. Dunlop et al.

Dunlop RA, Cato DH, Noad MJ (2014) Evidence of a Popper AN (eds) Hearing and sound communication in
Lombard response in migrating humpback whales amphibians, vol 28. Springer, New York, pp 113–146
(Megaptera novaeangliae). J Acoust Soc Am 135(1): Gluck JP (2016) Voracious science and vulnerable
430–437 animals. A primate scientist’s ethical journey. Univer-
Emlen ST (1972) An experimental analysis of the sity of Chicago Press, Chicago, IL
parameters of bird song eliciting species recognition. Gordon SD, Tiller B, Windmill JFC, Krugner R, Narins
Behaviour 41(1/2):130–171 PM (2019) Transmission of the frequency components
Endler JA (2014) The emerging field of tremology. In: of the vibrational signal of the glassy-winged sharp-
Cocroft R, Gogala M, Hill PSM, Wessel A (eds) shooter, Homalodisca vitripennis, in and between
Studying vibrational communication. Springer, Berlin, grapevines. J Comp Physiol 205:783–791
pp vii–vix. https://doi.org/10.1007/978-3-662-43607-3 Henry L, Barbu S, Lemasson A, Hausberger M (2015)
Evans CS, Marler P (1991) On the use of video images as Dialects in animals: evidence, development and poten-
social stimuli in birds – audience effects on alarm tial functions. Anim Behav Cogn 2(2):132–155.
calling. Anim Behav 41:17–26 https://doi.org/10.12966/abc.05.03.2015
Evans CS, Marler P (1994) Food calling and audience Herman LM, Richards DG, Wolz JP (1984) Comprehen-
effects in male chickens, Gallus gallus – their sion of sentences by bottlenosed dolphins. Cognition
relationships to food availability, courtship and social 16(2):129–219. https://doi.org/10.1016/0010-0277
facilitation. Anim Behav 47(5):1159–1170 (84)90003-9
Ey E, Pfefferle D, Fischer J (2007) Do age- and sex-related Heyes C, Dickinson A (1990) The intentionality of animal
variations reliably reflect body size in non-human pri- action. Mind Lang 5(1):87–104
mate vocalizations? A review. Primates 48:253–267 Hill PSM (1998) Environmental and social influences on
Farabaugh SM, Dooling RJ (1996) Acoustic communica- calling effort in the prairie mole cricket (Gryllotalpa
tion in parrots: laboratory and field studies of major). Behav Ecol 9(1):101–108. https://doi.org/10.
budgerigars, Melopsittacus undulatus. In: Kroodsma 1093/beheco/9.1.101
DE, Miller EH (eds) Ecology and evolution of acoustic Hill PSM (1999) Lekking in Gryllotalpa major, the prairie
communication in birds. Cornell University Press, mole cricket (Insecta: Gryllotalpidae). Ethology 105:
New York, pp 97–117. https://doi.org/10.7591/ 531–545
9781501736957 Hill PSM (2008) Vibrational communication in animals,
Fischer J, Kitchen D, Seyfarth RM, Cheney DL (2004) 1st edn. Harvard University Press, London
Baboon loud calls advertise male quality: acoustic Hill PSM, Wessel A (2016) Biotremology. Curr Biol 26:
features and relation to rank, age, and exhaustion. R181–R191
Behav Ecol Sociobiol 56:140–148 Hill PSM, Lakes-Harland R, Mazzoni V, Narins PM,
Fitch WT (1997) Vocal tract length and formant frequency Virant-Doberlet M, Wessel A (2019) Biotremology:
dispersion correlate with body size in rhesus macaques. studying vibrational behavior. Springer Nature,
J Acoust Soc Am 102:1213. https://doi.org/10.1121/1. Cham, Switzerland
421048 Hill PSM, Mazzoni V, Stritih Peljhan N, Virant-Doberlet
Fitch WT (2011) Unity and diversity in human language. M, Wessel A (2022) Biotremology: physiology, ecol-
Philos Trans R Soc Lond B Biol Sci 366(1563): ogy and evolution. Springer Nature, Cham,
376–388. https://doi.org/10.1098/rstb.2010.0223 Switzerland
Gannon WL, Lawlor TE (1989) Variation of the Chip Hoch H, Deckert J, Wessel A (2006) Vibrational signal-
vocalization of three species of Townsend chipmunks ling in a Gondwanan relict insect (Hemiptera:
(Genus Eutamias). J Mammal 70(4):740–753. https:// Coleorrhyncha: Peloridiidae). Biol Lett 2:222–224
doi.org/10.2307/1381708 Hockett CF (1960) The origin of speech. Sci Am 203:88–
Gannon WL, O’Farrell MJ, Corben C, Bedrick EJ (2003) 111
Call character lexicon and analysis of field recorded bat Hopkins WD, Savage-Rumbaugh ES (1991) Vocal com-
echolocation calls. In: Thomas J, Moss C, Vater M munication as a function of differential rearing
(eds) Echolocation in bats and dolphins. University of experiences in Pan paniscus: a preliminary report. Int
Chicago Press, Chicago, IL, pp 478–484 J Primatol 12(6):559–583
Gardner RA, Gardner BT (1984) A vocabulary test for Hulse SH, MacDougall-Shackleton SA, Wisniewski AB
chimpanzees (Pan troglodytes). J Comp Psychol 98(4): (1997) Auditory scene analysis by songbirds: stream
381–404. https://doi.org/10.1037/0735-7036.98.4.381 segregation of birdsong by European starlings (Sturnus
Garland EC, Goldizen AW, Rekdahl ML, Constantine R, vulgaris). J Comp Psychol 111:3–13
Garrigue C, Daeschler Hauser N, Poole MM, Huxley JS (1966) A discussion of ritualization of behavior
Robbins J, Noad MJ (2011) Dynamic horizontal cul- in animals and man. Philos Trans R Soc B 251:247–
tural transmission of humpback whale song at the 271
ocean Basin Scale. Curr Biol 21(8):687–691. https:// Jackson A, DeArment R (1963) The lesser prairie chicken
doi.org/10.1016/j.cub.2011.03.019 in the Texas panhandle. J Wildl Manag 27(4):733–737.
Gerhardt HC, Bee MA (2006) Recognition and localization https://doi.org/10.2307/379848
of acoustic signals. In: Narins PM, Feng AS, Fay RR,
11 Vibrational and Acoustic Communication in Animals 415

Janik VM, Slater PJB (2000) The different roles of social amplitude in plant-borne vibrational
learning in vocal communication. Anim Behav 60(1): communication. In: Cocroft RB, Gogala M, Hill
1–11 PSM, Wessel A (eds) Studying vibrational communi-
Keighley MV, Langmore NE, Zdenek CN, Heinsohn R cation. Springer, Berlin, pp 125–145
(2017) Geographic variation in the vocalizations of McKelvey EGZ, Gyles JP, Michie K, Barquín
Australian palm cockatoos (Probosciger aterrimus). Pancorbo V, Sober L, Kruszewski LE, Chan A, Fabre
Bioacoustics 26(1):91–108. https://doi.org/10.1080/ CCG (2021) Drosophila females receive male
09524622.2016.1201778 substrate-borne signals through specific leg neurons
Kellogg WN, Kellogg LA (1933) The ape and the child: a during courtship. Curr Biol. https://doi.org/10.1016/j.
comparative study of the environmental influence upon cub.2021.06.002
early behavior. Hafner Publishing, New York and Mitani JC, Marler P (1989) A phonological analysis of
London male gibbon singing behavior. Behaviour 109(1/2):
Kenneally C (2007) The first word: the search for the 20–45
origins of language. Viking, New York Morris D (1956) The function and cause of courtship
Kiley M (1969) The origin and evolution of some displays ceremonies in L’instinct en le comportement des
in canids, felids and ungulates with particular reference animeaux et des hommes. Fondation Singer-Polignac,
to causation. vol 1. Vocalisations, vol 2 Tail and ear Colloque International Sur L’instinct, Paris, pp
movements. D. Phil University of Sussex 261–266
Kiley M (1972) The vocalisations of ungulates canids and Morris D (1957) Typical intensity and its relationship to
felids with particular reference to their origin, causa- the problem of ritualization. Behaviour 11:12
tion and function. Zeit fur Tierpsychol 31:71–222 Morton ES (1977) On the occurrence and significance of
Kiley-Worthington M (2017) The mental homologies of motivational-structural rules in some bird and mammal
mammals. Towards an understanding of another sounds. Am Nat 111:855–869
mammals world view. Animals 7(12):87. Morton ES (1982) Grading, discreteness, redundancy and
Kondo W, Watanabe S (2009) Contact calls: information motivational-structural rules. In: Kroodsma DE, Miller
and social function. Jpn Psychol Res 51(3):197–208. EH (eds) Evolution and ecology of acoustical communi-
https://doi.org/10.1111/j.1468-5884.2009.00399.x cation in birds. Academic Press, New York, pp 183–212
Kroodsma DE (ed) (1982) Acoustic communication in Nagel T (1974) What is it like to be a bat? Philos Rev
birds. Production, perception and design features of 83(4):435–450
sounds, vol 1. Academic Press, New York Narins PM (1990) Seismic communication in anuran
Kroodsma DE, Miller EH (eds) (1982) Acoustic commu- amphibians. Bioscience 40:268–274
nication in birds. Song learning and its consequence, Narins PM, Lewis ER (1984) The vertebrate ear as an
vol 2. Academic Press, New York exquisite seismic sensor. J Acoust Soc Am 76:1384–
Lakes-Harlan R, Strauss J (2014) Functional morphology 1387
and evolutionary diversity of vibration receptors in Narins PM, Losin N, O’Connell-Rodwell CE (2009) Seis-
insects. In: Cocroft RB, Gogala M, Hill PSM, Wessel mic and vibrational signals in animals. In: Squire LR
A (eds) Studying vibrational communication. Springer, (ed) Encyclopedia of neuroscience. Elsevier,
Berlin, pp 277–302 Amsterdam, pp 555–559
Lyn H (2012) Apes and the evolution of language: taking Nehaniv C, Dautenhahn K (2002) Imitation in animals and
stock of 40 years of research. In: Vonk J, Todd K (eds) artifacts. MIT Press, Cambridge, MA
Shackelford Oxford handbook of comparative evolu- Noad MJ, Cato DH, Bryden MM, Jenner MN, Jenner KCS
tionary psychology, Chapter 19. Oxford University (2000) Cultural revolution in whale songs. Nature 408:
Press, Oxford. https://doi.org/10.1093/oxfordhb/ 537. https://doi.org/10.1038/35046199
9780199738182.013.0019 O’Connell-Rodwell CE, Arnason BT, Hart LA (1997)
Manabe K, Sadr EL, Dooling RJ (1998) Control of vocal Seismic transmission of elephant vocalizations and
intensity in budgerigars (Melopsittacus undulatus): movement. J Acoust Soc Am 102:3124
Differential reinforcement of vocal intensity and the O’Connell-Rodwell CE, Arnason BT, Hart LA (2000)
Lombard effect. J Acoust Soc Am 103:1190. https:// Seismic properties of Asian elephant (Elephas
doi.org/10.1121/1.421227 maximus) vocalizations and locomotion. J Acoust Soc
Marler P (1961) Logical analysis of animal communica- Am 108:3066–3072
tion. J Theor Biol 1:295–317. https://doi.org/10.1016/ O’Connell-Rodwell C, Guan X, Puria S (2019) Vibra-
0022-5193(61)90032-7 tional communication in elephants: a case for bone
Marler P, Doupe AJ (2000) Singing in the brain. Proc Natl conduction. In: Hill PSM, Lakes-Harlan R,
Acad Sci 97(7):2965–2967. https://doi.org/10.1073/ Mazzoni V, Narins PM, Virant-Doberlet M, Wessel
pnas.97.7.2965 A (eds) Biotremology: studying vibrational behavior.
Marler P, Tamura M (1964) Culturally transmitted patterns Springer Nature, Cham, Switzerland, pp 259–276
of vocal behavior in sparrow. Science 146:1483–1486 O’Farrell MJ, Corben C, Gannon WL (2000) Geographic
Mazzoni V, Eriksson A, Anfora G, Lucchi A, Virant- variation in the echolocation call of the hoary bat
Doberlet M (2014) Active space and the role of (Lasiurus cinereus). Acta Chiropterol 2(2):185–196
416 R. Dunlop et al.

Oliveira RF, McGregor PK, Latruffe C (1998) Know thine Savage-Rumbaugh S, Rumbaugh D, Fields W (2009)
enemy: fighting fish gather information from observing Empirical kanzi: The ape language controversy
conspecific interactions. Proc R Soc B 265(1401): revisited. Skeptic 15(1):25–33
1045–1049 Schaller GB (1964) The year of the gorilla. University of
Ota N, Soma M (2022) Vibrational signals in multimodal Chicago Press, Chicago, p 304
courtship displays of birds. In: Hill PSM, Mazzoni V, Scheiber IBR, Weiß BM, Kingma SA, Komdeur J (2017)
Stritih-Peljhan N, Virant-Doberlet M, Wessel A (eds) The importance of the altricial – precocial spectrum for
Biotremology: physiology, ecology and evolution. social complexity in mammals and birds – a review.
Springer Nature, Cham Front Zool 14:3. https://doi.org/10.1186/s12983-016-
Peake TM, McGregor PK (2004) Information and aggres- 0185-6
sion in fishes. Anim Learn Behav 32(1):114–121 Schusterman RJ, Kastak D (1998) Functional equivalence
Podos J, Warren PS (2007) The evolution of geographic in a California Sea lion: relevance to animal social and
variation in birdsong. In: Advances in the study of communicative interactions. Anim Behav 55(5):
behavior. Elsevier, New York 1087–1095. https://doi.org/10.1006/anbe.1997.0654
Potash LM (1972) Noise induced changes in calls of the Sebeok T (1977) How animals communicate. Indiana
Japanese quail. Psychonomic Sci 26(5):252–254 University Press, Bloomington
Randall JA (1984) Territorial defense and advertisement Skinner BF (1957) Century psychology series. In: Verbal
by footdrumming in bannertail kangaroo rats behavior. Appleton-Century-Crofts, New York.
(Dipodomys spectabilis) at high and low population https://doi.org/10.1037/11256-000
densities. Behav Ecol Sociobiol 16:11–20. https://doi. Slater PJB (1981) Cultural evolution in chaffinch song:
org/10.1007/BF00293099 process inferred from micro and macro geographical.
Randall JA, Lewis ER (1997) Seismic communication Biol J Linnaean Soc 42:135–147
between the burrows of kangaroo rats, Dipodomys Slater PJB (1986) The cultural transmission of bird song.
spectabilis. J Comp Physiol 181(5):525–531. https:// Trends Ecol Evol 1(4):94–97
doi.org/10.1007/s003590050136 Slater PJB (1989) Bird song learning: causes and
Ratnam R, Feng A (1998) Detection of auditory signals by consequences. Ethol Ecol Evol 1(1):19–46. https://
frog inferior collicular neurons in the presence of spa- doi.org/10.1080/08927014.1989.9525529
tially separated noise. J Neurophysiol 80:2848–2859. Slobodchikoff CN, Perla BS, Verdolin JL (2009)
https://doi.org/10.1152/jn.1998.80.6.2848 Prairie dogs: communication and community in an
Rendell L, Whitehead H (2001) Culture in whales and animal society. Harvard University Press,
dolphins. Behav Brain Sci 24(2):309–382. https://doi. Cambridge, MA
org/10.1017/S0140525X0100396X Smith WJ (1969) Messages of vertebrate communication.
Reznikova Z (2019) Evolutionary and behavioural aspects Science 165(3889):145–150
of altruism in animal communities: is there room for Strusaker TT (1966) Auditory communication among ver-
intelligence? In: Evolution: cosmic, biological, and vet moneys (Cercopithecus aethiops). In: Alternment
social. Almanac, Dublin SA (ed) Social communication among primates.
Riede T, Fitch WT (1999) Vocal tract length and acoustics Chicago University Press, Chicago, pp 281–384
of vocalization in the domestic dog, Canis familiaris. J Stumpner A, von Helversen D (2001) Evolution and func-
Exp Biol 202:2859–2867 tion of auditory systems in insects. Naturwis-
Riondato I, Gamba M, Tan CL, Niu K, Narins PM, senschaften 88:159–170
Yang Y, Giacoma C (2021) Allometric escape and Šturm R, Rexhepi B, López Díez JJ, Blejec A, Polajnar J,
acoustic signal features facilitate high-frequency com- Sueur J, Virant-Doberlet M (2021) Vibroscape – an
munication in an endemic Chinese primate. J Comp overlooked world of vibrational communication.
Physiol 207:327–336 iScience, revision in review
Roberts L, Howard DR (2022) Substrate-borne vibrational Sullivan J, Demboski JR, Bell KC, Hird S, Sarver B,
noise in the Anthropocene: From land to sea. In: Hill Reid N, Good JM (2014) Divergence with gene flow
PSM, Mazzoni V, Stritih-Peljhan N, Virant-Doberlet within the recent chipmunk radiation (Tamias). Hered-
M, Wessel A (eds) Biotremology: physiology, ecology ity 11:185–194. https://doi.org/10.1038/hdy.2014.27
and evolution. Springer Nature, Cham Sutton D, Nadler C (1974) Systematic revision of three
Sales G, Pye D (1974) Ultrasonic communication in Townsend Chipmunks (Eutamias townsendii). South-
animals. Chapman & Hall, London west Nat 19(2):199–211. https://doi.org/10.2307/
Salinas-Melgoza A, Wright TF (2012) Evidence for vocal 3670280
learning and limited dispersal as dual mechanisms for Terrace HS (1985) Animal cognition: thinking without
dialect maintenance in a parrot. PLoS One. https://doi. language. Philos Trans R Soc Lond B 308(1135):
org/10.1371/journal.pone.0048667 113–128. https://doi.org/10.1098/rstb.1985.0014
Sanvito S, Galimberti F, Miller EH (2007) Vocal signaling Vannoni E, McElligott AG (2008) Low frequency groans
of male southern elephant seals is honest but imprecise. indicate larger and more dominant fallow deer (Dama
Anim Behav 73:287–299 dama) males. PLoS One 3:e3113
11 Vibrational and Acoustic Communication in Animals 417

Virant-Doberlet M, Čokl A (2004) Vibrational communi- RR, Popper AN (eds) Hearing and sound communica-
cation in insects. Neotrop Entomol 33:121–134 tion in amphibians, vol 28. Springer, New York, pp
Virant-Doberlet M, Mazzoni V, de Groot M, Polajnar J, 44–86
Lucchi A, Symondson WOC, Čokl A (2014) Vibra- Wessel A, Mühlethaler R, Hartung V, Kuštor V, Gogala M
tional communication networks: eavesdropping and (2014) The tymbal: evolution of a complex vibration-
biotic noise. In: Cocroft R, Gogala M, Hill PSM, producing organ in the Tymbalia (Hemiptera excl.
Wessel A (eds) Studying vibrational communication. Sternorrhyncha). In: Cocroft R, Gogala M, Hill PSM,
Springer, Berlin, pp 93–123 Wessel A (eds) Studying vibrational communication.
Volodin IA, Matrosova VA, Frey R et al (2018) Altai pika Springer, Berlin, pp 395–444
(Ochotona alpina) alarm calls: individual acoustic var- Wiley RH (1983) The evolution of communication: Infor-
iation and the phenomenon of call-synchronous ear mation and manipulation. In: Halliday TR, PJB S (eds)
folding behavior. Sci Nat 105:40. https://doi.org/10. Animal behavior, volume 2, communication, vol 225.
1007/s00114-018-1567-8 W. H. Freeman, New York, pp 156–189
Walker SF (1998) Animal communication. In: Mey JL Wilson-Henjum GE, Job JR, McKenna MF et al (2019)
(ed) Concise encyclopedia of pragmatics. Elsevier, Alarm call modification by prairie dogs in the presence
Amsterdam, pp 26–35 of juveniles. J Ethol 37:167–174. https://doi.org/10.
Wells KD, Schwartz JJ (2006) The behavioral ecology of 1007/s10164-018-0582-8
anuran communication. In: Narins PM, Feng AS, Fay

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Echolocation in Bats, Odontocetes,
Birds, and Insectivores 12
Signe M. M. Brinkløv, Lasse Jakobsen, and Lee A. Miller

12.1 Introduction to collect information about their surroundings


and concluded that “echolocation is an
Echolocation, a term coined by Griffin (1944, eye-opening discovery about animal behavior.”
1958), is an active sensory system. Echolocating Demonstrating echolocation behavior means
animals emit sound signals and perceive their showing that the animal uses echoes of their out-
surroundings by way of the returned echoes. going sounds to locate and identify objects in
Using this approach, echolocators can determine their path. Several robust protocols exist for
the direction and distance to an object, the type of assessing echolocation ability and capacity in ter-
object, and whether it is moving or stationary. restrial and marine animals (Griffin 1958; Norris
Echolocation (also known as biosonar) is used et al. 1961). Echolocation and ultrasound are not
by most bats, odontocetes (toothed whales), inherently linked. Many animals echolocate by
oilbirds, and some swiftlets to negotiate, respec- signals fully or partly composed of frequencies
tively, night skies, deep waters, or dark caves. In readily audible to humans, such as the clicks
addition, soft-furred tree mice use echolocation in of some odontocetes, certain bat species, and
darkness for orientation (He et al. 2021). These birds. Conversely, many non-echolocating
are all habitats characterized by limited visibility, animals use ultrasonic sounds for intraspecific
likely a key evolutionary driver for echolocation. communication.
Echo feedback may also provide functional sen- A primary advantage of echolocation is that it
sory abilities in shrews and tenrecs. allows animals to operate and orient in uncertain
The discovery of echolocation traces back to lighting conditions. At the same time, information
Lazzaro Spallanzani’s suggestion in 1794 that leakage is a primary disadvantage of echoloca-
bats could “see” with their ears. Griffin (1944, tion. The signals used in echolocation are audible
1958) verified this idea much later when he to many other animals, such as competing
demonstrated that bats produce ultrasonic sounds conspecifics, predators, and prey. The evolution-
ary arms race between echolocating bats and sev-
eral families of insects sensitive to ultrasound is a
S. M. M. Brinkløv classic example of predator–prey co-evolution
Department of Ecoscience – Wildlife Ecology, University (Miller 1983; Miller and Surlykke 2001). Some
of Aarhus, Aarhus C, Denmark
fishes (Alosinae) hear high-frequency sounds
e-mail: brinklov@ecos.au.dk
(Mann et al. 1997; Wilson et al. 2008), which
L. Jakobsen · L. A. Miller (*)
could suggest similarly co-evolving sensory
Department of Biology, University of Southern Denmark,
Odense, Denmark abilities between odontocetes and their fish prey
e-mail: lasse@biology.sdu.dk; Lee@biology.sdu.dk (Wilson et al. 2013).
# The Author(s) 2022 419
C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_12
420 S. M. M. Brinkløv et al.

In this chapter, we review basic concepts about source levels are referenced to a distance of 1 m
echolocation, the variety of animals known to in front of the animal. Source levels of bats are
echolocate, the main types of echolocation signals variable, but generally higher in aerial-feeding bats
they use, and how they produce and receive those that fly and search for prey in the open sky (typi-
signals. The topic of perception by echolocating cally 100–130 dB re 20 μPa at 0.1 m). Bats that fly
animals is beyond the scope of this chapter. and forage in vegetation use lower-amplitude
signals. Among these, the so-called “whispering
bats” (e.g., slit-faced bats (Nycteridae), false vam-
12.2 Characteristics of Echolocation pire bats (Megadermatidae), and many New
Signals World leaf-nosed bats (Phyllostomidae)), emit
echolocation sounds at about 65–70 dB re
Echolocating animals use two broad classes of 20 μPa at 0.1 m (Jakobsen et al. 2013a). The
sounds. Toothed whales, rousette bats, and birds source level of a dolphin’s echolocation signal is
generate broadband clicks produced at varying several orders of magnitude greater than that of a
rates. The vast majority of bats, however, use bat’s signal, primarily owing to the different
tonal echolocation signals, characterized by lon- properties of the two media (see next section)
ger duration and either a constant frequency or, (Madsen and Surlykke 2014). Echolocation clicks
more commonly, frequency modulation (FM; i.e., of bottlenose dolphins (Tursiops truncatus) can
sweeping across several frequencies over time). reach source levels of 225 dB re 1 μPa at 1 m
With the exception of certain bat species, peak-to-peak (Au 1993, p. 78). Source levels of
echolocating animals time their outgoing pulses oilbirds (Steatornis caripensis) are around 100 dB
so the echo from a previous pulse does not over- re 20 μPa root-mean-square (rms) at 1 m (Brinkløv
lap with the next outgoing signal, especially dur- et al. 2017), corresponding to roughly 120 dB re
ing general orientation and searching for prey. 20 μPa at 0.1 m, which is comparable to estimates
This separation ensures that the strong outgoing from many bat species. Little has been
signal does not mask the fainter returning echoes documented about the source levels of swiftlets,
from the previous signal (Jen and Suga 1976; tenrecs, and shrews.
Kalko and Schnitzler 1989; Verfuss et al. 2009). Bats and toothed whales both emit the acoustic
Bats and odontocetes both show characteristic signal energy in a focused beam, with specific
changes in echolocation behavior as they vertical and horizontal transmission patterns,
approach objects. Notably, most species in both akin to an “acoustic flashlight” focused on a cer-
groups adjust the sound emission rate to the dis- tain search area. The open mouth of a bat, or the
tance of the target. The click rate increases as they nose in nasal-emitting bats, shapes the transmitted
approach objects and numerous species emit a beam (Hartley and Suthers 1987, 1989), which is
terminal buzz (i.e., a series of pulses or clicks in much broader than that of dolphins (Madsen and
rapid succession) during prey capture (Fig. 12.1). Surlykke 2014). The dolphin’s melon transmits
In bats, these temporal changes are accompanied the outgoing echolocation signals with a slightly
by a change from narrow to wider bandwidths elevated vertical beam above the rostrum
and lower to higher frequencies as they move (Au 1993). There is no information on signal
from an open to a cluttered aerial environment directionality from oilbirds or swiftlets.
or detect an airborne insect prey. Such pro-
nounced, systematic changes have not been
documented in oilbirds or swiftlets. 12.3 Differences in Echolocation
Echolocation signals are often much higher in Signals in Air and Water
amplitude than other sounds produced by animals.
Amplitudes of bat echolocation signals are typi- Only a few of the 71 known species of toothed
cally given at a reference distance of 0.1 m in front whales are proven to use echolocation, but by
of the mouth or nostril. For whales and birds, inference probably all of them do (Culik 2011),
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 421

Fig. 12.1 Echolocation sequence from a harbor porpoise species increase the rate of sound emission as they
(Phocoena phocoena) and a Daubenton’s bat (Myotis approach prey and emit a terminal buzz immediately
daubentonii) as they approach and capture prey. Both before prey capture

as do presumably more than 1000 species of bats. greater than 3.4 mm strongly reflects the
For echolocators, there are three important 100-kHz sound in air, while in water, the sphere
differences between sound in air and sound in must be larger than 15 mm in diameter.
water: (1) density of the medium, (2) reflectivity The absorption coefficient (see Chaps. 5 and 6
of targets, and (3) maneuverability of the target on sound propagation) of the medium is a func-
(Madsen and Surlykke 2014). These differences tion of several factors, but frequency is the most
severely influence the way echolocation has important for echolocators. In seawater, the
evolved in the two media (Au and Simmons absorption coefficient for sound at 100 kHz is
2007). about 0.038 dB/m, while in air at the same fre-
First, water is about 770 times denser than air: quency, it is much larger: 3.3 dB/m. In addition,
1000 and 1.3 kg/m3, respectively, partly sound pressure is lost through geometric spread-
explaining why sound travels about 4.4 times ing in both air and water. For spherical spreading,
faster in water than in air (1520 m/s versus each time the distance is doubled, the sound pres-
344 m/s). For the same frequency of sound, the sure level of the emitted signal is halved (i.e.,
wavelength in water is about 4.4 times longer reduced by 6 dB). Taken together, sound absorp-
than in air. Longer wavelengths limit detection tion and geometric spreading mean that an
to larger targets because reflection depends on the echolocating dolphin can detect an object at
relationship between the wavelength of the much longer distances than can an echolocating
impinging sound and the size of the reflecting bat (Madsen and Surlykke 2014).
object (Urick 1983; also see Chap. 5, section on Investigators often want to get a relative notion
reflection). Sound at a given frequency reflects of the difference in amplitude of bat and dolphin
more effectively from smaller objects in air than echolocation signals. However, such a compari-
in water. For example, the wavelength of a son should be done cautiously because of the
100-kHz signal is 3.4 mm in air, and 15 mm in different physical properties of air and water and
water. Thus, a sphere with a circumference the two different reference pressures. To compare
422 S. M. M. Brinkløv et al.

bat food, such as flying insects. There is, how-


ever, little impedance difference between seawa-
ter and toothed whale prey, such as fish or squid
(Madsen et al. 2007). Accordingly, most sound
from an echolocating toothed whale goes right
through a fish or squid, producing low echo levels
and making it difficult for the animal to detect its
prey. In contrast, the air-filled swim bladders of
some fish and hard features, such as the pen and
beak of squid, reflect sound well, resulting in
strong echoes.
In spite of substantial differences in the imped-
ance and reflectivity of prey in air and in water,
echo levels from airborne and aquatic prey are
about the same. The target strength (TS) is the
Fig. 12.2 For sound sources of the same power or inten-
sity, the sound pressure levels in air and water differ by
difference between the echo level (EL) measured
62 dB 1 m from the target and the incident sound (IS) at
the target: TS ¼ EL  IS, where EL and IS are
measured in dB re 20 μPa in air and 1 μPa in
a sound intensity level measured in dB in water to water, and TS is in dB as the reference levels
a reading in air, subtract 36 dB to compensate for cancel out. Maximum target strength depends on
the differences in acoustic impedance (i.e., den- the frequency of the echolocation signal and the
sity  sound speed; see Chap. 4, introduction to reflectivity, size, and orientation of the prey with
acoustics) between the two media. For the same respect to incident sound. For cod, haddock, and
source intensity, sound pressure in water is saithe (400 to 500 mm long) the TS (at 30 kHz) is
60 times greater than in air (i.e., ~36 dB). 32 to 40 dB. For a moth (Arctia caja) with a
25–35 mm wingspan, TS (at 20–50 kHz) is
 
I water =I air ¼ p2 =ρ c = p2 =ρ c ¼ 1=3570 42 dB; for the stonefly (Plecoptera sp.) with a
water air
wing-span of ~15 mm, TS (at 10–37 kHz) is
10 log 10 ð1=3570Þ ¼ 36 dB 47 dB (Miller 1983; Rydell et al. 1999). Despite
more than a magnitude of difference in size, the
where p is sound pressure, I is intensity, ρ is target strengths of fish and insect prey are similar
density, c is the speed of sound, and ρc is acoustic because of a combination of the differences in
impedance. Then, subtract 26 dB (20 log10 acoustic impedance of the medium and
(20/1) ¼ 26 dB) to correct for the different refer- reflectivity of the prey.
ence pressures used for the decibel scales of Viscosity differences between air and water
sound in air and in water; i.e., 1 μPa in water make toothed whales much less agile than bats.
and 20 μPa in air (Fig. 12.2). For example, if the Toothed whales swim at about 2 m/s when cap-
sound pressure level of a dolphin click were turing prey while bats fly at 2–10 m/s. After
220 dB re 1 μPa (Au 1993), then a source with detection, a bat arrives at its prey much sooner
the same power would produce a click of 158 dB than the toothed whale. A bat catching prey
re 20 μPa in air (220  36  26 ¼ 158 dB re moves quickly because it is hardly hindered by
20 μPa), which is a very high sound pressure in friction from air. Bats typically take about a sec-
air and well above the maximum sound pressure ond to capture prey, while porpoises and dolphins
levels achieved by bats. need several seconds because the higher viscosity
In air, there is a considerable difference in of water hinders their mobility. These differences
acoustic impedance between the medium and occur despite similar ratios between body length
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 423

of predator and prey; a 3-m long dolphin is 6–15 12.4.1 Sound Production and Signal
times larger than its fish prey (20 to 50 cm long) Characteristics
and a 3–8 cm long bat is 5–10 times bigger than
its insect prey. Bats often use their wing and tail With the exception of the tongue-clicking
membranes and even their feet to catch and Rousettus bats (10 species belonging to the
manipulate insects. Toothed whales are stream- pteropodid family), all ~1200 species of
lined with only pectoral and dorsal fins and flukes echolocating bats produce their echolocation
as appendages; they must catch and manipulate signals in the larynx (Suthers and Hector 1988).
prey with their teeth and mouths (Miller 2010). The larynges and associated structures in bats are
Despite very different selective pressures specialized to varying degrees from the basic
placed on bats and toothed whales, most of mammalian pattern, notably the entire structure
which are founded in the density and viscosity ossifies much earlier during development than in
differences between air and water, they operate most mammals, and for many species the vocal
their biosonar in very similar ways. This similar- tract and nasal passages are modified to filter
ity of the biosonar systems of bats and toothed frequencies used for echolocation (Au and
whales (Fig. 12.5a) is a wonderful example of Suthers 2014). Most echolocating bats emit
convergent evolution (Madsen and Surlykke sound through the open mouth, but bats in several
2014; Wilson et al. 2013). families emit sound through the nostrils
(Pedersen 1993). Bats emitting sound through
the mouth generally have plain faces, while the
bats emitting sound through the nose typically
12.4 Echolocation in Bats
have elaborate structures surrounding the nostrils
such as a nose-leaf that aids in sound radiation
Bats are the second-most species-rich order of
(Fig. 12.3).
mammals, currently comprising almost 1400 spe-
The vast majority of echolocating bats are
cies (Burgin et al. 2018) and they play several
insectivorous. Most insectivorous bats hunt flying
trophic roles. Echolocating bats eat a diverse
insects and typically vary the structure of their
range of food including animals (insects,
echolocation calls as they progress from
vertebrates), plant materials (leaves, fruit, nectar,
searching to approaching and capturing prey. Tra-
and pollen), and even blood. The
ditionally, prey capture is divided into three
non-echolocating pteropodid bats all eat mainly
phases (Fig. 12.4): a search, an approach, and a
plant materials. Traditionally, bats were arrayed
terminal phase (Griffin 1958; Griffin et al. 1960).
in two suborders separating them into the
In the search phase, bats emit long-duration,
echolocating Microchiroptera and the
lower-frequency, narrowband signals (search
non-echolocating Megachiroptera, but recent
calls) at a low repetition rate. After an object of
phylogenetic studies do not support this division.
interest is detected, the bats gradually reduce the
Bats are now divided into Yinpterochiroptera and
duration and intensity of the signals; while they
Yangochiroptera (Teeling 2009; Teeling et al.
increase the rate and the bandwidth as they
2005). The non-echolocating pteropodid bats are
approach objects (approach calls). In the terminal
found in the Yinpterochiroptera. This new divi-
phase, immediately before prey capture, the repe-
sion is intriguing because it creates two
tition rates may exceed 150 calls per second (the
alternatives for the evolution of bat echolocation,
terminal buzz). Several reasons underlie these
either as a single event resulting in the loss of
progressive changes in call emission. The search
echolocation by the pteropodids or as two sepa-
calls facilitate a long detection range as lower
rate events. The current consensus favors a single
frequencies are attenuated much less than are
origin of echolocation and subsequent loss in the
higher frequencies (Lawrence and Simmons
pteropodids (Thiagavel et al. 2018; Wang et al.
1982b) and the long duration and narrow
2017).
424 S. M. M. Brinkløv et al.

Fig. 12.3 Variation in bat facial morphology. (a) emitting echolocators while c–f are nose emitters. Note
Nyctalus noctula, (b) Murina cyclotis, (c) Plecotus that c does not have the associated nasal structures com-
auritus, (d) Mimon crenulatum, (e) Rhinolophus rouxii, mon in nose emitters. Photos by S. Brinkløv
(f) Hipposideros lankadiva. Bats a and b are mouth

bandwidth focus the energy of the call in a narrow avoids perceptual errors associated with poten-
range of the sensory system. These calls are, tially assigning echoes to the wrong calls, it also
however, not ideal for accurate localization and means that the distance between the bat and
object classification. Short-duration, broadband, objects of interest limits the call emission rate.
high-frequency calls are much better suited for As the bats approach an object, echoes return with
these tasks (Simmons et al. 1975). The switch progressively shorter delays and the bat can emit
from long-duration, narrowband, low-frequency the calls at a higher rate, up to over 200 calls/
calls in the search phase to short-duration, broad- s during the terminal buzz (Simmons et al. 1979,
band, higher-frequency calls in the approach Fig. 12.4). While this is an impressively high call
phase is a clear indication of object detection rate, the echoes are still received well before the
and it has been used to estimate detection distance next call is emitted. At the short distances
in echolocating bats. However, it is important to between the bat and the prey when the buzz is
note that this is a minimum measure as the bat emitted, the bat could theoretically increase the
may well have detected the object before call rate to 1000 calls/s and still avoid call-echo
adjusting its call parameters (Kalko and ambiguity. Instead, the call rate is limited by the
Schnitzler 1989, 1993). maximum speed of the superfast muscles that
Most echolocating bats, like toothed whales, control each call emission (Elemans et al. 2011).
emit an echolocation call and wait for echoes Concurrent with the increase in call rate, the call
from objects of interest before emitting the next duration decreases as distance to the object
call (Madsen and Surlykke 2014). While this decreases. This is likely to prevent overlap
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 425

Fig. 12.4 Echolocation call sequence emitted by a foraging soprano pipistrelle (Pipistrellus pygmaeus), illustrating the
progressive change in call characteristics and emission rate as the bat searches for, approaches, and captures insect prey

between the emitted call and the returning echo acoustic attenuation is less for lower frequencies.
since the much louder call emission will mask the On the reflection side, small objects return quieter
quieter returning echo if the two overlap (Kalko echoes and will therefore always be detectable at
and Schnitzler 1989, 1993). Hence, echoes from shorter ranges than large objects (Fig. 12.6). The
objects of interest are received in a clearly defined structure and texture of the object also affects the
window between the end of call emission and the level of the returning echo. Hard objects reflect
beginning of the next call. For example, a bat more sound than soft objects and the same is true
emitting calls of 8 ms duration at a call rate of for plane or convex surfaces compared to concave
10 calls/s can resolve echoes from objects surfaces (Urick 1983; also see Chap. 5, section on
between 1.4 and 17 m distance without masking reflection). Additionally, the relationship between
the returning echo during call emission and with- the wavelength of the sound impinging on the
out the risk of call-echo ambiguity (Fig. 12.5). object and the size of the object affects how
While call rate and call duration define an efficient the sound is reflected. If the wavelength
overlap-free window, it is the energy and fre- becomes too long (i.e., the frequency too low)
quency of the emitted call together with the relative to the size of the object, very little
bat’s hearing threshold and the nature of the sound is reflected (Fig. 12.6). This means that
echo-generating object that determine the range prey size imposes a lower frequency limit on bat
of the echolocation system. Echoes have to return echolocation (Houston et al. 2004; Pye 1993).
with enough energy to be detected by the bat. Bats are limited both physically and physio-
Emitting more energy, either by increasing the logically in how high a sound pressure they can
intensity or duration of the call, increases the produce. Supposedly, the main reason why they
detection distance. Emitting lower frequencies emit long-duration calls in the search phase is to
also increases the detection distance because increase the energy of the call. Emitting sound
426 S. M. M. Brinkløv et al.

Fig. 12.5 Schematic illustration of why most received after emission of the next call may create ranging
echolocating bats adjust call duration and call emission ambiguity if assigned to the incorrect call. IPI: inter-pulse
rate relative to target distance. Echoes received during call interval
emission are masked by the louder call and echoes

directionally also increases the source level, that directions (Jakobsen et al. 2013a). The highest
is the sound level measured directly in front of the source levels measured from bats are around
animal. All bats studied to-date emit directional 140 dB re 20 μPa rms at 0.1 m for the greater
echolocation calls. Most bats increase their source bulldog bat (Noctilio leporinus), but most reports
level by 10 dB or more purely by focusing the of open-space aerial hawking bats are around
sound as opposed to radiating sound equally in all 130 dB re 20 μPa rms at 0.1 m (Holderied et al.
2005; Hulgard et al. 2016; Surlykke and Kalko
2008). Combining knowledge of source level,
signal frequency, hearing threshold, and the
echo-generating object, the detection distance is
relatively easy to estimate using a variation of the
sonar equation (Urick 1983) (also see Chap. 6,
section on the sonar equation):

RL ¼ SL  2  PL þ TS

PL ¼ 20  log 10 ðdistance=0:1 mÞþ


α  ðdistance  0:1 mÞ

Here, RL is the received level, SL is the source


level emitted by the bat, PL is the propagation
(formerly, transmission) loss, α is the frequency-
dependent attenuation in air, and TS is the target
strength, a measure of how much sound is
reflected from the object at 0.1 m relative to the
sound impinging on the target. For an object to be
detected by the bat, RL simply has to be above the
bat’s hearing threshold. The maximum distance
that satisfies this requirement is the maximum
Fig. 12.6 Target strength of three types of insect as a detection distance. Estimated detection distances
function of echolocation frequency illustrating how reflec- vary greatly between species, but it is clear that
tion depends on the relationship between object size and
frequency. Smaller insects have lower target strength and
bat echolocation is a short-range system; the fur-
require higher frequencies for efficient reflection. thest estimates for large insect prey are around
Indicated sizes are wing length. Based on data from 10 m with most estimates below 5 m (Kalko and
Houston et al. (2004)
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 427

Schnitzler 1989, 1993; Nørum et al. 2012; size of the emitter allows the bats to adjust the
Surlykke and Kalko 2008; Stilz and Schnitzler directionality of the emitted call to suit their envi-
2012). ronment (Kounitsky et al. 2015; Surlykke et al.
The directional echolocation calls of bats 2009b). During the final buzz of prey pursuit, bats
allow an increased detection distance ahead of can broaden their echolocation beam to increase
the bat while reducing the sound levels off to peripheral echo levels and better track the prey
the sides and the back. This reduction in off-axis (Jakobsen et al. 2015; Jakobsen and Surlykke
sound level offers an additional benefit as it 2010; Matsuta et al. 2013; Motoi et al. 2017).
reduces echoes from objects in these directions This is achieved in several species by a sudden
that are likely of little interest to the bats. Echoes drop in call frequency by nearly an octave
from irrelevant objects are known as clutter ech- (as illustrated in Figs. 12.4, 12.7, and 12.8) and
oes and reducing them simplifies the acoustic is often referred to as the buzz II phase.
scene that the bats experience. The obvious dis- The majority of echolocating bats, and the
advantage in emitting directional echolocation focus of our description so far, hunt flying insects
calls is the loss of echoes from relevant off-axis (aerial hawking bats) using relatively short-
objects. The degree to which the benefits out- duration echolocation calls (also known as low
weigh the costs of emitting a very directional duty-cycle calls, with duty cycle being the dura-
echolocation call varies with the environment tion of the call divided by the time period (from
and the behavioral context. The directionality of the start of one call to the start of the next call).
the echolocation call is determined by the emitted There are, however, many species that forage and
frequency and the shape and size of the sound echolocate differently. About 150 species, includ-
emitter. For mouth-emitting bats, this is the shape ing the Old World horseshoe bats and
and size of the open mouth, and for nose-emitting hipposiderid bats (i.e., Pteronotus parnellii and
bats, the shape and size of the nostrils and the closely related species in the family
nose-leaf (Hartley and Suthers 1987, 1989; Mormoopidae from the New World), also feed
Strother and Mogus 1970). Higher frequencies on flying insects. These bats are so-called high
and larger emitters produce higher directionality duty-cycle echolocators and are able to broadcast
(Fig. 12.7). Varying the frequency, shape, and and receive sound at the same time. While low

Fig. 12.7 Echolocation


call directionality as a
function of emitter size and
frequency. Directionality
increases with increasing
frequency and increasing
size. Reprinted by
permission from Springer
Nature. Jakobsen L,
Ratcliffe JM, Surlykke
A. Convergent acoustic
field of view in
echolocating bats. Nature
493 (7430):93–96. https://
www.nature.com/articles/
nature11664. # Springer
Nature, 2013b. All rights
reserved
428 S. M. M. Brinkløv et al.

Fig. 12.8 Echolocation calls emitted by a low duty-cycle bat (Myotis daubentonii) with strongly frequency-modulated
calls (left) and a high duty-cycle bat (Rhinolophus formosae) with mostly constant frequency calls (right)

duty-cycle bats maintain a clear time separation v2 is the speed of the target in m/s (+ indicates
between the emitted call and returning echo, high movement away from the echolocator;  would
duty-cycle bats separate call and echo by fre- be movement toward the echolocator), f is the
quency. They all emit much longer duration, emitted frequency in Hz, θ is the angle in degrees
constant-frequency echolocation calls with short between the echolocater and the target, and c is
intervals to navigate and forage (Fig. 12.8, Fenton the speed of sound in the medium (about 344 m/
et al. 2012). When an echo-generating object, s in air and 1500 m/s in water).
such as a moth, moves relative to the bat, the Perception of a Doppler shift by an
echo returns to the bat at a slightly different echolocator is facilitated by emitting long signals
frequency than the emitted call because of the tuned to one frequency (narrowband or constant
Doppler shift. The classical example used to frequency) and by having acute hearing in the
explain the Doppler shift phenomenon is the frequency band of the Doppler-shifted echo. Spe-
moving ambulance. When an ambulance moves cifically, Doppler-shifted echoes are dominated
toward a nearby listener, the siren appears to be by different frequencies than those dominating
higher in frequency than the one heard by some- outgoing pulses (Fenton et al. 2012) and bats
one riding in the ambulance, which does not using this strategy are therefore not sensitive to
change. The effect of Doppler shift is apparent overlap of the two.
when the ambulance passes and moves away Greater horseshoe bats (Rhinolophus
from the listener. Now, the frequency abruptly ferrumequinum) detect the frequency and ampli-
changes from higher to lower in pitch. Doppler tude modulations of the Doppler-shifted echo
shift occurs because the speed of the moving from an insect to within a few Hz of the
ambulance is added to, or subtracted from, the ~82 kHz carrier-frequencies of their echolocation
speed of sound, raising or lowering the perceived calls (Neuweiler 2000). The bats that use
pitch of the siren. The amount of the Doppler shift Doppler-shifted echoes readily detect the wing
is doubled for echolocating animals, as the beats of a fluttering insect and distinguish the
frequencies of both outgoing and returning prey from the background. Flutter-detection is a
signals are shifted. The Doppler shift experienced recurring theme among bats that exploit Doppler
by an echolocating animal may be computed as: shifts (Goldman and Henson 1977; Schnitzler and
Flieger 1983; Lazure and Fenton 2011).
2
Δf ¼ ðv1 þ v2 Þ  f  cos θ  Bats that exploit Doppler-shifted echoes are
c
Doppler-shift compensators (DSC; Hiryu et al.
Here, Δf is the amount of Doppler shift in Hz, 2016) because they continuously adjust the out-
v1 is the speed of the echolocating animal in m/s, going signal to ensure that the Doppler-shifted
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 429

echoes remain at the frequencies to which their while these species often cluster their calls in
acoustic foveae are tuned (Schuller and Pollack groups with increased repetition rates when
1979, Schnitzler 1968; Schnitzler and Flieger faced with increasing acoustic complexity, they
1983; Hiryu et al. 2016). do not emit the terminal buzz characteristic of
There is no current evidence that toothed bats that target flying insect prey (Gonzalez-
whales or other echolocators using broadband Terrazas et al. 2016). In addition, they often rely
clicks are capable of Doppler-shift compensation. on additional sensory input, such as olfactory
However, the small harbor porpoise would be a cues (Gonzalez-Terrazas et al. 2016), or, in the
good species to test for Doppler-shift sensitivity, special case of vampire bats, thermoreception
as they have narrow auditory filters (Popov et al. (Kürten and Schmidt 1982).
2006) and use relatively long clicks (100 μs) and
narrowband echolocation signals centered around
130 kHz. 12.4.2 Hearing Anatomy
High duty-cycle bats, in general, have a highly and Echolocation Abilities
specialized hearing to facilitate this type of echo-
location and they modify their emitted echoloca- The hearing of echolocating bats is based on
tion calls such that the frequency of the returning standard mammalian hearing anatomy, including
echoes always falls within a very narrow fre- recognizable pinnae, tragus, ear canal, tympanic
quency range for which their hearing is optimized membrane, three middle ear bones, and a coiled
(Fig. 12.8 and Sect. 12.4.2) (Schnitzler 1973; cochlea. With few exceptions, they even have the
Schuller 1977). In spite of the large differences same hearing threshold as most other mammals,
between high and low duty-cycle bats, the overall measured at their best frequencies: 0 dB re 20 μPa
call emission pattern when catching flying insects (Fay 1988), Fig. 12.9. There are, however, nota-
is still remarkably similar. High duty-cycle bats ble specializations that relate to echolocation
still emit calls that correspond to the three phases where bats differ from most mammals. It is clear
of search, approach, and buzz when they pursue that most bats have a larger than average pinna
flying insects, including similar call-structure and tragus, but there is considerable variation
changes to those in the low duty-cycle bats: grad- across species in size and shape that likely relates
ual source-level reduction, duration shortening, to the bat’s echolocation signals and foraging
increasing repetition rate (Ratcliffe et al. 2013), ecology (Coles et al. 1989; Obrist et al. 1993)
and broadening of the echolocation beam during (Fig. 12.3). In general, bats that complement
the terminal buzz (Matsuta et al. 2013). their echolocation by passive listening for prey-
Bats that do not forage for flying insects gen- generated sounds have larger pinnae than bats
erally search for more conspicuous food. Many that rely solely on echolocation (Obrist et al.
species hunt non-flying insects in dense vegeta- 1993). The pinna provides substantial direction-
tion, a strategy known as gleaning. Gleaning bats, ality and acoustic gain depending on the relation-
in general, emit very short low-intensity calls that ship between pinna size and sound frequency.
sweep over a broad range of frequencies The pinnae of gleaning bats commonly amplify
(Denzinger and Schnitzler 2013). As noted ear- sound well below the bats’ echolocation
lier, such calls provide excellent localization and frequencies (Coles et al. 1989; Guppy and Coles
classification and the low intensities greatly 1988; Obrist et al. 1993; Schmidt et al. 1983). The
weaken clutter echoes, which is particularly acoustic gain provided by the large pinnae affords
important when flying in dense vegetation. Fruit some bats extremely low hearing thresholds such
and nectar eating can be considered variations on as the impressive 20 dB re 20 μPa hearing
the gleaning strategy, and the echolocation threshold found in the brown long-eared bat
behavior of fruit-eating and nectar-drinking bats (Plecotus auritus) and the Indian false vampire
very closely resembles that of insect-gleaning bat (Megaderma lyra) (Coles et al. 1989; Schmidt
bats (Denzinger and Schnitzler 2013). Notably, et al. 1983). While pinna structure plays a crucial
430 S. M. M. Brinkløv et al.

Fig. 12.9 Audiograms of three echolocating bats and two ferrumequinum, from Long and Schnitzler 1975);
echolocating bird species. A non-echolocating bird is dark blue: oilbird (Steatornis caripensis, from Konishi
shown for comparison. Bat thresholds are based on behav- and Knudsen 1979); red: swiftlet (Aerodramus
ioral experiments, bird thresholds are derived from neuro- spodiopygia, from Coles et al. 1987); yellow: black-
physiological experiments. Green: big brown bat capped chickadee (non-echolocating, from Wong and
(Eptesicus fuscus, from Dalland 1965); light blue: Egyp- Gall 2015). Thresholds are not directly comparable
tian fruit bat (Rousettus aegyptiacus, from Koay et al. between species due to differences in experimental
1998); purple: greater horseshoe bat (Rhinolophus conditions

role in bat echolocation, large external ears have a but a progressive increase in echo strength at the
disadvantage during flight. Large ears create sub- bat by +6 dB per halving of distance. However,
stantial drag, and it is likely that the ears of fast- the bat’s auditory system reduces its sensitivity by
flying bats are shaped as much by the aerodynam- an additional 6 dB per halving of distance,
ics of flight as by echolocation (Gardiner et al. because as the bat vocalizes, the middle ear
2008; Johansson et al. 2016; Vanderelst et al. muscles contract to avoid self-deafening, increas-
2015). ing the bat’s hearing threshold. This time-
As mentioned above, bats decrease their emit- dependent change in hearing threshold
ted intensity progressively as they approach corresponds almost perfectly to the missing
objects. This is primarily believed to function as 6 dB per halving of distance and presumably
gain control for the auditory system, a phenome- provides a constant perceived echo level for the
non also seen in echolocating odontocetes (see bat (Hartley 1992a, b; Henson 1965; Suga and
Sect. 12.5.2). If the bats kept their output level Jen 1975). The gradual relaxation of the middle
constant, the echo level would increase progres- ear muscles progressively decreases the bat’s
sively by many orders of magnitude as the bat hearing threshold back to resting level. It is
approached an object. Considering small insects worth noting that this is under very predictable
as point sources, this increase would be laboratory conditions and that in a real-life field
40  log10(r) or 12 dB per halving of distance r. scenario, the bats encounter much more unpre-
So, the output call level generally decreases by dictable conditions and prey behavior.
6 dB per distance halved (Boonman and Jones Recordings of prey capture in the field reveal
2002; Brinkløv et al. 2013; Hartley 1992a, b; that intensity reduction is much more variable
Lewanzik and Goerlitz 2018). Such a reduction and commonly exceeds 6 dB per halving of dis-
results in a constant intensity at the object/prey, tance (Nørum et al. 2012). This subject is also
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 431

discussed below for harbor porpoises and features of the echo generated by one or more
dolphins. reflections from the objects (Schmidt 1988;
Bat hearing is certainly specialized for echolo- Simmons et al. 1990; Weissenbacher and
cation and for high frequencies (Fig. 12.9). Other Wiegrebe 2003), while the classification of large
small mammals such as mice and rats have a objects such as trees is more complex (Grunwald
similar high-frequency hearing. Bats are, how- et al. 2004). The bat’s resolution of a target
ever, much more sensitive up to their high- depends on both the frequency of the emitted
frequency limit and have very high sensitivity call (higher frequencies reflect more efficiently
over a much wider range of frequencies. Compar- off smaller structures than do lower frequencies
ing echolocating to non-echolocating bats, the (Fig. 12.6 and Urick 1983) and the bat’s ability to
cochlea is significantly larger relative to skull perceive these reflections. Bats are capable of
size, and the basilar membrane, where frequency distinguishing similar-sized objects with very
coding occurs, is longer for echolocating bats minute textural differences. They can clearly dis-
compared to all other mammals (Kössl and tinguish small disks from mealworms when both
Vater 1995). High duty-cycle bats have the lon- are thrown in the air and smooth hanging beads
gest basilar membranes containing an acoustic from textured beads with the same overall echo-
fovea, which is a large region of the membrane strength (Falk et al. 2011; Griffin et al. 1965).
dedicated to a very narrow frequency range. The Our account of bat echolocation only contains
acoustic fovea provides the crucial frequency res- broad strokes. With around 1200 species of
olution and sharp tuning that allows high duty- echolocating bats, the variation in echolocation
cycle bats to separate call and echo by frequency design is vast, and while most follow the outline
instead of time (Bruns and Schmieszek 1980). given here, there are many deviations and many
Bats use the time delay between their outgoing bat species that utilize their echolocation in
call and the returning echo to determine the dis- puzzling ways that are as yet unexplained.
tance to a target. They determine the horizontal
direction to the object by comparing the input on
the two ears. For bats, interaural intensity 12.5 Echolocation in Odontocetes
differences likely provide the main cues (Pollak
1988). The vertical direction is mainly coded by Among cetaceans, only species in the suborder
frequency-dependent reflections from the pinna Odontoceti (toothed whales) are known to
and tragus (Lawrence and Simmons 1982a). echolocate (Au 1993). Bioacoustical research
Bats have excellent spatial resolution and accu- has focused on bottlenose dolphins, belugas,
racy. They consistently aim their echolocation false killer whales, and killer whales (all in the
beam to within less than 5 of their target both families Monodontidae and Delphinidae) as well
horizontally and vertically (Ghose and Moss as porpoises (Phocoenidae), sperm whales
2003; Jakobsen and Surlykke 2010; Masters (Physeteridae), and a few species of beaked
et al. 1985; Surlykke et al. 2009a) and can dis- whales (Ziphiidae).
criminate between two objects in the horizontal Odontocetes use echolocation to orient in the
plane if they are more than 1.5 apart (Simmons aquatic environment, to detect, chase, and capture
et al. 1983) and, in the vertical plane, if they are prey, and to socialize (Thomas et al. 2004;
more than 3 apart (Lawrence and Simmons Thomas and Turl 1990). They have broadband
1982a). hearing and a good ability to discriminate a signal
Aerial hawking bats can easily be tricked into in noise. Their echolocation signals have narrow
catching small pebbles thrown in the air. This is beam patterns that can be modified, as can the
not because bats cannot distinguish pebbles from amplitude and frequency content of outgoing
insects, but likely because most airborne items of clicks.
a given size are edible to bats. Classification of The bottlenose dolphin has been the “labora-
small objects is based on temporal and spectral tory rat” of odontocete biosonar studies. A series
432 S. M. M. Brinkløv et al.

of experiments by US Navy researchers examined brief broadband clicks for echolocation. Fig-
the ability of captive bottlenose dolphins ure 12.10 shows four echolocation clicks from a
(Tursiops truncatus) to detect subtle differences false killer whale (Pseudorca crassidens). Each
in human-made objects for military reconnais- click generally has four to eight cycles and a
sance purposes (Au 1993, 2015; Moore and Pop- duration of 15–70 μs. Peak-to-peak source levels
per 2019). They showed that dolphins wearing can be very high, from 210 to over 225 dB re
eyecups (so they could not see their targets) and 1 μPa at 1 m. High-intensity signals from
using only echolocation could: (1) distinguish dolphins generally are broadband and can contain
objects of the same shape, but of different frequencies beyond 100 kHz. The frequencies of
materials (e.g., cylinders of glass, metal, or dolphin clicks vary almost linearly with the signal
rock), (2) distinguish objects of the same material intensity, such that, as the peak frequency of
but different shapes (e.g., PVC cylinders, plates, echolocation signals increases, the intensity of
squares, and tubes), (3) detect a 3-inch hollow clicks increases (Au and Suthers 2014).
metal sphere at about 115 m distance and a sphere All odontocetes studied thus far produce echo-
of a few millimeters at a distance of about 50 m, location signals using one or two pairs of phonic
(4) feed normally if blind, but if hearing-impaired lips located in the nasal passages. The lips contain
become disoriented, (5) discriminate metal cylin- bursae, which are rod-like fatty structures situated
der targets with different wall-thickness (differ- just below the blowhole (AB, PB in Fig. 12.11b).
ence as little as 0.00 l mm), and (6) control the The phonic lips produce both echolocation clicks
amplitude and frequency of their outgoing pulses, and communication whistles (Cranford et al.
such that in areas of high ambient noise, they 1996).
produced louder and higher-frequency pulses. Amundin (1991) and Huggenberger et al.
(2009) studied click-production in the harbor por-
poise, which can serve as a general example for
12.5.1 Sound Production and Signal odontocetes other than sperm whales. Fig-
Characteristics ure 12.11 shows an overview and details of the
harbor porpoise sound-producing apparatus
Most dolphins emit whistles and burst-pulse (Huggenberger et al. 2009). Air passages are
sounds for intraspecific communication and shown in blue, fat in yellow, bone in white, and

202 ± 5 dB
I

205 ± 5 dB II III
I
Relative Amplitude

1
II
IV
209 ± 4 dB
III

213 ± 3 dB 0
IV 0 50 100 150 200
Frequency (kHz)
0 100 µs

Fig. 12.10 Left: Waveform of false killer whale biosonar Suthers RA. Production of Biosonar Signals: Structure and
signals with increasing averaged peak-to-peak source level Form, pp. 61–105, in Surlykke A, Nachtigall PE, Fay RR,
in dB re 1 μPa (relative amplitudes are drawn). Right: Popper AN (eds) Biosonar. Springer, New York, NY,
Spectra of the corresponding signal type showing increas- USA; https://link.springer.com/chapter/10.1007/978-1-
ing peak-frequency with increasing signal amplitude. 4614-9146-0_3. # Springer Nature, 2014. All rights
Adapted by permission from Springer Nature. Au WWL, reserved
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 433

Fig. 12.11 Schematic sagittal reconstruction of the head cantantis; PE: premaxillary eminence; PN: posterior
of an adult harbor porpoise showing the nasal structures nasofrontal sac; PS: premaxillary sac; PX: pharynx; RO:
and the position of the larynx (LA). (a) Overview. (b) rostrum; sm, sphincter muscle of larynx; TO: tongue; TR:
Detail of boxed area in (a). Blue: air spaces of the upper trachea; TT: connective tissue theca; V: ventral; VE: ver-
respiratory tract; gray: digestive system; light gray: carti- tex of skull; VP: vestibulum of nasal passage; VS: vestib-
lage and bone of the skull; yellow: fat bodies. AB: rostral ular sac; VV: folded ventral wall of vestibular sac.
bursa cantantis; AL: rostral phonic lip; AN: anterior Reprinted with permission from John Wiley and Sons.
nasofrontal sac; AS: angle of nasofrontal sac; BC: brain Huggenberger S, Rauschmann MA, Vogl TJ, Oelschläger
cavity; BH: blowhole; BL: blowhole ligament; BM: blow- HHA. Functional Morphology of the Nasal Complex in
hole ligament septum; C: caudal; CS: caudal sac; DI: the Harbor Porpoise (Phocoena phocoena L.). The
diagonal membrane; DP: low density pathway; IV: infe- Anatomical Record 292:902–920; https://anatomypubs.
rior vestibulum; LA: larynx; MA: mandible; ME: onlinelibrary.wiley.com/doi/full/10.1002/ar.20854.
melon; MT: melon terminus; NA: nasal passage; NP: # John Wiley and Sons, 2009. All rights reserved
nasal plug; NS: nasofrontal septum; PB: caudal bursa

other tissues in red. Air in the bony nares (NA) is Anterior Lip/PL: Posterior Lip) in each naris
pressurized by the nasopharyngeal pouch and the resulting in a click-like vibration in the bursae
sphincter muscle of the larynx (sm), possibly with (Anterior Bursa, AB and Posterior Bursa, PB),
help of the piston-like action of the rostral end of primarily on the right-side. Each click projects
the larynx (LA) and epiglottis (Ridgway and from the bursae through a low-density pathway
Carter 1988). The nasal plug (NP) and the blow- (DP) to the melon (ME) and from there to the
hole ligament septum (BM) control the flow of water. This low-density pathway (DP) is charac-
pressurized air past the phonic lip pair (AL: teristic for the families Phocoenidae (porpoises)
434 S. M. M. Brinkløv et al.

and Cephalorhynchinae (small dolphins). In the sac (VS) is associated with the melon and also
bottlenose dolphin, and most other delphinids, the acts like a shield to preventing sound leakage.
anterior bursa (AB) directly abuts the melon. The New results indicate that the melon of the harbor
small amount of air needed to produce a single porpoise functions as an acoustic waveguide (Wei
click ends up in the vestibular air sac (VS) and et al. 2017, 2018).
eventually is re-cycled to the nasal cavity (NA), The foreheads of beaked whales (Ziphiidae)
rather than exhaled through the blow hole and the two pygmy sperm whales (family
(BH) (Norris et al. 1971; Dormer 1979). This Kogiidae) are quite different. Here, the anterior
process appears to be the same in all odontocetes. bursae lie against a spermaceti organ filled with
Dormer (1979) showed that in three wax esters (Cranford et al. 1996). The spermaceti
delphinids, the right pair of phonic lips produces organ abuts the melon, so an echolocation click
high-frequency clicks, the left pair produces first passes through the spermaceti organ into the
whistles. Whistles, like clicks, are also transmit- melon and out into the sea. Beaked whales have
ted to the melon and into the water but are much an extensive sheet of thick, dense, connective
less directional due to their lower frequencies. tissue rather than air sacs above the spermaceti
There is conflicting evidence for click-production organ and melon (Cranford et al. 2008). Beaked
by the left pair of phonic lips (Madsen et al. 2013; whales dive deep and hunt at depths of more than
Cranford et al. 2011, 2015). Critically designed 1000 m (Johnson et al. 2006). At such extreme
experiments and field recordings are needed to pressures, air sacs would collapse, but the struc-
elucidate the full function of the left pair of pho- tural adaptation of the forehead would still protect
nic lips, particularly in species such as porpoises against acoustic leakage from the melon. Song
that do not whistle. et al. (2015) measured the acoustical properties
In dolphins, porpoises, and river dolphins, the of the melon in pygmy sperm whale (Kogia
melon (ME in Fig. 12.11) and associated tissues breviceps). The density of the melon tissue, and
are the primary structures for transmitting echolo- the velocity and impedance of sound are highest
cation clicks from the phonic lips to the water in the center of the melon. These physical
(Cranford et al. 1996). In the bottlenose dolphin characteristics keep sound from leaking through
melon, fat is not homogeneous; rather it is com- connective and muscular tissue surrounding the
posed of varying amounts of triglycerides and melon. In addition, air sacs above the spermaceti
wax esters that differentially affect the sound organ of Kogia keep sound in the spermaceti
transmission velocity through the melon organ. It is unknown how deep Kogia dives, but
(Au 1993, 2015). The same is true for the harbor the presence of air sacs above the spermaceti
porpoise (Au et al. 2006; Madsen et al. 2010), organ suggests that it does not dive as deeply as
where the melon contains mainly triglycerides, beaked whales. Kogia has extreme right-sided
probably of many different types (chain lengths asymmetry of the skull bones, the function of
and degree of saturation) producing different which remains unclear.
densities (acoustical impedances). The lowest The bioacoustical system of the sperm whale
density is near the low-density pathway (DP in differs from all other odontocetes (Cranford et al.
Fig. 12.11), while the highest density 1996). Sperm whales (Physeter macrocephalus)
approximates that of seawater and occurs in the have only the right pair of phonic lips, which
dorsal part of the melon about four centimeters projects to the tip of the giant rostrum
caudal to the upper lip of the harbor porpoise (Fig. 12.12). Click-production is essentially like
(Kuroda et al. 2015). that of other odontocetes. Air is pressurized in the
The density of muscle and connective tissue right naris (Rn) causing a click from the right pair
above and lateral to the melon (TT in Fig. 12.11) of phonic lips (Mo). A very small amount of
is greater than the density of the melon tissue and sound energy escapes through the distal air sac
keeps sound from leaking out of the melon. In (Di) at click-production (P0 Fig. 12.12b). The
dolphins and the harbor porpoise, a vestibular air major portion of sound energy projects back
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 435

Fig. 12.12 A schematic drawing of a sperm whale head. can modify click generation to produce coda, or weaker
Bl Blow hole; Di Distal air sac; Fr Frontal air sac; Jo Junk communication clicks (the red solid line). This indicates
organ; Ln Left naris; Mo Monkey lips (museau de singe); that the whale can somehow control where the click,
Rn Right naris; So Spermaceti organ. (a) communication generated by the monkey lips (Mo), reflects off the frontal
or coda clicks and (b) echolocation clicks, p1 being the air sac (Fr) thus exiting near the distal air sac (Di).
strongest. According to the bent horn model, the produc- Modified from Caruso et al. (2015). # Caruso et al.
tion of an intense echolocation click (the solid black 2015; https://doi.org/10.1371/journal.pone.0144503.
dashed lines and p1 in b) generates multiple weaker pulses Licensed under CC BY 4.0; https://creativecommons.org/
(p2, p3, p4 in b) owing to reverberation of the initial sound licenses/by/4.0/.
(p1) between Di and Fr (the thin dashed lines). The whale

through the spermaceti organ (So, heavy dashed pulse structure. Cranford et al. (1996) proposed
line), hits the frontal air sac (Fr) and is reflected that the spermaceti organ and the junk are homol-
through the “junk” (Jo, heavy dashed line) into ogous with the posterior and anterior bursae in the
the water as a powerful and broadband click (P1 dolphin, respectively.
in Fig. 12.12b). The sperm whale P1 click is the Although the sound-generating apparatus is
most powerful biological sound known (with basically similar in odontocetes, the outgoing
maximum source levels of 236 dB re 1 μPa rms sound from the melon can differ substantially
at 1 m, Møhl et al. 2003), and is probably used as among species. Initially, the action of the phonic
a long-distance biosonar probe signal (see lips, controlled by pneumatic pressure, influences
Fig. 12.13b). But it has been proposed that these the intensity of the click. Stronger hammer-action
powerful clicks could stun prey. Norris and Møhl of a phonic lip pair means the transmission of
(1983) suggested a “big bang theory” for more intense and higher-frequency clicks
bottlenose dolphins and sperm whales that pro- (Finneran et al. 2014; Fig. 12.10).
duce especially loud, single pulses (or bangs). During orientation, most delphinids produce
These pulses could debilitate prey for easy cap- short, broadband echolocation clicks (Au 1993)
ture, but this has never been proven. In fact, a new often of high intensity. They produce less intense,
study using D-tags on sperm whales recorded no but rapidly repeated clicks, analogous to a bat’s
“big bangs,” but normal odontocete prey capture buzz when approaching objects or prey (see
behavior (Fais et al. 2016). Fig. 12.1). A single click of a wild white-beaked
A fraction of P1 energy reflects from the distal dolphin lasts about 15 μs and has energy from
air sac causing a P2 click to be emitted at a delay about 30 kHz to over 200 kHz (Rasmussen and
consistent with the length of the head (spermaceti Miller 2002). The sperm whale also fits into this
organ). The reverberation continues (P1 to P4 in category (Møhl et al. 2003) with a broadband P1
Figs. 12.12b and 12.13a), resulting in a multi- click (Fig. 12.13b).
436 S. M. M. Brinkløv et al.

p1

0
b p1
a p2

Relative Intensity (dB)


-20
p3
p4
p0 p2
-40

-60
p0

-80
0 5 10 15 20
Frequency (kHz)
10 ms

Fig. 12.13 Multi-pulse structure of a sperm whale click. caused by reverberations in the nose of the whale (see
The P1 click is the most intense and broadest in frequency. also Fig. 12.12). From Møhl et al. (2003). # Acoustical
It is the most powerful biological sound known. The Society of America, 2003. All rights reserved
following clicks of decreasing amplitude (P2–P4) are

At present, it seems that the modulation of Beaked whales regularly use frequency-
clicks in the harbor porpoise occurs in the whale’s modulated up-swept clicks for orientation and
forehead and that the basic echolocation signals when searching for prey. These are relatively
entering the forehead are short-duration, broad- broadband and about 200 μs long (Fig. 12.15).
band clicks. Madsen et al. (2010) used contact Clicks used during prey capture in the buzz are
hydrophones to show that a harbor porpoise click less than 100 μs long, slightly more broadband
recorded near the right (or left) phonic lip pair is than the regular clicks and similar to dolphin
broadband. The same click recorded on the clicks. It is unknown how the upsweep of the
melon, along the midline of the animal near the regular click is generated, but by analogy to the
exit point of the sound, has the typical polycyclic porpoise, the basic signal is likely a broadband
narrowband structure. The narrowband high- click somehow shaped in the forehead of the
frequency click (Fig. 12.14) somehow results whale.
from the melon and associated tissues, but the The directionality of the echolocation sound
details of this mechanism are unknown. beam in odontocetes has been studied for many

Fig. 12.14 (a) Echolocation click from a harbor por- et al. 2019). From Fig. 12.1 in Miller and Wahlberg
poise. (b) Spectrum of a harbor porpoise click. The harbor (2013); # Miller and Wahlberg 2013; https://doi.org/10.
porpoise is one of several smaller toothed whales that use a 3389/fphys.2013.00052. Licenced under CC BY 3.0;
high-frequency narrowband echolocation click (Galatius https://creativecommons.org/licenses/by/3.0/
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 437

Fig. 12.15 Beaked whale click waveform (a), spectro- shows ambient noise). Baumann-Pickering et al. (2010).
gram (b Hann window, 40-point FFT, 98% overlap), and # Acoustical Society of America, 2010. All rights
spectrum (c Hann window, 256-point FFT; dashed line reserved

years (Au 1993, 2015; Au et al. 1985, 1986, Wild Amazon river dolphins (Inia geoffrensis)
1999; Kloepper et al. 2012; Koblitz et al. 2012). also increase the beam width during prey capture
Recent work reveals that odontocetes control the (Ladegaard et al. 2017). Increasing the beam
shape and direction of the beam (Moore et al. width helps the porpoise (or bat) track a moving
2008; Wisniewska et al. 2015). A bottlenose dol- prey at close proximity. Presumably, the muscu-
phin with its head stationary and its mouth on a lature around the melon helps control the beam
biteplate moved its sound beam by 26 to the left width and direction in porpoises and dolphins
and 21 to the right when echolocating a movable (Moore et al. 2008), but this needs verification.
sphere 9 m away (Moore et al. 2008). The direction of the sound beam from the head
Wisniewska et al. (2015) used two-dimensional of a porpoise carcass can be changed by artifi-
hydrophone arrays to verify that harbor porpoises cially inflating the vestibular air sacs (Miller
approaching a target (a dead fish) voluntarily 2010). With no air in the vestibular air sacs, a
change the diameter of their echolocation beam broadband click generated by a small hydrophone
to increase the ensonified area by 100–200%, between the right pair of phonic lips projects left
while reducing the interval between clicks in the of the midline and vice versa with an artificial
buzz phase just before prey capture (Fig. 12.16). click generated between the left phonic lip pair.
These changes are analogous to what a bat will do With air in the vestibular air sacs, the artificial
when capturing an insect (Jakobsen et al. 2015). clicks project out the midline (Fig. 12.17; see also
438 S. M. M. Brinkløv et al.

Fig. 12.16 The harbor porpoise can increase the intervals emitted in the search phase at longer distances
ensonified area by nearly 200% during the buzz phase (ICI in b, red). # Wisniewska et al. 2015; https://
with short inter-click intervals (ICI in b, blue). The large elifesciences.org/articles/05651. Licensed under CC BY
diameter circle (solid in a) illustrates the beam width for 4.0; https://creativecommons.org/licenses/by/4.0/. All
clicks with short intervals. The small diameter circle rights reserved
(dashed in a) shows the beam width of clicks with longer

Fig. 12.17 Short broadband artificial clicks generated generated (clicks generated between the right pair of pho-
between the phonic lips (right lip: solid arrow and curve; nic lips emerge to the left and vice versa). Adapted with
left lip: dashed arrow and curve) of a cadaver harbor permission from Miller LA (2010); Prey Capture by Har-
porpoise. With air in the vestibular air sacs (right image), bor Porpoises (Phocoena phocoena): A Comparison
the clicks emerge at the midline. Without air in the vestib- Between Echolocators in the Field and in Captivity; J
ular air sacs (left image), the clicks emerge on either side Marine Acoust Soc Jpn 37 (3):156–168. # The Marine
of the midline depending on where the artificial click was Acoustics Society of Japan, 2010
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 439

Starkhammar et al. 2011; Cranford et al. 2014). porpoise (Linnenschmidt et al. 2012, 2013) have
Incidentally, the exiting click remained broad- voluntary control over the level of the emitted
band in these experiments indicating that the liv- click and of their auditory sensitivity during echo-
ing melon and associated tissues are necessary for location tasks. The results from the harbor por-
producing a high-frequency, narrowband click poise clearly illustrate active hearing during the
typical for the harbor porpoise (Madsen et al. echolocation of targets: the porpoise maintains a
2010). constant level of auditory perception independent
The primordial odontocete echolocation signal of target distance. If the distance to a target is
was probably a short, broadband click similar to doubled, the level of a click impinging on the
the clicks used by most living dolphins and the target is halved (6 dB). To compensate for
sperm whale (Fig. 12.10, left). In contrast, the La this, the porpoise doubles the level of the outgo-
Plata dolphin (Pontoporia blainvillei), six small ing click (+6 dB), keeping the level of the inci-
dolphins (family Delphinidae), all porpoises dent sound on the target constant and independent
(family Phocoenidae, six species with four of distance (within a certain range). However, the
documented), and the pygmy and dwarf sperm returning echo is halved (6 dB) at double the
whales (family Kogiidae) use narrowband, high- distance. Linnenschmidt et al. (2012) showed that
frequency (NBHF) echolocation clicks (see there is an “automatic gain control” in the audi-
Fig. 12.14). The change from broadband to tory system of the porpoise such that its hearing
NBHF echolocation clicks could reflect predation increases in sensitivity by about +6 dB to com-
pressure by killer whales (and their ancestors), as pensate for the loss in the echo level over double
well as environmental factors (Andersen and the distance. Without compensating for the level
Amundin 1976; Madsen et al. 2005; Morisaka of the outgoing click and the gain control in the
and Connor 2007; Miller and Wahlberg 2013; auditory system, the echo level would drop by 1/4
Galatius et al. 2019). NBHF clicks appear to be (12 dB) per doubling of distance to the target,
generated in the melon and associated tissues making echolocation more difficult for the whale.
(Madsen et al. 2010). It is assumed that all Toothed whales obviously find their prey
odontocetes can control the amplitude of echolo- using echolocation, but how they discriminate
cation clicks, steer the sound beam, and manipu- between prey species is not known and, to our
late its width (Moore et al. 2008; Wisniewska knowledge, has not been studied experimentally.
et al. 2015). These features are of obvious advan- Probably the most spectacular use of echolocation
tage for detecting and tracking prey. There are to find prey is shown by bottlenose dolphins in
rich possibilities in future research of sound pro- the Grand Bahamas. The dolphins often find fish
duction and the use of echolocation by odontocete under the sand using their echolocation and stick
whales. their proboscis down in the sand, sometimes to
the pectoral fins, and come up with a fish in their
mouths (Rossbach and Herzing 1997). What echo
12.5.2 Hearing Anatomy information they use for this unusual behavior is
and Echolocation Abilities unknown. Harbor porpoises can discriminate
between identical spheres of different materials
We refer to Vol. 2 Chap. 9 on aquatic mammals (Wisniewska et al. 2012). Three harbor porpoises
for more detail on hearing anatomy and abilities. were easily able to distinguish between an alumi-
Here, we focus on the hearing abilities of num sphere and spheres of plexiglas, PVC, and
odontocetes as they relate to the tasks of obstacle brass. Two of the three had problems
and prey detection by echolocation. differentiating aluminum from steel spheres. The
Experimental studies show that the bottlenose spectra of these two spheres were very similar, so
dolphin (Li et al. 2011), the false killer whale we assume the harbor porpoises were using spec-
(Nachtigall and Supin 2008), and the harbor tral information to detect the differences among
440 S. M. M. Brinkløv et al.

Fig. 12.18 Underwater audiograms of four (Grampus griseus) auditory evoked response audiogram
odontocetes. Blue: Harbor porpoise behavioral audiogram using a 20-ms sinusoidal amplitude-modulated stimulus
using a 50-ms sound stimulus (Kastelein et al. 2010). (Nachtigall et al. 2005). Yellow: Killer whale average
Orange: White-beaked dolphin auditory evoked response behavioral audiogram of two animals using a 2-s tone
audiogram using a 1-s sinusoidal amplitude-modulated (Szymanski et al. 1999)
stimulus (Nachtigall et al. 2008). Purple: Risso’s dolphin

the spheres. Perhaps they also use spectral infor- Price et al. 2004). Neither seem to use echolocation
mation together with target strength to distinguish to find food, but rather for crude orientation in dark
between different fish species. caves or tunnels where they roost and nest. Argu-
All echolocating toothed whales have a ably, bird echolocation systems are not a highly
U-shaped audiogram (Fig. 12.18) and a broad evolved sensory specialization in the same sense as
range of hearing extending up to 200 kHz. In in bats and odontocetes.
general, the hearing of odontocetes is most sensi- Disregarding nesting habits, oilbirds and
tive at the frequencies used for echolocation. For swiftlets have very different ecologies. Oilbirds
example, the harbor porpoise, a narrow-band are nocturnal fruit-eaters from the tropical part of
high-frequency species, is most sensitive at South America (Chantler et al. 1999). Swiftlets
around 130 kHz, the peak frequency of its narrow occur across the Indo-Pacific and use vision to
band signal. The killer whale uses lower locate insect prey during the day. There are
frequencies in its echolocation signals and its records of swiftlets hunting at dusk, but it is
best hearing is accordingly lower (Fig. 12.18). unclear if they use echolocation during this activ-
ity (Price et al. 2004; Fullard et al. 1993).

12.6 Echolocation in Birds


12.6.1 Sound Production and Signal
The oilbird (Steatornis caripensis, family Characteristics
Steatornithidae), and a subset of the swiftlets, fam-
ily Apodidae (about 16 of 27 species, currently Like other birds, oilbirds and swiftlets produce
including Aerodramus spp and Collocalia sounds, including their biosonar signals, by
troglodytes) are the only birds known to echolocate inducing vibrations in air passed by membranous
(Griffin 1958; Novick 1959; Chantler et al. 1999; structures in their syrinx (see Vol. 2, Chap. 6).
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 441

Fig. 12.19 Schematic of syrinx anatomy in the oilbird into the two bronchi. Note the lack of intrinsic syringeal
(based on Suthers and Hector 1988, Fig. 12.2) and the muscles (mm. broncholateralis) in the swiftlet. Note also
Australian grey swiftlet (Aerodramus (formerly the asymmetry of the bronchial oilbird syrinx with a more
Collocalia) spodiopygia; based on Suthers and Hector cranial placement of the right semi-syrinx. Adapted by
1982, Fig. 12.2), showing the trachea and its bifurcation S. Brinkløv

Suthers and Hector (1982, 1985) revealed distinct lack intrinsic syringeal muscles (Fig. 12.19) and
differences in the syringeal morphology of instead contract extrinsic tracheolateralis muscles
oilbirds and swiftlets (Fig. 12.19) but proposed to terminate their echolocation clicks (Suthers and
similar sound production mechanisms in both. Hector 1982).
Oilbirds have a bronchial syrinx located caudal Bird biosonar signals are relatively broadband
to the tracheal bifurcation. The two half-syringes and without structured frequency changes over
are placed with bilateral asymmetry in the two time (Pye 1980). In this sense, they resemble the
bronchi (Suthers and Hector 1985). The swiftlet tongue-clicks of rousettes bats more than the
syrinx is tracheobronchial (i.e., located where the signals produced by other echolocators, but with
trachea splits into the two bronchi; Suthers and a narrower frequency range, longer duration, and
Hector 1982). lacking similarly well-defined on- and offsets
Suthers and Hector suggested that biosonar (Fig. 12.20).
signals in both oilbirds and swiftlets are produced In the wild, oilbirds emit click-bursts of two or
as a contraction of the extrinsic sternotrachealis more single clicks in rapid succession
muscles pulls the trachea caudal. This reduces (Fig. 12.20). Their clicks and click intervals are
tension across the syrinx and causes the syringeal stereotyped within such a burst, with click
membranes to fold into the syrinx lumen, where durations of 0.5–1 ms and click intervals of
they induce vibrations of the expiratory airflow. ~2.5 ms. Clicks recorded from oilbirds in the
Contrary to their other vocalizations, oilbirds and wild have the most energy around 10–15 kHz
swiftlets actively terminate their echolocation but extend from 7 to 23 kHz measured at 6 dB
clicks but do so by using different sets of muscles. from the peak frequency (Brinkløv et al. 2017).
In oilbirds, termination is controlled by contrac- The intervals between click-bursts are more vari-
tion of the broncholateralis muscles intrinsic to able, but often around 200 ms (Griffin 1953).
the syrinx (Suthers and Hector 1985). Swiftlets Each click-burst is perceived by human ears as
442 S. M. M. Brinkløv et al.

Fig. 12.20 Waveform and spectrogram displays of bird to its nest in a Sri Lankan railway tunnel. The overall
echolocation click sequences. Top panel: oilbird timescale is 1 s, frequency scale is from 0 to 20 kHz.
(Steatornis caripensis) exiting cave roost, recorded at Spectrogram settings: FFT size 256, Hann window, 98%
Dunstan’s Cave, Asa Wright Nature Centre, Trinidad. overlap. Both recordings are high-pass filtered at 1 kHz
Bottom panel: swiftlet (Aerodramus unicolor) returning (second order Butterworth filter)

one coherent sound (Konishi and Knudsen 1979). could be affected by reverberant confines or the
It is unresolved whether the number of individual stress of handling/being restrained.
clicks in a burst has functional meaning to the Swiftlets emit biosonar signals either as single
oilbird, but recent studies indicate that oilbirds or double clicks (two single clicks in rapid suc-
may add click subunits to a burst as a means to cession, Thomassen et al. 2004; Fig. 12.20). As in
increase overall burst energy and, as a result, the oilbirds, it is unclear if the difference between
echolocation range (Brinkløv et al. 2017). Click- single and double clicks has functional meaning
bursts typically have source levels of around to the swiftlets or is merely an artifact of the
100 dB re 20 μPa rms at 1 m (Brinkløv et al. sound production mechanism (Suthers and Hec-
2017). tor 1982). Of 12 swiftlet species studied, only the
Data from captive oilbirds differ somewhat Atui swiftlet (Aerodramus sawtelli) appears to
from field recordings. Konishi and Knudsen consistently produce single clicks (Fullard et al.
(1979) reported that oilbird signals had most 1993), while the rest emit both single and, more
energy around 2 kHz and described each click often, double-clicks. Each click of a pair is
as a pulse-like sound burst of 20 ms or more. 1–8 ms long, with the second often of higher
Suthers and Hector (1985) described a large sig- amplitude and slightly longer duration (Griffin
nal variation including continuous pulsed signals and Suthers 1970; Suthers and Hector 1982;
of 40–80 ms and shorter single or double pulses. Coles et al. 1987). Clicks within a pair have
This difference between field and captive data intervals of 1–25 ms and click-pairs are emitted
possibly indicates that the sounds of captive at intervals of 50–350 ms. Swiftlet clicks have
birds do not accurately reflect the echolocation most energy below 10 kHz (see spectrogram in
behavior of birds in the wild since vocalization Fig. 12.20).
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 443

12.6.2 Hearing Anatomy nuclei magnocellularis and nuclei laminaris com-


and Echolocation Abilities pared to non-echolocating swiftlets, structures
that are both involved in temporal coding of audi-
While the auditory systems of echolocating bats tory stimuli. The nucleus angularis appears to be
and odontocetes include specializations that con- enlarged in oilbirds (Kubke et al. 2004) and is
fer increased acuity and sensitivity, only a few known to process intensity information in barn
such morphological or neurological owls (Tyto alba). Iwaniuk et al. (2006) concluded
specializations have been found in echolocating that oilbirds and swiftlets may have enlarged
birds. Tomassen et al. (2007) used three- MLds (nucleus mesencephalicus lateralis, pars
dimensional, micro-CT scans to model the middle dorsalis), a structure homologous to the mamma-
ear function of a range of swiftlet species. They lian inferior colliculus. However, this enlarge-
found no morphological adaptations in the middle ment was only apparent compared to closely
ear single bone-lever system of the birds related non-echolocating species, not to
(Fig. 12.21) to improve impedance-matching in non-echolocating birds in general.
echolocating compared to non-echolocating spe- The hearing abilities of both oilbirds and
cies. Both had low tympanum-to-oval-window swiftlets have been tested using neurophysiologi-
ratios relative to bird auditory specialists such as cal approaches and indirectly through obstacle
owls. Birds have a straight, rather than coiled avoidance experiments. Measurements of
cochlea (Fig. 12.21) and generally do not hear cochlear and evoked potentials from the forebrain
much above 10 kHz (Fig. 12.9, also see Manley nucleus of anesthetized oilbirds empirically sup-
1990, p. 238). port the absence of inner ear specializations for
While peripheral auditory adaptations for echolocation. Oilbirds appear to be more or less
echolocation seem absent in birds, there is some insensitive to frequencies above 6 kHz and their
evidence that certain of the brain nuclei involved best auditory sensitivity is at ~2 kHz (Fig. 12.9,
in auditory processing are enlarged in and Konishi and Knudsen 1979). Single neuron
echolocating bird species. Thomassen (2005) recordings from the midbrain auditory nucleus of
found that echolocating swiftlets have larger the echolocating Australian grey swiftlet showed

Fig. 12.21 Overview of avian and mammalian middle Springer Nature. Manley GA, Peripheral hearing
and inner ear anatomy. Left: Birds have a single middle ear mechanisms in reptiles and birds; https://www.springer.
bone (columella) and a straight cochlea. Right: Mammals com/gp/book/9783642836176. # Springer Nature, 1990.
have three middle ear bones (malleus, incus, and stapes) All rights reserved
and a coiled cochlea. Adapted by permission from
444 S. M. M. Brinkløv et al.

best thresholds at 1–5 kHz (Fig. 12.9 and Coles relative to nights with more ambient light. The
et al. 1987). Hence, both oilbirds and swiftlets higher intensity of click-bursts emitted on darker
appear to have the ‘standard’ bird hearing range, nights resulted both from an increase in the ampli-
with lowest thresholds between 2 and 4 kHz and tude of individual clicks and an increase in the
poor sensitivity above 10 kHz (Dooling 1980). number of individual clicks per click-burst. Sev-
Curiously, it appears that oilbirds in the wild emit eral studies have noted that swiftlets increase
echolocation clicks that are not well-aligned to click repetition rate as they approach obstacles
their best area of hearing. The lack of external ear (Griffin and Suthers 1970; Coles et al. 1987)
structures in oilbirds and swiftlets means that and Atiu swiftlets emit signals at higher repetition
directional cues occur at frequencies predicted rate when they enter than when they emerge from
by head size. their cave roost (Fullard et al. 1993).
With echolocation signals matching their most Nesting in dark places, such as caves, mines,
sensitive area of hearing, oilbirds and swiftlets tunnels, and other places where the lighting is
should detect objects down to at least 17 cm in uncertain, is a common feature of the ecology of
diameter, equal to the wavelength of the signal at oilbirds and echolocating swiftlets. Both start
2 kHz. For Oilbirds, this prediction is supported clicking as they cross a threshold from light to
by obstacle-avoidance experiments, suggesting dark (Fenton 1975; Thomassen 2005; Brinkløv
that they detect discs 20 cm in diameter et al. 2017). Neither have been shown to use
suspended from the ceiling of their cave roost echolocation for foraging, although oilbirds may
(Konishi and Knudsen 1979). However, detection be able to detect some of the larger fruits they eat
thresholds between 0.6 and 2 cm have been found (palm fruits up to 6 cm) by echolocation (Snow
for swiftlets (Griffin and Suthers 1970; Fenton 1961, 1962; Bosque et al. 1995).
1975; Griffin and Thompson 1982; Smyth and
Roberts 1983), indicating that they may somehow
extract echo information from the upper, albeit 12.7 Orientation and Echolocation
weaker, frequency range of their signals. in Insectivores and Rodents
Like bats and odontocetes, oilbirds and
swiftlets detect obstacles in dark spaces using 12.7.1 Echo-Based Orientation
echolocation. Unlike bats and odontocetes, in Insectivores: Tenrecs
echolocating birds, even the nocturnal oilbird, and Shrews
are also vision specialists and presumably do not
forage by echolocation. The importance of vision Tenrecs and shrews are small insectivorous
in oilbirds is reflected in their specialized retinal mammals that forage in dense vegetation or
morphology with multiple layers of under leaf-litter (Fig. 12.22). Tenrecs are largely
photoreceptors (Martin et al. 2004). Initial behav- endemic to Madagascar, but shrews have a wide
ioral experiments revealed that oilbirds flying in distribution across Eurasia and North America.
darkness consistently produced sounds but could Both have tiny eyes and a presumably well-
not avoid obstacles if their ears were blocked. developed olfactory sense and emit a variety of
With the lights on, the birds, in contrast, produced sounds. The use of sounds by shrews and tenrecs,
fewer or no sounds and negotiated obstacles also as they approach and explore unfamiliar objects
with their ears blocked (Griffin 1953). in their surroundings, led to initial suggestions
Biosonar signals of birds are generally stereo- that they may use echolocation. However, few
typed (Thomassen and Povel 2006) and there is studies have successfully tested this hypothesis
no indication that birds have similar adaptive directly. The current consensus is that shrews
control over signal frequency as most and tenrecs may use a simple echo-based orienta-
echolocating bats. However, Brinkløv et al. tion system to obtain rough acoustic input about
(2017) recently found that the intensity of oilbird their surroundings at short range beyond their
echolocation signals increased on darker nights snout and vibrissae. As stated by Siemers et al.
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 445

Fig. 12.22 Photographs (from left) of lowland streaked by Wilfried Berns, 2006, https://en.wikipedia.org/wiki/
tenrec (Hemicentetes semispinosus), lesser hedgehog ten- Lesser_hedgehog_tenrec#/media/File:Kleiner-igeltanrek-
rec (Echinops telfairi), and northern short-tailed shrew a.jpg. Photo of northern short-tailed shrew by Giles
(Blarina brevicauda). Photo of lowland streaked tenrec Gonthier, 2007, https://en.wikipedia.org/wiki/Northern_
by Frank Vassen, 2010, https://commons.wikimedia.org/ short-tailed_shrew#/media/File:Blarina_brevicauda.jpg.
wiki/File:Lowland_Streaked_Tenrec,_Mantadia,_ All photos licensed under CC BY 2.0; https://
Madagascar.jpg#filelinks. Photo of lesser hedgehog tenrec creativecommons.org/licenses/by/2.0/deed.en

(2009): “Except for large and thus strongly series of tongue clicks, each less than 2 ms long
reflecting objects, such as a big stone or tree with most energy between 10 and 16 kHz. The
trunk, shrews probably are not able to disentangle clicks were produced as singles, doubles, or in
echo scenes, but rather derive information on triplets. Streaked tenrecs (Hemicentetes
habitat type from the overall call reverberations. semispinosus) emitted clicks of low intensity;
This might be comparable to human hearing while those of Nesogale dobsoni were audible to
whether one calls into a forest or into a reverber- humans at 7 m.
ant cave.” Gould et al. (1964) found that, contrary to the
Gould et al. (1964) and Gould (1965) provided audible pulses of tenrecs, shrews (Sorex vagrans,
the most direct evidence for echo-based orienta- S. cinereus, S. palustris, and Blarina brevicauda)
tion in several species of shrews and tenrecs. searching for the platform emitted ultrasonic
After unsuccessful attempts to use an obstacle- pulses with most energy between 30 and
avoidance set-up, the animals were instead tested 60 kHz. The pulses were about 5 ms in duration
using a so-called disc-platform apparatus. They with inter-pulse intervals of about 20 ms. Sanchez
were trained to find and jump onto a platform et al. (2019) recorded five Sorex unguiculatus in
suspended at a vertical distance below a disc three different experimental setups, including soft
with an area of partial overlap. The location of and hard barrier obstacles. Under all three
the overlap was varied at random between trials. conditions, the shrews emitted a variety of calls,
Both tenrecs and shrews emitted sounds during including clicks and several tonal pulse types
this task in the dark, but animals with their ears ranging in frequency between 5 and 45 kHz
blocked were less successful in finding and land- with durations of 3–40 ms. While several studies
ing on the platform than control animals. The have shown that shrews and tenrecs do show
control experiments included two tenrecs that context-dependent changes in vocalization rate,
were blindfolded. there is little direct evidence for echolocation by
Gould (1965) recorded the sound pulses emit- these animals (Buchler 1976; Tomasi 1979;
ted by captive tenrecs (Echinops telfairi, Forsman and Malmquist 1988; Siemers et al.
Hemicentetes semispinosus, and Nesogale (for- 2009; Sanchez et al. 2019).
merly Microgale) dobsoni) as they explored the No morphological adaptations for echoloca-
disk-platform apparatus. The tenrecs emitted tion have been found in the auditory systems of
446 S. M. M. Brinkløv et al.

tenrecs or shrews. The limited data on hearing in Supplementing the behavioral part of their
these animals indicate that at least tenrecs hear study, He et al. (2021) also conducted anatomical
well across the frequency range of their tongue- scans to reveal that the stylohyal bone of soft-
clicks. Sales and Pye (1974) reported that the furred tree mice is fused with the tympanic bone,
hearing of streaked tenrecs is most sensitive which is characteristic of echolocating bats.
from 2 to 60 kHz. Drexl et al. (2003) used Lastly, they used genetic analyses to document a
otoacoustic emissions and auditory evoked strong convergence of hearing-related genes with
potentials from the inferior colliculus and the those of other echolocating mammal groups,
auditory cortex to determine that the auditory including the prestin gene associated with echo-
range of lesser hedgehog tenrecs (Echinops location in bats and toothed whales (Liu et al.
telfairi) extends from 5–50 kHz at 40 dB SPL, 2014). All four species of soft-furred tree mice
with a lowest threshold at 16 kHz. Siemers et al. emit similar short (~2 ms) ultrasonic pulses rang-
(2009) report a best hearing range of shrews ing from 65 to 140 kHz (He et al. 2021).
between 2 and 20 kHz.

12.8 Are Echolocation Signals also


12.7.2 Echolocation in Rodents Used for Communication?

One important test for echolocation is to blind the Studies on the role of echolocation signals for
echolocator. This was done by Griffin (1958) for intraspecific communication have included
bats and by Norris et al. (1961) for dolphins. observations and recordings, playback
Although such a “blinding test” was not experiments, and combinations of these
performed, a multifaceted study by He et al. approaches. Echolocation signals elicited territo-
(2021) convincingly suggests soft-furred tree rial behavior in foraging spotted bats, served in
mice (Typhlomys) must be added to the list of individual recognition, and assisted in
echolocating animals. Through behavioral maintaining group adhesion among foraging
experiments in total darkness, filmed with an molossids (Fenton 1995). Furthermore, bats use
infrared video camera, they showed that all four buzzes (high pulse repetition rates) not only when
species of soft-furred tree mouse emitted acoustic attacking prey, but also during landing, drinking
pulses at higher rate and grouped pulses more in and by several species in social settings (e.g.,
complex space than open space and during obsta- Schwartz et al. 2007). Many bat species roost in
cle avoidance. Further, three species (T. cinereus, large groups in caves and emerge at dusk as a
T. daloushanensis, and T. nanus) were tested in a group to forage. Several toothed whale species
disk-platform setup similar to that used by Gould forage in large numbers. Echolocation in bats and
et al. (1964) for shrews and tenrecs. The tree mice odontocetes likely plays a role in maintaining
spent increased time emitting higher pulse rates spacing among group members during foraging
on the sector of the disk above the platform before or during large group movements. However, there
dropping down onto the platform. This preference has been little research on whether all or only
was lost when their ears were blocked but specific animals echolocate while foraging as a
regained when the ears were unplugged or fitted group. The benefits of eavesdropping on each
with hollow tubes. The study also used laboratory other’s echolocation signals need to be studied.
house mice (Mus musculus) as a control to dem- Groups of flying bats and swimming toothed
onstrate absence of any location preference or whales surely eavesdrop on each other’s echolo-
sound emission during the disk-platform test. cation signals to gain general information about
Myriad tests and field studies document the func- prey location. The energetic cost of sound pro-
tional use of echolocation by bats and toothed duction for flying bats and for clicking dolphins is
whales, but such studies are not available for negligible (Speakman and Racey 1991; Noren
insectivores and rodents. et al. 2017).
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 447

Evidence suggests that toothed whales use greater than that of silent dolphins indicating
their echolocation clicks as communication that echolocation is not energetically costly
signals. These comprise repeated patterns of (Noren et al. 2017).
rising, falling, or constant click repetition rates Several free-ranging species of dolphins
up to near 1000 clicks/s. Clicks used for commu- (Tursiops truncatus, Stenella attenuata,
nication by dolphins and porpoises have the same S. longirostris, S. frontalis, Orcinus orca, and
spectral properties as those used for echolocation, Cephalorhynchus hectori) use pulse-bursts
but this does not hold true for the coda-clicks of mostly during affiliative and aggressive behavior
sperm whales, as explained below. (Dawson 1991; Herzing 2000; Lammers et al.
In toothed whales, most is known about the 2004). Rasmussen et al. (2016) played back arti-
communication role of echolocation clicks from ficial pulse-burst signals (repeated at 300 clicks/
studies of captive harbor porpoises, captive s for 2 s) to 21 free-ranging white-beaked
bottlenose dolphins, and wild sperm whales. dolphins. Rather than responding with aggressive
Porpoises and dolphins communicate with chang- behavior, the dolphins showed mostly a change in
ing click repetition rates, rather like Morse code, swimming direction and swam around the projec-
without changing the temporal and spectral tion equipment, mirroring the retreat of individual
properties of the clicks (Rasmussen and Miller captive harbor porpoises receiving an ‘aggres-
2002; Clausen et al. 2010). These “pulse-bursts” sive’ pulse-burst. The pulse-bursts, or rasps, of
(or burst-pulse sounds) of high repetition rate Blainville’s beaked whale are only emitted at
clicks with narrow sound beams are especially depths below 200 m and composed of a series
good for close range and directed communication of short, FM clicks similar to its FM echolocation
(Clausen et al. 2010). clicks, except with a lower peak-frequency. The
Figure 12.23 shows click rates used in five communication context is not known (Arranz
behavioral contexts between a mother harbor por- et al. 2011).
poise and her calf. The porpoises used the highest Sperm whales are social and form social units
click rates in aggressive encounters, the lowest in in subtropical and tropical waters worldwide. Up
grooming and echelon swimming (Clausen et al. to 12 females with young of both sexes gather in
2010). The mother may be aggressive toward her long-term stable social units. Sperm whales in all
calf and toward males. Aggressive signals were ocean basins communicate using rhythmic
usually higher in intensity and repetition rates and “coda” clicks (see Fig. 12.12), which are a unique
always resulted in the other animal moving away specialization among toothed whales (Watkins
from the emitter. Both mother and calf emitted and Schevill 1977) and may even signify individ-
approach signals, but only the calf emitted contact ual identity. The composition of codas can have
signals and only the mother emitted grooming many repetitive patterns, such as one click + a
signals. Wild harbor porpoises also use rapid group of three clicks: 1 + 3, or 2 + 1 + 1 + 1,
click rates for communication (Sørensen et al. 1 + 1 + 3, etc. The coda patterns are not stereo-
2018). typed; click intervals within a coda can vary and
Bottlenose dolphins use both echolocation seem to contain information for the receiver. One
clicks and whistles as communication signals. stable social unit of five adult females, a juvenile
Blomkvist and Amundin (2004) studied two cap- male, and a calf in the waters off Dominica used
tive female bottlenose dolphins that used high- 15 different codas. All individuals in the unit used
frequency, high repetition rate pulse-bursts dur- several codas and one individual used 11 of the
ing aggressive behavior. The pulse-bursts lasted 15 codas (Antunes et al. 2011). A recent study
up to 900 ms with click repetition rates from (Oliveira et al. 2016) confirmed and extended
100 to 940 clicks/s. Like the echolocation clicks those of Antunes et al. (2011). Using digital data
used for orientation and foraging, the pulses were acquisition tags (D-tags) attached to five individ-
between 60 and 150 kHz. The metabolic rate of ual sperm whales near the Azores, Oliveira et al.
dolphins producing clicks was only slightly (2016) strongly indicated that codas from these
448 S. M. M. Brinkløv et al.

Fig. 12.23 Use of echolocation click rates by harbor Beedholm K, Dereuiter S, Madsen PT, Click communica-
porpoise as communication signals. Five different acoustic tion in harbor porpoises (Phocoena phocoena). Bioacous-
behaviors with seven events in each are shown. Note the tics 20:1–28; https://www.tandfonline.com/doi/abs/10.
very rapid increase in click repetition rate up to 1000 1080/09524622.2011.9753630. # Taylor & Francis,
clicks/s during aggressive encounters. Reprinted with per- 2011. All rights reserved
mission from Taylor & Francis. Clausen KT, Wahlberg M,
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 449

sperm whales contained individual identification 12.9 Summary


information. Some of the patterns can be distinct
from one area to another while others, like the To date, highly specialized echolocation systems
five-click coda, occurred in geographically wide- have evolved in many bat species and in toothed
spread social units. We have yet to reach a whales. Oilbirds and swiftlets also make use of a
detailed understanding of the use of codas by cruder type of echolocation, independent of obvi-
sperm whales, but codas may carry specific ous auditory specializations, for orientation when
behavioral information from individual sperm their visual abilities become insufficient. A more
whales. complete understanding of echolocation by birds
Sperm whale coda-clicks resemble biosonar- awaits future studies. A form of echo-based ori-
clicks (Fig. 12.12) and the same basic mechanism entation may be present in shrews and tenrecs, but
likely underlies the production of both. However, the exact extent of its function still needs proper
whereas the biosonar-click largely bypasses the documentation.
distal air sac, reducing the strength of back Most echolocators use ultrasonic signals,
reflections (P1 etc. in Fig. 12.12), the (Po) of the either broadband clicks (including most toothed
coda-click seems to exit the rostrum more dor- whales, rousette bats, oilbirds and swiftlets) or, as
sally (see Fig. 12.12). It thus hits a larger portion in most bats, tonal echolocation calls of constant
of the distal air sac and reflects to a larger extent frequency, frequency-modulated sweeps, or a
back to the frontal air sac producing the P1. This combination of these call types. Generally, echo-
difference is indicated by the smaller dB differ- location signals have high amplitude to promote
ence between the Po and P1 components for coda long-range transmission. Bats and dolphins emit
clicks relative to biosonar clicks (Fig. 12.12). The echolocation signals in a narrow beam, a sort of
large muscle and tendon layer between the dorsal acoustic flashlight, to focus their search. In both
edges of the cranium to the tip of the rostrum bats and dolphins, the repetition rate of signals
could play a role in directing the click. The initial increases as they approach a target. Bats and
coda click (Po) is lower in frequency and intensity dolphins can adjust the frequency and amplitude
than the biosonar click (Fig. 12.12, relative ampli- of their biosonar signals to adapt to noisy ambient
tude values). The intervals between repetitions of conditions. Most echolocators do not broadcast
a coda click match those of a biosonar click from and receive echolocation signals at the same time
the same animal (Fig. 12.12b) and reflect the but separate the outgoing pulse from the echo in
distance between the distal (Di) and frontal time to minimize the masking of faint echoes by
(Fr) air sacs (see Fig. 12.12). The properties of the next outgoing signal. However, some families
the coda clicks make them more suited for close- of bats are overlap-tolerant and emit long echolo-
range and less directional communication than cation signals of constant frequency while listen-
the more intense, higher frequency biosonar ing for Doppler-shifted echoes returned by prey
clicks (Fig. 12.13). items.
Whether echolocation signals serve a role for Hearing anatomy, physiology, and abilities in
intraspecific communication in birds and bats and dolphins have been well-studied. Bats
insectivores has, to our knowledge, not been stud- have a tragus and grooves in their pinnae that aid
ied, but Suthers and Hector (1988) hypothesized in signal reception and directional hearing. In
that individual differences of the syrinx anatomy, contrast, dolphins do not have pinnae but have
specifically the position of the syringeal evolved asymmetrical skull bones that aid in
membranes, would allow oilbirds to distinguish directional hearing. Some bats emit echolocation
own from conspecific signals by differences in signals through their nose and have elaborate
the spectral characteristics of their clicks. nose-leafs while others are open-mouth
450 S. M. M. Brinkløv et al.

echolocators. Bats produce their echolocation the Dark. While now more than 60 years old, the
sounds in the larynx. Dolphins emit echolocation original observations and insights detailed by
sounds through the melon within their forehead Griffin (1958) are still very much to the point
and from here into the water. They have phonic and relevant today. The Springer Handbook of
lips in their nasal passage to produce their echo- Auditory Research volumes Hearing by Bats,
location clicks and communication whistles. Bat Bioacoustics, Hearing by Whales and
A primary advantage of echolocation is Dolphins, and Biosonar are also highly
allowing animals to operate and orient in recommended as they hold much more detail
situations where light is uncertain, unpredictable, than the present description. Finally, Thomas,
or plain absent. But as with other sensory Moss, and Vater edited a book on Echolocation
capacities, echolocation often does not stand in Bats and Dolphins in 2002.
alone. The cross-modal sensory interactions
between echolocation and sensory abilities such Acknowledgments We dedicate this chapter to
as touch, olfaction, and vision, is an area awaiting Dr. Annemarie Surlykke, who made substantial
contributions to the field of bioacoustics in insects and in
further exploration.
echolocating bats. She was one of the first women
Information leakage is a primary disadvantage scientists to concentrate her research in the area of bio-
of echolocation. The signals used in echolocation acoustics, which requires a multi-disciplinary understand-
are audible to many other animals, such as com- ing of biology, acoustics, physics, animal behavior, and
electrical engineering.
peting conspecifics, predators, and prey. The evo-
We appreciate the careful reviews of sections 5 and 8 by
lutionary arms race between echolocating bats Mats Amundin, Senior Advisor Kolmårdens Djurpark and
and some insect prey is a classic example of Guest Prof. Linkoping University, Sweden; Professor
predator–prey co-evolution. Signals used in echo- Peter T. Madsen, Department of Bioscience, Aarhus Uni-
versity, Denmark; and Associate Professor Magnus
location also can function in communication, as
Wahlberg, Institute of Biology, University of Southern
shown in echolocating bats and toothed whales. Denmark, Odense, Denmark. We acknowledge and appre-
Both bats and odontocetes are affected by ciate the initial outline of this chapter by now deceased
anthropogenic activities, as exemplified by the Jeanette Thomas.
high mortality experienced by some bat species
from wind turbines and incidents of drowning, for
example, in porpoises accidentally entangled in References
stationary gillnets. Anthropogenic sound sources
like road or shipping noise may interfere with Amundin M (1991) Sound production in odontocetes with
efficient foraging in bats and toothed whales and emphasis on the harbour porpoise Phocoena
seismic explosions used for offshore oil explora- phocoena. Stockholm University, Stockholm
Andersen SH, Amundin M (1976) Possible predator-
tion can affect the behavior of toothed whales and related adaption of sound production and hearing in
other marine mammals. Echolocating birds are the harbour porpoise (Phoconea phocoena). Aquat
also affected by humans, for example, from Mamm 4(2):56–57
poaching or nest collecting and habitat- Antunes R, Schulz T, Gero S, Whitehead H, Gordon J,
Rendell L (2011) Individual distinctive acoustic
destructive mining activity. Gaining an increased features in sperm whale codas. Anim Behav 81(4):
understanding of echolocation behavior in these 723–730
animals could have important implications for Arranz P, Aguilar de Soto N, Madsen PT, Brito A, Bordes
such issues and for wildlife management in F, Johnson MP (2011) Following a foraging fish-
finder: diel habitat use of Blainville’s beaked whales
general. revealed by echolocation. PLoS One 6(12). https://doi.
org/10.1371/journal.pone.0028353
Au WWL (1993) The sonar of dolphins. Springer,
12.10 Additional Resources New York
Au WWL (2015) History of dolphin biosonar research.
Acoust Tod 11(4):10–17
For a more in-depth view of bat echolocation, we Au WWL, Simmons JA (2007) Echolocation in dolphins
strongly recommend Griffin’s book Listening in and bats. Phys Tod 2007:40–45
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 451

Au WWL, Suthers RA (2014) Production of biosonar Handbook of the birds of the world, barn owls to
signals: structure and form. In: Surlykke A, Nachtigall hummingbirds, vol 5. Lynx, Barceloa, pp 388–457
PE, Fay RR, Popper AN (eds) Biosonar. Springer, Clausen KT, Wahlberg M, Beedholm K, Dereuiter S,
New York, pp 61–105. https://doi.org/10.1007/978-1- Madsen PT (2010) Click communication in harbour
4614-9146-0_3 porpoises (Phocoena phocoena). Bioacoustics 20:1–
Au WWL, Charder DA, Penner RH, Scronce BL (1985) 28
Demonstration of adaptation in beluga whale echolo- Coles RB, Konishi M, Pettigrew JD (1987) Hearing and
cation signals. J Acoust Soc Am 77:726–730 echolocation in the Australian Grey swiftlet,
Au WWL, Moore PWB, Pawloski D (1986) Echolocation Collocalia spodiopygia. J Exp Biol 129:365–371
transmitting beam of the Atlantic bottlenose dolphin. J Coles RB, Guppy A, Anderson ME, Schlegel P (1989)
Acoust Soc Am 80:688–691 Frequency sensitivity and directional hearing in the
Au WWL, Kastelein RA, Rippe T, Schooneman NM gleaning bat, Plecotus auritus (Linnaeus 1758). J
(1999) Transmission beam pattern and echolocation Comp Physiol A 165:269–280
signals of a harbor porpoise (Phocoena phocoena). J Cranford TW, Amundin M, Norris KS (1996) Functional
Acoust Soc Am 106:3699–3705 morphology and homology in the Odontocete nasal
Au WWL, Kastelein RA, Benoit-Bird KJ, Cranford TW, complex: implications for sound generation. J Morphol
McKenna MF (2006) Acoustic radiation from the head 228:223–285
of echolocating harbor porpoises (Phocoena Cranford TW, McKenna MF, Soldevilla MS, Wiggins
phocoena). J Exp Biol 209:2726–2733 SM, Goldbogen JA, Shadwick RE, Krysl P, Leger
Baumann-Pickering S, Wiggins SM, Roth EH, Roch MA, JA, Hildebrand JA (2008) Anatomic geometry of
Schnitzler HU, Hildebrand JA (2010) Echolocation sound transmission and reception in Cuvier’s beaked
signals of a beaked whale at Palmyra atoll. J Acoust whale (Ziphius cavirostris). Anat Rec 291:353–378
Soc Am 127(6):3790–3799. https://doi.org/10.1121/1. Cranford TW, Elsberry WR, Van Bonn WG, Jeffress JA,
3409478 Chaplin MS, Blackwood DJ, Carder DA,
Blomkvist C, Amundin M (2004) High-frequency burst- Kamolnick T, Todd MA, Ridgway SH (2011) Obser-
pulse sounds in agonistic/aggressive interactions in vation and analysis of sonar signal generation in the
bottlenose dolphins, Tursiops truncatus. In: Thomas bottlenose dolphin (Tursiops truncatus): evidence for
JA, Moss CF, Vater M (eds) Echolocation in bats and two sonar sources. J Exp Mar Biol Ecol 407(1):81–96
dolphins. University of Chicago Press, Chicago, pp Cranford TW, Trijoulet V, Smith CR, Krysl P (2014) Vali-
425–431 dation of a vibroacoustic finite element model using
Boonman AM, Jones G (2002) Intensity control during bottlenose dolphin simulations: the dolphin biosonar
target approach in echolocating bats; stereotypical beam is focused in stages. Bioacoustics 23(2):161–194
sensori-motor behaviour in Daubenton’s bats, Myotis Cranford TW, Amundin M, Krysl P (2015) Sound produc-
daubentonii. J Exp Biol 205:2865–2874 tion and sound reception in Delphinoids. In: Johnson
Bosque C, Ramirez R, Rodriguez D (1995) The diet of the CM, Herzing DL (eds) Dolphin communication and
oilbird in Venezuela. Ornitol Neotrop 6:67–80 cognition. Past, present, and future. MIT Press, Boston,
Brinkløv S, Fenton MB, Ratcliffe JM (2013) Echolocation MA, pp 19–48
in oilbirds and swiftlets. Front Physiol 4:123 Culik BM (2011) Odontocetes - the toothed whales, CMS
Brinkløv S, Elemans CPH, Ratcliffe JM (2017) Oilbirds Technical Series No. 24, vol 24. United Nations Envi-
produce echolocation signals beyond their best hearing ronmental Program, Bonn, Germany
range and adjust signal design to natural light Dalland JI (1965) Hearing sensitivity in bats. Science 150:
conditions. R Soc Open Sci 4(5):17025 1185–1186
Bruns V, Schmieszek E (1980) Cochlear innervation in the Dawson SM (1991) Clicks and communication: the
greater horseshoe bat - demonstration of an acoustic behavioural and social contexts of Hector’s dolphin
fovea. Hear Res 3(1):27–43. https://doi.org/10.1016/ vocalizations. Ethology 88:265–276
0378-5955(80)90006-4 Denzinger A, Schnitzler HU (2013) Bat guilds, a concept
Buchler ER (1976) The use of echolocation by the wan- to classify the highly diverse foraging and echolocation
dering shrew (Sorex vagrans). Anim Behav 24:858– behaviors of microchiropteran bats. Front Physiol
873 4. https://doi.org/10.3389/fphys.2013.00164
Burgin CJ, Colella JP, Kahn PL, Upham NS (2018) How Dooling RJ (1980) Behavior and psychophysics of hearing
many species of mammals are there? J Mammal 99(1): in birds. In: Popper AN, Fay RR (eds) Comparative
1–14. https://doi.org/10.1093/jmammal/gyx147 studies of hearing in vertebrates. Springer, New York,
Caruso F, Sciacca V, Bellia G, De Domenico E, Larosa G, pp 261–288
Papale E et al (2015) Size distribution of sperm whales Dormer KJ (1979) Mechanism of sound production and air
acoustically identified during long term deep-sea mon- recycling in delphinids: cineradiographic evidence. J
itoring in the Ionian Sea. PLoS One 10(12):e0144503. Acoust Soc Am 65(1):229–239
https://doi.org/10.1371/journal.pone.0144503 Drexl M, Faulstich MH, Von Stebut B, Radtke-Schuller S,
Chantler P, Wells DR, Schuchmann KL (1999) Family Kössl M (2003) Distortion product otoacoustic
Apodidae (swifts). In: Hoyo D, Elliott S (eds) emissions and auditory evoked potentials in the
452 S. M. M. Brinkløv et al.

hedgehog tenrec, Echinops telfairi. J Assoc Res Gould E (1965) Evidence for echolocation in the
Otolaryngol 4:555–564 Tenrecidae of Madagascar. Proc Am Philos Soc 109:
Elemans CPH, Mead AF, Jakobsen L, Ratcliffe JM (2011) 352–360
Superfast muscles set maximum call rate in Gould E, Negus NC, Novick A (1964) Evidence for echo-
echolocating bats. Science 333:1885–1888 location in shrews. J Exp Zool 156:19–37
Fais M, Johnson M, Wilson M, Aguilar Soto N, Madsen Griffin DR (1944) Echolocation by blind men, bats and
PT (2016) Sperm whale predator-prey interactions radar. Science 100:589–590
involve chasing and buzzing, but no acoustic stunning. Griffin DR (1953) Acoustic orientation in the oilbird,
Sci Rep 6:28562:1–13. https://doi.org/10.1038/ Steatornis. Proc Natl Acad Sci USA 39:884–893
srep28562 Griffin DR (1958) Listening in the dark, 2nd edn. Cornell
Falk B, Williams T, Aytekin M, Moss CF (2011) Adaptive University, New York
behavior for texture discrimination by the free-flying Griffin DR, Suthers RA (1970) Sensitivity of echolocation
big brown bat, Eptesicus fuscus. J Comp Physiol A in cave swiftlets. Biol Bull 139:365–371
197(5):491–503 Griffin DR, Thompson T (1982) Echolocation by cave
Fay RR (1988) Hearing in vertebrates: a psychophysics swiftlets. Behav Ecol Sociobiol 10:119123
databook. Hill-Fay Associates, Winnetka, IL Griffin DR, Webster FA, Michael CR (1960) The echolo-
Fenton MB (1975) Acuity of echolocation in Collocalia cation of flying insects by bats. Anim Behav 8:141–
hirundinacea (Aves: Apodidae), with comments on the 154
distributions of echolocating swiftlets and molossid Griffin DR, Friend JH, Webster FA (1965) Target discrim-
bats. Biotropica 7:1–7 ination by the echolocation of bats. J Exp Zool 158:
Fenton MB (1995) Natural history and biosonar 155–168
signals. In: Popper AN, Fay RR (eds) Hearing by Grunwald JE, Schornich S, Wiegrebe L (2004) Classifica-
bats, Springer handbook of auditory research, vol tion of natural textures in echolocation. Proc Natl Acad
5. Springer, New York, pp 37–86 Sci USA 101(15):5670–5674. https://doi.org/10.1073/
Fenton MB, Faure PA, Ratcliffe JM (2012) Evolution of pnas.0308029101
high duty cycle echolocation in bats. J Exp Biol Guppy A, Coles RB (1988) Acoustical and neural aspects
215(17):2935–2944 of hearing in the Australian gleaning bats,
Finneran JJ, Branstetter BK, Houser DS, Moore PW, Macroderma gigas and Nyctophilus gouldi. J Comp
Mulsow J, Martin C, Perisho S (2014) High-resolution Physiol A 162(5):653–668. https://doi.org/10.1007/
measurement of a bottlenose dolphin’s (Tursiops Bf01342641
truncatus) biosonar transmission beam pattern in the Hartley DJ (1992a) Stabilization of perceived echo
horizontal plane. J Acoust Soc Am 136(4):2025–2038 amplitudes in echolocating bats. I. Echo detection and
Forsman KA, Malmquist MG (1988) Evidence for echolo- automatic gain control in the big brown bat, Eptesicus
cation in the common shrew, Sorex araneus. J Zool fuscus, and the fishing bat, Noctilio leporinus. J Acoust
Soc Lond 216:655–662 Soc Am 91:1120–1132
Fullard JH, Barclay RMR, Thomas DW (1993) Echoloca- Hartley DJ (1992b) Stabilization of perceived echo
tion in free-flying Atiu swiftlets (Aerodramus sawtelli). amplitudes in echolocating bats. II. The acoustic
Biotropica 25:334–339 behavior of the big brown bat, Eptesicus fuscus,
Galatius A, Olsen MT, Steeman ME, Racicot RA, when tracking moving prey. J Acoust Soc Am 91:
Bradshaw CD, Kyhn L, Miller LA (2019) Raising 1133–1149
your voice: evolution of narrow band high frequency Hartley DJ, Suthers RA (1987) The sound emission pat-
signals in odontocetes. Biol J Linn Soc 126:213–224. tern and the acoustical role of the noseleaf in the
https://doi.org/10.1093/biolinnean/bly194 echolocating bat, Carollia perspicillata. J Acoust Soc
Gardiner JD, Dimitriadis G, Sellers WI, Codd JR (2008) Am 82:1892–1900
The aerodynamics of big ears in the brown long-eared Hartley DJ, Suthers RA (1989) The sound emission pat-
bat Plecotus auritus. Acta Chiropterol 10(2):313–321. tern of the echolocating bat, Eptesicus fuscus. J Acoust
https://doi.org/10.3161/150811008x414881 Soc Am 85:1348–1351
Ghose K, Moss CF (2003) The sonar beam pattern of a He K, Liu Q, Xu D-M, Qi F-Y, Bai J, He S-W, Chen P,
flying bat as it tracks tethered insects. J Acoust Soc Am Zhou X, Cai W-Z, Chen Z-L, Jiang X-L, Shi P (2021)
114(2):1120–1131 Echolocation in soft-furred tree mice. Science 372:1–
Goldman LJ, Henson OW (1977) Prey recognition and 10. https://doi.org/10.1126/science.aay1513
selection by the constant frequency bat, Pteronotus p. Henson OW Jr (1965) The activity and function of the
parnellii. Behav Ecol Sociobiol 2:411–419 middle-ear muscles in echo-locating bats. J Physiol
Gonzalez-Terrazas TP, Martel C, Milet-Pinheiro P, 180(4):871–887. https://doi.org/10.1113/jphysiol.
Ayasse M, Kalko EKV, Tschapka M (2016) Finding 1965.sp007737
flowers in the dark: nectar-feeding bats integrate olfac- Herzing DL (2000) Acoustics and social behavior of wild
tion and echolocation while foraging for nectar. R Soc dolphins: implications for a sound society. In: Au
Open Sci 3(8). https://doi.org/10.1098/rsos.160199 WWL, Popper AN, Fay RR (eds) Hearing by whales
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 453

and dolphins, Hearing by whales and dolphins, vol 12. Kalko EKV, Schnitzler H-U (1993) Plasticity in echoloca-
Springer, New York, pp 225–272 tion signals of European pipistrelle bats in search
Hiryu S, Mora EC, Riquimaroux H (2016) Behavioral and flight: implications for habitat use and prey detection.
physiological bases for doppler shift compensation by Behav Ecol Sociobiol 33:415–428
echolocating bats. In: Fenton M, Grinnell A, Popper A, Kastelein RA, Hoek L, de Jong CAF, Wensveen PJ (2010)
Fay R (eds) Bat Bioacoustics, Springer handbook of The effect of signal duration on the underwater detec-
auditory research, vol 54. Springer, New York, NY. tion thresholds of a harbor porpoise (Phocoena
https://doi.org/10.1007/978-1-4939-3527-7_9 phocoena) for single frequency-modulated tonal signals
Holderied MW, Korine C, Fenton MB, Parsons S, between 0.25 and 160 kHz. J Acoust Soc Am 128(5):
Robson S, Jones G (2005) Echolocation call intensity 3211–3222. https://doi.org/10.1121/1.3493435_
in the aerial hawking bat Eptesicus bottae (Vesperti- Kloepper LN, Nachtigall PE, Donahue MJ, Breese M
lionidae) studied using stereo videogrammetry. J Exp (2012) Active echolocation beam focusing in the
Biol 208:1321–1327 false killer whale, Pseudorca crassidens. J Exp Biol
Houston RD, Boonman AM, Jones G (2004) Do echolo- 215:1306–1312. https://doi.org/10.1242/jeb.066605
cation signal parameters restrict bats’ choice Koay G, Heffner RS, Heffner HE (1998) Hearing in a
of prey? In: Thomas JA, Moss CF, Vater M (eds) Megachiropteran fruit bat (Rousettus aegyptiacus). J
Echolocation in bats and dolphins. Chicago University Comp Psychol 112(4):371–382
Press, Chicago, pp 339–345 Koblitz JC, Wahlberg M, Stilz P, Madsen PT,
Huggenberger S, Rauschmann MA, Vogl TJ, Oelschläger Beedholm K, Schnitzler HU (2012) Asymmetry and
HHA (2009) Functional morphology of the nasal com- dynamics of a narrow sonar beam in an echolocating
plex in the harbor porpoise (Phocoena phocoena L.). harbor porpoise. J Acoust Soc Am 131(3):2315–2324.
Anat Rec 292:902–920 https://doi.org/10.1121/1.3683254
Hulgard K, Moss CF, Jakobsen L, Surlykke A (2016) Big Konishi M, Knudsen EI (1979) The oilbird: hearing and
brown bats (Eptesicus fuscus) emit intense search calls echolocation. Science 204:425–427
and fly in stereotyped flight paths as they forage in the Kössl M, Vater M (1995) Cochlear structure and function
wild. J Exp Biol 219(3):334–340. https://doi.org/10. in bats. In: Popper AN, Fay RR (eds) Hearing by bats,
1242/jeb.128983 vol 5. Springer, New York, pp 191–234
Iwaniuk AN, Clayton DH, Wylie DR (2006) Echoloca- Kounitsky P, Rydell J, Amichai E, Boonman A, Eitan O,
tion, vocal learning, auditory localization and the rela- Weiss AJ, Yovel Y (2015) Bats adjust their mouth
tive size of the avian auditory midbrain nucleus (MLd). gape to zoom their biosonar field of view. Proc Natl
Behav Brain Res 167:307–317 Acad Sci USA 112(21):6724–6729. https://doi.org/10.
Jakobsen L, Surlykke A (2010) Vespertilionid bats control 1073/pnas.1422843112
the width of their biosonar sound beam dynamically Kubke MF, Massoglia DP, Carr CE (2004) Bigger brains
during prey pursuit. PNAS 107(31):13930–13935 or bigger nuclei? Regulating the size of auditory
Jakobsen L, Brinklov S, Surlykke A (2013a) Intensity and structures in birds. Brain Behav Evol 63:169–180
directionality of bat echolocation signals. Front Physiol Kuroda M, Sasaki M, Yamada K, Miki N, Matsuishi T
4:89. https://doi.org/10.3389/fphys.2013.00089 (2015) Tissue physical property of the harbor porpoise
Jakobsen L, Ratcliffe JM, Surlykke A (2013b) Convergent Phocoena phocoena for investigation of the sound
acoustic field of view in echolocating bats. Nature emission process. J Acoust Soc Am 138(3):1451–1456
493(7430):93–96. https://doi.org/10.1038/ Kürten L, Schmidt U (1982) Thermoperception in the
nature11664 common vampire bat (Desmodus rotundus). J Comp
Jakobsen L, Olsen MN, Surlykke A (2015) Dynamics of Physiol 146:223–228. https://doi.org/10.1007/
the echolocation beam during prey pursuit in aerial BF00610241
hawking bats. Proc Natl Acad Sci USA 112(26): Ladegaard M, Jensen FH, Beedholm K, da Silva VMF,
8118–8123. https://doi.org/10.1073/pnas.1419943112 Madsen PT (2017) Amazon river dolphins (Inia
Jen PHS, Suga N (1976) Coordinated activities of middle- geoffrensis) modify biosonar output level and directiv-
ear and laryngeal muscles in echolocating bats. Science ity during prey interception in the wild. J Exp Biol 220:
191:950–952 2654–2665. https://doi.org/10.1242/jeb.159913
Johansson LC, Hakansson J, Jakobsen L, Hedenstrom A Lammers MO, Au WWL, Aubauer R, Nachtigall PE
(2016) Ear-body lift and a novel thrust generating (2004) A comparative analysis of the pulsed emissions
mechanism revealed by the complex wake of brown of free-ranging Hawaiian spinner dolphins (Stenella
long-eared bats (Plecotus auritus). Sci Rep 6:24886. longirostris). In: Thomas JA, Moss CF, Vater M
https://doi.org/10.1038/srep24886 (eds) Echolocation in bats and dolphins. University of
Johnson M, Madsen PT, Zimmer WMX, Aguilar de Chicago Press, Chicago, pp 414–419
Soto N, Tyack PL (2006) Foraging Blainville’s beaked Lawrence BD, Simmons JA (1982a) Echolocation in bats:
whales (Mesoplodon densirostris) produce distinct the external ear and perception of the vertical position
click types matched to different phases of echoloca- of targets. Science 218:481–483
tion. J Exp Biol 209:5038–5050 Lawrence BD, Simmons JA (1982b) Measurements of
Kalko EKV, Schnitzler H-U (1989) The echolocation and atmospheric attenuation at ultrasonic frequencies and
hunting behavior of Daubenton’s bat, Myotis the significance for echolocation by bats. J Acoust Soc
daubentoni. Behav Ecol Sociobiol 24:225–238 Am 71(3):585–590. https://doi.org/10.1121/1.387529
454 S. M. M. Brinkløv et al.

Lazure L, Fenton MB (2011) High duty-cycle echoloca- Masters WM, Moffat AJM, Simmons JA (1985) Sonar
tion and prey detection by bats. J Exp Biol 214:1131– tracking of horizontally moving targets by the big
1137 brown bat Eptesicus fuscus. Science 228:1331–1333
Lewanzik D, Goerlitz HR (2018) Continued source level Matsuta N, Hiryu S, Fujioka E, Yamada Y,
reduction during attack in the low-amplitude bat Riquimaroux H, Watanabe Y (2013) Adaptive beam-
Barbastella barbastellus prevents moth evasive flight. width control of echolocation sounds by CF-FM bats,
Funct Ecol 32(5):1251–1261. https://doi.org/10.1111/ Rhinolophus ferrumequinum nippon, during prey-
1365-2435.13073 capture flight. J Exp Biol 216(Pt 7):1210–1218.
Li S, Nachtigall PE, Breese M (2011) Dolphin hearing https://doi.org/10.1242/jeb.081398
during echolocation: evoked potential responses in an Miller LA (1983) IV.3 How insects detect and avoid
Atlantic bottlenose dolphin (Tursiops truncatus). J Exp bats. In: Huber F, Markl H (eds) Neuroethology and
Biol 214:2027–2035. https://doi.org/10.1242/jeb. behavioral physiology. Springer, Berlin, pp 251–266
053397 Miller LA (2010) Prey capture by harbor porpoises
Linnenschmidt M, Beedholm K, Wahlberg M, Kristensen (Phocoena phocoena): a comparison between
JH, Nachtigall PE (2012) Keeping returns optimal: echolocators in the field and in captivity. J Mar Acoust
gain control exerted through sensitivity adjustments Soc Jpn 37(3):156–168
in the harbour porpoise auditory system. Proc Royal Miller LA, Surlykke A (2001) How some insects detect
Soc B 279:2237–2465 and avoid being eaten by bats: the tactics and counter
Linnenschmidt M, Teilmann J, Akamatsu T, Dietz R, tactics of prey and predator. Bioscience 51:570–581
Miller LA (2013) Biosonar, dive and foraging activity Miller LA, Wahlberg M (2013) Echolocation by the harbour
of satellite tracked harbour porpoises (Phocoena porpoise: life in coastal waters. Front Integr Physiol
phocoena). Mar Mamm Sci 29(2):E77–E97 4(52):1–6. https://doi.org/10.3389/fphys.2013.00052
Liu Z, Qi F-Y, Zhou X, Ren H-Q, Shi P (2014) Parallel Møhl B, Wahlberg M, Madsen PT, Heerfordt A, Lund A
sites implicate functional convergence of the hearing (2003) The monopulsed nature of sperm whale clicks. J
gene Prestin among echolocating mammals. Mol Biol Acoust Soc Am 114(2):1143–1154
Evol 31(9):2415–2424. https://doi.org/10.1093/ Moore P, Popper AN (2019) Heptuna’s contributions to
molbev/msu194 biosonar. Acoust Tod 15(1):44–52
Long GR, Schnitzler HU (1975) Behavioral audiograms Moore PWB, Dankiewicz LA, Houser DS (2008)
for the bat, Rhinolophus ferrumequinum. J Comp Beamwidth control and angular target detection in an
Physiol A 100:211–220 echolocating bottlenose dolphin (Tursiops truncatus). J
Madsen PT, Surlykke A (2014) Echolocation in air and Acoust Soc Am 124:3324–3332. https://doi.org/10.
water. In: Surlykke A, Nachtigall PE, Fay RR, Popper 1121/1.2980453
AN (eds) Biosonar, Springer handbook of auditory Morisaka T, Connor RC (2007) Predation by killer whales
research, vol 51. Springer, New York, pp 257–304. (Orcinus orca) and the evolution of whistle loss and
https://doi.org/10.1007/978-1-4614-9146-0 narrow-band high frequency clicks in odontocetes.
Madsen PT, Carder DA, Beedholm K, Ridgway SH (2005) Evol Biol 20. https://doi.org/10.1111/j.1420-9101.
Porpoise clicks from a sperm whale nose – convergent 2007.01336.x
evolution of 130 kHz pulses in toothed whale sonars? Motoi K, Sumiya M, Fujioka E, Hiryu S (2017) Three-
Bioacoustics 15:195–206 dimensional sonar beam-width expansion by Japanese
Madsen PT, Wilson M, Johnson M, Hanlon RT, house bats (Pipistrellus abramus) during natural forag-
Bocconcelli A, Aguilar de Soto N, Tyack PL (2007) ing. J Acoust Soc Am 141(5):EL439. https://doi.org/
Clicking for calamari: toothed whales can echolocate 10.1121/1.4981934
squid Loligo pealeii. Aquat Biol 1:141–150 Nachtigall PE, Supin AY (2008) A false killer whale
Madsen PT, Wisniewska D, Beedholm K (2010) Single adjusts its hearing when it echolocates. J Exp Biol
source sound production and dynamic beam formation 211:1714–1718
in echolocating harbour porpoises (Phocoena Nachtigall PE, Yuen MML, Mooney TA, Taylor KA
phocoena). J Exp Biol 213:3105–3110. https://doi. (2005) Hearing measurements from a stranded infant
org/10.1242/jeb.044420 Risso’s dolphin, Grampus griseus. J Exp Biol 208:
Madsen PT, Lammers MO, Wisniewska D, Beedholm K 4181–4188
(2013) Nasal sound production in echolocating Nachtigall PE, Mooney TA, Taylor KA, Miller LA,
delphinids (Tursiops truncatus and Pseudorca Rasmussen MH, Akamatsu T, Teilmann J,
crassidens) is dynamic, but unilateral: clicking on the Linnenschmidt M, Vikingsson GA (2008) Shipboard
right side and whistling on the left side. J Exp Biol measurements of the hearing of the white-beaked dol-
216(21):4091–4102. https://doi.org/10.1242/jeb.091306 phin, Lagenorhynchus albirostris. J Exp Biol 211:
Manley GA (1990) Peripheral hearing mechanisms in 642–647
reptiles and birds. Springer, Berlin Neuweiler G (2000) The biology of bats (trans: Covey E).
Mann DA, Lu Z, Popper AN (1997) Ultrasound detection Oxford University Press, Oxford
by a teleost fish. Nature 389:341–341 Noren DP, Holt MM, Dunkin RC, Williams TM (2017)
Martin GR, Rojas LM, Ramírez Y, McNeil R (2004) The Echolocation is cheap for some mammals: dolphins
eyes of oilbirds (Steatornis caripensis): pushing the conserve oxygen while producing high-intensity
limits of sensitivity. Naturwissenschaften 91:26–29 clicks. J Exp Mar Biol Ecol 495:103–109
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 455

Norris KS, Møhl B (1983) Can odontocetes debilitate prey NATO ASI Series, vol 156. Plenum Press, New York,
with sound? Am Nat 122(1):85–104 pp 53–60
Norris KS, Prescott JH, Asa-Dorian PV, Perkins P (1961) Rossbach KA, Herzing DL (1997) Underwater
An experimental demonstration of echo-location observations of benthic-feeding bottlenose dolphins
behavior in the porpoise, Tursiops truncatus (Tursiops truncatus) near Grand Bahama Island,
(Montagu). Biol Bull 120:163–176 Bahamas. Mar Mamm Sci 13(3):498–504
Norris KS, Dormer KJ, Pegg J, Liese GJ (1971) The Rydell J, Miller LA, Jensen ME (1999) Echolocation
mechanism of sound production and air recycling in constraints of Daubenton’s bat foraging over water.
porpoises: a preliminary report. In: Paper presented at Funct Ecol 13:247–255
the Proceedings of the eighth conference on the Sales GD, Pye JD (1974) Ultrasonic communication by
biological sonar of diving mammals. Menlo Park, animals. Chapman & Hall, London
California Sanchez L, Ohdachi SD, Kawahara A, Echenique-Diaz
Nørum U, Brinkløv S, Surlykke A (2012) New model for LM, Maruyama S, Kawata M (2019) Acoustic
gain control of signal intensity to object distance in emissions of Sorex unguiculatus (Mammalia:
echolocating bats. J Exp Biol 215(17):3045–3054 Soricidae): assessing the echo-based orientation
Novick A (1959) Acoustic orientation in the cave swiftlet. hypothesis. Ecol Evol 9(5):2629–2639. https://doi.
Biol Bull 117:497–503 org/10.1002/ece3.4930
Obrist MK, Fenton MB, Eger JL, Schlegel PA (1993) Schmidt S (1988) Evidence for a spectral basis of texture
What ears do for bats: a comparative study of pinna perception in bat sonar. Nature 331:617–619
sound pressure transformation in Chiroptera. J Exp Schmidt S, Türke B, Vogler B (1983) Behavioural audio-
Biol 180:119–152 gram from the bat Megaderma lyra (Geoffroy, 1810;
Oliveira C, Wahlberg M, Silva MA, Johnson M, Microchiroptera). Myotis 21–22:62–66
Antunes R, Wisniewska D, Fais A, Madsen PT Schnitzler HU (1968) Die Ultraschallortungslaute der
(2016) Sperm whale codas may encode individuality Hufeisen-Fledermäuse (Chiroptera-Rhinolophidae) in
as well as clan identity. J Acoust Soc Am 139(5): verschiedenen Orientierungssituationen. Z Vgl Physiol
2860–2869 57:376–408
Pedersen SC (1993) Cephalometric correlates of echolo- Schnitzler HU (1973) Control of Doppler-shift compensa-
cation in the chiroptera. J Morphol 218(1):85–98. tion in greater horseshoe bat, Rhinolophus
https://doi.org/10.1002/jmor.1052180107 ferrumequinum. J Comp Physiol 82(1):79–92. https://
Pollak GD (1988) Time is traded for intensity in the bats doi.org/10.1007/Bf00714171
auditory-system. Hear Res 36(2–3):107–124. https:// Schnitzler HU, Flieger E (1983) Detection of oscillating
doi.org/10.1016/0378-5955(88)90054-8 target movements by echolocation in the greater horse-
Popov AV, Supin AY, Wang D, Wang K (2006) Noncon- shoe bat. J Comp Physiol A 153:385–391
stant quality of auditory filters in the porpoises, Schuller G (1977) Echo delay and overlap with emitted
Phocoena phocoena and Neophocaena phocaenoides orientation sounds and Doppler-shift compensation in
(Cetacea, Phocoenidae). J Acoust Soc Am 119(5): bat, Rhinolophus ferrumequinum. J Comp Physiol
3173–3180 114(1):103–114
Price JJ, Johnson KP, Clayton DH (2004) The evolution of Schuller G, Pollack GD (1979) Disproprotionate fre-
echolocation in swiftlets. J Avian Biol 35:135–143 quency representation in the inferior colliculus of
Pye JD (1980) Echolocation signals and echoes in air. In: horseshoe bats: evidence for an “acoustic fovea”. J
Busnel R-G, Fish JF (eds) Animal sonar systems. Ple- Comp Physiol 132:47–54
num Press, New York, pp 309–353 Schwartz C, Tressler J, Keller H, Vanzant M, Ezell S,
Pye JD (1993) Is fidelity futile? The ‘true’ signal is illu- Smotherman M (2007) The tiny difference between
sory, especially with ultrasound. Bioacoustics 4(4): foraging and communication buzzes uttered by the
271–286 Mexican free-tailed bat, Tadarida brasiliensis. J
Rasmussen MH, Miller LA (2002) Whistles and clicks Comp Physiol A 193:853–863
from white-beaked dolphins, Lagenorhynchus Siemers BM, Schauermann G, Turi H, Von Merten S
albirostris, recorded in Faxafloi Bay, Iceland. Aquat (2009) Why do shrews Twitter? Communication or
Mamm 28:78–89 simple echo-based orientation. Biol Lett 5:593–596
Rasmussen MH, Atem ACG, Miller LA (2016) Behavioral Simmons JA, Howell DJ, Suga N (1975) Information
responses by Icelandic white-beaked dolphins content of bat sonar echoes. Am Sci 63:204–215
(Lagenorhynchus albirostris) to playback sounds. Simmons JA, Fenton MB, O’Farrell MJ (1979) Echoloca-
Aquat Mamm 42(3):317–329. https://doi.org/10.1578/ tion and pursuit of prey by bats. Science 203:16–21
AM.42.3.2016.317 Simmons JA, Kick SA, Lawrence BD, Hale C, Bard C,
Ratcliffe JM, Elemans CPH, Jakobsen L, Surlykke A Escudié B (1983) Acuity of horizontal angle discrimi-
(2013) How the bat got its buzz. Biol Lett 9(2). nation by the echolocating bat, Eptesicus fuscus. J
https://doi.org/10.1098/rsbl.2012.1031 Comp Physiol A 153:321–330
Ridgway SH, Carter DA (1988) Nasal pressure and sound Simmons JA, Moss CF, Ferragamo M (1990) Conver-
production in an echolocating white whale, gence of temporal and spectral information into acous-
Delphinaperus leucas. In: Nachtigall PE, Moore tic images of complex sonar targets perceived by the
PWB (eds) Animal sonar: processes and performance,
456 S. M. M. Brinkløv et al.

echolocating bat, Eptesicus fuscus. J Comp Physiol A orca) hearing: auditory brainstem response and behav-
166:449–470 ioral audiograms. J Acoust Soc Am 106:1134–1141
Smyth DM, Roberts JR (1983) The sensitivity of echolo- Teeling E (2009) A molecular and morphological perspec-
cation by the Grey Swiftlet Aerodramus spodiopygius. tive on the evolution of echolocation in bats. J Vertebr
Ibis 125:339–345 Paleontol 29:190a–190a
Snow DW (1961) The natural history of the oilbird, Teeling EC, Springer MS, Madsen O, Bates P, O’Brien SJ,
Steatornis caripensis, in Trinidad, W. I. Part 1. General Murphy WJ (2005) A molecular phylogeny for bats
behavior and breeding habits. Zoologica 46:27–48 illuminates biogeography and the fossil record. Science
Snow DW (1962) The natural history of the oilbird in 307:580–584
Trinidad. W. I. Part II. Population breeding ecology, Thiagavel J, Cechetto C, Santana SE, Jakobsen L, Warrant
and food. Zoologica 27:199–221 EJ, Ratcliffe JM (2018) Auditory opportunity and
Song Z, Xu X, Dong J, Xing L, Zhang M, Liu X, Zhang Y, visual constraint enabled the evolution of echolocation
Li S, Berggren P (2015) Acoustic property reconstruc- in bats. Nat Commun 9(1):98. https://doi.org/10.1038/
tion of a pygmy sperm whale (Kogia breviceps) fore- s41467-017-02532-x
head based on computed tomography imaging. J Thomas JA, Turl CW (1990) Echolocation characteristics
Acoust Soc Am 138(5):3129–3137 and range detection threshold of a false killer whale
Sørensen PM, Wisniewska DM, Jensen FH, Johnson M, (Pseudorca crassidens). In: Thomas JA, Kastelein RA
Teilmann J, Madsen PT (2018) Click communication (eds) Sensory abilities of cetaceans. Plenum Press,
in wild harbour porpoises (Phocoena phocoena). Sci New York, pp 321–334
Rep 8:9702. https://doi.org/10.1038/s41598-018- Thomas JA, Moss CF, Vater M (2004) Echolocation in bats
28022-8 and dolphins. University of Chicago Press, Chicago
Speakman JR, Racey PA (1991) No cost of echolocation Thomassen HA (2005) Swift as sound – design and evo-
for bats in flight. Nature 350:421–423 lution of the echolocation system in Swiftlets
Starkhammar J, Moore PW, Talmadge L, Houser DS (Apodidae: Collocaliini). Leiden University, Leiden
(2011) Frequency-dependent variation in the Thomassen HA, Povel GDE (2006) Comparative and
2-dimensional beam pattern of an echolocating dol- phylogenetic analysis of the echo clicks and social
phin. Biol Lett 7:836–839 vocalisations of swifts and swiftlets (Aves: Apodidae).
Stilz WP, Schnitzler HU (2012) Estimation of the acoustic Biol J Linn Soc 88:631–643
range of bat echolocation for extended targets. J Thomassen HA, Djasim UM, Povel GDE (2004) Echo
Acoust Soc Am 132:1765–1775. https://doi.org/10. click design in swiftlets: single as well as double clicks.
1121/1.4733537 Ibis 146:173–174
Strother GK, Mogus M (1970) Acoustical beam patterns Tomasi TE (1979) Echolocation by the short-tailed shrew,
for bats: some theoretical considerations. J Acoust Soc Blarina brevicauda. J Mammal 60:751–759
Am 48(6):1430–1432 Tomassen HA, Gea S, Maas S, Dirckx JJJ, Decraemer WF,
Suga N, Jen PH-S (1975) Peripheral control of acoustic Povel GDE (2007) Do Swiftlets have an ear for echo-
signals in the auditory system of echolocating bats. J location? The functional morphology of Swiftlets’
Exp Biol 62:277–311 middle ears. Hear Res 225:25–37. https://doi.org/10.
Surlykke A, Kalko EKV (2008) Echolocating bats cry out 1016/j.heares.2006.11.013
loud to detect their prey. PLoS One 3(4):e2036 Urick RJ (1983) Principles of underwater sound, 3rd edn.
Surlykke A, Pedersen SB, Jakobsen L (2009a) McGraw-Hill, New York
Echolocating bats emit a highly directional sonar Vanderelst D, Peremans H, Razak NA, Verstraelen E,
sound beam in the field. Proc R Soc B 276:853–860 Dimitriadis G (2015) The aerodynamic cost of head
Surlykke A, Ghose K, Moss CF (2009b) Acoustic scan- morphology in bats: maybe not as bad as it seems.
ning of natural scenes by echolocation in the big brown PLoS One 10(5):e0118545. https://doi.org/10.1371/
bat, Eptesicus fuscus. JEB 212:1011–1020 journal.pone.0126061
Suthers RA, Hector DH (1982) Mechanism for the produc- Verfuss UK, Miller LA, Pilz PKD, Schnitzler HU (2009)
tion of echolocating clicks by the Grey Swiftlet, Echolocation by two foraging harbour porpoises
Collocalia spodiopygia. J Comp Physiol A 148:457–470 (Phocoena phocoena). J Exp Biol 212:823–834
Suthers RA, Hector DH (1985) The physiology of vocali- Wang Z, Zhu T, Xue H, Fang N, Zhang J, Zhang L, Pang J,
zation by the echolocating oilbird, Steatornis Teeling EC, Zhang S (2017) Prenatal development
caripensis. J Comp Physiol A 156:243–266 supports a single origin of laryngeal echolocation in
Suthers RA, Hector DH (1988) Individual variation in bats. Nat Ecol Evol 1:0021. https://doi.org/10.1038/
vocal tract resonance may assist oilbirds in recognizing s41559-016-0021
echoes of their own sonar clicks. In: Nachtigall PE, Watkins WA, Schevill WE (1977) Sperm whale codas. J
Moore PWB (eds) Animal sonar: processes and perfor- Acoust Soc Am 62:1485–1490
mance. Plenum Press, New York, pp 87–91 Wei C, Au WWL, Ketten DR, Song Z, Zhang Y (2017)
Szymanski MD, Bain DE, Kiehi K, Pennington S, Biosonar signal propagation in the harbor porpoise’s
Wong S, Henry KR (1999) Killer whale (Orcinus (Phocoena phocoena) head: the role of various
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 457

structures in the formation of the vertical beam. J Wilson M, Wahlberg M, Surlykke A, Madsen PT (2013)
Acoust Soc Am 141(6):4179–4187 Ultrasonic predator-prey interaction in water-
Wei C, Au WWL, Ketten DR, Zhang Y (2018) Finite convergent evolution with insects and bats in air?
element simulation of broadband biosonar signal prop- Front Physiol 4(June):1–12
agation in the near- and far-field of an echolocating Wisniewska DM, Johnson M, Beedholm K, Wahlberg M,
Atlantic bottlenose dolphin (Tursiops truncatus). J Madsen PT (2012) Acoustic gaze adjustments during
Acoust Soc Am 143(5):2611–2620. https://doi.org/ active target selection in echolocating porpoises. J Exp
10.1121/1.5034464 Biol 215:4358–4373
Weissenbacher P, Wiegrebe L (2003) Classification of Wisniewska DM, Ratcliffe JM, Beedholm K, Christensen
virtual objects in the echolocating bat, Megaderma CB, Johnson M, Koblitz JC, Wahlberg M, Madsen PT
lyra. Behav Neurosci 117(4):833–839. https://doi.org/ (2015) Range-dependent flexibility in the acoustic field
10.1037/0735-7044.117.4.833 of view of echolocating porpoises (Phocoena
Wilson M, Acolas ML, Bégout ML, Madsen PT, phocoena). eLife. https://doi.org/10.7554/eLife.05651
Wahlberg M (2008) Allis shad (Alosa alosa) exhibit Wong A, Gall MD (2015) Frequency sensitivity in the
an intensity-graded behavioral response when exposed auditory periphery of male and female black-capped
to ultrasound. J Acoust Soc Am 243:1–5 chickadees (Poecile atricapillus). Zoology 118:357–
363

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license
and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain
permission directly from the copyright holder.
The Effects of Noise on Animals
13
Christine Erbe, Micheal L. Dent, William L. Gannon,
Robert D. McCauley, Heinrich Römer, Brandon L. Southall,
Amanda L. Stansbury, Angela S. Stoeger,
and Jeanette A. Thomas

13.1 Introduction geophysical, biological, and anthropogenic


sounds, which constitute the local soundscape
Noise is ubiquitous in all animal habitats, often at (see Chap. 7). Some of these sounds can interfere
substantial levels (Brumm and Slabbekoorn with the life functions of animals and hence are
2005). Habitats typically contain a myriad of often referred to as “noise” (American National
Standards Institute 2013).
Communication plays a critical role in
Jeanette A. Thomas (deceased) contributed to this chapter
while at the Department of Biological Sciences, Western animals’ life functions as it is the foundation for
Illinois University-Quad Cities, Moline, IL, USA social relationships among animals. However,
acoustic communication often is constrained by
C. Erbe (*) · R. D. McCauley
background noise, which reduces the signal-to-
Centre for Marine Science & Technology, Curtin
University, Perth, WA, Australia noise ratio (SNR) and thus the signal detection
e-mail: c.erbe@curtin.edu.au; r.mccauley@cmst.curtin. and discrimination success of receivers. In terres-
edu.au trial habitats, natural, abiotic noise is caused by
M. L. Dent wind, precipitation, thunder, running water, and
Department of Psychology, University at Buffalo, SUNY, seismicity. Birds, frogs, insects, and mammals
Buffalo, NY, USA
create biotic noise. In aquatic environments, nat-
e-mail: mdent@buffalo.edu
ural, abiotic noise is caused by wind, precipita-
W. L. Gannon
tion, breaking waves, polar ice break-up, and
Department of Biology and Graduate Studies, Museum of
Southwestern Biology, University of New Mexico, natural seismic activity. Biotic noise sources
Albuquerque, NM, USA include shrimps, fishes, and marine mammals.
e-mail: wgannon@unm.edu Such natural noise has been shown to interfere
H. Römer with sound usage by animals. For example, wind
Department of Biology, Graz University, Graz, Austria noise might interfere with marine mammal com-
e-mail: heinrich.roemer@uni-graz.at
munication, and as a counteraction, humpback
B. L. Southall whales (Megaptera novaeangliae) increase the
El Paso Zoo, El Paso, TX, USA
sound pressure level of their sounds as a function
e-mail: brandon.southall@sea-inc.net
of increasing wind noise level (Dunlop et al.
A. L. Stansbury
2014). Also, animals of the same or different
Mammal Communication Laboratory, University of
Vienna, Vienna, Austria species can interfere with sound usage. Snapping
shrimp are known to mask toothed whale
A. S. Stoeger
Southall Environmental Associates, Inc., Aptos, CA, USA biosonar (Au et al. 1974, 1985) and harp seals
e-mail: angela.stoeger-horwath@univie.ac.at (Pagophilus groenlandicus) have been shown to
# The Author(s) 2022 459
C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1_13
460 C. Erbe et al.

increase their call repetition to be heard above the stress, hearing loss, barotrauma (in aquatic spe-
chorus of their conspecifics (Serrano and Terhune cies), injury, and ultimately death (Kight and
2001). Similarly, king penguins (Aptenodytes Swaddle 2011). In addition to such direct effects
patagonicus; Aubin and Jouventin 1998), zebra of noise, there may be indirect effects (e.g., when
finches (Taeniopygia guttata; Narayan et al. a prey species is impacted, leading to reduced
2007), and big brown bats (Eptesicus fuscus; prey availability). The effects of noise do not
Warnecke et al. 2015) communicate in a cacoph- always have to be negative from the animals’
ony of conspecific calls. Animals have evolved point of view. In some cases, animals actually
sound production and reception capabilities in use anthropogenic sounds to their advantage.
natural biotic and abiotic background noise. For example, the sound of a dumpster lid closing
However, anthropogenic noise is fairly recent on in a campground might indicate a food source to
evolutionary time scales. Researchers have tried some birds and mammals. Underwater sounds
to assess whether existing adaptations are suffi- from ships can increase the settlement, growth
cient for animals to deal with anthropogenic noise. rate, and absolute growth of biofouling organisms
Anthropogenic noise in terrestrial such as bryozoans, oysters, calcareous
environments originates from road traffic, trains, tubeworms, and barnacles (Stanley et al. 2014).
aircraft, industrial sites, energy plants, construc- Sounds from fishing vessels may attract birds,
tion machinery, etc. Anthropogenic noise in seals, and dolphins, which then feed on the bait
aquatic environments originates from recreational or catch (Söffker et al. 2015). This attraction to a
boating, commercial shipping, commercial fish- food source elicited by anthropogenic noise is
ing, offshore hydrocarbon and mineral explora- called the “dinner bell effect.”
tion, hydrocarbon production, mineral mining, In terms of the potential negative effects of
marine construction, offshore renewable energy anthropogenic noise on animals, Fig. 13.1 shows
production, military activities, etc. Such anthro- a generalized view of increasingly severe effects
pogenic sounds, in air or water, have distinct closer to the noise source. Depending on where
“sound signatures,” and their contributions to the noise source and the receiving animals are
the marine and terrestrial soundscapes are located in space, received noise will differ in
discussed in Chap. 7. spectral and temporal characteristics (see
The effects of anthropogenic noise have been Chaps. 5 and 6 on sound propagation in air and
studied extensively in humans (Kryter 1994); water, respectively). While there are widely vary-
however, less is known about how human- ing sound propagation conditions depending on
generated noise affects other animals. Four edited the specific environment in which a sound is
books (Brumm 2013; Popper and Hawkins 2012, produced and received, received levels generally
2016; Slabbekoorn et al. 2018a) and some journal attenuate or decrease as sound propagates from its
special issues (Erbe et al. 2016b, 2019c; Le Prell source. Given that no habitat is acoustically
et al. 2019; Thomsen et al. 2020) compile many homogeneous or isotropic, received levels vary
examples outlining the effects of noise. The with azimuth (direction) and inclination (height or
effects of anthropogenic noise on animals are a depth), leading to different impact ranges in all
growing concern, having resulted in an exponen- directions.
tial increase in the number of research The absolute range and order of noise impact
publications on this topic (Williams et al. 2015). severity can differ based on features of the propa-
What are the effects of anthropogenic noise? gation environment, exposure context, and spe-
They can vary from mere auditory sensation, mild cies involved (Ellison et al. 2012). In general, at
and temporary annoyance, brief behavioral the longest ranges, a noise might barely be audi-
changes, temporary avoidance of an area, and ble to an animal and may be less likely to have
masking to long-term changes in the usage of any negative effect. Audibility of a noise depends
important feeding or breeding areas, prolonged on its amplitude and spectrum, propagation
13 The Effects of Noise on Animals 461

Fig. 13.1 Sketch of


generalized ranges from a
noise source, at which
different types of impact
may occur

conditions from the source to the receiver, ambi- bands occupied by the signal and enabling paral-
ent noise conditions, and hearing abilities of the lel processing (Moore 2013). The critical ratio is
animal. the most commonly measured parameter related
Stress is a physiological response, which to auditory masking. It is defined as the mean-
might occur at long and short ranges and at low square sound pressure of a narrowband signal
and high noise levels. Stress can be a direct (e.g., a tone) divided by the mean-square sound
response to noise (e.g., if a novel noise is sud- pressure spectral density of the masking noise at a
denly heard) and an indirect response to noise level, where the signal is just detectable (see
(e.g., if masking causes stress). Stress can affect Chap. 10 on audiometry; International Organiza-
numerous life functions (including immune tion for Standardization 2017). There are two
response, reproductive success, predator avoid- categories of masking. Energetic masking occurs
ance, etc.; Tarlow and Blumstein 2007). when the masking sound overlaps with the signal
Acoustic masking might occur over long in both frequency and time, such that the signal is
ranges when a distant noise masks a faint signal. inaudible. Informational masking occurs later in
Masking is the process (and amount) by which the auditory process; the signal is still audible, but
the audibility threshold for a sound is raised by it cannot be disentangled from the masker (Moore
the presence of another sound (i.e., noise; Ameri- 2013).
can National Standards Institute 2013).1 The Somewhat closer to the source, changes in
higher the noise level is, the greater the masking behavior of varying severity might be seen. An
effect. Masking can interfere with signals impor- animal might change its orientation, cease prior
tant to animals, such as their social communica- behavior (e.g., feeding), move away from the
tion calls, mother-offspring recognition sounds, source, or alter its vocal behavior, which may
echolocation signals, environmental sounds, or have implications for social functions.
sounds by predators and prey (Dooling and Animals must be closer to sound sources to
Leek 2018). The animal’s auditory system splits receive sound levels sufficiently high for noise-
incoming sound into a series of overlapping induced hearing loss (NIHL). NIHL results from
bandpass filters, thus optimizing SNR in the overstimulation of the sensory cells in the inner
ear, leading to metabolic exhaustion of the hair
1
ANSI/ASA S1.1 & S3.20 Standard Acoustical & cells, damage to the organ of Corti, and in
Bioacoustical Terminology Database; https:// extreme cases, degeneration of retrograde
asastandards.org/asa-standard-term-database/
462 C. Erbe et al.

ganglion cells and axons. NIHL includes both


temporary and permanent loss of hearing, termed
temporary threshold shift (TTS) and permanent
threshold shift (PTS), respectively. Both TTS and
PTS depend on the spectral and temporal (dura-
tion of exposure and duty cycle) characteristics of
the noise received (Moore 2013; Saunders and
Dooling 2018). TTS, by definition, is recover-
able, but the time to recover depends on the
amplitude, frequency, rise time, and duration of
noise exposure. While experiencing TTS, animals
could have a decreased ability to communicate,
interact with offspring, assess their environment,
detect predators or prey, etc. While TTS implies a
Fig. 13.2 Example of a historical dose-response curve
full recovery without physical injury, TTS might based on received exposure level as a metric of sound
still involve submicroscopic physical damage. dose used to assess the likelihood of bioacoustic impact
Kujawa and Liberman (2009) showed that for from mid-frequency sonar (Department of the Navy 2008).
Half of a population was modeled to respond at 165 dB re
high levels of TTS, sensory hair cells appear
1 μPa, with fewer animals responding at lower levels, and
unharmed, yet afferent nerve terminals might be more animals responding at higher levels
injured leading to cochlear nerve degeneration.
Death of sensory hair cells in the ear, damage to
swimming behavior; Blackwell et al. 2015;
the auditory nerve, or injury to tissues in the
Malme et al. 1983; Miller et al. 2005). Therefore,
auditory pathway may lead to PTS (Liberman
some studies have developed a dose-response
2016).
curve (Fig. 13.2) relating likelihood of response
At high levels of noise exposure, animals may
(or percentage of a population that might
incur injury (i.e., acoustic trauma) to tissues and
respond) to the received level of the specific
organs, such as damage to ear bones, lungs, kid-
source of noise under consideration (e.g.,
ney, or gonads (Popper et al. 2014). In aquatic
Hawkins et al. 2014; Miller et al. 2014; Williams
species, fast changes in pressure can cause blood
et al. 2014).
gases to exit solution and gas-filled tissues or
The effects of noise discussed so far, and the
organs (e.g., swim bladders in fish) to expand
concepts of impact ranges (Fig. 13.1) and dose-
and contract rapidly, which may damage
response curves (Fig. 13.2) relate to acute noise
surrounding tissues and organs (e.g., rupture the
exposures (e.g., to a single discharge of a seismic
swim bladder). Rapid changes in sound pressure
airgun array or a single supersonic overflight).
are more likely to cause damage than gradual
The scientific difficulty is to link short-term, indi-
changes (Popper et al. 2014).
vidual impacts to long-term, population-level
Whether the effect of noise is auditory, behav-
impacts, considering that animals might travel
ioral, or physiological, individual animals of the
and be exposed to aggregate noise from multiple
same species or population respond at different
sources distributed through space and time. While
ranges and in different ways. Age, health, sex,
some studies have documented long-term
individual hearing abilities, prior experience
reductions in species abundance and diversity
(habituation versus sensitization), context, current
(e.g., near highways or in industrialized areas;
behavioral state, and environmental conditions
Francis et al. 2009; Goodwin and Shriver 2011),
may all affect the responses of individuals. For
in the majority of cases (i.e., species and noise
example, bowhead whale (Balaena mysticetus)
sources), it remains unknown how the impacts on
and gray whale (Eschrichtius robustus) responses
individuals accumulate over time (i.e., over mul-
to seismic surveys ranged from none-observed to
tiple exposures) and over a population.
moderate (i.e., changing vocalization rates and
13 The Effects of Noise on Animals 463

Fig. 13.3 Population Consequences of Acoustic Distur- level consequences via a series of stages, connected by
bance (PCAD) model (National Research Council 2005), transfer functions
which links noise exposure from individual to population-

Extrapolating temporary effects on individuals to seismic survey or detonation), the surrounding


population-level effects is problematic. The Pop- area is commonly observed (e.g., visually or
ulation Consequences of Acoustic Disturbance acoustically), and operations are changed (e.g.,
(PCAD) model (Fig. 13.3) was originally devel- temporarily reducing power or shutting down) if
oped for marine mammals and provides a frame- animals are detected within the so-called safety
work for the link between noise exposure and zones (Fig. 13.4; Weir and Dolman 2007). Some-
population impacts (National Research Council times, alternative (e.g., quieter) technology is
2005). The link is broken down into five stages available. Also, noise barriers may be employed
and four transfer functions. (e.g., temporary, sound-absorbing walls in terres-
Data to fully parameterize this model are not trial environments, or bubble curtains in marine
available for any species. However, progress has environments; Bohne et al. 2019). Operations
been made for a few selected species, with the may be ramped up in an attempt to warn animals
elephant seal (Mirounga angustirostris) being an (e.g., Wensveen et al. 2017). Short-term
excellent model in the marine world, having been operations may be timed to avoid biologically
studied extensively over long periods (Costa et al. critical seasons or habitats.
2016). This conceptual model has recently been In the case of chronic noise, such as from
more fully developed mathematically and broad- shipping, voluntary area-wide speed reductions
ened to consider potential changes in vital rates to reduced noise levels (Joy et al. 2019). Similarly,
estimate population-level effects of any form of voluntarily turning off engines in drive-through
disturbance (New et al. 2014); the resulting national parks is encouraged (Fig. 13.5). For
framework is now more broadly termed the Pop- long-term operations or installations (such as
ulation Consequences of Disturbance (PCoD) highways), permanent sound barriers are com-
model. Furthermore, novel conceptual paradigms monly erected in the terrestrial environment. But
have been proposed to consider population these mitigation measures can reduce habitat con-
consequences of noise exposure from multiple nectivity. Instead, overpasses and long under-
stressors, complex interactions of which may be ground roadways may shelter large areas from
additive, synergistic, or antagonistic (Ocean Stud- noise exposure while concurrently increasing
ies Board 2016). These models have implications habitat connectivity. Understanding the role
for other taxa and their conservation management. sound plays in habitat fragmentation will increase
One important aspect of noise impact manage- the ability to make barriers, underpasses, and
ment is mitigation. To reduce the risk of impacts overpasses more effective at reducing noise expo-
from acute noise exposure (e.g., from a marine sure, while also increasing landscape connectivity.
464 C. Erbe et al.

Fig. 13.4 Bird’s-eye sketch of different mitigation temporarily reduce power or shut down if animals are
methods employed in the marine environment to reduce detected within these zones and resume once animals
the risk of noise impacts (Erbe et al. 2018). The offshore, have departed. In addition, modifications might be possi-
noise-producing platform is indicated by the black star. It ble to the source or its operational parameters. Noise
is surrounded by safety zones, which are observed in real reduction gear (e.g., a bubble curtain around pile driving
time. MMO: marine mammal observer, who might be on in shallow water) is indicated by gray dots. MPA: marine
shore, or on the operations platform, or on an additional protected area, which might only be accessible during
vessel. PAM: passive acoustic monitoring using low-risk seasons
hydrophones, possibly as a towed array. Operations

Fig. 13.5 Photograph from Addo Elephant National of Cathy Dreyer, Conservation Manager, Addo Elephant
Park, South Africa, encouraging visitors to switch off National Park)
their car engines to limit noise effects on wildlife (courtesy
13 The Effects of Noise on Animals 465

Overall, the effects of anthropogenic noise are or highways. The densities of white-footed mice
a challenge to researchers, noise producers, and (Peromyscus leucopus) and eastern chipmunks
policy makers. Often, stakeholders have data (Tamias striatus) did not decrease near roads.
from only a few studies on a few species from While both species were significantly less likely
which to develop criteria for noise exposure. This to cross a road than move the same distance away
chapter gives examples of the effects of noise on a from roads, traffic volume (and noise level) had
variety of animal taxa. no effect (McGregor et al. 2008). Wale et al.
(2013b) investigated the physiological responses
of shore crabs (Carcinus maenas) to single and
13.2 Behavioral Options in a Noisy multiple ship-noise playbacks. Crabs consumed
Environment more oxygen, indicative of a higher metabolic
rate and potential stress, when exposed to ship
When exposed to anthropogenic noise, animals noise compared to ambient noise. However,
have choices of responses. Behavioral changes repeated exposures to ship noise showed no
are perhaps the most frequently observed and change. The authors proposed that crabs
reported effects of noise. In many cases, such exhibited the maximum response on the first
changes might be an “affordable” adaptation, for exposure to ship noise, then habituated or became
example when an animal temporarily moves tolerant of the noise.
away from the noise. The response (or lack Even when no behavioral response is detect-
thereof) is likely based on a cost-benefit ratio or able, animals might accept noise exposure at
the cost of change to improve fitness versus the levels that could have long-term hearing impacts,
magnitude of the benefit by changing. Although a especially if there are benefits of sticking around.
variety of behavioral changes in response to noise For example, each winter endangered manatees
have been studied in several species, their (Trichechus manatus) congregate around power
implications for biological fitness are difficult to plants in Florida likely in order to stay in the
determine. warm water effluence produced by the plant. In
the process, they are potentially exposed to high
levels of underwater noise for long periods.
13.2.1 Habituation Seemingly, the benefit of the warm water
outweighs the cost of noise exposure
Animals sometimes habituate to anthropogenic (JA Thomas, pers. obs.). Similarly, seals
noise. Habituation is a form of learning in which depredating at aquaculture sites might accept
an animal reduces or ceases its response to a hearing loss inducing noise levels from acoustic
stimulus after repeated presentations; in other harassment devices or “seal scarers” (Coram et al.
words, the animal learns to stop responding to 2014).
anthropogenic noise when it learns there are no
significant consequences. Habituation can be dif-
ficult to determine in the wild. A lack of observed 13.2.2 Change of Behavior
behavioral response does not necessarily mean
that there was no response or that the animal Temporary behavioral responses have been
habituated; the response might have been too reported for gray whales that took a somewhat
small to be observed, or it was of physiological wider route around the noise from offshore oil
type, or the animal’s hearing sensitivity might drilling platforms, while continuing their normal
have been reduced by prior exposure. round-trip migration from Alaska to Mexico
There are many accounts of animals living (Malme et al. 1984). Such a subtle response likely
without apparent detrimental impacts in areas of won’t have any long-term impact on fitness. Har-
high ambient noise, for example small mammals bour porpoises (Phocoena phocoena), on the
that live and breed along runways, railroad tracks, other hand, have been shown to forage almost
466 C. Erbe et al.

continuously around the clock and hence even 13.2.3 Change of Acoustic Signaling
moderate occurrences of anthropogenic distur-
bance might have significant fitness Vocal behaviors can also change in response to
consequences (Wisniewska et al. 2016). noise. To reduce interference from urban daytime
A permanent displacement from habitat has noise, chaffinches sang earlier in the day and
been suggested in egrets (Ardea alba) and great European robins (Erithacus rubecula) changed
blue herons (Ardea herodias), judged by the vocal activities to nighttime (Bergen and Abs
altered distribution of nests along the Mississippi 1997; Fuller et al. 2007). The cost of this change
River, potentially in response to increased vessel in vocal behavior is unknown. Animals might
traffic, such as tugboats and barges (JA Thomas, also change the characteristics of their sounds to
pers. obs.). A long-term displacement lasting six avoid masking. Changes in vocal effort such as
years occurred in killer whales (Orcinus orca) in increases in amplitude, repetition rate, and dura-
response to acoustic harassment devices installed tion, or frequency shifts are collectively known as
in parts of their habitat. Whales returned when the Lombard effect, which has been demonstrated
devices were removed (Morton and Symonds in several taxa, including frogs (Halfwerk et al.
2002). 2016), birds (Slabbekoorn and Peet 2003), and
Noise affects not only animal movement but cetaceans (Scheifele et al. 2005). The Lombard
also other behaviors. Chaffinches (Fringilla effect has also been observed during odontocete
coelebs) reduced their food pecking during echolocation: A captive beluga whale
increased background noise, which increased (Delphinapterus leucas) increased the amplitude
their vigilance; however, the increased alertness and frequency of its echolocation signal when
and hence reduction in predation risk might have moved from a quiet habitat in San Diego to an
reduced fitness via the reduction in food intake area with high snapping shrimp noise in Hawaii
(Quinn et al. 2006). Similarly, California ground (Au et al. 1985).
squirrels (Otospermophilus beecheyi) showed Some animal taxa might be limited in their
increased vigilance near wind turbines, poten- ability to voluntarily and temporarily change the
tially at the cost of other behaviors (Rabin et al. spectrographic features of their sounds—often
2006). In the marine environment, anthropogenic called behavioral plasticity. Insects, for example,
noise interfered with the predator-prey relation- generate sound by stridulation of body parts, the
ship. Motorboat noise elevated metabolic rate in resonance of which cannot be actively controlled.
prey fish, which then responded less often and Consequently, a Lombard effect failed to be
less rapidly to predation attempts. Predator fish observed in Oecanthus tree crickets (Costello
consumed more than twice as much prey during and Symes 2014); however, grasshoppers
boat noise exposure (Simpson et al. 2016). (Chorthippus biguttulus) from noisy habitats or
Reinforcing an acoustic communication mes- those exposed to noise as nymphs produced
sage with a visual display can enhance communi- higher-frequency sounds with higher duty cycles
cation in a noisy environment. For example, male (i.e., increased sound-to-pause ratio), indicating
foot-flagging frogs (Dendropsophus parviceps) developmental plasticity (Lampe et al. 2012,
live in neotropical areas with fast-flowing 2014).
streams, high levels of rain, and numerous other A cessation of sound emission in the presence
species of calling frogs. Foot-flagging frogs of anthropogenic noise can also occur. Thomas
evolved the visual signal of stretching out one or et al. (2016) studied the effects of construction
two hind legs, vibrating their feet, or stretching noise on yellow-cheeked gibbons (Nomascus
out their toes while calling, assisting with their gabriellae) at Niabi Zoo. Before construction, a
communication (Amézquita and Hödl 2004). bonded pair and their four-year-old offspring
13 The Effects of Noise on Animals 467

were quite soniferous. The pair commonly duet- The consequences of elevated stress levels can
ted in the early morning and displayed behaviors be far-reaching. Tarlow and Blumstein (2007)
typical of a bonded pair. Once construction near reviewed the effects of increased stress in birds
their exhibit commenced, they gradually resulting from human disturbances. The review
vocalized less often, and by the end of the four- documented changes in hormone levels, changes
month construction period, the pair bond had in heart rate, immunosuppression, changes in
dissolved and the young became ill (possibly flight-initiation distance, disturbed breeding suc-
due to decreased quality of care with the loss of cess, altered mate choice, and fluctuating
parent pair bond). For about a year, the pair anatomical asymmetry—all as a result of stress.
remained distant from each other and did not While there have not been many long-term stud-
vocalize. One of the authors (JA Thomas) played ies of noise-induced, chronic stress in animals,
back recordings of the pair’s own duet and those there is plenty of evidence from humans
of wild gibbons. Already during the first play- documenting, for example, hypertension and car-
back, the pair slowly started to vocalize and diovascular disease (Bolm-Audorff et al. 2020;
move to the top of the exhibit where they nor- Hahad et al. 2019; World Health Organization
mally performed their duet. They vocalized in 2011).
response to their own duet as opposed to Noise can further affect other non-acoustic
playbacks of other gibbon duets. The pair sensing and information use (termed cross-
continued duetting for several more years of modal impacts). For example, road noise
observation. impacted the ability of mongoose (Helogale
parvula) to smell predator feces, leaving these
mammals more susceptible to predation and loss
13.3 Physiological Effects of group cohesion (Morris-Drake et al. 2016).
The effects of noise are complex and they differ
In addition to eliciting changes in fine- or gross- by species. The following sections describe
motor behavior and acoustic behavior, sound can observed responses to sound by different taxa.
also cause physiological impacts, like stress,
hearing loss, or injury to tissues and organs. An
animal with impaired hearing might exhibit dif- 13.4 Noise Effects on Marine
ferent responses to sound and different acoustic Invertebrates
behavior, compared to an animal with normal
hearing. Marine invertebrates comprise a great diversity of
A stress response may occur when noise is fauna with a corresponding diversity of sensory
loud, novel, or unexpected (Wale et al. systems and modes of detecting sound or vibra-
2013a, b). Studies often concentrate on the effects tion. Only a few publications exist on the impacts
of noise-induced stress on reproduction. How- of underwater sound on marine invertebrates.
ever, stress also can result in: (1) a reduction or
cessation of normal movement, with a reduced
likelihood of escaping a predator; (2) reduced 13.4.1 Marine Invertebrate Hearing
appetite, feeding, or food acquisition; and
(3) excessive anti-predation behaviors. Attention Invertebrate species exhibit a diversity of sensory
is required to capture prey or avoid detection by a systems for detecting sound and vibration. Many
predator. Many animals use auditory cues to crustaceans and molluscs have acoustic sensory
detect the presence of predators or prey, and any systems that are an analogue to the fish otolith
noise-induced distraction could limit this detec- hearing system as they contain statocysts. These
tion (Siemers and Schaub 2011). Chan et al. are small organs that house a dense mass (i.e., a
(2010) termed this the “distracted prey statolith), which moves in response to sound and
hypothesis”. thus drives sensory hair cells, which create the
468 C. Erbe et al.

nervous response to the appropriate stimuli. 13.4.2 Effects of Noise by Taxon


Statocysts are involved in balance and motion
sensing (e.g., in squids and cuttlefish; Arkhipkin Invertebrate statocyst systems can be over-
and Bizikov 2000). Invertebrates can sense the excited by excessive motion of the statolith in
particle motion of an incoming sound wave with response to intense sound, resulting in damage
the statocyst system, as reported, for example, in to surrounding hair cells or membranes, as
common prawn (Palaemon serratus; Lovell et al. observed in lobsters exposed to seismic airguns
2005), octopus (Octopus ocellatus; Kaifu et al. (Day et al. 2016a, 2019). There were no signs of
2008), and longfin squid (Loligo pealeii; Mooney repair over the 365-day holding period in these
et al. 2010). lobsters. While such damage likely results in a
Benthic molluscs, which are site-attached and degradation of an animal’s sensory capability, the
fixed to the substrate, possess statocysts. These degree to which the fitness of wild animals is
animals may be responsive to water-borne sound, affected remains unclear and in at least one
to substrate-borne sound, or to sound waves documented case did not seem to alter population
traveling along the seabed-water interface. Some success (Day et al. 2020).
high-energy sound sources (e.g., impulsive seis- Invertebrates comprised of soft tissue with no
mic survey signals) can directly excite the ground dense masses might vibrate with a sound wave. In
(Day et al. 2016a). A benthic animal might derive the case of intense impulse signals, this mechani-
information on nearby surf conditions or on an cal motion might cause physiological trauma to
approaching predator grubbing along the seafloor cells, although the onset level is not known
from seabed-transmitted sound. Thus, benthic (Lee-Dadswell 2011). Planktonic invertebrates
invertebrates, including molluscs and with no statocyst systems but with sensory
crustaceans, may be adapted to sense substrate- appendages and antennal organs have been
borne sound, as well as respond to water-borne shown to be susceptible to damage from intense
sound. impulse signals (McCauley et al. 2017).
Other invertebrates do not possess statocyst Studies on noise effects on marine
organs. Many invertebrates may be comprised invertebrates show a range of impacts from none
primarily of soft tissue with no organs containing to severe, and results are difficult to compare due
internal masses capable of exciting hair cells. to vastly different experimental regimes. The fol-
Small animals of a single or few cells might lowing sections provide examples of study results
merely vibrate in phase with the sound wave. on a species level.
Other vibratory sensory systems documented in
invertebrates include single sensory hairs or
13.4.2.1 Squid
antennal organs, such as in the copepod
Caged squid (Sepioteuthis australis) that were
Lepeophtheirus salmonis, which responded to
approached by a 20-in3 airgun moved away
low-frequency vibrations or infrasound
from the airgun at received sound exposure levels
(<10 Hz; Heuch and Karlsen 1997).
(SEL) of 140–150 dB re 1 μPa2s and spent more
Invertebrate larvae undergo multiple develop-
time near the sea surface; a strong startle response
mental stages of which the later stages, just before
of the squid inking and jetting away from the
settlement, have the most developed sensory
airgun was observed when the airgun was
systems. These pre-settlement larvae are critical
discharged at about 30-m range with a received
for recruitment success and thus of great concern
SEL of 163 dB re 1 μPa2s (Fewtrell and
with regard to anthropogenic impacts. Many late-
McCauley 2012; McCauley et al. 2003a). Two
stage larvae are responsive to sound cues for
events of giant squid (Architeuthis dux) mass
settlement; for example, those of corals (Vermeij
mortality in the Bay of Biscay in 2001 and 2003
et al. 2010) and crabs (Stanley et al. 2009). Infor-
were suggested to have been a result of marine
mation on the responses of late-stage larvae to
seismic surveys, based on tissue damage (Guerra
anthropogenic sound is limited.
13 The Effects of Noise on Animals 469

Fig. 13.6 Scanning electron microscope images of squid https://journals.plos.org/plosone/article?id¼10.1371/jour


(Illex coindetii) epithelium 48 h after sound exposure. nal.pone.0078825; licensed under CC BY 4.0; https://
Arrows point to missing cilia and holes. Scale bars: A, creativecommons.org/licenses/by/4.0/
B, C ¼ 50 μm, D ¼ 10 μm (Solé et al. 2013). # Solé et al.;

et al. 2004). Statocyst hair cell damage was found had been removed from their seafloor habitat
in cephalopods (cuttlefish and squid) subjected to and were suspended in lantern nets in the water
simulated sonar sweeps in a laboratory tank column where they would not have experienced
(André et al. 2011; Solé et al. 2013; Fig. 13.6). substrate-borne and interface (i.e., at the seafloor)
sound and vibration. Also, physiological
measurements and long-term monitoring were
13.4.2.2 Scallops
not conducted. Przeslawski et al. (2018) made
Scallops (Pecten fumatus) exhibited behavioral
observations of wild scallops exposed to seismic
changes as a result of exposure to a 150-in3
airguns and found no discernible impacts, but the
airgun, which continued during the full 120-day
study had insufficient controls and no physiologi-
post-exposure monitoring, suggesting damage to
cal measurements, and longer-term post-exposure
the statocyst organ, which controls balance (Day
sampling was not undertaken.
et al. 2016a, 2017). Physiological measures
changed for the worse and mortality increased
with dose from 1 to 4 passes of the airgun (Day 13.4.2.3 Crustaceans
et al. 2016a, 2017). A different study failed to find Spiny lobsters (Jasus edwardsii) were exposed to
any significant effects of seismic airguns on single passes of a 45 or 150-in3 airgun and moni-
scallops (Parry et al. 2002); however, animals tored for 365 days after exposure (Day et al.
470 C. Erbe et al.

2016a). No mortality or significant morphological 214–220 dB re 1 μPa rms (McCauley 2014). No


changes were found in adults or in egg viability evidence of mechanical trauma (i.e., breakage),
(Day et al. 2016b). However, impaired righting physiological impairment (i.e., polyp withdrawal
ability correlating with damaged statocyst organs or reduction in soft coral rigidity), or long-term
(ablated hair cells) and compromised immune change in coral community structure was found
function were reported (Day et al. 2019; (Battershill et al. 2008; Heyward et al. 2018).
Fitzgibbon et al. 2017). How these changes
would impact wild lobsters is unclear, especially 13.4.2.5 Larvae/Plankton
as another study using an apparently healthy lob- Noise and vibration from ships can enhance the
ster population found pre-existing statocyst dam- settlement and growth of larvae of bryozoans,
age and no further increase in damage after oysters, calcareous tubeworms, and barnacles,
experimental airgun exposure, suggesting the and thus increase biofouling (Stanley et al.
animals had been exposed to intense noise in 2014). The effects of a 150-in3 airgun were stud-
situ before the experiment but had adapted to ied by Day et al. (2016b) with berried (with eggs)
the damage (Day et al. 2020). American lobsters spiny lobster (Jasus edwardsii) off Tasmania. No
(Homarus americanus) exposed to 202–227 dB mortality of adult lobster or eggs could be
re 1 μPa pk-pk airgun signals in a large tank attributed to the airgun at cumulative received
exhibited physiological changes but no impact SEL of up to 199 dB re 1μPa2s. Some differences
on righting times and no mortality (Payne et al. in exposed larvae morphology were noted (i.e.,
2007). Andriguetto-Filho et al. (2005) compared slightly larger than controls), but no differences in
shrimp (Litopenaeus schmitti, Farfantepenaeus larval hatching rates or viability were found.
subtilis, and Xyphopenaeus kroyeri) catch rates These were early-stage larvae with under-
before and after airgun exposure (635 in3) in developed sensory organs; results might differ
shallow (2–15 m) water in north-eastern Brazil, for late-stage larvae. Parry et al. (2002) found no
finding no difference. The playback of ship noise impacts on plankton from a 3542-in3 seismic
as opposed to ambient noise negatively affected array, but their statistical power to detect impacts
the foraging and antipredator behavior of shore was low. Aguilar de Soto et al. (2013) exposed
crabs (Carcinus maenas; Wale et al. 2013a). Fur- early-stage scallop larvae to airgun signals
thermore, oxygen consumption was greater dur- simulated by an underwater loudspeaker 9 cm
ing ship noise playback (possibly a stress away from the larval tank. Morphological
response), and heavier crabs were more affected deformities were found in all exposed larvae.
(Wale et al. 2013b). Evidently, there might be However, the exact stimulus was unknown
different responses to anthropogenic noise, owing to the experimental setup and inherent
depending on the size of an individual organism. acoustic limitations in small tanks.
McCauley et al. (2017) reported negative
13.4.2.4 Coral impacts, including a 2–3 times greater mortality
Experiments on the potential impacts of a rate, on various zooplankton out to 1 km from
2055-in3 3D seismic survey on corals were passage of a 150-in3 seismic airgun. In contrast,
undertaken in the 60-m deep lagoon of Scott Fields et al. (2019) exposed constrained adult
Reef, north-western Australia. Corals within and North Sea copepods (Calanus finmarchicus) to a
outside of the lagoon were exposed to airgun 520-in3 airgun cluster with measured impacts
noise over a 59-day period. Some corals received limited to within 10 m. McCauley et al. stated
airgun pulses from straight overhead (seismic that the “‘copepods dead’ category was
source at 7-m depth, corals at ~60-m depth), dominated by the smaller copepod species
whereas the full seismic survey passed within (Acartia tranteri, Oithona spp.)”. These species
tens to hundreds of meters horizontal offset, are ~0.5 mm in length as compared to the ~2.5-
yielding maximum received levels of 226–232 mm C. finmarchicus, suggesting a possible size
dB re 1 μPa pk-pk, 197–203 dB re 1 μPa2s, and dependency for impacts from airguns. The 1-km
13 The Effects of Noise on Animals 471

impact range given by McCauley et al. (2017) 100 kHz. Signaling at these frequencies is impor-
was within the repeat range (400–800 m) within tant for mate attraction and localization, rivalry,
which a 3D seismic survey vessel would pass on and spacing of individuals within populations. In
an adjacent seismic line, so that the entire survey addition, many species use their ears to detect and
area could have its plankton field degraded. avoid predators. Some species of flies eavesdrop
Richardson et al. (2017) ran ecological models on calling insects to locate and parasitize them.
to assess the scale of this impact. Assuming an An evolutionary adaptation to ambient noise
area of strong tidal currents and consistent ocean from competing insect choruses is the modifica-
current, a 3-day copepod turnover rate, and a tion of peripheral sensory filters, such as the
three-fold increase in copepod mortality within sharpening of tuning in the cricket (Fig. 13.7).
1.2 km, the copepod plankton field was modeled Such sharp tuning curves reduce the amount of
to recover within three days of completion of a masking noise within the filter (Schmidt et al.
mid-size 3D seismic survey. But, when 2011).
Richardson et al. (2017) reduced the strength of However, the most prevalent form of insect
the currents in the model, the impact persisted for communication involves substrate-borne sound.
three weeks. Many larger zooplankton have a More than 139,000 described taxa are expected
longer than 3-day turnover rate (i.e., weeks to to exclusively use vibrational signaling and an
months) with larval forms having a once or additional 56,000 taxa use a combination of
twice per year recruitment cycle, enhancing vibrational communication and other forms of
impacts above the published model output. mechanical signaling (Cocroft and Rodríguez
Given the central role zooplankton play in the 2005). The sensory organs monitoring substrate-
ocean ecosystem, and given that not all turn borne sound (e.g., the subgenual organs in the
over rapidly, the results of McCauley et al. legs) are tuned to frequencies below 1 kHz and
(2017) are of concern for ocean health. are extremely sensitive.

13.5 Noise Effects on Terrestrial


Invertebrates

Soniferous terrestrial invertebrates include some


crabs, spiders, and insects. Limited information
exists on the impacts of sound on terrestrial
invertebrates, with insects being the main group
studied. Currently, little is known about how egg
and larvae of terrestrial invertebrates respond to
high-amplitude anthropogenic sounds. As a
result, this section concentrates on adult insects
as representatives of terrestrial invertebrates.
Fig. 13.7 Graph of standardized mean sensitivity tuning
curves of auditory interneuron AN1 in three cricket spe-
cies: Paroecanthus podagrosus (P.p.), a neotropical
13.5.1 Insect Hearing cricket communicating under strong background noise
levels, and Gryllus bimaculatus (G.b.) and G. campestris
The ability to hear air-borne sound evolved inde- (G.c.), field crickets in environments with less background
pendently at least 24 times in seven orders of noise. The increased steepness in tuning toward higher
frequencies filters out competing frequencies from other
insects (Greenfield 2016), either as tympanal crickets (Schmidt et al. 2011). # Schmidt et al.; https://
hearing or hearing with antennae. These ears are jeb.biologists.org/content/214/10/1754. Published green
sensitive to a very broad range of frequencies, open access; https://jeb.biologists.org/content/rights-
from less than 1 kHz to high ultrasonics beyond permissions
472 C. Erbe et al.

Anthropogenic noise sources produce signifi- neither modify the fundamental frequency of their
cant amplitudes of air-borne sound at frequencies song nor increase the amplitude of their calls in
from less than 10 Hz to 50 kHz (e.g., traffic on noise (i.e., lack of a Lombard effect), as do some
roads and railways, compressors, wind turbines, species of frogs and birds, to reduce masking by
military activities, and urban environments). At anthropogenic noise.
the same time, airport, road, and railroad traffic For insects using substrate-borne signals,
and construction are significant sources of experimentally induced noise may disrupt mat-
low-frequency, substrate-borne vibrations below ing. Insects either respond less frequently to
1 kHz. Such substrate-borne noise may be created signals of the opposite sex, or they cease signal-
directly by vibrating the substrate (e.g., by driving ing during the initial part of communication
over it) or indirectly via air-borne noise that (Polajnar and Čokl 2008). The fact that noise
induces vibrations in the substrate. The relatively can disrupt substrate-borne communication
low-frequency sound produced by many of these between the sexes may be utilized in pest control
sources suffers less attenuation and can thus in agriculture (Polajnar et al. 2015). For example,
travel farther from the source. Because many substrate-borne noise can mask the mating signals
insects have very sensitive receptors for of species of leafhoppers, which represent a major
substrate-borne sound, with displacement pest in vineyards, resulting in reduced reproduc-
thresholds less than 1 nm, they are likely to detect tive success. A similar approach was successful
anthropogenic sources over long distances. with pine bark beetles, when the substrate-borne
Anthropogenic noise may therefore have a signif- noise spectrally overlapped with beetle signals
icant impact on the ability of insects to communi- (Hofstetter et al. 2014).
cate and listen in both the air-borne and substrate- The failure to adjust the frequency or ampli-
borne channel (reviewed by Morley et al. 2014; tude of mating signals in noise does, however, not
Raboin and Elias 2019). exclude other means of behavioral plasticity. For
example, the responses of male field crickets
(Gryllus bimaculatus) to traffic noise depended
13.5.2 Behavioral Effects on prior experience (Gallego-Abenza et al. 2019).
Recordings of car noise were played back to
Anthropogenic noise may impact insects in vari- males living at different ranges from the road
ous ways. It can mask communication signals, and, therefore, with different prior experience to
increase stress, affect larval development, and road noise. Males farther from the road decreased
ultimately decrease lifespan (reviewed by Raboin their chirp rate more than those nearer by,
and Elias 2019). The most common consequence suggesting that “behavioral plasticity modulated
of noise is masking, when noise overlaps in time by experience may thus allow some insect species
and frequency with a signal. This decreases the to cope with human-induced environmental
signal-to-noise ratio and thus the detection and/or stressors” (Gallego-Abenza et al. 2019).
discrimination of signals. For example, Schmidt Developmental plasticity may also manifest in
et al. (2014) found that anthropogenic noise signal modifications in response to noise. The
resulted in less effective female cricket orienta- courtship signals of grasshoppers are more broad-
tion toward signaling males (phonotaxis: band in frequency than those of crickets. Specifi-
orientated movement in relation to a sound cally, male grasshoppers (Chorthippus
source), which, in crickets, is the usual way to biguttulus) from roadside habitats produced
bring the sexes together. In another cricket spe- higher-frequency signals compared to
cies, males shortened their calls and paused sing- grasshoppers in quieter habitats (Lampe et al.
ing with increasing noise level. However, males 2014). In an experiment that reared half of the
did not adjust the duration of intervals between grasshopper nymphs in a noisy environment and
song elements important for species identification the other half in a quiet environment, adult males
(Orci et al. 2016). Apparently, these insects can from the first group produced signals with higher-
13 The Effects of Noise on Animals 473

frequency components, suggesting that develop- monsters, monitors, and bearded dragons) spe-
mental plasticity allows signal modifications in cies. Soniferous reptiles include some snakes,
noisy habitats. alligators, crocodiles, geckos, and freshwater
and marine turtles (e.g., Young 1997).
Reptiles are surrounded by anthropogenic
13.5.3 Physiological Effects noise from traffic (in water, on land, and in air),
construction, mineral and hydrocarbon explora-
Strong anthropogenic noise can result in hearing tion and production, etc. Because many anthropo-
loss. Auditory receptors in the locust ear showed genic noise sources are low in frequency and thus
a decreased ability to encode sound after noise within the reptilian hearing range, understanding
exposure. The mechanism for such hearing loss the impact of these sources on behavior and phys-
reveals striking parallels with that of the mamma- iology is an important start for reptile
lian auditory system (Warren et al. 2020). A conservation.
series of experiments was conducted to determine Little literature exists on the impacts of anthro-
whether exposure to simulated road traffic noise pogenic noise on reptiles, with sea turtles having
induces increased heart rates, as an indicator of a received recent attention. Simmons and Narins
stress response (Davis et al. 2018). Larvae of the recently reviewed the topic (2018). Currently,
monarch butterfly (Danaus plexippus) exposed little is known about how eggs and juvenile
for 2 h to road traffic noise experienced a signifi- reptiles respond to anthropogenic noise. As a
cant increase in heart rate, indicative of stress. result, this section concentrates on adult sea
Because these larvae do not have ears for turtles as a representative of reptiles.
air-borne sound, the likely sensory pathway Acoustic signals play an important role in tur-
involved vibration receptors. However, exposing tle social behavior and reproduction. Turtles
larvae for longer periods (up to 12 days) to con- make very-low-frequency calls of short duration
tinuous traffic noise did not increase heart rate at by swallowing or by forcibly expelling air from
the end of larval development; so chronic noise their lungs. Galeotti et al. (2005) published a
exposure may result in habituation or desensitiza- summary of sound occurrence, context, and
tion. However, habituation to stress during larval usage in Cryptodira chelonians—a taxon, which
stages may impair reactions to stressors in adult is quite soniferous. In general, turtles call when
insects. mating or seeking a mate, when they are sick or in
While more research is necessary to under- distress, or for other reasons. Male red-footed
stand the sensory strategies for avoiding or com- tortoises (Chelonoidis carbonaria) make a
pensating for anthropogenic noise, there are some clucking sound during mounting, Greek tortoises
cases where insects experience a significant fit- (Testudo graeca) whistle during combat, and
ness advantage. This may happen in a predator- young big-headed turtles (Platysternon
prey or parasitoid-host relationship, when the megacephalum) squeal when disturbed (Galeotti
noise decreases the ability of a parasitoid fly to et al. 2005). Nesting female leatherback sea
localize calls of their host crickets (Lee and turtles (Dermochelys coriacea) make a belching
Mason 2017), or when bats as predators of flying sound (Cook and Forrest 2005; Mrosovsky
insects are less efficient foragers in the presence 1972), and the sounds from leatherback sea turtle
of anthropogenic noise (Siemers and Schaub eggs are believed to help coordinate hatching
2011). (Ferrara et al. 2014).

13.6 Noise Effects on Reptiles 13.6.1 Reptile Hearing

Reptiles have both aquatic (sea turtles, alligators, Not all reptiles produce sound for communica-
and crocodiles) and terrestrial (geckos, snakes, tion. Most reptiles can detect substrate-borne
iguana, whiptails, geckos, chameleons, gila vibrations (e.g., Barnett et al. 1999; Christensen
474 C. Erbe et al.

et al. 2012). The auditory anatomy of most reptile 13.7 Noise Effects on Amphibians
species includes a tympanic membrane near the
rear of the head, a middle ear with a stapes, and a Frogs rely heavily on acoustic communication for
fluid-filled inner ear housing the lagena and its mating. Noise has been shown to alter both the
sound-sensing cells (Wever 1978). Brittan- production and perception of frog vocalizations.
Powell et al. (2010) indicated that reptile hearing This can have serious implications for reproduc-
is similar in frequency range to hearing in birds tion in these animals. Males that do not call as
and amphibians. The most sensitive lizards have often will not attract females to their locations
similar absolute sensitivities to birds. Ridgway along a pond edge. Females that do not hear the
et al. (1969) used electrophysiological methods advertisement calls from the males will not be
to test hearing abilities of the green sea turtle able to localize or approach them. Further, they
(Chelonia mydas) and found peak sensitivity will not be able to sample multiple males for
between 300 and 400 Hz, with the best hearing selection of the most attractive one. Studies have
range from 60 to 1000 Hz. In general, the best been conducted in both the laboratory and the
frequency range of hearing in chelonids (turtles, field to determine the effects of noise on acoustic
tortoises, and terrapins) is 50–1500 Hz (Popper communication in frogs, for both vocal produc-
et al. 2014). tion and auditory perception.

13.6.2 Behavioral Responses to Noise 13.7.1 Frog Hearing

Sea turtles may be exposed to acute and chronic The amphibian ear consists of a tympanic mem-
noise. The soundscape of the Peconic Bay Estu- brane on the outside through which sound enters
ary, Long Island, NY, USA, a major coastal for- the ear, a middle ear containing a columella,
aging area for juvenile sea turtles, was recorded similar to the mammalian stapes, that provides
during sea turtle season. There was considerable mechanical lever action, and an inner ear in
boating and recreational activity, especially which sound is converted to neural signals
between early July and early September. Samuel (Wever 1985). The inner ear contains two papil-
et al. (2005) suggested that increasing and chronic lae, known as the amphibian papilla, which
exposure to high levels of anthropogenic noise responds to lower frequencies, and the basilar
could affect sea turtle behavior and ecology. papilla, which responds to higher frequencies.
Indeed, loggerhead sea turtles have been shown Audiograms show good sensitivity between
to dive when exposed to seismic airgun noise— 100 Hz and a few kHz (e.g., Megela-Simmons
perhaps as a means of avoidance (DeRuiter and et al. 1985). Some species, however, exhibit sen-
Larbi Doukara 2012). In the terrestrial world, sitivity also to ultrasound (Narins et al. 2014), and
desert tortoises (Gopherus agassizii) exposed to others to infrasound (Lewis and Narins 1985).
simulated jet overflights did not show a startle
response or increased heart rate, but they froze;
and in response to simulated sonic booms, they
13.7.2 Behavioral Responses to Noise
exhibited brief periods of alertness (Bowles et al.
1999).
Some species of frogs, like other animals, are
Unfortunately, there is a complete lack of data
known to avoid roads and highways, possibly to
on masking of biologically important signals in
avoid both traffic mortality and a reduced trans-
sea turtles and other reptiles by anthropogenic
mission of vocal signals (reviewed by
noise (Popper et al. 2014). Similarly, there has
Cunnington and Fahrig 2010). Several studies,
been little research on physiological effects of
however, failed to document behavioral avoid-
noise in reptiles.
ance of noise by frogs or did not find reduced
13 The Effects of Noise on Animals 475

frog abundance near continuous noise sources 2008). Barber et al. (2010) believed that these
such as highways (Herrera-Montes and Aide frogs were unable to adjust the frequency or dura-
2011). tion of their calls to increase signal transmission.
Nonetheless, noise does affect the perception Penna et al. (2005) found a similar decrease in
of acoustic signals by frogs. Bee and Swanson call rate in leptodactylid frogs (Eupsophus
(2007) investigated the potential of noise from calcaratus) exposed to recordings of natural
road traffic to interfere with the perception of noise in the wild.
male gray treefrog (Dryophytes chrysoscelis) An effective way to increase the likelihood
signals by females. Using a phonotaxis assay, that acoustic signals will be received is by
they presented females with a male advertisement increasing the intensity of those signals (Lombard
call at various signal levels (37–85 dB re 20 μPa) effect). Love and Bee (2010) measured the
in three masking conditions: (1) no masking intensities of vocalizations produced in the labo-
noise, (2) a moderately dense breeding chorus, ratory by Cope’s gray treefrog (Dryophytes
and (3) road traffic noise recorded in wetlands chrysoscelis) in the midst of different levels of
near major roads. In both the chorus and traffic background noise, similar to a frog chorus. They
noise maskers, female response latency increased, found no evidence for the existence of the Lom-
orientation behavior toward the signal decreased, bard effect in their frogs. Frogs produced calls at a
and response thresholds increased by about level of 92–93 dB re 20 μPa, regardless of noise
20–25 dB. The authors concluded that realistic level. Similar to findings from other frogs, Cope’s
levels of traffic noise could limit the active space, gray treefrogs increased call duration and
or the maximum transmission distance, of male decreased call rate with increasing noise levels.
treefrog advertisement calls. Another treefrog However, they appeared to be maximizing their
(Dendropsophus ebraccatus) tested in a labora- call amplitudes in every calling situation, which
tory to compare the effects of dominant frequency does not allow them to increase their call
and signal-to-noise ratio on call perception intensities further when needed. On the contrary,
showed a low-frequency call preference in quiet túngara frogs (Engystomops pustulosus) and
conditions (usually correlated with larger, more rhacophorid treefrogs (Kurixalus chaseni) did
attractive males), but no preference at higher increase their call levels in noise (Halfwerk et al.
signal-to-noise ratios (Wollerman and Wiley 2016; Yi and Sheridan 2019).
2002). These results indicate that females listen- Another possible way for a frog to increase
ing to males in a noisy environment will likely communication efficacy would be to increase
make errors in mate choice. the frequencies of their calls to be above the
Sun and Narins (2005) examined the effects of frequency of the masking noise. Parris et al.
fly-by noise from airplanes and played back (2009) found that two species of frogs (southern
low-frequency sound from motorcycles to an brown treefrog, Litoria ewingii, and common
assemblage of frog species in Thailand. Three of eastern froglet, Crinia signifera) called at a higher
the most acoustically active species (Microhyla frequency in traffic noise (e.g., 4.1 Hz/dB for
butleri, Sylvirana nigrovittata, and Kaloula L. ewingii), and suggested this was an adaptation
pulchra) decreased their calling rate and the over- to be heard over the noisy environmental
all intensity of the assemblage calls decreased. conditions. An extreme form of this frequency-
However, calls from another frog (Hylarana increasing behavior has been discovered in
taipehensis) seemed to persist. The authors concave-eared torrent frogs (Odorrana tormota)
suggested that the anthropogenic noise in China (Feng and Narins 2008). These frogs live
suppressed the calling rate of some species, but near extremely loud streams and waterfalls
seemed to stimulate calling behavior in (58–76 dB re 20 μPa, up to 16 kHz), which should
H. taipenhensis. Another study found that the make vocalizations difficult for other frogs to
vocalization rate of European treefrog (Hyla hear, at least at the lowest frequencies. The calls
arborea) decreased in traffic noise (Lengagne from these frogs are quite different from the
476 C. Erbe et al.

Fig. 13.8 Spectrograms, waveforms, and call spectra tormotus). Journal of Comparative Physiology A, 194(2),
from six vocalizations from the O. tormota frog (Feng 159–167; https://link.springer.com/article/10.1007/
and Narins 2008). Reprinted by permission from Springer s00359-007-0267-1. # Springer Nature, 2008. All rights
Nature. A. S. Feng and Narins, P. M. Ultrasonic commu- reserved
nication in concave-eared torrent frogs (Amolops

vocalizations of other frogs, however. These tor- 13.7.3 Physiological Responses


rent frogs produce numerous vocalizations with to Noise
energy in the ultrasonic frequency range
(Fig. 13.8). A phonotaxis study found that female Spatially separating a signal from a masker is one
torrent frogs actually preferred synthetic male way to improve signal detectability. Spatial
calls embedded in higher-amplitude stream noise release from masking has been demonstrated in
than those embedded in lower-amplitude stream frogs behaviorally as well as physiologically.
noise (Zhao et al. 2017). These ultrasonic signals Ratnam and Feng (1998) recorded from single
are both produced and perceived by males and units in the inferior colliculus of northern leopard
females, suggesting that they are not just a frogs (Lithobates pipiens) and found
by-product of vocal production, and are instead improvements in signal detection thresholds
an adaptation to avoid signal masking in a very with spatially separated signals and noise maskers
noisy environment (Shen et al. 2008). relative to spatially coincident signals and
Some species of frogs are known to use visual maskers. This has been shown in laboratory stud-
signals when conditions are noisy, in an effort to ies with awake behaving animals, when female
improve communication. Grafe et al. (2012) Cope’s gray treefrogs approached a target signal
recorded acoustic and visual communication (male calling frog) more readily when it was
strategies in noisy conditions by the Bornean spatially separated (by 90 ) from a noise source
rock frog (Staurois parvus). These frogs modified (Bee 2007). This spatial release from masking, in
the amplitude, frequency, repetition rate, and the range of 6–12 dB, is similar to what is seen in
duration of their calls in response to noise, but other animals such as budgerigars (Melopsittacus
in addition engaged in visual foot-flagging and undulatus; Dent et al. 1997) and killer whales
foot-touching behaviors. In a noisy world and (Bain and Dahlheim 1994).
with limited flexibility in vocal production Finally, increased levels of corticosterone,
capabilities, adding a visual component to an which correlated with impaired female mobility,
acoustic signal may be one of the only ways have been shown in high traffic noise conditions
these animals are able to adapt. in female wood frogs (Lithobates sylvaticus)
13 The Effects of Noise on Animals 477

(Tennessen et al. 2014), although a recent study otoliths of the inner ear, which sends neural
suggests that eggs taken from high traffic noise signals to the brain. The inner ear is sensitive to
conditions yielded frogs that were less affected by particle motion. Fish with swim bladders close to
noise exposure than frogs from eggs taken from or even connected to the ears are also sensitive to
low traffic noise environments, suggesting acoustic pressure. This is because the sound pres-
adaptations are possible (Tennessen et al. 2018). sure excites the gas bladder, which reradiates an
Whether it is from the stress or the masking of the acoustic wave that drives the otolith. Particle
acoustic signals, anthropogenic noise has been motion then creates differential movement
shown to have negative consequences. between the otoliths and the rest of the ear. The
lateral line system involves neuromasts that detect
water flow and acoustic particle motion. Due to
13.8 Noise Effects on Fish variability in otolith anatomy and the absence or
presence and variable connectivity of swim
All fish species studied to date can detect sound. bladders, fish hearing varies greatly with species
Hundreds of species are known to emit sound in terms of sensitivity and bandwidth, with most
with the most prominent display of sound produc- species sensitive to somewhere between 30 and
tion in fishes being their choruses on spawning 1000 Hz, but some species detecting infrasound,
grounds (Slabbekoorn et al. 2010). Adult, juve- and others ultrasound up to 180 kHz (Popper and
nile, and larval-stage fishes actively use environ- Fay 1993, 2011; Tavolga 1976). Hearing in noise
mental sound to orientate and settle (Jeffrey et al. has been studied and parameters such as the criti-
2002; Simpson et al. 2005, 2007). Herring cal ratio (signal-to-noise ratio for sound detection,
(Clupea harengus) have shown avoidance behav- see Chap. 10) have been measured (Fay and Pop-
ior to playbacks of sounds of killer whales, one of per 2012; Tavolga et al. 2012); however, the
their predators (Doksaeter et al. 2009). Underwa- significance of acoustic masking to fish fitness
ter anthropogenic noise can have a variety of and survival remains poorly understood.
effects on fish, ranging from behavioral changes,
masking, stress, and temporary threshold shifts, to
tissue and organ damage, and death in extreme 13.8.2 Behavioral Responses to Noise
cases (Hawkins and Popper 2018; Normandeau
Associates 2012; Popper and Hastings 2009). The schooling behavior of fish has been observed
Mortality can also result from an increased risk to change in response to an approaching airgun
of predation in noisy environments (Simpson with fish swimming faster, deeper in the water
et al. 2016). Despite the growing amount of liter- column, and in tighter schools (Davidsen et al.
ature, our understanding of the cumulative effects 2019; Fewtrell and McCauley 2012; Neo et al.
of multiple exposures and the fitness implications 2015; Pearson et al. 1992). Caged fish had
to wild fish is limited. compacted near the center of the cage floor at
received levels of 145–150 dB re 1 μPa2s and
swimming behavior returned to normal after
13.8.1 Fish Hearing 11–31 min (Fewtrell and McCauley 2012). A
startle response was noted when the airgun was
Fish have two systems detecting sound and vibra- discharged at close range (Pearson et al. 1992),
tion: the inner ear and the lateral line system. The but not when the received level was ramped up by
inner ear of fish resembles an accelerometer. It approaching from a longer range; also, the startle
contains otoliths, which are bones of approxi- response diminished over time (Fewtrell and
mately three times the water density. Water- McCauley 2012). Wild pelagic and mesopelagic
borne acoustic waves therefore result in differen- species dove deeper and their abundance
tial motion between the otoliths and the fish’s increased at long range from the airgun array
body, thus bending hair cells coupled to the (Slotte et al. 2004). There are a few studies
478 C. Erbe et al.

Fig. 13.9 (a) Experimental setup to study fish responses stopped at the 2nd line, restarted at the 3rd line, and
to playbacks of pile driving sound. (b) Echogram of zoo- stopped at the 4th line (modified from Hawkins et al.
plankton dropping in depth below sea surface during play- 2014). # Acoustical Society of America, 2014. All rights
back of pile driving sound (red ellipses). Time is along the reserved
x-axis; playback started at the 1st vertical black line,

documenting a drop in catch rates of pelagic fish 13.8.3 Effects of Noise on the Auditory
after seismic surveying (Engas and Løkkeborg and other Systems
2002; Engås et al. 1996; Slotte et al. 2004),
believed to be due to behavioral responses. After exposure to intense pulsed sound from
Hawkins et al. (2014) played pile driving noise airguns, extensive hearing damage in the form
to wild zooplankton and fish. A loudspeaker was of ablated or missing hair cells was found in
deployed from one boat for sound transmission, pink snapper (Pagrus auratus) (McCauley et al.
while an echosounder and side-scan sonar were 2003a, b). Other studies have found only limited
deployed from a second boat for animal observa- or no hearing damage or threshold shift in various
tion (Fig. 13.9a). Zooplankton dropped in depth species of fish from airgun exposure (Hastings
below the sea surface after playback onset as and Miksis-Olds 2012; Popper et al. 2005; Song
shown by the echogram in Fig. 13.9b. Wild et al. 2008). Apart from the typical differences in
sprat (Sprattus sprattus) and mackerel (Scomber experimental setup, exposure regime, and species
scombrus) exhibited a diversity of responses tested, a factor influencing the degree of noise
including break-up of aggregations and reforming impact might be the direction from which sound
of much denser aggregations in deeper water. The is received (specifically, vertical versus horizontal
sprat is sensitive to sound pressure, however the incidence; McCauley et al. 2003a). Fish ears are
mackerel lacks a swim bladder and is sensitive to not symmetrical and many anthropogenic sound
the particle motion. The occurrence of behavioral sources have a strong vertical directionality under
responses increased with the received level. The water due to their near-surface deployment lead-
50% response thresholds were 163.2 and ing to a dipole sound field.
163.3 dB re 1 μPa pk-pk and 135.0 and 142.0 Halvorsen et al. (2012, Fig. 13.11) looked for
dB re 1 μPa2s (single-strike exposure) for sprat tissue and organ damage in Chinook salmon
and mackerel, respectively (Hawkins et al. 2014; (Oncorhynchus tshawytscha) that were placed
Fig. 13.10). inside a standing-wave test tube (High-Intensity
13 The Effects of Noise on Animals 479

Fig. 13.10 Dose-response curves (solid lines) and 95% pile driving (modified from Hawkins et al. 2014).
confidence intervals (dashed lines) of (a) sprat and # Acoustical Society of America, 2014. All rights
(b) mackerel to peak-to-peak sound pressure levels from reserved

Fig. 13.11 Chinook salmon injuries from noise hemorrhage (Halvorsen et al. 2012). # Halvorsen et al.;
exposure. Mild: (a) eye hemorrhage, (b, c) fin hematoma. https://journals.plos.org/plosone/article?id¼10.1371/jour
Moderate: (d) liver hemorrhage and (e) bruised swim nal.pone.0038968; licensed under CC BY 4.0; https://
bladder. Mortal: (f) intestinal hemorrhage and (g) kidney creativecommons.org/licenses/by/4.0/
480 C. Erbe et al.

Controlled Impedance Fluid-filled wave Tube, some species’ hearing extending into the infra-
HICI-FT) in which pressure and particle motion sonic range (Dooling et al. 2000).
could be controlled. Physical injury commenced
at 211 dB re 1 μPa2s cumulative sound exposure
resembling 1920 strikes of a pile driver at 177 dB 13.9.2 Behavioral Responses to Noise
re 1 μPa2s each.
Yelverton (1975) conducted studies of the Several studies have demonstrated that some
gross effects of sounds generated from underwa- birds are affected by low-frequency (<3 kHz)
ter explosive blasts on fish. He found three impor- anthropogenic noise from roadways and that
tant factors that influenced the degree of damage: long-term exposure can lead to lower species
the size of the fish relative to the wavelength of diversity or lower breeding densities in an area
the sound, the species’ anatomy, and the location (reviewed by Goodwin and Shriver 2011; Reijnen
of the fish in the water column relative to the and Foppen 2006). Urban noise is known to affect
sound source. reproduction and mating behaviors of birds in
several ways. Urban noise can mask acoustic
components of the lekking display by male
greater sage grouse (Centrocercus urophasianus;
13.9 Noise Effects on Birds
Blickley and Patricelli 2012). It also disrupts
female preference for low-frequency songs sung
Birds rely heavily on acoustic communication for
by male canaries (des Aunay et al. 2014) and
life functions such as warning others about
great tits (Halfwerk et al. 2011). Females of
predators, finding and assessing the quality of
these (and other) species prefer males that sing
mates, defending territories, and discerning
lower-frequency songs over those that sing
which youngster to feed (Bradbury and
higher-frequency songs because the
Vehrencamp 2011). When environmental noise
low-frequency songs are sung by males of higher
levels are high, such functions become difficult
quality (e.g., Gil and Gahr 2002). When
or impossible, unless the birds can make tempo-
low-frequency urban noise masks the
rary or permanent adjustments to their signal,
low-frequency components of calls and songs,
posture, or location. There have been several
females either cannot detect or find the males
studies on the effects of noise on survival and
that are singing or cannot discriminate between
communication in birds in the field as well as
the high-quality males singing at low frequencies
the laboratory, and on the ways that birds adjust
and the poorer-quality males singing at higher
their communication signals and/or lifestyles to
frequencies.
adapt to the noisy modern world.
Urban noise also has influences on where birds
choose to live and breed, often resulting in
consequences for choosing less favorable
13.9.1 Bird Hearing habitats. For instance, Eastern bluebirds (Sialia
sialis) living in noisier environments were found
The avian ear has three main parts: an outer, to have reduced reproductive productivity and
middle, and inner ear. The outer ear is typically brood size compared to those living in quieter
hidden by feathers, but consists of a small exter- habitats (Kight et al. 2012). The presence and
nal meatus. A tympanic membrane separates the absence of construction and highways often
outer and middle ear. The middle ear contains the changes the distribution of birds. Foppen and
columella that mechanically transmits sound to Deuzeman (2007) compared the distribution of
the inner ear, and a connected interaural canal to reed warbler (Acrocephalus arundinaceus) pairs
aid in directional hearing. The basilar papilla in in the Netherlands before a highway was built
the inner ear converts sound into neural signals. through a nesting area and after the highway
Most birds hear between 50 Hz and 10 kHz, with was present. When the highway was present
13 The Effects of Noise on Animals 481

there were fewer nesting pairs, meaning that some mating and reproductive success. Nestling
birds were avoiding preferred habitats to avoid white-crowned sparrows (Zonotrichia
traffic noise. The road was temporarily closed and leucophrys) tutored with songs embedded in
the number of nesting pairs increased; however, anthropogenic noise later sung songs at higher
once the road reopened the number of nesting frequencies and with lower vocal performance
pairs again decreased. A more extensive study than those tutored with non-noisy control songs
conducted in the Netherlands found that 26 of (Moseley et al. 2018). As another example, when
43 (60%) woodland bird species showed reduced alarm calls were presented to tree swallow
numbers near roads (Reijnen et al. 1995). Another (Tachycineta bicolor) nestlings, the tree swallows
count of birds near and far from roads showed in quiet environments crouched more often (hid-
that even when habitats were similar to one ing from predators) while the nestlings in noisy
another, but either near to or far from a highway, environments produced longer calls and did not
the number of birds in each area increased with crouch (McIntyre et al. 2014). Nestling tree
increasing distance from the road (Fig. 13.12), swallows living in noisier environments produced
correlating with noise levels (Polak et al. 2013). narrower-bandwidth and higher-frequency calls
That is, both abundance and diversity of birds than those from quieter nests (Leonard and Horn
increased as noise levels decreased. Other studies 2008), although hearing of noise-reared nestlings
have confirmed that birds with higher-frequency does not differ from that of quiet-reared nestlings
calls were less likely to avoid the roadways than (Horn et al. 2020). These studies indicate that
birds with lower-frequency calls (Rheindt 2003), noise could affect how well offspring hear
again pointing to the challenges that many birds predators and how well parents hear begging
have when communicating in low-frequency calls. It also could influence the rate of feeding
urban noise, and highlighting the difficult choice nestlings and could even have long-lasting effects
that birds must face: Do the costs of choosing a on call structure, which could influence breeding
less favorable habitat outweigh the benefits of success of those nestlings as adults. In a labora-
living in quieter environments? The answer to tory study looking at the effects of noise on repro-
this question clearly differs across both individual duction, high levels of environmental noise
birds and species. eroded pair preferences in zebra finches (Swaddle
When birds do choose to nest in noisier and Page 2007). Paired females chose non-partner
environments, there could be consequences for males over their partners when moderate to high

Fig. 13.12 Relationship 36


between bird abundance at Mean + SE + SD
point-count locations and **
32
distance from the road. **
Arrows show significant
differences between points 28
Number of birds

(Polak et al. 2013).


# Springer Nature, 2013; 24
https://link.springer.com/
article/10.1007/s10342-
20
013-0732-z/figures/5.
Licensed under CC BY;
https://creativecommons. 16
org/licenses/
12

8
60 m 310 m 560 m
Distance from road
482 C. Erbe et al.

levels of white noise were presented in a prefer- discriminate the calls of their parents from calls
ence test. These results have implications for of other adults at a negative signal-to-noise ratio,
noisy environments altering the population’s suggesting that the enhanced detectability of nat-
breeding styles and eventually the evolutionary ural vocal signals found in the laboratory actually
trajectory of the species (Swaddle and Page translates to excellent acuity in the wild (Aubin
2007). and Jouventin 1998).
All of the above-mentioned studies reveal that
songs and calls are more or less discriminable or
13.9.3 Communication Masking
detectable when they are presented within differ-
ent masker types. For instance, great tits have
To know exactly how noise affects acoustic com-
better thresholds for detecting song elements
munication in birds, playback or perceptual
embedded in woodland noise than urban noise
experiments must be conducted to measure audi-
(Fig. 13.13a; Pohl et al. 2009). Interestingly,
tory acuity in a controlled environment.
detection of song elements in the dawn chorus
Experiments would use either pure tones and
was the most difficult condition for the great tits
white noise or more complex and natural signals
compared to the other noise types, suggesting that
that birds use for communication purposes. Con-
birds are not necessarily listening to one another
trolled laboratory studies measuring the ability to
in the mornings while they are singing. Canaries
detect simple pure tones in broadband noise have
trained to identify canary songs embedded in one
been conducted in over a dozen bird species
to four other distractor canary songs found it more
(reviewed by Dooling et al. 2000) using operant
difficult when there were more songs present,
conditioning techniques. These studies have
similar to conditions of the dawn chorus where
shown that as the frequency of the tone increases,
many birds are singing overlapping songs
it must be incrementally louder to hear it in a
(Fig. 13.13b; Appeltants et al. 2005). Another
noisy background. This is not unlike the trend
laboratory study determined birds’ abilities to
seen in other animals, suggesting a preserved
discriminate auditory distance, a task crucially
evolutionary mechanism for hearing in noise.
important for territorial birds. Pohl et al. (2015)
Other laboratory studies measuring the detec-
trained great tits to discriminate between virtual
tion and discrimination of calls and songs embed-
birdsongs at near and far distances, presented in
ded in various types of noise can reveal more
quiet or embedded in a noisy dawn chorus. The
about the exact nature of the active space for the
birds accurately discriminated between distances,
natural acoustic signals used for communication
although this was much harder in noisy than in
by social birds. Psychoacoustic studies often test
quiet conditions. In summary, these experiments
the abilities of birds to detect, discriminate, or
and others demonstrate that hearing in noise is
identify songs or calls that are embedded in a
possible, and that factors such as the spectro-
chorus of other songs or different types of noise
temporal make-up of signals, noise type, and
(e.g., urban or woodland). Operant conditioning
noise level all have an influence on hearing
experiments on zebra finches, European starlings
signals in noise.
(Sturnus vulgaris), canaries (Serinus canaria),
As a whole, results from the laboratory and
great tits (Parus major), and budgerigars all
field experiments suggest that bird communica-
show that birds have excellent acuity for detecting
tion is more successful in quiet, rather than noisy
or discriminating communication signals relative
environments, that the type of noise matters for
to pure tones, possibly due to the ecological rele-
communication, and that if noise is present,
vance of these signals (Appeltants et al. 2005;
adjustments need to be made to the calls or
Dent et al. 2009; Hulse et al. 1997; Lohr et al.
songs of signalers for those signals to be detected,
2003; Narayan et al. 2007; Pohl et al. 2009). In a
discriminated, and localized by the receivers. One
field test of call discrimination, juvenile king
such adjustment that has shown to be effective is
penguins in a noisy colony were able to
changing the position of the signal relative to the
13 The Effects of Noise on Animals 483

a) 40 b) 80

Percent Correct Responses


Threshold (dB SPL) 30
70

20

60
10

0 50
1 2 3 4
an
e

s
ru
nc

an

rb

ho
le

dl

U
Si

oo

C
n Number of Song Maskers
W

aw
D

Fig. 13.13 (a) Masked thresholds for great tits detecting urban noise than woodland noise. (b) Performance for
a synthetic song element embedded in silence, woodland canaries discriminating song elements embedded in 1–4
noise, urban noise, or dawn chorus noise (adapted from other songs (adapted from Appeltants et al. 2005). As the
Pohl et al. 2009). Performance is best for quiet conditions, number of maskers increases, performance decreases
worst for the chorus conditions. Thresholds are higher for

masker. Dent et al. (1997) found that thresholds one another in noisy environments, changing
for budgerigars detecting a pure tone in white their position or even simply moving their heads
noise were 11 dB lower when the signal and will increase communication efficiency in similar
noise were separated by 90 in space than when ways as humans attempting to speak to one
they were co-located (i.e., spatial release from another in a noisy cocktail party will often move
masking). A follow-up study showed an even their head toward a speaker.
greater advantage when the spatially separated Another adjustment made by many birds is to
signal was zebra finch song and the masker was shift the frequency content of songs to a higher
a zebra finch chorus (Fig. 13.14; Dent et al. 2009). range, as documented for European blackbirds
Thus, when birds are trying to communicate with (Turdus merula; Slabbekoorn and Ripmeester
2008), plumbeous vireos (Vireo plumbeus;
Francis et al. 2011), gray vireos (Vireo vicinior;
Francis et al. 2011), European robins (McMullen
et al. 2014), chaffinches (Verzijden et al. 2010),
black-capped chickadees (Poecile atricapillus;
Proppe et al. 2011), and a number of tropical
birds (de Magalhães Tolentino et al. 2018).
Whether this is a true adaptation attempting to
increase the lowest frequencies of songs above
the highest frequencies of the noise, whether it is
simply easier for the birds to make high
frequencies louder, or whether urban birds live
in denser environments and want to distinguish
Fig. 13.14 Signal-to-noise ratio thresholds for detecting a their songs from those of other birds is still being
zebra finch song are higher (worse) when a chorus masker debated (e.g., Nemeth et al. 2013).
is co-located with the song (black boxes) than when the Pohl et al. (2012) tested the consequences of
song is spatially separated from the masker (green boxes),
in both budgerigars and zebra finches. Adapted from Dent such shifts on perception in the laboratory. These
et al. (2009) authors trained great tits to detect or discriminate
484 C. Erbe et al.

between song phrases embedded in urban or have an almost-immediate ability to re-occupy


woodland noises. In the urban noise background, an acoustic niche within a soundscape.
it was easier for the tits to detect the high-
frequency phrases than the low-frequency
phrases. There was no difference in the woodland 13.9.4 Physiological Effects
noise for detection of the different song types. For
birds attempting to discriminate high- or One major advantage birds possess, compared to
low-frequency songs embedded in woodland or humans, is the ability to regenerate auditory sen-
urban noises, the researchers found that the high- sory cells lost during exposure to very loud
frequency elements were more useful in urban sounds (Ryals and Rubel 1988), therefore birds
conditions, while the whole song was used for experience no hearing loss over time from either
discrimination in woodland noise. Thus, birds aging or noisy environments. Birds do, however,
that are changing their calls and songs into experience stress from noise (Blickley et al. 2012;
higher-frequency ranges for improved communi- Strasser and Heath 2013).
cation in noisy urban environments are doing so Acoustic communication in birds is vital for
adaptively. survival, and understanding how noise affects
Other vocal adjustments made by birds in sound production and perception is important
response to noise are to sing more during the for conservation efforts. Birds are clearly affected
quiet night than during the noisy day (as in by the increasing levels of urban noise in their
European robins; Fuller et al. 2007), to shift the environments, but many adjust their calling and
initiation of the dawn chorus by as much as 5 h to singing styles or locations to overcome problems
compensate for traffic noise (as in European of communicating in noise. Certainly, there are
blackbirds; Nordt and Klenke 2013), and to both limits to and consequences of those
increase the intensity of vocalizations (Lombard adjustments.
effect). Black-capped chickadees modify the
structure and frequencies of their alarm calls in
response to noise (Courter et al. 2020), while 13.10 Noise Effects on Terrestrial
house wrens (Troglodytes aedon) reduce the size Mammals
of their song repertoires in addition to changing
their song frequencies (Juárez et al. 2021). In a Anthropogenic noise affects mammals in a vari-
field study on noisy miners (Manorina ety of ways changing their behavior, physiology,
melanocephala), Lowry et al. (2012) found that and ultimately ability to succeed in what other-
individuals at noisier locations produced louder wise might be considered optimal habitat. Terres-
alarm calls than those at quieter locations. The trial mammals show responses that range from
Lombard effect has also been demonstrated in the ignoring or tolerating to avoiding noise, with
laboratory in Japanese quail (Coturnix japonica; potential impacts ranging from negligible to
Potash 1972), budgerigars (Manabe et al. 1998), severe (Slabbekoorn et al. 2018b).
chickens (Gallus gallus domesticus; Brumm et al.
2009), nightingales (Luscinia megarhynchos;
Brumm and Todt 2002), white-rumped munia 13.10.1 Terrestrial Mammal Hearing
(Lonchura striata; Kobayasi and Okanoya
2003), and zebra finches (Cynx et al. 1998). A Among terrestrial mammals, humans (Homo
recent experiment measuring songs of the white- sapiens) are the most studied species with preva-
crowned sparrows in urban San Francisco during lent research addressing hearing physiology and
the 2020 COVID-19 shutdown showed that the psychology, hearing loss, and restoration. The
birds responded to the decrease in noise levels mammalian ear consists of mechanical structures
with a return to decades-old song frequencies (incus, malleus, and stapes) evolutionarily
(Derryberry et al. 2020), suggesting that they derived from elements of the jaw that function
13 The Effects of Noise on Animals 485

to translate sound from acoustic waves to nerve 13.10.2 Behavioral Responses to Noise
signals in the cochlea and auditory nerve. Though
very effective, the ear can sustain damage and it One of the most frequently studied sources of
degrades with age. Hearing loss results in reduced noise in terrestrial mammal habitats is traffic
auditory acuity and limited information for the noise from cars, trains, or aircraft. The most fre-
mammal to use. Loss can be caused by sudden quently reported response is animal movement
exposure to high-intensity sound (e.g., from an away from the noise source. For example, Sonoran
explosion or gunfire) or by repeated or prolonged pronghorn (Antilocapra americana sonoriensis)
noise exposure (e.g., at industrial workplaces, at increased their use of areas with lower levels of
rock concerts, or from personal media players). noise over areas with higher levels of noise from
While the general structure of the mammalian military aircraft (Landon et al. 2003). In the case of
ear is shared amongst terrestrial mammal species, mountain sheep (Ovis canadensis mexicana), 19%
there is great diversity in the sounds mammals showed disturbance to low-flying aircraft
can perceive, in the sounds they produce, and in (Krausman and Hervert 1983). Prairie dogs
their responses to sound. While human hearing (Cynomys ludovicianus) were exposed to playback
ranges from about 20 Hz to 20 kHz, elephants use of highway noise in an experimental prairie-dog
infrasound (sounds extending below the human town that was previously absent of anthropogenic
hearing range, i.e., below 20 Hz; Herbst et al. noise. The treatment area had fewer prairie dogs
2012; Payne et al. 1986) and bats use ultrasound above ground. Those that were above ground
(sounds extending above the human hearing spent less time foraging and much more time
range, i.e., above 20 kHz, with some species exhibiting vigilant behavior (Shannon et al.
hearing and emitting sound up to 220 kHz; 2014) leading to earlier predator detection and
Fenton et al. 2016). Rodents are known to be earlier flight response (Shannon et al. 2016).
quite diverse, with subterranean species having A major concern regarding these behavioral
excellent low-frequency hearing and terrestrial responses by wildlife to traffic corridors is habitat
rodents having excellent ultrasonic hearing fragmentation together with limited connectivity.
(reviewed by Dent et al. 2018). Mammals can Noisy areas may displace wildlife and form
thus be expected to display a diversity of barriers to migration and dispersal (Barber et al.
responses to noise. 2011; Fig. 13.15). Roads also fragment bat

Fig. 13.15 (a) Photo of the Going-to-the-Sun road in Formichella, C., Crooks, K. R., Theobald, D. M., and
Glacier National Park, USA. (b) 3D plot of 24-h traffic Fristrup, K. M. Anthropogenic noise exposure in protected
noise. (c) 2D plot of 24-h traffic noise (Barber et al. 2011). natural areas: estimating the scale of ecological
Road noise may form a barrier to wildlife migration. consequences. Landscape Ecology, 26(9), 1281; https://
Reprinted by permission from Springer Nature. Barber, link.springer.com/article/10.1007/s10980-011-9646-7.
J. R., Burdett, C. L., Reed, S. E., Warner, K. A., # Springer Nature, 2011. All rights reserved
486 C. Erbe et al.

habitat, although many species cross roadways or effect). Cats increased the amplitude of calls in
fly through underpasses (Kerth and Melber 2009). noise (Nonaka et al. 1997). Common marmosets
Animals may adapt temporal behavioral (Callithrix jacchus) and cotton-top tamarins
patterns around noise exposure. Black-tufted (Saguinus oedipus) increased both amplitude
marmosets (Callithrix penicillata) living in an and duration of calls in noise (Brumm et al.
urban park in Brazil stayed in quieter, central 2004; Roian Egnor and Hauser 2006). Cotton-
(i.e., away from road noise) areas during the top tamarins timed their calls to avoid overlap
day, and only utilized the park edges at night or with periodic noise (Egnor et al. 2007). Horse-
weekends (Duarte et al. 2011). Forest elephants shoe bats (Rhinolophidae) increased echolocation
(Loxodonta cyclotis) became more nocturnal in amplitudes and shifted echolocation frequency in
areas of industrial activity; and while the study noise (Hage et al. 2013).
found no direct link to noise intensity, concern
about natural biorhythms near noisy industrial
sites was raised (Wrege et al. 2010). 13.10.3 Physiological Responses
Noise may affect foraging behavior. Wood- to Noise
land caribou stopped feeding when exposed to
noise from petroleum exploration (Bradshaw Human studies have shown that noise exposure
et al. 1997). Reduced food intake in noise slowed can lead to a variety of health effects ranging from
growth in rats, pigs, and dogs (Alario et al. 1987; a feeling of annoyance to disturbed sleep, emo-
Gue et al. 1987; Otten et al. 2004). Gleaning bats tional stress, decreased job performance, higher
(Myotis myotis) displayed reduced hunting effi- chance of developing cardiovascular disease, and
ciency during road noise playbacks (Schaub et al. decreased learning in schoolchildren (Basner
2008; Siemers and Schaub 2011). Similarly, et al. 2014). We can only begin to understand
Brazilian free-tailed bats (Tadarida brasiliensis) the effects of noise on the health of other mam-
were less active and produced fewer echolocation malian species.
bursts near a noisy gas compression station Studies on elk (Cervus canadensis) and
(Bunkley et al. 2015). Peromyscus mice, on the wolves (Canis lupus) in Yellowstone National
other hand, were more successful collecting pine Park, USA, had elevated levels of glucocorticoid
seeds (a major food source) near noisy enzymes (a blood hormone that indicates stress)
gas-extraction sites because competing, seed- when snowmobiles were allowed in the park.
collecting jays (Aphelocoma californica) aban- After banning snowmobiling, enzyme levels
doned the site (Francis et al. 2012). Additionally, returned to normal, although a direct link to
predators of the mice, like owls, avoided the noise exposure was not made (Creel et al. 2002).
noisier sites, which may result in reduced preda- After ongoing zoo visitor noise, giant pandas
tion of the mice (Mason et al. 2016). Finally, (Ailuropoda melanoleuca) exhibited increased
some animals may associate noise with reinforce- glucocorticoids, negatively impacting reproduc-
ment, such as food sources, and learn to approach tion efforts (Owen et al. 2004). In male rats
sounds. Badgers (Meles meles) quickly learned to exposed to chronic noise, testosterone decreased
approach an acoustic deterrent device baited with (Ruffoli et al. 2006). Pregnant mice exposed to
food (dinner bell effect; Ward et al. 2008). 85–95 dB re 20 μPa alarm bells had pups with
One pathway by which noise disrupts animal lower serum IgG levels, indicating impaired
behavior is by acoustic masking. Piglets use immune responses (Sobrian et al. 1997). Chronic
vocalization bouts to coordinate nursing with noise exposure in rats affected calcium regulation
sows and noise disrupted this communication leading to detrimental changes at cellular level
leading to reduced milk ingestion and increased (Gesi et al. 2002). Desert mule deer (Odocoileus
energetic costs for the piglets attempting to elicit hemionus crooki) and mountain sheep had
milk (Algers and Jensen 1985). Some animals can increased heart rates relative to increased levels
adjust their calls to reduce masking (Lombard of aircraft noise playback. Heart rate returned to
13 The Effects of Noise on Animals 487

normal within 60–180 s and responses decreased 20 days. Loss of both inner and outer hair cells
over time potentially indicating a form of habitu- at the basal end of the organ of Corti and hence
ation (Weisenberger et al. 1996). PTS were produced (Hawkins et al. 1976). The
difference in noise exposure when an individual
transitions from having temporary to permanent
13.10.4 Effects of Noise on the Auditory damage varies by species as well as depending on
System several individual factors such as past sound
exposure, age, genetics, etc. (Hu 2012).
The physiological impact of noise is well Exposure to continuous, high-level (>100 dB
documented in several mammalian species, par- re 20 μPa) sounds has been shown to damage or
ticularly laboratory animals, due to the ability to destroy hair cells in multiple species, such as rats,
systematically expose and test individuals. Sys- rabbits, and guinea pigs (Borg et al. 1995; Chen
tematic research has shown that several sound and Fechter 2003; Hu et al. 2000). Recently,
features (such as sound frequency, duration, exposure to lower-amplitude sounds over long
intensity, amplitude rise time, continuous versus periods of time has also been shown to cause
temporary exposure, etc.) impact how an animal’s permanent damage. Mice exposed to 70 dB re
auditory system is affected by noise exposure. For 20 μPa continuous white noise for 8 h a day
example, chinchillas experienced TTS from over the course of up to 3 months showed
exposure to the sound of a hammer hitting a nail increased hearing thresholds and decreased audi-
repeatedly (Dunn et al. 1991). While some of the tory response amplitudes (Feng et al. 2020).
chinchillas were exposed to repeated hammering Notably, the mice also showed aggravated
(a series of separate sound events), others were age-related hearing loss in relatively young mice
exposed to continuous noise of the same spectrum (mice were 8 weeks old at the start of exposure)
as nail hammering (one single sound event). (Feng et al. 2020).
While all chinchillas showed a decrease in Some animals can mitigate the impact of noise
hearing sensitivity, the chinchillas exposed to on the auditory system using a stapedial reflex to
the repeated hammering had more hearing loss close the auditory meatus. When exposed to a
(Dunn et al. 1991). loud sound, the contraction of the stapedial mus-
NIHL can occur from mechanical damage cle causes a decrease in auditory sensitivity by
and/or from metabolic disruption of acoustic closing the auditory meatus, thus negating some
structures (Hu 2012). Mechanical damage occurs potential damage. This reflex is well documented
during the sound exposure due to excessive in humans and appears to primarily play a role in
movement caused by sound waves. Depending sudden, unexpected sounds with sharp rise times.
on the level of the sound, loud noise can damage The reflex is thought to function similarly in most
structures at the cellular level. Metabolic damage terrestrial mammals, for example in rabbits.
occurs due to a cascade of changes at the cellular Rabbits exposed to sound in normal conditions
level from mechanical damage and can continue had very little threshold shifts, but when their
for weeks after sound exposure. stapedial reflex was inactivated (by blocking the
In TTS, damage may occur to the synapses and nerve) during noise exposure, PTS was observed
stereocilia, while in PTS, damage is more exten- at otherwise not NIHL inducing levels (Borg et al.
sive, including outer hair cell death and fibrocyte 1983). In cats, this reflex functions even under
loss. For example, the audiograms of four species anesthesia (McCue and Guinan 1994). However,
of Old-World monkeys (Macaca nemestrina, damage to the auditory nerve connections
M. mulatta, M. fascicularis, and Papio papio) (synaptopathy) can also damage auditory
were compared before and after exposure to reflexes; for example, in mice, synaptopathy was
octave-band noise (between 0.5 and 8 kHz at directly correlated to the function of the middle
levels of 120 dB re 20 μPa) for 8 h daily for ear muscle reflex (Valero et al. 2018).
488 C. Erbe et al.

Synaptopathy not only occurs from noise expo- et al. (2007) criteria were updated in 2019
sure, but also at old age or from exposure to (Southall et al. 2019b).
ototoxins (Valero et al. 2018).

13.11.1 Marine Mammal Hearing


13.11 Noise Effects on Marine
In most situations of noise exposure, marine
Mammals
mammals might merely detect a sound without a
specific adverse effect. Furthermore, animals
As with terrestrial animals, the potential effects of
arguably have to be able to detect signals in
noise exposure on marine mammals may include
order for most of the effects described here to
a range of physical effects on auditory and other
potentially occur. Hearing capabilities and
systems, as well as behavioral responses, and
specializations vary widely in marine mammals.
interference with sound communication systems
Some species, such as pinnipeds, have
(Erbe et al. 2018; Southall 2018). Several reviews
adaptations to facilitate both aerial and underwa-
have recently been completed, for specific noise
ter hearing (Reichmuth et al. 2013). Other spe-
sources (such as shipping, Erbe et al. 2019b;
cies, including the odontocete cetaceans, have
dredging, Todd et al. 2015; and wind farms,
very wide frequency ranges of underwater
Madsen et al. 2006), and specific geographic
hearing extending well into ultrasonic ranges to
regions (such as Antarctica; Erbe et al. 2019a).
facilitate echolocation (Mooney et al. 2012). For
Current knowledge is summarized here, ranging
other key species, including many of the
from issues that are likely most experienced, but
endangered mysticete cetaceans, virtually no
less severe, to effects that may more rarely occur
direct data are available regarding hearing,
but are increasingly severe. Events of the latter
which is instead estimated from anatomical and
category, such as mass strandings and mortalities
sound production parameters.
of marine mammals associated with strong acute
Southall et al. (2007) developed the concept of
anthropogenic sounds (notably certain military
functional marine mammal hearing groups. Each
active sonar systems or explosives), have histori-
group was assigned a frequency-specific auditory
cally driven and dominated the awareness, inter-
filter (called weighting function) to account for
est, and research on the potential effects of noise
known and presumed differences in hearing sen-
on marine mammals (e.g., Filadelfo et al. 2009).
sitivity within marine mammals (Fig. 13.16).
However, there is increasing concern over
Using additional direct data, these hearing groups
sub-lethal, yet potentially more widespread,
and weighting functions were substantially
effects (notably behavioral influences) of more
improved and modified (Finneran 2016). These
chronic noise sources and their consequences for
weighting functions are applied to the noise spec-
individual fitness and ultimately population
trum in order to estimate the likelihood of NIHL,
parameters (e.g., New et al. 2014; Ocean Studies
by comparison to published TTS and PTS onset
Board 2016). Southall et al. (2007) reviewed the
thresholds expressed as weighted cumulative
available literature at that time and made specific
sound exposure levels (National Marine Fisheries
recommendations regarding effects of anthropo-
Service 2018).
genic noise on hearing and behavior in marine
Understanding and directly accounting for the
mammals. Substantial additional research and
frequency-specific parameters of noise and how
synthesis of available data has expanded on their
they interact with background noise and marine
assessment, improving the empirical basis for
mammal-specific hearing is important in consid-
these evaluations and expanding consideration
ering the contextual aspects of potential behav-
to other important areas discussed here (e.g.,
ioral responses (Ellison et al. 2012), auditory
masking and auditory impact thresholds; Erbe
masking (Erbe et al. 2016a), and hearing
et al. 2016a; Finneran 2015). And so the Southall
impairment and damage (e.g., Finneran 2015).
13 The Effects of Noise on Animals 489

Fig. 13.16 Auditory weighting functions for marine water, OCW: other carnivores in water, PCA: phocid
mammal functional hearing groups; LF: low-frequency carnivores in air, OCA: other carnivores in air (Southall
cetaceans, HF: high-frequency cetaceans, VHF: very- et al. 2019b)
high-frequency cetaceans, PCW: phocid carnivores in

13.11.2 Behavioral Responses to Noise natural environment, but it can be challenging to


observe individuals and determine exposure
Noise exposure may lead to a variety of behav- levels and responses with sufficient resolution
ioral responses (and severity) in marine and sample size. Field studies of large sample
mammals, ranging from minor changes in orien- size include observations of changes in whale
tation to separation of mothers and dependent distribution in response to industrial noise and
offspring, or mass mortality. Southall et al. seismic surveys (see Richardson et al. 1995 for
(2007) reviewed these responses and proposed a an overview), recordings of vocal behavior of
qualitative relative severity scaling that takes into whales exposed to military sonar (Fristrup et al.
account the relative duration and potential 2003; Miller et al. 2000), and a recent series of
impacts on biologically meaningful activities. experiments exposing migrating humpback
This approach has been applied and modified in whales to 20, 440, and 3300-in3 seismic airgun
quantifying behavioral responses in the context of arrays (Dunlop et al. 2016, 2017a, 2020). Many
exposure-response risk functions (e.g., Miller recent experimental field studies have considered
et al. 2012; Southall et al. 2019a). While sound potential effects of active sonar on cetaceans
exposure level is an important aspect of determin- (Southall et al. 2016). Among the many broad
ing the relative probability of a response, other results and conclusions are dose-response curves
contextual factors of exposure also may be criti- for exposure level and response probability in
cally important, including animal behavioral state killer whales (Miller et al. 2014) and humpback
(e.g., Goldbogen et al. 2013), spatial proximity to whales (Dunlop et al. 2017b, 2018), behavioral
the noise (e.g., Ellison et al. 2012), sensitization state-dependent responses in blue whales
to noise exposure (Kastelein et al. 2011), or (Balaenoptera musculus; Goldbogen et al. 2013)
nearby vessel noise (Dunlop et al. 2020). A vari- and humpback whales (Dunlop et al. 2017a,
ety of experimental and observational methods 2020), and changes in social behavior following
have been applied in evaluating noise exposure noise exposure in pilot whales (Globicephala sp.;
and behavioral responses, resulting in a large Visser et al. 2016) and humpback whales (Dunlop
volume of scientific literature on this subject that et al. 2020). For instance, Goldbogen et al. (2013)
is reviewed generally here. showed that deep-feeding blue whales are much
Behavioral responses to noise have been stud- more likely to change diving behavior and body
ied in both field and laboratory. The advantage of orientation in response to noise than those in
field studies is the observation of animals in their shallow-feeding or non-feeding states
490 C. Erbe et al.

Fig. 13.17 Relative response differences in various quantified using generalized additive mixed models for
aspects of blue whale behavior between non-feeding, sur- behavioral parameters relevant to each behavioral state
face-feeding, and deep-feeding individuals (adapted from and potential responses in terms of diving, orientation,
Goldbogen et al. 2013). Response magnitude was and displacement

(Fig. 13.17). This finding has been replicated and lions (Zalophus californianus) have included
expanded with individual blue whales, large sample sizes and repeated exposures to
demonstrating the same context-dependency in demonstrate species, age, and experiential
response probability as well as potential depen- differences in response probability to military
dence in response probability based on horizontal sonar signals (Houser et al. 2013a, b).
range from the sound source even for the same Observational methods (visual and acoustic)
received levels (Southall et al. 2019a). have provided complementary data to assess
Some species such as long-finned pilot whales both acute and chronic noise exposure. Passive
appear behaviorally tolerant of noise exposure acoustic monitoring over large areas and time
(e.g., Antunes et al. 2014), whereas beaked periods demonstrated changes in acoustic behav-
whales (Family Ziphiidae) are clearly among the ior and inferred movement of beaked whales in
more sensitive species behaviorally (DeRuiter response to military sonar signals (e.g., McCarthy
et al. 2013; Miller et al. 2015; Stimpert et al. et al. 2011) resulting in dose-response curves
2014; Tyack et al. 2011). The analysis of multi- (Moretti et al. 2014). Similarly, large-scale moni-
variate behavioral data to determine changes in toring linked cetacean distribution and behavior
behavior, including potentially subtle but impor- to seismic surveys (e.g., Pirotta et al. 2014;
tant changes, is statistically challenging, although Thompson et al. 2013), impact pile driving (e.g.,
recent substantial progress in analytical methods Dähne et al. 2013; Thompson et al. 2010;
has been made as well (Harris et al. 2016). Tougaard et al. 2009), and acoustic harassment
Experimental laboratory approaches have the devices (e.g., Johnston 2002).
advantage of greater control and precision on Such observational studies lack experimental
multivariate aspects of exposure and response, control, resolution to the individual level, detail
but lack the contextual reality in which free- on fine-scale responses, and ability to differenti-
ranging animals experience noise. Studies that ate short-term responses to noise from those to
evaluated noise exposure and response probabil- other stimuli, but offer information on broad-
ity in captive harbor porpoises (e.g., Kastelein scale spatio-temporal changes in habitat use and
et al. 2011, 2013) demonstrated a particular sen- behavior. Ideally, experimental approaches
sitivity of this species, which matched field would be combined with broad-scale observa-
observations. Studies with captive bottlenose tional methods to discover potential population-
dolphins (Tursiops truncatus) and California sea level effects (see Southall et al. 2016).
13 The Effects of Noise on Animals 491

13.11.3 Communication Masking termed a spatial release from masking (e.g.,


Turnbull 1994), or based on wide-band ampli-
Noise can interfere with or “mask” acoustic com- tude-modulation patterns in the noise, termed a
munication by marine mammals (Erbe et al. comodulation masking release (e.g., Branstetter
2016a). Masking is due to the simultaneous pres- et al. 2013). These compensatory and signal
ence of signal and noise energy within the same processing capabilities reduce the masking poten-
frequency bands. Masking reduces the range over tial of noise.
which a signal may be detected. Or, in other
words, the signal must be louder, for it to be
detected in the presence of noise (Fig. 13.18). 13.11.4 Effects of Noise on the Auditory
The area over which an animal call can be and Other Systems
detected by its intended recipients (i.e., the active
space or communication space) fluctuates in While behavioral responses and auditory masking
space and time. Models have been developed to may occur relatively far from sound sources,
quantify lost communication space and applied to impacts to the auditory system are expected at
mysticetes communicating near busy shipping higher levels hence shorter ranges. As with
lanes (Fig. 13.19; Clark et al. 2009; Hatch et al. masking, the frequency of noise exposure is
2012). important in terms of the potential for NIHL,
The Lombard effect has been demonstrated in and noise at frequencies where animals are more
marine mammals as an increase in vocalization sensitive has a greater potential for inducing such
source levels (e.g., Helble et al. 2020; Holt et al. effects in marine mammals (Finneran 2015). Fur-
2009; Thode et al. 2020), duration (Miller et al. thermore, the temporal pattern of noise matters
2000), or repetition (Thode et al. 2020). Addition- substantially in terms of the potential for NIHL.
ally, marine mammals have demonstrated Impulsive signals with rapid rise times are more
increased detection capabilities based on angular likely to cause NIHL (see Finneran 2015). The
separation between signal and noise sources, risk and severity of NIHL increases with repeated
and longer exposures, but simple energy-based
models integrating exposure level over time can-
not fully predict potential NIHL.
Despite substantial recent research, our under-
standing of NIHL in marine mammals remains
limited. TTS has been studied in fewer than ten
species, and not in any mysticete. Controlled
exposure experiments that would produce a PTS
are infeasible due to animal ethics considerations.
Nonetheless, TTS studies in odontocetes and
pinnipeds produced TTS-onset levels and infor-
mation on frequency-dependence (reviewed by
Finneran 2015). Recent experiments produced
frequency-weighted TTS-onset levels higher
Fig. 13.18 Beluga whale (Delphinapterus leucas) audio-
than the original exposure criteria compiled by
gram (shaded green), spectrum of a call at detection thresh- Southall et al. (2007). However, some studies
old (measured behaviorally) in the absence of noise, (e.g., Kastelein et al. 2012; Lucke et al. 2009)
spectrum of an icebreaker’s bubbler noise, and the masked demonstrated much lower TTS-onset levels, spe-
call spectrum in the presence of bubbler noise. The spectra
are shown as band levels, with the bandwidths aiming to
cifically in harbor porpoises.
represent the auditory filters. The upwards shift of the call Noise may further cause non-auditory physio-
spectrum equals the amount of masking: 37 dB (Erbe logical impacts that may not be immediately
2000) apparent. Noise has increased stress hormones in
492 C. Erbe et al.

Fig. 13.19 Chart of acoustic footprints of North Atlantic noise footprints can easily engulf (i.e., mask) the right
right whales (Eubalaena glacialis; light blue dots) and whale calls. Stellwagen Bank National Marine Sanctuary
ships (larger footprints with red centers) off Cape Cod, outlined in yellow. Figure courtesy of Chris Clark
Massachusetts Bay, USA. The larger and stronger ship

the blood of captive marine mammals (e.g., been secondarily caused or exacerbated by the
Romano et al. 2004). In the wild, stress hormones animals’ behavioral responses to sonar.
in right whales decreased when ambient noise
from shipping was lower (Rolland et al. 2012).
Such measurements of noise-induced stress in
13.12 Summary
marine mammals are comparable to studies with
other vertebrates (Romero and Butler 2007).
This chapter presented examples of the variety of
However, information is lacking on how stress
effects noise can have on animals in terrestrial and
scales with noise exposure and on the long-term
aquatic habitats. Studies on the hearing in noise
health impacts of prolonged stress.
and on behavioral and physiological responses to
Finally, beaked whales that stranded after
noise have concentrated on fish, frogs, birds, ter-
exposure to military sonar exhibited lesions and
restrial mammals, and marine mammals. Clearly,
gas or fat emboli (Fernandez et al. 2005; Jepson
more research is needed for invertebrates,
et al. 2003). While some form of decompression
reptiles, and all groups of freshwater species. In
sickness has been hypothesized, the physiological
addition, more studies on the metabolic costs of
mechanisms for such emboli to occur are poorly
these responses are needed.
understood. These physiological effects may have
13 The Effects of Noise on Animals 493

Animals demonstrate a hierarchy of behavioral Appl Anim Behav Sci 14(1):49–61. https://doi.org/10.
and physiological responses to noise. Behavioral 1016/0168-1591(85)90037-1
American National Standards Institute (2013) Acoustical
reactions to anthropogenic noise include a startle Terminology (ANSI/ASA S1.1-2013). Acoustical
response, change in movement and direction, Society of America, Melville, NY
freezing in place, cessation of vocal behavior, Amézquita A, Hödl W (2004) How, when, and where to
and change in behavioral budgets. Animals can perform visual displays: The case of the Amazonian
frog Hyla parviceps. Herpetologica 60(4):420–429.
also modify their signals to counteract the effects https://doi.org/10.1655/02-51
of noise and improve communication. Such André M, Solé M, Lenoir M, Durfort M, Quero C, Mas A,
modifications include changes in amplitude, dura- Lombarte A, Mvd S, López-Bejar M, Morell M,
tion, and frequency. Some animals also increase Zaugg S, Houégnigan L (2011) Low-frequency sounds
induce acoustic trauma in cephalopods. Front Ecol
the redundancy of their signals by repeating them Environ 9(9):489–493. https://doi.org/10.1890/100124
more often. Physiological reactions to anthropo- Andriguetto-Filho JM, Ostrensky A, Pie MR, Silva UA,
genic noise are indicated by increased cortisol Boeger WA (2005) Evaluating the impact of seismic
levels (indication of stress), temporary or perma- prospecting on artisanal shrimp fisheries. Cont Shelf
Res 25(14):1720–1727. https://doi.org/10.1016/j.csr.
nent hearing loss, and physical damage to tissues 2005.05.003
and organs such as lungs and swim bladders. Antunes R, Kvadsheim PH, Lam FPA, Tyack PL,
The effects of anthropogenic noise on individ- Thomas L, Wensveen PJ, Miller PJO (2014) High
ual animals can escalate to the population level. thresholds for avoidance of sonar by free-ranging
long-finned pilot whales (Globicephala melas). Mar
Ultimately, species-richness and biodiversity Pollut Bull 83(1):165–180. https://doi.org/10.1016/j.
could be affected. However, methods and models marpolbul.2014.03.056
to address these topics are in their infancy. Appeltants D, Gentner TQ, Hulse SH, Balthazart J, Ball
There is the potential to mitigate any negative GF (2005) The effect of auditory distractors on song
discrimination in male canaries (Serinus canaria).
impacts of anthropogenic noise by modifying the Behav Process 69(3):331–341. https://doi.org/10.
noise source characteristics and operation 1016/j.beproc.2005.01.010
schedules, finding alternative means to obtain Arkhipkin AI, Bizikov VA (2000) Role of the statolith in
operational goals of the noise source, and functioning of the acceleration receptor system in
squids and sepioids. J Zool 250(1):31–55
protecting critical habitats. Effective management Au WWL, Floyd RW, Penner RH, Murchison AE (1974)
of habitats should include noise assessment. Fur- Measurement of echolocation signals of the Atlantic
ther research is needed to understand the ecologi- bottlenose dolphin, Tursiops truncatus Montagu, in
cal consequences of chronic noise in terrestrial open waters. J Acoust Soc Am 56(4):1280–1290
Au WWL, Carder DA, Penner RH, Scronce BL (1985)
and aquatic environments. Demonstration of adaptation in beluga whale echolo-
Remote wilderness areas are not immune to cation signals. J Acoust Soc Am 77(2):726–730.
the effects of anthropogenic noise, because sound https://doi.org/10.1121/1.392341
travels very well (with little loss over long ranges) Aubin T, Jouventin P (1998) Cocktail–party effect in king
penguin colonies. Proc R Soc Lond Ser B Biol Sci
in many terrestrial and aquatic habitats. Resource 265(1406):1665
managers should continue to be vigilant in moni- Bain D, Dahlheim M (1994) Effects of masking noise on
toring and mitigating the effects of anthropogenic detection thresholds of killer whales. In: Loughlin T
noise on animals. (ed) Marine mammals and the Exxon Valdez. Aca-
demic Press, San Diego, CA, pp 243–256
Barber JR, Crooks KR, Fristrup KM (2010) The costs of
chronic noise exposure for terrestrial organisms.
Trends Ecol Evol 25(3):180–189. https://doi.org/10.
References 1016/j.tree.2009.08.002
Barber JR, Burdett CL, Reed SE, Warner KA,
Alario P, Gamallo A, Beato MJ, Trancho G (1987) Body Formichella C, Crooks KR, Theobald DM, Fristrup
weight gain, food intake and adrenal development in KM (2011) Anthropogenic noise exposure in protected
chronic noise stressed rats. Physiol Behav 40(1): natural areas: estimating the scale of ecological
29–32. https://doi.org/10.1016/0031-9384(87)90181-8 consequences. Landsc Ecol 26(9):1281. https://doi.
Algers B, Jensen P (1985) Communication during suck- org/10.1007/s10980-011-9646-7
ling in the domestic pig. Effects of continuous noise.
494 C. Erbe et al.

Barnett KE, Cocroft RB, Fleishman LJ (1999) Possible rabbits. Morphological and electrophysiological
communication by substrate vibration in a chameleon. features, exposure parameters and temporal factors,
Copeia 1:225–228. https://doi.org/10.2307/1447408 variability and interactions. Scand Audiol Suppl 40:
Basner M, Babisch W, Davis A, Brink M, Clark C, 1–147
Janssen S, Stansfeld S (2014) Auditory and Bowles AE, Eckert S, Starke L, Berg E, Wolski L (1999)
non-auditory effects of noise on health. Lancet Effects of flight noise from jet aircraft and sonic booms
383(9925):1325–1332. https://doi.org/10.1016/ on hearing, behavior, heart rate and oxygen consump-
S0140-6736(13)61613-X tion of desert tortoises (Gopherus agassizii). Hubbs-
Battershill C, Cappo M, Colquhoun J, Cripps E, Sea World Research Institution, San Diego, CA
Jorgensen D, McCorry D, Stowar M, Venables W Bradbury JW, Vehrencamp SL (2011) Principles of animal
(2008) Coral damage monitoring using Towed Video communication, 2nd edn. Sinauer Associates,
(TVA) and Photo Quadrat Assessments (PQA). Sunderland, MA
Australian Institute of Marine Science, Bradshaw CJA, Boutin S, Hebert DM (1997) Effects of
Townsville, QLD petroleum exploration on Woodland Caribou in North-
Bee MA (2007) Sound source segregation in grey eastern Alberta. J Wildl Manag 61(4):1127–1133.
treefrogs: spatial release from masking by the sound https://doi.org/10.2307/3802110
of a chorus. Anim Behav 74(3):549–558. https://doi. Branstetter BK, Trickey JS, Bakhtiari K, Black A,
org/10.1016/j.anbehav.2006.12.012 Aihara H, Finneran JJ (2013) Auditory masking
Bee MA, Swanson EM (2007) Auditory masking of patterns in bottlenose dolphins (Tursiops truncatus)
anuran advertisement calls by road traffic noise. with natural, anthropogenic, and synthesized noise. J
Anim Behav 74(6):1765–1776. https://doi.org/10. Acoust Soc Am 133(3):1811–1818. https://doi.org/10.
1016/j.anbehav.2007.03.019 1121/1.4789939
Bergen F, Abs M (1997) Etho-ecological study of the Brittan-Powell EF, Christensen-Dalsgaard J, Tang Y,
singing activity of the Blue Tit (Parus caeruleus), Carr C, Dooling RJ (2010) The auditory brainstem
Great Tit (Parus major) and Chaffinch (Fringilla response in two lizard species. J Acoust Soc Am
coelebs). J Ornithol 138(4):451–467. https://doi.org/ 128(2):787–794. https://doi.org/10.1121/1.3458813
10.1007/bf01651380 Brumm H (ed) (2013) Animal communication and
Blackwell SB, Nations CS, TL MD, Thode AM, noise. Animal signals and communication, vol 2.
Mathias D, Kim KH, Green CR, Macrander AM Springer, Berlin. https://doi.org/10.1007/978-3-642-
(2015) Effects of airgun sounds on bowhead whale 41494-7
calling rates: evidence for two behavioral thresholds. Brumm H, Slabbekoorn H (2005) Acoustic communica-
PLoS One 10(6):e0125720. https://doi.org/10.1371/ tion in noise. In: Advances in the study of behavior, vol
journal.pone.0125720 35. Academic Press, New York, pp 151–209. https://
Blickley JL, Patricelli GL (2012) Potential acoustic doi.org/10.1016/S0065-3454(05)35004-2
masking of greater sage-grouse (Centrocercus Brumm H, Todt D (2002) Noise-dependent song ampli-
urophasianus) display components by chronic indus- tude regulation in a territorial songbird. Anim Behav
trial noise. Ornithol Monogr 74:23–35. https://doi.org/ 63(5):891–897. https://doi.org/10.1006/anbe.2001.
10.1525/om.2012.74.1.23 1968
Blickley JL, Word KR, Krakauer AH, Phillips JL, Sells Brumm H, Voss K, Köllmer I, Todt D (2004) Acoustic
SN, Taff CC, Wingfield JC, Patricelli GL (2012) communication in noise: regulation of call
Experimental chronic noise is related to elevated fecal characteristics in a New World monkey. J Exp Biol
corticosteroid metabolites in lekking male greater 207(3):443. https://doi.org/10.1242/jeb.00768
Sage-Grouse (Centrocercus urophasianus). PLoS Brumm H, Schmidt R, Schrader L (2009) Noise-
One 7(11):e50462. https://doi.org/10.1371/journal. dependent vocal plasticity in domestic fowl. Anim
pone.0050462 Behav 78(3):741–746. https://doi.org/10.1016/j.
Bohne T, Grießmann T, Rolfes R (2019) Modeling the anbehav.2009.07.004
noise mitigation of a bubble curtain. J Acoust Soc Am Bunkley JP, McClure CJW, Kleist NJ, Francis CD, Barber
146(4):2212–2223. https://doi.org/10.1121/1.5126698 JR (2015) Anthropogenic noise alters bat activity
Bolm-Audorff U, Hegewald J, Pretzsch A, Freiberg A, levels and echolocation calls. Global Ecol Conserv 3:
Nienhaus A, Seidler A (2020) Occupational noise and 62–71. https://doi.org/10.1016/j.gecco.2014.11.002
hypertension risk: a systematic review and meta- Chan AAY-H, Giraldo-Perez P, Smith S, Blumstein DT
analysis. Int J Environ Res Public Health 17(17): (2010) Anthropogenic noise affects risk assessment
6281. https://doi.org/10.3390/ijerph17176281 and attention: the distracted prey hypothesis. Biol
Borg E, Nilsson R, Engström B (1983) Effect of the Lett 6(4):458–461. https://doi.org/10.1098/rsbl.2009.
acoustic reflex on inner ear damage induced by indus- 1081
trial noise. Acta Otolaryngol 96(5-6):361–369. https:// Chen G-D, Fechter LD (2003) The relationship between
doi.org/10.3109/00016488309132721 noise-induced hearing loss and hair cell loss in rats.
Borg E, Canlon B, Engström B (1995) Noise-induced Hear Res 177(1):81–90. https://doi.org/10.1016/
hearing loss. Literature review and experiments in S0378-5955(02)00802-X
13 The Effects of Noise on Animals 495

Christensen CB, Christensen-Dalsgaard J, Brandt C, Davidsen JG, Dong H, Linné M, Andersson MH, Piper A,
Madsen PT (2012) Hearing with an atympanic ear: Prystay TS, Hvam EB, Thorstad EB, Whoriskey F,
good vibration and poor sound-pressure detection in Cooke SJ, Sjursen AD, Rønning L, Netland TC,
the royal python, Python regius. J Exp Biol 215(2): Hawkins AD (2019) Effects of sound exposure from
331. https://doi.org/10.1242/jeb.062539 a seismic airgun on heart rate, acceleration and depth
Clark CW, Ellison WT, Southall BL, Hatch L, Van Parijs use in free-swimming Atlantic cod and saithe. Conserv
SM, Frankel A, Ponirakis D (2009) Acoustic masking Physiol 7(1). https://doi.org/10.1093/conphys/coz020
in marine ecosystems: intuitions, analysis, and impli- Davis AK, Schroeder H, Yeager I, Pearce J (2018) Effects
cation. Mar Ecol Prog Ser 395:201–222. https://doi. of simulated highway noise on heart rates of larval
org/10.3354/Meps08402 monarch butterflies, Danaus plexippus: implications
Cocroft RB, Rodríguez RL (2005) The behavioral ecology for roadside habitat suitability. Biol Lett 14(5):
of insect vibrational communication. Bioscience 55(4): 20180018. https://doi.org/10.1098/rsbl.2018.0018
323–334. https://doi.org/10.1641/0006-3568(2005) Day RD, McCauley RD, Fitzgibbon QP, Hartmann K,
055[0323:TBEOIV]2.0.CO;2 Semmens JM (2016a) Assessing the impact of marine
Cook SL, Forrest TG (2005) Sounds produced by nesting seismic surveys on southeast Australian scallop and
leatherback sea turtles (Dermochelys coriacea). lobster fisheries. Fisheries Research & Development
Herpetol Rev 36(4):387–390 Corporation
Coram A, Gordon J, Thompson D, Northridge SP (2014) Day RD, McCauley RD, Fitzgibbon QP, Semmens JM
Evaluating and Assessing the relative effectiveness of (2016b) Seismic air gun exposure during early-stage
acoustic deterrent devices and other non-lethal embryonic development does not negatively affect
measures on marine mammals. University of St spiny lobster Jasus edwardsii larvae (Decapoda:
Andrews, Sea Mammal Research Unit, St Andrews, Palinuridae). Sci Rep 6:22723. https://doi.org/10.
Scotland 1038/srep22723
Costa DP, Schwarz L, Robinson P, Schick RS, Morris PA, Day RD, McCauley RD, Fitzgibbon QP, Hartmann K,
Condit R, Crocker DE, Kilpatrick AM (2016) A bioen- Semmens JM (2017) Exposure to seismic air gun
ergetics approach to understanding the population signals causes physiological harm and alters behavior
consequences of disturbance: elephant seals as a in the scallop Pecten fumatus. Proc Natl Acad Sci
model system. In: Popper AN, Hawkins T (eds) The 114(40):E8537. https://doi.org/10.1073/pnas.
effects of noise on aquatic life II. Springer, New York, 1700564114
pp 161–169. https://doi.org/10.1007/978-1-4939- Day RD, McCauley RD, Fitzgibbon QP, Hartmann K,
2981-8_19 Semmens JM (2019) Seismic air guns damage rock
Costello RA, Symes LB (2014) Effects of anthropogenic lobster mechanosensory organs and impair righting
noise on male signalling behaviour and female reflex. Proc Biol Sci 286(1907):20191424. https://doi.
phonotaxis in Oecanthus tree crickets. Anim Behav org/10.1098/rspb.2019.1424
95:15–22. https://doi.org/10.1016/j.anbehav.2014. Day RD, Fitzgibbon QP, McCauley RD, Hartmann K,
05.009 Semmens JM (2020) Lobsters with pre-existing dam-
Courter JR, Perruci RJ, McGinnis KJ, Rainieri JK (2020) age to their mechanosensory statocyst organs do not
Black-capped chickadees (Poecile atricapillus) alter incur further damage from exposure to seismic air gun
alarm call duration and peak frequency in response to signals. Environ Pollut. https://doi.org/10.1016/j.
traffic noise. PLoS One 15(10):e0241035. https://doi. envpol.2020.115478
org/10.1371/journal.pone.0241035 de Magalhães Tolentino VC, Baesse CQ, Melo C (2018)
Creel S, Fox JE, Hardy A, Sands J, Garrott B, Peterson RO Dominant frequency of songs in tropical bird species is
(2002) Snowmobile activity and glucocorticoid stress higher in sites with high noise pollution. Environ Pollut
responses in wolves and elk. Conserv Biol 16(3): 235:983–992. https://doi.org/10.1016/j.envpol.2018.
809–814. https://doi.org/10.1046/j.1523-1739.2002. 01.045
00554.x de Soto NA, Delorme N, Atkins J, Howard S, Williams J,
Cunnington GM, Fahrig L (2010) Plasticity in the Johnson M (2013) Anthropogenic noise causes body
vocalizations of anurans in response to traffic noise. malformations and delays development in marine lar-
Acta Oecol 36(5):463–470. https://doi.org/10.1016/j. vae. Sci Rep 3:2831. https://doi.org/10.1038/
actao.2010.06.002 srep02831
Cynx J, Lewis R, Tavel B, Tse H (1998) Amplitude Dent ML, Larsen ON, Dooling RJ (1997) Free-field bin-
regulation of vocalizations in noise by a songbird, aural unmasking in budgerigars (Melopsittacus
Taeniopygia guttata. Anim Behav 56(1):107–113. undulatus). Behav Neurosci 111(3):590–598. https://
https://doi.org/10.1006/anbe.1998.0746 doi.org/10.1037/0735-7044.111.3.590
Dähne M, Gilles A, Lucke K, Peschko V, Adler S, Dent ML, McClaine EM, Best V, Ozmeral E, Narayan R,
Krügel K, Sundermeyer J, Siebert U (2013) Effects of Gallun FJ, Sen K, Shinn-Cunningham BG (2009) Spa-
pile-driving on harbour porpoises (Phocoena tial unmasking of birdsong in zebra finches
phocoena) at the first offshore wind farm in Germany. (Taeniopygia guttata) and budgerigars (Melopsittacus
Environ Res Lett 8(2):025002. https://doi.org/10.1088/ undulatus). J Comp Psychol 123(4):357–367. https://
1748-9326/8/2/025002 doi.org/10.1037/a0016898
496 C. Erbe et al.

Dent ML, Screven LA, Kobrina A (2018) Hearing in Dunlop RA, Noad MJ, McCauley RD, Kniest E, Slade R,
Rodents. In: Dent ML, Fay RR, Popper AN (eds) Paton D, Cato DH (2017a, 1869) The behavioural
Rodent bioacoustics. Springer, Cham, pp 71–105. response of migrating humpback whales to a full seis-
https://doi.org/10.1007/978-3-319-92495-3_4 mic airgun array. Proc Biol Sci 284. https://doi.org/10.
Department of the Navy (2008) Atlantic fleet active sonar 1098/rspb.2017.1901
training environmental impact statement. Naval Dunlop RA, Noad MJ, McCauley RD, Scott-Hayward L,
Facilities Engineering Command, Atlantic, Kniest E, Slade R, Paton D, Cato DH (2017b) Deter-
Norfolk, VA mining the behavioural dose–response relationship of
Derryberry EP, Phillips JN, Derryberry GE, Blum MJ, marine mammals to air gun noise and source proxim-
Luther D (2020) Singing in a silent spring: Birds ity. J Exp Biol 220(16):2878–2886. https://doi.org/10.
respond to a half-century soundscape reversion during 1242/jeb.160192
the COVID-19 shutdown. Science 370(6516):575. Dunlop RA, Noad MJ, McCauley RD, Kniest E, Slade R,
https://doi.org/10.1126/science.abd5777 Paton D, Cato DH (2018) A behavioural dose-response
DeRuiter SL, Larbi Doukara K (2012) Loggerhead turtles model for migrating humpback whales and seismic air
dive in response to airgun sound exposure. Endanger gun noise. Mar Pollut Bull 133:506–516. https://doi.
Species Res 16(1):55–63. https://doi.org/10.3354/ org/10.1016/j.marpolbul.2018.06.009
esr00396 Dunlop RA, McCauley RD, Noad MJ (2020) Ships and air
DeRuiter SL, Southall BL, Calambokidis J, Zimmer guns reduce social interactions in humpback whales at
WMX, Sadykova D, Falcone EA, Friedlaender AS, greater ranges than other behavioral impacts. Mar
Joseph JE, Moretti D, Schorr GS, Thomas L, Tyack Pollut Bull 154:111072. https://doi.org/10.1016/j.
PL (2013) First direct measurements of behavioural marpolbul.2020.111072
responses by Cuvier’s beaked whales to Dunn DE, Davis RR, Merry CJ, Franks JR (1991) Hearing
mid-frequency active sonar. Biol Lett 9(4). https:// loss in the chinchilla from impact and continuous noise
doi.org/10.1098/rsbl.2013.0223 exposure. J Acoust Soc Am 90(4):1979–1985. https://
des Aunay GH, Slabbekoorn H, Nagle L, Passas F, doi.org/10.1121/1.401677
Nicolas P, Draganoiu TI (2014) Urban noise Egnor SER, Wickelgren JG, Hauser MD (2007) Tracking
undermines female sexual preferences for silence: adjusting vocal production to avoid acoustic
low-frequency songs in domestic canaries. Anim interference. J Comp Physiol A 193(4):477–483.
Behav 87:67–75. https://doi.org/10.1016/j.anbehav. https://doi.org/10.1007/s00359-006-0205-7
2013.10.010 Ellison W, Southall B, Clark C, Frankel A (2012) A new
Doksaeter L, Godo OR, Handegard NO (2009) context-based approach to assess marine mammal
Behavioural responses of herring (Clupea harengus) behavioral responses to anthropogenic sounds.
to 1-2 and 6-7 kHz sonar signals and killer whale Conserv Biol 26(1):21–28. https://doi.org/10.1111/j.
feedings sounds. J Acoust Soc Am 125(1):554–564. 1523-1739.2011.01803.x
https://doi.org/10.1121/1.3021301 Engas A, Løkkeborg S (2002) Effects of seismic shooting
Dooling RJ, Leek MR (2018) Communication masking by and vessel-generated noise, on fish behaviour and
man-made noise. In: Slabbekoorn H, Dooling RJ, Pop- catch rates. Bioacoustics 12(2–3):313–316
per AN, Fay RR (eds) Effects of anthropogenic noise Engås A, Løkkeborg S, Ona E, Soldal AV (1996) Effects
on animals. Springer, New York, pp 23–46. https://doi. of seismic shooting on local abundance and catch rates
org/10.1007/978-1-4939-8574-6_2 of cod (Gadus morhua) and haddock
Dooling RJ, Lohr B, Dent ML (2000) Hearing in birds and (Melanogrammus aeglefinus). Can J Fish Aquat Sci
reptiles. In: Dooling RJ, Fay RR, Popper AN (eds) 53(10):2238–2249. https://doi.org/10.1139/f96-177
Comparative hearing: birds and reptiles. Springer, Erbe C (2000) Detection of whale calls in noise: perfor-
New York, pp 308–359. https://doi.org/10.1007/978- mance comparison between a beluga whale, human
1-4612-1182-2_7 listeners and a neural network. J Acoust Soc Am
Duarte MHL, Vecci MA, Hirsch A, Young RJ (2011) 108(1):297–303. https://doi.org/10.1121/1.429465
Noisy human neighbours affect where urban monkeys Erbe C, Reichmuth C, Cunningham KC, Lucke K,
live. Biol Lett 7(6):840–842. https://doi.org/10.1098/ Dooling RJ (2016a) Communication masking in
rsbl.2011.0529 marine mammals: a review and research strategy. Mar
Dunlop RA, Cato DH, Noad MJ (2014) Evidence of a Pollut Bull 103:15–38. https://doi.org/10.1016/j.
Lombard response in migrating humpback whales marpolbul.2015.12.007
(Megaptera novaeangliae). J Acoust Soc Am 136(1): Erbe C, Sisneros J, Thomsen F, Hawkins A, Popper A
430–437. https://doi.org/10.1121/1.4883598 (2016b) Overview of the fourth international confer-
Dunlop RA, Noad MJ, McCauley RD, Kniest E, Slade R, ence on the effects of noise on aquatic life. Proc Meet
Paton D, Cato DH (2016) Response of humpback Acoust 27(1):010006. https://doi.org/10.1121/2.
whales (Megaptera novaeangliae) to ramp-up of a 0000256
small experimental air gun array. Mar Pollut Bull Erbe C, Dunlop R, Dolman S (2018) Effects of noise on
103(1–2):72–83. https://doi.org/10.1016/j.marpolbul. marine mammals. In: Slabbekoorn H, Dooling RJ,
2015.12.044 Popper AN, Fay RR (eds) Effects of anthropogenic
13 The Effects of Noise on Animals 497

noise on animals. Springer, New York, pp 277–309. Filadelfo R, Mintz J, Michlovich E, Amico AD, Tyack PL,
https://doi.org/10.1007/978-1-4939-8574-6_10 Ketten DR (2009) Correlating military sonar use with
Erbe C, Dähne M, Gordon J, Herata H, Houser DS, beaked whale mass strandings: what do the historical
Koschinski S, Leaper R, McCauley R, Miller B, data show? Aquat Mamm 35:435–444. https://doi.org/
Müller M, Murray A, Oswald JN, Scholik-Schlomer 10.1578/AM.35.4.2009.435
AR, Schuster M, van Opzeeland IC, Janik VM (2019a) Finneran JJ (2015) Noise-induced hearing loss in marine
Managing the effects of noise from ship traffic, seismic mammals: a review of temporary threshold shift stud-
surveying and construction on marine mammals in ies from 1996 to 2015. J Acoust Soc Am 138(3):
Antarctica. Front Mar Sci 6:647. https://doi.org/10. 1702–1726. https://doi.org/10.1121/1.4927418
3389/fmars.2019.00647 Finneran JJ (2016) Auditory weighting functions and
Erbe C, Marley S, Schoeman R, Smith JN, Trigg L, TTS/PTS exposure functions for marine mammals
Embling CB (2019b) The effects of ship noise on exposed to underwater noise. National Marine
marine mammals: a review. Front Mar Sci 6:606. Fisheries Service, Silver Spring, MD
https://doi.org/10.3389/fmars.2019.00606 Fitzgibbon QP, Day RD, McCauley RD, Simon CJ,
Erbe C, Sisneros J, Thomsen F, Lepper P, Hawkins A, Semmens JM (2017) The impact of seismic air gun
Popper A (2019c) Overview of the fifth international exposure on the haemolymph physiology and
conference on the effects of noise on aquatic life. Proc nutritional condition of spiny lobster, Jasus edwardsii.
Meet Acoust 37(1):001001. https://doi.org/10.1121/2. Mar Pollut Bull 125(1):146–156. https://doi.org/10.
0001052 1016/j.marpolbul.2017.08.004
Fay RR, Popper AN (2012) Fish hearing: new perspectives Foppen RPB, Deuzeman S (2007) De Grote karekiet in de
from two ‘senior’ bioacousticians. Brain Behav Evol noordelijke randmeren; een dilemma voor natuuront-
79(4):215–217. https://doi.org/10.1159/000338719 wikkelingsplannen!? De Levende Natuur 108:20–26
Feng AS, Narins PM (2008) Ultrasonic communication in Francis CD, Ortega CP, Cruz A (2009) Noise pollution
concave-eared torrent frogs (Amolops tormotus). J changes avian communities and species interactions.
Comp Physiol A 194(2):159–167. https://doi.org/10. Curr Biol 19(16):1415–1419. https://doi.org/10.1016/
1007/s00359-007-0267-1 j.cub.2009.06.052
Feng S, Yang L, Hui L, Luo Y, Du Z, Xiong W, Liu K, Francis CD, Ortega CP, Cruz A (2011) Different
Jiang X (2020) Long-term exposure to low-intensity behavioural responses to anthropogenic noise by two
environmental noise aggravates age-related hearing closely related passerine birds. Biol Lett 7(6):850–852.
loss via disruption of cochlear ribbon synapses. Am J https://doi.org/10.1098/rsbl.2011.0359
Transl Res 12(7):3674–3687 Francis CD, Kleist NJ, Ortega CP, Cruz A (2012) Noise
Fenton MB, Grinnell AD, Popper AN, Fay RR (eds) pollution alters ecological services: enhanced pollina-
(2016) Bat bioacoustics. Springer handbook of audi- tion and disrupted seed dispersal. Proc R Soc B Biol
tory research, vol 54. Springer, New York Sci 279(1739):2727–2735. https://doi.org/10.1098/
Fernandez A, Edwards JF, Rodriguez F, de los Monteros rspb.2012.0230
AE, Herraez P, Castro P, Jaber JR, Martin V, Arbelo M Fristrup KM, Hatch LT, Clark CW (2003) Variation in
(2005) “Gas and fat embolic syndrome” involving a humpback whale (Megaptera novaeangliae) song
mass stranding of beaked whales (Family Ziphiidae) length in relation to low-frequency sound broadcasts.
exposed to anthropogenic sonar signals. Vet Pathol J Acoust Soc Am 113(6):3411–3424. https://doi.org/
42(4):446–457 10.1121/1.1573637
Ferrara CR, Vogt RC, Harfush MR, Sousa-Lima RS, Fuller RA, Warren PH, Gaston KJ (2007) Daytime noise
Albavera E, Tavera A (2014) First evidence of leather- predicts nocturnal singing in urban robins. Biol Lett
back turtle (Dermochelys coriacea) embryos and 3(4):368. https://doi.org/10.1098/rsbl.2007.0134
hatchlings emitting sounds. Chelonian Conserv Biol Galeotti P, Sacchi R, Fasola M, Ballasina D (2005) Do
13(1):110–114. https://doi.org/10.2744/ccb-1045.1 mounting vocalisations in tortoises have a communi-
Fewtrell JL, McCauley RD (2012) Impact of air gun noise cation function? A comparative analysis. Herpetol J
on the behaviour of marine fish and squid. Mar Pollut 15(2):61–71
Bull 64(5):984–993. https://doi.org/10.1016/j. Gallego-Abenza M, Mathevon N, Wheatcroft D (2019)
marpolbul.2012.02.009 Experience modulates an insect’s response to anthro-
Fields DM, Handegard NO, Dalen J, Eichner C, Malde K, pogenic noise. Behav Ecol 31(1):90–96. https://doi.
Karlsen Ø, Skiftesvik AB, Durif CMF, Browman HI org/10.1093/beheco/arz159
(2019) Airgun blasts used in marine seismic surveys Gesi M, Fornai F, Lenzi P, Ferrucci M, Soldani P,
have limited effects on mortality, and no sublethal Ruffoli R, Paparelli A (2002) Morphological
effects on behaviour or gene expression, in the cope- alterations induced by loud noise in the myocardium:
pod Calanus finmarchicus. ICES J Mar Sci 76(7): the role of benzodiazepine receptors. Microsc Res
2033–2044. https://doi.org/10.1093/icesjms/fsz126 Tech 59(2):136–146. https://doi.org/10.1002/jemt.
10186
498 C. Erbe et al.

Gil D, Gahr M (2002) The honesty of bird song: multiple challenges of analyzing behavioral response
constraints for multiple traits. Trends Ecol Evol 17(3): study data: an overview of the MOCHA (Multi-study
133–141. https://doi.org/10.1016/S0169-5347(02) OCean Acoustics Human Effects Analysis) project. In:
02410-2 Popper AN, Hawkins A (eds) The effects of noise on
Goldbogen JA, Southall BL, DeRuiter SL, aquatic life II. Springer, New York, pp 399–407.
Calambokidis J, Friedlaender AS, Hazen EL, Falcone https://doi.org/10.1007/978-1-4939-2981-8_47
EA, Schorr GS, Douglas A, Moretti DJ, Kyburg C, Hastings MC, Miksis-Olds J (2012) Shipboard assessment
McKenna MF, Tyack PL (2013) Blue whales respond of hearing sensitivity of tropical fishes immediately
to simulated mid-frequency military sonar. Proc R Soc after exposure to seismic air gun emissions at Scott
B 280(1765):20130657. https://doi.org/10.1098/rspb. Reef. In: Popper AN, Hawkins A (eds) The effects of
2013.0657 noise on aquatic life. Springer, New York, pp 239–243.
Goodwin SE, Shriver WG (2011) Effects of traffic noise https://doi.org/10.1007/978-1-4419-7311-5_53
on occupancy patterns of forest birds. Conserv Biol Hatch L, Clark C, Van Parijs S, Frankel A, Ponirakis D
25(2):406–411. https://doi.org/10.1111/j.1523-1739. (2012) Quantifying loss of acoustic communication
2010.01602.x space for right whales in and around a U.S. National
Grafe TU, Preininger D, Sztatecsny M, Kasah R, Dehling Marine Sanctuary. Conserv Biol 26(6):983–994.
JM, Proksch S, Hödl W (2012) Multimodal communi- https://doi.org/10.1111/j.1523-1739.2012.01908.x
cation in a noisy environment: A case study of the Hawkins AD, Popper AN (2018) Effects of man-made
Bornean rock frog Staurois parvus. PLoS One 7(5): sound on fishes. In: Slabbekoorn H, Dooling RJ, Pop-
e37965. https://doi.org/10.1371/journal.pone.0037965 per AN, Fay RR (eds) Effects of anthropogenic noise
Greenfield MD (2016) Evolution of acoustic communica- on animals. Springer, New York, pp 145–177. https://
tion in insects. In: Pollack GS, Mason AC, Popper AN, doi.org/10.1007/978-1-4939-8574-6_6
Fay RR (eds) Insect hearing. Springer, Cham, pp Hawkins JE, Johnsson LG, Stebbins WC, Moody DB,
17–47. https://doi.org/10.1007/978-3-319-28890-1_2 Coombs SL (1976) Hearing loss and cochlear pathol-
Gue M, Fioramonti J, Frexinos J, Alvinerie M, Bueno L ogy in monkeys after noise exposure. Acta Otolaryngol
(1987) Influence of acoustic stress by noise on gastro- 81(3-6):337–343. https://doi.org/10.3109/
intestinal motility in dogs. Dig Dis Sci 32(12): 00016487609119971
1411–1417. https://doi.org/10.1007/BF01296668 Hawkins AD, Roberts L, Cheesman S (2014) Responses
Guerra A, González AF, Rocha F (2004) A review of the of free-living coastal pelagic fish to impulsive sounds.
records of giant squid in the north-eastern Atlantic and J Acoust Soc Am 135(5):3101–3116. https://doi.org/
severe injuries in Architeuthis dux stranded after acous- 10.1121/1.4870697
tic explorations. In: Paper presented at the ICES Helble TA, Guazzo RA, Martin CR, Durbach IN, Alongi
Annual Science Conference, Vigo, Spain GC, Martin SW, Boyle JK, Henderson EE (2020)
Hage SR, Jiang T, Berquist SW, Feng J, Metzner W Lombard effect: Minke whale boing call source levels
(2013) Ambient noise induces independent shifts in vary with natural variations in ocean noise. J Acoust
call frequency and amplitude within the Lombard Soc Am 147(2):698–712. https://doi.org/10.1121/10.
effect in echolocating bats. Proc Natl Acad Sci 0000596
110(10):4063. https://doi.org/10.1073/pnas. Herbst CT, Stoeger AS, Frey R, Lohscheller J, Titze IR,
1211533110 Gumpenberger M, Fitch WT (2012) How low can
Hahad O, Kröller-Schön S, Daiber A, Münzel T (2019) you go? Physical production mechanism of elephant
The cardiovascular effects of noise. Dtsch Arztebl Int infrasonic vocalizations. Science 337(6094):595–599.
116(14):245–250. https://doi.org/10.3238/arztebl. https://doi.org/10.1126/science.1219712
2019.0245 Herrera-Montes MI, Aide TM (2011) Impacts of traffic
Halfwerk W, Bot S, Buikx J, van der Velde M, Komdeur J, noise on anuran and bird communities. Urban Ecosyst
ten Cate C, Slabbekoorn H (2011) Low-frequency 14(3):415–427. https://doi.org/10.1007/s11252-011-
songs lose their potency in noisy urban conditions. 0158-7
Proc Natl Acad Sci 108(35):14549–14554. https:// Heuch PA, Karlsen E (1997) Detection of infrasonic water
doi.org/10.1073/pnas.1109091108 oscillations by copepodids of Lepeophtheirus salmonis
Halfwerk W, Lea AM, Guerra MA, Page RA, Ryan MJ (Copepoda Caligida). J Plankton Res 19(6):735–747.
(2016) Vocal responses to noise reveal the presence of https://doi.org/10.1093/plankt/19.6.735
the Lombard effect in a frog. Behav Ecol 27(2): Heyward A, Colquhoun J, Cripps E, McCorry D,
669–676. https://doi.org/10.1093/beheco/arv204 Stowar M, Radford B, Miller K, Miller I, Battershill
Halvorsen MB, Casper BM, Woodley CM, Thomas J, C (2018) No evidence of damage to the soft tissue or
Carlson TJ, Popper AN (2012) Threshold for onset of skeletal integrity of mesophotic corals exposed to a 3D
injury in Chinook salmon from exposure to impulsive marine seismic survey. Mar Pollut Bull 129(1):8–13.
pile driving sounds. PLoS One 7(6):e38968. https:// https://doi.org/10.1016/j.marpolbul.2018.01.057
doi.org/10.1371/journal.pone.0038968 Hofstetter RW, Dunn DD, McGuire R, Potter KA (2014)
Harris CM, Thomas L, Sadykova D, DeRuiter SL, Tyack Using acoustic technology to reduce bark beetle
PL, Southall BL, Read AJ, Miller PJO (2016) The
13 The Effects of Noise on Animals 499

reproduction. Pest Manag Sci 70(1):24–27. https://doi. response to anthropogenic noise. Ibis 163(1):52–64.
org/10.1002/ps.3656 https://doi.org/10.1111/ibi.12844
Holt MM, Noren DP, Veirs V, Emmons CK, Veirs S Kaifu K, Akamatsu T, Segawa S (2008) Underwater sound
(2009) Speaking up: Killer whales (Orcinus orca) detection by cephalopod statocyst. Fish Sci 74(4):
increase their call amplitude in response to vessel 781–786. https://doi.org/10.1111/j.1444-2906.2008.
noise. J Acoust Soc Am 125(1):El27–El32. https:// 01589.x
doi.org/10.1121/1.3040028 Kastelein RA, Steen N, de Jong C, Wensveen PJ,
Horn AG, Aikens M, Jamieson E, Kingdon K, Leonard Verboom WC (2011) Effect of broadband-noise
ML (2020) Effect of noise on development of call masking on the behavioral response of a harbor por-
discrimination by nestling tree swallows, Tachycineta poise (Phocoena phocoena) to 1-s duration 6-7 kHz
bicolor. Anim Behav 164:143–148. https://doi.org/10. sonar up-sweeps. J Acoust Soc Am 129(4):2307–2315.
1016/j.anbehav.2020.04.008 https://doi.org/10.1121/1.3559679
Houser DS, Martin SW, Finneran JJ (2013a) Behavioural Kastelein R, Gransier R, Hoek L, Olthuis J (2012) Tempo-
responses of California sea lions to mid-frequency rary threshold shifts and recovery in a harbour porpoise
(3250-3450 Hz) sonar signals. Mar Environ Res 92: (Phocoena phocoena) after octave-band noise at
268–278. https://doi.org/10.1016/j.marenvres.2013. 4 kHz. J Acoust Soc Am 132(5):3525–3537. https://
10.007 doi.org/10.1121/1.4757641
Houser DS, Martin SW, Finneran JJ (2013b) Exposure Kastelein RA, Gransier R, van den Hoogen M, Hoek L
amplitude and repetition affect bottlenose dolphin (2013) Brief behavioural response threshold levels of a
behavioural responses to simulated mid-frequency harbour porpoise (Phocoena phocoena) to five heli-
sonar signals. J Exp Mar Biol Ecol 443:123–133. copter dipping sonar signals (1.33 to 1.43 kHz).
https://doi.org/10.1016/j.jembe.2013.02.043 Aquat Mamm 39(2):162–173
Hu B (2012) Noise-induced structural damage to the Kerth G, Melber M (2009) Species-specific barrier effects
cochlea. In: Le Prell CG, Henderson D, Fay RR, Pop- of a motorway on the habitat use of two threatened
per AN (eds) Noise-induced hearing loss: scientific forest-living bat species. Biol Conserv 142(2):
advances. Springer, New York, pp 57–86. https://doi. 270–279. https://doi.org/10.1016/j.biocon.2008.
org/10.1007/978-1-4419-9523-0_5 10.022
Hu BH, Guo W, Wang PY, Henderson D, Jiang SC (2000) Kight CR, Swaddle JP (2011) How and why environmen-
Intense noise-induced apoptosis in hair cells of guinea tal noise impacts animals: an integrative, mechanistic
pig cochleae. Acta Otolaryngol 120(1):19–24. https:// review. Ecol Lett 14(10):1052–1061. https://doi.org/
doi.org/10.1080/000164800750044443 10.1111/j.1461-0248.2011.01664.x
Hulse SH, MacDougall-Shackleton SA, Wisniewski AB Kight CR, Saha MS, Swaddle JP (2012) Anthropogenic
(1997) Auditory scene analysis by songbirds: stream noise is associated with reductions in the productivity
segregation of birdsong by European starlings (Sturnus of breeding Eastern Bluebirds (Sialia sialis). Ecol Appl
vulgaris). J Comp Psychol 111(1):3–13. https://doi. 22(7):1989–1996. https://doi.org/10.1890/12-0133.1
org/10.1037/0735-7036.111.1.3 Kobayasi KI, Okanoya K (2003) Context-dependent song
International Organization for Standardization (2017) amplitude control in Bengalese finches. Neuroreport
Underwater acoustics—terminology (ISO 18405). 14(3):521–524
Switzerland, Geneva Krausman PR, Hervert JJ (1983) Mountain sheep
Jeffrey ML, Brooke MC-E, Douglas HC (2002) Sound responses to aerial surveys. Wildl Soc Bull 11(4):
detection in situ by the larvae of a coral-reef damselfish 372–375
(Pomacentridae). Mar Ecol Prog Ser 232:259–268 Kryter KD (1994) The handbook of hearing and the effects
Jepson PD, Arbelo M, Deaville R, Patterson IAP, Castro P, of noise: physiology, psychology, and public health.
Baker JR, Degollada E, Ross HM, Herráez P, Pocknell Academic Press, New York
AM, Rodríguez F, Howie FE, Espinosa A, Reid RJ, Kujawa SG, Liberman MC (2009) Adding insult to injury:
Jaber JR, Martin V, Cunningham AA, Fernández A cochlear nerve degeneration after “temporary” noise-
(2003) Gas-bubble lesions in stranded cetaceans. induced hearing loss. J Neurosci 29(45):14077–14085.
Nature 425:575. https://doi.org/10.1038/425575a https://doi.org/10.1523/JNEUROSCI.2845-09.2009
Johnston DW (2002) The effect of acoustic harassment Lampe U, Schmoll T, Franzke A, Reinhold K (2012)
devices on harbour porpoises (Phocoena phocoena) in Staying tuned: grasshoppers from noisy roadside
the Bay of Fundy, Canada. Biol Conserv 108(1): habitats produce courtship signals with elevated fre-
113–118. https://doi.org/10.1016/s0006-3207(02) quency components. Funct Ecol 26(6):1348–1354.
00099-x https://doi.org/10.1111/1365-2435.12000
Joy R, Tollit D, Wood J, MacGillivray A, Li Z, Trounce K, Lampe U, Reinhold K, Schmoll T (2014) How
Robinson O (2019) Potential benefits of vessel grasshoppers respond to road noise: developmental
slowdowns on endangered southern resident killer plasticity and population differentiation in acoustic
whales. Front Mar Sci 6(344). https://doi.org/10. signalling. Funct Ecol 28(3):660–668. https://doi.org/
3389/fmars.2019.00344 10.1111/1365-2435.12215
Juárez R, Araya-Ajoy YG, Barrantes G, Sandoval L Landon DM, Krausman PR, Kiana KGK, Harris LK
(2021) House wrens Troglodytes aedon reduce reper- (2003) Pronghorn use of areas with varying sound
toire size and change song element frequencies in pressure levels. Southwest Nat 48(4):725–728
500 C. Erbe et al.

Le Prell CG, Hammill TL, Murphy WJ (2019) Noise- Malme CI, Miles PR, Clark CW, Tyack P, Bird JE (1983)
induced hearing loss: translating risk from animal Investigations of the potential effects of underwater
models to real-world environments. J Acoust Soc Am noise from petroleum industry activities on migrating
146(5):3646–3651. https://doi.org/10.1121/1.5133385 gray whale behavior Bolt Beranek and Newman Inc.
Lee N, Mason AC (2017) How spatial release from for U.S. Minerals Management Service, Anchorage,
masking may fail to function in a highly directional AK, Cambridge, MA
auditory system. eLife 6:e20731. https://doi.org/10. Malme CI, Miles PR, Clark CW, Tyack P, Bird JE (1984)
7554/eLife.20731 Investigations of the potential effects of underwater
Lee-Dadswell GR (2011) Physics of the interaction noise from petroleum industry activities on migrating
between a crab and a seismic test pulse - stage 4: gray whale behavior. Phase II: January 1984 migration.
continued development of mathematical model and (trans: Minerals Management Service USDotI,
testing of model via simulation. Cape Breton Univer- Washington, DC.). Bolt Beranek and Newman Inc.,
sity, Sydney, NS Anchorage, AK
Lengagne T (2008) Traffic noise affects communication Manabe K, Sadr EI, Dooling RJ (1998) Control of vocal
behaviour in a breeding anuran, Hyla arborea. Biol intensity in budgerigars (Melopsittacus undulatus): dif-
Conserv 141(8):2023–2031. https://doi.org/10.1016/j. ferential reinforcement of vocal intensity and the Lom-
biocon.2008.05.017 bard effect. J Acoust Soc Am 103(2):1190–1198.
Leonard ML, Horn AG (2008) Does ambient noise affect https://doi.org/10.1121/1.421227
growth and begging call structure in nestling birds? Mason JT, McClure CJW, Barber JR (2016) Anthropo-
Behav Ecol 19(3):502–507. https://doi.org/10.1093/ genic noise impairs owl hunting behavior. Biol
beheco/arm161 Conserv 199:29–32. https://doi.org/10.1016/j.biocon.
Lewis ER, Narins PM (1985) Do frogs communicate with 2016.04.009
seismic signals? Science 227(4683):187–189. https:// McCarthy E, Moretti D, Thomas L, DiMarzio N,
doi.org/10.1126/science.227.4683.187 Morrissey R, Jarvis S, Ward J, Izzi A, Dilley A
Liberman MC (2016) Noise-induced hearing loss: perma- (2011) Changes in spatial and temporal distribution
nent versus temporary threshold shifts and the effects and vocal behavior of Blainville’s beaked whales
of hair cell versus neuronal degeneration. In: Popper (Mesoplodon densirostris) during multiship exercises
AN, Hawkins A (eds) The effects of noise on aquatic with mid-frequency sonar. Mar Mamm Sci 27(3):
life II. Springer, New York, pp 1–7. https://doi.org/10. E206–E226. https://doi.org/10.1111/j.1748-7692.
1007/978-1-4939-2981-8_1 2010.00457.x
Lohr B, Wright TF, Dooling RJ (2003) Detection and McCauley RD (2014) Maxima seismic survey noise expo-
discrimination of natural calls in masking noise by sure, Scott Reef 2007. Centre for Marine Science &
birds: estimating the active space signal. Anim Behav Technology, Perth, WA
65:763–777 McCauley RD, Fewtrell J, Duncan AJ, Jenner C, Jenner
Love EK, Bee MA (2010) An experimental test of noise- M-N, Penrose JD, Prince RIT, Adhitya A, Murdoch J,
dependent voice amplitude regulation in Cope’s grey McCabe K (2003a) Marine seismic surveys: analysis
treefrog, Hyla chrysoscelis. Anim Behav 80(3): and propagation of air-gun signals; and effects of expo-
509–515. https://doi.org/10.1016/j.anbehav.2010. sure on humpback whales, sea turtles, fishes and
05.031 squid. In: Environmental implications of offshore oil
Lovell JM, Findlay MM, Moate RM, Yan HY (2005) The and gas development in Australia: further research.
hearing abilities of the prawn Palaemon serratus. Australian Petroleum Production and Exploration
Comp Biochem Physiol A Mol Integr Physiol 140(1): Association, Canberra, ACT
89–100. https://doi.org/10.1016/j.cbpb.2004.11.003 McCauley RD, Fewtrell J, Popper AN (2003b) High inten-
Lowry H, Lill A, Wong BBM (2012) How noisy does a sity anthropogenic sound damages fish ears. J Acoust
noisy miner have to be? Amplitude adjustments of Soc Am 113:638–642. https://doi.org/10.1121/1.
alarm calls in an avian urban ‘adapter’. PLoS One 1527962
7(1):e29960. https://doi.org/10.1371/journal.pone. McCauley RD, Day RD, Swadling KM, Fitzgibbon QP,
0029960 Watson RA, Semmens JM (2017) Widely used marine
Lucke K, Siebert U, Lepper PA, Blanchet M-A (2009) seismic survey air gun operations negatively impact
Temporary shift in masked hearing thresholds in a zooplankton. Nat Ecol Evol 1:0195. https://doi.org/
harbor porpoise (Phocoena phocoena) after exposure 10.1038/s41559-017-0195
to seismic airgun stimuli. J Acoust Soc Am 125(6): McCue MP, Guinan JJ (1994) Acoustically responsive
4060–4070. https://doi.org/10.1121/1.3117443 fibers in the vestibular nerve of the cat. J Neurosci
Madsen PT, Wahlberg M, Tougaard J, Lucke K, Tyack P 14(10):6058. https://doi.org/10.1523/JNEUROSCI.
(2006) Wind turbine underwater noise and marine 14-10-06058.1994
mammals: implications of current knowledge and McGregor RL, Bender DJ, Fahrig L (2008) Do small
data needs. Mar Ecol Prog Ser 309:279–295. https:// mammals avoid roads because of the traffic? J Appl
doi.org/10.3354/meps309279 Ecol 45(1):117–123. https://doi.org/10.1111/j.
1365-2664.2007.01403.x
13 The Effects of Noise on Animals 501

McIntyre E, Leonard ML, Horn AG (2014) Ambient Morrissey R (2014) A risk function for behavioral
noise and parental communication of predation risk disruption of Blainville’s beaked whales (Mesoplodon
in tree swallows, Tachycineta bicolor. Anim Behav densirostris) from mid-frequency active sonar. PLoS
87:85–89. https://doi.org/10.1016/j.anbehav.2013. One 9(1):e85064. https://doi.org/10.1371/journal.
10.013 pone.0085064
McMullen H, Schmidt R, Kunc HP (2014) Anthropogenic Morley EL, Jones G, Radford AN (2014) The importance
noise affects vocal interactions. Behav Process 103: of invertebrates when considering the impacts of
125–128. https://doi.org/10.1016/j.beproc.2013. anthropogenic noise. Proc Biol Sci 281(1776):
12.001 20132683. https://doi.org/10.1098/rspb.2013.2683
Megela-Simmons A, Moss CF, Daniel KM (1985) Behav- Morris-Drake A, Kern JM, Radford AN (2016) Cross-
ioral audiograms of the bullfrog (Rana catesbeiana) modal impacts of anthropogenic noise on information
and the green tree frog (Hyla cinerea). J Acoust Soc use. Curr Biol 26(20):R911–R912. https://doi.org/10.
Am 78(4):1236–1244. https://doi.org/10.1121/1. 1016/j.cub.2016.08.064
392892 Morton AB, Symonds HK (2002) Displacement of
Miller PJO, Biassoni N, Samuels A, Tyack PL (2000) Orcinus orca (L.) by high amplitude sound in British
Whale songs lengthen in response to sonar. Nature Columbia, Canada. ICES J Mar Sci 59(1):71–80
405(6789):903. https://doi.org/10.1038/35016148 Moseley DL, Derryberry GE, Phillips JN, Danner JE,
Miller GW, Moulton VD, Davis RA, Holst M, Millman P, Danner RM, Luther DA, Derryberry EP (2018) Acous-
MacGillivray A, Hannay D (2005) Monitoring seismic tic adaptation to city noise through vocal learning by a
effects on marine mammals - Southeastern Beaufort songbird. Proc R Soc B Biol Sci 285(1888):20181356.
Sea, 2001-2002. In: Armsworthy SL, Cranford PJ, https://doi.org/10.1098/rspb.2018.1356
Lee K (eds) Offshore oil and gas environmental effects Mrosovsky N (1972) Spectrographs of the sounds of leath-
monitoring: approaches and technologies. Battelle erback turtles. Herpetologica 28(3):256–258
Press, Columbus, OH, pp 511–542 Narayan R, Best V, Ozmeral E, McClaine E, Dent M,
Miller PJ, Kvadsheim PH, Lam F-PA, Wensveen PJ, Shinn-Cunningham B, Sen K (2007) Cortical interfer-
Antunes R, Alves AC, Visser F, Kleivane L, Tyack ence effects in the cocktail party problem. Nat
PL, Sivle LD (2012) The severity of behavioral Neurosci 10(12):1601–1607. https://doi.org/10.1038/
changes observed during experimental exposures of nn2009
killer (Orcinus orca), long-finned pilot (Globicephala Narins PM, Wilson M, Mann DA (2014) Ultrasound
melas), and sperm (Physeter macrocephalus) whales to detection in fishes and frogs: discovery and
naval sonar. Aquat Mamm 38(4):362–401 mechanisms. In: Köppl C, Manley GA, Popper AN,
Miller PJO, Antunes RN, Wensveen PJ, Samarra FIP, Fay RR (eds) Insights from comparative hearing
Alves AC, Tyack PL, Kvadsheim PH, Kleivane L, research. Springer, New York, pp 133–156. https://
Lam F-PA, Ainslie MA, Thomas L (2014) Dose- doi.org/10.1007/2506_2013_29
response relationships for the onset of avoidance of National Marine Fisheries Service (2018) 2018
sonar by free-ranging killer whales. J Acoust Soc Am Revisions to: technical guidance for assessing the
135(1):975. https://doi.org/10.1121/1.4861346 effects of anthropogenic sound on marine mammal
Miller PJO, Kvadsheim PH, Lam F-PA, Tyack PL, hearing (Version 2.0): underwater thresholds for
Cure C, Deruiter SL, Kleivane L, Sivle LD, Van onset of permanent and temporary threshold shifts. U.-
Ijsselmuide SP, Visser F, Wensveen PJ, Von Benda- S. Department of Commerce, National Oceanic and
Beckmann AM, Martin Lopez LM, Narazaki T, Atmospheric Administration, Silver Spring, MD
Hooker SK (2015) First indications that northern National Research Council (2005) Marine mammal
bottlenose whales are sensitive to behavioural distur- populations and ocean noise: determining when noise
bance from anthropogenic noise. R Soc Open Sci 2: causes biologically significant effects. National
140484. https://doi.org/10.1098/rsos.140484 Academies Press, Washington. https://doi.org/10.
Mooney TA, Hanlon RT, Christensen-Dalsgaard J, 17226/11147
Madsen PT, Ketten DR, Nachtigall PE (2010) Sound Nemeth E, Pieretti N, Zollinger SA, Geberzahn N,
detection by the longfin squid (Loligo pealeii) studied Partecke J, Miranda AC, Brumm H (2013) Bird song
with auditory evoked potentials: sensitivity to and anthropogenic noise: vocal constraints may
low-frequency particle motion and not pressure. J explain why birds sing higher-frequency songs in cit-
Exp Biol 213(21):3748–3759. https://doi.org/10. ies. Proc Biol Sci 280(1754). https://doi.org/10.1098/
1242/jeb.048348 rspb.2012.2798
Mooney TA, Yamato M, Branstetter BK (2012) Hearing in Neo Y, Ufkes E, Kastelein R, Winter H, ten Cate C,
cetaceans: from natural history to experimental biol- Slabbekoorn H (2015) Impulsive sounds change euro-
ogy. Adv Mar Biol 63:197–246 pean seabass swimming patterns: influence of pulse
Moore BCJ (2013) An introduction to the psychology of repetition interval. Mar Pollut Bull 97(1–2):111–117.
hearing. Brill, Leiden, The Netherlands https://doi.org/10.1016/j.marpolbul.2015.06.027
Moretti D, Thomas L, Marques T, Harwood J, Dilley A, New LF, Clark JS, Costa DP, Fleishman E, Hindell MA,
Neales B, Shaffer J, McCarthy E, New L, Jarvis S, Klanjcek T, Lusseau D, Kraus S, McMahon CR,
502 C. Erbe et al.

Robinson PW, Schick RS, Schwarz LK, Simmons SE, Penna M, Pottstock H, Velasquez N (2005) Effect of
Thomas L, Tyack PL, Harwood J (2014) Using short- natural and synthetic noise on evoked vocal responses
term measures of behaviour to estimate long-term fit- in a frog of the temperate austral forest. Anim Behav
ness of southern elephant seals. Mar Ecol Prog Ser 70(3):639–651. https://doi.org/10.1016/j.anbehav.
496:99–108. https://doi.org/10.3354/meps10547 2004.11.022
Nonaka S, Takahashi R, Enomoto K, Katada A, Unno T Pirotta E, Brookes KL, Graham IM, Thompson PM (2014)
(1997) Lombard reflex during PAG-induced vocaliza- Variation in harbour porpoise activity in response to
tion in decerebrate cats. Neurosci Res 29(4):283–289. seismic survey noise. Biol Lett 10(5):20131090.
https://doi.org/10.1016/S0168-0102(97)00097-7 https://doi.org/10.1098/rsbl.2013.1090
Nordt A, Klenke R (2013) Sleepless in town – drivers of Pohl NU, Slabbekoorn H, Klump GM, Langemann U
the temporal shift in dawn song in urban European (2009) Effects of signal features and environmental
blackbirds. PLoS One 8(8):e71476. https://doi.org/10. noise on signal detection in the great tit, Parus major.
1371/journal.pone.0071476 Anim Behav 78(6):1293–1300. https://doi.org/10.
Normandeau Associates (2012) Effects of noise on fish, 1016/j.anbehav.2009.09.005
fisheries, and invertebrates in the U.S. Atlantic and Pohl NU, Leadbeater E, Slabbekoorn H, Klump GM,
Arctic from energy industry sound-generating Langemann U (2012) Great tits in urban noise benefit
activities. U.S. Department of the Interior, Bureau of from high frequencies in song detection and discrimi-
Ocean Energy Management, Washington, DC nation. Anim Behav 83(3):711–721. https://doi.org/10.
Ocean Studies Board (2016) Approaches to understanding 1016/j.anbehav.2011.12.019
the cumulative effects of stressors on marine mammals. Pohl NU, Klump GM, Langemann U (2015) Effects of
The National Academies Press, Washington, signal features and background noise on distance cue
DC. https://doi.org/10.17226/23479 discrimination by a songbird. J Exp Biol 218(7):1006.
Orci KM, Petróczki K, Barta Z (2016) Instantaneous song https://doi.org/10.1242/jeb.113639
modification in response to fluctuating traffic noise in Polajnar J, Čokl A (2008) The effect of vibratory distur-
the tree cricket Oecanthus pellucens. Anim Behav 112: bance on sexual behaviour of the southern green stink
187–194. https://doi.org/10.1016/j.anbehav.2015. bug Nezara viridula (Heteroptera, Pentatomidae). Cent
12.008 Eur J Biol 3(2):189–197. https://doi.org/10.2478/
Otten W, Kanitz E, Puppe B, Tuchscherer M, Brüssow KP, s11535-008-0008-7
Nürnberg G, Stabenow B (2004) Acute and long term Polajnar J, Eriksson A, Lucchi A, Anfora G, Virant-
effects of chronic intermittent noise stress on Doberlet M, Mazzoni V (2015) Manipulating
hypothalamic-pituitary-adrenocortical and sympatho- behaviour with substrate-borne vibrations – potential
adrenomedullary axis in pigs. Anim Sci 78(2): for insect pest control. Pest Manag Sci 71(1):15–23.
271–283. https://doi.org/10.1017/S1357729800054060 https://doi.org/10.1002/ps.3848
Owen MA, Swaisgood RR, Czekala NM, Steinman K, Polak M, Wiącek J, Kucharczyk M, Orzechowski R
Lindburg DG (2004) Monitoring stress in captive (2013) The effect of road traffic on a breeding commu-
giant pandas (Ailuropoda melanoleuca): behavioral nity of woodland birds. Eur J For Res 132(5):931–941.
and hormonal responses to ambient noise. Zoo Biol https://doi.org/10.1007/s10342-013-0732-z
23(2):147–164. https://doi.org/10.1002/zoo.10124 Popper AN, Fay RR (1993) Sound detection and
Parris KM, Velik-Lord M, North JMA (2009) Frogs call at processing by fish: critical review and major research
a higher pitch in traffic noise. Ecol Soc 14(1):25 questions. Brain Behav Evol 41(1):14–25
Parry GD, Heislers S, Werner GF, Asplin MD (2002) Popper AN, Fay RR (2011) Rethinking sound detection by
Assessment of environmental effects of seismic testing fishes. Hear Res 273(1):25–36. https://doi.org/10.
on scallop fisheries in Bass Strait. Marine and Fresh- 1016/j.heares.2009.12.023
water Resources Institute, Queenscliff, VIC Popper AN, Hastings MC (2009) The effects of anthropo-
Payne KB, Langbauer WR, Thomas EM (1986) Infrasonic genic sources of sound on fishes. J Fish Biol 75:455–
calls of the Asian elephant (Elephas maximus). Behav 489
Ecol Sociobiol 18(4):297–301. https://doi.org/10. Popper AN, Hawkins A (eds) (2012) The effects of noise
1007/bf00300007 on aquatic life. advances in experimental medicine and
Payne JF, Andrews CA, Fancey LL, Cook AL, Christian biology 730. Springer, New York
JR (2007) Pilot study on the effects of seismic air gun Popper AN, Hawkins A (eds) (2016) The effects of noise
noise on lobster (Homarus americanus). Canadian on aquatic life II. Advances in experimental medicine
Technical Report of Fisheries and Aquatic and biology 730. Springer, New York
Sciences 2712 Popper AN, Smith ME, Cott PA, Hanna BW,
Pearson WH, Skalski JR, Malme CI (1992) Effects of MacGillivray AO, Austin ME, Mann DA (2005)
Sounds from a Geophysical Survey Device on Behav- Effects of exposure to seismic airgun use on hearing
ior of Captive Rockfish (Sebastes spp.). Can J Fish of three fish species. J Acoust Soc Am 117(6):
Aquat Sci 49(7):1343–1356. https://doi.org/10.1139/ 3958–3971. https://doi.org/10.1121/1.1904386
f92-150 Popper AN, Hawkins AD, Fay RR, Mann DA, Bartol S,
Carlson TJ, Coombs S, Ellison WT, Gentry RL,
13 The Effects of Noise on Animals 503

Halvorsen MB, Løkkeborg S, Rogers PH, Southall BL, Richardson AJ, Matear RJ, Lenton A (2017) Potential
Zeddies DG, Tavolga WN (2014) Sound exposure impacts on zooplankton of seismic surveys. CSIRO,
guidelines. In: ASA S3/SC1.4 TR-2014 sound expo- Canberra, Australia
sure guidelines for fishes and sea turtles: a technical Ridgway SH, Wever EG, McCormick JG, Palin J,
report prepared by ANSI-Accredited Standards Com- Anderson JH (1969) Hearing in the giant sea turtle,
mittee S3/SC1 and registered with ANSI. Springer, Chelonia mydas. Proc Natl Acad Sci USA 64:884–890
New York, pp 33–51. https://doi.org/10.1007/978-3- Roian Egnor SE, Hauser MD (2006) Noise-induced vocal
319-06659-2_7 modulation in cotton-top tamarins (Saguinus oedipus).
Potash LM (1972) Noise-induced changes in calls of the Am J Primatol 68(12):1183–1190. https://doi.org/10.
Japanese quail. Psychon Sci 26(5):252–254. https:// 1002/ajp.20317
doi.org/10.3758/bf03328608 Rolland RM, Parks SE, Hunt KE, Castellote M, Corkeron
Proppe DS, Sturdy CB, St Clair CC (2011) Flexibility in PJ, Nowacek DP, Wasser SK, Kraus SD (2012) Evi-
animal signals facilitates adaptation to rapidly chang- dence that ship noise increases stress in right whales.
ing environments. PLoS One 6(9):e25413. https://doi. Proc R Soc Lond Ser B Biol Sci 279(1737):
org/10.1371/journal.pone.0025413 2363–2368. https://doi.org/10.1098/rspb.2011.2429
Przeslawski R, Huang Z, Anderson J, Carroll AG, Romano TA, Keogh MJ, Kelly C, Feng P, Berk L,
Edmunds M, Hurt L, Williams S (2018) Multiple field- Schlundt CE, Carder DA, Finneran JJ (2004) Anthro-
based methods to assess the potential impacts of seismic pogenic sound and marine mammal health: measures
surveys on scallops. Mar Pollut Bull 129(2):750–761. of the nervous and immune systems before and after
https://doi.org/10.1016/j.marpolbul.2017.10.066 intense sound exposure. Can J Fish Aquat Sci 61(7):
Quinn JL, Whittingham MJ, Butler SJ, Cresswell W 1124–1134. https://doi.org/10.1139/f04-055
(2006) Noise, predation risk compensation and vigi- Romero ML, Butler LK (2007) Endocrinology of stress.
lance in the chaffinch Fringilla coelebs. J Avian Biol Int J Comp Psychol 20(2)
37(6):601–608. https://doi.org/10.1111/j.2006. Ruffoli R, Carpi A, Giambelluca MA, Grasso L, Scavuzzo
0908-8857.03781.x MC, Giannessi FF (2006) Diazepam administration
Rabin LA, Coss RG, Owings DH (2006) The effects of prevents testosterone decrease and lipofuscin accumu-
wind turbines on antipredator behavior in California lation in testis of mouse exposed to chronic noise
ground squirrels (Spermophilus beecheyi). Biol stress. Andrologia 38(5):159–165. https://doi.org/10.
Conserv 131(3):410–420. https://doi.org/10.1016/j. 1111/j.1439-0272.2006.00732.x
biocon.2006.02.016 Ryals BM, Rubel EW (1988) Hair cell regeneration after
Raboin M, Elias DO (2019) Anthropogenic noise and the acoustic trauma in adult Coturnix quail. Science
bioacoustics of terrestrial invertebrates. J Exp Biol 240(4860):1774. https://doi.org/10.1126/science.
222(12):jeb178749. https://doi.org/10.1242/jeb. 3381101
178749 Samuel Y, Morreale SJ, Clark CW, Greene CH, Richmond
Ratnam R, Feng AS (1998) Detection of auditory signals ME (2005) Underwater, low-frequency noise in a
by frog inferior collicular neurons in the presence of coastal sea turtle habitat. J Acoust Soc Am 117(3):
spatially separated noise. J Neurophysiol 80(6):2848 1465–1472. https://doi.org/10.1121/1.1847993
Reichmuth C, Holt MM, Mulsow J, Sills JM, Southall BL Saunders JC, Dooling RJ (2018) Characteristics of tempo-
(2013) Comparative assessment of amphibious hearing rary and permanent threshold shifts in vertebrates. In:
in pinnipeds. J Comp Physiol A 199(6):491–507 Slabbekoorn H, Dooling RJ, Popper AN, Fay RR (eds)
Reijnen R, Foppen R (2006) Impact of road traffic on Effects of anthropogenic noise on animals. Springer,
breeding bird populations. In: Davenport J, Davenport New York, pp 83–107. https://doi.org/10.1007/978-1-
JL (eds) The ecology of transportation: managing 4939-8574-6_4
mobility for the environment. Springer, Netherlands, Schaub A, Ostwald J, Siemers BM (2008) Foraging bats
Dordrecht, pp 255–274. https://doi.org/10.1007/1- avoid noise. J Exp Biol 211(19):3174. https://doi.org/
4020-4504-2_12 10.1242/jeb.022863
Reijnen R, Foppen R, Braak CT, Thissen J (1995) The Scheifele PM, Andrew S, Cooper RA, Darre M, Musiek
effects of car traffic on breeding bird populations in FE, Max L (2005) Indication of a Lombard vocal
woodland. III. Reduction of density in relation to the response in the St. Lawrence River beluga. J Acoust
proximity of main roads. J Appl Ecol 32(1):187–202. Soc Am 117(3):1486–1492. https://doi.org/10.1121/1.
https://doi.org/10.2307/2404428 1835508
Rheindt FE (2003) The impact of roads on birds: does Schmidt AKD, Riede K, Römer H (2011) High back-
song frequency play a role in determining susceptibil- ground noise shapes selective auditory filters in a trop-
ity to noise pollution? J Ornithol 144(3):295–306. ical cricket. J Exp Biol 214(10):1754. https://doi.org/
https://doi.org/10.1007/bf02465629 10.1242/jeb.053819
Richardson WJ, Greene CR, Malme CI, Thomson DH Schmidt R, Morrison A, Kunc HP (2014) Sexy voices – no
(1995) Marine mammals and noise. Academic Press, choices: male song in noise fails to attract females.
San Diego Anim Behav 94:55–59. https://doi.org/10.1016/j.
anbehav.2014.05.018
504 C. Erbe et al.

Serrano A, Terhune JM (2001) Within-call repetition may Slabbekoorn H, Dooling RJ, Popper AN, Fay RR
be an anti-masking strategy in underwater calls of harp (eds) Effects of anthropogenic noise on animals.
seals. Can J Zool 79:1410–1413 Springer, New York, pp 243–276. https://doi.org/10.
Shannon G, Angeloni LM, Wittemyer G, Fristrup KM, 1007/978-1-4939-8574-6_9
Crooks KR (2014) Road traffic noise modifies Slotte A, Hansen K, Dalen J, Ona E (2004) Acoustic
behaviour of a keystone species. Anim Behav 94: mapping of pelagic fish distribution and abundance in
135–141. https://doi.org/10.1016/j.anbehav.2014. relation to a seismic shooting area off the Norwegian
06.004 west coast. Fish Res 67(2):143–150. https://doi.org/10.
Shannon G, Crooks KR, Wittemyer G, Fristrup KM, 1016/j.fishres.2003.09.046
Angeloni LM (2016) Road noise causes earlier preda- Sobrian SK, Vaughn VT, Ashe WK, Markovic B,
tor detection and flight response in a free-ranging Djuric V, Jankovic BD (1997) Gestational exposure
mammal. Behav Ecol 27(5):1370–1375. https://doi. to loud noise alters the development and postnatal
org/10.1093/beheco/arw058 responsiveness of humoral and cellular components
Shen J-X, Feng AS, Xu Z-M, Yu Z-L, Arch VS, Yu X-J, of the immune system in offspring. Environ Res
Narins PM (2008) Ultrasonic frogs show hyperacute 73(1):227–241. https://doi.org/10.1006/enrs.1997.
phonotaxis to female courtship calls. Nature 3734
453(7197):914–916. https://doi.org/10.1038/ Söffker M, Trathan P, Clark J, Collins MA, Belchier M,
nature06719 Scott R (2015) The impact of predation by marine
Siemers BM, Schaub A (2011) Hunting at the highway: mammals on patagonian toothfish longline fisheries.
traffic noise reduces foraging efficiency in acoustic PLoS One 10(3):e0118113. https://doi.org/10.1371/
predators. Proc R Soc B Biol Sci 278(1712): journal.pone.0118113
1646–1652. https://doi.org/10.1098/rspb.2010.2262 Solé M, Lenoir M, Durfort M, López-Bejar M,
Simmons AM, Narins PM (2018) Effects of anthropogenic Lombarte A, André M (2013) Ultrastructural damage
noise on amphibians and reptiles. In: Slabbekoorn H, of Loligo vulgaris and Illex coindetii statocysts after
Dooling RJ, Popper AN, Fay RR (eds) Effects of low frequency sound exposure. PLoS One 8(10):
anthropogenic noise on animals. Springer, New York, e78825. https://doi.org/10.1371/journal.pone.0078825
pp 179–208. https://doi.org/10.1007/978-1-4939- Song J, Mann DA, Cott PA, Hanna BW, Popper AN
8574-6_7 (2008) The inner ears of Northern Canadian freshwater
Simpson SD, Meekan M, Montgomery J, McCauley R, fishes following exposure to seismic air gun sounds. J
Jeffs A (2005) Homeward sound. Science 308:221. Acoust Soc Am 124(2):1360–1366. https://doi.org/10.
https://doi.org/10.1126/science.1107406 1121/1.2946702
Simpson SD, Jeffs A, Montgomery JC, McCauley RD, Southall BL (2018) Noise. In: Würsig B, Thewissen JGM,
Meekan MG (2007) Nocturnal relocation of adult and Kovacs KM (eds) Encyclopedia of marine mammals,
juvenile coral reef fishes in response to reef noise. 3rd edn. Academic Press, New York, pp 637–645.
Coral Reefs 27:97–104. https://doi.org/10.1007/ https://doi.org/10.1016/B978-0-12-804327-1.00183-7
s00338-007-0294-y Southall BL, Bowles AE, Ellison WT, Finneran JJ, Gentry
Simpson SD, Radford AN, Nedelec SL, Ferrari MCO, RL, Greene CRJ, Kastak D, Ketten DR, Miller JH,
Chivers DP, McCormick MI, Meekan MG (2016) Nachtigall PE, Richardson WJ, Thomas JA, Tyack
Anthropogenic noise increases fish mortality by preda- PL (2007) Marine mammal noise exposure criteria:
tion. Nat Commun 7:10544. https://doi.org/10.1038/ Initial scientific recommendations. Aquat Mamm
ncomms10544 33(4):411–521. https://doi.org/10.1080/09524622.
Slabbekoorn H, Peet M (2003) Ecology: birds sing at a 2008.9753846
higher pitch in urban noise. Nature 424(6946): Southall BL, Nowacek DP, Miller PJO, Tyack PL (2016)
267–267. https://doi.org/10.1038/424267a Experimental field studies to measure behavioral
Slabbekoorn H, Ripmeester EAP (2008) Birdsong and responses of cetaceans to sonar. Endanger Species
anthropogenic noise: implications and applications Res 31:293–315. https://doi.org/10.3354/esr00764
for conservation. Mol Ecol 17(1):72–83. https://doi. Southall BL, DeRuiter SL, Friedlaender A, Stimpert AK,
org/10.1111/j.1365-294X.2007.03487.x Goldbogen JA, Hazen E, Casey C, Fregosi S, Cade
Slabbekoorn H, Bouton N, van Opzeeland I, Coers A, ten DE, Allen AN, Harris CM, Schorr G, Moretti D,
Cate C, Popper AN (2010) A noisy spring: the impact Guan S, Calambokidis J (2019a) Behavioral responses
of globally rising underwater sound levels on fish. of individual blue whales (Balaenoptera musculus) to
Trends Ecol Evol 25(7):419–427. https://doi.org/10. mid-frequency military sonar. J Exp Biol 222(5):
1016/j.tree.2010.04.005 jeb190637. https://doi.org/10.1242/jeb.190637
Slabbekoorn H, Dooling RJ, Popper AN, Fay RR (2018a) Southall BL, Finneran JJ, Reichmuth C, Nachtigall PE,
Effects of anthropogenic noise on animals. In: Springer Ketten DR, Bowles AE, Ellison WT, Nowacek DP,
handbook of auditory research, vol 66. Springer, Tyack PL (2019b) Marine mammal noise exposure
New York criteria: updated scientific recommendations for resid-
Slabbekoorn H, McGee J, Walsh EJ (2018b) Effects of ual hearing effects. Aquat Mamm 45(2):125–232.
man-made sound on terrestrial mammals. In: https://doi.org/10.1578/AM.45.2.2019.125
13 The Effects of Noise on Animals 505

Stanley JA, Radford CA, Jeffs AG (2009) Induction of Thompson PM, Lusseau D, Barton T, Simmons D,
settlement in crab megalopae by ambient underwater Rusin J, Bailey H (2010) Assessing the responses of
reef sound. Behav Ecol 21(1):113–120. https://doi.org/ coastal cetaceans to the construction of offshore wind
10.1093/beheco/arp159 turbines. Mar Pollut Bull 60(8):1200–1208. https://doi.
Stanley JA, Wilkens SL, Jeffs AG (2014) Fouling in your org/10.1016/j.marpolbul.2010.03.030
own nest: vessel noise increases biofouling. Biofouling Thompson PM, Brookes KL, Graham IM, Barton TR,
30(7):837–844. https://doi.org/10.1080/08927014. Needham K, Bradbury G, Merchant ND (2013)
2014.938062 Short-term disturbance by a commercial
Stimpert AK, Deruiter SL, Southall BL, Moretti DJ, two-dimensional seismic survey does not lead to
Falcone EA, Goldbogen JA, Friedlaender A, Schorr long-term displacement of harbour porpoises. Proc R
GS, Calambokidis J (2014) Acoustic and foraging Soc Lond Ser B Biol Sci 280(1771). https://doi.org/10.
behavior of a Baird’s beaked whale, Berardius bairdii, 1098/rspb.2013.2001
exposed to simulated sonar. Sci Rep 4:7031. https:// Thomsen F, Erbe C, Hawkins A, Lepper P, Popper AN,
doi.org/10.1038/srep07031 Scholik-Schlomer A, Sisneros J (2020) Introduction to
Strasser EH, Heath JA (2013) Reproductive failure of a the special issue on the effects of sound on aquatic life.
human-tolerant species, the American kestrel, is J Acoust Soc Am 148(2):934–938. https://doi.org/10.
associated with stress and human disturbance. J Appl 1121/10.0001725
Ecol 50(4):912–919. https://doi.org/10.1111/ Todd VLG, Todd IB, Gardiner JC, Morrin ECN,
1365-2664.12103 Macpherson NA, Dimarzio NA, Thomsen F (2015) A
Sun JWC, Narins PM (2005) Anthropogenic sounds dif- review of impacts of marine dredging activities on
ferentially affect amphibian call rate. Biol Conserv marine mammals. ICES (International Council for the
121(3):419–427. https://doi.org/10.1016/j.biocon. Exploration of the Seas). J Mar Sci 77(2):328–340.
2004.05.017 https://doi.org/10.1093/icesjms/fsu187
Swaddle JP, Page LC (2007) High levels of environmental Tougaard J, Carstensen J, Teilmann J (2009) Pile driving
noise erode pair preferences in zebra finches: zone of responsiveness extends beyond 20 km for
implications for noise pollution. Anim Behav 74(3): harbor porpoises (Phocoena phocoena (L.)). J Acoust
363–368. https://doi.org/10.1016/j.anbehav.2007. Soc Am 126(1):11–14. https://doi.org/10.1121/1.
01.004 3132523
Tarlow EM, Blumstein DT (2007) Evaluating methods to Turnbull SD (1994) Changes in masked thresholds of a
quantify anthropogenic stressors on wild animals. Appl harbor seal Phoca vitulina associated with angular
Anim Behav Sci 102(3):429–451. https://doi.org/10. separation of signal and noise sources. Can J Zool 72:
1016/j.applanim.2006.05.040 1863–1866. https://doi.org/10.1139/z94-253
Tavolga WN (ed) (1976) Sound reception in fishes. Tyack PL, Zimmer WMX, Moretti D, Southall BL,
Dowden, Hutchinson and Ross, Stroudsburg, PA Claridge DE, Durban JW, Clark CW, D’Amico A,
Tavolga WN, Popper AN, Fay RR (eds) (2012) Hearing DiMarzio N, Jarvis S, McCarthy E, Morrissey R,
and sound communication in fishes. Springer, Ward J, Boyd IL (2011) Beaked whales respond to
New York simulated and actual navy sonar. PLoS One 6(3):
Tennessen JB, Parks SE, Langkilde T (2014) Traffic noise e17009. https://doi.org/10.1371/journal.pone.0017009
causes physiological stress and impairs breeding Valero MD, Hancock KE, Maison SF, Liberman MC
migration behaviour in frogs. Conserv Physiol 2(1). (2018) Effects of cochlear synaptopathy on middle-
https://doi.org/10.1093/conphys/cou032 ear muscle reflexes in unanesthetized mice. Hear Res
Tennessen JB, Parks SE, Swierk L, Reinert LK, Holden 363:109–118. https://doi.org/10.1016/j.heares.2018.
WM, Rollins-Smith LA, Walsh KA, Langkilde T 03.012
(2018) Frogs adapt to physiologically costly anthropo- Vermeij MJA, Marhaver KL, Huijubers C, Nagelkerken I,
genic noise. Proc R Soc B Biol Sci 285(1891): Simpson SD (2010) Coral larvae move toward reef
20182194. https://doi.org/10.1098/rspb.2018.2194 sounds. PLoS One 5(5):e10660. https://doi.org/10.
Thode AM, Blackwell SB, Conrad AS, Kim KH, 1371/journal.pone.0010660
Marques T, Thomas L, Oedekoven CS, Harris D, Brö- Verzijden MN, Ripmeester EAP, Ohms VR,
ker K (2020) Roaring and repetition: how bowhead Snelderwaard P, Slabbekoorn H (2010) Immediate
whales adjust their call density and source level (Lom- spectral flexibility in singing chiffchaffs during experi-
bard effect) in the presence of natural and seismic mental exposure to highway noise. J Exp Biol 213(15):
airgun survey noise. J Acoust Soc Am 147(3): 2575. https://doi.org/10.1242/jeb.038299
2061–2080. https://doi.org/10.1121/10.0000935 Visser F, Curé C, Kvadsheim PH, Lam F-PA, Tyack PL,
Thomas JA, Friel B, Yegge S (2016) Restoring dueting Miller PJO (2016) Disturbance-specific social
behavior in a mated pair of buffy cheeked gibbons after responses in long-finned pilot whales, Globicephala
exposure to construction noise at a zoo through melas. Sci Rep. https://doi.org/10.1038/srep28641
playbacks of their own sounds. J Acoust Soc Am Wale MA, Simpson SD, Radford AN (2013a) Noise nega-
140(4):3415–3415. https://doi.org/10.1121/1.4970975 tively affects foraging and antipredator behaviour in
506 C. Erbe et al.

shore crabs. Anim Behav 86(1):111–118. https://doi. Williams R, Wright AJ, Ashe E, Blight LK, Bruintjes R,
org/10.1016/j.anbehav.2013.05.001 Canessa R, Clark CW, Cullis-Suzuki S, Dakin DT,
Wale MA, Simpson SD, Radford AN (2013b) Size- Erbe C, Hammond PS, Merchant ND, O’Hara PD,
dependent physiological responses of shore crabs to Purser J, Radford AN, Simpson SD, Thomas L, Wale
single and repeated playback of ship noise. Biol Lett MA (2015) Impacts of anthropogenic noise on
9(2):20121194. https://doi.org/10.1098/rsbl.2012.1194 marine life: publication patterns, new discoveries, and
Ward AI, Pietravalle S, Cowan DP, Delahay RJ (2008) future directions in research and management. Ocean
Deterrent or dinner bell? Alteration of badger activity Coast Manag 115:17–24. https://doi.org/10.1016/j.
and feeding at baited plots using ultrasonic and water ocecoaman.2015.05.021
jet devices. Appl Anim Behav Sci 115(3):221–232. Wisniewska DM, Johnson M, Teilmann J, Rojano-
https://doi.org/10.1016/j.applanim.2008.06.004 Doñate L, Shearer J, Sveegaard S, Miller Lee A,
Warnecke M, Chiu C, Engelberg J, Moss CF (2015) Siebert U, Madsen Peter T (2016) Ultra-high foraging
Active listening in a bat cocktail party: adaptive echo- rates of harbor porpoises make them vulnerable to
location and flight behaviors of big brown bats, anthropogenic disturbance. Curr Biol 26(11):
Eptesicus fuscus, foraging in a cluttered acoustic envi- 1441–1446. https://doi.org/10.1016/j.cub.2016.03.069
ronment. Brain Behav Evol 86(1):6–16. https://doi. Wollerman L, Wiley RH (2002) Background noise from a
org/10.1159/000437346 natural chorus alters female discrimination of male
Warren B, Fenton GE, Klenschi E, Windmill JFC, French calls in a neotropical frog. Anim Behav 63(1):15–22.
AS (2020) Physiological basis of noise-induced hearing https://doi.org/10.1006/anbe.2001.1885
loss in a tympanal ear. J Neurosci 40(15):3130. https:// World Health Organization (2011) Burden of disease from
doi.org/10.1523/JNEUROSCI.2279-19.2019 environmental noise: quantification of healthy life
Weir CR, Dolman SJ (2007) Comparative review of the years lost in Europe. World Health Organization,
regional marine mammal mitigation guidelines Copenhagen
implemented during industrial seismic surveys, and Wrege PH, Rowland ED, Thompson BG, Batruch N
guidance towards a worldwide standard. J Int Wildl (2010) Use of acoustic tools to reveal otherwise cryptic
Law Policy 10:1–27. https://doi.org/10.1080/ responses of forest elephants to oil exploration.
13880290701229838 Conserv Biol 24(6):1578–1585. https://doi.org/10.
Weisenberger ME, Krausman PR, Wallace MC, De Young 1111/j.1523-1739.2010.01559.x
DW, Maughan OE (1996) Effects of simulated jet Yelverton JT, Richmond DR, Hicks W, Saunders K,
aircraft noise on heart rate and behavior of desert Fletcher ER (1975) The relationship between fish size
ungulates. J Wildl Manag 60(1):52–61. https://doi. and their response to underwater blast. Lovelace Foun-
org/10.2307/3802039 dation for Medical Education and Research,
Wensveen PJ, Kvadsheim PH, Lam F-PA, von Benda- Albuquerque, NM
Beckmann AM, Sivle LD, Visser F, Curé C, Tyack Yi YZ, Sheridan JA (2019) Effects of traffic noise on
PL, Miller PJO (2017) Lack of behavioural responses vocalisations of the rhacophorid tree frog Kurixalus
of humpback whales (Megaptera novaeangliae) indi- chaseni (Anura: Rhacophoridae) in Borneo. RAFFLES
cate limited effectiveness of sonar mitigation. J Exp Bull Zool 67:77–82. https://doi.org/10.26107/RBZ-
Biol 220(22):4150–4161. https://doi.org/10.1242/jeb. 2019-0007
161232 Young BA (1997) A review of sound production and
Wever EG (1978) The reptile ear. Princeton University hearing in snakes, with a discussion of intraspecific
Press, Princeton. https://doi.org/10.2307/j.ctvbcd2f0 acoustic communication in snakes. J Pa Acad Sci
Wever EG (1985) The amphibian ear. Princeton Univer- 71(1):39–46
sity Press. https://doi.org/10.2307/j.ctt7zth8g Zhao L, Zhu B, Wang J, Brauth SE, Tang Y, Cui J (2017)
Williams R, Erbe C, Ashe E, Beerman A, Smith J (2014) Sometimes noise is beneficial: stream noise informs
Severity of killer whale behavioural responses to ship vocal communication in the little torrent frog Amolops
noise: a dose-response study. Mar Pollut Bull 79:254– torrentis. J Ethol 35(3):259–267. https://doi.org/10.
260. https://doi.org/10.1016/j.marpolbul.2013.12.004 1007/s10164-017-0515-y

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Index

A Aircrafts, 225
Abiotic noise, 175, 459 Airgun, 210, 238, 249, 468–470
Absolute abundance, 348 Airplane, 226
Absorption, 160, 161, 187, 194, 195 Akaike’s information criterion (AIC), 346
Accelerometer, 18 Alarm calls, 98, 406
Acoustic adaptation hypothesis, 218, 236 Alder flycatchers, 173
Acoustic alarms, 234 Aliasing, 137
Acoustically hard, 165 Alligator, 222
Acoustically soft, 165 Alpacas, 366
Acoustic camera, 17, 43 Alternative hypothesis, 337
Acoustic communication, 187 Altruism, 406
Acoustic complexity index (ACI), 248 Ambient noise, 171, 187, 357
Acoustic diversity index (ADI), 247 Ambient sound, 228
Acoustic energy, 119 American bullfrog, 362
Acoustic environment, 217 Amplitude, 116
Acoustic evenness index (AEI), 247 Amplitude-modulation, 116
Acoustic habitat generalists, 218, 219 Amplitude sensitivity, 39
Acoustic habitat hypothesis, 218 Analog, 2, 137
Acoustic habitat specialists, 218, 219 Analog-to-digital converter, 3
Acoustic harassment devices, 465 ANCOVA, 341
Acoustic impedance, 195, 199, 422 Anechoic chamber, 360
Acoustic indices, 246 Angle of incidence, 195
Acoustic intensity, 120 Angular frequency, 200
Acoustic masking, 486 Animal choruses, 278
Acoustic mirage, 166, 169 Animal ethics permits, 88
Acoustic niche hypothesis, 218, 219 ANOVA, 341
Acoustic niches, 242 Antagonistic, 463
Acoustic power, 120 Antarctic blue whales, 236, 304
Acoustic scene, 219, 427 Antarctic minke whale, 273
Acoustic startle response, 360 Antennae, 471
Acoustic tag, 26 Anthropogenic noise, 175, 460
Acoustic trauma, 462 Anthropophony, 112, 218
Acoustic wavefront, 193 Anti-aliasing filter, 138
Active sonar, 21 Anti-masking strategies, 172, 187
Active sonar systems, 199 Antinode, 203
Active space, 177, 475, 482, 491 Antiphonal, 392
AD-converters, 43 Ants, 220
Adiabatic mode method, 205 Anuran, 220, 295, 296
Advertisement calls, 363 Appraiser, 393
Affected source level, 126 Approach calls, 423
Affiliative calls, 404 Aquatic biophony, 228
African elephant, 221, 296 Aquatic geophony, 232
Aggressive calls, 404 Aquatic soundscape, 227
Air absorption, 167 Aquatic technophony, 233

# The Author(s) 2022 507


C. Erbe, J. A. Thomas (eds.), Exploring Animal Behavior Through Sound: Volume 1,
https://doi.org/10.1007/978-3-030-97540-1
508 Index

Arctic fox, 371 Black lemur, 286


Artificial neural networks, 291 Black-tufted marmosets, 486
Atmospheric pressure, 116 Blainville’s beaked whales, 290, 295, 297
Atui swiftlet, 442 Blue whales, 278, 489
Audibility, 460 Blur-ratio, 162
Audience effect, 400 Boat, 226
Audio coding, 139 Bootstrap, 329
Audiogram, 187, 356 Bornean rock frog, 476
Auditory brainstem response, 357, 375 Bottlenose dolphin, 102, 284, 297, 300, 372, 378, 379,
Auditory evoked potentials, 358 432, 439, 490
Auditory nerve, 361 Bowhead whale, 462
Auditory scene analysis, 219 Brazilian free-tailed bat, 486
Auditory-vestibular nerve, 361 Broadband, 131
Australian grey swiftlet, 441 Brown long-eared bat, 429
Australian sea lion, 223 Bryozoans, 460
Autocorrelated, 331 Bubbles, 199
Autonomous recorder, 13 Budgerigar, 178, 360, 476, 482, 484
Autonomous underwater vehicles (AUVs), 25 Bullfrogs, 176
A-weighting, 129 Bumble bees, 220
Buntings, 297
B Burst-pulse sounds, 231
Backpropagation, 293 Butterfly, 473
Badgers, 486
Band-pass filter, 131 C
Bandwidth, 38, 40, 131 Calibration signal, 90
Barbastelle bats, 92 California sea lion, 372, 490
Barnacles, 460 Campbell’s monkeys, 271
Barred owl, 271 Canaries, 482
Bat, 160, 172, 222, 238, 271, 294, 300, 485 Car, 225, 226
Bat detectors, 19, 43 Cardioid, 44
Bayesian inference, 335 Catch trials, 366
Bayesian information criterion (BIC), 346 Categorical variables, 326
Beaked whales, 490 Cats, 486, 487
Beamforming, 22, 55, 147 Caustics, 201
Bearded seals, 98 Censored data, 343
Behavioral hearing threshold, 357 Center frequency, 131
Behavioral plasticity, 472 Central tendency, 326
Behavioral response, 465 Cephalopods, 469
Beluga whale, 187, 188, 239, 359, 367, 371, 372, 466 Cepstral coefficients, 283
Best hearing sensitivity, 358 Cepstrogram, 284
Beta distribution, 327 Chaffinches, 466, 483
Bias, 323 Characteristic impedance, 121
Bidirectional, 44 Chickens, 484
Big brown bat, 430, 460 Chiffchaff, 220
Bimodal acousto-vibrational communication, 398 Chimpanzees, 173
Binary coding, 3 Chinchilla, 356, 487
Binaural, 46 Chipmunks, 271, 288, 465
Binomial distribution, 327 Chorus, 482
Bioacoustic index, 246 Cicadas, 220
Biofouling, 470 Cichlids, 96
Biophony, 112, 218 Classical conditioning, 365
Biosonar, 193, 222 Classification tree, 284, 288
Biotic noise, 175, 459 Clicks, 231
Biotremology, 18, 390 Closest point of approach, 142, 197
Birds, 168, 219, 295 Cochlea, 485
Bit depth, 3, 138 Cochlear microphonic potential, 375
Bivariate, 332 Cockatoos, 302
Blackbird, 163 Cocktail party effect, 401
Black-capped chickadee, 292, 430, 483, 484 Coefficient of variation, 329
Index 509

Coherent, 124, 198 Curvature, 115


Comb-filter effects, 165 Cut-off frequency, 131
Common chiffchaffs, 225 Cuttlefish, 468, 469
Common dolphin, 231, 284 C-weighting, 129
Common eastern froglet, 475 Cylindrical spreading, 159, 193, 195
Common marmosets, 486
Common octopus, 377 D
Common yellowthroat, 88 Damselfish, 96
Communication, 390, 459 Datasheet, 94
Communication space, 224, 491 Daubenton’s bat, 421
Comodulation masking release, 172, 178, 491 3D audio, 17
Compact cassette, 8 3-dB bandwidth, 131
Compound action potential, 375 10-dB bandwidth, 131
Compressional waves, 199 3-dB duration, 132
Compressions, 113 10-dB duration, 132
Condenser microphone, 15, 42 Dear enemy effect, 173
Conditioned response, 102, 358, 359, 365 Decibels, 123
Conditioned stimulus, 365 Decidecades, 132
Conditioned suppression, 366 Deep isothermal layer, 190
Confidence intervals, 337 Deep neural networks, 293
Confounding effects, 326 Deep sound channel (DSC), 202
Confounding factors, 326 Dendrogram, 284
Confusion matrix, 280 Dependent variables, 326
Conservative response bias, 371 Depth sensor, 55
Construction equipment, 226 Descriptive studies, 321
Constructive interference, 164 Desensitization, 473
Context-dependent, 403 Desert mule deer, 486
Context-independent, 403, 410 Desert tortoises, 474
Continuous, 326 Designators, 393
Continuous variables, 326 Destructive interference, 164
Continuous wave, 115 Detection threshold, 187, 276
Contour, 115 Developmental plasticity, 472
Convective air currents, 171 Dialects, 402
Convergence zone, 202, 235 Difference limen, 379
Convergent evolution, 423 Diffraction, 162, 171, 201
Convolutional neural networks, 293 Diffractive scattering, 163
Copepod, 468, 471 Diffuse reflection, 161
Coqui frogs, 239 Digital, 3
Corals, 468, 470 Digital audio tape, 10
Correct detection, 370 Digital compact cassette, 10
Correct rejection, 370 Digital recorders, 10, 43
Cortical evoked responses, 357 Digital-to-analog converter, 3
Coterie, 406 Dinner bell effect, 460, 486
Cotton-top tamarins, 486 Dipole sound field, 478
Coupled mode method, 205 Directionality, 42
Covariates, 326 Directional microphone, 17
Crab, 229, 465, 468, 470 Directivity index, 187
Crabeater seals, 93 Discrete variables, 326
Credible intervals, 337 Discriminant function analysis, 287
Crepuscular, 401 Displacement from habitat, 466
Crickets, 178, 220, 466, 472 Distant shipping, 228
Critical bandwidth, 382 Distortion product otoacoustic emissions, 373
Critical ratio, 172, 177, 187, 381, 461, 477 Distracted prey hypothesis, 467
Crocodiles, 220 Distress calls, 406
Cross-correlation, 144 Dogs, 224
Cross-modal impacts, 467 Domain, 327
Crustaceans, 467, 468 Domestic kittens, 365
CTD measurements, 101 Doppler shift, 142, 226, 428
Cultural trait, 402 Doppler-shift compensators, 428
510 Index

Dose-response curve, 462, 489, 490 Extended source, 127


Downsweep, 115 External auditory meatus, 358
Downward refraction, 158, 166, 168, 169 Extrapolation, 325
Downwind, 169
Drumming, 396 F
Duck, 220 Fallow deer, 173
Duct, 201 False alarm, 280, 366, 370
Duetting, 98, 408 False killer whale, 101, 304, 432
Dusk chorus, 409 False negative, 339
Duty cycle, 427 False positive, 280, 339
Dwarf minke whale, 273 Far-field, 126, 127
Dynamic information, 403 Fast-field programs, 205
Dynamic microphones, 15, 42 Fear-potentiated startle, 362
Dynamic range, 39, 40 Feature vectors, 282
Dynamic range of hearing, 380 Field crickets, 472
Dynamic time-warping, 297 Field quantity, 123
File formats, 104
E Filter, 131
Earthquakes, 232, 233 Fin whale, 208, 236, 278
Eastern bluebirds, 480 Fish, 229, 245, 284, 466, 477
Eavesdropping, 400, 446 Fish chorus, 230, 243
Echolocation, 19, 222 Fixed effects, 342
Ecoacoustics, 218 Fletcher critical band, 382
Ecuadorian hillstar hummingbird, 220 Flies, 471
Eddies, 171 Foliage attenuation, 165
Effect size, 333 Foot-flagging frogs, 466
Egrets, 466 Forest canopy, 168
Egyptian fruit bat, 98, 430 Forest elephants, 486
Elastic seabeds, 199 Formant dispersion, 404
Electret condenser microphones, 16 Formants, 174
Electret microphones, 42 Forward masking, 383
Electrophysiological hearing threshold, 357 Fourier transform, 136
Electrophysiological noise, 357 Free-field, 126, 127
Electrostatic microphones, 42 Frequency, 113, 356
Elephant, 168, 174, 239, 250, 485 Frequency band, 131
Elephant seal, 463 Frequency dispersion, 300
Elk, 486 Frequency-division, 60
Empirical model, 329 Frequency domain, 134, 136
Emulation, 392 Frequency domain models, 200
End frequency, 115 Frequency-modulation, 115, 420
Energetic masking, 461 Frequency region of best sensitivity, 358
90% energy signal duration, 119, 211 Frequency resolution, 141
Energy threshold detector, 278 Frequency response, 38, 40
Entropy index, 247 Frequency selectivity, 380
Environmental conditions, 91 Frequency spacing, 139, 141
Equal loudness contours, 129 Frequency weightings, 128
Equal-power assumption, 188 Frequentist inference, 335
Error of anticipation, 369 Frog, 220, 474
Error of habituation, 369 Fruit fly, 176
Estimators, 329 Functionally referential signals, 406
European blackbirds, 483, 484 Fundamental frequency, 114, 220
European robins, 483, 484
Evanescent wave, 143 G
Evoked calling, 365 Gabor signals, 116
Ewes, 224 Gamma distribution, 327
Excess attenuation, 160 Gated recurrent unit, 293
Explanatory variables, 326 Gaussian beam tracing, 201, 207
Exploratory studies, 321 Gaussian distribution, 327
Explosions, 234 Gaussian linear regression model, 330
Index 511

Gaussian mixture models (GMMs), 295 Hierarchical modeling, 344


Generalized additive mixed models (GAMM), 342 High duty-cycle bats, 428
Generalized additive model (GAM), 342 High-pass filter, 55, 131
Generalized estimating equations (GEEs), 343 Hit rates, 366
Generalized least squares (GLS), 344 Homogeneity of variances, 341
Generalized linear mixed models (GLMM), 342 Honeybees, 220
Generalized linear models (GLMs), 341 Horse, 287
Gentoo penguins, 171 Horseshoe bat, 382, 486
Geometric spreading, 171, 193 House mouse, 356
Geophony, 112, 218 House wrens, 484
Geophysical, 112 Hoverflies, 220
Giant pandas, 486 Howler monkeys, 168
Gibbon, 98, 274, 466 Hummingbird, 284
Gleaning bats, 486 Humpback whale, 98, 105, 230, 231, 245, 271, 276, 304,
Gliders, 25 459, 489
Global positioning system (GPS), 48 Hurdle models, 343
Goldfinches, 88 Huygens’ principle, 157, 161, 162, 167
Goldfish, 365 Hydrophone, 21, 116
Go/no-go response, 367 Hydrostatic pressure, 116
Goodness-of-fit tests, 345 Hypercardioid, 44
Gorillas, 222
Graded calls, 405 I
Gramophone, 4 Icebreaker noise, 188
Grasshopper, 171, 466, 472 Identifiers, 393
Gray treefrogs, 172, 475 Image source, 197
Gray whale, 462, 465 Incoherent, 124
Grazing angle, 195, 198 Incus, 484
Greater bulldog bat, 426 Independent variables, 326
Greater horseshoe bat, 174, 428, 430 Indian elephant, 379, 380
Greater spear-nosed bat, 173 Indian false vampire bat, 429
Great tits, 482, 483 Indigo bunting, 96
Green sea turtle, 474 Inferential studies, 321
Green treefrog, 363, 382 Inferior colliculus evoked potentials, 375
Ground effect, 164, 165 Inflection point, 115
Ground squirrels, 98 Informational masking, 461
Grouper, 294 Information entropy, 279
Guinea pig, 178, 356, 360 Infrasonic, 221
Gulf corvina, 238 Infrasonic microphones, 43
Gunshots, 227 Infrasound, 41, 111, 485
Inner ear, 358
H Insects, 160, 220, 236, 471
Habituation, 465, 473 Institutional Animal Care and Use Committee (IACUC),
Hair cell, 467, 487 88
Hankel function, 206 Instrumentation microphones, 44
Harbor porpoise, 248, 377, 421, 429, 433, 436, 438, 439, Integrated Nested Laplace Approximation (INLA), 345
448, 465 Intensity, 194
Harbor seals, 98 Interactions, 330
Harmonic, 114, 406 Interference pattern, 164
Harp seals, 459 Invertebrates, 468
Hawaiian monk seal, 367, 369 Isosensitivity curve, 372
Hearing damage, 478 Isothermal layer, 192
Hearing loss, 176, 461, 485
Hearing threshold, 356 J
Heat capacity, 190 Jackhammer, 226
Helmholtz equation, 200, 204–206 Jays, 486
Herons, 466
Herring, 477 K
Heterodyning, 60 Kangaroo rats, 374
Hidden Markov model (HMM), 298 Katydids, 291
512 Index

Killer whale, 92, 98, 232, 292, 295, 297, 367, 466, 476, MANCOVA, 341
489 MANOVA, 341
King penguins, 173, 460 Marine mammal, 229
Kit foxes, 362 Markov Chain Monte Carlo (MCMC), 345
Koalas, 174 Marmoset monkeys, 175
Marmosets, 222
L Masked threshold, 356
Laboratory mice, 362 Masking, 358, 461, 472, 475, 491
Lamb, 224 Matched filters, 279
Larvae, 468, 470 Mating calls, 363
Larynx, 391 Mauthner cells, 361
Laser accelerometers, 18 Maximum frequency, 115
Laser-doppler vibrometer, 76 Maximum likelihood estimation (MLE), 336
Laser interferometers, 18 Mean-square sound pressure, 117
Laser microphones, 18 Measurement microphones, 18, 44
Law of reflection, 161 Mechanical turbulence, 170
Leafhoppers, 472 Mechanical wave forms, 395
Leaky modes, 205 Mel-frequency cepstrum, 284
Least-squares estimation (LSE), 336 Melon, 434
Leatherback sea turtles, 473 Metadata, 88
Leks, 409 Method of constant stimuli, 368, 381
Leopard frogs, 476 Method of limits, 368, 369, 381
Leopard seal, 231, 302 Mice, 274, 361, 373, 378, 487
Level quantity, 123 Micro-electrical-mechanical system (MEMS), 43
Liberal response bias, 371 Microphone, 116
Limiting ray, 167 Microphone array, 17
Linear regression models, 341 Microphonic potentials, 357
Line sources, 159 Middle ear, 358
Link function, 341 Miners, 484
Little brown bats, 173 MiniDisc, 10
Lizards, 474 Minimum frequency, 115
Lloyd’s mirror effect, 14, 197 Mink, 176
Lobster, 229, 468–470 Missed detection, 280, 370
Local extremum, 115 Mitigation, 463
Local maximum, 115 Mixed layer, 190, 192
Local minimum, 115 Mockingbirds, 106
Locust, 473 Mode shape, 203
Logbook, 93 Modified method of limits, 369
Loggerhead sea turtles, 474 Molecular relaxation, 160, 194
Logit function, 342 Molluscs, 467, 468
Log-link function, 342 Mongolian gerbil, 356, 363, 377
Lombard effect, 172, 174, 401, 466, 472, 475, 484, 486, Mongoose, 173, 467
491 Monophonic, 17
Longitudinal wave, 113 Mono recordings, 8
Longitudinal studies, 331 Moth, 422
Long short-term memory, 293 Motifs, 274
Long-term spectral averages, 243 Mountain sheep, 485, 486
Lossless compression, 139 Multi-collinearity, 345
Loudness, 117 Multilevel modeling, 344
Low duty-cycle bats, 427–428 Multiple regression, 341
Low-pass filter, 131, 160 Multivariate, 332
M-weighting, 130
M Myotis bats, 286, 289
Machine learning, 290, 293 Mysticete, 229, 488
Mackerel, 478
Magnetaphone, 6 N
Magnitude, 116 Narrowband, 131
Malleus, 484 Naval sonar, 186
Manatees, 465 Near-field, 126, 127
Index 513

Needle electrodes, 377 Pallid bats, 173


Negative binomial distribution, 327 Parabolic equation, 206, 207
Negative phonotaxis, 363 Parabolic reflector, 5, 17
Negative reinforcement, 366 Parameters, 326
Neotropical frog, 284 Parametric cluster analysis, 284
Nested variables, 344 Parametric tests, 340
Neuromasts, 477 Paraxial approximation, 206
Neutral response bias, 371 Particle acceleration, 120, 143
New River tree frog, 274 Particle displacement, 120, 143
New Zealand fur seal, 223 Particle motion, 79, 361, 477
Nightingales, 484 Particle velocity, 117, 120, 143
Nocturnal communication distances, 168 Passerines, 175
Noise, 111, 459 Passive acoustic monitoring, 22, 63
Noise control, 153 Passive bioacoustic studies, 37
Nonlinear dimensionality reduction, 290 Passive sonar, 21
Non-parametric tests, 340 Peak frequency, 131
Non-song sounds, 276 Peak sound pressure, 117
Normal distribution, 327 Peak sound pressure level, 117
Normalized difference soundscape index (NDSI), 248 Peak-to-peak sound pressure, 116
Normal mode model, 203–205, 207, 235 Peak-to-peak sound pressure level, 117
Null hypothesis, 337 Pearson’s correlation coefficient, 333
Numerical variables, 326 Penguins, 482
Nyquist frequency, 38, 137 Permanent threshold shift, 462
Phantom powering, 11
O Phase, 113
Ocean acidification, 239 Phonautograph, 3
Octave band levels, 239 Phonic lips, 434, 438
Octave bands, 132 Phonometer, 44
Octopus, 468 Phonotaxis, 472, 475
Odontocete, 229, 488 Phons, 129
Oilbird, 173, 430, 441, 442 Photosynthesis, 238
Old-World monkeys, 487 Phrases, 274
Omnidirectional, 157, 171, 175 Pickersgill’s reed frog, 225
One-choice experiment, 364 Piezoelectric, 51
One-tailed test, 337 Piezoelectric microphone, 16
One-third octave bands, 132 Piglets, 486
Open-reel recorders, 6 Pigtail macaque monkey, 383
Operant conditioning, 366 Pikas, 173
Optical microphones, 44 Pile driving, 234
Orange-eyed treefrog, 364 Pilot whale, 232, 283, 290, 295, 489
Organ of Corti, 461, 487 Pine bark beetles, 472
Orthopterans, 160 Pinna, 358
Ortolan buntings, 298 Pinnipeds, 488
Oscillograms, 27 Pipistrelle bats, 174
Oscilloscope, 19, 27 Pitch, 113
Otoacoustic emissions, 358, 373 Plague, 409
Otoliths, 477 Plankton, 470
Ototoxins, 488 Plug-in-power (PIP), 11, 42
Outdoor sound propagation, 156, 157 Plumbeous vireos, 483
Outer hair cells, 358 Point estimate, 329
Outlier, 325 Point source, 127, 158, 159
Oval squid, 377 Poisson distribution, 327
Overdispersed, 342 Polar bears, 223
Overtones, 114 Polar ice, 233
Oysters, 460 Polynomial, 342
Population consequences of acoustic disturbance (PCAD),
P 463
Pacific white-sided dolphin, 102–103, 284 Population consequences of disturbance (PCoD), 463
Pale spear-nosed bat, 223 Porpoises, 490
514 Index

Positive phonotaxis, 363 Ray models, 201


Positive reinforcement, 366 Ray propagation, 157
Posterior distribution, 336 Ray traces, 168, 169
Power analysis, 339 Ray tracing, 157, 158, 207
Power quantity, 123 Realizations, 327
Power spectral density (PSD), 244 Real-time spectrogram, 29
Power spectral density (PSD) percentiles, 141 Received level, 159, 186
Prairie dogs, 98, 292, 485 Receiver operating characteristic, 280, 370
Prawn, 468 Reception of mechanical signals, 396
Preamplifier, 51 Recurrent neural networks, 293
Precision, 324 Red deer, 174, 298
Precision-recall, 282 Redundancy, 391
Predation risk, 466 Reed warbler, 480
Predictive studies, 321 Reel-to-reel recorders, 6
Predictors, 326 Reflection, 161, 171, 187, 201, 235
Prepulse inhibition, 362 Reflexive responses, 360
Prepulse stimulus, 362 Reflex modifications, 362
Presbycusis, 176, 356 Refracted waves, 196
Prescriber, 393 Refraction, 166, 171, 187, 201, 235
Pressure-release boundary, 197 Regression coefficient, 330
Preyer reflex, 360 Regression models, 330
Primates, 222, 238, 361 Reinforcement regimen, 371
Principal component analysis, 286, 334 Relative humidity, 160
Principle of parsimony, 345 Repeated measures study, 331
Prior distribution, 336 Research permits, 88
Probability density, 142 Residuals, 336
Probability density function, 327 Response variables, 326
Probability distribution, 327 Reverberation, 103, 162, 300
Probability mass function, 327 Rhacophorid treefrogs, 475
Production of vibrational signals, 396 Rhinoceros, 274
Projector, 51 Right whale, 98, 231, 294
Propagation loss, 159, 186, 187, 193–195, 198, 199 Risso’s dolphins, 290, 295
Propagation modeling, 194 Ritualization, 391
Psychometric function, 368 Rms bandwidth, 131
Psychophysical tuning curves, 383 Road noise, 165
Pulse length, 119 Rock hyraxes, 274
Pure tone, 114 Rodents, 485
Pygmy blue whales, 237, 245 Root-mean-square sound pressure, 117
Ross seal, 231
Q Roughness, 199
Quail, 484 Ruffed grouse, 161
Rufous-sided towhee, 93
R
Radiated noise level, 126 S
Radio-frequency microphone, 42 Salinity, 192
Railway, 225 Salinity profile, 192
Rain noise, 224 Salmon, 478
Random effects, 342 Sampling frequency, 38, 137
Random error, 323 Sampling rate, 3, 38, 137
Random forest, 294 Sampling units, 323
Random variable, 326, 327 Savannah sparrows, 105
Range-dependent, 201 Scallop, 229, 469
Range-independent, 201 Scattering, 161, 171, 187
Range of hearing, 358 Scattering loss, 197
Rarefactions, 113 Sciurids, 93
Rat, 356, 361 Scops owls, 174
Rattlesnakes, 92 Sea ice, 192
Ray, 157, 201 Sea lions, 223, 232
Rayleigh roughness parameter, 197 Seals, 223, 232
Index 515

Seal scarers, 465 Spatial release from masking, 172, 178, 189, 401, 476,
Sea otters, 223 483, 491
Sea turtles, 474 Species evenness, 246
Sea urchins, 229 Species-recognition, 96
Seismic airgun, 234, 302 Species richness, 246
Seismic communication, 396 Spectral density, 133
Seismic survey, 233, 238, 470 Spectral leakage, 140
Self-noise, 40 Spectral probability density (SPD), 244
Shadow zone, 167, 169, 201, 227, 235 Spectrogram, 27, 136, 240, 241
Shallow-water duct, 201 Spectrogram cross-correlation, 278
Shannon entropy, 279 Spectrogram equalization, 278
Shannon-Nyquist sampling theorem, 3 Spectrum, 131, 134, 136
Shear, 113 Specular reflection, 161
Sheep, 224 Speed of sound, 120, 165
Ship noise, 240, 245 Spermaceti organ, 434
Shotgun microphones, 17 Sperm whale, 300, 435, 436
Shrew, 445 Spherical spreading, 159, 193, 195
Shrimp, 228, 229, 239, 470 Spline, 342
SI base units, 112 Spontaneous otoacoustic emissions, 373
Sidelobes, 134, 140 Spotted dolphins, 297
Signal, 112, 393 Sprat, 478
Signal-to-noise ratio, 171, 186, 459 Spreading loss, 187
Significance level, 339 Spring peepers, 236
Sinusoidal, 117 Squid, 468, 469
SI system, 112 Squirrelfish, 376
Smartphone, 32 Squirrels, 238, 271, 466
Smooth functions, 342 Standard atmospheric pressure, 160
Snakes, 220 Stapedial reflex, 487
Snapper, 478 Stapes, 484
Snapping shrimp, 240, 459 Starling, 178, 482
Snell’s law, 166, 196, 201, 227, 235 Start frequency, 115
Snowmobiles, 226 Startle response, 477
Social adaptation, 402 Static information, 403
SOFAR channel, 202, 233, 235 Statistical controls, 323
Sonar equation, 186, 187, 208 Statistical decision theory, 347
Song, 219, 231, 274, 407 Statistical inference, 323
Songbirds, 274, 297 Statistical population, 323
Sonobuoy, 54, 101 Statistical power, 322
Sonogram, 27 Statocyst, 467–470
Sonoran pronghorn, 485 Statolith, 467, 468
Sound, 111, 113 Stereocilia, 487
Sound-attenuating chamber, 360 Stereophonic, 17
Sound detection, 363 Stereo recording, 8
Sound discrimination, 363 Stickleback, 362
Sound exposure, 118 Stonefly, 422
Sound exposure level, 118 Streaked tenrecs, 445
Sound field, 158 Stress, 461, 467, 486, 491
Sound level meter, 44 Structure-borne, 112
Sound localization, 366 Substrate-borne, 112
Sound maps, 170 Substrate-borne vibrations, 75
Sound pressure, 116 Suction-cup electrodes, 377
Sound pressure level, 117, 159, 239 Support vector machines, 284
Sound propagation, 156, 227, 239, 245 Surface duct, 192, 201, 235
Soundscape, 112, 141, 217, 218 Survival models, 343
Sound speed, 192, 199 Swallow, 481
Sound speed profile, 101, 166, 168, 192 Swiftlet, 430, 442
Source level, 126, 159, 186, 187 Syllables, 274
Source-path-receiver model, 153 Symbolic, 411
Southern brown treefrog, 475 Synapses, 487
Southern right whale, 273 Synaptopathy, 487
Sparrows, 96 Synergistic, 463
516 Index

Systematic error, 323 Ultrasonic microphones, 17, 43


Ultrasound, 41, 43, 111, 485
T Uncertainty principle, 140
Tail-flip reflex, 361 Unconditioned response, 357, 359, 365
Tail-to-signal ratio, 162 Unconditioned stimulus, 365
Target motion analysis, 149 Unidirectional, 44
Target strength, 189 Uniform manifold approximation and projection, 290
Tawny owls, 173, 224 Univariate analyses, 332
t-distributed stochastic neighbor embedding, 290 Unmanned surface vessels (USVs), 25
Teager–Kaiser energy operator, 280 Unmasked threshold, 356
Technophony, 218 Up/down staircase method, 368–370, 381–382
Telegraphone, 4 Upsweep, 115
Teleost fish, 361 Upward refracting, 166, 168
Temperature, 192 Upwind propagation, 167
Temperature inversion, 168, 227
Temperature lapse, 167 V
Temperature profile, 168, 169, 192 Variables, 326
Temporary threshold shift, 462 Vector sensors, 55
Tenrec, 445 Velvet ants, 220
Terminal buzz, 420, 423 Ventriloquial calls, 406
Terrestrial biophony, 219 Vespertilionid, 406
Terrestrial geophony, 225 VHS tape recorders, 9
Test statistics, 337 Vibrational behavior, 395
Theoretical model, 329 Vibrometers, 18
Thermal turbulence, 171 Vibroscape, 398
Thermocline, 55, 190, 192 Video tape recorder, 9
Thunderstorm, 224 Vireo, 88
Timberline wrens, 275 Virtual source, 197
Time-bandwidth product, 187 Viscous friction, 165
Time difference of arrival (TDOA), 144 Vocalization, 396
Time domain, 134, 136 Volcanic eruptions, 233
Time domain models, 200 Volumetric array, 55
Time series, 136 Voucher specimens, 106
Torrent frogs, 475 48-V phantom power, 42
Tortoises, 220, 473
Total internal reflection, 196 W
Towed array, 22, 54 Walruses, 223, 232
Transient, 119 Warbler, 294
Transient otoacoustic emissions, 373 Wasps, 220
Transmission loss, 159 Water-borne, 112
Transverse, 113 Wave equation, 156, 200
Treefrog, 176, 178, 475 Waveform, 134
Tree mice, 446 Wavefronts, 157
Tree-shrews, 238 Waveguide, 201, 235
Trucks, 225 Wavelength, 114, 156, 161, 197, 227
True positives, 280 Wavenumber, 205
Truncated regression, 343 Wavenumber integration, 205, 207
Tubeworms, 460 Wave rays, 157
Túngara frogs, 174, 475 Wax-cylinder recorder, 3
Turtles, 473 Weber fraction, 380
Two-choice experiment, 364 Weddell seal, 92, 98, 174
Two-tailed test, 337 Weighting function, 488
Tympanal hearing, 471 Whale choruses, 243
Type I error, 339 Whale-watching boat, 186
Type II error, 339 Whistles, 231
Types of communication, vibrational signaler, 398 White-crowned sparrows, 481, 484
White-footed mice, 465
U White noise, 67
Ultrasonic, 220, 222 White-throated sparrow, 271
Index 517

Wind noise, 224, 228, 232, 240 Zebrafish, 362


Windsock, 48 Zero-crossing detection, 19
Wind velocity profile, 168, 169 Zero-crossing points, 60
Wireless microphones, 44 Zero-inflated models, 343
Wolves, 302, 486 Zero-padding, 141
Wood frogs, 476 Zero-to-peak sound pressure, 117
Woodland caribou, 486 Zero-to-peak sound pressure level, 117
Zooplankton, 470
Z Zoosemiotics, 389
Zebra finches, 297, 460, 481, 482, 484 Z-weighting, 129

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy