Exploring Animal Behavior Through Sound Volume 1
Exploring Animal Behavior Through Sound Volume 1
Exploring
Animal Behavior
Through Sound:
Volume 1
Methods
Exploring Animal Behavior Through
Sound: Volume 1
Christine Erbe • Jeanette A. Thomas
Editors
Exploring Animal
Behavior Through
Sound: Volume 1
Methods
Editors
Christine Erbe Jeanette A. Thomas (deceased)
Centre for Marine Science Moline, IL, USA
and Technology
Curtin University
Perth, WA, Australia
Cover photo: Acoustic recording of an Adélie penguin colony at Brown Bluff, Antarctic Sound
(# Ole Næsbye Larsen)
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
In loving memory of
Jeanette A. Thomas,
A pioneer of animal bioacoustics,
A role model, mentor, colleague,
And dear friend to many of us.
We miss you, Jeanette.
The ASA Press
Editorial Board
Mark F. Hamilton (Chair), University of Texas at Austin
James Cottingham, Coe College
Timothy F. Duda, Woods Hole Oceanographic Institution
Robin Glosemeyer Petrone, Threshold Acoustics
William M. Hartmann (Ex Officio), Michigan State University
Darlene R. Ketten, Boston University
James F. Lynch (Ex Officio), Woods Hole Oceanographic Institution
Philip L. Marston, Washington State University
Arthur N. Popper (Ex Officio), University of Maryland
Christine H. Shadle, Haskins Laboratories
G. Christopher Stecker, Boys Town National Research Hospital
Stephen C. Thompson, The Pennsylvania State University
Ning Xiang, Rensselaer Polytechnic Institute
Preface
The idea for this textbook on Animal Bioacoustics was Jeanette’s. She reached
out to bioacousticians working on the different animal taxa and received great
interest in this book. Experts from around the globe joined her effort, devel-
oping chapters on bioacoustic studies on the diverse animal taxa, from
invertebrates and insects, to amphibians, reptiles, fishes, birds, and mammals.
It soon became obvious that the developing chapters relied on common
background knowledge, techniques, and terminology. The need for a volume
on methods to precede the volume on taxon-specific bioacoustic studies was
identified and this is when I came onboard.
In this volume, Chapter 1 presents a brief history to bioacoustic recording
and equipment. Chapter 2 provides guidance on choosing and calibrating
equipment. Chapter 3 explains how to collect bioacoustic data in the field
and laboratory, and what metadata are important to document. Chapter 4
introduces basic acoustic concepts, standard terminology, quantities and units,
and basic signal processing methods. Chapter 5 delves into the source–path–
receiver model, applied to terrestrial bioacoustic studies, with a comprehen-
sive treatise of sound propagation in terrestrial environments. Chapter 6 is
devoted to the intricacies of sound propagation under water. Chapter 7
explores terrestrial and aquatic soundscapes and introduces basic analysis
tools. Chapter 8 gives an overview of software algorithms for automated
detection and classification of animal sounds. Chapter 9 unravels analytical
and statistical methods for analyzing bioacoustic data. Chapter 10 presents
behavioral and physiological methods for studying animal hearing. The final
three chapters apply the tools presented in the first ten chapters to taxon-
overarching topics. Chapter 11 explores animal acoustic and vibrational
communication. Chapter 12 provides an overview of echolocation in bats,
dolphins, birds, and shrews. And Chap. 13 gives examples of the effects of
noise on animals.
The intended audience includes students and researchers of animal ecology
and, specifically, animal behavior, who wish to add acoustics to their toolbox.
Environmental managers in industry and government, members of
non-governmental organizations concerned with animal conservation, and
regulators of noise might equally find the book useful. The book will
empower its readers to understand and apply the bioacoustic research litera-
ture, design their own studies in the field and laboratory, avoid common
pitfalls and mistakes, choose appropriate equipment, apply different data
ix
x Preface
analysis methods, correctly interpret their data, adequately archive data for
future applications, and apply their results to management and conservation.
I would like to thank Keith Attenborough, Jay Barlow, Ross Chapman,
Russ Charif, Kurt Fristrup, Karl-Heinz Frommolt, Bob Gisiner, Alan Grinnell,
Shane Guan, Shizuko Hiryu, Dorian Houser, Vincent Janik, Colleen LePrell,
Peter Narins, Eric Rexstad, James Simmons, Hans Slabbekoorn, and Meta
Virant-Doberlet for reviewing one or more chapters in this volume.
A special thank-you goes to Lars Koerner at Springer Verlag in Heidelberg
for his emotional, technical, and editorial support throughout the years, in
particular the final year.
Open access to this book was mostly funded by the Richard Lounsbery
Foundation, as a contribution to the International Quiet Ocean Experiment.
The remainder of fees was covered by the Centre for Marine Science and
Technology at Curtin University, the Cornell Lab of Ornithology, and
l’Université de Toulon. Thank you!
Jeanette A. Thomas was a pioneer of animal bioacoustics. She successfully
straddled both terrestrial and aquatic worlds, studying animals from the
tropics to the poles. This book is a testament to her legacy.
The original online version of this Frontmatter and backmatter was revised.
Contents
xi
xii Contents
xvii
History of Sound Recording and Analysis
Equipment 1
Gianni Pavan, Gregory Budney, Holger Klinck, Hervé Glotin,
Dena J. Clink, and Jeanette A. Thomas
most cases, and depending on the focal animals behavior is a concern. Through the development
the age, sex, reproductive status, behavior, activ- and use of autonomous recorders, video cameras,
ity patterns, and even health of an individual may and acoustic animal tags, human observer effects
be estimated from acoustic recordings. Acoustic can be minimized, and unsupervised data collec-
data can be used to estimate the population den- tion over extended periods (days to months) and
sity of vocal animals, and dialects can indicate the in remote locations is now possible.
geographic boundaries of a population. However, In this chapter, we describe the history of the
density estimation by acoustics is still in its development of transducers, recorders, and sound
infancy, and will require further advancement in analyzers, along with the advances that these
the spatial analysis of the acoustic environment developments facilitated in the field of bioacous-
by using multiple sensors to become reliable and tics. Recording equipment can now capture a
widely applicable. At the community level, the wide range of frequencies, from infrasounds to
entire acoustic environment or soundscape can be ultrasounds (sounds below and above the range of
used to estimate species abundance and biodiver- human hearing, respectively), and are used in a
sity. Changes in vocal behavior can be indicative wide range of applications, from the study of
of environmental stressors, such as anthropogenic individuals and populations to entire
noise or habitat degradation (Pavan 2017). soundscapes. The digital revolution in sound
Originally, sounds of terrestrial animals were recording and analysis allowed for significant
studied with equipment and methods developed advances in the field of bioacoustics (Obrist
for military needs, human speech analysis, and et al. 2010) and resulted in the development of
music processing (Koenig et al. 1946; Potter et al. new disciplines, such as computational bioacous-
1947; Marler 1955). Later, scientists became tics (Frommolt et al. 2008), acoustic ecology,
interested in the sounds of aquatic animals, and soundscape ecology (Pijanowski et al. 2011a, b;
underwater research was facilitated by Farina 2014), and ecoacoustics (Farina and Gage
technologies used by the navies to monitor the 2017). An overview of acoustic principles and the
noise made by ships and submarines. Because of evolution of sound recording systems for musical
the frequency limitations of transducers (i.e., applications is given in Rumsey and McCormick
microphones and hydrophones), recorders, and (2009) and in Rossing (2007).
analysis equipment, most initial bioacoustic
research was conducted in the sonic range (i.e.,
the frequency range audible to humans: 20 Hz– 1.2 Advances in Recorders
20 kHz). Even in the early stages of the digital
revolution, both recorders and analysis equipment The most significant advancement in recording
were generally limited to audible frequencies. technology was the switch from analog-to-digital
A major hurdle for collecting field recordings devices. A reduction in size and weight of the
was the large size and weight of early analog recorder, extended battery life, rechargeable
equipment, along with high power consumption, batteries, more stable and larger capacity storage
which resulted in limited recording time. The media, broader frequency range, and accessibility
development of smaller, lightweight recording of a computer interface accompanied this transi-
devices made the collection of acoustic data sig- tion. Together, these advances provided
nificantly easier. Currently, with the advent of bioacousticians with an adaptable system for
small digital recorders with large solid-state recording a variety of species, greater field porta-
memories, anyone including researchers, bility, and generally more affordable high-quality
professionals, and amateurs can collect large equipment.
amounts of high-quality acoustic data continu- To understand the basic differences between
ously over extended periods. However, when analog and digital recorders, a clear explanation
using handheld recorders, the potential influence of the terms is necessary. Humans perceive the
of the human observer on the animals’ acoustic world in analog; this means that everything is
1 History of Sound Recording and Analysis Equipment 3
seen and heard as a continuous flow of informa- digitize the analog signal and transform it into a
tion. In contrast, digital information estimates sequence of numbers.1 For playing back sounds
analog data by taking samples at discrete intervals from a computer, a sound interface with a digital-
and describing the sample values as a finite num- to-analog converter (DA-converter) is required.
ber represented by binary coding (Pohlmann Next, we outline a brief history of the evolution
1995). For instance, while a vinyl record player of analog and digital recording devices. For more
(phonograph) is analog, a CD player is digital. A detail on digital recording technologies, see
phonograph converts groove modulation from a Pohlmann 1995.
vinyl record into a continuous electrical signal,
whereas a CD player reads a pit structure that is
interpreted as a series of ones and zeros (bits) that
1.2.1 Analog Recorders
is typical of binary coding. Likewise, a video
cassette recorder (VCR) is analog, yet a digital
The first purported sound recording was made by
videodisc (DVD) player is digital. A VCR reads
Édouard-Léon Scott de Martinville and dates
audio and video data from a tape as a continuous
back to 1860. The recording was just a few
variation of magnetic information, whereas a
seconds in duration and was made using a
DVD player reads ones and zeros from a disc
phonautograph. The phonautograph has a vibrat-
similar to a CD.
ing stylus, which moves on soot-covered paper to
Digital devices can approximate analog audio
draw the sound waveform.2 It was invented in
or video signals with an accuracy level that is
1857, and although it could record sounds, it
dependent on both sampling rate and bit depth
never evolved to allow reproduction of the
(or the number of bits in each sample). The
recorded sound.
Shannon-Nyquist sampling theorem proves that,
In the 1870s, Thomas Edison invented the
for a given frequency range, a sampling rate at
wax-cylinder recorder (Figs. 1.1 and 1.2), which
least twice that of the highest frequency can cap-
had a vibrating diaphragm that was mechanically
ture all information in that frequency band,
linked to a needle that sculpted grooves. It was
enabling perfect reconstruction of the analog
initially recorded on aluminum foil and then on a
waveform.
wax layer covering the cylinder, as it was slowly
With proper sampling, analog signals can be
rotated and translated on a screw axis. This device
transformed in the digital domain at a level that
encoded the sound vibrations into modulations of
makes them indistinguishable from the original.
the groove and then allowed playback of the
A significant advantage of digital data is that it
recorded vibrations through the same needle-
can be stored and manipulated more easily than
membrane system.
analog recordings. With analog recorders, each
According to Ranft (2001), the first known
copy produces a little degradation that
recordings of animal sounds (a caged Indian
accumulates through multiple successive copies.
bird, the Common Shama) were made in
Analog tapes are also prone to degradation with
Germany in 1889 on an Edison wax-cylinder.
time. Digital copies are a perfect duplication that
One of the first known scientific studies of animal
is indistinguishable from the original, unless spe-
sounds occurred in 1892 when Richard Lynch
cific data codes are added to identify them. More
Garner recorded primates on vax cylinders at a
importantly, digital recordings can be directly
zoo in the USA (Garner 1892). Garner also
transferred to a computer for processing or trans-
ferred through the Internet to be shared among
different laboratories. If researchers want to trans- 1
Analog Definition and Meaning: www.webopedia.com/
fer audio or video files from old analog tapes so
TERM/A/analog.html; accessed 24 Oct 2021.
they can be recognized and processed by a com- 2
The Phonautograms of Édouard-Léon Scott de
puter, they must use a sound interface based on an Martinville: http://www.firstsounds.org/sounds/scott.php;
analog-to-digital converter (AD-converter) to accessed 24 Oct 2021.
4 G. Pavan et al.
Fig. 1.1 Thomas Alva Edison and his phonograph. (per http://loc.gov/pictures/resource/cwpbh.04044/),
Image source: https://commons.wikimedia.org/wiki/File: public domain, Wikimedia Commons
Edison_and_phonograph_edit2.jpg, by Levin C. Handy
experimented with the playback of the recordings bird sound was issued in 1910 in Germany, and
to observe the primates’ reactions. the first radio broadcast of a singing bird was in
The first flat disc was invented in the late Britain in 1927 (Ranft 2001).
1870s, which provided an advantage over previ- Lademar Poulsen, a Danish engineer, invented
ous technology as the discs could be easily the telegraphone or wire recorder in 1898
replicated. Then in 1887, Emile Berliner patented (Poulsen 1900). Wire recorders were the first
a variant of the phonograph, named the gramo- magnetic recording devices, and they utilized a
phone, which used flat discs instead of spinning thin metallic wire, which passed across an elec-
cylinders (Fig. 1.3). Sounds were recorded on tromagnetic recording head. Each point along the
a disc as modulated grooves, with a system wire was magnetized based on the intensity and
similar to the one developed by Edison for polarity of the signal in the recording head. Wire
wax-cylinders. The first published recording of a recorders often had problems with kinks in the
1 History of Sound Recording and Analysis Equipment 5
Fig. 1.2 Photographs of an Edison’s wax-cylinder player licenses/by-sa/3.0/, via Wikimedia Commons; (right)
(left) and a wax-cylinder recording (right). Image sources: https://commons.wikimedia.org/wiki/File:Bettini_1890s_
(left) https://commons.wikimedia.org/wiki/File: brown_wax_cylinder.jpg, by Jalal Gerald Aro, CC BY-SA
EdisonPhonograph.jpg, by Norman Bruderhofer, www. 2.0 https://creativecommons.org/licenses/by-sa/2.0, via
cylinder.de, CC BY-SA 3.0 http://creativecommons.org/ Wikimedia Commons
wires, but editing was relatively easy as sections recording of bird songs in the field3 (Ranft
of wire could simply be cut out. 2001). In those years, Theodore Case of Fox
In the early 1900s, RCA Victor developed the Case Corporation approached Arthur Allen to
Victrola, which played records or albums that record singing wild birds and demonstrate the
were readily available to the general public. sound-synchronized film technology. Under the
Sounds were recorded as modulated grooves on guidance of Allen, a Fox Case Corporation crew
a disc, and this disc was used to produce a master filmed and recorded the songs of wild birds in
metallic plate where the grooves appeared as North America (Little 2003). Today, two of those
ridges. Albums were then produced for distribu- recordings can be heard on the Macaulay Library
tion by molding copies using the master plate and website.4 After a successful campaign with the
Bakelite (or synthetic plastic) material. In 1920, Fox Case film crew, Allen and his colleague Peter
AT&T invented the Vitaphone, which recorded Paul Kellogg recorded the sounds of wildlife for
and reproduced sounds as optical soundtracks on research and education purposes. The Library of
photographic film; the film impression was made Natural Sounds (now known as the Macaulay
with a thin beam of light modulated by the sound. Library) began in 1930 at the Cornell Laboratory
Arthur Allen, the founder of Cornell of Ornithology. In 1932, Allen and Kellogg used
University’s Laboratory of Ornithology, and visual and audio recordings to demonstrate to the
Peter Kellogg made the first recordings of wild American Ornithological Union that the ruffed
birds in 1929 at a city park in Ithaca, NY, USA. grouse (Bonasa umbellus) produced drumming
Albert R. Brand (a graduate student of Allen) and sounds (Little 2003). In 1935, Cornell biologists
M. Peter Keane built the first equipment for
recording in the field. Together, they recorded 3
Macaulay Library: Early milestones (1920–1950):
over 40 bird species within the first two years. https://www.macaulaylibrary.org/about/history/early-
With World War I parabola molds available from milestones/; accessed 24 Oct 2021.
4
the Physics Department, Keane and True McLean Macaulay Library: listen to recordings of Rose-breasted
Grosbeak https://macaulaylibrary.org/asset/16968 and a
(a professor in Electrical Engineering at Cornell) Song Sparrow https://macaulaylibrary.org/asset/16737;
constructed a parabolic reflector to improve accessed 11 Oct 2021.
6 G. Pavan et al.
carried out an expedition to record the sounds of principle as the magnetic wire recorder, but
vanishing bird species, including the ivory-billed instead of wire, it had long, thin strips of paper
woodpecker (Campephilus principalis), for impregnated with fine particles of iron oxide that
which they used a mule-drawn wagon to transport were drawn across an electromagnetic head. After
recording equipment into the field (Fig. 1.4).5 World War II, the American company Ampex
Even with limited space and harsh conditions, perfected the German technology by replacing
Alton Lindsay, in 1934, took a phonograph paper with a thin plastic film. For almost
recorder on the Little America Expedition to 50 years, reel-to-reel magnetic tape was the stan-
Antarctica and made recordings of airborne dard media for use on recorder/playback devices
sounds from Weddell seals (Leptonychotes (Fig. 1.5). Reel-to-reel recorders (or open-reel
weddellii), available today at the Smithsonian recorders) used variable tape speeds to record
Institution. different frequency ranges, with faster recording
In the late 1930s, a German company invented speeds providing higher-frequency recordings.
the Magnetaphone, which was based on the same Another American company, a contemporary of
Ampex, the Amplifier Corporation of America,
5 was one of the first companies to develop a truly
Macaulay Library: listen to the ivory-billed woodpecker
recording made with an optical film recorder https:// portable reel-to-reel recorder, the Magnemite
macaulaylibrary.org/asset/6784; accessed 11 Oct 2021. 610, which was introduced in 1951 and was
1 History of Sound Recording and Analysis Equipment 7
Fig. 1.4 Photograph of ornithologist Peter Paul Kellogg Tract, Madison Parish, Louisiana. Image by Arthur
in 1935 in a mule-drawn wagon used to haul an amplifier A. Allen courtesy of the Cornell Laboratory of
(center) and optical film recorder (on the right) to capture Ornithology
the sounds of ivory-billed woodpeckers in the Singer
Fig. 1.5 Open-reel recorder made by AEG (1939). Image BY-SA 3.0 https://creativecommons.org/licenses/by-sa/
source: https://commons.wikimedia.org/wiki/File:AEG_ 3.0, via Wikimedia Commons
Magnetophon_K4_1939.jpg, by Friedrich Engel, CC
8 G. Pavan et al.
used by many pioneers in the field of bioacous- which meant they were better suited for field
tics. Figure 1.6 shows Peter Paul Kellogg using a studies. Eventually, recorders had even more
1950s Magnemite 610 recorder with a Western channels (as many as 24 in some music-recording
Electric 633 microphone mounted in a parabolic studios), which enabled scientists to record and
reflector. playback signals simultaneously from more than
Initially, tape recordings were mono one acoustic sensor.
recordings with one soundtrack on the tape. Ste- Recorders were also developed to record a
reo recording techniques (providing two record/ wide range of frequencies. Studies by Griffin
playback channels) were developed in the 1960s. (1944), Sales and Pye (1974), and Au (1993),
Initially, these recorders were bulky and not field provided evidence that animals (bats and
portable. Then, portable open-reel recorders were dolphins) produce a wide range of ultrasonic
developed for the rapidly developing outdoor signals. The first recordings of ultrasonic echolo-
recording needs of the radio, music, and film cation signals from bats and dolphins were made
industries. Stereophonic recorders allowed the on expensive dedicated tape recorders at very fast
recording of two synchronous signals on parallel tape speed (60 and 120 inches per second).
tracks onto one tape. In bioacoustics applications, Among them, the RACAL Store4DS recorder
often one track was used by the recordist for was used in the 1980s and 1990s, and it provided
comments and the second track for recording tape speed up to 60 inches per second to record
animal sounds. frequencies up to 300 kHz. It was battery
In the 1970s and 1980s, the most common powered and reasonably portable. However, the
reel-to-reel recorders used by bioacousticians limited data storage capacity of these magnetic
were the Nagra III and IV series and the Uher reels meant that the recordings lasted only a few
4000 series. They offered multiple recording and minutes.
playback speeds (depending on the models, 3.75, In 1964, Philips introduced the compact cas-
7.5, 15, or 30 inches per second), were relatively sette tape, which was comprised of a small plastic
lightweight, ruggedized, and battery powered, case holding two small reels with 1/8-inch wide
1 History of Sound Recording and Analysis Equipment 9
magnetic tape running at 4.75 cm/s (1.875 inches The same trick can now be done easily with
per second). In the 1970s, analog cassette digital systems. Playbacks are a commonly used
recorders, which could easily record and playback experimental approach in bioacoustics, wherein
sounds, became available at affordable prices, but previously recorded sounds are broadcast to the
were used primarily for music and human speech, animals of interest. Many playback studies used
and were thus limited in frequency to the human magnetic tape recordings containing animal
hearing range. These recorders (Fig. 1.7) were sounds as the stimuli.
much smaller and less expensive than reel-to- Researchers could easily play the sound back-
reel devices. Cassette tapes could record up to ward (by reversing the reading direction of a
one hour on each side of the cassette (typical spliced tape) or insert a section of tape containing
total recording duration was either 60, 90, or sounds of another species, individual, or noise as
120 min), but tapes were very thin and fragile, a control stimulus. Magnetic tape was also used
which made them prone to print-through (the to record live video images. The first practical
magnetic transfer of a recorded signal to adjacent video tape recorder (VTR) was built in 1956 by
layers of tape). In 1976, Sony introduced, with Ampex Corporation. The first VTRs were
little success, the Elcaset, a bigger cassette with reel-to-reel recorders used in television studios,
1/4-inch tape running at 9.5 cm/s. Today, how- which made recording for television cheaper and
ever, it is almost impossible to find new reel-to- easier.
reel or cassette tapes as there are very few VHS tape recorders, introduced in the 1970s,
manufacturers of these media. were the first compact analog devices to record
One of the advantages of tape recording was both audio and video signals simultaneously on
the possibility to play back the tapes at a speed the same tape. Commercial video cameras
lower or higher than the original recording speed. quickly became available for home use. Battery
This way it was possible to lower the frequency power for cassette recorders and VHS cameras/
of recorded ultrasonic signals to the human recorders made this equipment popular for field
hearing range, thus making them audible (and studies of animal behavior and sounds.
longer in duration); conversely, recordings of Many magnetic analog recordings had problems
infrasounds were played at higher speed to because the media deteriorated when tapes were
make them audible (and shorter in duration). not stored under properly climate-controlled
10 G. Pavan et al.
conditions. Unfortunately, some older analog encapsulated in a small cassette using a rotating
recordings have been lost, or, in some cases, the helical-scanning magnetic head, which allowed
players are not available to retrieve the recorded for much faster head-tape speed and data density.
sounds. In the last decades, a great effort was made Many R-DAT recorders allowed recording at dif-
by major sound libraries to preserve old recordings ferent sampling rates of 32.0, 44.1, or 48.0 kHz
(on wax-cylinders, discs, magnetic tapes, and and 16-bit resolution (the CD standard is
cassettes) and to transfer them to safer digital stor- 44.1 kHz, 16 bit) (Pohlmann 1995). The R-DAT
age (Ranft 1997, 2001, 2004). This was often not format had little success in the consumer market
an easy task because magnetic tape recordings used because of the high cost but was used widely by
a large variety of tape types, speeds, and track professional recordists as a replacement for
format arrangements. Unfortunately, many valu- expensive and bulky open-reel recorders.
able tape recordings have yet to be converted to a Some specialized R-DAT models allowed
digital format and archived. Without a long-term recording up to 100 kHz on a single channel
preservation strategy and support, it is possible that (i.e., by using a 204.8 kHz sampling frequency
these media may be lost forever. and doubled tape speed). R-DAT offered record-
ing quality that was comparable to open-reel
recorders, however, the helical-scanning head
proved problematic in humid conditions, and the
1.2.2 Digital Recorders
thin tape used in R-DAT cassettes was easily
damaged. An alternative to R-DAT was the digi-
The introduction of the CD by the music industry
tal compact cassette (DCC) introduced by Philips
in 1983 brought digital audio to the consumer
in 1992. DCC was compatible with the already
market and started a new audio recording age
existing analog cassette tapes but failed to gain
(Pohlmann 1995). The ability to store sound in a
commercial success.
digital format greatly improved acoustic data col-
Digital recorders with optical discs (CD-R and
lection. It allowed easy and perfect replication of
DVD-R) never gained popularity for field
recordings, enabled accurate digital editing, and
applications because the equipment had to remain
provided the means of more permanent data stor-
stationary while recording. Also, at the same
age with direct access for processing and analysis
time, magnetic discs (hard drives) quickly
by a computer.
became the state-of-the-art data storage media.
In 1987, Rotary Digital Audio Tape (R-DAT
In contrast, the MiniDisc (MD), a small optical
or DAT) recorders were the first widely available
disc developed and marketed by Sony in 1992,
digital recorders (Fig. 1.8). However, these
had more success among nature recordists,
devices still recorded on a thin magnetic tape
Fig. 1.8 (a) Photograph of a portable R-DAT recorder Sony TCD-D7 (1992) with a DAT cassette and the optical able to
provide digital data transfer to a PC. (b) a MiniDisc recorder and disc (1997)
1 History of Sound Recording and Analysis Equipment 11
because the MD portable recorders were smaller, have quiet microphone preamplifiers, several
lighter weight, and much cheaper than DAT types of powering options and can have up
recorders. MD offered random access to the to 8 channels. Most pocket recorders lack the
recordings (DAT and analog tape recorders phantom powering required for professional
allowed only sequential access), which made it microphones, but can power external
much easier to find and listen to specific sections microphones at low voltage (Plug-In-Power, or
of a recording. These devices used the same sam- PIP; see Sect. 1.3.1).
pling mode as the CD (44.1 kHz, 16 bit). The Most digital recorders can sample at different
main disadvantage of the MD was the lossy signal sampling frequencies (e.g., 44.1, 48, 96, and
compression based on Adaptive Transform 192 kHz) with either 16 or 24 bits of resolution,
Acoustic Coding (ATRAC), similar to the MP3 yielding very high sound quality. Some models
codec developed by the Moving Picture Expert can sample up to 192 kHz, but some of these have
Group (Budney and Grotke 1997). The compres- input electronics that limit the bandwidth to less
sion fit 74 minutes of acoustic data onto a small than 60 kHz, well beyond human hearing limits,
digital disc with a nominal capacity of but not enough for recording animal ultrasounds.
140 megabytes (MB) with a compression rate of In the music industry, other standards have been
5:1. The precision of some measurements of the developed to allow even higher acoustic quality
acoustic structure of animal sounds can be signif- (Melchior 2019), up to 384 kHz sampling with
icantly affected by lossy data compression 32-bit depth, but they are not yet available in
schemes (Araya-Salas et al. 2017). low-cost consumer recorders.
With hard drive recorders and the subsequent
development of solid-state memory recorders, a
new generation of high-quality equipment with 1.2.3 Recording to a Computer
unparalleled capacity became available in the
early 2000s (Figs. 1.9 and 1.10). Solid-state mem- In the 1990s, the first sound-acquisition boards
ory recorders do not require mechanical moving for personal computers became available, which
parts for the storage and retrieval of digital infor- revolutionized the way scientists collect and ana-
mation and instead use memory cards, such as lyze acoustic data. Once a sound was recorded in
Compact Flash (CF) or Secure Digital (SD and a digital format, recordings could easily and with-
microSD) cards also used in the digital photogra- out degradation be transferred to a computer,
phy market. stored, edited, copied, distributed, played,
The subsequent development of pocket digital processed, and analyzed with different
recorders for the consumer market allowed algorithms. Software (either freeware or commer-
scientists and amateurs to record many hours of cial) that can be used on a laptop provides
sounds with high quality. Portability and storage scientists with “a bioacoustics laboratory in a
space increased while cost decreased. Today, tape bag.” The consumer and professional market
recorders have been completely replaced by offer a large number of sound interfaces, to be
solid-state digital recorders with either external connected by USB or other standards to a PC,
(Fig. 1.9a) or built-in microphones (Fig. 1.9c). which can offer very high audio quality and mul-
Attempts to develop portable digital recorders tiple input/output channels. Smaller versions of
based on handheld portable computers or pocket such a setup, or compact single-board computers
PCs never gained much popularity because of the costing few tens of US dollars, are being used in
rapid development of pocket recorders. Profes- autonomous stationary and mobile recording
sional and semi-professional recorders systems, which allow data collection and real-
(Fig. 1.9a) provide phantom powering at 48 V time data processing in remote areas for months
(P48) for professional condenser microphones, at a time (e.g., Klinck et al. 2012).
12 G. Pavan et al.
Fig. 1.9 (a) Photograph of a professional portable high- card. (c) Photograph of five widely used digital recorders
quality recorder (Sound Devices, SD722) with both hard lined-up for comparative testing. From left: Sony
disc and solid-state memory recording capabilities, PCM-M10, Sony PCM-D50, Olympus LS-3, Roland
connected to two low noise microphones (Rode NT1A) R05, and Zoom H1. They feature internal microphones,
for soundscape recording. (b) Photograph of SONY but also can connect to external Plug-In-Power (PIP)
TC-510 open-reel recorder (1982) and a SONY microphones or hydrophones. Courtesy of M Pesente
PCM-M10 digital recorder with its microSD memory (2016)
interesting stories to tell about the evolution of recording system made by Wildlife Acoustics. A
their autonomous recording equipment (e.g., few different types of autonomous recorders are
McCauley et al. 2017). currently available. However, as interest in con-
The first commercially available, programma- tinuous, long-term acoustic monitoring of remote
ble autonomous recorder, SongMeter 1 (SM1), areas (Pavan et al. 2015; Righini and Pavan 2019)
was sold by Wildlife Acoustics in late 2007 increases, new devices will continue to appear on
and opened a rapidly developing market. Since the market and in the open-source arena. In some
then, new products have been proposed by cases, audio recorders can be coupled with photo-
companies and research groups, with increasing and video traps to get images of the animals if
performances and autonomy. These can be they are at a close enough range.
programmed to record at defined intervals (e.g., Recent open-source autonomous recorders are
every day across the dawn and dusk periods) or built around the Raspberry Pi and similar small
more regular sampling schedules (e.g., 1 minute board computers. However, these devices often
every 10 minutes, or 10 minutes every half-hour) have inefficient power optimization and require
to sample temporal patterns of variation in a large batteries to supply power over long periods.
soundscape. This way, the acoustic behavior of The Solo acoustic monitoring platform6
animals of interest can be recorded without dis- (consisting of Raspberry Pi plus external micro-
turbance by the recordist and for extended phone) needs a 12-V car battery to record for
periods, both day and night. These recorders 40 days. Autonomous recorders need to be
need to be rugged and reliable to be deployed in low-power to allow for extended periods of
harsh environments. The period of time that recording time with a manageable battery supply.
recorders can collect data depends on the combi- The AudioMoth7 is an open-source device that
nation of available battery power and memory. also can be purchased assembled, and it employs
Depending on these factors, terrestrial recorders a low-power microcontroller with an onboard
can operate for weeks to months. A grid of auton- Micro Electro-Mechanical System (MEMS)
omous recorders can be used for monitoring bio-
diversity over a large area (e.g., entire countries; 6
Project website: https://solo-system.github.io/home.
Obrist et al. 2010), even in the ultrasonic range. html; accessed 1 Oct 2021.
7
Figure 1.10b illustrates one type of autonomous https://www.openacousticdevices.info/audiomoth;
accessed 22 Jun 2022.
14 G. Pavan et al.
microphone (Hill et al. 2018). MEMS are very environmental information has driven the devel-
small and cheap and allow for production of opment of new multi-channel, multi-parametric
autonomous recording devices at very low cost. instrumentation. Multi-channel portable recorders
Autonomous recorders can also be built around a and computer interfaces developed primarily for
wireless interface to send raw or processed data in professional music recording can be used for bio-
real-time, in near real-time, or at scheduled acoustics applications, however, dedicated
intervals. However, data transmission requires recorders with very high sampling rates are also
power and the creation or use of a suitable wire- being developed for specific study systems.
less network (Sethi et al. 2018). The recently developed JASON Qualilife9 can
Smartphones with an external battery supply record up to 5 data channels, with the maximum
are another option used to explore animal sounds sampling frequency up to 800 kHz per channel,
and soundscapes. The Automated Remote Biodi- all featuring 16-bit resolution, a sharp filter to
versity Monitoring Network (RFCx ARBIMON) prevent aliasing, and an adjustable analog gain
can receive acoustic data from a remote recorder for a large range of uses (Fig. 1.11).
based on a cellphone that, if coverage is available, Although already designed for low-power con-
directly sends data to the central server with sumption (12 V, 100 mA), to further reduce
online access.8 This system, coupled with Artifi- power consumption and achieve extended long-
cial Intelligence recognition algorithms, can iden- term recording, an extension board (Qualilife
tify sound categories to generate alerts to prevent Wake-Up Detector; Fourniol et al. 2018; Glotin
poaching and deforestation. More information on et al. 2018), can be used to trigger the recorder
autonomous recorders is available in Chap. 2. when it receives a signal at a specified frequency.
This allows for a reduction in power consumption
and data storage, also reducing unnecessary post-
1.2.5 Multi-Channel Recorders processing work. Moreover, it includes a high
dynamic luxmeter (which works from sun zenith
Collecting multiple channels of acoustic data to lunar eclipse) that is synchronized with the
allows for acoustic localization of the sound acoustic recorder.
source. Multi-channel recordings can help miti-
gate the Lloyd’s mirror effect, a phenomenon in
which low-frequency sounds near the ground 1.3 Advances in Microphones
may not be recorded correctly because of the
interference of direct and surface reflected There were several early attempts in the mid- to
sound. Increased interest in collecting multiple late-1800s by Johann Philipp Reis and Elisha
channels of acoustic data coupled with
8 9
Project website: https://rfcx.org/ & https://arbimon.rfcx. Project website: https://www.univ-tln.fr/SMIoT.html;
org; accessed 1 Oct 2021. accessed 20 Jun 2022.
1 History of Sound Recording and Analysis Equipment 15
Fig. 1.12 Left: Drawing of a carbon-button microphone microphone used for bioacoustics research; https://
(1916). Image source: https://commons.wikimedia.org/ commons.wikimedia.org/wiki/File:Sennheiser_MKH416.
wiki/File:Carbon_button_microphone_1916.png; jpg by Galak76, CC BY-SA 3.0 http://creativecommons.
unknown author, public domain, via Wikimedia org/licenses/by-sa/3.0/, via Wikimedia Commons
Commons. Right: Sennheiser MKH416 directional
Gray to develop the precursor to a microphone. of carbonized anthracite coal, which were con-
Reis developed the sound transmitter, which fined between two electrodes. One electrode was
contained a metallic strip that rested on a mem- connected to an iron diaphragm. Edison’s trans-
brane that caused intermittent contact between a mitter was durable, efficient, simple, and cheap to
metal point on the strip and an electrical circuit build. His transmitter became the basis for
when it vibrated. Elisha Gray developed the liq- millions of telephone transmitters used around
uid transmitter, consisting of a diaphragm the world.
connected to a moveable conductive rod, which
was immersed in an acidic solution. In 1876,
Alexander Graham Bell invented the magnetic 1.3.1 Microphones Used
transmitter, and Edison and Berliner developed a in Bioacoustics Research
loosely-packed carbon granules microphone
(Fig. 1.12). David Edward Hughes coined the At the beginning of the twentieth century, most
term “microphone” in 1878 for his microphone microphones were carbon granule sensors. These
system based on carbon granules, which early microphones were noisy and had limited
performed poorly by today’s standards (due to sensitivity and frequency response. This meant
high self-noise and distortion). However, it was these early microphones were suited only for
an important step forward, enabling technology recording human voices. In those early stages,
for long-distance voice communication or tele- dynamic microphones based on a membrane
phony (for more details see Robjohns 2010)10 with a coil immersed in a magnetic field were
In 1886, Thomas Alva Edison refined the car- difficult to produce because they required small
bon granule microphone and developed the but strong magnets.
carbon-button transmitter. This transmitter In 1917, Edward Wente made a great stride
consisted of a compartment filled with granules forward by inventing the condenser microphone,
which is still used in a wide variety of
10 applications today. In the 1920s, with the signifi-
A Brief History of Microphones: http://microphone-
data.com/media/filestore/articles/History-10.pdf; accessed cant increase in broadcast radio, there was a high
11 Oct 2021. demand for better quality microphones. The
16 G. Pavan et al.
Fig. 1.13 Photograph of the PRIMO EM172 microphone capsule (left) used by many nature sound recordists for their
custom-made microphones (center and right). Courtesy of M Pesente
piezoelectric microphone was created based on The widely used condenser microphones are
piezoelectric crystals, which are sensitive to pres- fairly sensitive, compared with dynamic
sure changes and generate a voltage when com- microphones, and feature an extended frequency
pressed/decompressed; conversely, they vibrate response, but they require external power. Profes-
and produce sound waves if excited by an electric sional condenser microphones are often powered
signal. Originally, they used quartz or Rochelle through the signal cables with 48 V (phantom
salt crystals, but the sound quality was poor. With power, P48) provided by the recording device,
the development of strong magnets, dynamic by a preamplifier, or by a power unit. Consumer
microphones were then used for decades because microphones usually use electret condenser
of their simplicity and reliability. However, for capsules that require 3–5 Vdc powering (plug-in
bioacoustics studies, they were not sensitive power, PIP) provided by the recorder via the
enough, and their frequency response generally microphone plug. Microphones well-suited for
did not extend beyond the human hearing range. bioacoustics studies can be built with electret
Today, almost 90% of the microphones condenser capsules costing only a few US dollars
manufactured annually are electret condenser (Fig. 1.13). For a detailed discussion of features
microphones (Rossing 2007) because of their and operation of microphones, see Chap. 2, sec-
many advantages when compared with dynamic tion on selecting a microphone.
microphones, including higher sensitivity, higher Many animals including insects, frogs, bats,
fidelity, and wider frequency response. Piezoelec- and other terrestrial and marine mammals emit
tric transducers are now mainly used in ultrasonic sounds (Sales and Pye 1974). Studies
hydrophones that have specialized ceramics that of ultrasonic signals require a broadband micro-
provide high sound quality. Robjohns (2010) phone capable of responding to signals at very
provides a history of microphone evolution and high frequencies. In contrast, some animals, such
outlines how advances in broadcast radio, as elephants, produce very low-frequency sounds
telephones, television, and music industry, along and require infrasonic microphones capable of
with the need for directional and ultrasonic detecting signals at or below 20 Hz (Payne et al.
recordings, drove the design of several new 1986). Previously, ultrasonic and infrasonic
types of microphones (e.g., the condenser-, recording required very expensive and complex
dynamic-, ribbon-, and carbon-microphones). transducers, recorders, and analyzers. With the
1 History of Sound Recording and Analysis Equipment 17
advent of broadband AD-converters in laptops sounds) to minimize off-axis sounds (e.g., noise
and smartphones, ultrasonic and infrasonic ani- from the public and room reflections).
mal sounds can now be recorded at a reasonable Single microphone (i.e., monophonic)
cost. Ultrasonic microphones may use small elec- recordings cannot provide any spatial informa-
tret condenser capsules or MEMS, which are tion. These recordings are made with a single
primarily used in smartphones. MEMS are small microphone that can be an omnidirectional micro-
and inexpensive, feature an extended frequency phone to capture all sounds around or a direc-
response (including the ultrasonic frequency tional one to capture sounds from a specific
range), can include an AD-converter, and can be source or direction. However, microphones can
directly integrated into digital systems. Some be paired to record sounds in stereo to provide
microphones also incorporate a high-speed a spatial sound image wherein listeners can iden-
AD-converter and USB interface to be directly tify the perceived spatial location of the sound
connected to a computer, a smartphone, or a tablet source. Many different types of microphone
for recording and real-time display. The configurations have been developed, mainly
Dodotronic Ultramic series offers a range of for recording music, but also for recording
USB ultrasonic microphones with sampling soundscapes.
frequencies ranging from 192 kHz to 384 kHz A further development, mainly conceived for
(Buzzetti et al. 2020); the most advanced models cinema and videogames, is the surround system
also include the ability to record on an internal that is based on multi-microphone (i.e., micro-
microSD memory card.11 phone array) recordings and speakers placed
In cases where researchers want to separate around the listener to create a more immersive
sounds coming from different directions, or target acoustic experience (Streicher and Everest 1998;
an individual animal for recording, a directional Rayburn 2011). With 3D audio, a whole acoustic
microphone, a parabolic reflector, or a micro- space is recorded with a microphone array. From
phone array can be used. One of the first this, it is possible to extract sound information to
documented attempts was in 1932, when Peter build a stereophonic or binaural or surround pro-
Paul Kellogg and Arthur Allen used a micro- gram. Today 3D audio is mainly used for 3D
phone installed in the focus of a parabolic reflec- Virtual Reality, with either video game, cinema
tor to record bird sounds (Wahlstrom 1985; Ranft or scientific uses, that allows the user to be placed
2001). Parabolic reflectors have been widely used in a 3D audio and video environment (with spe-
to record animal sounds, capture distant speech, cial visors and headphones, or in special VR
and detect the noise of incoming vehicles and rooms) and to move inside it to look and listen
airplanes during the first and second world wars in any direction. The currently most used 3D
(i.e., before the invention of radar; see Chap. 2 for audio system is Ambisonics (Fig. 1.14) that is
a discussion of use and features of parabolic based on 4 (first order), 8 (second order),
reflectors). As an alternative to parabolic 16 (third order) or more channels (Zotter and
reflectors, ultra-directional microphones, or Frank 2019).
so-called shotgun microphones, were developed. Specific microphone array applications in bio-
The design of shotgun microphones is based on acoustics include localizing sound sources, either
the interference tube principle to attenuate off- static or moving, such as flying bats (Blumstein
axis sounds; these microphones were developed et al. 2011). Using specific algorithms, signals
to have a narrow angle of forward reception. The can be extracted from the microphone array, and
shotgun was initially designed for use in a studio the direction and intensity of sound sources can
setting (as opposed to recording long-distance be identified by superimposing a sound map on
top of an image taken by a video camera. This
type of application is called an acoustic camera
11
Dodotronic webpage: http://www.dodotronic.com; and is largely employed by the automotive indus-
accessed 20 Jun 2022. try to locate sources of noise in a vehicle.
18 G. Pavan et al.
1.3.3 Accelerometers
Fig. 1.15 Left: Photograph of an early ultrasonic bat UltraMic250k, based on MEMS, developed by
detector from the laboratory of Donald Griffin. Image Dodotronic in 2010, connected to a tablet computer that
courtesy of the Cornell Laboratory of Ornithology. allows recording and display of ultrasounds in real-time
Right: Photograph of an ultrasonic USB microphone
and hundreds of meters. For example, laser was related to their hearing, it was not until the
microphones can measure the vibration of a development of ultrasonic recorders and
glass window to capture the sounds produced microphones in the early 1940s (Fig. 1.15) that
inside a room. These devices were developed for scientists were able to study the ultrasonic sounds
spying purposes and are now mostly used in produced by bats for echolocation (Griffin 1944).
industry to record vibration of machinery. In bio- Donald Griffin was working with piezoelectric
acoustics research, and biotremology studies in transducers connected to an oscilloscope when
particular (Hill et al. 2019), this technology is he observed high-frequency signals produced by
used to record the vibration of animal body parts bats flying outside his open laboratory window.
(e.g., wings or abdomen of insects producing This discovery opened an entirely new field of bat
sounds) or vibration of the substrates (e.g., plant echolocation research.
stem, tree trunk, spider-web, and burrow-wall), Early bat detectors were based on the hetero-
which could indicate the presence of an animal. dyne principle and on frequency-division
Current instruments are lightweight and easy to counters (Obrist et al. 2010), which produced
use; however, they require that the target being audible but highly distorted sounds when receiv-
recorded is not moving and on a stable platform. ing ultrasonic calls. Heterodyne detectors allowed
These devices should not be confused with opti- only a narrow frequency range up to a few kHz, to
cal microphones and hydrophones, which are be shifted down to the audible range. The user
being developed and have a completely optical then tuned the detector to the frequency of interest
chain, where the transducer directly produces an and listened to and recorded signals only around
optical signal to be sent on an optical fiber cable, the tuned frequency. Information outside that fre-
either analog or digital, from the transducer to the quency range was discarded.
recorder. Frequency division (or count-down) detectors
cover a broad frequency range. They are based on
zero-crossing detection. They count how many
times the signal waveform crosses zero pressure
1.3.5 Bat Detectors
and they produce a synthetic wave every
n incoming waves. The output signal frequency
In the eighteenth century, the Italian scientist
is a fraction of the original frequency (i.e., 1/n),
Lazzaro Spallanzani recognized that bats were
and advanced systems retain the amplitude enve-
capable of navigating and capturing their prey in
lope of the original signal. The frequency division
the dark. While Spallanzani hypothesized that this
20 G. Pavan et al.
method is much better than the heterodyne; how- timer to start at sunset and stop at sunrise. Some
ever, both produce a distorted signal often not also have analysis software that identifies the
useful for scientific investigation. The first digital species, of course with variable margin of error
models, called time-expansion detectors, digitally depending on the species (see Chap. 2, section on
recorded the incoming bat calls at a high sampling bat detectors). Given the computing and storage
rate, and played them back at a reduced sampling capabilities of current tablets and smartphones,
rate, which allowed for human observers to hear dedicated ultrasonic microphones with an
the calls and record them on a conventional integrated AD interface also are available to
recorder (Obrist et al. 2010). This method record bat calls and display their features on the
preserves all acoustic features so that recordings device screen (Fig. 1.15).
can be used for scientific analysis.
Digital bat detectors include a built-in ultra-
sonic microphone, onboard signal sampling and 1.4 Advances in Hydrophones
processing, memory for digital data storage, a
graphical display to show a spectrogram with In 1826, Jean-Daniel Colladon and Charles-
related settings, and a speaker for monitoring Francois Sturm made an experiment in Lake
incoming ultrasounds by either slowing down or Geneva, Switzerland, to determine the speed of
shifting them in frequency. Current models are sound in water (Colladon 1893). They used two
completely digital, they record and store data small boats on opposite sides of the lake, ~14 km
continuously, and can transpose ultrasounds into apart. On one boat, there was an underwater bell,
audible sounds in real-time by spectral shifting which was struck at the same time that gunpow-
(or spectral compression), using a Fast Fourier der was ignited, which resulted in a paired under-
Transform (FFT) algorithm (see Chap. 4 on signal water sound and above-water gunpowder flash.
processing). Some bat detectors can be used as The operator of the second boat used an under-
autonomous recorders which can selectively water listening horn to detect the sound of the bell
record ultrasounds from echolocating bats for (Fig. 1.16). The time difference between seeing
many consecutive nights, with a programmable the gunpowder flash and hearing the bell allowed
shore, a small boat, or sea ice, and required the non-invasive and able to collect long-term data
presence of a researcher. from remote areas independently of weather and
Traditional hydrophones feature an analog light conditions (Mellinger et al. 2007; Lammers
output (voltage or current) and are available et al. 2008; Tremblay et al. 2009; Obrist et al.
with or without a front-end preamplifier. 2010; Sousa-Lima et al. 2013; Jacobson et al.
Hydrophones that feature an integrated 2016); see Chap. 2.
AD-converter and digitize the analog signal
directly at the sensor are now commercially avail-
able. Some digital hydrophones also integrate 1.4.4 Towed Hydrophone Arrays
signal processing and storage capabilities (e.g.,
real-time reporting of noise levels). Because of A towed array contains several hydrophones
the increased power consumption of digital housed in an oil-filled plastic sleeve, which are
hydrophones, these are primarily used in cabled pulled behind vessels of varying size. Towed
sensor networks, such as seafloor sensors or arrays of hydrophones allow beamforming (a
sub-surface towed arrays. processing technique that combines time-delayed
signals from multiple hydrophones to increase
gain in a given direction) to improve signal-to-
1.4.2 Sonobuoys noise ratio and estimate bearings to specific sound
sources. Consecutive bearing estimates allow the
Navies of the world recognized the need for a localization of a source and determining its range.
hydrophone that could operate remotely, was A towed array in effect provides a high-gain,
mobile, and could monitor sounds at different directional sensor that can be steered in different
water depths, which led to the development of directions either in real-time or in the post-
sonobuoys. Sonobuoys are individual canisters processing of recordings (see Chap. 2 for details
that float at the water surface and house a hydro- of towed hydrophone arrays). During World
phone, dampening cable, battery, recording/trans- War I, a towed sonar array (the first documented
mitting electronics, and a transmitting antenna. towed array) known as the Electric Eel was devel-
See Chap. 2 for details of features and operation oped by the US Navy physicist Harvey Hayes
of sonobuoys. Navies of the world used (Naramoto 2000). Bill Watkins and William
sonobuoys for underwater listening to detect Schevill at Woods Hole Oceanographic Institu-
submarines by deploying them from airplanes or tion were among the first bioacousticians to use
ships. A few labs were able to acquire military this technology to record and study the sounds of
sonobuoys and used them for receiving and marine mammals (e.g., Watkins and Schevill
recording marine animals. 1977; Watkins et al. 1987). The original towed
arrays focused on lower-frequency signals (i.e.,
frequencies typical of foreign vessel noise), but
1.4.3 Autonomous Underwater Schevill and Watkins developed new instruments
Acoustic Recorders to record the higher frequencies emitted by
dolphins. Their recordings are of high scientific
In recent years, a wide variety of stationary, value and are available online in digital format at
autonomous passive acoustic monitoring (PAM) the WHOI Watkins Sound Library.12
systems have been developed for the recording of In 1983, Thomas et al. (1986, 1987) worked
acoustic activity from naturally occurring with a geophysical company to build a modified
biological and geophysical sources, as well as towed array specifically for the study of marine
from anthropogenic sources in marine mammal sounds (Fig. 1.18), which was capable
environments (Figs. 1.19, 1.20, 1.21, and 1.22).
These systems have an advantage over systems 12
WHOI Library: http://cis.whoi.edu/science/B/
that rely on human observers as they are whalesounds/index.cfm; accessed 11 Oct 2021.
1 History of Sound Recording and Analysis Equipment 23
Fig. 1.18 Left: Photograph of the topside electronics Mary, to listen for underwater sounds of marine mammals
required to receive, record, and process data from a and fish in the Eastern Tropical Pacific. Photos by Jeanette
towed array in 1983. Right: Photograph of deploying a Thomas
towed array from the deck of a tuna seiner, the MV Queen
of capturing low- and medium-frequency under- the height of the Cold War, the US Navy
water sounds (20 Hz–15 kHz). Depth and temper- launched a classified project known as the
ature sensors on the array measured the SOund SUrveillance System (SOSUS). The
thermocline and sound propagation conditions in SOSUS large-aperture arrays allowed the Navy
the area. Self-noise from the moving ship was to detect signals at ranges of several hundred
present, but filtered out as much as possible. kilometers. SOSUS arrays were highly successful
Many species of marine mammals were heard, in detecting and tracking Soviet submarines of
which helped the fishermen find tuna as they that era. The sailors operating the early SOSUS
tend to associate with dolphin pods. arrays also detected numerous biological sounds
In recent years, lightweight towed arrays have of unknown origin. An unknown low-frequency
been developed to meet the requirements of sound was attributed to the “Jezebel Monster,”
studying marine mammal sounds from small yet later found to be from blue (Balaenoptera
platforms, such as sailboats (Pavan and Borsani musculus) and fin whales (Balaenoptera
1997). Deployment of the towed array from a physalus). After the end of the Cold War, the
sailboat minimizes recorded self-noise of the SOSUS system was made available to scientists
towing vessel. Current towed arrays can capture (Nishimura and Conlon 1994; Stafford et al.
sounds over a large geographic area and cover a 1998; Watkins et al. 2000), who monitored the
wide frequency range (from infrasound to presence of marine mammal sounds and tracked
ultrasound). their long-range seasonal movements across the
oceans. In one case, a blue whale was tracked for
80 days along the eastern seaboard of the USA
1.4.5 Seafloor Hydrophone Arrays using the 20-Hz signal the animal repeatedly
produced.
Arrays of bottom-mounted hydrophones were an At present, bottom-mounted arrays of
important naval asset for the surveillance of hydrophones are deployed across oceans world-
oceans for the presence and movements of wide, with some strictly dedicated to military
enemy vessels and submarines. In the 1950s, at applications, and others dedicated to monitoring
24 G. Pavan et al.
Fig. 1.19 The JASON Qualilife DAQ 3x600 kHz in the custom array by H Glotin, recording sperm whales in the near
field in 2018. Courtesy of V Sarano
Fig. 1.20 Left: Photograph of the passive acoustic The Sphyrna ASV allows 3D passive acoustic tracking of
seaglider™ developed by the Applied Physics Laboratory, diving cetaceans
University of Washington. Courtesy of G Shilling. Right:
26 G. Pavan et al.
Fig. 1.21 The evolution of the DTAG over fifteen years. 2000 (a) had 400 MB of memory and could record a single
Each design comprises electronics, batteries, suction cups, sound channel at 16 kHz sampling frequency for a few
floatation material, and a VHF transmitter for retrieval hours. The most recent version developed in 2009
when the tag is floating on the sea surface. The tags all (b) records stereo sound at up to 500 kHz sampling fre-
record sound, depth, and motion to solid-state memory. quency for almost two days. (c) is an intermediate version
However, the size, capabilities, and endurance have of the tag. Courtesy of P Tyack and M Johnson (2016)
changed over the years. The earliest version developed in
1.5.3 Animal Acoustic Tags sounds included rumination, which allowed the
researchers to document foraging activities.
A recent development for studying animals Video tags have been attached to whales,
in-situ is the animal-worn acoustic tag. Such dolphins, sirenians, and penguins, and to docu-
devices allow detailed observations of the move- ment the underwater life. Sophisticated acoustic
ment and acoustic behavior of tagged animals. tags provided an important step forward in marine
However, for some species, such as cetaceans, mammal bioacoustics. The development of these
developing a reliable, long-term instrument tags was primarily driven by the need to docu-
attachment has been problematic. ment and understand the reaction of cetaceans to
Recorders in collars, similar to those used for underwater sounds such as naval sonars, airguns,
radio tracking, have also been experimented to and pile drivers. The D-TAG (Johnson and Tyack
record sounds and activity of terrestrial animals 2003), A-Tag (Akamatsu et al. 2007), Acousonde
while moving freely, but with few applications. recorder (Burgess et al. 2011), and other similar
More successful was using the crittercam devel- instruments, feature a variety of animal move-
oped and used by National Geographic to primar- ment detectors (three-axial accelerometer, mag-
ily provide amazing video14 of wild animals netometer, depth-sensor, light sensor, etc.) and
either on land or in water. Lynch et al. (2013) acoustic sensors (hydrophones). These tags are
attached an inexpensive collar-mounted record- attached to the animals with non-invasive suction
ing device on ten wild mule deer (Odocoileus cups, and usually stay attached for a few hours,
hemionus) over two weeks in Colorado. Recorded but can stay on the animal for up to a few days.
Once detached, the tag floats to the surface and
14
https://www.nationalgeographic.org/education/ transmits a radio signal to aid recovery. This kind
crittercam-education/; accessed 11 Oct 2021. of technology (Fig. 1.21) has enabled important
1 History of Sound Recording and Analysis Equipment 27
research on sound usage and behavioral responses traced on paper by an oscillating pen (similar to
of animals to anthropogenic sounds, such as naval a seismometer).
sonars (Tyack 2009; Tyack et al. 2011). The Kay Electric Company (later to become
Often a variety of sensors can be attached to Kay Elemetrics) developed the Sona-Graph™
the animal to provide additional environmental machine, which was a completely analog instru-
or behavioral data to accompany acoustic ment and one of the first instruments to create an
recordings. Evans et al. (2004) attached a water- image of a sound known as a SonaGramTM.
proof video camera with a hydrophone, VHS Developed primarily for navy applications and
recorder, and depth-sensor to examine vocal initially called vibralyzer, this technology was
behavior during dives of Weddell seals in applied successfully to the study of human speech
Antarctica. Each time the seal vocalized, the and animal sounds (Koenig et al. 1946; Borror
depth and time of the sound were documented, and Reese 1953; Thorpe 1954; Marler 1955:
audio and video were recorded, and the call type Fig. 1.22). A SonaGram (sometimes called a
was later analyzed in the laboratory. Researchers sonogram by biologists) is a visual representation
had to retrieve the VHS tapes, but this species of the frequencies (on the y-axis) and intensity
remains close to a colony during the breeding (color or shades of gray as the z-axis) in a sound
season, hauls out on the ice daily, and is easily as they vary with time (on the x-axis). This type of
(re)captured for recovery of the tag and data. image visualization is also called spectrogram.
Current digital video equipment is highly The Sona-Graph™ was very expensive and capa-
miniaturized and allows new exciting options ble of analyzing a signal of only a few seconds in
for exploring the life of animals in the wild. duration up to 8 or 16 kHz. The device offered
two analysis settings, wideband (300 Hz) and
narrowband (45 Hz). The wideband setting
1.6 Advances in Sound Analysis provided better time resolution, while the narrow-
Hard- and Software band setting provided better frequency resolution
(Beecher 1988). The sound could be played back
The most important advancements in sound anal- from a reel-to-reel recorder and recorded on an
ysis equipment were the transition from analog- iron oxide magnetic track, which ran the circum-
to-digital systems, along with the transition from ference of a large internal turntable. A special
hardware to software signal processing. This thermo- sensitive paper was wrapped around a
provided lightweight, field portable, battery- drum mounted on top of the turntable. The drum
operated units with higher storage capacity, spun synchronously with the turntable as the sig-
more stable storage media, and broadband analy- nal was played back through a variable band-pass
sis, often at a more affordable price than before. filter or a filter bank, and a stylus burned the
Now, even a smartphone can produce a spectro- signal onto the paper on the rotating drum
gram in real-time. Another important break- according to the level of sound at the frequencies
through was the ability of scientists to share given by the filter (Fig. 1.23).
digital data using the internet and shared storage This was a smelly, smoky process, which
in the cloud. made the procedure unpleasant for researchers.
Initially, the basic analysis of acoustic signals To analyze a long sound recording, several short
was done using oscilloscopes. These instruments spectrogram sections had to be printed and taped
provided a visual representation of the waveform together. The resulting sheets of paper often
of acoustic signals known as oscillograms, which required a lot of wall or table space for review
are plots with amplitude on the y-axis and time on and further analysis. Because of the large size,
the x-axis. Originally, oscilloscopes were large, these spectrograms were also difficult to reduce in
heavy, expensive, AC powered, and used vacuum size and adapt for inclusion in a publication.
tubes. To obtain a hardcopy of the waveform, a In the 1970s, a camera using Kodak photo-
camera was used to capture an image from the graphic paper (the size of 35-mm film) was
display. In some cases, the waveforms were attached to the screen of an advanced
28 G. Pavan et al.
oscilloscope capable of performing real-time FFT frequency and time could be taken as the
spectrum analysis (Hopkins et al. 1974). As the spectrograms were displayed. The photographic
sound played, a spectrogram image appeared on paper had to be developed in a dark room and
the screen and the camera photographed the produced a roll of 35-mm paper about 4 m long.
resulting image in real-time. Measurements of One advantage of this system was the ability to
1 History of Sound Recording and Analysis Equipment 29
view the sounds in real-time, which allowed each one characterized by frequency, amplitude,
scientists to study patterns of sounds. This system and phase. This algorithm was successfully
produced long-lasting spectrograms that are still applied to the human voice and to animal sounds
usable 40 years later (see Thomas and Kuechle to produce spectrograms in different formats. The
1982 for samples of sonogram output). speed and data-handling capabilities of computers
Once thermal imaging paper (similar to the in subsequent years allowed for the implementa-
paper used in older fax machines) was developed, tion of more complex mathematical signal
Kay, Unigon, and other companies developed processing algorithms (see Chap. 4 on signal
real-time spectrogram imaging units, which had processing).
a continuous output using large rolls (8 inch A few years later, in 1980, a computer-based
wide) of thermal imaging paper. For further anal- digital spectrographic workstation was developed
ysis, segments had to be cut with scissors. How- at the University of Pavia (Italy) that produced
ever, these data were difficult to analyze, store, black-and-white spectrograms of animal sounds
and prepare for publication. Measurements of on a computer screen, with a moving cursor to
frequency and time could be taken as the images take measures. The workstation produced and
were displayed on the analyzer but were not printed a spectrogram of a 1-s signal in about
provided on the output itself. If exposed to light 40 minutes (Pavan 1983, 1985). The
or heat, the hardcopies gradually turned brown AD-converter allowed users to acquire and ana-
and were generally unusable after a few years. lyze sounds in the ranges of 5, 10, and 20 kHz
In the mid-1970s, the first attempts were made with a sampling frequency of 51.2 kHz.
to use general-purpose computers to analyze Hardcopies of displays were made on the
sounds, mainly for speech analysis. These computer’s printer and then joined together
attempts used the Fast Fourier Transform (Strong (Fig. 1.24).
and Palmer 1975), an algorithm that decomposes Around that same time, in 1984, a group of
a signal segment into a finite number of sinusoids, acousticians at The Rockefeller University and
Fig. 1.24 Black-and-white spectrogram of a 2.4-s bird axis is the frequency in hertz. Frequency range 0–5 kHz,
song (Thekla lark) produced in 1981 by joining three sampling frequency 20,480 Hz, and 12-bit resolution
printouts of 800 ms each; the spectrogram generation (72-dB dynamic range). From top: spectrogram, envelope,
required 2 hours. The x-axis is time in seconds and y- tracking of dominant frequency, and amplitude plot in dB
30 G. Pavan et al.
Engineering Design Inc. developed a software Intel 8086/8087 processors and a high-quality
program, called Signal. This software was devel- Audiologic Duetto sound board produced in
oped for computers and was able to control and Italy, with sampling frequency up to 48 kHz
communicate with the recording hardware. The with 16-bit resolution, and later with a widely
system was able to display spectrograms in real- available and cheap Sound Blaster sound card.
time, provide basic time-frequency information A mouse-driven cursor allowed to take accurate
of recorded signals, and store data digitally on measures directly on the computer screen, and
the computer’s hard disc. These developments printouts were possible in gray scales on standard
revolutionized bioacoustics sound analysis; how- matrix-dot printers or on thermal printers. By
ever, at the time, these units were expensive, storing the recordings in a digital format, it was
custom-made, and had very little storage capacity also possible to edit the recordings and to play
(the typical storage available in 1985 was 5 MB them back at a different speed or even backward
on a 15-inch magnetic disc). (e.g., to produce playback tapes for behavioral
In 1985, the spectrographic workstation experiments).
was upgraded to produce color spectrograms At the same time, other researchers started
(Fig. 1.25; Pavan 1992) on a mainframe computer experimenting with digital signal processing.
(HP 1000) interfaced to an AD-converter and to a Aubin (France) and Specht (Germany) developed
graphic workstation.15 Around this time, the first similar digital sound analysis systems that
personal computers (PC) appeared, and the soft- also included the synthesis of sounds for
ware was rewritten to produce real-time color playback experiments (Bremond and Aubin
spectrograms and signal envelopes using an 1989; Specht 1992; Aubin et al. 2000).
Specialized AD-converters appeared on the mar-
15
http://www.unipv.it/cibra/res_dspwstory_uk.html; ket to sample analog signals at high rates, which
accessed 29 Oct 2021. allowed digital recording and analysis of
1 History of Sound Recording and Analysis Equipment 31
Fig. 1.26 Photograph of the University of Pavia portable open-reel stereo recorder, cassette deck recorder,
bioacoustic laboratory equipment in 1989 with a Kay filter bank, speakers, and headphone
Sona-Graph DSP 5500, color monitor, thermal printer,
classification algorithms over long-term data sets enabled by equipment developed for military
for automated detection of occurrences of a target use, professional music applications, human
sound (see Chap. 8 on detection and classification speech analysis, and for the radio, television,
methods). This saves much time and avoids hav- and film industries. Often an improvement in
ing to view and listen to the entire recording one type of equipment led to advancements in
manually. Scientists also can use readily available another. Analog devices, which stored data on
programming environments (including MATLAB, magnetic tape, were replaced by digital devices,
Octave, Python, R) to develop their own analyses, such as optical discs, hard drives and solid-state
often facilitated by libraries of procedures dedi- memory cards. Microphones and hydrophones
cated to sound processing and bioacoustic analy- are now used in arrays that allow long-term mon-
sis (e.g., Sueur et al. 2008; Sueur 2018; Ulloa itoring, localization of the sound-producing
et al. 2021). animals, and 3D acoustic recording. Towed
In the late 1990s, smartphone technology was hydrophone arrays allow mobile surveys of
developed, along with sound analysis software marine sounds, which can be coupled with animal
for these devices. Smartphones of the twenty- sightings and environmental data. Autonomous
first century have the same computing power as transducer/recorder units can be deployed for
a desktop PC. Sound recording and visualization long-term monitoring of biotic and abiotic sounds
applications were developed for both Android in both air and water in remote habitats. Recently,
and iPhone Operating System (iOS) platforms. smartphone applications have provided an afford-
In addition, the development of the Internet of able and portable bioacoustics laboratory for use
Things and low-cost computer platforms (e.g., by hobbyists, citizen scientists, and researchers
Arduino, Raspberry PI, and others) have allowed alike.
scientists to build web-enabled data recording and The digital revolution in sound recording and
analysis systems. These new technologies and analysis has facilitated significant advances in the
analytical methods can be applied not only to field of bioacoustics and enabled the development
audible sound but also to infrasonic and ultra- of ecoacoustics, which joins bioacoustics and ecol-
sonic signals. For example, ultrasonic echoloca- ogy, and computational bioacoustics. Acousticians
tion signals produced by bats can now easily be are now able to study the sounds from sound-
shifted into the human hearing range, visualized, producing species in a wide variety of locations,
and analyzed in real-time with handheld digital during day and night, year-round, and often
devices, with a smartphone equipped with an remotely. Many free and commercially available
ultrasonic microphone, or remotely monitored software packages for recording and analyzing
with web-connected recorders.18 acoustic data have been developed for computers,
tablets, and smartphones. Artificial Intelligence is
now being applied to big data problems and to
bioacoustic recordings to hopefully classify and
1.7 Summary
recognize sounds at species level. It has never
been easier or cheaper to study the acoustic world
Advances in electronic technology over the last
ranging from infrasounds to ultrasounds. How-
100 years, including the dramatic size reduction
ever, it is always important to know the intrinsic
of equipment, increased battery life, increased
limitations of each piece of equipment or software,
data storage capacity, the switch from analog-to-
the constraints given by the environmental context,
digital recorders, along with the transition from
and all their potential impact on the final results. It
analog-to-digital signal processing, have
is also worth considering that bioacoustics and
facilitated an explosion of research in the field
ecoacoustics are now being widely used to study
of bioacoustics. Many of these advances were
and monitor critical and endangered species and to
monitor entire ecosystems to understand climate
18
http://www.bat-pi.eu/; accessed 11 Oct 2021. change impacts.
1 History of Sound Recording and Analysis Equipment 33
localisation, & density estimation workshop. devices for studying animal behavior. Ecol Evol 3(7):
Sorbonne, Paris 2030–2037. https://doi.org/10.1002/ece3.608
Griffin DR (1944) Echolocation by blind men, bats and Marler P (1955) Characteristics of some animal calls.
radar. Science 29:589–590 Nature 176:6–8
Griffiths ET, Barlow J (2015) Design of the drifting acous- Matsumoto H, Jones C, Klinck H, Mellinger DK, Dziak
tic spar buoy recorder (DASBR). Proceedings of the RP, Meinig C (2013) Tracking beaked whales with a
7th international workshop on detection, classification, passive acoustic profiler float. J Acoust Soc Am 133:
localization, and density estimation of marine 731–740
mammals using passive acoustics, San Diego, CA, McCauley RD, Thomas F, Parsons MJG, Erbe C, Cato D,
USA, pp. 86 Duncan AJ, Gavrilov AN, Parnum IM, Salgado-Kent
Hill AP, Prince P, Covarrubias EP, Doncaster CP, C (2017) Developing an underwater sound recorder.
Snaddon JL, Rogers A (2018) AudioMoth: evaluation Acoust Aust 45(2):301–311. https://doi.org/10.1007/
of a smart open acoustic device for monitoring s40857-017-0113-8
biodiversity and the environment. Methods Melchior VR (2019) High resolution audio: a history and
Ecol Evol:1–13 perspective. J Audio Eng Soc 67(5):246–257
Hill P, Lakes-Harlan R, Mazzoni V, Narins PM, Virant- Mellinger DK, Stafford KM, Moore SE, Dziak RP,
Doberlet M, Wessel A (eds) (2019) Biotremology: Matsumoto H (2007) An overview of fixed passive
studying vibrational behavior. Springer Verlag, acoustic observation methods for cetaceans. Oceanog-
p 534. https://doi.org/10.1007/978-3-030-22293-2 raphy 20(4):36–45
Hopkins CD, Rosetto M, Lutjen A (1974) A continuous Munk W, Wunsch C (1979) Ocean acoustic tomography: a
sound Spectrum analyzer for animal sounds. Z scheme for large scale monitoring. Deep-Sea Res
Tierpsychol 34(3):313–320. https://doi.org/10.1111/j. 26A:123–161
1439-0310.1974.tb01804.x Naramoto MV (2000) A concise history of acoustics in
Hopp SL, Owren MJ, Evans CS (eds) (1998) Animal warfare. Appl Acoust 59:137–147
acoustic communication: sound analysis and research Nishimura CE, Conlon DM (1994) IUSS dual use: moni-
methods. Springer, p 421 toring whales and earthquakes using SOSUS, marine
Jacobson EK, Forney KA, Harvey JT (2016) Evaluation of tech. Soc J 27:13–21
a passive acoustic monitoring network for harbor por- Nosengo N (2009) The neutrino and the whale. Nature
poise in California. Tech rep no. CEC-500-2016-008. 462:560–561
https://doi.org/10.13140/RG.2.1.2282.9680 Obrist MK, Pavan G, Sueur J, Riede K, Llusia D, Marquez
Johnson MP, Tyack PL (2003) A digital acoustic R (2010) Bioacoustic approaches in biodiversity
recording tag for measuring the response of inventories. In: manual on field recording techniques
wild marine mammals to sound. IEEE J Ocean Eng and protocols for all taxa biodiversity inventories. Abc
28:3–12 Taxa 8:68–99
Klinck H, Mellinger DK, Klinck K, Bogue NL, Luby JC, Pavan G (1983) Ricerche di elettronica applicata alla
Jump WA, Shilling GS, Litchendorf T, Wood AS, biologia. 1. Analisi computerizzata del canto degli
Schorr GS, Baird RW (2012) Near-real-time acoustic uccelli. Pubbl Ist Entom Univ di Pavia 24:1–43
monitoring of beaked whales and other cetaceans using Pavan G (1985) Analisi con calcolatore delle emissioni
a seaglider™. PLoS One 7(5):e36128. https://doi.org/ acustiche degli uccelli. Annuario EST (Enciclopedia
10.1371/journal.pone.0036128 della Scienza e della Tecnica). Arnoldo Mondadori,
Klinck H, Fregosi S, Matsumoto H, Turrpin A, Mellinger Milano, pp 135–140
DK, Erofeev A, Barth JA, Shearman RK, Pavan G (1992) A portable DSP workstation for real-time
Jafarmadar K, Stelzer R (2015) Mobile autonomous analysis of cetacean sounds in the field. European
platforms for passive-acoustic monitoring of high- Research on Cetaceans 6:165–169
frequency cetaceans. Proceedings of the 8th Interna- Pavan G (1994) A digital signal processing workstation for
tional Robotic Sailing Conference, Mariehamn, Åland, bioacoustical research. Atti 6 conv. Ital. Ornit. Torino
Finland, August 2015: 29–38 1991:227–234
Koenig W, Dunn HK, Lacy LY (1946) The sound spectro- Pavan G (2017) Fundamentals of soundscape conserva-
graph. J Acoust Soc Am 18:19–49 tion. In: Farina A, Gage SH (eds) Ecoacoustics: the
Lammers MO, Brainard RE, Whitlow WWL, Mooney TA, ecological role of sound. Wiley, pp 235–258
Wong KB (2008) An ecological acoustic recorder Pavan G (web) http://www.unipv.it/cibra/res_dspwstory_
(EAR) for long term monitoring of biological and uk.html. Accessed 28 March 2018
anthropogenic sounds on coral reefs and other marine Pavan G, Borsani JF (1997) Bioacoustic research on
habitats. J Acoust Soc Am 123:1720–1728 cetaceans in the Mediterranean Sea. Mar Freshw
Little RS (2003) For the birds: the Laboratory of Ornithol- Behav Physiol 30:99–123
ogy and Sapsucker Woods at Cornell University. Pavan G, Fossati C, Manghi M, Priano M (2004) Passive
Basking Ridge, NJ acoustics tools for the implementation of acoustic risk
Lynch E, Angeloni L, Fristrup K, Joyce D, Wittemyer G mitigation policies. In: Evans PGH, Miller LA (eds)
(2013) The use of on-animal acoustical recording Proceedings of the workshop on active sonar and
1 History of Sound Recording and Analysis Equipment 35
cetaceans, 17th ECS conference, March 2003. whale (Balaenoptera physalus) offshore eastern Sicily,
European Cetacean Society Newsletter no. 42 – Spe- Central Mediterranean Sea, PLoS One, 10 (11):
cial Issue: 52-58 e0141838. https://doi.org/10.1371/journal.pone.
Pavan G, Favaretto A, Bovelacci B, Scaravelli D, 0141838
Macchio S, Glotin H (2015) Bioacoustics and Sethi SS, Ewers RM, Jones NS, Orme CDL, Picinali L
ecoacoustics applied to environmental monitoring and (2018) Robust, real-time and autonomous monitoring
management. Rivista Italiana di Acustica 39(2):68–74 of ecosystems with an open, low-cost, networked
Payne KB, Langbauer WR Jr, Thomas EM (1986) Infra- device. Methods Ecol Evol 9(12):2383–2387
sonic calls of the Asian elephant (Elephas maximus). Sousa-Lima R, Norris TF, Oswald JN, Fernandes DP
Behav Ecol Sociobiol 18:297–301 (2013) A review and inventory of fixed autonomous
Pijanowski BC, Farina A, Gage SH et al (2011a) What is recorders for passive acoustic monitoring of marine
soundscape ecology? An introduction and overview of mammals. Aquat Mamm 39:23–53
an emerging new science. Landsc Ecol 26:1213–1232. Specht R (1992) Sonagraphishce Analysen mit dem per-
https://doi.org/10.1007/s10980-011-9600-8 sonal computer. Der Falke 1992:281–283
Pijanowski BC, Villanueva-Rivera LJ, Dumyahn SL et al Stafford KM, Fox CG, Clark DS (1998) Long-range
(2011b) Soundscape ecology: the science of sound in acoustic detection and localization of blue whale calls
the landscape. Bioscience 61:203–216. https://doi.org/ in the Northeast Pacific Ocean. J Acoust Soc Am
10.1525/bio.2011.61.3.6 104(6):36161–33625
Pohlmann KC (1995) Principles of digital audio. McGraw- Stoeger AS, Heilmann G, Zeppelzauer M, Ganswindt A,
Hill, New York Hensman S, Charlton BD (2012) Visualizing sound
Potter RK, Kopp GA, Green HC (1947) The visible emission of elephant vocalizations: evidence for two
speech, Bell Laboratory, Cambridge, MA rumble production types. PLoS One 7(11):e48907.
Poulsen V (1900) Method of recording and reproducing https://doi.org/10.1371/journal.pone.0048907
sounds or signals. US Patent 661619A Streicher R, Everest FA (1998) The new stereo
Poupard M, Ferrari M, Schluter J, Marxer R, Giraudet P, Soundbook. AES Publishing, Pasadena, CA
Barchasz V, Gies V, Pavan G, Glotin H (2019) Real- Strong WJ, Palmer EP (1975) Computer-based sound
time passive acoustic 3D tracking of deep diving ceta- spectrograph system. J Acoust Soc Am 58:899–904
cean by small non-uniform Mobile surface antenna. Sueur J (2018) Sound analysis and synthesis with
IEEE ICASSP:8251–8255 R. Springer, New York
Ranft R (1997) The wildlife section of the British library Sueur J, Aubin T, Simonis C (2008) Seewave: a free
National Sound Archive. Bioacoustics 7:315–319 modular tool for sound analysis and synthesis. Bio-
Ranft R (2001) Capturing and preserving the sounds of acoustics 18:213–226
nature. In: Linehan A (ed) Aural history: essays on Thomas JA, Kuechle VB (1982) Quantitative analysis of
recorded sound. The British Library, London, pp Weddell seal (Leptonychotes weddellii) underwater
65–78 vocalizations at McMurdo Sound, Antarctica. J Acoust
Ranft R (2004) Natural sound archives: past, present and Soc Am 72(6):1730–1738
future. An Acad Bras Cienc 76(2):455–465 Thomas JA, Awbrey FT, SR Fisher (1986) Use of acoustic
Rayburn RA (2011) Eargle’s the microphone book: from techniques for studying whale behavior. Rep Int Whal
mono to stereo to surround - a guide to microphone Commn Special Issue 8, No. SC/A82/Bw10: 121–138
design and application, 3rd edn. Elsevier, Oxford Thomas JA, Fisher SR, Ferm LM, Holt RS (1987) Acous-
Righini R, Pavan G (2019) First assessment of the tic detection of cetaceans using a towed-array of
soundscape of the integral nature reserve “Sasso hydrophones. Rep Int Whal Commn Special Issue
Fratino” in the central Apennine, Italy. Biodivers J 8, No. SC/37/03: 139–148
21(1):4–14. https://doi.org/10.1080/14888386.2019. Thorpe WH (1954) The process of song-learning in the
1696229 chaffinch as studied by means of the sound spectro-
Robjohns H (2010) A brief history of microphones. http:// graph. Nature 173:465
microphone-data.com/media/filestore/articles/ Tremblay C, Calupca T, Clark CW, Robbins M,
History-10.pdf Spaulding E, Warde A, Kemp J, Newhall K (2009)
Rossing TD (ed) (2007) Springer handbook of acoustics. Autonomous seafloor recorders and auto detection
Springer, New York. 1167 pp with CD-ROM buoys to monitor whale activity for long-term and
Rumsey F, McCormick T (2009) Sound and recording. near-real-time applications. J Acoust Soc Am 125:
Focal Press, New York. 628 pp 2548
Sales G, Pye D (1974) Ultrasonic communication by Tyack PL (2009) Human-generated sound and marine
animals. Chapman and Hall, London mammals. Phys Today, November: 39–44
Sciacca V, Caruso F, Beranzoli L, Chierici F, De Tyack PL, Zimmer WMX, Moretti D, Southall BL,
Domenico E, Embriaco D, Favali P Giovanetti, Claridge DE, Claridge DE, Durban JW, Clark CW,
Larosa G, Marinaro G, Papale E, Pavan G, D’Amico A, DiMarzio N, Jarvis S, McCarthy E,
Pellegrino C, Pulvirenti S, Simeone F, Viola S, Morrissey R, Ward J, Boyd IL (2011) Beaked whales
Riccobene G (2015) Annual acoustic presence of fin respond to simulated and actual navy sonar. PLoS One
36 G. Pavan et al.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Choosing Equipment for Animal
Bioacoustic Research 2
Shyam Madhusudhana, Gianni Pavan, Lee A. Miller,
William L. Gannon, Anthony Hawkins, Christine Erbe,
Jennifer A. Hamel, and Jeanette A. Thomas
non-optimal sampling frequency can produce 1 mV/Pa (¼0.001 V/Pa) can be expressed as
misrepresentations of components in the original 60 dB re 1 V/Pa. Note that an rms sound pres-
waveform, which often manifest as artifacts in a sure of 1 Pa is equal to a sound pressure level
spectrographic display but are not actually pres- (SPL) of 94 dB re 20 μPa, because
ent in the original signal (see Chap. 4, section on
1 Pa ¼ 1,000,000 μPa ¼ 50,000 20 μPa;
aliasing). In a spectrogram, the alias is mostly in
the higher frequency region and appears as the apply 20 log 10 and get: 20 log 10 ð50,000Þ ¼ 94:
mirror-image of the actual signals beyond the
The most sensitive sensor is not necessarily the
Nyquist frequency (Fig. 2.2). In digital recording,
“best” sensor. When attempting to capture very
anti-aliasing filters (Sect. 2.3.2.2) are required
loud sound, less sensitive equipment should be
before the sampling stage to prevent aliasing
chosen to avoid signal distortion or, in extreme
from sounds that have components higher than
cases, damaging the equipment. If only a sensor
the Nyquist frequency.
of low sensitivity is available, then an amplifier
may be used in the recording chain, but self-noise
may become an issue. High sensitivity allows
2.2.3 Amplitude Sensitivity lower gain settings to promote a good recording.
Fig. 2.2 Spectrogram (top) and oscillogram (bottom) of created with frequency ffN. As such, a 50-kHz input
an AD-converter with a sinusoidal frequency sweep from produces a 46-kHz alias and a 52-kHz input produces a
40 kHz to 100 kHz as input. Sampling frequency 96 kHz, 44-kHz alias, etc. The amplitude of the alias depends on
and thus Nyquist frequency 48 kHz. In an ideal system the attenuation of the anti-aliasing filter at the input fre-
with a sharp anti-aliasing filter, the spectrogram would quency. An attenuation of 10 dB at 50 kHz produces an
only go up to 48 kHz and show nothing once the signal alias at 46 kHz with a level of 10 dB relative to the input
frequency went beyond Nyquist. In this real-world exam- level. Spectrogram generated by SeaPro (http://www.
ple, however, as the signal frequency f exceeds the Nyquist unipv.it/cibra/seapro.html; accessed 15 Mar. 2021)
frequency fN, the alias (appearing as the downsweep) is software
40 S. Madhusudhana et al.
dynamic range of about 96 dB (unipolar, 90 dB can generate broadband background noise with
bipolar) and 24 bits theoretically produce a various spectral shapes (i.e., not necessarily flat
dynamic range of 144 dB (unipolar, 138 dB bipo- across the frequency band, like white noise, but
lar) thus encompassing the dynamic range of worse at higher frequencies). The level of this
human hearing. However, even the best analog noise is expressed in decibel (e.g., dB(A) after
circuits rarely exceed 110 dB of dynamic range. frequency weighting, dB re 20 μPa unweighted in
This means that of the available 24 bits, only air, or dB re 1 μPa unweighted in water) to indi-
20 bits are effectively used to encode the sound cate the equivalent sound level of noise as if
and the others are dominated by noise. In many generated by the environment. The self-noise of
conditions, the real dynamic range is limited to a sensor is almost always declared in its technical
70–80 dB by the noise of the sensor and pream- specifications; the same is true for professional
plifier. An accurate setting of the recording levels recorders. On the contrary, for many consumer
can allow effective use of 16-bit recorders, with- recorders, even of high quality, the self-noise
out wasting the extra storage space required for measures are rarely available. A useful compari-
24-bit recording. However, when incoming sound son of the self-noise of consumer recorders avail-
levels cannot be predicted, the 24-bit setting able on the market is presented on the website of
allows additional dynamic range for unpredict- Avisoft Bioacoustics.1
able sound events (e.g., high-intensity impulsive The noisiest component of the chain
noises such as from pile driving). The recorded determines the quality of the recording. This is
volume should be set at a particular level to particularly important when recording low-level
exploit the dynamic range of the recording sounds (Fig. 2.3). The input self-noise is
setup: high enough to rise above the equipment expressed as the Equivalent Input Noise (EIN)
self-noise during quiet times, but not too high to measured in an open or unloaded circuit and
cause clipping of loud sounds. Recently expressed in dBU (the “U” stands for
introduced recorders allow 32-bit floating-point “unloaded”). Very good values range from
recording by combining the output of two 24-bit 130 dBU to 120 dBU, and poor recorders
converters working with different signal gains. have a 100 dBU EIN.
This simplifies the setting of recording levels but
cannot yet overcome the dynamic range
limitations of the microphones and of associated 2.3 Instrumentation of Signal
preamplifiers. Chain Components
Fig. 2.3 Spectrogram depicting high self-noise versus background. In the following sections, nosier systems
low self-noise output by three microphone/recorder were used; the sounds appear unclear and listening was
combinations. In the left section, a low-noise system was unpleasant
used and the signal clearly emerged from the environment
Fig. 2.4 Schematic of a dynamic microphone (left) and a Microphone schematic components: 1. vibrating dia-
condenser microphone (right) showing the conversion of phragm, 2. coil attached to the diaphragm, 3. magnet,
sound waves into electrical audio-signal outputs. 4. backplate, 5. battery, 6. resistor, 7. output
directionality. A sound sensor, to be omnidirec- in the condenser. Capacitance changes are then
tional, should be smaller than the minimum wave- converted to voltage. Condenser microphones
length of the signal to be received. Large sensors need a high voltage to polarize the condenser. In
are more sensitive but tend to limit responses at contrast, electret microphones are permanently
high frequencies. Large sensors become direc- polarized as their diaphragms are made of
tional at lower frequencies than small sensors do. metallic-coated, pre-polarized, plastic membrane.
Both condenser and electret microphones need
power for their integrated preamplifier, with con-
2.3.1.1 Microphones
denser microphones requiring additional power to
Microphones convert sound energy (from sound
polarize the condenser. This power may be sup-
waves) into an electrical audio-signal using a
plied by an internal 3–5 V battery, 48-V phantom
moving diaphragm or membrane. Two main
power (P48), or a Power-In-Plug (PIP) unit. P48
types of microphones are common: dynamic
is a standard means of feeding power to a con-
microphones and electrostatic microphones (con-
denser microphone with 48 Vdc and is commonly
denser and electret microphones) (Brüel and Kjær
used in professional recorders. Modern pocket
1982). Some microphones are sensitive to particle
digital recorders use PIP units for powering their
motion, as well as sound pressure, which results
microphones. The membranes in electrostatic
in them being very sensitive to sounds very close
microphones are delicate and sensitive to humid-
to the microphone (i.e., in the near-field). This
ity, which can be problematic in humid
often exaggerates the low-frequency components
environments. The lower mass of electrostatic
of the received sound.
elements generally yields superior high-
In dynamic microphones, a coil on the back of
frequency response. However, electrostatic
the diaphragm is immersed in a magnetic field
sensors may be noisier than dynamic sensors.
and generates a current by electromagnetic induc-
For studies involving low-frequency sounds,
tion when the membrane moves (Fig. 2.4). Such
dynamic sensors may be a better choice.
microphones do not require external power, but
A radio-frequency microphone is a special
they have limited sensitivity, making them most
type of condenser microphone, developed by
useful for loud signals or at close range to the
Sennheiser2 in its MKH series. With this type of
sound source. The delicate mechanical suspen-
microphone, variations of the capacitor modulate
sion in dynamic microphones may warrant gentle
the frequency of a radio-frequency oscillator, and
handling.
then a demodulator extracts the audio-signal to be
Electrostatic microphones are based on a con-
denser with a thin moving diaphragm (Fig. 2.4).
Movement of the diaphragm changes capacitance 2
http://www.sennheiser.com/; accessed 15 Mar. 2021.
2 Choosing Equipment for Animal Bioacoustic Research 43
transmitted over a cable. The radio-frequency subtracting the given SNR from 94. If properly
oscillator and the demodulator are both housed measured and reported, an SNR of 80 dB
inside the microphone, and these microphones are (A) means a self-noise of 14 dB(A), which is
less prone to problems of interference and pretty good. In other cases, the sensitivity, the
humidity. maximum allowed SPL, and the dynamic range
The more recently developed Micro-Electri- are presented. In this case, the self-noise can be
cal-Mechanical System (MEMS) microphones obtained by subtracting the dynamic range from
have pressure-sensitive elements integrated the maximum allowed SPL.
directly into a silicon chip (as found in most cell
phones) with similar fabrication technologies
Ultrasonic and Infrasonic Microphones
used to make semi-conductor devices. Some inte-
Microphones for ultrasounds are typically small,
grate an AD-converter to produce a digital output.
with a small membrane with very low inertia.
Their development resulted from the need for tiny
Ultrasonic microphones are usually condenser
microphones for cell phones. Because of the
microphones developed for measurement
small size and low inertia of their sensors,
purposes, not for recording music; however, the
MEMS microphones are sensitive to high
increasing interest in ultrasonic communication
frequencies and consequently are used in ultra-
and echolocation in animals (mainly bats and
sonic microphones, such as in bat detectors.
rodents, but also insects) has fostered the devel-
Because of their low cost, they are the perfect
opment of a wide range of sensors for
candidates for array applications, including
ultrasounds. Ultrasonic microphones for mea-
“acoustic cameras” that overlay the image taken
surement purpose need to have a flat frequency
by a video-camera with a map of the sound
response; usually they also have high self-noise
sources generated by a matrix of tens or hundreds
and are very expensive. If the flatness of the
of MEMS microphones.
frequency response is not a necessity, other,
Most condenser microphones have a self-noise
lower-cost microphones can be used instead
lower than 20 dB(A), which is sufficient to record
(e.g., low-cost small condenser microphones and
music or speech at a close distance, but not suited
tiny MEMS microphones). Considering that
to record faint animal sounds and noises in a quiet
ultrasonic microphones need high sampling
environment. The quietest studio microphones
rates, often beyond those available in consumer
have a self-noise below 10 dB(A); among these
digital recorders or AD-converters (see Sect.
microphones is the Rode NT1A, a cardioid micro-
2.3.4), ultrasonic sensors with integrated
phone that has an excellent self-noise of only
AD-converter and USB interface have been
5.5 dB(A). Even quieter microphones are avail-
developed. In bioacoustic studies, these are
able in the category of instrumentation
mainly used for detecting and recording bats
microphones, but few very expensive models are
(Sect. 2.3.5), insects (Buzzetti et al. 2020), and
available. Lynch et al. (2011) and Pavan (2017)
rodents either in the wild or in etho-
used very quiet instruments to show that noise in
pharmacological studies (Buck et al. 2014).
natural environments can be as low as 10 dB re
Infrasonic microphones are specially designed
20 μPa and even go below 0 dB re 20 μPa below
for low-frequency recording, down to 1 Hz or
1 kHz. Of course, a quiet microphone must be
even 0.1 Hz. Until a few decades ago, Sennheiser
connected to a quiet recorder!
produced the MKH 110, a condenser microphone
Sometimes, microphone specifications are dif-
with 12-V powering. Now discontinued, it is still
ficult to read or self-noise is not provided. One
appreciated in the used equipment market. These
must examine the parameters that are given, such
microphones have been widely used to record
as amplitude sensitivity and the signal-to-noise
elephant communication (Payne et al. 1986;
ratio (SNR). If not differently declared, the SNR
Poole et al. 1988). Currently, microphones
is relative to 94 dB re 20 μPa (i.e., 1 Pa) at 1 kHz
designed for infrasonic applications are largely
and thus the self-noise can be obtained by
44 S. Madhusudhana et al.
0° 0°
a. b.
-5 dB -5 dB
-10 dB -10 dB
-15 dB -15 dB
-20 dB -20 dB
-25 dB -25 dB
180° 180°
0° 0°
c. d.
-5 dB
-10 dB
-15 dB
-20 dB
-25 dB
180° 180°
Fig. 2.5 Polar patterns of directionality of different plane. In the horizontal plane, these patterns are symmet-
microphones. With microphones facing the top of the rical (i.e., they rotate about the vertical axis). (a) omnidi-
page, these patterns extend from the axis of the rectional, (b) cardioid, (c) bidirectional (figure-of-8), and
microphones, and thus present directivity in the vertical (d) shotgun (lobar)
figure-of-8 pattern equally from two, opposite length of the interference tube and with the fre-
directions. quency of incoming signals, so that at high fre-
Shotgun microphones (Fig. 2.5d) are the most quency (> 4 kHz), the receiving lobe is quite
directional and commonly used for recording a narrow. For lower frequencies, the directivity
specific animal. Their use is desirable when it is decreases. This also means that off-axis sounds
necessary to improve the recording level of a are not only attenuated, but also have a modified
specific sound source, or to attenuate unwanted frequency spectrum, with high frequencies more
sound coming from other directions. The design attenuated than low frequencies. At wavelengths
of shotgun microphones (such as the Sennheiser longer than tube length, off-axis attenuation is
K6/ME66 or the MKH 8070) is based on the null. If interested in higher frequencies, such as
interference tube principle; usually a cardioid bird songs above 1 kHz, a high-pass filter to cut
condenser microphone is placed at the end of a off low frequencies (e.g., to attenuate wind noise
tube with slits on sides, canceling off-axis signals or traffic noise below 150 Hz) is available in high-
(Fig. 2.6). The directivity increases with the quality microphones.
46 S. Madhusudhana et al.
Fig. 2.6 Photograph (left) of a modular microphone ME66, shotgun ME67). Polar pattern (top-right) of the
(Sennheiser K6/ME66) with the preamplifier body that microphone at different frequencies and the frequency
hosts a battery to power the microphone in case the P48 response (bottom-right) on axis and at 90 from the
powering is not available; the sensing capsule is inter- sound. Reprinted with permission from Sennheiser
changeable (omni ME62, cardioid ME64, short shotgun
Fig. 2.7 XY recording configuration (left) using two in the middle and a bidirectional microphone taking the
cardioid microphones, and MS recording configuration sounds coming from the sides (figure-of-8 polar pattern)
(right) which typically combines a cardioid microphone
produces the best spatial image when heard sphere, surround-sound) acoustic environment,
through headphones. In some setups, cardioid capture sound not only in the horizontal plane,
microphones angled at 60 –90 , like in the XY but also above and below the listener. Surround-
configuration, are used to enhance left-right sound recording requires several microphones in
separation. a 3D configuration, whose signals (channels) are
In the MS microphone stereo recording setup, electronically or digitally combined to produce
a cardioid microphone is piggy-backed on top of both stereo and multi-channel surround-sound
a bidirectional microphone. The cardioid picks up experiences, or to create specific receiving
frontal information, whereas the bidirectional beams (e.g., to focus on a sub-space or on a
microphone gets sounds coming from the sides specific source). The Ambisonics system allows
only. This type of recording requires specific recording of sound pressure on 3 axes with
electronics, or signal processing to combine the 4 microphone capsules mounted as a small tetra-
signals to produce a traditional stereo image. In hedron (first order Ambisonics) (Zotter and Frank
essence, the signals from the left and right 2019). Higher-order Ambisonics microphones
capsules are summed out-of-phase before being can have up to 32 capsules on a small sphere to
combined with the mono-signal. This computa- achieve higher directional details and to simulate
tion allows the recordist to control the width of virtual directional microphones to be oriented in
the stereo spread and make other adjustments in any direction during post-processing.
post-processing. In the early stages of the sound
industry, this helped to maintain the compatibility Microphone Arrays
among mono and stereo recordings. Several Arrays of sound sensors are used to monitor
microphone arrangements have been developed animals across habitats, locate and track sound
for stereophonic recording; for a comprehensive sources (such as individual animals), and study
review, see Rayburn (2011) or Streicher and environmental noise. Arrays may be stationary
Everest (1998). (fixed in location), freely drifting (e.g., suspended
Latest developments, mainly driven by the from balloons), or towed. Ambisonic
film industry to produce an immersive 3D (full- microphones, are a special case of microphone
48 S. Madhusudhana et al.
Fig. 2.8 Noise pattern observed at Sapsucker Woods (Ithaca, NY, USA), caused by a The ambient noise levels are raised by about 15 dB at the frequencies of chorusing birds
jet plane taking off from a nearby runway. Receiver locations are denoted by white (2 to 4 kHz). Image courtesy of Dimitri Ponirakis, K. Lisa Yang Center for Conserva-
circles. Regions shaded red show high noise levels and follow parallel to the flight path. tion Bioacoustics, Cornell Lab of Ornithology
49
50 S. Madhusudhana et al.
Fig. 2.10 Diagram of a parabolic dish and microphone used to record a bird on a tree. The parabolic solution gives
added amplification and directivity, which helps in recording a single animal, a quiet animal, or animals at a distance
2 Choosing Equipment for Animal Bioacoustic Research 51
Fig. 2.11 Sketch of frequency response and gain of a show the theoretical gain of three parabolas of different
generic microphone placed in parabolas of different sizes. The gain is proportional to frequency and to the
diameters. The red lines show the frequency response of parabola diameter. Actual response may vary depending
an ideal microphone, with the option of a high-pass filter to on the shape and depth of the parabola and on the response
reduce low-frequency noise below 80 Hz. The blue lines and positioning of the microphone
adequate for detecting this range of usually is sealed in a resin package with a water-
mid-frequency calls. proof connector and needs to be handled with
To produce a more pleasant recording, it is care. After use in saltwater, a hydrophone should
possible to record in stereo by using two be rinsed with freshwater or else connections are
microphones in the focus, separated by a thin likely to corrode.
plate. This way, sounds coming from the frontal A piezoelectric transducer can be used as a
axis of the parabola reach both microphones with sensor or projector; however, when the transducer
the same level, while off-axis sounds are focused has a built-in preamplifier, it can no longer be
more on one side. Another option is to place an used as a projector, but only as a sensor.
MS microphone combination in the focus of the Hydrophones are much less sensitive, and a
parabola. Listening with headphones helps in great deal of power is needed (from an external
pointing the parabola on the source of interest amplifier) to drive a hydrophone as a projector.
and gives immediate feedback on the quality of As a sensor, a hydrophone can have a built-in
the sounds being recorded. When analyzing preamplifier that matches the frequency response,
recordings made with a parabola, it is important dynamic range, and high impedance of the trans-
to take into account that the frequency response is ducer. A few hydrophones on the market with
not flat as it increases with frequency (Fig. 2.11). built-in preamplifier (Fig. 2.12) can be powered
In some cases, slightly moving the microphone directly by a recorder, computer, or analysis sys-
out of focus reduces the high-frequency emphasis tem (e.g., either by P48 or by PIP at 2–5 Vdc).
and produces a more pleasant sound. Most preamplified hydrophones require powering
through dedicated cables and can require single or
2.3.1.2 Hydrophones dual powering (e.g., þ12 V, or 12 V and þ12 V)
A hydrophone is a piezoelectric transducer that to be provided by a battery box (Fig. 2.12). A
converts sound waves in water to electrical popular low-cost hydrophone is the H2c from
signals. Hydrophones can receive sound in air, Aquarian Audio,7 which allows PIP powering.
but the sound has to be of very high amplitude. The DolphinEar8 is an inexpensive, lightweight,
Because the acoustic impedances of the medium
and the sensor match much better in water than in 7
http://www.aquarianaudio.com/; accessed 15 Mar. 2021.
air, hydrophones have to be less sensitive, or they 8
http://www.dolphinearglobal.com/; accessed 19 Jun.
would easily overload. The underwater sensor 2022.
52 S. Madhusudhana et al.
Fig. 2.12 Photographs of an ITC 6050C hydrophone with built-in preamplifier and external battery power (left) and a
Cetacean Research Technology C57 hydrophone with cable and battery box (right; courtesy of J R Olson)
Fig. 2.13 Specifications and polar plot of directional ITC ITC (https://www.gavial.com/itc-products; accessed
3003D transducer (left) and omnidirectional ITC 1007 22 Aug. 2021)
transducer (right). Reprinted with permission from Gavial
transducer, sounds are received and projected sensor, a spherical hydrophone is typically omni-
uniformly in the horizontal plane, assuming the directional (receives sounds equally from all
transducer is suspended vertically. In the vertical directions) as shown by the right polar plot of
plane, the transducer will have a directivity pat- Fig. 2.13. Used as a projector, the directivity
tern. If the transducer has a planar shape, it will pattern of a hydrophone changes depending on
have two beams on its opposite faces as shown in the frequency being projected (directivity
the left polar plot in Fig. 2.13. When used as a increases with frequency).
54 S. Madhusudhana et al.
sound (<15 kHz). In more recent years, light- non-acoustic mechanical vibration, which reduce
weight, wideband towed arrays sensitive up to the ability to capture low-frequency animal
100 kHz and more have been developed to meet sounds and which can cause an acoustic overload
the requirements of researchers aiming to study of the recording chain. To mitigate these issues,
marine mammals from small platforms, such as tow speed should usually not exceed 6 knots. A
sailboats (Pavan and Borsani 1997; Pavan et al. long cable with special elastic sections in the
2013). By simultaneously processing sound from array can dampen vibrations. Flow- and vessel-
more than one hydrophone (or group of noise can be mitigated with a smooth high-pass
hydrophones), the bearing (or even location) of filter (e.g., 500 Hz, 12 dB/octave; see Sect. 2.3.2.1).
the vocalizing animal maybe be determined (see
Chap. 4, section on sound localization). Towed
Deployment Considerations
arrays are used for line-transect surveys and to
To operate properly, hydrophones must have little
sample animals in their environment over a wide
vertical or horizontal movement. Water flow over
geographic range.
the surface of the hydrophone generates pressure
A straight-line array cannot resolve between
fluctuations, which appear as noise in
signals arriving from the port or starboard side
spectrograms but which are not due to an acoustic
without the vessel changing course or using mul-
wave. This flow noise is an artifact of deployment
tiple array deployments (Thode et al. 2010).
(see Chap. 3, section on flow noise). It is typically
Large arrays (sometimes hundreds of sensors,
of low to mid frequencies (see, for example,
possibly with different frequency sensitivities
the spectrogram in Fig. 3 in Erbe et al. (2015)
and bandwidths) allow tracking of multiple
showing flow noise in marine soundscape
sources simultaneously by selective beamforming
recordings) and thus can be filtered out with a
(Zimmer 2011). More complex towed systems
high-pass filter, but this limits the recording of
use a 3D hydrophone configuration called a volu-
low-frequency sounds. Large or rapid vertical or
metric array (Zimmer 2013) or vector sensors
horizontal movement of a hydrophone (e.g., if it
(Thode et al. 2010) to locate sound sources in
is deployed over the side of a boat) may cause the
three dimensions. Acoustic vector sensors are
system to be saturated with no useable recordings
sensitive to particle velocity rather than to pres-
collected. It is very difficult to make good
sure and hence sense the direction of incoming
recordings in the open ocean; a hydrophone
sound waves and resolve the directional
often needs to have its own flotation system,
ambiguities. Thode et al. (2010) attached a vector
rather than be suspended from a boat; otherwise,
sensor module to the end of an 800-m towed array
the movement of the boat will translate into
to detect sperm whale clicks and compute unam-
movement of the hydrophone. The horizontal
biguous bearing estimates of whales over time.
component of water flow past a hydrophone
Many towed arrays have a depth sensor, so the
may be minimized by deploying freely drifting
operator knows the tow-depth in relation to the
hydrophone systems (e.g., suspended from a
sound velocity profile in the water column. Such
freely drifting buoy). The vertical component of
information allows the user to position the array
water flow past a hydrophone may be minimized
either in a surface duct or below the thermocline
by dampening systems; for example, suspending
to listen to sounds coming from deep water (see
the recorder on a bungee with a movement-
Chap. 6 on sound propagation under water).
dampening drogue, or by using a catenary
Additionally, the depth information enables
floatation line (see Chap. 3 and Fig. 5 in Erbe
subsequent array processing to exploit the surface
et al. 2019). In towed arrays, long towing cables
effects on sound propagation to improve localiza-
and specifically designed hydrophones
tion accuracy.
(acceleration-compensated) are used to avoid sat-
Array performance is degraded (in particular
uration of the hydrophones from movement.
below ~1 kHz) by vessel self-noise, hydrody-
namic noise artifacts (flow noise), and
56 S. Madhusudhana et al.
Fig. 2.16 Spectrogram of a sinusoidal tone sampled at tone sampled at 44,100 Hz with a good AD-converter
44,100 Hz with a poor AD-converter (top panel). Note the (middle panel); the broad blue band is absent in this
low-intensity broadband noise (blue components) due to image. The bottom panel shows the constant amplitude
random jitter around the red line representing the tone’s of the signal waveform
central frequency. Spectrogram of the same sinusoidal
higher than the highest frequency contained in the higher frequencies of interest, so using a high-
input signals. If the upper-frequency content of pass filter at a selected low frequency while
the signal (including any possible noise or inter- recording is recommended.
ference such as those generated by video AD-converters are more commonly available
monitors, digital networks, and switching power in the consumer market as “digital recorders” that
supplies) is unknown, use a good-quality, also include the circuitry to save recorded data to
low-pass external filter at the known or presumed permanent storage (e.g., SD-cards or internal
upper cut-off frequency while recording and digi- memory) and an interface for powering the other
tally filter and down-sample the recorded file components (either from an external source or
thereafter. It is also important to consider that through internal batteries). Some digital recorders
strong low-frequency sounds below the desired also offer built-in selectable high-pass filters,
frequency range can limit the dynamic range at which can help reduce the low-frequency noises
2 Choosing Equipment for Animal Bioacoustic Research 59
produced by handling and suppress wind or flow form of ultrasonic microphones with integrated
noises. high-speed AD-converter and USB interface
The frequency response of the digital recorder (e.g., Dodotronic12 Ultramic family with sam-
should be matched to the frequency response of pling frequencies ranging from 200 kHz to
the sensor–preamplifier–amplifier system as close 384 kHz). Dodotronic microphones do not need
as possible and to the needs of the research. The specific drivers and can be used on Windows,
component with the narrowest frequency MacOS, and Linux, and also on Android
response is the limiting factor in the recording smartphones. Recent models include support for
chain. All AD-converters have a maximum volt- internal storage (miniSD card) and powering with
age range at the input that can be converted with- a USB battery box. The internal recorder can be
out overloading or clipping. The trick is to stay set by Bluetooth to record on trigger or on a time
below the clip-level and still have good dynamic schedule. Other similar devices are the Wildlife
range and SNRs. Other important features in Acoustics Echo Meter Touch and Petterson Ultra-
selecting the appropriate recorder are: the number sound Microphone. Another option for recording
of channels (e.g., 2, 4, 8, or more), durability, at very high sampling frequency is to use an
reliability for field-use, battery duration, flexibil- instrumentation AD-converter like the PoScope
ity and ease of use, maximum storage, integrated Mega1+.
sensors (unidirectional or directional), inputs for Many recorders are not suited for very-low-
external sensors, power options for the external frequency recording. Most have a lower limit of
sensors (P48 and/or PIP power), and the capabil- 10–20 Hz; others can record down to 7–10 Hz.
ity to connect a remote-control or a timer. Some Recording very-low-frequency animal signals is
recorders (especially many analog and digital tape complicated because this frequency range also
recorders and video-cameras) use Automatic contains environmental and electronic noise,
Gain Control (AGC) to keep the recorded volume which typically would be filtered out. For record-
within the same amplitude range. Other devices ing infrasounds (e.g., calls of elephants or baleen
have an Auto Level Control (ALC) setting or a whales), it is important to check the specifications
limiter function designed to avoid overloading or of the recorder and eventually make a bench-test
clipping. Some recorders indicate clipping either of the available frequency range using a signal
by a level-meter or with a flashing light. Any generator (a tone sweeping through a wide range
AGC, ALC, or limiter options should be disabled of frequencies is a good test signal). An option is
to perform comparisons among different sounds to use an instrumentation AD-converter with DC
or different recordings and if true sound level coupling.
measurements are needed. The gain level should
remain constant throughout a recording, and
2.3.4.2 Special Features of Digital
noted; ideally, the sampling rate and gain settings
Recorders
should remain the same among recordings, at
Pre-recording buffer memory allows the user to
least for the same subject or context.
save the few seconds of sound before pressing the
record button. Auto-start initiates the recording
2.3.4.1 Recording Ultrasounds
automatically when a certain input level is
and Infrasounds
exceeded. Double recording allows a lower-level
Ultrasonic recorders were developed mainly for
backup copy in case some parts of the primary
bat and dolphin studies; however, other animal
recording are overloaded. With this method, the
species also produce ultrasonic sounds (e.g.,
incoming sound is recorded twice, in two differ-
insects, frogs, and infant rodents). To record
ent files, the second stereo file is stored at some
ultrasound requires a sensor with suitable fre-
dB down from the first file. In terrestrial
quency extension and a recorder or an
AD-converter with a high enough sampling fre-
12
quency. An affordable solution is available in the http://www.dodotronic.com/; accessed 15 Mar. 2021.
60 S. Madhusudhana et al.
applications, a wired remote-control can be useful Elektronik13 D100) tuned to the 40–50 kHz
when it is required to hide or protect the recorder range, the call of a bat at 45 kHz (such as the
(e.g., from rain). A wireless remote-control, by Pipistrelli bat, Pipistrellus spp.) is multiplied
Bluetooth or by Wi-Fi (wireless fidelity), allows (heterodyned) by a frequency (43 kHz) generated
controlling the functions and levels by a by an internal oscillator. This produces sidebands
smartphone application, but this would consume at 88 kHz and 2 kHz (which are the sum and the
additional power and could impact energy difference of the two frequencies); the higher
budgets. File time-stamping inserts the date and frequency is eliminated with filters and the
time of the recording in the file name, rather than lower frequency is broadcast to the listener and
just a sequential number. This is extremely help- available for recording. This makes for a tunable,
ful when storing and cataloging the recordings. inexpensive bat detector that will quickly indicate
Some recorders have a computer audio-interface if bats are in the area. Heterodyning offers a
or the ability to connect a computer to record limited view of the ultrasonic spectrum but is
directly on a laptop or a tablet. This option allows still appreciated by many bat specialists.
the same recording quality while using special Frequency-division transforms the available
software for managing files (e.g., to tag files frequencies and replicates the bat call by
with a time-stamp and GPS position, or to auto- converting it into a square wave (sine wave also
matically start and stop the recording according to used) at its zero-crossing points. This wave is
received signals or according to a user-defined then divided by a preset factor (usually 10), cre-
schedule). ating another square (or sine) wave at a lower
frequency (e.g., a 40-kHz call is converted to
4 kHz). All sounds in the environment are
2.3.5 Equipment for Monitoring Bats converted in this way. As such, masking of bat
calls by noise, or overlapping of calls from differ-
Acoustic detection of ultrasonic bat calls has ent individuals, can produce results that could
emerged as the most commonly used method for become difficult to interpret. Many devices have
monitoring bat presence and activity (Collins and filters and ways to lower or otherwise adjust
Jones 2009; Gorresen et al. 2008; Weller and background noise. However, this recording
Baldwin 2012). Observing and recording bats, option is now obsolete because modern digital
other than for scientific research, is a very diffuse ultrasound recorders are capable of recording at
hobby and a common topic of citizen science. very high sampling frequencies (upward of
This results in a wide variety of bat detectors 200 kHz) and capture the full bandwidth.
produced by small companies or DIY bat detector Time-expansion bat detectors use an
kits. The common types of detectors are hetero- AD-converter to digitize sounds, convert them
dyne, frequency-division, time-expansion, zero- so that they are audible to the human operator,
crossing, and full-bandwidth digital recorders and store these digital signals to memory (usually
(Obrist et al. 2010). Some bat detectors have SD-card). Reduction of the recorded frequencies
their own specific software, either free or to be expands the sounds in time (hence the name).
purchased, for further processing of Some modern digital bat detectors do convert
recorded data. ultrasounds to audible sounds in real-time by
Heterodyning was the first developed system, means of FFT processing (Pavan et al. 2001).
completely analog, to shift one frequency (the However, there is a delay when the signals are
incoming signal) to another by multiplying it retrieved and played back at a slower speed
with a second frequency (set by the user). The (so that they can be heard with some delay). A
user can tune the detector (similar to tuning a high-frequency modulated call that sounds like a
radio) to select a frequency range accessing a
small portion of the available received frequency.
13
For example, with a bat detector (e.g., Pettersson http://www.batsound.com/; accessed 15 Mar. 2021.
2 Choosing Equipment for Animal Bioacoustic Research 61
quick click is heard as a descending note or whis- Some frequency-division detectors are com-
tle upon playback from time-expansion. bined with heterodyne and time-expansion
Zero-crossing is an algorithm for extracting capabilities into one unit. The Ciel CDB301
primary frequency information by tracking when combines both a heterodyne detector with a
the waveform crosses the zero-amplitude level at frequency-division detector, allowing the
certain rates. Zero-crossing bat detectors run con- researcher to tune into the frequency of a known
stantly, wake up when certain frequencies are bat call and identify a bat by both its sound
detected, and save information on zero-crossings contour and frequency. At the same time, the
in storage. Some advanced bat detectors also detector monitors the whole frequency band and
retain the amplitude envelope of the original checks if there are any bats in the vicinity. The
call; however, they only track the most intense Pettersson D240, like many of these dual bat
component of the call. Using zero-crossing, a bat detectors, provides heterodyning ability on one
detector documents the dominant frequency, so if, channel and time-expansion on another.
for some reason, a harmonic is dominant over the Connected to a voice-activated digital recorder,
fundamental or other signals overlap the funda- these detectors can be left in the field in monitor
mental of the call, only the most intense fre- mode and retrieved data can be analyzed on a PC
quency is recorded. The operator needs to using the product’s software (e.g., BatSound).
recognize this in order to represent the true nature The Anabat Walkabout (Fig. 2.17) records bat
of the bat’s signal. The recordings produced by signals using the zero-crossing technology and
zero-crossing detectors are usually small (e.g., also saves signals as full-spectrum WAV files
50 KB), whereas an equivalent recording of full- compatible with SonoBat software. The calls can
spectrum calls consumes considerable storage be heard and displayed at the same time and saved
space (e.g., 5 MB per call). to disk, making species identification instanta-
Full-spectrum digital bat detectors are digital neous. Units are compact, mobile, and well-suited
recorders with high sampling frequency that cap- for long-term monitoring. Solar-powered units
ture the full bandwidth of the call (Dannhof and with detachable solid-state hard drives allow for
Bruns 1991; Moir et al. 2013). In some detectors, greater periods of use.
it is also possible to hear sounds in time- For teaching or demonstration, any detector is
expansion while recording continuously. These useful, but one may consider heterodyne types of
bat detectors can record continuously or only detectors because of their low cost (i.e., every
when there are signals in a given frequency student could use one). An interesting and flexi-
band set by the user (triggered recording); this ble option is represented by ultrasonic
solution reduces the storage size and shortens microphones that incorporate a high-speed
the time needed to analyze the recordings as AD-converter that can be connected by USB to
only call series are recorded. Different trigger any computer platform (Windows, MacOS,
parameters allow selecting the frequency range Linux, iOS, Android, or Raspberry). The
to be recorded (spectral trigger) and the threshold Dodotronic Ultramic series, the Wildlife Acous-
level to activate the recorder. This technology is tics Echo Meter Touch, and the Petterson M500
available in handheld and autonomous recorders are great devices for classroom demonstration.
(see Sect. 2.4.1), and computer-based bat They allow to record ultrasounds continuously
detectors that use an external ultrasonic micro- or on trigger with a companion tablet or
phone. Some of the more advanced handheld smartphone, and provide full-spectrum recording
digital bat detectors incorporate a display to capability, audio feedback, and real-time visuali-
visualize detected calls, and also include zation. Some of these manufacturers also provide
frequency-division, time-expansion, or software for either basic operations, such as
frequency-shifting to provide acoustic feedback recording and display, or more advanced tasks
to the operator. such as bat species identification.
62 S. Madhusudhana et al.
shallow depths. For example, a swimming pool long-term (months to years) data from remote
speaker (Lubell,14 Fig. 2.18) is an inexpensive areas and operate independent of weather and
electrodynamic device, but has a narrow fre- light conditions (e.g., Lammers et al. 2008;
quency range that is relatively flat. On the other McCauley et al. 2017; Obrist et al. 2010). Some
hand, piezoelectric projectors have projection recorders generate recordings in popular formats
sensitivity that varies with frequency. Note that (e.g., WAV files) that are compatible across sev-
many of the piezoelectric projectors are two-way eral analysis software packages, whereas others
or reciprocal devices that can also receive acous- generate a device-specific file format requiring
tic signals in water. The receiving sensitivity is the use of a specific software program for
fairly flat for a large portion of the operative analyses. Autonomous recorders eliminate the
frequency range; on the contrary, when working influence of an observer’s presence on the
as a projector, the amplitude of the generated animal’s behavior, are non-invasive, operate
signal typically increases with frequency. remotely, allow systematic periodic sampling,
and provide long-term recordings.
recorders in the field include: battery duration, certain frequency bands exceeds a preset thresh-
total recording time, recorder reliability, program- old, data are recorded. This can reduce the
ming capabilities, weatherproof construction, amount of data to be stored onboard. Recorded
tamper-proof setup, ease of data-retrieval, and data can be retrieved manually from the recorder
possible interface with video. The frequency or remotely via wireless methods. The more
response, dynamic range, and amplitude sensitiv- advanced units feature Wi-Fi, cellular network,
ity of the unit are determined by the sound sensor, or satellite communication interfaces for data
preamplifier, amplifier, and AD-converter used. transmission to a remote server. For instance,
By using a GPS or a highly precise internal clock, Pavan and team used autonomous recorders
individual recorders can be time-synchronized. (Wildlife Acoustics SM3 and SM4) to document
This allows measuring the TDOA of sounds airborne sounds for six years at three locations
among multiple recorders to triangulate and with 10-min samples every 30 min (Fig. 2.19)
locate a sound source (see Chap. 4, section on (Pavan et al. 2015; Righini and Pavan 2019).
localization). Another option is triggered Bat nocturnal activities were monitored via ultra-
recordings. For example, when the energy in sonic autonomous recorders (Wildlife Acoustics
EM3+ and SM4BAT-FS) and an ultrasonic USB expanding. Autonomous recorders with a variety
microphone (Dodotronic Ultramic 250 K) of features (such as operational longevity, high
connected to a PC-tablet. depth rating, onboard processing, and communi-
The increasing interest in acoustic monitoring cation capabilities) are produced by several com-
in the last few years has stimulated the develop- mercial organizations and academic entities.
ment of many autonomous recorders; among Examples of commercially available recorders
these, the Wildlife Acoustics series, the are the AMAR from JASCO Applied Sciences,19
Bioacoustic Audio Recorder (Frontier Labs,15 Snap from Loggerhead Instruments,20 AURAL
Brisbane, Queensland, Australia), the Swift from Multi-Électronique,21 icListen from
(Cornell Lab of Ornithology, Cornell University, Ocean Sonics,22 SoundTrap from
Ithaca, New York, USA), and the Anabat Express OceanInstrumentsNZ,23 EAR from Oceanwide
(Titley Scientific, Brendale, Queensland, Science Institute24 (Lammers et al. 2008), and
Australia). Some recent open-source examples RESEA from RTSYS.25 Academic recorders
are built around the Raspberry Pi and similar include the Rockhopper by Cornell Lab of Orni-
small-board computers. In some cases, the thology (upgraded variant of MARU; Klinck
projects are open access. However, these devices et al. 2020), USR by Curtin University
often require large batteries to sustain power over (McCauley et al. 2017), and HARP by Scripps
long periods. Examples include the Solo acoustic Institution of Oceanography (Wiggins and
monitoring platform16 (Whytock and Christie Hildebrand 2007). Selection of a particular type
2017), based on the Raspberry Pi and an external of autonomous recorder is driven by the needs
microphone; the Bat Pi 217 for monitoring bats; and limitations of the research project. Most of
and the AURITA system, which combines in a these modern recorders support recording at 16-
waterproof package the Solo recorder and a com- and 24-bit resolutions and offer flexibility to
mercially available bat recorder, the Peersonic record at different sampling frequencies and to
RPA2, to capture sounds from 60 Hz to program custom duty cycles. Some even offer
192 kHz (Beason et al. 2018). The AudioMoth,18 the flexibility to easily switch components (e.g.,
an open-source device, which also can be pur- choosing hydrophones with appropriate sensitiv-
chased and assembled, employs a low-power ity or frequency range). With the market for these
microcontroller and an onboard MEMS micro- recorders expanding, there are numerous options
phone (Hill et al. 2018) and has very basic available beyond the few products
capabilities but allows remote data acquisition at mentioned here.
very low cost on a single channel with sampling In very shallow waters, at depths reachable by
frequencies up to 384 kHz. a diver, deployment and recovery operations can
be relatively easy. At greater depths, specific
additional equipment is needed to allow the
recovery—typically, a ballast (to secure stability
2.4.2 Underwater Recorders
on the seafloor), an acoustic release, and floaters
to retrieve the recorder at the surface once the
Over the past few decades, interest in marine
bioacoustics and in underwater noise monitoring
have increased worldwide, and the market for 19
http://www.jasco.com/; accessed 15 Mar. 2021.
underwater autonomous recorders is rapidly 20
http://www.loggerhead.com/; accessed 15 Mar. 2021.
21
http://www.multi-electronique.com/; accessed
15
https://frontierlabs.com.au/; accessed 23 Aug. 2021. 23 Aug. 2021.
22
16
http://solo-system.github.io/home.html; accessed http://oceansonics.com/; accessed 15 Mar. 2021.
23
15 Mar. 2021. http://www.oceaninstruments.co.nz/; accessed
17
http://www.bat-pi.eu/; accessed 23 Aug. 2021. 15 Mar. 2021.
24
18
https://www.openacousticdevices.info/; accessed https://oceanwidescience.org/; accessed 23 Aug. 2021.
25
23 Aug. 2021. http://rtsys.eu/; accessed 15 Mar. 2021.
66 S. Madhusudhana et al.
sensitivity of the recording system. The recording electronics and AD-converters is given in pk-pk
system consists of several components (e.g., sen- values. The simple equation is only valid for
sor, amplifier, and AD-converter), each with its sinusoidal signals.
own frequency response and amplitude sensitiv- Using a sine wave yields an amplitude sensi-
ity. The recording system may be calibrated as a tivity at only one frequency. In order to measure
whole by presenting a calibration signal of known the frequency response of the equipment, a series
amplitude and measuring the output. From the of sine waves at different frequencies needs to be
difference between output and input, the fre- presented. More commonly, white noise (i.e., a
quency response and amplitude sensitivity may broadband signal of equal amplitude across fre-
be calculated. Or, each piece of equipment may quency) is used and amplitude sensitivity is deter-
be calibrated separately, and the frequency mined at all frequencies contained in the signal
responses and amplitude sensitivities may be after Fourier transform of the output signal (see
joined (i.e., multiplied in linear terms or summed Chap. 4).
in logarithmic terms). A simple recording setup is shown in
The simplest calibration signal is a sine wave Fig. 2.22. A calibration signal p(t) (i.e., pure
(i.e., a pure tone; Fig. 2.21). While the rms value tone or white noise of known amplitude) is
is typically used in equipment calibration sheets, presented to the sensor (i.e., microphone or
the peak (pk) or peak-to-peak (pk-pk) values are hydrophone). The sensor has a sensitivity s,
more easily read off signal displays on a computer which relates the voltage V at its output to the
or oscilloscope. For a sine wave, the pressure p at its input; so s has the unit V/Pa. The
conversion is: sensitivity can also be expressed in dB re 1 V/Pa:
ppk S ¼ 20 log10 (s/(V/Pa)). The output voltage V of
prms ¼ pffiffiffi 0:707 ppk the sensor is typically passed to an amplifier. The
2
p ppk pffiffiffi amplifier gain g relates the voltage at its output to
, 20 log 10 rms ¼ 20 log 10 20 log 10 2 the voltage at its input and is thus unit-less:
p0 p0
g ¼ V2/V1. Expressed in dB, the amplifier gain
ppk
20 log 10 3 dB is G ¼ 20 log10 (g). The output voltage of the
p0
amplifier is then passed to an AD-converter such
The variable p denotes pressure. The reference as a soundcard on a computer. The AD-converter
pressure p0 is 20 μPa in air (i.e., for microphone has a digitization gain c, that relates the digital
calibration) and 1 μPa in water (i.e., for hydro- values d in the audio file to the voltage V at its
phone calibration); also see Chap. 4 on an intro- input. The bit-depth of the AD-converter limits
duction to quantities and units. To add to the the maximum digital value (i.e., the full-scale
confusion, the dynamic range of analog value FS) that can be stored. The digitization
gain is defined as the ratio of the full-scale value
68 S. Madhusudhana et al.
Fig. 2.22 Sketch of a generic recording system may be expressed in linear terms (small letters) or decibels
consisting of a sensor (i.e., microphone or hydrophone), (capital letters). The sensor converts the input pressure
amplifier, and AD-converter (e.g., a computer with time series p(t) to a voltage time series V1(t), which is
soundcard). Each piece of equipment has its own sensitiv- amplified to yield V2(t). The AD-converter produces a
ity or gain (indicated by red letters). These sensitivities digital time series d(t)
to the input voltage that produces the full-scale normalized by the full-scale value and so lie
value: c ¼ FS/Vmax. The digitization gain is between 1 and +1. Computing the rms ampli-
expressed in dB re FS/V. The sensitivities tude of the normalized digital time series yields a
(in linear terms) of each component in the record- value of, let’s say, 0.06. In logarithmic terms, the
ing system can be multiplied to yield the system rms amplitude level of the stored normalized data
sensitivity, which relates the digital values d in is D ¼ 20log10(0.06) ¼ 24 dB. What was the
the audio file to the pressure p sensed by the received sound pressure level of the bird song?
sensor. In logarithmic terms, the overall system Subtracting all the gains, the rms sound pressure
sensitivity is the sum of the sensitivities of each level received at the microphone was 32 dB re
piece of equipment. 1 Pa (because 24 –(6) – 40 –(26) ¼ 32).
Once the recording system has been calibrated, The standard reference pressure in air is, how-
it can be used to record animals or other sound ever, 20 μPa, which is equivalent to
sources. To determine the calibrated pressure 20log10(20/1,000,000) ¼ 94 dB re 1 Pa. So,
time series p(t) from the stored data d(t), divide the rms sound pressure level recorded from the
by all the sensitivities and gains: p(t) ¼ d(t) / (c g bird was 32 (94) ¼ 62 dB re 20 μPa. The
s). Alternatively, using the level quantities (in dB) researcher might further want to compute
for each equipment, the received level RL (e.g., calibrated sound spectrograms of the bird song,
rms sound pressure level) is determined by and so the question is how to convert the digital
subtracting all sensitivities and gains from the values to pressure values. Using the linear
rms amplitude level D: RL ¼ D – C – G – S. sensitivities and gains, p(t) ¼ d(t) / (FS / 2 V) /
For example, somebody made a 10-minute 100 / (0.05 V/Pa) yields pressure samples in units
recording of a singing bird. The microphone sen- of Pa.
sitivity was s ¼ 50 mV/Pa, or
S ¼ 20log10(0.05) ¼ 26 dB re 1 V/Pa. The
amplitude at the output of the microphone was 2.6.1 Microphone
amplified by, let’s say, a factor g ¼ 100, or
G ¼ 20log10(100) ¼ 40 dB. The soundcard pro- To make accurate recordings of sound intensity in
duced a full-scale amplitude at 2 V input: c ¼ FS/ the laboratory or field, either from an animal or a
2 V, or C ¼ 20log10(1/2) ¼ 6 dB re FS/V. A different source, a researcher should always use a
computer is used to process the data. If the data calibrated microphone. A commercial micro-
are read using the MATLAB (The MathWorks phone is calibrated when received from the man-
Inc., Natick, MA, USA) function audioread ufacturer and comes with specification sheets
with the flag “native,” then the raw digital values containing amplitude sensitivity, frequency
are presented. With the flag “double,” the data are response, and reception directionality as a
2 Choosing Equipment for Animal Bioacoustic Research 69
a. b.
dB
c. 5
-5
-10
-15
-20
1 10 100 1000 10000 100000 Hz
Fig. 2.23 Specifications of a Brüel & Kjær 1/2-inch free-field microphone type 4191. (a) Photo. (b) Polar plot of
receiving directionality from 16 kHz to 40 kHz. c. Graph of frequency response. Permission to reprint from Brüel & Kjær
function of frequency in the horizontal and verti- roadway or jet noise, may also be considered
cal planes. For example, the ½-inch microphone while ensuring that both microphones receive
shown in Fig. 2.23a has an amplitude sensitivity the same signals and levels. First, calibrate the
of 12.5 mV/Pa or 38 dB re 1 V/Pa and a flat sound field at the frequencies of interest with the
frequency response (to within 3 dB) from about calibrated microphone. Then, replace the
3 Hz to 40 kHz (Fig. 2.23c). Given its cylindrical calibrated microphone with the one of unknown
symmetry, it is omnidirectional about its vertical
axis (Fig. 2.23b). In the vertical plane, its receiv-
ing directionality is steered toward its axis; in
other words, it is most sensitive in the forward
(i.e., vertical in Fig. 2.23b) direction. The lower
the frequency, the more receptive it becomes
from other directions. To check that the micro-
phone maintains its sensitivity over time, a bioac-
oustician should periodically use a calibrator. For
example, the calibrator shown in Fig. 2.24 is very
stable and emits a 1 kHz tone at 94 dB re 20 μPa.
Provided there is a commercial, calibrated
microphone available, a researcher can calibrate
a microphone of unknown sensitivity by compar- Fig. 2.24 A sound level calibrator (LUTRON, model
SC-941) that generates 94 dB re 20 μPa at 1 kHz. The
ison with a calibrated microphone. Using a loud- microphone to be calibrated must be inserted in the hole
speaker system to do this is a convenient option. (1/4 inch diameter) on the left side. Adapters are available
Alternatively, signals of opportunity, like to fit other microphone diameters
70 S. Madhusudhana et al.
Fig. 2.25 Sketch of a setup to calibrate a microphone of manual with permission from Lasse Jakobsen, Institute of
unknown sensitivity with a microphone of known sensi- Biology, University of Southern Denmark, Odense,
tivity in a constant sound field. Redrawn from a laboratory Denmark
sensitivity and record the output in the same fre- To use RESON hydrophones as examples,
quency range. Do not place the two microphones their most sensitive hydrophone (i.e., the one
side-by-side in the sound field since this could with the least negative sensitivity: TC4032;
cause diffraction and distortion of the sound field. Fig. 2.26) has a sensitivity of 170 dB re 1 V/μPa
The sound field should not contain echoes, so (single ended). If the sound received by the
choose an open space or an anechoic room for hydrophone were 170 dB re 1μPa rms, then
low frequencies. In the example of Fig. 2.25, the the output from the hydrophone would be
calibrated microphone has a sensitivity of 50 mV/ 1 V rms. To compare this to a microphone, add
Pa. In the given sound field, it produces an output 120 dB, which is a factor 106 in pressure (20 log10
signal with an amplitude of 0.3 voltage units. After (106) ¼ 120 and 106 μPa ¼ 1 Pa). So,
the calibrated microphone has been removed and 170 dB + 120 dB yields 50 dB re 1 V/Pa.
the to-be-calibrated microphone has been installed The most sensitive ½- or 1-inch microphone is
at exactly the same location, the latter produces an 26 dB re 1 V/Pa, which is 24 dB (i.e., about
output signal of 0.7 voltage units. The sensitivity 16 times, because 20log10(16) ¼ 24) more sensi-
of the to-be-calibrated microphone is simply tive than the TC4032 hydrophone.
0.7/0.3 50 mV/Pa ¼ 117 mV/Pa. Although most hydrophones are stable
through time, it is wise to check the calibration
periodically using a pistonphone. However, a
pistonphone can determine the sensitivity of an
2.6.2 Hydrophone
uncalibrated hydrophone at only one frequency.
The sound pressure of a pistonphone is extremely
High-quality commercial hydrophones are
stable and is only affected by one factor: baromet-
calibrated by the manufacturer with all pertinent
ric pressure. For this reason, a special barometer
information contained in the accompanying spec-
is included with the pistonphone. For accurate
ification sheets. Many hydrophone types have
calibrations, the barometric pressure should be
built-in preamplifiers with amplification and
checked, and sound pressure adjusted according
impedance matching. Thus, these hydrophones
to the scale on the barometer. For calibrations
come with a calibration sheet having one sensi-
performed near sea level (as is often the case in
tivity value that includes the preamplifier. The
marine bioacoustics), this error is negligible, but
sensitivity of a hydrophone is usually expressed
if one is working in an aquatic environment that is
in dB re 1 V/μPa, which is different from the
significantly above sea level, then this factor
expression for microphone sensitivity (dB re
(which is 2 dB at 2000 m altitude) should be
1 V/Pa).
included. For hydrophones to be deployed at
2 Choosing Equipment for Animal Bioacoustic Research 71
Fig. 2.26 Graph of amplitude sensitivity and frequency sensitive is the TC4035. Permission to reprint from
response for several RESON hydrophones with RESON (http://www.teledyne-reson.com/; accessed
preamplifiers. The most sensitive is the TC4032; the least 15 Mar. 2021)
great depth in the ocean, the amplitude sensitivity projected pulse must be ramped up and down to
(and pressure resistance) should be measured in a reduce high-frequency artifacts caused by the
pressure chamber. onset and end of the pulse.
The frequency response of an uncalibrated The next step is to determine the received level
hydrophone (for frequencies up to a few kHz) of an underwater sound. For example, a dolphin
can be measured in air by using the same method click is recorded with a TC4035 hydrophone,
as described for a microphone (Fig. 2.25). How- which has a sensitivity of 215 dB re 1 V/μPa
ever, for higher frequencies, this should be done (Fig. 2.26). If the output is amplified by 60 dB
in open water (e.g., a deep lake) and the method (1000x) and the recorded signal is 1.2 V pk-pk,
described for microphones can be used by simply then the received level is: 20 log10 (1.2) – 60 –
substituting the microphone with a hydrophone of (215) ¼ 1.58 60 + 215 157 dB re 1 μPa
known sensitivity compared to one of unknown pk-pk. Usually, the analog voltage signal is
sensitivity. An appropriate amplifier and an converted to a digital signal by an AD-converter,
underwater projector are needed, but a hydro- which has a digitization gain that also needs to be
phone without a built-in preamplifier also can be accounted for (see above).
used as a projector. First, the environment (lake,
pool, or tank) should be checked for echoes and
reverberations (see Popper and Hawkins 2018 for 2.6.3 AD-Converter
details). The projected calibration sound must be
a pulse that ends before the first echo arrives at the A 16-bit AD-converter has 216 bit resolution,
sensor. This necessity restricts the frequency covering 65,536 counts peak-to-peak. Its full-
range that can be used for calibration since the scale value is 216–1 ¼ 65,535 in unipolar mode,
72 S. Madhusudhana et al.
where the digital amplitude values lie between the chosen gain will affect the amplitude sensitiv-
0 and 65,535, or 215 ¼ 32,768 in bipolar mode, ity and needs to be accounted for. Some manuals
where the digital amplitude values are in the (e.g., the SoundTrap User Guide26) provide guid-
range 32,768; . . ; 0; . . ; 32,767. In decibels, ance on how to calibrate the recorded data if read
the dynamic range of a 16-bit AD-converter in by software packages such as MATLAB,
bipolar mode is 20 log10 (32,768) ¼ 90 dB. Every PAMGuard, or Audacity.
bit gives ~6 dB of dynamic range in the digital
domain. But a 90-dB dynamic range rarely can be
realized since most electronics used before
2.6.5 Measuring Self-Noise
AD-conversion do not have such a large dynamic
range. A 24-bit converter in bipolar mode offers a
When intending to record quiet sounds or ambient
theoretical dynamic range of about 138 dB; how-
sound levels in the absence of nearby sound
ever, only the most sophisticated electronics can
sources, it is important to first measure the system
provide up to 115–120 dB of dynamic range. This
self-noise to avoid confounding electronic noise
means that there cannot be more than 19–20 bits
with environmental noise. For this, the system
of real dynamic range and the remaining bits
should record in a quiet room and the sound
(least significant bits) are just filled by noise.
sensor should be in a sound- and vibration-proof
AD-converter specification sheets rarely show
box (Fig. 2.27). If using an autonomous recorder,
this, thus there is growing need to have more
the entire system should rest in a sound-proof
realistic AD-specifications to account for the
box.
intrinsic AD-converter noise and its artifacts
To record quiet sounds under water or to accu-
showing as distortion and jitter. In some record-
rately quantify ambient sea noise, a sensitive
ing systems, the least significant bits are used to
hydrophone with a wide frequency range is
encode complementary information; however,
needed (e.g., the TC4032, Fig. 2.26). All of the
this practice is not standard.
system components should have low self-noise. A
AD-converters thus carry an intrinsic digitiza-
“wet-ground” ground-wire from the input equip-
tion gain, which is the ratio of the full-scale value
ment to the water might be necessary to reduce
to the input voltage that leads to full-scale. The
system noise. The amplifier should have an
digitization gain is expressed in dB re FS/V. For
adjustable band-pass filter to avoid aliasing dur-
example, an AD-converter with a digitization
ing direct digital recording. The AD-converter
gain of 6 dB re FS/V reaches its FS value at a
needs sufficient bit-resolution and sampling rate
peak input voltage of 2 V, because
to cover the frequency band of interest. The sys-
20 log10(FS/2 V) ¼ 6 dB re FS/V. AD-converters
tem frequency response shown in Fig. 2.27 goes
may be calibrated with a voltage signal generator.
up to about 100 kHz. If the full bandwidth is
The peak voltage of the input signal has to be less
desired, then the sampling frequency should be
than the maximum voltage range specified in the
at least 200 kHz. When reporting measured
specification sheet; otherwise, the AD-converter
levels, provide the frequency range over which
will be overloaded and the signal clipped.
sound was measured and the bandwidth over
which sound levels were computed (e.g., per Hz
or in 1/3-octave bands).
2.6.4 Autonomous Recorder
60
50
40
30
1/3 Ottava da 20/06/2013 16.48.00 a 20/06/2013 16.58.00
20K 70
Hz
10K dB
5K
60
2K
1K
500
50
200
100
40
50
20
30
10 20
5
16.48 h:m 16.50 16.52 16.54 16.56 16.58
Fig. 2.28 Recording and spectral analysis of noise in a spectral composition of the recorded period. At about
residential area. Recording (top) of the overall sound level 20 Hz is the noise generated by a truck engine. At about
(A-weighted) with the LAeq level of the shown period. The 16.53 occurs the noise of a passing airplane (50–1000 Hz).
unweighted spectrographic image (bottom), with fre- Bird songs appear at 1500–9000 Hz. Courtesy of Alberto
quency up to 20 kHz on a logarithmic scale, shows the Armani
where T is the time interval of the measurement. broadband or band-limited (e.g., in a 1-octave or
The level may be weighted (e.g., A or C 1/3-octave band). Most sophisticated, and expen-
weighting). LAeq is often used in the assessment sive, noise measuring systems can produce spec-
of noise dose or sound exposure in humans tra in narrower bands (as fine as 1-Hz bands) and
(Fig. 2.28). For example, LAeq,1s ¼ 73 dB or calculate spectral percentiles to show the level
Leq,1s ¼ 73 dB(A) is a measurement taken with variation statistics for each frequency band. In
an A-weighting filter over 1 s and LCeq,1s other words, the percentile analysis of a 1/3-
indicates a measurement taken with a octave spectrum shows what percentage of time
C-weighting filter for 1 s. each level is reached or exceeded within the mea-
Some SPL meters have a 60-s Leq setting used surement period (see Chap. 4, section on power
for short-term sampling. However, if the sound spectral density percentiles).
level varies randomly, calculating Leq is tricky, All these devices need to be calibrated period-
and so, Integrating Sound Level Meters are better ically with a known calibration tone. Calibrators
(Fig. 2.29) as they determine Leq during a suitable are standardized at the factory and usually main-
time period. When more information on the sta- tain calibration for a long time. Only specialized
tistics of sound levels is needed, in both time and laboratories can certify calibrators. The calibrator
frequency, noise-level analyzers are used signal is usually a 1-kHz sinusoidal tone at 94 dB
(Fig. 2.29). They perform statistical analyses of re 20 μPa SPL rms (equivalent to a pressure of
sound levels over a specified period, either 1 Pa rms, 95.45 dB pk, or 1.41 Pa pk).
2 Choosing Equipment for Animal Bioacoustic Research 75
impacts the measurements that one makes of a Sensor Types Based on the Quantity
signal and how that signal is characterized. Measured
Some of the key considerations for selecting a Displacement: Phonocartridges and other piezo-
type of sensor include its sensitivity and power electric sensors have greatest sensitivity at low
needs (all sensors require power), the frequency frequencies. Phonocartridges can be quite good
and amplitude ranges of the signals, equipment for detecting low-frequency, low-amplitude
ruggedness and portability (if considered for signals in plant substrates, but placement of the
fieldwork), and cost (Table 2.1). Research photocartridge on the plant leaf or stem necessar-
questions can be framed around the signaler or ily loads the substrate and changes its transmis-
receiver, and the measurement of interest can vary sion properties (Fig. 2.30a). Additionally,
widely (e.g., number of signals produced, signal amplitude measurements made with
parameters, etc.). Different sensor types function phonocartridges are variable and not repeatable,
best in different frequency ranges, and the domi- because amplitude varies with the pressure with
nant frequency of a vibrational signal can vary which the stylus contacts the plant tissue.
widely, from <50 Hz for tremulating katydids Velocity: LDVs use the reflection of a laser
(De Souza et al. 2011; Morris 1980; Morris beam pointed at a reflective object or substrate
et al. 1994; Sarria-S et al. 2016), to between to detect the velocity of its movement. (If a sur-
50 and 200 Hz for tremulating stinkbugs face does not reflect enough of the laser for mea-
(reviewed in Čokl et al. 2014), to above 500 Hz surement, a small amount of reflective paint or
for diverse kinds of plant-feeding insects tape can be applied to the substrate.) LDVs are
(reviewed in Čokl et al. 2014). Vibrational signals highly sensitive and excellent for detecting and
can also be narrowband (McNett and Cocroft making measurements of low-amplitude signals
2008) or broadband, with energy distributed that also have energy concentrated in low
over several kHz (Cocroft 1996; Hamel and frequencies. They do not load any mass to a
Cocroft 2019). substrate, so they do not affect signal transmis-
The amplitudes of vibrational signals also vary sion in this way, and in fact, they can be used to
widely, even just within small arthropods. For characterize signals by recording from an animal
example, large neotropical katydids produce itself (Čokl et al. 2005). LDVs provide repeatable
substrate-borne vibrations by vertically measures of amplitude for vibrational signals.
oscillating their abdomens relative to the substrate Unfortunately, LDVs can be expensive. Although
(in other words, they bounce) and the amplitude they are fairly portable, they are still quite cum-
of these oscillations can be large enough to bersome compared with a micro-accelerometer.
observe with the naked eye (Belwood and Morris Additionally, because an LDV detects motion
1987; Morris et al. 1994; Rajaraman et al. 2015). perpendicular to the laser, the researcher must
In contrast, the amplitude of signals by tiny tree- decide which plane is of interest (e.g., identify
hopper nymphs can be so low as to be difficult to the major axis of motion). LDVs are not well-
detect without a very sensitive sensor, such as a suited for high-amplitude signals, as a moving
laser-Doppler vibrometer (LDV) (JH, pers. obs.). branch or stem will break the contact of the laser
The animal’s use of substrates is another key with the reflective surface and disrupt measurement.
factor to consider: some vibrationally signaling Acceleration: Accelerometers can be pur-
animals, such as small, plant-feeding insects, are chased in a wide variety of sensitivities, fre-
relatively sessile and signal from specific quency ranges, and sizes, and some models have
locations on plants of a single species (McNett the capacity for adjustable gain. For example, a
and Cocroft 2008), whereas other vibrationally commonly used micro-accelerometer in studies of
signaling animals are more motile and may signal small insects has a mass of 0.8 g and a frequency
on diverse substrate types (reviewed in Elias and range of 0.8 Hz–10 kHz. Accelerometers can
Mason 2010). generate repeatable measurements of amplitude,
2
Table 2.1 Examples of sensors that might be selected for vibrational communication studies, taking research aim, substrate type, signal frequency and amplitude ranges,
ruggedness of equipment, and cost constraints into account
Dominant
Substrate frequency Signal Cost
Organism Aim type and range amplitude Other considerations constraints Decision
Small insect that mainly Characterize spectral Plant 100 Hz, Low Study will be Funding is LDV is ideal for
signals on herbaceous and temporal features of stem/leaf most energy conducted in the lab provided, or low-frequency, low-amplitude
plant species (e.g., signals <200 Hz and field; field work is an LDV is signals; will not load the
stinkbug) in moderate otherwise substrate or affect signal
environments available transmission
Small, sessile insect that Behavioral study Tree Broadband, Low Equipment needs to Funding is Accelerometer is the best bet
signals on a specific part documenting signaling branch energy be sturdy; field study; limited for funding constraints, field
of a single plant species response (or lack distributed branches sometimes study, and the need to affix
(e.g., treehopper) thereof) in response to a 100 Hz move in breeze sensor to substrate; frequency
stimulus range suggests that this
Choosing Equipment for Animal Bioacoustic Research
5000 Hz
approach will work
Medium or large, motile Characterize spectral Leaf litter Both high Medium; Animal moves around Funding is LDV is ideal for not loading
arthropod (e.g., wolf and temporal features of and low- multiple signal but is confined to an provided, or the substrate and for
spider) signals frequency elements arena; signals on leaf an LDV is characterizing low- and
signals litter otherwise mid-frequency ranges
available
Large, motile insect Behavioral study Large 20 Hz, most High (can Equipment needs to Funding is LDV is ideal for
(e.g., katydid) documenting signaling plant energy observe be sturdy and limited low-frequency signals, but
response (or lack stem/leaf < 100 Hz movement of function in high heat high signal amplitude
thereof) in response to a plant stem as and humidity suggests that laser would not
stimulus animal signals) environment remain in contact (with a
moving substrate). Funding
limitations and study setting
suggest that an accelerometer
is the place to start.
77
78 S. Madhusudhana et al.
Fig. 2.30 Sensors that detect and measure substrate- to substrates with a small amount of accelerometer wax
borne vibrations. (a) A phonocartridge attached to or dental wax. Lightweight supports such as twist-ties and
lab-hands or a thin wooden dowel. (b) Accelerometer. thin hair clips are used to reduce the likelihood of the
(c) Piezo disc or contact microphone for detecting accelerometer shifting position or detaching from a
substrate-borne vibrations. (d–f) Accelerometers affixed substrate
2 Choosing Equipment for Animal Bioacoustic Research 79
and because accelerometers are necessarily the particle motion from the sound pressure
attached to a substrate, they can measure high- measurements and the acoustic properties of the
amplitude signals that move the substrate itself. medium. This is relatively easy in an acoustic
Accelerometers are lightweight and small free-field (i.e., no nearby boundaries to sound
(Fig. 2.30b), can be rugged, and several com- propagation). However, near acoustic boundaries
monly used models can be powered by one or (like the seabed and the sea surface), the relation-
more 9-V batteries. Drawbacks of accelerometers ship between pressure and particle motion
are that attaching a sensor to a substrate loads becomes complex and so, particularly in shallow
mass to the substrate; to avoid altering of sub- waters that are inhabited by many fishes and
strate transmission properties, it is recommended invertebrates, measuring particle motion directly
to limit sensor mass to <5% of the mass of the is necessary. The result is a dearth of data on
substrate (Cocroft and Rodríguez 2005). Because particle motion and its importance to, and poten-
accelerometers detect acceleration, they are not as tial effects upon, animals. Although there are
sensitive at low frequencies as they are at higher excellent hydrophones for monitoring sound
frequencies, and they generally have lower pressure, there are far fewer devices for detecting
bandwidths than LDVs. and analyzing particle motion.
The study of animal vibrational communica- Popper and Hawkins (2018) described the
tion is rapidly growing. In order to withstand the many problems with measuring particle motion
rigor of peer-review, researchers must document in a tank and recommended that measurements be
the type, make, model, and sensitivity of the taken in the field, or at least in a specially
sensors used, and also document the factors likely designed sound exposure chamber to control the
to affect signal characteristics and propagation relative magnitudes of particle motion and sound
(e.g., substrate type and characteristics, position pressure. To make particle motion measurements,
of the animal). The relative position of the sensor it is necessary to mount three orthogonally
must be logical, consistent, and be informative for orientated vector sensors together to monitor the
the study. For sensors that attach to substrates three spatial components of particle motion. Any
(e.g., accelerometers), secure and even attach- sound can thus be resolved into its directional
ment will help achieve a good signal-to-noise components and the direction to the sound source
ratio and minimize impedance mismatch may be determined. Calibrated particle motion
(Fig. 2.30 a, d–f). measurement systems are commercially avail-
able, but expensive. An alternative approach is
2.7.2.2 In Underwater Studies to measure the sound pressure gradient in the
An important issue with respect to fishes and water to derive the particle motion in a particular
invertebrates is their sensitivity to particle motion direction.
that accompanies sound transmission, rather than Many studies have used custom-built particle
to sound pressure. Particle motion comprises par- motion sensors for studying the impacts of
ticle displacement, particle velocity, and particle anthropogenic activities on fish (e.g., Campbell
acceleration (ISO 18405 201727) and differs from et al. 2019; Solé et al. 2017; van der Knaap et al.
sound pressure in that it is a vector quantity. In 2021). GeoSpectrum Technologies Inc. offers a
contrast, sound pressure is a scalar quantity, act- few choices for off-the-shelf particle motion
ing in all directions. sensors in their M20 line of products. Each device
Popper and Hawkins (2018) reported that it is consists of an omnidirectional acoustic pressure
commonplace to characterize underwater sound sensor co-located with three (or two) dipole
by the sound pressure alone, because it is easily sensors that measure the amplitude and phase of
measured by a hydrophone, and then to estimate particle motion in the three (or two) orthogonal
directions. Being lightweight and having a small
27
https://www.iso.org/standard/62406.html; accessed form factor (e.g., the M20–040 has a 64 mm
8 Mar. 2021. diameter and is 179 mm tall; Fig. 2.31), they are
80 S. Madhusudhana et al.
Fig. 2.31 Photograph (left) and receiving frequency velocity level (PVL): dBV re 1 m/s. Permission to reprint
response (right) of GeoSpectrum M20–040. Note that the from GeoSpectrum Technologies Inc. (http://www.
units of the calibration curve are in terms of particle GeoSpectrum.ca/; accessed 15 Mar. 2021)
preferred over traditional hydrophone arrays for provide the ability to select a recording time and
assessing directionality, especially for use on duration for long-term, remote monitoring of
small unmanned underwater vehicles (e.g., Stinco ambient and animal sounds.
et al. 2019). The M20 devices support direction-
ality assessments over a frequency range of 1 Hz
to 3 kHz, and the bearing uncertainty increases 2.8 Summary
with decreasing frequency and decreasing SNR.
Erbe et al. (2017) used a GeoSpectrum M20 to Technology used in bioacoustic research is
determine sound pressure, particle displacement, changing rapidly. This chapter describes cur-
particle velocity, and particle acceleration from rently used equipment in bioacoustic studies,
recreational swimmers, kayakers, and divers. along with references and websites. The chapter
starts with an introduction to the nomenclature
used in the industry, describing these as they
2.7.3 Smartphone Applications apply to animal bioacoustic research. An under-
standing of the terminology would assist a bioac-
Smartphone applications have put bioacoustic oustician with choosing appropriate equipment
research in the hands of hobbyists and citizen with characteristics suitable for a particular
scientists. Applications are inexpensive, rapidly study. Instruments that form a complete recording
evolving, and available on both Android based or playback setup are described in light of these
phones and iPhones. These applications are well- characteristics, along with mentions of a few of
suited for classroom and field demonstrations of the commonly used products available in the
bioacoustic research. The microphone and market. Considerations such as electronic noise,
soundcard in cellphones from different aliasing, sensitivity, resolution, and dynamic
manufacturers determine the frequency range range are discussed for both terrestrial and under-
and level of the sounds recorded and the type of water equipment. Autonomous recorders, that
analysis possible. A researcher needs to know the offer pre-packaged programmable solutions for
frequency range and amplitude sensitivity of the passive acoustic monitoring, are also discussed.
cellphone to ensure that the sounds of the target The discussions cover several indicative
animals can be appropriately captured. bioacoustic studies (targeting a wide variety of
Applications used in battery-operated cellphones fauna) that highlight the use of specific equipment
2 Choosing Equipment for Animal Bioacoustic Research 81
for different purposes and under different • Marco Pesente’s blog on getting started with
conditions. Other related types of equipment nature recording: http://www.naturesound.it/;
used in closely related fields (such as accessed 6 Sep. 2021.
biotremology, particle velocity measurement, • Useful instructions on how to build your own
etc.) are highlighted. DIY microphones can be found on the email
A priori knowledge of the target animal’s discussion lists naturerecordists
sounds is helpful in selecting appropriate equip- (naturerecordists@yahoogroups.com) and
ment. Sensing and recording equipment needs to micbuilders (micbuilders@yahoogroups.
be appropriate for the environmental conditions com).
being studied. This chapter summarizes how to • For biotremology, recent reviews that discuss
select and operate microphones and hydrophones, sensor possibilities as well as playback equip-
digital recorders, automated recording systems, ment include Wood and O’Connell-Rodwell
amplifiers, filters, sound pressure level meters, (2010) and Elias and Mason (2014). For a
and cellphone applications. Knowing the equip- thorough discussion of considerations for
ment specifications and selecting components to vibrational playback experiments, we suggest
match in frequency range and amplitude sensitivity Cocroft et al. (2014b). An email discussion list
is important. The dynamic range, amplitude sensi- of vibrational communication researchers can
tivity, and frequency response of each piece of be found at biotremology@googlegroups.
equipment in a recording setup must match and com.
suit the types of sound (i.e., their level and fre-
quency range) intended to be recorded. Periodic Smartphone applications:
calibrations of microphones and hydrophones are
necessary to ensure accurate measurements are • How to record birds for fun and science and
made, and the methods are described herein. With with a cellphone: https://www.allaboutbirds.
their wide availability and ease of use, smartphone org/news/how-to-record-bird-sounds-with-
driven approaches are gaining popularity lately. your-smartphone-our-tips/; accessed
The chapter aims to offer the reader a firm ground- 30 Jan. 2021.
ing with the concepts and available equipment
Acknowledgments SM thanks Holger Klinck, Director,
options in bioacoustics. Pointers to seek further
K. Lisa Yang Center for Conservation Bioacoustics,
understanding are provided along with information Cornell Lab of Ornithology, for his support and advice
about online resources that could offer more up-to- on some of the topics covered in the chapter. Thanks are
date information on the topic. extended by LAM to Lasse Jakobsen and Magnus
Wahlberg, Institute of Biology, University of Southern
Denmark, Odense, Denmark, and Jakob Tougaard and
Peter T. Madsen, Institute for Bioscience, Aarhus Univer-
sity, Aarhus, Denmark, for comments on this chapter.
2.9 Additional Resources WLG thanks Natalie Gannon and Mithriel for information
and photographs on marine acoustic programs in Puerto
Information about recording equipment: Rico. Michael O’Farrell provided current notes on Anabat
and other bat detector technology. Dean Julie Coonrod,
• Review by the Macaulay Library of the University of New Mexico, provided academic support for
Cornell Laboratory of Ornithology: https:// completion of this project. GP thanks Marco Pesente for
his contribution of material about DIY microphones.
www.macaulaylibrary.org/resources/audio-
recording-gear/; accessed 30 Jan. 2021.
• Introductory guide on instruments and
techniques for bioacoustics by the Interdisci- References
plinary Center for Bioacoustics and Environ-
mental Research, University of Pavia: http:// Beason RD, Rüdiger Riesch R, Koricheva J (2018)
www.unipv.it/cibra/edu_equipment_uk.html; AURITA: an affordable, autonomous recording device
for acoustic monitoring of audible and ultrasonic
accessed 30 Jan. 2021.
82 S. Madhusudhana et al.
frequencies. Bioacoustics 28(4):381–396. https://doi. Čokl A, Zorović M, Kosi AŽ, Stritih N, Virant-Doberlet M
org/10.1080/09524622.2018.1463293 (2014) Communication through plants in a narrow
Bell PD (1980) Transmission of vibrations along plant frequency window. In: Janik EM, McGregor P (eds)
stems: implications for insect communication. Journal Studying vibrational communication. Springer, Berlin,
of the New York Entomological Society 88:210–216 pp 171–195. https://doi.org/10.1007/978-3-662-
Belwood JJ, Morris GK (1987) Bat predation and its 43607-3_10
influence on calling behavior in neotropical katydids. Collins J, Jones G (2009) Differences in bat activity in
Science 238:64–67. https://doi.org/10.1126/science. relation to bat detector height: implications for bat
238.4823.64 surveys at proposed windfarm sites. Acta
Brüel and Kjær (1982) Condenser-microphones. Brüel & Chiropterologica 11:343–350
Kjær, Denmark: 1–146. http://www.bkhome.com/doc/ Dannhof BJ, Bruns V (1991) The organ of Corti in the bat
be0089.pdf Hipposideros bicolor. Hear Res 53(2):253–268
Brüel and Kjær (2001) Environmental noise. Brüel & De Souza LR, Kasumovic MM, Judge KA (2011) Com-
Kjær, Denmark: 1–67. http://www.bkhome.com/doc/ municating male size by tremulatory vibration in a
br1626.pdf Columbian rainforest katydid, Gnathoclita sodalis
Buck CL, Malavar JC, George O, Koob GF, Vendruscolo (Orthoptera, Tettigoniidae). Behaviour 148:341–357.
LF (2014) Anticipatory 50 kHz ultrasonic https://doi.org/10.1163/000579511X559418
vocalizations are associated with escalated alcohol Elias DO, Mason AC (2010) Signaling in variable
intake in dependent rats. Behav Brain Res 271:171– environments: substrate-borne signaling mechanisms
176 and communication behavior in spiders. In:
Buzzetti F, Brizio C, Pavan G (2020) Beyond the audible: O’Connell-Rodwell CE (ed) The use of vibrations in
wide band (0–125 kHz) field investigation on Italian communication: properties, mechanisms and function
Orthoptera (Insecta) songs. Biodiversity Journal 11(2): across taxa. Research Signpost, Kerala
443–496 Elias DO, Mason AC (2014) The role of wave and sub-
Campbell J, Sabet SS, Slabbekoorn H (2019) Particle strate heterogeneity in vibratory communication: prac-
motion and sound pressure in fish tanks: a behavioural tical issues in studying the effect of vibratory
exploration of acoustic sensitivity in the zebrafish. environments in communication. In: Janik EM,
Behavioural Processes 164:38–47 McGregor P (eds) Studying vibrational communica-
Caruso F, Sciacca V, Bellia G, De Domenico E, Larosa G, tion. Springer, Heidelberg, pp 215–247. https://doi.
Papale E, Pellegrino C, Pulvirenti S, Riccobene G, org/10.1007/978-3-662-43607-3_12
Simeone F, Speziale F, Viola S, Pavan G (2015) Size Erbe C (2009) Underwater noise from pile driving in
distribution of sperm whales acoustically identified Moreton Bay, Qld. Acoustics Australia 37(3):87–92
during long term deep-sea monitoring in the Ionian Erbe C, Verma A, McCauley R, Gavrilov A, Parnum I
Sea. PLoS One 10(12):e.0144503. https://doi.org/10. (2015) The marine soundscape of the Perth Canyon.
1371/journal.pone.0144503 Progress in Oceanography 137:38–51. https://doi.org/
Cocroft RB (1996) Insect vibrational defence signals. 10.1016/j.pocean.2015.05.015
Nature 382:679–680 Erbe C, Parsons M, Duncan AJ, Lucke K, Gavrilov A,
Cocroft RB, Rodríguez RL (2005) The behavioral ecology Allen K (2017) Underwater particle motion (accelera-
of insect vibrational communication. BioScience 55: tion, velocity and displacement) from recreational
323. https://doi.org/10.1641/0006-3568(2005)055[ swimmers, divers, surfers and kayakers. Acoust Aust
0323:TBEOIV]2.0.CO;2 45:293–299. https://doi.org/10.1007/s40857-017-
Cocroft RB, Gogala M, Hill PSM, Wessel A (eds) (2014a) 0107-6
Studying vibrational communication. Springer-Verlag, Erbe, C., Marley, S., Schoeman, R., Smith, J. N., Trigg, L.,
Berlin, Heidelberg. https://doi.org/10.1007/978-3-662- and Embling, C. B. (2019). The effects of ship noise on
43607-3 marine mammals--A review. Front Mar Sci 6, 606.
Cocroft RB, Hamel JA, Su Q, Gibson J (2014b) Vibra- https://doi.org/10.3389/fmars.2019.00606.
tional playback experiments: challenges and Favali P, Chierici F, Marinaro G, Giovanetti G,
solutions. In: Janik EM, McGregor P (eds) Studying Azzarone A, Beranzoli L, De Santis A, Embriaco D,
vibrational communication. Springer, Heidelberg, pp Monna S, Lo Bue N, Sgroi T, Cianchini G, Badiali L,
249–274 Qamili E, De Caro MG, Falcone G, Montuori C,
Čokl A, Prešern J, Virant-Doberlet M, Bagwell GJ, Millar Frugoni F, Riccobene G, Sedita M, Barbagallo G,
JG (2004) Vibratory signals of the harlequin bug and Cacopardo G, Calì C, Cocimano R, Coniglione R,
their transmission through plants. Physiological Ento- Costa M, D’Amico A, Del Tevere F, Distefano C,
mology 29:372–380. https://doi.org/10.1111/j. Ferrera F, Valentina Giordano V, Massimo Imbesi M,
0307-6962.2004.00395.x Dario Lattuada D, Migneco X, Musumeci M,
Čokl A, Zorović M, Žunič A, Virant-Doberlet M (2005) Orlando A, Papaleo R, Piattelli P, Raia G, Rovelli X,
Tuning of host plants with vibratory songs of Nezara Sapienza P, Speziale F, Trovato A, Viola S, Ameli F,
viridula L (Heteroptera: Pentatomidae). J Exp Biol Bonori M, Capone A, Masullo R, Simeone F,
208:1481–1488. https://doi.org/10.1242/jeb.01557 Pignagnoli L, Zitellini N, Bruni F, Gasparoni F,
2 Choosing Equipment for Animal Bioacoustic Research 83
Pavan G (2013) NEMO-SN1 Abyssal Cabled observa- wireless microphone array for spatial monitoring of
tory in the Western Ionian Sea. IEEE J Ocean Engineer animal ecology and behaviour. Methods Ecol Evol
38(2):358–374 3(4):704–712
Gannon WL, Lawlor TL (1989) Variation in the chip Michelsen A, Fink F, Gogala M, Traue D (1982) Plants as
vocalization of three species of Townsend chipmunks transmission channels for insect vibrational songs.
(genus Eutamias). J Mammal 70:740–753 Behav Ecol Sociobiol 11:269–281
Gorresen PM, Miles AC, Todd CM, Bonaccorso FJ, Miller BS, Barlow J, Calderan S, Collins K, Leaper R,
Weller TJ (2008) Assessing bat detectability and occu- Olson P, Ensor P, Peel D, Donnelly D, Andrews-
pancy with multiple automated echolocation detectors. Goff V, Olavarria C, Owen K, Rekdahl M,
J Mamm 89:11–17 Schmitt N, Wadley V, Gedamke J, Gales N, Double
Hamel JA, Cocroft RB (2019) Maternal vibrational signals MC (2015) Validating the reliability of passive acous-
reduce the risk of attracting eavesdropping predators. tic localisation: a novel method for encountering rare
Front Ecol Evol 7:614. https://doi.org/10.3389/fevo. and remote Antarctic blue whales. Endang Species Res
2019.00204 26:257–269
Hill PSM (2008) Vibrational communication in animals. Moir HM, Jackson JC, Windmill JFC (2013) Extremely
Harvard University Press, Cambridge, MA high frequency sensitivity in a ‘simple’ ear. Biol Lett 9:
Hill AP, Prince P, Covarrubias EP, Doncaster CP, 20130241
Snaddon JL, Rogers A (2018) AudioMoth: Evaluation Morris GK (1980) Calling display and mating behaviour
of a smart open acoustic device for monitoring biodi- of Copiphora rhinoceros Pictet (Orthoptera:
versity and the environment. Meth Ecol Evol:1–13 Tettigoniidae). Anim Behav 28:42-IN1. https://doi.
Hill P, Lakes-Harlan R, Mazzoni V, Narins PM, Virant- org/10.1016/S0003-3472(80)80006-6
Doberlet M, and Wessel A. eds. (2019). Biotremology: Morris GK, Mason AC, Wall P, Belwood JJ (1994) High
studying vibrational behavior. Springer International ultrasonic and tremulation signals in neotropical
Publishing: Cham. https://doi.org/10.1007/978-3-030- katydids (Orthoptera: Tettigoniidae). J Zool 233:129–
22293-2 163. https://doi.org/10.1111/j.1469-7998.1994.
Jensen ME, Miller LA (1999) Echolocation signals of the tb05266.x
bat, Eptesicus serotinus, recorded using a vertical Mortimer B (2017) Biotremology: do physical constraints
microphone array: effect of flight altitude on searching limit the propagation of vibrational information? Anim
signals. Behav Ecol Sociobiol 47:60–69 Behav 130:165–174. https://doi.org/10.1016/j.
Klinck H, Winiarski D, Mack RC, Tessaglia-Hymes CT, anbehav.2017.06.015
Ponirakis DW, Dugan PJ, Jones C, Matsumoto H Nosengo N (2009) The Neutrino and the whale. Nature
(2020) The ROCKHOPPER: a compact and extensible 462:560–561
marine autonomous passive acoustic recording O’Connell-Rodwell CE (2010) The use of vibrations in
system. In: OCEANS 2020 MTS/IEEE Global. IEEE, communication: properties, mechanisms and function
pp 1–7 across taxa. Research Signpost, Kerala
Lammers MO, Brainard RE, Au WW, Mooney TA, Wong Obrist MK, Pavan G, Sueur J, Riede K, Llusia D, Márquez
KB (2008) An ecological acoustic recorder (EAR) for R (2010) Bioacoustic approaches in biodiversity
long-term monitoring of biological and anthropogenic inventories. In: Manual on field recording techniques
sounds on coral reefs and other marine habitats. J and protocols for all taxa biodiversity inventories. Abc
Acoust Soc Am 123(3):1720–1728 Taxa 8:68–99. ISSN 1784-1283 (hard copy) ISSN
Lynch E, Joyce D, Fristrup K (2011) An assessment of 1784-1291 (online pdf). Available at http://www.
noise audibility and sound levels in U.S. National abctaxa.be/volumes/volume-8-manual-atbi/Chapter-5/
Parks. Landscape Ecol 26:1297. https://doi.org/10. Pavan G (2017) Fundamentals of soundscape
1007/s10980-011-9643-x conservation. In: Farina A, Gage SH (eds)
Magal C, Schöller M, Tautz J, Casas J (2000) The role of Ecoacoustics: the ecological role of sound. Wiley,
leaf structure in vibration propagation. J Acoust Soc Hoboken, pp 235–258
Am 108:2412–2418. https://doi.org/10.1121/1. Pavan G, Borsani JF (1997) Bioacoustic research on
1286098 cetaceans in the Mediterranean Sea. Mar Freshw
McCauley RD, Thomas F, Parsons MJG, Erbe C, Cato D, Behav Physiol 30:99–123
Duncan AJ, Gavrilov AN, Parnum IM, Salgado-Kent Pavan G, Manghi M, Fossati C (2001) Software and hard-
C (2017) Developing an underwater sound recorder. ware sound analysis tools for field work. Proc. 2nd
Acoust Aust 45(2):301–311. https://doi.org/10.1007/ Symposium on Underwater Bio-sonar and Bioacoustic
s40857-017-0113-8 Systems. Proc. I.O.A. 23(part 4):175–183
McNett GD, Cocroft RB (2008) Host shifts favor vibra- Pavan G, Fossati C, Caltavuturo G (2013) Marine bio-
tional signal divergence in Enchenopa binotata acoustics and computational bioacoustics at the Uni-
treehoppers. Behav Ecol 19:650–656. https://doi.org/ versity of Pavia (Italy). In: Adam O, Samaran F (eds)
10.1093/beheco/arn017 Detection classification and localization of marine
Mennill DJ, Battiston M, Wilson DR, Foote JR, Doucet mammals using passive acoustics. 2003–2013:
SM (2012) Field test of an affordable, portable,
84 S. Madhusudhana et al.
10 years of international research. DIRAC NGO, Paris, Stinco P, Tesei A, Biagini S, Micheli M, Ferrri G, LePage
pp 3–25. 1–298. ISBN 978-2-7466-6118-9 KD (2019, June). Source localization using an acoustic
Pavan G, Favaretto A, Bovelacci B, Scaravelli D, vector sensor hosted on a buoyancy glider. In
Macchio S, Glotin H (2015) Bioacoustics and OCEANS 2019-Marseille (pp. 1–5). IEEE. https://
ecoacoustics applied to environmental monitoring and doi.org/10.1109/OCEANSE.2019.8867452
management. Rivista Italiana di Acustica 39(2):68–74 Streicher R, Everest FA (1998) The new stereo sound
Payne K, Langbauer WR Jr, Thomas EM (1986) Infra- book. AES Publishing, Pasadena, CA
sonic calls of the Asian Elephant (Elephas maximus). Thode A, Skinner J, Scott P, Roswell J, Strailey J, Folkert
Behav Ecol Sociobiol 18:297–301 K (2010) Tracking sperm whales with a towed acoustic
Poole JH, Payne K, Langbauer WR, Moss CJ (1988) The vector sensor. J Acoust Soc Am 128(5):2681–2694
social context of some very low frequency calls of van der Knaap I, Reubens J, Thomas L, Ainslie MA,
African elephants. Behav Ecol Sociobiol 22:385–392 Winter HV, Hubert J, Martin B, Slabbekoorn H
Popper AN, Hawkins AD (2018) The importance of parti- (2021) Effects of a seismic survey on movement of
cle motion to fishes and invertebrates. J Acoust Soc free-ranging Atlantic cod. Curr Biol 31(7):1555–1562.
Am 143(1):470–488. https://doi.org/10.1121/1. e4. https://doi.org/10.1016/j.cub.2021.01.050
5021594 Virant-Doberlet M, Čokl A (2004) Vibrational communi-
Rajaraman K, Godthi V, Pratap R, Balakrishnan R (2015) cation in insects. Neotropical Entomology 33:121–
A novel acoustic-vibratory multimodal duet. J Exp 134. https://doi.org/10.1590/S1519-
Biol 218:3042–3050. https://doi.org/10.1242/jeb. 566X2004000200001
122911 Wahlstrom S (1985) The parabolic reflector. J Audio Eng
Rayburn RA (2011) Eargle’s the microphone book: from Soc 33(6):418–429
mono to stereo to surround: a guide to microphone Weller TJ, Baldwin JA (2012) Using echolocation
design and application, 3rd edn. Elsevier, Waltham, MA monitoring to model bat occupancy and inform
Righini R, Pavan G (2019) First assessment of the mitigations at wind energy facilities. J Wildl
soundscape of the Integral Nature Reserve “Sasso Manag 76:619–631
Fratino” in the Central Apennine, Italy. Biodiversity Whytock RC, Christie J (2017) Solo: an open source,
Journal 21(1):4–14. https://doi.org/10.1080/14888386. customizable and inexpensive audio recorder for
2019.1696229 bioacoustic research. Meth Ecol Evol 8:308–312
Sarria-S FA, Buxton K, Jonsson T, Montealegre ZF (2016) Wiggins SM, Hildebrand JA (2007) High-frequency
Wing mechanics, vibrational and acoustic communica- acoustic recording package (HARP) for broad-band,
tion in a new bush-cricket species of the genus long-term marine mammal monitoring. In: Interna-
Copiphora (Orthoptera: Tettigoniidae) from Colombia. tional Symposium on Underwater Technology and
Zoologischer Anzeiger - A Journal of Comparative Workshop on Scientific Use of Submarine Cables and
Zoology 263:55–65. https://doi.org/10.1016/j.jcz. Related Technologies 2007 (IEEE, Tokyo, Japan):
2016.04.008 551–557
Sciacca V, Caruso F, Beranzoli L, Chierici F, De Wood JD, O’Connell-Rodwell CE (2010) Studying vibra-
Domenico E, Embriaco D, Favali P, Giovanetti G, tional communication: equipment options, recording,
Larosa G, Marinaro G, Papale E, Pavan G, playback and analysis techniques. In: The use of
Pellegrino C, Pulvirenti S, Simeone F, Viola S, vibrations in communication: properties, mechanisms
Riccobene G (2015) Annual acoustic presence of fin and function across taxa. Research Signpost, Kerala,
whale (Balaenoptera physalus) offshore Eastern Sicily, pp 163–182
Central Mediterranean Sea. PLoS One 10(11): Zimmer WMX (2011) Passive acoustic monitoring of
e0141838. https://doi.org/10.1371/journal.pone. cetaceans. Cambridge University Press, Cambridge
0141838 Zimmer WMX (2013) Range estimation of cetaceans with
Solé M, Sigray P, Lenoir M, Van Der Schaar M, compact volumetric arrays. J Acoust Soc Am 134(3):
Lalander E, André M (2017) Offshore exposure 2610–2618
experiments on cuttlefish indicate received sound pres- Zotter F, Frank M (2019) Ambisonics. Springer Interna-
sure and particle motion levels associated with acoustic tional Publishing, Cham. ISBN 978-3-030-17206-0.
trauma. Sci Rep 7(1):1–13 https://doi.org/10.1007/978-3-030-17207-7
2 Choosing Equipment for Animal Bioacoustic Research 85
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Collecting, Documenting, and Archiving
Bioacoustical Data and Metadata 3
William L. Gannon, Rebecca Dunlop, Anthony Hawkins,
and Jeanette A. Thomas
3.2 Ethical Research principles and studies that are conducted with
scientific integrity (Fig. 3.1). Most researchers
As with all scientific endeavors, bioacousticians consider their work with animals to be harmless
work to answer questions and address hypotheses and therefore ethical. However, the process of
by observing or manipulating the natural world. thinking through how animals could be affected,
There is an ethical obligation to document and proposing research methods during the prep-
procedures and methods, so that reported results aration of an IACUC protocol can be very instruc-
are understandable and reproducible by other tive. In some cases, preparing a protocol for
researchers. A reliable way for understanding review can save a project from mistakes (such as
data, and how they were collected, is by low statistical power, inadequate or illegal animal
documenting metadata associated with a record- housing or handling methods, unnecessary dupli-
ing. Metadata are the description of basic infor- cation, unnecessary expense, or unrecognized
mation collected at the time of the recording, such alternative hypotheses). In fact, developing a
as the recordist; date and time; specific location research protocol can serve to make the research
(GPS coordinates); equipment and settings; water more robust.
depth or altitude; water or air medium; water or Gannon (2014) provided two examples that
air temperature/humidity; weather conditions; illustrate a potentially unethical study and posed
and species, sex, age, and behavior of the the question of whether a research permit was
animals. Knowing the who, what, when, and needed. In 1991, a rare migrant yellow-green
where, of acoustic recordings makes acoustic vireo (Vireo flavoviridis) was spotted at protected
data more useful and allows a review of methods parklands in Rattlesnake Springs, New Mexico,
by other researchers to validate or USA. The sighting was announced on the rare-
supplement data. bird hotline and a number of people went to the
Although bioacoustical studies are usually area to view the bird and to add it to their “life
non-invasive, investigators need to consider and list.” During this time, a PhD student was
minimize any potential effects of their work on collecting goldfinches (Spinus tristis). Knowing
animals (e.g., avoid playbacks of extremely loud that genetic material and voucher specimens are
or injurious sounds that could disturb animals in important to taxonomic and conservation
critical breeding and feeding areas). In many research, he decided to collect the rare bird for a
cases, animal ethics permits and/or research museum research collection. To entice the bird to
permits are needed from the country, state, an unprotected area for easy and legal collection,
county, or any other political entity in which the he recorded calls of the vireo and then played
study will be conducted. If the species is them back where he could legally collect the
endangered, additional permits may be required. bird. The birding community became incredulous
Most research institutions receiving funding from and angry. Was it ethical to record and use
the USA government require investigators to sub- playbacks of this species’ calls to lure the bird to
mit an animal research protocol to an Institutional an unprotected area for collection (see Gluck
Animal Care and Use Committee (IACUC) for 1998)?
approval before conducting research involving More recently, as characterized in Fig. 3.2, a
any animals. Ethical conduct of research goes smartphone birding application was used to lure a
beyond satisfying the requirements of the male common yellowthroat (Geothlypis trichas)
IACUC and includes responsible data collection into view. White (2013) described that broadcast-
and management, appropriate statistical analyses, ing calls, using a smartphone application, gener-
thorough presentation and archival of data, and a ally elicits a quick response from a normally
study that is reproducible. Additionally, research concealed bird. Possibly thinking the sounds
should be reported, peer-reviewed, and published were from another male of his species and threat-
ethically. This falls under research ethics ening his territory, the male yellowthroat
swooped down right in front of a birding tour
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 89
Fig. 3.1 A collage of common reference materials and the scientific process and the ethics of how a study is
journals that are used to advise on the responsible conduct conducted undoubtedly produce better science
of research with animals. Considerations of the integrity of
Fig. 3.3 Conditions in the field often contrast sharply from protected for bats to inhabit safely. Occasional sampling is
those in a controlled laboratory environment. Working to completed by live-capture (bottom left) and acoustic moni-
exclude bats (Townsend’s big-eared bat, Corynorhinus toring (bottom right). All photos by authors except bottom
townsendii) from gold mining operations in Nevada, USA left (MNH field biologists collect bat specimens, by
(top left). Recording assures animals are excluded prior to Florante A. Cruz; https://www.wikiwand.com/en/UPLB_
destroying the tunnel system for mineral extraction. Mitiga- Museum_of_Natural_History; licensed under CC BY-SA
tion sites are identified (top right) which are gated and 4.0; https://creativecommons.org/licenses/by-sa/4.0/
92 W. L. Gannon et al.
Fig. 3.4 Photographs of researchers in Antarctica record- prominent so as to not draw the subject’s attention. Note
ing a killer whale (Orcinus orca; left) and Weddell seal the researcher on the right maintains a distance from the
(Leptonychotes weddellii; right). Equipment is both seal so as not to disturb it
protected from being molested by the animal but also not
Documenting the ambient temperature and Researchers should not disturb animals while
humidity is especially important when studying recording (Fig. 3.5). If possible, the recordist
ectothermic terrestrial animals, such as reptiles, should hide in a blind spot or use an automated
frogs, toads, insects, or other invertebrates. At recording system with no observer present. Note
low ambient temperatures, ectothermic animals that sometimes narrating observations of the
are less active and sounds are lower in frequency animal’s behavior during the recording is useful
than during higher ambient temperatures. For which means that the researcher should decide
example, studies by Kissner et al. (1997) between using a remote setup and a setup where
demonstrated that sounds from ectothermic they are nearby. To concurrently monitor animal
animals, such as rattlesnakes (Crotalus viridis), behavior, a video camera on a tripod can be used,
change with ambient temperature and humidity. with minimal disturbance to the animal. How-
ever, the researcher should be aware that the
audio track of a video camera has a limited fre-
quency response and an auto-adaptive level con-
3.3.3 Animal Considerations
trol, meaning these sound recordings should not
be relied upon for acoustical analysis. Closed
The transducer should be positioned so target-
Circuit Television (CCTV), synchronized with
animal sounds are recorded but the animal does
omnidirectional microphones on an ultrasonic
not damage the equipment. An aggressive or curi-
detector, and coordinated using a mobile phone
ous animal can quickly demolish a recording
and speaking clock, has been used to document
system (Fig. 3.4). Equipment used in playback
new vocalizations and activities patterns for
studies can be particularly susceptible to an ani-
barbastelle bats (Barbastella barbastellus;
mal attack. The goal of recording is to document
Young et al. 2018). With a little ingenuity, a
sounds from natural circumstances and not from a
researcher can create a robust recording system.
charging or frightened animal. Captive animals
To save time and expense, it is important to
often are curious about a hydrophone or a micro-
know whether a species has a preferred time of
phone in their enclosure and can need time to
day or season for producing sounds. Many spe-
habituate to equipment before undisturbed sounds
cies are most vocal during the breeding season.
are produced. Placing the transducer in a
Some birds and amphibians are most soniferous
protected area or in a protective mesh cage may
at dawn and dusk whereas many chorusing
be necessary.
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 93
insects primarily produce sounds at dusk. For mid-April. Testes were in regression by 20 July
example, Thomas and DeMaster (1982) showed and had become inactive by mid- to late-
that Antarctic crabeater seals (Lobodon September (Davis 1958). So, if a researcher
carcinophaga) preferred to call under water desires to record sounds of this species associated
between 2100 h and 0500 h and were hauled-out with breeding, the study should be conducted
on the ice at other times. If the number of from mid-April to mid-July. In addition, this spe-
vocalizations was used as a population count, a cies shifts their song to an earlier start time in
census of crabeater seals at 1200 h would have relation to civil twilight. As day length increases
yielded a much lower population estimate than a between the spring equinox and the summer sol-
census at 2400 h. Bats, obviously, are active at stice, civil twilight occurs earlier in relation to
night. However, there is usually a notable peak of sunrise, causing the dawn calling period to
activity approximately 30 minutes after dusk lengthen.
(Kunz and Parsons 2009). Some species (many
in the genus Myotis and Tadarida) are more likely
to be recorded during the first four hours of night, 3.3.4 Documentation and Data
while others emerge past midnight (Euderma, Sheets
Artibeus). Some bats have multimodal activity
patterns (Sherwin et al. 2000) and many sciurids Documentation is very important. A logbook
(e.g., Marmota and Neotamias) actively vocalize should accompany each recording to provide
in the morning and then again in late afternoon metadata on the recordist; the recording system
(Gannon 1999). Some species (e.g., prairie dogs, and equipment settings (e.g., any filter or gain
Cynomys and pikas, Ochotona) are seasonally settings); the location, date and time; environ-
soniferous all day (Slobodchikoff et al. 1998; mental conditions; types of sounds recorded; the
Smith et al. 2016). animals’ behavior (e.g., breeding, feeding, or
It is important to know the effects of both time socializing); a specific animal number
of day and month to interpret the behavioral con- (if marked); and any other circumstances which
text of a recording. For example, breeding data could be valuable for analysis.
from the North American male rufous-sided Many devices may record some of the
towhee (Pipilo erythrophthalmus) showed that metadata automatically. For instance, the Echo
males reached breeding condition around Meter Touch 2 PRO Ultrasonic Module using
94 W. L. Gannon et al.
Table 3.1 Sample logbook showing important metadata to be noted. Examples from author (JAT) notes for Weddell
seal (Leptonychotes weddellii) and sea otter (Enhydra lutris)
Tape Counter Collector Date Time Location Subject Quality Comments
2 234 JA 23 March 16:00 McMurdo Weddell Poor Underwater, adult male,
Thomas 2004 seal 839W, wind 20 knts
13 22 CM 18 Sept 13:15 Valdez, Sea otter Excellent Airborne, mother and pup,
Smith 2004 AK unmarked, no wind
Kaleidoscope Pro software1 (Wildlife Acoustics, heat-shrink tubing, electrical ties, electrical tape,
Maynard, MA, USA) records calls to an iPhone extra cables and connectors, batteries (preferably
or other device and collects metadata about each rechargeable, with charger), multi-meter, etc. If
recording. Metadata can then be displayed with possible, pack replacement equipment: anemom-
Kaleidoscope software or exported to a spread- eter, thermometer, laptop with extra charger,
sheet. Recording directly to a computer allows external speakers, software for data entry, backup
time-stamped (and often GPS-stamped) files. hydrophone or microphone, headset, walkie-
If a datasheet (spreadsheet) is used, put talkie, smartphone, microphone for narration
metadata headers as the first column and fill the onto a PC, and data storage devices (SD-cards,
rows with your observations (Table 3.1). Each thumb-drive, external hard-drive). Why are
sound or bout of sounds should be assigned a duplicates necessary? If you cannot repair some-
unique number for easy reference later, and a thing, then use backups so the research effort is
variety of variables can then be noted for each not wasted.
sound (Table 3.2). Spreadsheets can be imported Moving or shipping equipment often creates
directly into a variety of statistical and graphing problems with loose connections or fittings. If
software products for analyses (see Chap. 9 on equipment is not operating properly, tighten
analytical approaches). Note that datasheets for fasteners on the equipment housing, make sure
playback studies usually include additional circuit boards are seated properly, check that
variables on animal behavior (Table 3.3). batteries are fully charged, and make sure all
cables are connected and working. To check for
cable malfunction, use an ohm-meter to make
3.3.5 Trouble-shooting Equipment sure the resistance of a cable is zero. If new
Problems equipment is used in a study, always unpack it
and check its operation in the laboratory before
Often field work is conducted in remote locations, going to the field. Bring manuals for all equip-
sometimes without easy access to the Internet, ment to the field site or know where to reliably
electricity, or equipment repairs. Consider all pos- access them.
sible equipment problems and always have
backups—of everything. A good motto for field
work is to “bring one to use and one to lose”
3.4 Playback Methods
(Fig. 3.5). Studies usually are costly and time-
and Controls
consuming—in particular in remote locations.
There is nothing worse than a missed field oppor-
Projections of sounds to animals (or playbacks)
tunity caused by the lack of a cable or battery.
are common methods of study in bioacoustics
Bring proper tools to the field site to make
(Fig. 3.6). Several authors have used playbacks
repairs: soldering iron, solder, electrical wire,
to determine the function of a specific animal
1
sound by measuring the animal’s behavioral
https://www.wildlifeacoustics.com/products/echo-
response (Morton and Morton 1998).
meter-touch-2-pro-ios and https://www.wildlifeacoustics.
com/products/kaleidoscope-pro; accessed 13 June 2022
3
Table 3.2 Sample data sheet for acoustic measurements on airborne vocalizations of California sea lions (Zalophus californianus); na not applicable. Courtesy of Schwalm
(2012)
Dominant Dominant First Interval to
start end Dominant Dominant harmonic First second Total
frequency frequency maximum minimum interval component component duration
Date Sound (Hz) (Hz) frequency (Hz) frequency (Hz) Harmonics (Hz) duration (s) (s) (s) Rate
3 July 2011 1 642 755 943 491 Yes 453 0.838 na 0.838 Single
4 July 2011 2 566 566 717 415 Yes 377 0.253 0.148 5.211 Even, series
of 2
5 July 2011 3 640 534 720 294 No na 0.139 na 0.139 Single
9 July 2011 4 614 800 881 480 Yes 26 0.388 0.146 3.477 Accelerate,
series of 4
9 July 2011 5 587 667 747 427 Yes 400 0.165 0.146 5.57 Irregular,
series of 6
Collecting, Documenting, and Archiving Bioacoustical Data and Metadata
Table 3.3 Sample data sheet for playback study with white-cheeked gibbons (Nomascus leucogenys); na not applicable. Courtesy of Yegge (2012)
Type of Quadrant Mate in Number Approach Move away Out-of-
Session Date Time playback Animal number same quad calls speaker from speaker Stationary Climb Move Groom sight
3 29 Aug 16:00 Control CJ 2 Yes 0 Yes No No Yes Run No No
3 29 Aug 16:00 Control Max 2 No 0 No No Yes No na Yes No
4 30 Aug 16:30 Own duet CJ 3 Yes 2 No Yes No No Walk No No
4 30 Aug 16:30 Own duet Max 4 Yes 4 No No Yes No na Yes No
95
96 W. L. Gannon et al.
Fig. 3.6 Playback studies are those by which an animal or carnivores, and primates), birds, reptiles, fish, and many
group of animals is played their calls (or calls of their others. Painting “His Master’s Voice” by Francis Barraud
conspecifics) back to them and then their response is (1856–1924). Source: Victor Talking Machine Company.
recorded. Research using playbacks has been used com- Public domain; https://commons.wikimedia.org/wiki/File:
monly in mammals (such as squirrels, prairie dogs, pika, His_Master%27s_Voice.jpg
Playback studies on fish have been used to attracting European sprat (Sprattus sprattus) in
determine species recognition from a particular mid-water in the sea (Fig. 3.7).
sound, to classify different call types, to identify Many birds respond to playbacks of their own
effects of sound on fish behavior, to study how a or other animal sounds by approaching the pro-
call was coded, and to measure acoustic jector and sometimes even attacking the speaker
parameters of the call relevant to communication (Fig. 3.8). Emlen (1972) investigated how infor-
(Zelick et al. 1999). For example, Myrberg and mation is encoded in bird song by altering
Riggio (1985), studying bicolor damselfish components of Indigo bunting (Passerina
(Stegastes partitus), found that males produced cyanea) song and playing-back the modified
sounds more often in response to playbacks of songs to male territory holders. He quantified
conspecific sounds than to sounds of other spe- the intensity of responses to modified songs and
cies’, and responded more readily to sounds from thus inferred the importance of temporal, struc-
non-resident fish than sounds from their nearest tural, and syntactical features for both individual-
neighbor. Playbacks of male Lake Malawi cichlid and species-recognition.
fish (Pseudotropheus zebra) sounds to female Beecher and Burt (2004) played-back territo-
cichlids caused them to lay eggs earlier than con- rial sounds from male song sparrows (Melospiza
trol female fish of another Lake Malawi cichlid melodia) that were in neighboring territories ver-
species (Pseudotropheus emmiltos; Amorim et al. sus distant territories. The males were slower and
2008). Simpson et al. (2011) played-back ambient less likely to fly over and explore the sounds from
sounds of different reefs to coral reef fish and a neighbor than calls from a distant male. When a
showed that fish approached the sounds of their song from a distant territorial male was played,
native coral reef versus sounds from a foreign the subject almost always matched or replicated
reef. Hawkins et al. (2014) played back the song and approached the speaker as if looking
recordings of impulsive pile driving sounds for an intruder. In contrast, when the song of a
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 97
a b
5m
5m
10 m
10 m
15 m
20 m
15 m
c d
5m 5m
10 m
10 m
15 m
15 m
20 m
20 m
25 m
Fig. 3.7 Responses of sprat (Sprattus sprattus) schools to reappearing seconds later slightly closer to the seabed.
sound exposure. Vertical lines indicate the beginning and (c) A large sprat school cut off at the onset of the sound
end of each sound sequence. (a) Echogram of a medium- and reappearing at a greater depth at lower density. (d) A
sized sprat school, cut off abruptly after the beginning of small sprat school increasing in density in response to
the sound, and reappearing a few seconds later as a denser sound exposure. From Hawkins et al. 2014. # Acoustical
school slightly closer to the seabed. (b) A medium-sized Society of America, 2014. All rights reserved
sprat school cut off at the onset of the sound and
neighbor male was played, 85% of the time the eat seals versus killer whales that eat fish; the
subject sang a different song, but one familiar to seals exhibited fearful responses when sounds
the neighbor. By responding with a different, but by the former were broadcast. Wild killer whales
shared song, the subject sparrow indicated it either approached or ignored playbacks of sounds
recognized that the sounds were from a neighbor. from another killer whale pod, but did not call in
Much of the work in determining the function response. However, when their own calls were
of alarm calls in ground squirrels and prairie dogs played, most killer whales approached the source
(Spermophilus and Cynomys, respectively) was and the entire pod started calling in response
determined or confirmed by playing-back previ- (Filatova et al. 2011). Clark and Clark (1980)
ously recorded calls to an attentive colony of described right whale (Balaena australis) behav-
these rodents in the field and observing their ior from playback experiments where right
responses (e.g., Slobodchikoff et al. 2009). Prat whales can differentiate between conspecific
et al. (2016) used playback techniques of calls sounds and other sounds. Playbacks of their own
recorded from the Egyptian fruit bat (Rousettus song or social sounds to wild humpback whales
aegyptiacus) to show that 16 sounds recorded and (Megaptera novaeangliae) resulted in some
played-back from this bat provided enough infor- animals approaching, some charging the source,
mation to identify who was calling, where they and others moving away (Mobley et al. 1988;
were calling from, what they were calling about, Tyack 1983).
and what sort of response the receiver made to the Before a playback session, the researcher
vocalization. should always check the projected sound near
Yegge (2012) and Thomas et al. (2016) the animal to make sure the sound is not distorted
reported using playbacks of duets to restore a and is of sufficient amplitude to mimic the
pair-bond in yellow-cheeked gibbons (Nomascus intended sound. Ideally, playback experiments
gabriellae). A breeding pair of captive gibbons should be carried out on wild animals that are
stopped duetting when construction occurred near free to move within their natural habitats. Captive
their exhibit lasting for about 6 months. After- animals often are de-sensitized to reoccurring
wards, the authors played-back sounds of the sounds, and confinement within a small space
pair’s previous duet, along with a silent- and can greatly alter their behaviors and
music-controls. The pair slowly resumed their vocalizations. It is especially important to ensure
duet, established a pair-bond, and continued to that playback experiments are carried out under
duet, some 5 years later. appropriate acoustic conditions, where the trans-
Playback experiments with marine mammals mitted sounds are free from distortion, and reflec-
are less common due to the logistical challenges tion and reverberation are minimal. This is a
of undertaking these experiments at sea. How- particular problem with playback experiments
ever, there are a few examples. Weddell seals on fish, where sounds can be greatly altered by
(Leptonychotes weddellii) produced geographi- the acoustic environment, especially in small
cally different vocal repertoires that has potential aquarium tanks (Parvulescu 1964; Grey et al.
for identifying discrete breeding stocks of Antarc- 2016; Rogers et al. 2016).
tic seals (Thomas et al. 1983). Charrier et al. Playback studies require controls to ensure the
(2013) used playback methods to confirm that animal is responding to the projected sound and
bearded seals (Erignathus barbatus) recognized not to the noise/hum of equipment or the novelty
vocalizations of their species from different of a new sound. Current sound analysis and
regions. Male harbor seals (Phoca vitulina) that sound-generation software allows the manipula-
are territorial, use roars given by intruding seals to tion of many sound characteristics that could be
locate and challenge those intruders (Hayes et al. used as a control. There are several types of
2004). Deecke (2003) used playbacks to examine controls used by investigators: 1) Merely turn on
whether captive harbor seals could distinguish the equipment to replicate the electronic/back-
sounds from killer whales (Orcinus orca) that ground noise. 2) Play the animal’s own sound,
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 99
but backwards. This projects the same frequency, periods of heavy rain, sounds from animals will
amplitude, and time relationships of the actual either not be heard or masked by the rain.
sound, but in a different order. 3) Play the A common problem in bioacoustical studies in
animal’s sound at a higher or lower speed. This terrestrial environments is the presence of
transforms the projected sound into a different acoustically-active non-target animals. If a
frequency range. 4) Play a call with parts filtered non-target species calls in a specific frequency
out. 5) Play something totally novel to the animal, band, their sounds can perhaps be filtered out,
such as sounds from another species it has never but in many cases, this is not possible. Some
encountered, music, machinery noise, or human analysis software allows to define the frequency
speech. 6) Play sounds typical of the animal’s and amplitude of a target species’ calls and auto-
natural environment. matically identifies only them in a recording.
However, in many cases, finding locations and
times when only an individual animal is
3.5 Considerations for Terrestrial vocalizing provides the best opportunity to make
Field Studies quality recordings.
A good solution for animals such as bats is to
If recording on land, from a vehicle (such as use units which are self-contained and weather
during a truck survey for bat sounds), ground- resistant (see Chap. 2, section on bat detectors).
generated noise can be a problem. In fact, Borkin Each unit can include a receiving transducer,
et al. (2019) reported a negative relationship storage device, or laptop programmed to record
between bat activity and night-time traffic volume at intervals and can be powered by rechargeable
on New Zealand highways; when traffic battery packs or solar panels. Data can be recov-
increased, probability of detecting bats decreased. ered daily, weekly, monthly, or even uploaded in
These researchers used stationary automatic bat the proximity of Wi-Fi for automated data
detectors to avoid their own road noise. Some retrieval. Arrays of bat detectors have been used
solutions include: stopping and turning the vehi- to record ultrasonic calls of bats, as well as to
cle off and recording in silence; using a recently sample the acoustic landscape, estimate biodiver-
paved asphalt track rather than an older and nois- sity, and estimate species density (Carles et al.
ier road or a dirt track; and carrying out vehicle 2007; Sherwin et al. 2000).
transects using electric vehicles. Road surveys are
valuable, but reducing non-biotic noise would
make these transects even more valuable. Terres- 3.6 Considerations for Aquatic
trial recordings can be contaminated with nearby Field Studies
traffic noise. It is therefore advisable to make a
sample recording, check it for ambient noise, and Studies in freshwater are easier on the equipment
select an optimal quiet area. than in saltwater environments; saltwater’s corro-
Air temperature can be a problem. Thomas, sive properties require that underwater equipment
Zinnel, and Ferm (1983), when recording be rinsed with freshwater after use and recorders
Weddell seal breeding colonies, used water- and hydrophones be wiped down to remove salt-
activated chemical heat packs placed next to water deposited from the air. It is, of course, good
recording equipment and batteries in an insulated practice to wipe down and dry all equipment,
box to keep equipment warm in the Antarctic for whether it was deployed in saltwater, in freshwa-
24-hour periods. In extremely warm locations ter, or on land, after use to avoid any rusting or
with high humidity, moisture can collect on build-up of deposits.
recorders or microphones. Placing recording Maintenance and calibration of equipment
equipment inside an insulated box with desiccants such as hydrophones has been shown to be impor-
can minimize moisture problems. In rain forests, tant for long-term monitoring studies and data
equipment must be totally waterproof. During integrity. This includes considerations such as
100 W. L. Gannon et al.
the pressure rating on the hydrophone and the fluctuating pressure around the hydrophone,
length of cable that is waterproofed; the longer which is sensed by the hydrophone and appears
the cable, the higher the impedance and the as noise in recordings. But this “noise” is not due
greater the signal attenuation. Some plastic- to a traveling acoustic wave and hence not due to
coated cables, if deployed for long periods, are sound in the environment. It is an artifact. Flow
vulnerable to damage by marine organisms, shark noise is often a problem in rivers but also offshore
bites, and even sea urchins. Polytetrafluor- (see flow noise marked in the spectrograms in
oethylene (PTFE) coated cables are less suscepti- Fig. 3.3 in Erbe et al. 2015). It can require the
ble to damage of this kind. In addition, acoustic- use of a shield or deflector, or placement of the
release mechanisms (to allow equipment to sur- hydrophone in a sheltered area.
face) can malfunction when encrusted by marine Sound-recording acoustic tags are attached to
creatures. In a review of underwater soundscape marine animals to record their vocalizations and
ecology to monitor habitat health in general, and examine the effects of anthropogenic noise in the
fish spawning in particular, Lindseth and Lobel marine environment relative to animal generated
(2018) summarized current recording and sam- sound. Flow noise (generated simply by water
pling methods including metrics commonly used flowing around the tag) can be useful in this
in analyses of aquatic acoustic data. They point instance, as it can measure whale speed (von
out that there have been significant technological Benda-Beckmann et al. 2016; Fig. 3.9). However,
advances in equipment, especially hydrophones. interference by background noise is also a com-
In aquatic situations, there can be electronic mon problem. Unfortunately, survey vessels pro-
interference from improper grounding on the ves- duce noise while operating. Therefore, to avoid
sel, depending on the types of electronic equip- unnecessary mechanical background noise during
ment running onboard (e.g., lights, radios, recordings, turn off any non-essential equipment
freezers, generators, winches, fans, air (such as engines, pumps, filters, fans, generators,
conditioners, or furnaces). A quick-fix to ground- lights, refrigerators, winches, etc.). However,
ing problems on a ship is to drop a bare wire into fishing, military, research, and whale-watching
the water with the other end attached to the boat operators often are reluctant to do this. Alter-
recording equipment. However, a trial-and-error natively, these vessel sounds can be filtered out
approach may be needed to resolve this. during recording or analysis.
Flow noise is a problem that causes artifacts in In rivers or shallow coastal areas, currents and
the recordings. Noise from water flow over the tides transport sediment which may create noise.
hydrophone and its mooring can create turbulence It may come as quite a shock when an entire
and small eddies (vortex shedding). These lead to recording is ruined by nonstop sand swishing
back and forth over the hydrophone, creating under water). It is important to measure and
noise between 10 Hz and 2 kHz (Erbe 2009). understand the sound speed profile in the study
Perhaps more amusing shallow-water “mooring area to know the propagation pattern and range of
noise” occurred when a group of teenage girls a signal, which influence the recorded sound. For
swam over to the mooring, held on to the floats years, navies of the world measured sound speed
and sang ABBA songs for 20 minutes—very profiles using disposable, battery-operated CTD
clearly recorded. The entire recording session (conductivity, temperature, depth) units, which
had to be discarded (Erbe 2013). were tossed into the ocean and data sent back to
Similarly, a hydrophone fixed to a ship, boat, the ship as the unit fell in the water and unspooled
buoy, or dock will bob up-and-down and produce a long copper wire. The units were not retrieved.
spurious signals such as flow noise as the water Today, retrievable, digital CTD units are used.
passes the hydrophone and artifacts from hydro- The sound speed profile may change over the
static pressure changes as the hydrophone course of a day—within the upper few meters
changes its depth. The recording can be saturated below the sea surface. Turl and Thomas (1992)
with such signals. This noise can be reduced by documented that a false killer whale (Pseudorca
suspending the hydrophone with a bungee cord, crassidens) echolocating during target-detection
decoupling the floating hydrophone from the sur- distance experiments in Kaneohe Bay, Hawaii,
face through a catenary line, or mounting the USA, consistently performed better during the
hydrophone on the seafloor (Fig. 3.10; also see morning than afternoon; i.e., the whale could
Chap. 2, section on PAM systems). Another solu- detect the target at a greater distance during the
tion to reduce flow noise is to use a sonobuoy or morning. After taking CTD measurements prior
an anti-heave buoy (see photograph in Chap. 4, to the morning and afternoon sessions, the
section on sonobuoys). The long cable of the researchers realized the water column, and thus
sonobuoy acts as a bungee cord to dampen verti- sound speed profile, were very different between
cal oscillations of the hydrophone. The sonobuoy the two periods because or prevailing midday
is isolated from self-noise of the vessel, but will rains.
detect sounds from the vessel until it moves out of Sound propagation is particularly complicated
range. in shallow water because of the close proximity of
Local sound propagation conditions will affect boundaries formed by the sea surface and seabed
the recording (see Chap. 6 on sound propagation (Rogers and Cox 1988). Sound is reflected,
acous c floats
release hydrophone
rope
Fig. 3.10 Mooring options to avoid noise artifacts: (a) et al.; https://www.frontiersin.org/articles/10.3389/fmars.
recorder on the seafloor, (b) recorder suspended from a 2019.00606/full. Published under a Creative Commons
float via a bungee cord and drogue, and (c) recorder Attribution License (CC BY); https://creativecommons.
suspended via a catenary line (Erbe et al. 2019). # Erbe org/licenses/by/4.0/
102 W. L. Gannon et al.
scattered, and absorbed at these boundaries. animals, and health). Care should be taken to
There is far more attenuation of low-frequency study healthy animals, as opposed to ill or
sounds in shallow water compared to deep water. rehabilitating animals, to best represent the acous-
Rogers and Cox (1988) suggested that the lowest tic abilities of their wild counterparts. However,
frequency that could propagate in water less than burgeoning research by Therrien et al. (2012)
1 m deep was about 300 Hz, but this was strongly indicated that changes in vocal behavior of
dependent on the nature of the seabed (sand, rock, bottlenose dolphins (Tursiops truncatus) and
or mud). California sea lions (Zalophus californianus)
Ambient noise is an omnipresent issue and actually could be used to indicate a health prob-
may mask the signals desired for recording (see lem (Schwalm 2012). Moreover, captive animals,
Chap. 7 on soundscapes). Wind and precipitation especially those that have been hand-reared or
create noise underwater from coastal to offshore raised in a hatchery (such as salmon or sea bass)
regions. In polar regions, ice popping and crack- can show some degree of genetic selection,
ing may dominate the soundscape. When a hydro- de-sensitization, and habituation to the presence
phone was dropped in the ice-covered water next of high levels of ambient sound. They can be
to a group of Antarctic Weddell seals (JAT, per- much less responsive to sounds than wild
sonal observations), music was heard from the animals.
radio-station at the New Zealand Research Base Most zoos have noise created by loudspeaker
in Antarctica about 2 km away! Organisms from announcements, music, shows, rides, or facility
tiny snapping shrimp to enormous singing whales vehicles. Key events, such as hearing music for a
may also mask recordings of a target species. show, or a vehicle delivering food, may affect
Ship noise is almost omnipresent in the world’s animal behavior; therefore, studies should not be
oceans, so it can be difficult to obtain recordings conducted during those times. Reminiscent of
of a target species in a quiet aquatic environment. Ivan Pavlov in the 1890s experiment that dogs
were being conditioned behaviorally (drooled) in
response to being fed at the sound of a bell
3.7 Considerations for Studies (conditioned response), researchers need to be
on Captive Animals aware of regular triggers to animal behavior. Of
course, a common source of noise in captive
Because there are regulations on the housing and studies is from visitors, keepers, and maintenance
care of captive animals, research permit and workers. If at all possible, it is best to conduct
IACUC requirements can be more detailed for research before or after humans are near the study
research on captive species. However, often location (i.e., before or after the zoo is open). If
those regulations were written for laboratory possible, operation of air conditioners, furnaces,
animals used in medical research (mostly Rattus air-filters, and lights should be stopped, or
and Mus) and are not specified or applicable for minimized, to reduce or eliminate background
wild animal research. For example, one of us sounds in recordings. Some facilities isolate
(WLG) had to convince the university veterinar- their mechanical equipment in a separate building
ian to allow kangaroo rats (Heteromyidae, from the animals’ environment; this greatly
Dipodomys) to be housed using sandy desert reduces noise exposure for the animals. A prelim-
soils instead of rat bedding so that these wild inary survey of noise in the animals’ enclosure,
animals could properly sand-bathe and tunnel. using a sound pressure level meter, helps identify
Zoos and aquaria support bioacoustical studies any particularly noisy or quiet areas.
on a wide variety of species, including Sometimes, ultrasonic noise or underwater
endangered species. Some benefits of studying noise can be present unbeknownst to zoo or
captive animals in a zoo are that their history is aquarium staff. One of us (JAT, personal
usually known (i.e., wild caught vs. captive born, observations) provided two examples. In an
sex, age, reproductive history, relatedness to other underwater hearing study on a Pacific white-
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 103
of the sound-field in small tanks were first pointed 3.8 Digital File Format
out by Parvulescu (1964) and recently discussed
by Duncan et al. (2016), Grey et al. (2016), Several file formats are available to save digital
Rogers et al. (2016), and Popper and Hawkins recordings. Digital file extensions include WAV,
(2018). Even in quite large tanks, the sound-field PCM, MP3, au, ram, MIDI, ogg, as well as others.
generated by even a simple sound source is It is best to record using uncompressed or WAV
transformed by interactions with boundaries or PCM (Pulse Code Modulation) formats for
(i.e., walls, floor of pool, and water surface) and faithful spectrum analysis.
can vary rapidly as a function of both space and MP3 is a digital audio-encoding format which
frequency. The resulting sound-field can be diffi- uses data compression to reduce file size. It is a
cult to model, or even characterize, and the common audio-format for consumer audio and a
sound-level can be very different from the natural de facto standard of digital audio-compression
environment. In particular, the levels of the parti- used for the transfer and playback of music. How-
cle motion components of the sounds (to which ever, MP3 files and other compression methods
fish are sensitive) can be very high. Attempts at are poor for spectrum analysis because compres-
dampening reverberation by adding materials sion only retains signals in a frequency band up to
such as “horse hair” or bubble-wrap can be effec- 16 kHz (i.e., the human hearing range). As a
tive at high frequencies, but have little effect at result, spectrum analysis using MP3 files is not
the low frequencies to which fish are sensitive and trustworthy above 16 kHz. The psychoacoustic-
where the sound wavelength often exceeds the based compression algorithms, in addition to lim-
dimensions of the tank (Popper and Hawkins iting frequencies to below 16 kHz (and even less
2018). In contrast, experiments performed in at higher compression ratios), discards fine details
deep and open water allow the establishment of that cannot be heard by humans. Cuts introduced
a relatively simple, well-controlled, and predict- by compression appear as unpleasant “holes” in
able sound-field (Hawkins 2014). the spectrogram and can destroy details that could
Grey et al. (2016) measured the sound-field in have meaning. However, MP3 files can be valu-
several large laboratory tanks and came to the able for ecological monitoring of temporal and
following conclusions: 1) Tanks, even large spatial patterns of well-known sounds.
ones, are not appropriate surrogates for open- A few digital recorders offer the Free Lossless
water environments. 2) Tank wall-thickness is Audio Codec (FLAC) format, which has less
largely irrelevant. Walls backed by air essentially compression and reduces the storage space up to
present a low impedance, and walls in contact 50% without loss of detail. In addition, a few
with a solid foundation or ground present finite digital recorders employ a Direct Stream Digital
(non-rigid) impedance defined by the substrate (DSD) format; a proprietary system of digitally
materials. 3) Resonance of the tank walls can recreating audible signals for the Super Audio
dominate underwater sound-field characteristics. CD, using delta-sigma 1-bit A/D-converters at
4) Lining the walls of a tank with acoustic absor- 2.8 or 5.6 MHz. Because of the intrinsic
bent material is futile, because the thicknesses properties of the delta-sigma conversion made
required at low frequencies would leave no by the 1-bit A/D-converter, these recorders have
room for the fish. 5) Both the sound pressure the potential to record frequencies well beyond
and the particle motion of a sound need to be 100 kHz, but with increased noise at high
measured and checked for mutual validation by frequencies. Spectrum analysis of recordings
calculating the particle motion from pressure made in the DSD format is appropriate.
gradients. Special hydrophone systems, based on Waveform sound files (WAV; created by
seismic accelerometers, are required to measure Microsoft) are perhaps the simplest of the com-
particle motion (see Chap. 2). mon formats for storing audio samples. Unlike
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 105
MPEG and other compressed formats, WAV files When converting analog to digital formats,
and their derivatives (like the Broadcast Wave usually using an A/D-converter, the sampling
File, BWF) store samples “in the raw” where no frequency must be at least twice the highest fre-
pre-processing is used, other than formatting of quency recorded and the recordist needs to make
data. When there is a choice of a recording file sure that the parameters of the storage medium are
format, the WAV (or BWF) format should be adequate for the task. There are a number of free
selected, rather than the MP3 format. software applications for conversion of analog to
With continuous recording, WAV files can digital formats.
become quite large and subsequently be difficult Storage of digital recordings can be done on
to handle with sound analysis software. For hard drives, optical drives, solid-state memory, or
example, WAV recordings sampling at 96 kHz an Internet cloud. Bluetooth (a wireless technol-
and 24 bit for 1 hour will occupy approximately ogy standard) provides reliable exchange of data
1 GB of storage capacity (96,000 samples/s between fixed and mobile devices over short
24 bits 1 byte/8 bits 60 minutes 60 s/ distances. Bluetooth uses UHF radio waves that
minute ¼ 1.04 GB). If monitoring is required for are effective at a short distance.
long periods, it is therefore important to select the
appropriate sampling rate to conserve storage
space. For example, if mid-frequency fish sounds 3.10 Archiving Recordings
are the main features of interest, then it can be
appropriately sampled at only 22 kHz, or at an Properly curated recordings are critically impor-
even lower sampling frequency. Several possible tant for assessing changes in soundscapes, ambi-
sampling frequencies and sometimes a choice of ent noise, and animal presence/absence and
bit depth (16 or 24 bit) are available, but not on all acoustic behavior over time. For example, under-
recorders. Some recorders enable a limit to be water recordings made by the US Navy off the
placed on the maximum size of each recorded coast of California indicated a steady increase in
file. Alternatively, a recording protocol can be background noise levels in the ocean in the last
adopted to limit the length of each recording. 60 years (from the 1960s). Marie Poland Fish, an
oceanographer and marine biologist, recorded
and analyzed the sounds of more than 300 species
3.9 Data Storage of marine life, from mammals to mussels. Her
work (described and spectrograms provided in
All storage media should be carefully labeled Fish and Mowbray 1970) helped the US Navy
with who, what, where, and when. Each recording to distinguish fish and other animal sounds from
period should have a unique number. Creating a the sounds made by submarines and remains a
master catalog of recording numbers allows primary source for analysis of marine fish sounds.
researchers to cross-reference metadata from a Recordings of humpback whale songs date
logbook. back to the 1970s and continue to document
Magnetic media, including magnetic tape annual changes in their song within different
(e.g., reel-to-reel, cassette, or DAT tapes), and populations. Williams et al. (2013) studied the
computer hard drives require storage in a dry, changing songs of male savannah sparrows
dark area away from any type of magnetic field. (Passerculus sandwichensis) recorded over three
Exposure to a magnet could erase data. If tapes decades (1980–2011) on Kent Island, New
are not played often, the tightly packed tape could Brunswick, in the Bay of Fundy. Life-long
“bleed through” from one segment to another, recordings of songs of white-crowned sparrows
thus contaminating data. Therefore, converting (Zonotrichia leucophrys) found they memorize
old recordings on magnetic tape to modern stor- syllables they hear at 10–50 days of age and
age is becoming urgent for data on historic sing the same song throughout their life. In con-
soundscapes and animals not be lost. trast, life-long recordings of northern
106 W. L. Gannon et al.
mockingbirds (Mimus polyglottos) found they ence program to create a map of the UK coastal
add elements to their songs throughout their soundscape in 2015.3 Other European online
lives. Only long-term archival data could be sound libraries include: Tierstimmen Archiv4
used for analysis of these trends. In this time of (approximately 120,000 sound recordings;
global warming and accelerated ice melts, Museum für Naturkunde, Berlin, Germany)
archived recordings from the polar regions Xeno-Canto5 (595,000 recordings from approxi-
might become instrumental in monitoring the mately 10,250 bird species Naturalis Biodiversity
rate of climate change (by quantifying Center, Leiden, Netherlands), and FonoZoo6
ice-cracking noise) and the effects on (11,657 recordings of 1621 animal species;
soundscapes and ecology (Obrist et al. 2010). Fonoteca Zoológica, Museo Nacional de Ciencias
The take-home message here is that good research Naturales (CSIC), Madrid, Spain).
practices with solid documentation and data In the USA, the Macaulay Library7 (Cornell
archiving allow for future knowledge generation. Lab of Ornithology, Ithaca, NY, USA) archived
older analog, digital, and video recordings. To
date, their holdings are approximately 24 million
3.11 Repositories photos, 915,000 audio and 192,000 video
of Bioacoustical Data recordings available for researchers. The K. Lisa
Yang Center for Conservation Bioacoustics8
Hafner et al. (1997) noted that collections of (Cornell Lab of Ornithology, Ithaca, NY, USA)
animal recordings with ancillary data are rich is everything “bird” including citizen science and
sources of reference material for bioacoustical masterful guides and information in ornithology
studies. Archiving analog data by converting to (including bird vocalization identification apps
a digital format has played an essential role in and bird cams). The Museum of Southwestern
preserving data for future use. Species-specific Biology9 (University of New Mexico,
sounds from a variety of regions and times, with Albuquerque, NM, USA) and Museum of Verte-
associated voucher specimens and metadata, are brate Zoology10 (University of California,
available for researchers at a number of Berkeley, CA, USA) have hundreds of thousands
organizations. All collections and their of cataloged natural history journals and voucher
corresponding links were valid as of specimens and began to associate avian
13 June 2022. vocalizations with voucher specimens in the
In Europe, there is a long tradition of recording 2000s. These museum collections have shown a
animal sounds, in particular bird songs, and many desire to include bat call libraries before 2023.
collections have been published on vinyl discs The Watkins Sound Library11 (Woods Hole
and CDs, mainly in France and the UK. In 1969, Oceanographic Institution, Woods Hole, MA,
the British Library of Wildlife Sounds2 USA) provides particularly good collections of
established holdings of more than 160,000 well- marine mammal sounds with a highlighted
documented field-recordings covering all classes “Best of” cuts section that contains 1694 sound
of sound-producing animals from many regions.
More than 10,000 species of invertebrates,
insects, amphibians, reptiles, fishes, birds, and 3
https://www.bl.uk/sounds-of-our-shores
4
mammals, including many rare and threatened http://www.tierstimmenarchiv.de/
5
species. A large number of these recordings https://www.xeno-canto.org/
6
were made for radio by the BBC Natural History http://www.fonozoo.com/index_eng.php
7
Unit. The British Library supported a citizen-sci- http://macaulaylibrary.org
8
https://www.birds.cornell.edu/ccb/
9
https://arctosdb.org/; http://www.msb.unm.edu/
10
2
https://www.bl.uk/collection-guides/wildlife-and-envi http://mvz.berkeley.edu/General_Information.html
11
ronmental-sounds; accessed 13 June 2022 https://cis.whoi.edu/science/B/whalesounds/index.cfm
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 107
Fig. 3.12 Commercial companies and others market photo by the authors; right photo, “Capturing the sounds
sounds of animals and soundscapes recorded by of the lake” by S. Shiller; https://www.flickr.com/photos/
researchers such as Bernie Krause. Recording and 12289718@N00/9454414945; licensed under CC BY 2.0;
analyzing natural sound is fulfilling and insightful, and https://creativecommons.org/licenses/by/2.0/
can be a profound source for generating knowledge. Left
deal of time in transferring analog recordings to Charrier I, Mathevon H, Aubin T (2013) Bearded seal
digital formats for more permanent preservation. males perceive geographic variation in their trills.
Behav Ecol Sociobiol 67(10):1679–1689. https://doi.
CDs of animal and nature sounds are now com- org/10.1007/s00265-013-1578-6
mercially available. Archives are useful for edu- Clark CW, Clark JM (1980) Sound playback experiments
cation and research. As we evaluate current with southern right whales (Eubalaena australis). Sci-
hypotheses related to global warming, perhaps ence 20:663–665
Davis J (1958) Singing behavior and the gonad cycle of
we can hear the world change. the Rufous-sided towhee. Condor 60:308–336
Deecke V (2003) Seals are guided by voices. Discover
April p. 17.
Duncan AJ, Lucke K, Erbe C, McCauley RD (2016) Issues
3.13 Additional Resources associated with sound exposure experiments in tanks.
Proc Meet Acoust 27(1):070008. https://doi.org/10.
• Sound recording tips from eBird: https://www. 1121/2.0000280
macaulaylibrary.org/how-to/recording- Emlen ST (1972) An Experimental Analysis of the
Parameters of Bird Song Eliciting Species Recogni-
techniques/ tion. Anim Behav 41:130–171
• Bioacoustics equipment and field techniques, Erbe C (2009) Underwater noise from pile driving in
Centro Interdisciplinare di Bioacustica Moreton Bay, Qld. Acoust Aust 37(3):87–92
e Ricerche Ambientali, Università degli Studi Erbe C (2013) Underwater noise of small personal water-
craft (jet skis). J Acoust Soc Am 133(4):EL326–
di Pavia: http://www.unipv.it/cibra/edu_equip EL330. https://doi.org/10.1121/1.4795220
ment_uk.html Erbe C, Verma A, McCauley R, Gavrilov A, Parnum I
• Manual on Field Recording Techniques and (2015) The marine soundscape of the Perth Canyon.
Protocols for All Taxa Biodiversity Prog Oceanogr 137:38–51. https://doi.org/10.1016/j.
pocean.2015.05.015
Inventories and Monitoring (Eymann et al. Erbe C, Marley S, Schoeman R, Smith JN, Trigg L,
2010): https://issuu.com/ysamyn/docs/ Embling CB (2019) The effects of ship noise on marine
abctaxa_vol_8_part1_lr mammals--A review. Front Mar Sci 11. https://doi.org/
10.3389/fmars.2019.00606
Eymann J, Degreef J, Hauser C, Monje JC, Samyn Y,
All web resources were last accessed VanDerSpiegel D (2010) Manual on field recording
13 June 2022. techniques and protocols for all taxa biodiversity
inventories and monitoring. Abc Taxa series, The Bel-
gian Development Corporation, Brussels, Belgium
http://www.abctaxa.be ISSN 1784-1291.
References Filatova OA, Fedutin ID, Burdin AM (2011) Responses of
Kamchatkan fish-eating killer whales to playbacks of
Amorim MCP, Simões JM, Fonseca PJ, Turner GF (2008) conspecific calls. Mar Mam Sci 27(2):E26–E42
Species differences in courtship acoustic signals Fish MP, Mowbray WH (1970) Sounds of Western North
among five Lake Malawi cichlid species Atlantic fishes. A reference file of biological underwa-
(Pseudotropheus spp.). Fish Biol 72:1355–1368. ter sounds. The John Hopkins Press, Baltimore, MA,
https://doi.org/10.1111/j.1095-8649.2008.01802.x USA, 207 p
Au WWL (2000) Chapter 9: Echolocation in dolphins. In: Gannon WL (1999) Tamias siskiyou. In: Wilson DE, Ruff
WWL A, Popper AN, Fay RR (eds) Hearing by whales S (eds) Complete book of North American Mammals.
and dolphins. Springer-Verlag, New York, pp 364–408 Smithsonian Institution Press, Washington, DC
Beecher MD, Burt JM (2004) The Role of Social Interac- Gannon WL (2014) Integrating research ethics with grad-
tion in Bird Song Learning. Curr Dir Psychol Sci 13: uate education in Geography. J Geography High Ed
224–228. https://doi.org/10.1111/j.0963-7214.2004. 38:481–499. https://doi.org/10.1080/03098265.2014.
00313.x 958656
Borkin KM, Smith DHV, Shaw WB, McQueen JC (2019) Gluck JP (1998) The death of a vagrant bird. In: Orlans
More traffic, less bat activity: the relationship between BF, Beauchamp TL, Dresser R, Morton D, Gluck JP
overnight traffic volumes and Chalinolobus (eds) The human use of animals; Case studies in ethical
tuberculatus activity along New Zealand highways. choice. Oxford University Press, New York, pp
Acta Chiropterologica 21:321–329. https://doi.org/10. 191–208
3161/15081109ACC2019.21.2.007 Grey MD, Rogers PH, Popper AN, Hawkins AD, Fay RR
Carles F, Torre I, Arrizabalaga A (2007) Comparison of (2016) Large tank acoustics: How big is big
Sampling Methods for Inventory of Bat Communities. enough? In: Popper AN, Hawkins A (eds) The effects
J Mammal 88:526–533. https://doi.org/10.1644/06- of noise on aquatic life II, Advances in experimental
MAMM-A-135R1.1 Medicine and biology. Springer Science + Business
3 Collecting, Documenting, and Archiving Bioacoustical Data and Metadata 109
Media, New York. https://doi.org/10.1007/978-1- Prat Y, Taub M, Yovel Y (2016) Everyday bat
4939-2981-8_43 vocalizations contain information about emitter.
Hafner MS, Gannon WL, Salazar-Bravo J, Alvarez- Nature 39419(2016):1. https://doi.org/10.1038/
Castaneda ST (1997) Mammal collections of the West- srep39419
ern Hemisphere: a survey and directory of existing Rogers PH, Cox M (1988) Underwater sound as a
collections. In: American Society of Mammalogists. biological stimulus. In: Atema J, Fay RR, Popper
Allen Press, Lawrence KS, p 97. ISBN 0-89338-055- AN, Tavolga WN (eds) Sensory biology of aquatic
5. http://www.mammalsociety.org/uploads/commit animals. Springer-Verlag, New York, pp 131–149
tee_files/collsurvey.pdf Rogers PH, Hawkins AD, Popper AN, Fay RR, Grey MD
Hawkins AD (2014) Examining fish in the sea: a European (2016) Parvulescu revisited: small tank acoustics for
perspective on fish hearing experiments. In: Popper bioacousticians. In: Popper AN, Hawkins A (eds) The
AN, Fay RR (eds) Perspectives on auditory research. effects of noise on aquatic life II, Advances in experi-
Springer 247 Handbook of Auditory Research, p 50. mental medicine and biology. Springer Science + Busi-
https://doi.org/10.1007/978-1-4614-9102-6_14 ness Media, New York. https://doi.org/10.1007/978-1-
Hawkins AD, Roberts L, Cheesman S (2014) Responses 4939-2981-8_43
of free-living coastal pelagic fish to impulsive sounds. Schwalm A (2012) Analysis of aerial vocalizations of
J Acoust Soc Am 135(5):3101–3116 California sea lions in rehabilitation as an indicator of
Hayes SA, Kumar A, Costa DP, Mellinger DK, Harvey J health. PhD Dissertation, Western Illinois University.
(2004) Evaluating the function of the male harbour Sherwin RE, Gannon WL, Haymond S (2000) The effi-
seal, Phoca vitulina, roar through playback cacy of acoustic techniques to infer differential use of
experiments. Anim Behav 67:1133–1139. https://doi. habitat by bats. Acta Chiropterologica 2(2):145–153
org/10.1016/j.anbehav.2003.06.019 Simpson SD, Radford AN, Tickle EJ, Meekan MG, Jeffs
Kissner KJ, Forbes MR, Secoy DM (1997) Rattling behav- AG (2011) Adaptive avoidance of reef noise. PLoS
ior of prairie rattlesnakes (Crotalus viridis viridis, One 6(2):e16625. https://doi.org/10.1371/journal.
Viperidae) in relation to sex, reproductive Status, pone.0016625
Body Size, and Body Temperature. Ethology Slobodchikoff CN, Ackers SH, Van Ert M (1998) Geo-
103(12):1042–1050. https://doi.org/10.1111/j. graphical variation in alarm calls of Gunnison’s prairie
1439-0310.1997.tb00146.x dogs. J Mammal 79:1265–1272
Kunz TH, Parsons S (eds) (2009) Ecological and behav- Slobodchikoff CN, Perla BS, Verdolin JL (2009)
ioral methods for the study of bats, 2nd edn. The John Prairie dogs: Communication and community in an
Hopkins Press, Baltimore, MD, p 920 animal society. Harvard University Press, Boston, MA
Lindseth AV, Lobel PS (2018) Underwater soundscape Smith AT, Nagy JD, Millar CI (2016) Behavioral ecology
monitoring and fish bioacoustics: a review. Fishes 3: of American pikas (Ochotona princeps) at Mono
1–15. https://doi.org/10.3390/fishes3030036 Craters, California: Living on the edge. West N Am
Mobley JR Jr, Herman IM, Frankel AS (1988) Responses Nat 76(4):459–484
of wintering humpback whales (Megaptera Therrien SC, Thomas JA, Therrien RE, Stacey R (2012)
novaeangliae) to playback of recordings of winter Time of day and social change affects underwater
and summer vocalisations and of synthetic sound. sound production of bottlenose dolphins (Tursiops
Behav Ecol Sociobiol 23:211–223 truncatus) at the Brookfield Zoo. Aquat Mamm
Morton SL, Morton ES (1998) Sound playback studies. In: 38(1):65–75
Evans CS (ed) Animal acoustic communication. Thomas JA, DeMaster DP (1982) An acoustic technique
Springer-Verlag, Berlin, pp 323–352 for determining haulout pattern in leopard (Hydrurga
Myrberg AA Jr, Riggio RJ (1985) Acoustically mediated leptonyx) and crabeater (Lobodon carcinophagus)
individual recognition by a coral reef fish seals. Can J Zool 60(8):2028–2031
(Pomacentrus partitus). Anim Behav 33:411–416 Thomas JA, Zinnel KC, Ferm LM (1983) Analysis of
Obrist MK, Pavan G, Sueur J, Riede K, Llusia D, Márquez Weddell seal (Leptonychotes weddelli) vocalizations
R (2010) Bioacoustic approaches in biodiversity using underwater playbacks. Can J Zool 61:1448–1456
inventories. In: Manual on field recording techniques Thomas JA, Friel B, Yegge S (2016) Restoring duetting
and protocols for all taxa biodiversity inventories, Abc behavior in a mated pair of buffy cheeked gibbons after
Taxa series, vol 8, http://www.abctaxa.be, ISSN 1784- exposure to construction noise at a zoo through
1291 edn. The Belgian Development Corporation, playbacks of their own sounds ASA abstract Dec 2016.
Brussels, pp 68–99 Tremel DP, Thomas JA, Ramirez KT, Dye GS, Bachman
Parvulescu A (1964) Problems of propagation and WA, Orban AN, Grimm KK (1998) Underwater
processing. In: Tavolga WN (ed) Marine hearing sensitivity of a Pacific white-sided dolphin,
bio-acoustics. Pergamon Press, Oxford, pp 87–100 Lagenorhynchus obliquidens. Aquat Mamm 24(2):
Popper AN, Hawkins AD (2018) The importance of parti- 63–69
cle motion to fishes and invertebrates. J Acoust Soc Turl CW, Thomas JA (1992) Possible relationship
Am 143(1):470–488. https://doi.org/10.1121/1. between oceanographic conditions and long-range tar-
5021594 get detection by a false killer whale. In: Thomas JA,
110 W. L. Gannon et al.
Kastelein RA, Supin AY (eds) Marine Mammal Sen- evolution in Savannah Sparrow songs. Anim Behav
sory Systems. Plenum Press, New York, pp 421–432. 85(1):213. https://doi.org/10.1016/j.anbehav.2012.
773 pp. ISBN 9780306443510 10.028
Tyack P (1983) Differential responses of humpback Yegge SA (2012) Playbacks of conspecific vocalizations
whales, Megaptera novaeangliae, to playback of and music to Gabriella’s crested gibbons at Niabi Zoo
song or social sounds. Behav Ecol Sociobiol 13:49–55 to restore their duetting behavior Western Illinois Uni-
von Benda-Beckmann AM, Wensveen PJ, Samarra FIP, versity. PhD Dissertation.
Beerens SP, Miller PJO (2016) Separating underwater Young S, Carr A, Jones G (2018) CCTV enables the
ambient noise from flow noise recorded on stereo discovery of new barbastelle (Barbastella
acoustic tags attached to marine mammals. J Exp Biol barbastellus) vocalizations and activity pattern near a
219, 2271–2275. https://doi.org/10.1242/jeb.133116. roost. Acta Chiropterologica 20:262–272
White M (2013) The ethical flap over birdsong apps. Zelick R, Mann DA, Popper AN (1999) Acoustic commu-
National Geographic, https://www. nication in fishes and frogs. In: Fay RR, Popper AN
nationalgeographic.com/news/2013/6/130614-bird- (eds) Comparative hearing: fish and amphibians.
watching-birdsong-smartphone-app-ethics/ Springer-Verlag, New York, pp 363–411
Williams H, Levin II, Ryan-Norris D, Newman AEM,
Wheelwright NT (2013) Three decades of cultural
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Introduction to Acoustic Terminology
and Signal Processing 4
Christine Erbe, Alec Duncan, Lauren Hawkins,
John M. Terhune, and Jeanette A. Thomas
Institute 2013; International Organization for trains, ships, and construction sites. The distinc-
Standardization 2017). In contrast to noise, a sig- tion by source type is common in the study of
nal is wanted, because it conveys information. soundscapes. These comprise a geophony,
There are many ways to describe, quantify, biophony, and anthropophony.
and classify sounds. One way is to label sounds The following sections explain some of the phys-
according to the medium in which they have ical measurements by which sounds can be
traveled: air-borne, water-borne, or structure- characterized and quantified. The terminology is
borne (also called substrate-borne or ground- based on international standards (including, Interna-
borne). For example, scientists studying bat echo- tional Organization for Standardization 2007, 2017;
location work with air-borne sound. Those American National Standards Institute 2013).
looking at the effects of marine seismic survey
noise on baleen whales work with water-borne
sounds. Some of the sound may have traveled as 4.2 Terms and Definitions
a structural vibration through the ground and is
therefore referred to as structure-borne. Just as 4.2.1 Units
earthquakes can be felt on land, submarine
earthquakes can be sensed by benthic organisms A wide (and confusing) collection of units can be
on the seafloor. In both cases, the sound is found in early books and papers on acoustics, but
structure-borne (Dziak et al. 2004). Sound can the units now used for all scientific work are
cross from one medium into another. The sound based on the International System of Units, better
of airplanes is generated and heard in air but also known as the SI system (Taylor and Thompson
transmits into water where it may be detected by 2008). In this system, a unit is specified by a
aquatic fauna (e.g., Erbe et al. 2017b; Kuehne standard symbol representing the unit itself, and
et al. 2020). a multiplier prefix representing a power of
Another way of grouping sounds is by their 10 multiples of that unit. For example, the symbol
sources: geophysical, biological, or anthropo- μPa (pronounced micro pascal) is made up of the
genic. Geophysical sources of sound are wind, multiplier prefix μ (micro), representing a factor
rain, hail, breaking waves, polar ice, earthquakes, of 106 (one one-millionth) and the symbol Pa
and volcanoes. Biological sounds are made by (pascal), which is the SI unit of pressure. So, a
animals on land, such as insects, birds, and bats, measured pressure given as 1.4 μPa corresponds
or by animals in water, such as invertebrates, to 1.4 times 106 Pa or 0.0000014 Pa. The SI base
fishes, and whales. Anthropogenic sounds are units are listed in Table 4.1. Other quantities and
made by humans and stem from airplanes, cars, their units result from quantity equations that are
Table 4.1 SI base units (length, mass, time, electric current, temperature, luminous intensity, and amount of substance)
and example derived units (frequency, pressure, energy, and power)
Quantity Unit name Unit symbol Expressed in terms of base units
Length meter m
Mass kilogram kg
Time second s
Electric current ampere A
Temperature kelvin K
Luminous intensity candela cd
Amount of substance mole mol
Frequency hertz Hz 1/s
Pressure pascal Pa kg / (m s2)
Energy joule J kg m2 / s2
Power watt W kg m2 / s3
4 Introduction to Acoustic Terminology and Signal Processing 113
based on these base quantities. The SI multiplier oscillate is parallel (or longitudinal) to the direc-
prefixes that go along with these units are listed in tion of propagation of the sound wave in the case
Table 4.2. Note that unit names are always written of longitudinal waves.
in lowercase. However, if the unit is named after a Rock is a solid medium and here, vibration
person, then the symbol is capitalized, otherwise travels as both longitudinal (also called pressure
the symbol is also lowercase. Examples for units or P-waves) and transverse waves (also called
named in honor of a person are kelvin [K], pascal shear or S-waves). In S-waves, the particles oscil-
[Pa], and hertz [Hz]. late perpendicular to the direction of propagation.
It is again because of the coupling of particles,
that the wave propagates. P-waves travel faster
than S-waves so that P-waves arrive before
4.2.2 Sound
S-waves. The P therefore also stands for “pri-
mary” and S for “secondary.”
Sound refers to a mechanical wave that creates a
local disturbance in pressure, stress, particle dis-
placement, and other quantities, and that
propagates through a compressible medium by 4.2.3 Frequency
oscillation of its particles. These particles are
acted upon by internal elastic forces. Air and Frequency refers to the rate of oscillation. Specif-
water are both fluid acoustic media and sound in ically, it is the rate of change of the phase of a sine
these media travels as longitudinal waves (also wave over time, divided by 2π. Here, phase refers
called pressure or P-waves). A common miscon- to the argument of a sine (or cosine) function.
ception is that the air or water particles travel with It denotes a particular point in the cycle of a
the sound wave from the source to a receiver. This waveform. Phase changes with time. Phase is
is not the case. Instead, individual particles oscil- measured as an angle in radians or degrees.
late back and forth about their equilibrium posi- Phase is a very important factor in the interaction
tion. These oscillations are coupled across of one wave with another. Phase is not normally
individual particles, which creates alternating an audible characteristic of a sound wave, though
regions of compressions and rarefactions and it can be in the case of very-low-frequency
which allows the sound wave to propagate sounds.
(Fig. 4.11). The line along which the particles A simpler concept of frequency of a sine wave,
as shown in Fig. 4.1, is the number of cycles per
1 second. A full cycle lasts from one positive peak
Dan Russell’s animations of particle motion during
acoustic wave propagation: https://www.acs.psu.edu/ to the next positive peak. To determine the fre-
drussell/Demos/waves-intro/waves-intro.html, of the quency, count how many full cycles and fractions
amplitude at a fixed location: https://www.acs.psu.edu/ thereof occur in 1 s. Note that pitch is an attribute
drussell/Demos/wave-x-t/wave-x-t.html, and of longitudi-
of auditory sensation and while it is related to
nal and transverse waves: https://www.acs.psu.edu/
drussell/Demos/waves/wavemotion.html; accessed frequency, it is used in human auditory perception
12 October 2020. as a means to order sounds on a musical scale. As
114 C. Erbe et al.
Fig. 4.1 A sinusoidal sound wave having a peak pressure propagates to the right. At regions of compression, the
of 1 Pa, a peak-to-peak pressure of 2 Pa, a root-mean- pressure is high; at regions of rarefaction, it is low. The
square pressure of 0.7 Pa, a period of 0.25 s, and a bottom plot shows the change in pressure over time at a
frequency of 4 Hz. The top plot indicates the motion of fixed location. While the plots are lined up, the horizontal
the particles of the medium; they undergo coupled axes of the top and bottom plots are space and time,
oscillations back and forth, so that the sound wave respectively
we know very little about auditory perception in but also harmonically related overtones. The
animals, the term pitch is not normally used in frequencies of overtones are integer multiples of
animal bioacoustics. the fundamental: 2 f0, 3 f0, 4 f0, ... Beware that
The symbol for frequency is f and the unit is there are two schemes for naming these tones: f0
hertz [Hz] in honor of Heinrich Rudolf Hertz, a can be called either the fundamental or the first
German physicist who proved the existence of harmonic. In the former case, 2 f0 becomes the
electromagnetic waves. Expressed in SI units, first overtone, 3 f0 the second overtone, etc. In the
1 Hz ¼ 1/s. latter case, 2 f0 becomes the second harmonic, 3 f0
The fundamental frequency (symbol: f0; unit: the third harmonic, etc.
Hz) of an oscillation is the reciprocal of the Musical instruments produce harmonics,
period. The period (symbol: τ; unit: s) is the which determine the characteristic timbre of the
duration of one cycle and is related to the funda- sounds they produce. For example, it is the
mental frequency as (see Fig. 4.1): differences in harmonics that make a flute sound
unmistakably different from a clarinet, even when
1
τ¼ they are playing the same note. Animal sounds
f0
also often have harmonics as they use similar
The wavelength (symbol: λ; unit: m) of a sine basic mechanisms to musical instruments. Most
wave measures the spatial distance between two mammals have string-like vocal cords and birds
successive “peaks” or other identifiable points on have string-like syrinxes. Fish have muscles that
the wave. contract around a swim bladder to produce
A sound that consists of only one frequency is percussive-type sounds. Insects and invertebrates
commonly called a pure tone. Very often, sounds stridulate or rub body parts together to produce a
contain not only the fundamental frequency percussive sound.
4 Introduction to Acoustic Terminology and Signal Processing 115
The frequency or frequencies of a sound may contour with respect to time is zero at a local
change over time, so that frequency is a function extremum, and the second derivate is a positive
of time: f(t). This is called frequency modulation number in the case of a minimum or a negative
(abbreviation: FM). If the frequency increases number in the case of a maximum. At an inflec-
over time, the sound is called an upsweep. If tion point, the curvature of the contour changes
the frequency decreases over time, the sound is from clockwise to counter-clockwise or vice
called a downsweep. Sounds without frequency versa. Mathematically, the first derivative of the
modulation are called continuous wave. The whistle contour with respect to time exhibits a
sound of jet skis under water is frequency- local extremum and the second derivative is zero
modulated due to frequent speed changes (Erbe at an inflection point. Steps in the contour are
2013). Whistles of animals such as birds or discontinuities in frequency. There is no temporal
dolphins (e.g., Ward et al. 2016) are commonly gap but the contour jumps in frequency. The
frequency-modulated and often exhibit overtones frequency measurements are taken from the fun-
(Fig. 4.2). damental contour. The duration, number of local
The acoustic features of frequency-modulated extrema, number of inflection points, and number
sounds such as whistles can identify the species, of steps are the same in fundamental and
population, and sometimes individual animal that overtones and can therefore be measured from
made them (e.g., Caldwell and Caldwell 1965). any harmonic contour. This is beneficial if the
Such characteristic features include the start fre- fundamental is partly masked by noise.
quency, end frequency, minimum frequency,
maximum frequency, duration, number of local
extrema, number of inflection points, and number
of steps (e.g., Marley et al. 2017). The start fre-
quency is the frequency at the beginning of the
fundamental contour, the end frequency is the
frequency at the end of the fundamental contour
(Fig. 4.3). The minimum frequency is the lowest
frequency of the fundamental contour and the
maximum frequency is the highest. Duration
measures how long the whistle lasts. Extrema
are points of local minima or maxima in the
contour. At a local minimum, the contour changes
from downsweep to upsweep; at a local maxi-
mum, it changes from upsweep to downsweep. Fig. 4.3 Spectrogram of a frequency-modulated sound,
Mathematically, the first derivative of the whistle identifying characteristic features
116 C. Erbe et al.
4.2.4 Pressure
ppkpk ¼ max ðpðt ÞÞ min ðpðt ÞÞ (m/s)2 if the signal is particle velocity). The mean-
square sound pressure formula is similar to
In other words, it is the sum of the greatest (Eq. 4.1) but without the root.
magnitude during compression and the greatest The sound pressure level (abbreviation: SPL;
magnitude during rarefaction. symbol: Lp) is the level of the root-mean-square
The peak sound pressure (symbol: ppk; unit: sound pressure and computed as
Pa) is also called zero-to-peak sound pressure and
is the greatest deviation of the sound pressure
from the static pressure; it is the greatest magni- prms
Lp ¼ 20 log 10
tude of p(t): p0
Table 4.3 Examples of sound pressure levels in air. All meters. Note that the different sources listed can have a
levels are broadband; the hearing thresholds are single- range of levels and only one example is given
frequency. Nominal ranges from the source are given in
Pa dB re 20 μPa
Explosion at 1 m 63,246 190
Airplane take-off at 25 m 632 150
Human pain threshold at 1 kHz 200 140
Lion roar at 1 m 13 116
Human discomfort threshold at 1 kHz 10 114
Diesel lawn mower at 1 m 1 94
Truck at city speed at 20 m 0.2 80
Old vacuum cleaner at 1 m 0.1 70
Bird song at 1 m 0.02 60
Cricket chorus at 1 m 0.02 60
Human speech at 1 m 0.01 55
Buzzing mosquito 0.002 40
Human whisper at 1 m 0.001 30
Fluttering leaves 0.0002 20
Human breathing at 1 m 0.0001 10
Human hearing threshold at 1 kHz 0.00002 0
Table 4.4 Examples of sound pressure levels in water. meters. Note that the different sources listed can have a
All levels are broadband; the hearing thresholds are single- range of levels and only one example is given
frequency. Nominal ranges from the source are given in
Pa dB re 1 μPa
Subsea earthquake 316,228 230
Seismic survey airgun at 1 m 10,000 200
Container ship at 1 m 5623 195
Humpback whale song at 1 m 1778 185
Zodiac at high speed at 1 m 178 165
Dolphin whistle at 1 m 32 150
Geotechnical drilling at 1 m 18 145
Jet ski 10 140
Toadfish at 1 m 10 140
Damsel fish at 1 m 1 120
Open ocean ambient noise at sea state 4 0.1 100
Open ocean ambient noise at sea state 0.5 0.01 80
California Sea lion hearing threshold at 10 kHz 0.001 60
Killer whale hearing threshold at 20 kHz 0.0001 40
Pressure [ Pa]
related, and in fact, the sound exposure level can 5
be computed from the sound pressure level as:
0
LE,p ¼ Lp þ 10 log 10 ðt 2 t 1 Þ
-5
Conceptually, the difference is that the SPL is a
time-average and therefore useful for sounds that
don’t change significantly over time, or that last for 10
15 Cumulative Squared Pressure
a long time, or that, for the assessments of noise 4
impacts, can be considered continuous. Examples 95 % cumulative E
t) [ Pa 2s]
are workplace noise or ship noise. The SEL, how- 3
ever, increases with time and critically depends on
the time window over which it is computed. It is 2
sum(p 2
therefore most useful for short-duration, transient
sounds, such as pulses from explosions, pile 1
5 % cumulative E
driving, or seismic surveys. The SEL is then
computed over the duration of the pulse. 0
0 0.1 0.2 0.3 0.4 0.5
It can be difficult to determine the actual pulse T5% T95%
length as the exact start and end points are often Time [s]
not clearly visible, in particular in background
Fig. 4.5 Pressure pulse recorded from pile driving under
noise. Therefore, in praxis, SEL is commonly
water (top) and cumulative squared-pressure curve (bot-
computed over the 90% energy signal duration. tom). The horizontal lines indicate the 5% and 95% cumu-
This is the time during which 90% of the sound lative squared-pressure points on the y-axis. The vertical
exposure occurs. Sound exposure is computed lines identify the corresponding times on the x-axis. The
time between the 5% and 95% marks is the 90% energy
symmetrically about the 50% mark; i.e., from
signal duration. Recording from Erbe 2009
the 5% to the 95% points on the cumulative
squared-pressure curve. SEL becomes (Fig. 4.5):
0Z t95% 1 4.2.7 Acoustic Energy, Intensity,
and Power
B p ðtÞdt C
2
B t5% C
LE,p ¼ 10 log 10 B C
@ E p,0 A Apart from sound pressure and sound exposure,
other physical quantities appear in the bioacous-
tics literature, but are often wrongly used. Acous-
In the presence of significant background tic energy refers to the total energy contained in
noise pn(t), the noise exposure needs to be an acoustic wave. This is the sum of kinetic
subtracted from the overall sound exposure in energy (contained in the movement of the
order to yield the sound exposure due to the signal particles of the medium) and potential energy
alone. In praxis, the noise exposure is computed (i.e., work done by elastic forces in the medium).
over an equally long time window (from t1 to t2) Acoustic energy E is proportional to squared pres-
preceding or succeeding the signal of interest: sure p and time interval Δt (i.e., to sound expo-
0Z t95% Z t2 1 sure) only in the case of a free plane wave or a
p ðt Þdt
2
pn ðt Þdt C
2 spherical wave at a large distance from its source:
B
B t t1 C
LE,p ¼ 10 log 10 B 5% C S 2
@ E p,0 A E¼ p Δt
Z
120 C. Erbe et al.
The proportionality constant is the ratio of More information and definitions can be found in
surface area S through which the energy flows acoustic standards (including American National
and acoustic impedance Z. Acoustic energy Standards Institute 2013; International Organiza-
increases with time; i.e., the longer the sound tion for Standardization 2017).
lasts or the longer it is measured, the greater the
transmitted energy. The unit of energy is joule
[J] in honor of English physicist James Prescott
4.2.8 Particle Velocity
Joule. In SI units:
Frequency [Hz]
6000 100 6000 40
20
4000 80 4000
0
2000 60 2000
-20
0 40 0 -40
30 40 50 60 30 40 50 60
Frequency [Hz]
6000 60 6000 60
40
4000 4000
40
20
2000 2000
0 20
0 0
30 40 50 60 30 40 50 60
Time [s] Time [s]
Fig. 4.6 Spectrograms of mean-square sound pressure Hz], and mean-square particle acceleration spectral density
spectral density [dB re 1 μPa2/Hz], mean-square particle [dB re 1 (μm/s2)2/Hz] recorded under water when a snor-
displacement spectral density [dB re 1 pm2/Hz], mean- keler swam above the recorder (Erbe et al. 2016b; Erbe
square particle velocity spectral density [dB re 1 (nm/s)2/ et al. 2017a)
Fig. 4.7 Example profiles of the speed of sound in (a) air and the national programs that contribute to it; https://argo.
(data from The Engineering ToolBox; https://www. ucsd.edu, https://www.ocean-ops.org. The Argo Program
engineeringtoolbox.com/elevation-speed-sound-air-d_ is part of the Global Ocean Observing System. Argo float
1534.html; accessed 16 April 2021) and (b) water in polar data and metadata from Global Data Assembly Centre
and equatorial regions (These data were collected and (Argo GDAC); https://doi.org/10.17882/42182; accessed
made freely available by the International Argo Program 16 April 2021). See Chaps. 5 and 6
By definition, the level LQ of quantity Q is or P0). For example, an underwater tone at a level
proportional to the logarithm of the ratio of of 120 dB re 1 μPa rms has an rms pressure of
Q and a reference value Q0, which has the same 1 Pa. This is worked out as follows:
unit. In the case of a field quantity F, such as
sound pressure or particle velocity, or an electri- F ¼ 10120=20 1μPa ¼ 106 μPa ¼ 1 Pa
cal quantity such as voltage or current, the level
However, a tone of 120 dB re 20 μPa rms in air
LF is computed as
has an rms pressure of 20 Pa:
F
LF ¼ 20 log 10 F ¼ 10120=20 20 μPa ¼ 106 20 μPa ¼ 20 Pa
F0
¼ 10ð 20 Þ
F1 LF1 LF2
4.2.11.1 Conversion from Decibel
F2
to Field or Power Quantities
The relationships for calculating field and power
¼ 10ð 10 Þ
P1 LP1 LP2
Table 4.6 Level differences and their corresponding field and power quantity ratios
Level difference Field quantity ratio (F1/F2); use for Power quantity ratio (P1/P2); use for power,
(LF1-LF2 or LP1-LP2) pressure, particle velocity, voltage, intensity, energy, sound exposure, mean-square
in dB current, etc. pressure, etc.
40 1/100 ¼ 0.01 1/10,000 ¼ 0.0001
20 1/10 ¼ 0.1 1/100 ¼ 0.01
pffiffiffiffiffi
10 1= 10 0:316 1/10 ¼ 0.1
6 1/2 ¼ 0.5 1/4 ¼ 0.25
pffiffiffi
3 1/ 2 0.707 1/2 ¼ 0.5
0 1 1
pffiffiffi
3 2 1.41 2
6 2 4
pffiffiffiffiffi
10 10 3.16 10
20 10 100
40 100 10,000
124 C. Erbe et al.
quantity ratio is the square of the corresponding analog-to-digital converters have a digitization
field quantity ratio. gain expressed in dB re FS/V, which specifies
For example, a tone at a level of 120 dB re the input voltage that leads to full scale (FS). If
1 μPa rms is 20 dB stronger than a tone at a the digitizer has a digitization gain ΔLDG ¼ 10 dB
level of 100 dB re 1 μPa rms, so from re FS/V, then 1010/20 FS/V ¼ 101/2 FS/V is the
Table 4.6, the ratio of the two rms pressures is relationship between FS and input voltage,
p1/p2 ¼ F1/F2 ¼ 10, and the ratio of their meaning that FS is reached when the input is
intensities is I1/I2 ¼ P1/P2 ¼ 100. 1/101/2 V ¼ 0.32 V. The actual value of FS
depends on the number of bits available. A
16-bit digitizer in bipolar mode (i.e., producing
4.2.11.3 Amplification of Signals
both positive and negative numbers) has a full-
The above formulae and Table 4.6 can also be
scale value of 216–1 ¼ 215 ¼ 32,768. And so the
used to calculate the effect of amplifying signals.
digital values v representing the acoustic pressure
For example, if an amplifier has a gain of 20 dB,
will lie between 32,768 and + 32,767 (with one
then the rms voltage at the output of the amplifier
of the possible numbers being 0). The final steps
will be 10 times the rms voltage at its input.
in relating these digital values to the recorded
Similarly, an amplifier with a 40 dB gain will
acoustic pressure entail dividing by FS,
increase the rms voltage by a factor of 100. If
converting to dB, and subtracting all the gains:
several amplifier stages are cascaded, then their
combined gain is the sum of the gains of the Lp ¼ 20 log 10 ðv=FSÞ ΔLDG ΔLG N S
individual stages (in dB). ¼ 20 log 10 ðv=FSÞ þ 150 dB re 1 μPa
When calibrating acoustic recordings (see
Chap. 2), the gains of all components of the
recording systems have to be summed. An under-
water recording system (Fig. 4.8), for example,
4.2.11.4 Superposition of Field
contains a hydrophone that converts received
and Power Quantities
acoustic pressure to a time series of voltages at
If two tones of the same frequency and level
its output. The sensitivity of the hydrophone
arrive in phase at a listener, then the amplitude
specifies this relationship. For example, a hydro-
is doubled and the combined level is therefore
phone with a sensitivity NS ¼ 180 dB re
6 dB above the level of each tone (see
1 V/μPa produces 10–180/20 ¼ 109 Volts output
Table 4.6). If, on the other hand, there is a random
per 1 μPa input. A more sensitive hydrophone has
phase difference between the two tones then, on
a less negative sensitivity. The output voltage
average, the intensity of the two signals will sum.
might be passed to an amplifier with ΔLG ¼ 20 dB
In this case (again from Table 4.6) the combined
gain, after which it is digitized by a data acquisi-
intensity is 3 dB higher than the level of each
tion board, such as a computer’s soundcard. All
tone. For example, if each tone has a level of
120 dB re 1 μPa rms, then the two tones together
have a level of 126 dB re 1 μPa rms if they are in
phase. Their superposition has an average level of
123 dB re 1 μPa rms if they have a random phase
difference. Summing signals that have the same
amplifier phase, or a fixed phase difference, is known as
soundcard coherent summation, whereas performing an “on
average” summation of signals assuming a ran-
hydrophone
dom phase is called incoherent summation.
Fig. 4.8 Sketch of an example underwater recording The calculation is more complicated if the two
setup. A terrestrial setup would have a microphone instead tones have different levels. It is necessary to use
of a hydrophone Eq. (4.3) to convert both levels to corresponding
4 Introduction to Acoustic Terminology and Signal Processing 125
field (coherent summation) or power (incoherent This ratio of mean-square pressures in the two
summation) quantities, add these quantities, and media can be expressed in terms of the density
then convert the result back to a level. and speed of sound of the two media:
The outcome of this process is plotted in
Fig. 4.9 in terms of the increase in the combined p2w Z w ρw cw
¼ ¼ :
level from that of the higher-level signal as a p2a Z a ρa c a
function of the difference between the higher
Applying 10 log10() to these ratios, the differ-
and lower levels. Note that this increase never
ence between the mean-square sound pressure
exceeds 6 dB for a coherent summation or 3 dB
levels in water and air is:
for an incoherent summation. In the case of a
coherent summation, proper account has to be p2w p2
taken of the relative phases of the two tones Lpw2 Lpa2 ¼ 10 log 10 2
10 log 10 a2
p0 p0
when adding the field quantities, and this can
have a very large effect. Figure 4.9 shows the p2w ρ c
¼ 10 log 10 ¼ 10 log 10 w w
extreme cases: The upper limit occurs when the p2a ρa c a
two signals are in phase, and the lower limit ¼ 36 dB
occurs when they have a phase difference of
180 (π radians). The latter case gives destructive The difference between the sound pressure
interference and the combined level is lower levels is, of course, also 36 dB:
than that of the highest individual signal. If pw p
the two individual signals have a 180 phase Lpw Lpa ¼ 20 log 10 20 log 10 a
p0 p0
difference and the same amplitude, then the rffiffiffiffiffiffiffiffiffiffi
p ρw c w
destructive interference is complete, the two ¼ 20 log 10 w ¼ 20 log 10
pa ρa c a
signals cancel each other out, and the combined
level is 1! ¼ 36 dB
Another useful observation from Fig. 4.9 is
that when the difference in level between the In the above two equations, the same reference
two individual signals is greater than 10 dB, the pressure p0 is required. However, the convention
126 C. Erbe et al.
is to use pa0¼20 μPa in air and pw0¼1 μPa in Some sources are large in their physical
water. The difference in reference pressures adds dimensions and placing a recorder at short range
another 26 dB to the sound pressure level in (i.e., into the so-called near-field, see Sect. 4.2.13)
water, because: will not result in a level that captures the full
output of the source. Also, many sound sources
pa0 20 μPa
20 log 10 ¼ 20 log 10 ¼ 26 dB do not operate in a free-field but rather near a
pw0 1 μPa
boundary (e.g., air-ground, air-water, or water-
So, if two sound sources emit the same inten- seafloor). At such boundaries, reflection, scatter-
sity in air and water, then the sound pressure level ing, absorption, and phase changes may occur,
in water referenced to 1 μPa is 62 dB (i.e., affecting the recorded level. In praxis, a sound
36 dB + 26 dB) greater than the sound pressure source is recorded at some range in the far-field
level in air referenced to 20 μPa. and an appropriate (and sometimes sophisticated)
While this might be confusing, there would sound propagation model is utilized to account
hardly be a sensible reason to compare levels in for the effects of the environment in order to
air and water. Such comparisons have been compute a source level that is independent of
attempted in the past to give an analogy to levels the environment. Such source levels can then be
with which humans have experience in air. For applied to new situations and different
example, humans find 114 dB re 20 μPa annoying environments in order to predict received levels
and 140 dB re 20 μPa painful, so what would be a elsewhere. Like other levels, the source level is
similarly annoying level under water that might expressed in dB relative to a reference value. It is
disturb animals? further referenced to a nominal distance of 1 m
But animals perceive sound differently from from the source. The source level can be a sound
humans, hear sound at different frequencies and pressure level or a sound exposure level,
levels, and can have rather different auditory depending on the source and situation.
anatomy (see Chap. 10 on audiograms). As a The radiated noise level (abbreviation: RNL;
result, a signal easily heard by a human could be symbol LRN) is more easily determined. It is the
barely audible to some animals or much louder to level of the product of the sound pressure and the
others. Even for divers, sound reception under range r at which the sound pressure is recorded,
water is quite a different process from sound and it can be calculated as the received sound
reception in air, due to different acoustic imped- pressure level Lp plus a spherical propagation
ance ratios of the acoustic medium and human loss term:
tissues, and different sound propagation paths. prms ðr Þr r
Furthermore, the psychoacoustic effects (emo- LRN ¼ 20 log 10 ¼ Lp þ 20 log 10
p0 r 0 r0
tional impacts) of different types of noise on
animals have not been examined thoroughly. It is expressed in dB relative to a reference
Even in humans, for example, 110 dB re 20 μPa value of p0r0 ¼ 20 μPa m in air and p0r0 ¼ 1 μPa m
of rock music does not provide the same experi- in water. The radiated noise level is dependent
ence as 110 dB re 20 μPa of traffic noise. upon the environment and is therefore also called
affected source level. Note that it is very common
in the bioacoustic literature to report source levels
4.2.12 Source Level and radiated noise levels as dB re 20 μPa @ 1 m
in air and dB re 1 μPa @ 1 m in water. The ISO
The source level (abbreviation: SL; symbol: LS) is definition is mathematically different and the
meant to be characteristic of the sound source and notation excludes “@ 1 m” (International Organi-
independent of both the environment in which the zation for Standardization 2017).
source operates and the method by which the While the source level can be characteristic of
source level is determined. In praxis, the determi- the source, there are many factors that affect the
nation of the source level has numerous problems. source level. For example, larger ships typically
4 Introduction to Acoustic Terminology and Signal Processing 127
have a higher source level than smaller ships. out of phase either because sound from different
Cars going fast have a higher source level than parts of the source arrives at different times (This
cars going slowly. Animals can vary the ampli- is the case of an extended source.) or because the
tude of the same sound depending on the context curvature of the spherical wavefront from the
and their motivation. Different sound types can source is too great to be ignored (This is the case
have different source levels. Territorial defense or of a source small enough to be considered a point
aggressive sounds usually have the highest source source.). These two cases have different frequency
level in a species’ repertoire. Mother-offspring dependence with the near-field to far-field transi-
sounds often have the lowest source level in a tion distance increasing with increasing frequency
species’ repertoire, because mother and calf are for an extended source, and decreasing with
typically close together and want to avoid detec- increasing frequency for a small source. A single
tion by predators. source may behave as a small source at low
frequencies and as an extended source at high
frequencies, which implies that there is some
4.2.13 What Field? Free-Field, non-zero frequency at which it will have a mini-
Far-Field, Near-Field mum near-field to far-field transition distance.
This has resulted in much confusion.
While this might read like the opening of a When is a sound source small versus
Dr. Seuss book, it is quite important to understand extended? A sound source can be considered
these concepts. The free-field, or free sound field, small when its physical dimensions are small
exists around a sound source placed in a homoge- compared to the acoustic wavelength. A fin
neous and isotropic medium that is free of whale (Balaenoptera physalus) with a head size
boundaries. Homogenous means that the medium of perhaps 6 m produces a characteristic 20-Hz
is uniform in all of its parameters; isotropic means signal that has a wavelength of about 70 m and so
that the parameters do not depend on the direction the whale can be considered small.
of measurement. While the free-field assumption When studying the effects of noise on animals,
is commonly applied to estimates of particle however, the noise sources one deals with are
velocity from pressure measurements or estimates mostly extended sources. In the near-field, the
of propagation loss, sound sources and receivers amplitudes of field and power quantities are
are rarely in a free-field. More often, sound affected by the physical dimension of the sound
sources and receivers are near a boundary. This source. This is because the surface of an extended
is the case for sources such as trains or construc- sound source can be considered an array of sepa-
tion sites and for receivers such as humans, all of rate point sources. Each point source generates an
which are right at the air-ground boundary. This acoustic wave. At any location, the instantaneous
is also the case for sources such as ships at the pressure (as an example of a field quantity) is the
water surface and for receivers such as fishes in summation of the instantaneous pressures from
shallow water, where they are near two all of the point sources. In the near-field, the
boundaries: the air-water and the water-seafloor various sound waves have traveled various
boundaries. At boundaries, some of the sound is distances and arrive at various phases. Therefore,
transmitted into the other medium, some of it is the near-field consists of regions of destructive
reflected, some of it is scattered in various and constructive interference and the pressure
directions. For more detail on source-path- amplitude depends greatly on where exactly in
receiver models in air and water, see Chaps. 5 the near-field it is measured. There may be
and 6. regions close to a sound source where the pres-
The far-field is the region that is far enough sure amplitude is always zero. The interference
from the source so that the particle velocity and pattern depends on the frequency of the sound,
pressure are effectively in phase. The near-field is and the regions of destructive and constructive
the region closer to the source where they become interference will be different depending on the
128 C. Erbe et al.
Gain [dB]
effect of noise on humans. However, at present,
-40
only weightings A, C, and Z are standardized
(International Electrotechnical Commission
2013). -60
A
C
-80 Z
4.2.14.1 A, C, and Z Frequency
Weightings 10 2 10 4
A, C, and Z frequency weightings are derived Frequency [Hz]
from standardized equal-loudness contours.
Fig. 4.11 Graph of A-, C-, and Z-weighting curves
These are curves which demonstrate SPL
variations over the frequency spectrum for
which constant loudness is perceived (Suzuki The function is tailored to the perception of
and Takeshima 2004). Loudness is the human low-level sounds and represents an idealized
perception of sound pressure. Loudness levels human 40-phon equal-loudness contour.
are measured in units of phons, determined from Measurements are noted as dB(A) or dBA.
referencing the equal-loudness contours. The The C-weighting function provides a better
number of phons n is equal in intensity to a representation of human auditory sensitivity to
1-kHz tone with an SPL of n dB. The equal- high-level sounds. This weighting is useful for
loudness contours were developed from human stipulating peak or impact noise levels and is
loudness perception studies (Fletcher used for the assessment of instrument and equip-
and Munson 1933; Robinson and Dadson 1956; ment noise.
Suzuki and Takeshima 2004) and are The Z-weighting function (also known as the
standardized (International Organization for zero-weighting function) covers a range of
Standardization 2003). Table 4.7 defines the A, frequencies from 8 Hz to 20 kHz (within
C, and Z-weighting values at frequencies up to 1.5 dB), replacing the “FLAT” and “Linear”
16 kHz. Figure 4.11 displays the contours of the weighting functions. It adds no “weight” to
weightings. account for the auditory sensitivity of humans
A-weighting is the primary weighting function and is commonly used in octave-band
for environmental noise assessment. It covers a analysis to analyze the sound source rather than
broad range of frequencies from 20 Hz to 20 kHz. its effect.
Table 4.8 Constants of Eq. 4.4 for the six functional hearing groups of marine mammals (Southall et al. 2019)
Marine mammal hearing group a b f1 [kHz] f2 [kHz] C [dB]
Low-frequency cetaceans (LF) 1 2 0.2 19 0.13
High-frequency cetaceans (HF) 1.6 2 8.8 110 1.20
Very-high-frequency cetaceans (VHF) 1.8 2 12 140 1.36
Sirenians (SI) 1.8 2 4.3 25 2.62
Phocid carnivores in water (PCW) 1 2 1.9 30 0.75
Phocid carnivores in air (PCA) 2 2 0.75 8.3 1.50
Other marine carnivores in water (OCW) 2 2 0.94 25 0.64
Other marine carnivores in air (OCA) 1.4 2 2 20 1.39
4 Introduction to Acoustic Terminology and Signal Processing 131
0
-3
-20
-10
-40 LF
Gain [dB]
Amplitude [dB]
HF
VHF
-60 SI
PCW
-80 PCA
OCW
OCA f10l f3l fp f3u f10u Frequency [Hz]
-100
Fig. 4.13 Illustration of the 3-dB and 10-dB bandwidths
10 -2 10 -1 10 0 10 1 10 2 of a signal; p: peak, l: lower, u: upper
Frequency [kHz]
Fig. 4.14 Echolocation click from a harbor porpoise points at one tenth of the peak power (i.e., 10 dB below
(Phocoena phocoena); (a) waveform and amplitude enve- the maximum). Computation of the 90% energy signal
lope (determined by Hilbert transform), (b) cumulative duration was explained in Sect. 4.2.6. Three bandwidth
energy, and (c) spectrum. Three different duration measures are shown. The 3-dB and 10-dB bandwidths are
parameters (τ) are shown. The 3-dB duration is the differ- measured down from the maximum power, which occurs
ence in time between the two points at half power (i.e., at the peak frequency fp, and the rms bandwidth is
3 dB down from the maximum of the signal envelope). measured about the center frequency fc. Click recording
The 10-dB duration is the time difference between the courtesy of Whitlow Au
energy. At the cut-off frequency, the energy is Octave bands are exactly one octave wide,
typically reduced by 3 dB. Beyond the cut-off with an octave corresponding to a doubling of
frequency, the attenuation increases; how rapidly frequency. The upper edge frequency of an octave
depends on the order of the filter. band is twice the lower edge frequency of
Band-pass filtering is very common in the the band: fup ¼ 2 flow. Fractional octave bands
study of broadband sounds, in particular broad- are a fraction of an octave wide. One-third octave
band noise such as aircraft or ship noise. A num- bands are common. The center frequencies fc of
ber of band-pass filters are used that have adjacent adjacent 1/3 octave bands are calculated as
pass-bands such that the sound spectrum is split fc(n) ¼ 2n/3, where n counts the 1/3 octave
into adjacent frequency bands. If these bands all bands. The lower and upper frequencies of band
have the same width, then the filters are said to n are calculated as:
have constant bandwidth. In contrast, propor-
tional bandwidth filters split sound into adjacent ƒlow ðnÞ ¼ 21=6 f c ðnÞ and ƒup ðnÞ ¼ 21=6 f c ðnÞ
bands that have a constant ratio of upper to lower
Another example for proportional bands are
frequency. These bands become wider with
decidecades. Their center frequencies fc are
increasing frequency (e.g., octave bands).
4 Introduction to Acoustic Terminology and Signal Processing 133
Table 4.9 Center frequencies of adjacent 1/3 octave bands [Hz]. The table can be extended to lower and higher
frequencies by division and multiplication by 10, respectively
10 12.5 16 20 25 31.5 40 50 63 80
100 125 160 200 250 315 400 500 630 800
1000 1250 1600 2000 2500 3150 4000 5000 6300 8000
10,000 12,500 16,000 20,000 25,000 31,500 40,000 50,000 63,000 80,000
90
octave band levels
80
50
spectral density levels
40 (1 Hz band levels)
30
Fig. 4.17 Examples of signal waveforms (left) and their 10-ms long tone burst with a center frequency of 1000 Hz
spectra (right). (a) A sine wave with a frequency of and 2-ms rise and fall times; (e) a 10-ms long FM sweep
1000 Hz; (b) a signal consisting of a sine wave with a from 500 Hz to 1500 Hz with 2-ms rise and fall times; and
fundamental frequency of 1000 Hz and five overtones; (c) (f) uncorrelated (white) random noise
a 10-ms long pulse with 2-ms rise and fall times; (d) a
apparent in (d). Finally, (f) shows a waveform information about what it will be at any other
consisting of uncorrelated noise and its spectrum. time instant. This type of noise is often called
In this context “uncorrelated” means that knowl- white noise because it has a flat spectrum (like
edge of the noise at one time instant gives no white light), but as can be seen in this example,
136 C. Erbe et al.
the spectrum of any particular white noise signal frequency in the original signal. H(f) is a complex
is itself quite noisy and it is only flat if one function and the argument contains the phase of
averages the spectra of many similar signals, or that frequency. The inverse Fourier transform
alternatively the spectra of many segments of the recreates the original signal from its Fourier
same signal. components. For a continuous function with
A spectrogram is a plot with, most commonly, t representing time and f representing frequency,
time on the x-axis and frequency on the y-axis. A the Fourier transform is (i is the imaginary unit):
quantity proportional to acoustic power is Z 1
displayed by different colors or gray levels. If Hð f Þ ¼ hðt Þe2πift dt
properly calibrated, a spectrogram will show 1
mean-square sound pressure spectral density. A and the inverse Fourier transform is:
spectrogram is computed as a succession of Z 1
Fourier transforms. A window is applied in the
hð t Þ ¼ H ð f Þe2πift df
time domain containing a fixed number of 1
samples of the digital time series. The Fourier
transform is computed over these samples. While a sound wave might be continuous,
Amplitudes are squared to yield power. The during digital recording or digitization of an ana-
power spectrum is then plotted as a vertical col- logue recording, its instantaneous pressure is
umn with frequency on the y-axis. The window in sampled at equally spaced times over a finite
the time domain is then moved forward in time window in time. This results in a finite and dis-
and the next samples of the digital time series are crete time series. The equations for the discrete
taken and Fourier-transformed. This second spec- Fourier transform are similar to the above, where
trum is then plotted next to the first spectrum, as the integrals are replaced by summations. The fast
the second vertical column in the spectrogram. Fourier transform (FFT) is the most common
The window in the time domain is moved again, mathematical algorithm for computing the dis-
the third Fourier transform is computed and crete Fourier transform. In animal bioacoustics,
plotted as the third column of the spectrogram, the FFT is the most commonly used algorithm to
and so forth (see examples in Fig. 4.2). The spec- compute the frequency spectrum of a sound. The
trogram, therefore, shows how the spectrum of a most common display of the frequency spectrum
sound changes over time. With modern signal is as a power spectrum. Here, the amplitudes H(f)
processing software, researchers are able to listen are squared and in this process, the phase infor-
to the sounds in real-time while viewing the spec- mation is lost and, therefore, the original time
tral patterns. series cannot be recreated. If sufficient care is
taken to properly preserve the phase information,
it is not only possible, but often very convenient,
4.3.2 Fourier Transform to transform a signal into the frequency domain
using the FFT, carry out processing (such as
It turns out that any signal can be broken down filtering) in this domain, and then use an inverse
into a sum of sine waves with different FFT to resynthesize the processed signal in the
amplitudes, frequencies, and phases. This is time domain.
done by the Fourier transform, named after
French mathematician and physicist Joseph
Fourier. While the original signal can be 4.3.3 Recording and FFT Settings
represented as a time series h(t) (e.g., sound pres-
sure p(t)) in the time domain, the Fourier trans- Sounds in the various displays can look rather
form transforms the signal into the frequency different depending on the recording and analysis
domain, where it is represented as a spectrum parameters. There is no set of parameters that will
H(f). The magnitude of H is the amount of that produce the best display for all sounds. Rather,
4 Introduction to Acoustic Terminology and Signal Processing 137
Pressure [Pa]
0.5
-0.5
-1
0 0.2 0.4 0.6 0.
Fig. 4.18 Waveforms of a 1-Hz sine wave (black) and a red samples fit either sine wave. In fact, there is an infinite
9-Hz sine wave (blue), both sampled 8 times per second number of signals that fit these samples
(i.e., fs ¼ 8 Hz) as indicated by the red circles. Note that the
Fig. 4.19 Examples of folding (aliasing). Top: A killer upsweeps greater than the Nyquist frequency appear as
whale sound sampled at 96 kHz (a) and at 32 kHz (b) downsweeps. Bottom: Humpback whale (Megaptera
(Wellard et al. 2015). If no anti-aliasing filter is applied, novaeangliae) notes recorded with a sampling frequency
frequencies above the Nyquist frequency (i.e., 16 kHz in of 6 kHz, but without an anti-aliasing filter. Contours
the right panel) will appear reflected downwards; above 3 kHz appear mirrored about the 3-kHz edge
higher the sampling frequency is, the higher the 32 kHz. Without an anti-aliasing filter, energy is
maximum frequency that can be accurately mirror-inverted or reflected about the Nyquist
digitized. frequency of 16 kHz in the second case.
In praxis, in order to avoid higher frequencies Conceptually, energy is folded down about the
of animal sounds being erroneously displayed Nyquist frequency by as much as it was above the
and interpreted as lower frequencies, an anti- Nyquist frequency.
aliasing filter is employed in the recording
system. This is a low-pass filter with a cut-off 4.3.3.3 Bit Depth
frequency below the Nyquist frequency. When a digitizer samples a sound wave (or the
Frequencies higher than the Nyquist frequency voltage at the end of a microphone), it stores the
are thus attenuated, so that the effect of aliasing pressure measures with a limited accuracy. Bit
is diminished. depth is the number of bits of information in
An example of aliasing is given in Fig. 4.19. each sample. The more bits, the greater the reso-
Spectrograms of the same killer whale (Orcinus lution of that measure (i.e., the more accurate the
orca) call are shown sampled at 96 kHz and at pressure measure). Inexpensive sound digitizers
4 Introduction to Acoustic Terminology and Signal Processing 139
use 12 bits per sample. Commercially available a time series consisting of real (i.e., not complex)
CDs store each sample with 16 bits of storage, numbers, the same result is obtained by doubling
which allows greater accuracy in records of pres- the squared amplitudes of the positive frequencies
sure. Blue-ray discs typically use 24 bits per and discarding the negative frequencies. This
sample. The more bits per sample, the larger the means that NFFT samples in the time domain
sound file to be stored, but the larger the dynamic yield NFFT/2 measures in the frequency domain.
range (ratio of loudest to quietest) of sounds that The FFT values, and therefore the power spec-
can be captured. trum calculated from them, are output at a fre-
quency spacing:
4.3.3.4 Audio Coding
fs
Audio coding is used to compress large audio Δf ¼
NFFT
files to reduce storage needs. A common format
is MP3, which can achieve 75–95% file reduction For example, if a sound recording was sam-
compared to the original time series stored on a pled at 44.1 kHz and the FFT was computed over
CD or computer hard drive. Most audio coding NFFT ¼ 1024 samples, then the frequency
algorithms aim to reduce the file size while spacing would be 43.07 Hz and the power spec-
retaining reasonable quality for human listeners. trum would contain 512 frequencies: 43.07 Hz,
The MP3 compression algorithm is based on per- 86.14 Hz,. . ., 22,050 Hz. A different way of
ceptual coding, optimized for human perception, looking at this is that the FFT produces spectrum
ignoring features of sound that are beyond normal levels in frequency bands of constant bandwidth.
human auditory capabilities. Playing MP3 files And the center frequencies in this example are
back to animals might result in quite different 43.07 Hz, 86.14 Hz,. . ., 22,050 Hz. If there were
perception compared to the playback of the origi- two tones at 30 Hz and 50 Hz, then the combina-
nal time series. Unfortunately, this is very often tion of recording settings ( fs ¼ 44.1 kHz) and
ignored in animal bioacoustic experiments. analysis settings (NFFT ¼ 1024) would be unable
Lossless compression does exist (e.g., Free to separate these tones. Their power would be
Lossless Audio Codec, FLAC; see Chap. 2 on added and reported as the single level in the
recording equipment). For animal bioacoustics frequency band centered on 43.07 Hz. To sepa-
research, it is best to use lossless compression or rate these two tones, a frequency spacing of no
none at all. more than 20 Hz is required. This is achieved by
increasing NFFT. To yield a 1-Hz frequency
4.3.3.5 FFT Window Size (NFFT) spacing, 1 s of recording needs to be read into
During Fourier analysis of a digitized sound the FFT; i.e., NFFT ¼ fs 1 s.
recording, a fixed number of samples of the origi- As the NFFT increases, the frequency spacing
nal time series is read and the FFT is computed on decreases, but at the cost of the temporal resolution.
this window of samples. The number of samples This is because an increase in NFFT means that
is a parameter passed to the FFT algorithm and is more samples from the original time series are read
typically represented by the variable NFFT. If in order to compute one spectrum. More samples
NFFT samples are read from the original time implies that the time window over which the spec-
series, then the Fourier transform will produce trum is computed increases. In the above example,
amplitude and phase measures at NFFT with fs ¼ 44.1 kHz, NFFT ¼ 1024 samples corre-
frequencies. However, the FFT algorithm spond to a time window Δt of 0.023 s:
produces a two-sided spectrum that is symmetri-
NFFT 1
cal about 0 Hz and contains NFFT/2 positive Δt ¼ ¼
fs Δf
frequencies and NFFT/2–1 negative frequencies.
To compute the power spectrum, after FFT, the While 44,100 samples last 1 s, 1024 samples
amplitudes of all frequencies (positive and nega- only last 0.023 s. The spectrum is computed over
tive) are squared and summed. In the usual case of
140 C. Erbe et al.
a time window of 0.023 s length. If the recording (NFFT) should be optimized for the sounds of
contained dolphin clicks of 100 μs duration, then interest.
the spectrum would be averaging over multiple
clicks and ambient noise. To compute the spec-
4.3.3.6 FFT Window Function
trum of one click, a time window of 100 μs is
The computation of a discrete Fourier transform
desired and corresponds to NFFT ¼ fs
over a finite window of samples produces spectral
100 μs ¼ 4. This is a very short window. The
leakage, where some power appears at
resulting frequency spacing would be impracti-
frequencies (called sidelobes) that are not part of
cally coarse:
the original time series but rather due to the length
fs 44, 100 Hz and shape of the window. If a window of samples
Δf ¼ ¼ ¼ 10, 000 Hz
NFFT 4 is read off the time series and passed straight into
the FFT, then the window is said to have rectan-
There is a trade-off between frequency
gular shape. The rectangular window function has
spacing and time resolution in Fourier spectrum
values of 1 over the length of the window and
analysis. This is often referred to as the Uncer-
values of 0 outside (i.e., before and after). The
tainty Principle (e.g., Beecher 1988): Δf Δt ¼ 1.
window function is multiplied sample by sample
In spectrograms, using a large NFFT will result in
with the original time series so that NFFT values
sounds looking stretched out in time, while a
of unaltered amplitude are passed to the FFT
small NFFT will result in sounds looking
algorithm. A rectangular window produces a
smudged in frequency. The combination of
large number of sidelobes (Fig. 4.20).
recording settings ( fs) and analysis settings
Fig. 4.20 Comparison of some window functions (left) and their Fourier transforms (right) for (a) rectangular, (b) Hann,
(c) Hamming, and (d) Blackman-Harris windows
4 Introduction to Acoustic Terminology and Signal Processing 141
Spectral leakage can be reduced by using reduce the frequency spacing Δf. This so-called
non-rectangular windows such as Hann, Ham- zero-padding produces a smoother spectrum but
ming, or Blackman-Harris windows. These have does not improve the frequency resolution, which
values of 1 in the center of the window, but then is still determined by the shape of the window and
taper off toward the edges to values of 0. The the duration of the signal to which the window
amplitude of the original time series is thus was applied.
weighted. The benefits are fewer and weaker
sidelobes, which result in less spectral leakage.
The smallest difference in frequency between 4.3.4 Power Spectral Density
two tones that can be separated in the spectrum is Percentiles and Probability
called the frequency resolution and is determined Density
by the width of the main lobe of the window
function. There is therefore a trade-off between When recording soundscapes on land or under
the reduction in sidelobes and a wider main lobe, water, sounds fade in and out, from a diversity of
which results in poorer frequency resolution. sources and locations. A soundscape is dynamic,
In order to not miss a strong signal or strong changing on short to long time scales (see
amplitude at the edges of the window where the Chap. 7). The variability in sound levels can be
amplitude is weighted by values close to expressed as power spectral density (PSD)
0, overlapping windows are used. Rather than percentiles. The nth percentile gives the level that
reading samples in adjacent windows, windows is exceeded n% of the time (note: in engineering,
commonly have 50% overlap. A spectrogram that the definition is commonly reversed). The 50th
was computed with 50% overlapping windows percentile corresponds to the median level. An
will have twice the number of spectrum columns example from the ocean off southern Australia is
and appear to have finer time resolution. Each shown in Fig. 4.21. The median ambient noise
spectrum column still has the same Δt as for a level is represented by the thin black line and
spectrogram without overlapping windows, but goes from about 90 dB re 1 μPa2/Hz at 20 Hz to
there will be twice as many spectrum columns 60 dB re 1 μPa2/Hz at 30 kHz. The lowest thin
making the spectrogram appear finer in time. gray line corresponds to the 99th percentile. It gets
Zeros can be appended to each signal block quieter than this only 1% of the time. Levels at
(after windowing) to increase NFFT and therefore low frequencies (20–50 Hz) never drop below
75 dB re 1 μPa2/Hz because of the persistent noise sound source. By listening in air with two ears,
from distant shipping. we can tell the direction to the sound source and
These plots not only give the statistical level whether it remains at a fixed location or
distribution over time, but can also identify the approaches or departs. From recordings made
dominant sources in a soundscape based on the over a period of time, the closest point of
shapes of the percentile curves. The hump from approach (CPA) is often taken as the point in
100 Hz to lower frequencies is characteristic of time when mean-square pressure (or some other
distant shipping. The more leveled curves at acoustic quantity like particle displacement,
mid-frequencies (200–800 Hz) are characteristic velocity, or acceleration) peaked (Fig. 4.22).
of wind noise recorded under water. The median Whether a sound source is approaching or
level of about 68 dB re 1 μPa2/Hz corresponds to departing can also be told from the Doppler
a Sea State of 4. The hump at 1.2 kHz is charac- shift. As a car or a fire engine drives past and as
teristic of chorusing fishes. While there are likely an airplane flies overhead, the pitch drops. In fact,
other sounds in this soundscape at certain times as each approaches, the frequency received by a
(e.g., nearby boats or marine mammals), they do listener or a recorder is higher than the emitted
not occur often enough or at a high enough level, frequency, and as each departs, the received fre-
to stand out in PSD percentile plots. quency is lower than the emitted frequency.4 At
Probability density of PSD identifies the most CPA, the received frequency equals the emitted
common levels. In Fig. 4.21, at 100 Hz, the most frequency. The time of CPA can be identified in
common (probable) level was 75 dB re 1 μPa2/ spectrograms as the point in time when the
Hz. This was equal to the median level at this steepest slope in the decreasing frequency
frequency. The red colors indicate that the median occurred as the sound source passed or as the
levels were also the most probable levels. At mid- point in time when the frequency had decreased
to-high frequencies, the levels were more evenly half-way (Fig. 4.23). The Doppler shift Δf can
distributed (i.e., only shades of blue and no red easily be quantified as
colors). The most probable levels are not neces- v
sarily equal to the median levels. A case where Δf ¼ f
c 0
the most probable level (again from distant
shipping) was below the median (due to strong where v is the speed of the source relative to a
pygmy blue whale, Balaenoptera musculus fixed receiver, c is the speed of sound, and f0 is the
brevicauda, calling) is shown in Fig. 4.6, and a frequency emitted by the source (i.e., half-way
case where two different levels were equally between the approaching and the departing
likely (due to two seismic surveys at different frequencies). From a spectrogram, not only the
ranges) is shown in Fig. 4.8, both of Erbe et al. CPA, but also the speed of the sound source can
2016a.3 PSD percentile and probability density be determined.
plots (as well as other graphs) can be created for In the example of Fig. 4.23, one of the engine
both terrestrial and aquatic environments with the harmonics dropped from 96 Hz to 64 Hz. So the
freely available software suite by Merchant emitted frequency was 80 Hz and the Doppler
et al. 2015. shift was 16 Hz. With a speed of sound in air of
343 m/s, the airplane flew at 70 m/s ¼ 250 km/h.
The interesting part of this example is that the
4.4 Localization and Tracking recorder was actually resting on the riverbed, in
1 m of water, and hence in a different acoustic
There are a few simple ways to gain information medium to the source. How this affects the results
about the rough location and movement of a
4
Doppler shift animations by Dan Russell: https://www.
3
https://www.acoustics.asn.au/conference_proceedings/ acs.psu.edu/drussell/Demos/doppler/doppler.html;
AASNZ2016/papers/p14.pdf; accessed 13 October 2020. accessed 13 October 2020.
4 Introduction to Acoustic Terminology and Signal Processing 143
depends on the depth of the hydrophone relative speed, so it was the in-air sound speed that deter-
to the acoustic wavelength. In this particular mined the Doppler shift. If the measurement had
instance, the hydrophone was a small fraction of been carried out in deeper water with a deeper
an acoustic wavelength below the water surface hydrophone, the signal would have been
and the signal reached it via the evanescent wave dominated by the air-to-water refracted wave,
(see Chap. 6 on sound propagation). The evanes- and the Doppler shift would have been deter-
cent wave traveled horizontally at the in-air sound mined by the in-water sound speed.
To accurately locate a sound source in space,
signals from multiple simultaneous acoustic
receivers need to be analyzed. These receivers
are placed in specific configurations, known as
arrays. Methods of localization are dependent on
the configuration of the receiver array, the acous-
tic environment, spectral characteristics of the
sound, and behavior of the sound source. There
are three broad classes of these methods:
time difference of arrival, beamforming, and
parametric array processing methods. The follow-
ing sections provide a condensed overview of the
three methods. For a comprehensive treatise,
please refer to the following: Schmidt 1986;
Van Veen and Buckley 1988; Krim and Viberg
1996; Au and Hastings 2008; Zimmer 2011;
Chiariotti et al. 2019.
Fig. 4.23 Spectrogram of an airplane flying over the Tracking is a form of passive acoustic moni-
Swan River, Perth, Australia, into Perth Airport. toring (PAM), where an estimation of the behav-
Recordings were made in the river, under water. The ior of an active sound source is maintained
closest point of approach occurred at about 18 s, when over time. Passive acoustic tracking has many
the frequencies of the engine tone and its overtones
dropped fastest (Erbe et al. 2018) demonstrated applications in the underwater and
terrestrial domains.
144 C. Erbe et al.
-1
0 20 40 60 80 100 120
Sound Pressure [µPa]
-1
0 20 40 60 80 100 120
Time [ms]
Fig. 4.24 Determining TDOA by cross-correlation. Top: coefficient) is low. Bottom: The red time series is shifted
Two 100-ms time series were recorded by two spatially sample by sample against the blue time series and the dot
separated receivers. A signal of interest arrived 20 ms into product computed over the overlapping samples. When
the recording at receiver 1 (red) and 40 ms into the record- the signals line up, the correlation coefficient is maximum.
ing at receiver 2 (blue). The dot product (i.e., correlation In this example, the TDOA was 20 ms
4.4.1 Time Difference of Arrival series from receiver 2 (blue), and the dot product
is computed again (over the overlapping
Localization by Time Difference Of Arrival samples), yielding the second cross-correlation
(TDOA) is a two-step process. The first step is coefficient. By sliding the two time series against
to measure the difference in time between the each other (sample by sample) and computing the
arrivals of the same sound at any pair of acoustic dot product, a time series of cross-correlation
receivers. The second step is to apply appropriate coefficients forms. A peak in cross-correlation
geometrical calculations to locate the sound occurs when the time series have been shifted
source. TDOA methods work best for signals such that the signal recorded by receiver 1 lines
that contain a wide range of frequencies (i.e., up with the signal recorded by receiver 2. The
have a wide bandwidth), which includes short number of samples by which the time series were
pulses, FM sweeps, and noise-like signals. shifted, divided by the sampling frequency of the
two receivers, is the TDOA.
Generalized cross-correlation is a common
4.4.1.1 Generalized Cross-Correlation
way of determining TDOA. It is suitable for
TDOAs are commonly determined by cross-cor-
localization in air and water in environments
relation. The time series of recorded sound pres-
with high noise and reverberation and can be
sure by two spatially separated receivers are
computed in either the time or frequency domains
cross-correlated as a sliding dot product. This
(Padois 2018).
means that each sample from receiver 1 is
multiplied with a corresponding sample from
receiver 2, and the products are summed over 4.4.1.2 TDOA Hyperbolas
the full length of the overlapping time series. TDOAs are always computed between two
This yields the first cross-correlation coefficient. receivers (from a pair of receivers). Figure 4.25
Next, the time series from receiver 1 (red in sketches the arrangement of an animal A (at point
Fig. 4.24) is shifted by 1 sample against the time A) and two receivers (R1 and R2) in space. The
4 Introduction to Acoustic Terminology and Signal Processing 145
Fig. 4.25 Graphs of localization hyperbolas with two position; R1 and R2 mark the receiver positions. R2 is
receivers; (a) 3D hyperboloid and (b) 2D hyperbola (i.e., hidden inside the hyperboloid in the 3D image
cross-section) in the x-z plane. A marks the animal’s
distances A-R1 (mathematically noted as a line Reflections off boundaries can also be used to
connecting points A and R1 and then taking the refine the location estimate. Finally, if one
magnitude of it: j A R1 j ), A-R2, and R1-R2 are deploys more than two receivers, TDOAs can be
shown as red lines. If A produces a sound that is computed between all possible pairs of receivers,
recorded by both R1 and R2, then the arrival time yielding multiple hyperboloids that will intersect
at point R1 is equal to the distance A-R1, divided at the location of the animal.
by the speed of sound c, and the arrival time at R2
is equal to the distance A-R2, divided by the speed
4.4.1.3 TDOA Localization in 2
of sound c. The TDOA is simply the difference
Dimensions
between the two arrival times:
Localization in 2D space is, of course, simpler
j A R1 j j A R2 j than in 3D, though it might seem a little
TDOA ¼ contrived. In Fig. 4.26, the airport arrival flight
c
path goes straight over a home. TDOA is used to
It turns out mathematically that the animal can locate (and perhaps track) each airplane. Two
be anywhere on the hyperboloid and the TDOA receivers on the ground will yield the upper half
will be the same. In other words, the TDOA of the hyperbola in Fig. 4.25b as possible airplane
defines a surface (in the shape of a hyperboloid) locations. We know the airplane cannot be under-
on which the animal may be located. With two ground, but in terms of its altitude and range, two
receivers in the free-field, the animal’s position receivers are unable to resolve these. A third
cannot be specified further. If there are receiver in line with R1 and R2 is needed. With
boundaries near the animal and/or receivers three receivers in a line array, three TDOAs can
(e.g., if a bird is tracked with receivers on the be computed and three hyperbolas can be drawn.
ground), then the possible location of the animal Any two of these hyperbolas will intersect at two
can be easily limited (i.e., the bird cannot fly points: one above and one below the x-axis (i.e.,
underground, eliminating half of the space). above and below ground). Knowing that the
146 C. Erbe et al.
a) y
x
m1 m2 m3
b) y
m2
x
m1 m3
When a sound source is far away from an array relative positions of all the hydrophones to be
of receivers, the TDOAs can still be used to accurately known, so this is not always easy to
determine the direction of the sound source achieve in practice.
but any estimate of its distance will become Beamforming itself is relatively simple
inaccurate. conceptually, but there are many subtleties (for
details, see Van Veen and Buckley 1988; Krim
and Viberg 1996). As for TDOA methods, the
4.4.2 Beamforming starting point is that when sound from a distant
source arrives at an array of hydrophones, it will
TDOA methods give poor results for sources arrive at each hydrophone at a slightly different
that emit narrow-bandwidth signals such as con- time, with the time differences depending on the
tinuous tones (e.g., some sub-species of blue direction of the sound source. The simplest type
whale) and can also be confounded in situations of beamformer is the delay and sum beamformer
where there are many sources of similar signals in in which the array is “steered” in a particular
different directions from the array (e.g., a fish direction by calculating the arrival time
chorus). However, a properly designed array can differences corresponding to that direction,
be used to determine the direction of narrowband delaying the received signals by amounts that
sources and can also determine the directional cancel out those time differences, and then adding
distribution of sound produced by multiple, them together. This has the effect of reinforcing
simultaneously emitting sources using a signals coming from the desired direction, while
processing method called beamforming. If two signals from other directions tend to cancel out.
or more spatially separated arrays can be This isn’t a perfect process and the array will still
deployed, then the directional information they give some output for signals coming from other
produce can be combined to obtain a spatial directions. The relative sensitivity of the
localization of the source. Alternatively, if the beamformer output to signals coming from differ-
source is known to be stationary, or moving suf- ent directions can be calculated and gives the
ficiently slowly, localization can be achieved by beam pattern of the array. The beam pattern of a
moving a single array, for example by towing it line array depends on the steering direction, with
behind a ship. the narrowest beams occurring when the array is
For the convenient, and hence commonly used steered at right-angles to the axis of the array
case of an array consisting of a line of equally (broadside), and the broadest beams when steered
spaced hydrophones, beamforming requires the in the axial direction (end-fire). There are a num-
hydrophone spacing to be less than half the ber of other beamforming algorithms that can
acoustic wavelength of the sound being emitted give improved performance in particular
by the source. Also, the accuracy of the bearing circumstances; see the above references for
estimates improves as the length of the array details.
increases. These two factors combined mean
that a useful array for beamforming is likely to
require at least eight hydrophones, and even that 4.4.3 Parametric Array Processing
would give only modest bearing accuracy. Con-
sequently, 16-element or even 24-element arrays The array requirements for parametric array
are commonly deployed in practice. A straight- processing methods are similar to those for
line array used for beamforming suffers from the beamforming, but these methods attempt to cir-
same ambiguity as a TDOA array in which all the cumvent the direct dependence of the angular
hydrophones are in a straight line. As in the accuracy on the length of the array (in acoustic
TDOA case, this ambiguity can be countered by wavelengths) that is inherent to beamforming. A
offsetting some of the hydrophones from the summary of these methods can be found in Krim
straight line, however beamforming requires the and Viberg (1996). One of the earliest and best
148 C. Erbe et al.
sea floor
such as vessels (Zhu et al. 2018). A MUSIC may be more than one animal vocalizing; any
approach to localization also has applications in one animal will have quiet times between
the underwater domain, having previously been vocalizations. So, TDOA locations need to be
used for recovering acoustically-tagged artifacts joined into tracks; tracks need to be continued;
by autonomous underwater vehicles (AUVs) old tracks need to be terminated; new tracks need
(Vivek and Vadakkepat 2015). to be initiated; tracks may need to be merged or
Finally, target motion analysis involves mark- split. Different algorithms have been developed to
ing the bearing to a sound source (from direc- aid this process, with Kalman filtering being com-
tional sensors or a narrow-aperture array) mon (Zimmer 2011; Zarchan and Musoff 2013).
successively over time. If the animal calls fre- While radio telemetry has historically been the
quently and moves slowly compared to the obser- primary approach to terrestrial animal tracking, pas-
vation platform, successive bearings will intersect sive acoustic telemetry has grown in popularity as
at the animal location (e.g., Norris et al. 2017). more animals can be monitored non-invasively (e.g.,
McGregor et al. 1997; Matsuo et al. 2014). Passive
acoustic tracking in water is a well-established
method of monitoring the behavior of aquatic
4.4.5 Passive Acoustic Tracking
fauna, including their responses to environmental
and anthropogenic stimuli (e.g., Thode 2005;
Passive acoustic tracking is the sequential locali-
Stanistreet et al. 2013). Both towed and moored
zation of an acoustic source, useful for monitor-
arrays are used, with towed arrays providing greater
ing its behavior. Such behavior includes kinetic
spatial coverage in the form of line-transect surveys.
elements (e.g., swim path and speed) and acoustic
elements (such as vocalization rate and type). In
praxis, the process is a bit more complicated than
just connecting TDOA locations over time. 4.5 Symbols and Abbreviations
Animals will be arriving and departing; there (Table 4.10)
4.6 Summary Chen C-E, Ali AM, Wang H (2006) Design and testing of
robust acoustic arrays for localization and enhance-
ment of several bird sources. Proceedings of the 5th
This chapter presented an introduction to acous- International Conference on Information Processing in
tics and explained the basic quantities and Sensor Networks:268–275
concepts relevant to terrestrial and aquatic animal Chiariotti P, Martarelli M, Castellini P (2019) Acoustic
beamforming for noise source localization – Reviews,
bioacoustics. Specific terminology that was methodology and applications. Mech Syst Signal Pro-
introduced includes sound pressure, sound expo- cess 120:422–448. https://doi.org/10.1016/j.ymssp.
sure, particle velocity, sound speed, longitudinal 2018.09.019
and transverse waves, frequency modulation, Dziak RP, Bohnenstiehl DR, Matsumoto H, Fox CG,
Smith DK, Tolstoy M, Lau T-K, Haxel JH, Fowler
amplitude modulation, decibel, source level, MJ (2004) P- and T-wave detection thresholds, Pn
near-field, far-field, frequency weighting, power velocity estimate, and detection of lower mantle and
spectral density, and one-third octave band level, core P-waves on ocean sound-channel hydrophones at
amongst others. The chapter further introduced the Mid-Atlantic Ridge. Bull Seismol Soc Am 94(2):
665–677. https://doi.org/10.1785/0120030156
basic signal sampling and processing concepts Erbe C (2009) Underwater noise from pile driving in
such as sampling frequency, Nyquist frequency, Moreton Bay, Qld. Acoust Aust 37(3):87–92
aliasing, windowing, and Fourier transform. The Erbe C (2013) Underwater noise of small personal water-
chapter concluded with an introductory treatise of craft (jet skis). J Acoust Soc Am 133(4):EL326–
EL330. https://doi.org/10.1121/1.4795220
sound localization and tracking, including time Erbe C, King AR (2009) Modelling cumulative sound
difference of arrival and beamforming. exposure around marine seismic surveys. J Acoust
Soc Am 125(4):2443–2451. https://doi.org/10.1121/1.
3089588
Erbe C, McCauley R, Gavrilov A, Madhusudhana S,
References Verma A (2016a). The underwater soundscape around
Australia. Proceedings of Acoustics 2016, 9–11
Acton WI (1974) The effects of industrial airborne ultra- November 2016, Brisbane, Australia.
sound on humans. Ultrasonics 12(3):124–128. https:// Erbe C, Parsons M, Duncan AJ, Allen K (2016b) Under-
doi.org/10.1016/0041-624X(74)90069-9 water acoustic signatures of recreational swimmers,
Amaral FR, Serrano Rico JC, Medeiros MAF (2018) divers, surfers and kayakers. Acoust Aust 44(2):
Design of microphone phased arrays for acoustic 333–341. https://doi.org/10.1007/s40857-016-0062-7
beamforming. J Braz Soc Mech Sci Eng 40(7):354. Erbe C, Parsons M, Duncan AJ, Lucke K, Gavrilov A,
https://doi.org/10.1007/s40430-018-1275-5 Allen K (2017a) Underwater particle motion (acceler-
American National Standards Institute (2013) Acoustical ation, velocity and displacement) from recreational
Terminology (ANSI/ASA S1.1-2013). Acoustical swimmers, divers, surfers and kayakers. Acoust Aust
Society of America, Melville, NY, USA 45:293–299. https://doi.org/10.1007/s40857-017-
Au WWL, Hastings M (2008) Principles of marine bio- 0107-6
acoustics. Springer Verlag, New York Erbe C, Parsons M, Duncan AJ, Osterrieder S, Allen K
Baumann-Pickering S, McDonald MA, Simonis AE, (2017b) Aerial and underwater sound of unmanned
Solsona Berga A, Merkens KPB, Oleson EM, Roch aerial vehicles (UAV, drones). J Unmanned Veh Syst
MA, Wiggins SM, Rankin S, Yack TM, Hildebrand JA 5(3):92–101. https://doi.org/10.1139/juvs-2016-0018
(2013) Species-specific beaked whale echolocation Erbe C, Williams R, Parsons M, Parsons SK, Hendrawan
signals. J Acoust Soc Am 134(3):2293–2301. https:// IG, Dewantama IMI (2018) Underwater noise from
doi.org/10.1121/1.4817832 airplanes: An overlooked source of ocean noise. Mar
Beecher MD (1988) Spectrographic analysis of animal Pollut Bull 137:656–661. https://doi.org/10.1016/j.
vocalizations: Implications of the “Uncertainty Princi- marpolbul.2018.10.064
ple”. Bioacoustics 1(2–3):187–208. https://doi.org/10. Finneran J, Schlundt C (2011) Subjective loudness level
1080/09524622.1988.9753091 measurements and equal loudness contours in a
Caldwell MC, Caldwell DK (1965) Individualized whistle bottlenose dolphin (Tursiops truncatus). J Acoust Soc
contours in bottlenosed dolphins (Tursiops truncatus). Am 130(5):3124–3136. https://doi.org/10.1121/1.
Nature 207(4995):434–435. https://doi.org/10.1038/ 3641449
207434a0 Fletcher H, Munson WA (1933) Loudness, its definition,
Cato DH (1998) Simple methods of estimating source measurement and calculation. J Acoust Soc Am 5(2):
levels and locations of marine animal sounds. J Acoust 82–108. https://doi.org/10.1121/1.1915637
Soc Am 104(3):1667–1678. https://doi.org/10.1121/1. Holland RA, Waters DA, Rayner JMV (2004) Echoloca-
424379 tion signal structure in the Megachiropteran bat
4 Introduction to Acoustic Terminology and Signal Processing 151
Rousettus aegyptiacus Geoffroy 1810. J Exp Biol environments using microphone arrays. J Acoust Soc
207(25):4361. https://doi.org/10.1242/jeb.01288 Am 135(4):2207–2207. https://doi.org/10.1121/1.
Houser DS, Yost W, Burkard R, Finneran JJ, 4877207
Reichmuth C, Mulsow J (2017) A review of the his- McGregor PK, Dabelsteen T, Clark CW, Bower JL, Hol-
tory, development and application of auditory land J (1997) Accuracy of a passive acoustic location
weighting functions in humans and marine mammals. system: empirical studies in terrestrial habitats. Ethol
J Acoust Soc Am 141(3):1371–1413. https://doi.org/ Ecol Evol 9(3):269–286. https://doi.org/10.1080/
10.1121/1.4976086 08927014.1997.9522887
Huang X, Bai L, Vinogradov I, Peers E (2012) Adaptive Merchant ND, Fristrup KM, Johnson MP, Tyack PL, Witt
beamforming for array signal processing in MJ, Blondel P, Parks SE (2015) Measuring acoustic
aeroacoustic measurements. J Acoust Soc Am 131(3): habitats. Methods Ecol Evol 6(3):257–265. https://doi.
2152–2161. https://doi.org/10.1121/1.3682041 org/10.1111/2041-210X.12330
International Electrotechnical Commission (2013) Electro- Miller PJ, Tyack PL (1998) A small towed beamforming
acoustics - Sound level meters - Part 1: Specifications array to identify vocalizing resident killer whales
(IEC 61672-1 Ed. 2.0 b:2013). New York. (Orcinus orca) concurrent with focal behavioral
International Organization for Standardization (2003) observations. Deep Sea Res Part II Top Stud Oceanogr
Acoustics—Normal equal-loudness-level contours 45(7):1389–1405. https://doi.org/10.1016/S0967-0645
(ISO 226:2003). (98)00028-9
International Organization for Standardization (2007) Mouy X, Hannay D, Zykov M, Martin B (2012) Tracking
Acoustics — Definitions of basic quantities and terms of Pacific walruses in the Chukchi Sea using a single
(ISO/TR 25417). Geneva, Switzerland. hydrophone. J Acoust Soc Am 131(2):1349–1358.
International Organization for Standardization (2017) https://doi.org/10.1121/1.3675008
Underwater acoustics—Terminology (ISO 18405). Norris TF, Dunleavy KJ, Yack TM, Ferguson EL (2017)
Geneva, Switzerland. Estimation of minke whale abundance from an acous-
Janik VM, Van Parijs SM, Thompson PM (2000) A tic line transect survey of the Mariana Islands. Mar
two-dimensional acoustic localization system for marine Mamm Sci. https://doi.org/10.1111/mms.12397
mammals (Note). Mar Mamm Sci 16(2):437–447. Padois T (2018) Acoustic source localization based on the
https://doi.org/10.1111/j.1748-7692.2000.tb00935.x generalized cross-correlation and the generalized mean
Kamminga C, Beitsma GR (1990) Investigations on ceta- with few microphones. J Acoust Soc Am 143(5):
cean sonar IX, remarks on dominant sonar frequencies EL393–EL398. https://doi.org/10.1121/1.5039416
from Tursiops truncatus. Aquat Mamm 16(1):14–20 Parrack HO (1966) Effect of Air-Borne Ultrasound on
Kastelein RA, Wensveen PJ, Terhune JM, de Jong CAF Humans. Int J Audiol 5(3):294–308. https://doi.org/
(2011) Near-threshold equal-loudness contours for har- 10.3109/05384916609074198
bour seals (Phoca vitulina) derived from reaction times Parsons MJ, McCauley RD, Mackie MC, Siwabessy P,
during underwater audiometry: a preliminary study. J Duncan AJ (2009) Localization of individual
Acoust Soc Am 129(1):488–495. https://doi.org/10. mulloway (Argyrosomus japonicus) within a spawning
1121/1.3518779 aggregation and their behaviour throughout a diel
Koblitz JC (2018) Arrayvolution: using microphone arrays spawning period. ICES J Mar Sci 66(6):1007–1014.
to study bats in the field. Can J Zool 96(9):933–938. https://doi.org/10.1093/icesjms/fsp016
https://doi.org/10.1139/cjz-2017-0187 Prime Z, Doolan C, Zajamsek B (2014) Beamforming
Krim H, Viberg M (1996) Two decades of array signal array optimisation and phase averaged sound source
processing research: the parametric approach. IEEE mapping on a model wind turbine. Inter-Noise and
Signal Process Mag 13(4):67–94. https://doi.org/10. Noise-Con Congress and Conference Proceedings
1109/79.526899 249(7):1078–1086
Kuehne LM, Erbe C, Ashe E, Bogaard LT, Collins MS, Putland RL, Mackiewicz AG, Mensinger AF (2018)
Williams R (2020) Above and below: Military aircraft Localizing individual soniferous fish using passive
noise in air and under water at Whidbey Island, acoustic monitoring. Ecol Inform 48:60–68. https://
Washington. J Mar Sci Eng 8(11):923. https://doi.org/ doi.org/10.1016/j.ecoinf.2018.08.004
10.3390/jmse8110923 Robinson DW, Dadson RS (1956) A re-determination of
Leighton TG (2018) Ultrasound in air—Guidelines, the equal-loudness relations for pure tones. Br J Appl
applications, public exposures, and claims of attacks Phys 7(5):166–181. https://doi.org/10.1088/0508-
in Cuba and China. J Acoust Soc Am 144(4): 3443/7/5/302
2473–2489. https://doi.org/10.1121/1.5063351 Schmidt R (1986) Multiple emitter location and signal
Marley SA, Erbe C, Salgado Kent CP (2017) Underwater parameter estimation. IEEE Trans Antennas Propag
recordings of the whistles of bottlenose dolphins in 34(3):276–280. https://doi.org/10.1109/TAP.1986.
Fremantle Inner Harbour, Western Australia. Sci Data 1143830
4(170):126. https://doi.org/10.1038/sdata.2017.126 Southall BL, Bowles AE, Ellison WT, Finneran JJ, Gentry
Matsuo I, Wheeler A, Kloepper L, Gaudette J, Simmons RL, Greene CRJ, Kastak D, Ketten DR, Miller JH,
JA (2014) Acoustic tracking of bats in clutter Nachtigall PE, Richardson WJ, Thomas JA, Tyack
152 C. Erbe et al.
PL (2007) Marine mammal noise exposure criteria: bioacoustics. Appl Acoust 145:137–143. https://doi.
Initial scientific recommendations. Aquat Mamm org/10.1016/j.apacoust.2018.09.022
33(4):411–521. https://doi.org/10.1080/09524622. Van Veen BD, Buckley KM (1988) Beamforming: a ver-
2008.9753846 satile approach to spatial filtering. IEEE ASSP Mag
Southall BL, Finneran JJ, Reichmuth C, Nachtigall PE, 5(2):4–24. https://doi.org/10.1109/53.665
Ketten DR, Bowles AE, Ellison WT, Nowacek DP, Vivek R, Vadakkepat P (2015) Multiple signal classifica-
Tyack PL (2019) Marine mammal noise exposure tion (MUSIC) based underwater acoustic localization
criteria: Updated scientific recommendations for resid- module (UALM) for AUV. Paper presented at the 2015
ual hearing effects. Aquat Mamm 45(2):125–232. IEEE Underwater Technology (UT) Conference.
https://doi.org/10.1578/AM.45.2.2019.125 https://doi.org/10.1109/UT.2015.7108288.
Stanistreet JE, Risch D, Van Parijs SM (2013) Passive Ward R, Parnum I, Erbe C, Salgado-Kent CP (2016)
acoustic tracking of singing humpback whales Whistle characteristics of Indo-Pacific bottlenose
(Megaptera novaeangliae) on a Northwest Atlantic dolphins (Tursiops aduncus) in the Fremantle Inner
feeding ground. PLoS One 8(4):e61263. https://doi. Harbour, Western Australia. Acoust Aust 44(1):
org/10.1371/journal.pone.0061263 159–169. https://doi.org/10.1007/s40857-015-0041-4
Surlykke A, Pedersen SB, Jakobsen L (2009) Watkins WA, Schevill WE (1972) Sound source location
Echolocating bats emit a highly directional sonar by arrival-times on a non-rigid three-dimensional
sound beam in the field. Proc R Soc B Biol Sci hydrophone array. Deep-Sea Res Oceanogr Abstr
276(1658):853–860. https://doi.org/10.1098/rspb. 19(10):691–706. https://doi.org/10.1016/0011-7471
2008.1505 (72)90061-7
Suzuki Y, Takeshima H (2004) Equal-loudness-level Wellard R, Erbe C, Fouda L, Blewitt M (2015)
contours for pure tones. J Acoust Soc Am 116(2): Vocalisations of killer whales (Orcinus orca) in the
918–933. https://doi.org/10.1121/1.1763601 Bremer Canyon, Western Australia. PLoS One 10(9):
Taylor B, Thompson A (eds) (2008) The International e0136535. https://doi.org/10.1371/journal.pone.
System of Units (SI). National Institute of Standards 0136535
and Technology, Gaithersburg, MD Zarchan P, Musoff H (2013) Fundamentals of Kalman
Thode A (2005) Three-dimensional passive acoustic track- filtering: a practical approach, 4th edn. American Insti-
ing of sperm whales (Physeter macrocephalus) in tute of Aeronautics and Astronautics, Inc., Reston, VA
ray-refracting environments. J Acoust Soc Am Zhu C, Garcia H, Kaplan A, Schinault M, Handegard NO,
118(6):3575–3584. https://doi.org/10.1121/1.2049068 Godø OR, Huang W, Ratilal P (2018) Detection, local-
Tonin R (2018) A review of wind turbine-generated ization and classification of multiple mechanized ocean
infrasound: Source, measurement and effect on health. vessels over continental-shelf scale regions with pas-
Acoust Aust 46(1):69–86. https://doi.org/10.1007/ sive ocean acoustic waveguide remote sensing.
s40857-017-0098-3 Remote Sens 10(11):1699
Tougaard J, Beedholm K (2019) Practical implementation Zimmer WMX (2011) Passive acoustic monitoring of
of auditory time and frequency weighting in marine cetaceans. Cambridge University Press, Cambridge
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Source-Path-Receiver Model
for Airborne Sounds 5
Ole Næsbye Larsen, William L. Gannon, Christine Erbe,
Gianni Pavan, and Jeanette A. Thomas
travels to surrounding residential buildings.2 The Even though the SPRM was originally devel-
source may be eliminated by relocating all traffic oped to manage hazards at the workplace, it is
to an inner-city bypass and banning all traffic much more broadly applicable to the day-to-day
downtown. Maybe private car traffic can be lives of humans—and animals. In fact, the SPRM
substituted by a quieter, electric city bus service. is fundamental. Without a receiver, there is no
Imposing a speed limit reduces noise. Some cities hazard. Without a listener, there is no noise.
enforce noise emission standards for cars. Long- Researchers of animal bioacoustics might want
term engineering solutions may include building to apply the SPRM to their project in order to
a tunnel, resurfacing the road with noise- identify parameters of the source, path, and
absorbing material, installing noise barrier walls receiver, that might influence the results. Other
along the road, or erecting earth bunds. Residen- chapters in this book either explicitly or implicitly
tial buildings may have noise-reduction (double- apply the SPRM. Chapter 13 on the effects of
glazed) windows and residents may set up their noise on animals provides examples where the
bedrooms at the opposite side of the building. The source is a highway, the path follows from the
specific implementation of the SPRM depends on highway into the surrounding bush, and the
the application. For example, residents in an receivers are birds, whose abundance might
apartment building would not want to wear decrease closer to the source as a result of habitat
earmuffs at home, but for workers in a noisy degradation by noise. Chapter 11 deals with
plant, such PPE is common practice. A poster acoustic communication between animals, and
showing the steps involved in workplace noise so the source may be a male frog, the path may
control is shown in Fig. 5.2. lead through a tropical rain forest, and the
receivers are nearby females of the same species.
2 Chapter 12 is about echolocation. Here, the
Example SPRM for traffic noise. Environmental Protec-
tion Department, The Government of the Hong Kong source and the receiver are the same individual
Special Administrative Region https://www.epd.gov.hk/ animal. A bat echolocates on a moth and the
epd/noise_education/young/eng_young_html/m3/m3. echolocation signal reflects off the moth,
html; accessed 4 December 2020.
5 Source-Path-Receiver Model for Airborne Sounds 155
informing the bat how far away its prey is. The store acoustic data for later analysis in the labora-
signal travels through the environment twice: tory. The following sections first explore the basic
from the bat to its prey and back. Chapter 10 concepts of sound propagation in air before
covers audiometry, where the sources are con- applying these to an example SPRM.
trolled and engineered signals (often pure tones)
that are played to animals over short distances or
through earphones, and the receivers are individ-
5.2 Sound Propagation
ual animals whose hearing is being measured.
in Terrestrial Environments
Chapter 7 explores soundscapes on land and
under water. The sources are grouped into
The environment through which a sound travels
geophony (e.g., wind, rain, and waves), biophony
alters its acoustic features such as its spectral
(i.e., animals), and anthropophony (e.g., airplanes
composition and level. The effects of the environ-
or ships). The paths go through the air over land,
ment on bioacoustic signals were well explored in
under water, and through the ground. The
the classic works of Chappuis (1971), Marten and
receivers in passive acoustic monitoring of
Marler (1977), Michelsen (1978), and Wiley and
soundscapes are recorders, which collect and
Richards (1978).
156 O. N. Larsen et al.
Fig. 5.3 Diagram of some of the factors affecting sound propagation in air. Figure donated by Sara Torres Ortiz
Airborne sound propagation (often called out- temperature, wind speed and direction, and
door sound propagation) is characterized by a humidity) vary throughout the day and among
number of phenomena. Sounds attenuate with seasons, and so sound propagation can be quite
distance from the sender due to geometrical atten- variable. Sound propagation models exist and can
uation (i.e., spreading) and absorption by the be used to predict the distance over which sounds
medium. High-frequency sounds (i.e., sounds travel, create noise maps, estimate changes to the
having short wavelengths; see Chap. 4 on acoustic (e.g., spectral) features of received
definitions of frequency and wavelength) propa- sounds, and identify factors that could hinder or
gate over shorter distances than low-frequency enhance animal communication (see Lohr et al.
sounds (i.e., sounds having long wavelengths). 2003; Jensen et al. 2008). Bioacousticians should
Environmental and structural factors such as sub- consider the characteristics of sound propagation,
strate composition; terrain profile; obstacles along which could explain variability in the receiver’s
the path; amount of vegetative cover; wind speed behavioral response or the effectiveness of acous-
and direction; vertical gradients (i.e., increases or tic communication.
decreases) in wind speed, air temperature, and
humidity; air turbulence; and, to a small degree,
altitude (i.e., atmospheric pressure) affect sound 5.2.1 Ray Traces
propagation in air (Fig. 5.3). The propagation
paths, along which sounds travel, are rarely Sound propagation is accurately described by the
straight lines, but rather bend (i.e., refract or dif- acoustic wave equation. This is a four-
fract), reflect, and scatter. The same sound dimensional (4-d: three spatial coordinates and
traveling along different propagation paths may time) differential equation of the second order.
interfere with itself constructively or destruc- For an “easy” derivation of the acoustic wave
tively. The received sound is a weaker and often equation, see Larsen and Radford (2018). How-
distorted version of the sent sound (Wahlberg and ever, in the simplest situation of symmetric geom-
Larsen 2017). etry (i.e., omnidirectional signal in a
This section explains the basic concepts of homogeneous medium with no reverberation),
sound propagation in air and provides some the equation can be simplified and described by
insights into environmental effects on propaga- one variable: the range to the source (Wahlberg
tion. Some environmental factors (e.g., air and Larsen 2017). Even then, solving the wave
5 Source-Path-Receiver Model for Airborne Sounds 157
a) b)
t1 t2 t3 t4 t3 t4
Fig. 5.4 (a) Sketch of a rooster sitting on a branch. When propagation. (b) Illustration of Huygens’ principle. Each
the bird crows, sound is emitted in all directions (marked point on the wavefront at time t4 can be considered itself a
by a few example black arrows). The green concentric (secondary) source; nine example points are marked by
circles represent the wavefronts of the outgoing sound at suns. The wavefronts of the secondary sources (shown as
times t1 t4. The wave rays are perpendicular to the black circle segments) superpose to yield the new primary
wavefronts and point in the direction of sound wavefront, drawn at time t4
equation under the various and variable waves cancel out in some places but at the farthest
conditions encountered in common sound propa- range from the rooster in the center, the secondary
gation scenarios is quite a task. Fortunately, there wavefronts line up to yield the new primary
are much simpler, conceptual principles of sound wavefront at time t4.
propagation, which can yield satisfactory results. As the expanding wavefront encounters
One such concept is ray propagation or ray features of the environment (e.g., vegetation or
tracing. gradients in sound speed), its shape changes and
Let us consider an omnidirectional source, the directions of the wave rays change. The laws
which emits sound equally in all directions. An of physics and principles of sound propagation
example is the crowing rooster in Fig. 5.4a can be applied to trace the propagation paths. This
(although it is only omnidirectional at the lower is called ray tracing. For an easy introduction to
frequencies of its crow and it might not typically ray tracing, see Heller (2013). Wahlberg and
crow while roosting, but for the sake of Larsen (2017) suggested visualizing a ray as a
science. . .; Larsen and Dabelsteen 1990). Wave “small acoustic particle travelling along a narrow
rays point in the direction of sound propagation beam or ray in discrete steps and bouncing-off or
and are perpendicular to the wavefronts of the being refracted through surfaces.” This type of
propagating sound. The wavefronts are spheres sound field visualization, first introduced in
in 3D space (circles in 2D). Huygens’ principle 1967 (Krokstad et al. 2015), has been used exten-
(named after Christiaan Huygens, a Dutch physi- sively in linear acoustics to model phenomena in
cist) states that every point on a wavefront can be outdoor sound propagation with the computa-
considered a source of a new (secondary) wave. tional tools now available with computers
And all of the secondary wavefronts superpose (Attenborough et al. 1995).
to build the next (in time) primary wavefront. An example of ray tracing is shown in Fig. 5.5.
The wavefront at time t3 in Fig. 5.4a is also The omnidirectional source is located in the lower
shown in Fig. 5.4b. Nine example points on this left corner, 5 m above ground at range 0, and it
wavefront are “randomly” illustrated (as small emits a 10-Hz tone. The wave rays are shown and
suns). These each create their own set of concen- follow the sound propagation paths. Sound that is
tric wavefronts, drawn at time t4. The secondary initially emitted in an upwards direction bends
158 O. N. Larsen et al.
Fig. 5.5 Top: Ray traces modeling the propagation of an longer than typical animal sound communication
airborne 10-Hz tone from a point source located 5 m off distances, which normally are up to only a few hundred
the ground (lower left corner). The model suggests that meters. Bottom: Contour plot of propagation loss, PL (i.e.,
sound is bent downwards (downward refraction, typical attenuation) of the 10-Hz sound. Modified from
for nighttime) where it bounces off the ground several Attenborough et al. (1995). # Acoustical Society of
times depending on the initial direction from the source. America, 1995. All rights reserved
Note the scales: These effects occur at distances much
downward at a certain altitude (depending on its loss) and regions that only a few rays enter have
initial angle of emission). This is typical for night- low received levels (high propagation loss).
time sound propagation. Once rays hit the ground, For example, Ottemöller and Evers (2008)
they are reflected upwards again. The sound field used ray tracing to describe the sound propaga-
(i.e., the received level at every location in space) tion of a massive vapor cloud explosion at
is computed by summing sound pressure over all Buncefield fuel depot near Hemel Hempstead,
rays. Regions where rays travel close together UK, on the morning of 11 December 2005. The
have high received levels (little propagation storage tank overflowed and released over
5 Source-Path-Receiver Model for Airborne Sounds 159
300 tons of fuel. An explosion was triggered after ground up to about 10 m from a microphone, only
a vapor cloud formed and spread over a very large spherical spreading needs to be considered. If the
area (80,000 m2 or about 20 acres) before ignit- receiver is at a greater distance from the bird, then
ing. The explosion was huge, caused extensive ground and atmospheric effects also must be con-
damage, injured 43 people, and was detected by sidered. If the bird is flying overhead, then spher-
seismograph stations in the UK and the ical spreading and atmospheric effects need to be
Netherlands. The data provided significant infor- considered when determining propagation
mation on the ray trajectories of this explosion. characteristics.
If other sources of attenuation are negligible,
then Eq. 5.2 can be used to calculate the source
levels of a vocalizing animal located at distance
5.2.2 Geometrical Sound Spreading
r from the receiver. For instance, if a bioacousti-
cian measured RL ¼ 65 dB re 20 μPa at a distance
Sound from an omnidirectional source in the free-
of 10 m from a singing bird, then SL (at 1 m from
field spreads out evenly in a spherical pattern (i.e.,
the bird) becomes 65 dB re 20 μPa + 20
equally in all directions). The free-field is homo-
log10(10) dB re 1 m ¼ 85 dB re 20 μPa m (e.g.,
geneous (i.e., has no temperature or humidity
Dabelsteen 1981). Similarly, if somebody played
gradients) and unimpeded by buildings or vegeta-
back a sound at a known source level of 85 dB re
tion. At any receiver location in space, only a
20 μPa m, then the predicted RL at 1 km (¼
small proportion of the emitted sound arrives,
103 m) range would be 25 dB re 20 μPa, as
and so the received sound is attenuated compared
20 log10(103) ¼ 60.
to the sound energy emitted at the source. The
In some environments, and for some sources
total attenuation or loss of sound energy from the
(i.e., line sources rather than point sources), air-
source to a receiver is known as propagation loss
borne sound propagation can be better described
(PL; formerly transmission loss). The sound pres-
as cylindrical spreading. For an infinitely long
sure level at the source (defined as 1 m from a
line source, the propagation loss as a function of
point source; see Chap. 4) is called the source
range becomes PLcyl ¼ 10 log10(r) and so Eq. 5.1
level (SL), whereas the sound pressure level at
becomes:
the receiver at a distance (i.e., range r) from the
source is called the received level (RL). The rela- RL ¼ SL 10 log 10 ðr Þ ð5:3Þ
tion between these two levels is given by Eq. 5.1:
Most biological line sources, however, are
RL ¼ SL PL ð5:1Þ finite, such as a row of vocalizing birds on a
power line. (Please be aware that this example is
Propagation loss in the free-field is termed
not a line source in the strict acoustic sense.) This
spherical spreading loss, which can be computed
means that geometrical spreading loss is some-
as PLsph ¼ 20 log10(r) (for derivation of this
where between that of spherical and cylindrical
expression, see Wahlberg and Larsen 2017). It is
spreading loss (Fig. 5.6). When the receiver dis-
independent of signal frequency and only
tance from the finite line source is much less than
depends on the geometry of the source and
the length of the finite line source, then the atten-
sound field. So, Eq. 5.1 may be reformulated:
uation is close to that of an infinite line source
RL ¼ SL 20 log 10 ðr Þ ð5:2Þ (i.e., 10 log10(r)), whereas at distances compara-
ble to or larger than the length of the finite line
As a first approximation, spherical spreading is source, the latter acts more like a point source and
a good model for the propagation of terrestrial attenuation develops as 20 log10(r). At suffi-
animal sounds produced in large open-air regions, ciently long distances, all sources can be regarded
such as grassland. Generally, if a bird sings on the as point sources.
160 O. N. Larsen et al.
animals takes place and at frequencies below secondary wave. Two secondary wavefronts are
10 kHz, the role of absorption in overall propaga- shown at time t3. From the time t1, when the first
tion loss is likely insignificant compared to other ray hits, to the time t3, the first wavefront has
environmental factors. Garcia et al. (2012), for expanded quite a bit. The second wavefront was
example, described the 40-Hz wing beat signals started at time t2, when the second ray hit, and has
of drumming ruffed grouse (Bonasa umbellus). expanded less by time t3. The third ray is just
Theoretically, these sound signals would be starting its secondary wave at time t3, with its
reduced by 6 dB due to air absorption at a dis- secondary wavefront not yet visible. The tangent
tance of 187 km from the drumming bird, to the secondary wavefronts at time t3 gives the
whereas spherical spreading loss alone would new wavefront of the reflected wave. The angle of
have reduced the signal amplitudes to a level far incidence (measured from the normal) is equal to
below auditory threshold of most animals at a the angle of reflection (also measured from the
distance of 1 km already (PLsph ¼ 60 dB re 1 m). normal). This is referred to as the law of reflec-
tion. It applies to the so-called specular reflection
(as from a mirror).
5.2.4 Reflection, Scattering, Reflection is not always specular but might
and Diffraction instead be diffuse. In diffuse reflection, sound is
scattered from the surface in all sorts of directions
A second and less predictable component of EA including the specular direction (Fig. 5.8b). This
is the attenuation caused by reflection, scattering, happens when the surface is not smooth but
and diffraction. As a sound wave hits a hard rough. Scattering depends on the ratio of the
surface, it is reflected. Reflection can be explained wavelength of sound to the size of the scatterer.
with Huygens’ principle. In Fig. 5.8a, the rooster When the sound wavelength is long (i.e., fre-
from Fig. 5.4a is very far away such that the quency is low) relative to the roughness of the
wavefronts at any location appear planar (rather surface, all the sound energy is reflected in the
than circular) and the wave rays are parallel specular direction. When the wavelength is short
(rather than radial). Three incident rays are (i.e., frequency is high) and less than the magni-
drawn, hitting the surface (e.g., a road) at times tude of the unevenness of the surface, then sound
t1, t2, and t3. By Huygens’ principle, each point on is scattered in other, non-specular directions. A
the road that is hit acts as the source of a gravel road, for instance, produces specular
162 O. N. Larsen et al.
a) b)
Incident Reflected Incident Scaered
wave wave wave rays wave rays
rays rays
⍬r ⍬i ⍬r
⍬i
t1 t2 t3
Fig. 5.8 (a) Sketch of specular reflection of a plane wave ray has started to grow a secondary wavefront, and the first
(originating from a far-away rooster) off a hard surface. ray has grown the largest wavefront. The angles of inci-
Wave fronts are shown as green lines; they are perpendic- dence θi are equal to the angles of reflection θr. (b) Sketch
ular to the wave rays, shown as black arrows. The three of diffuse reflection off a rough surface where the uneven-
incident rays hit at times t1 t3 at the locations marked by ness is great compared to the wavelength of incident
small suns. Each of these points creates a secondary wave sound. While there is a reflected ray in the specular direc-
by Huygens’ principle. The secondary wavefronts super- tion, too (indicated by a blue arrow), there are many other
pose to yield the new wavefront of the reflected wave, directions in which the incident sound is scattered
shown at time t3, when the third ray just hits, the second (indicated by red arrows)
reflection at frequencies below 15–20 kHz, but at (Holland et al. 2001). Consequently, leading
higher frequencies, where the gravel roughness is edges of sound segments are relatively well-
large relative to the wavelength, sound is preserved, whereas ending edges are lost in rever-
scattered in different directions (Michelsen and berant environments.
Larsen 1983). Diffraction occurs when a sound wave is par-
Reverberation is a result of multiple reflections tially obstructed. In Fig. 5.10a, a plane wave
and refers to the phenomenon of sound persisting (perhaps again from a far-away rooster) hits a
even if the source is turned off. In canyons, caves, wall with an opening in the center. The rays that
or other enclosures, sound bounces off the hit the wall are reflected (not drawn). The rays
boundaries again and again. The reverberant that hit the opening pass straight through. By
sound field is the space that is dominated by Huygens’ principle, each point of the opening
reflected sound (as opposed to the field near the acts as a source of secondary waves. As the sec-
source where the direct sound dominates). Once ondary wavefronts expand, they superpose to
the source is switched off, the reverberant field form new wavefronts that appear to bend behind
will continue to exist for some time, yet decay due the wall. This is termed diffraction. It also occurs
to absorption by the medium, boundaries (e.g., when the obstruction is finite (Fig. 5.10b).
the walls of a music room), and absorbers in the If the object that is in the path of a propagating
room (e.g., furniture and people). The more reflec- sound wave becomes much smaller than a wall
tive the boundaries, the greater the reverberation. (e.g., a bush or maybe just an insect in the air), to
Reverberation severely alters the structure of the point where the wavelength is much greater
the received sound and is one of the least wanted (at least by a factor 10) than the size of the object,
effects in analysis of recorded animal sounds then the sound wave “ignores” the object and
(Fig. 5.9). This type of signal degradation with propagates without obstruction. The sound effec-
propagation distance can be quantified by mea- tively cannot “see” the object; it is too small. In
suring the blur-ratio (see e.g., Dabelsteen et al. laboratory experiments, bioacousticians should
1993). The received sound appears longer in therefore make sure that objects in the sound
duration than the emitted sound, with the delayed path from loudspeaker to experimental animal
echoes forming a resulting “tail.” This reverbera- are at least 10 times smaller than the wavelength
tion tail can be quantified as the tail-to-signal ratio of the stimulus sound (Larsen 1995). When the
5 Source-Path-Receiver Model for Airborne Sounds 163
wavelength is of the same order of magnitude as as a paved road, ice sheet, cave wall, canyon,
the object, or somewhat greater, then diffractive subterranean tunnel, burrow wall, or wall of a
scattering occurs (Bradbury and Vehrencamp captive animal’s exhibit) reflects more and
2011). As the name suggests, this is a combina- absorbs less acoustic energy than a porous, soft
tion of diffraction and scattering, whereby some surface (such as tree leaves, grassy pastures, or
sound bends around the object and some sound forest canopy). Whether a surface or object is
scatters in all directions, leading to a complicated considered rough or smooth and hard or soft
sound field. depends on the wavelength of the sound. In a
Different surfaces or materials exhibit different mixed deciduous forest, reverberations for
degrees of sound reflection, absorption, and trans- frequencies above 4 kHz are stronger with leaves
mission. A hard, compact, smooth surface (such on the trees than without leaves (Wiley and
a) b)
Fig. 5.10 (a) Sketch of diffraction as a sound wave secondary waves. The secondary waves combine to create
passes through an aperture. Wave rays are indicated by the new wavefronts shown at three successive instances in
black arrows; wavefronts are indicated by green lines. As time. The wavefronts appear to bend behind the aperture.
the plane wave from a distant rooster hits a wall, each point (b) Sketch of diffraction as a sound wave passes by a finite
in the opening acts as a source (indicated by suns) of obstruction
164 O. N. Larsen et al.
Richards 1982). Reverberations essentially are the sound propagating along PD and PG. The
absent in an open field on a calm day. interference pattern has regions of enhanced
received level (due to constructive interference)
and of attenuated received level (due to destruc-
5.2.5 Ground Effect tive interference) at the position of R (Fig. 5.11b).
The received sound signal is a distorted version of
Another component of EA is the so-called ground the emitted signal. It is said to be comb-filtered, as
effect, which is always present in terrestrial sound the destructive interference creates the “comb
propagation. The sound signal from a sender teeth” attenuating some frequencies in the signal,
(S) located at some height above ground (e.g., a whereas the constructive interference enhances
bird at 4 m) will reach a receiver (R; e.g., a other frequencies of the signal. The magnitude
recordist’s microphone at 1.5 m) first by the direct of the ground effect depends on sound frequency,
path (PD) and a moment later by the indirect and on geometry of the sender-receiver separation
longer path when the signal has been reflected distance and height above ground, on the rough-
from the ground (PG) (Fig. 5.11a). This results ness and softness of the ground, and on atmo-
in a range-dependent interference pattern between spheric pressure, ambient temperature, relative
Fig. 5.11 Predicted ground effect. (a) Sender 4 m above covered field (flow resistivity 100 kPa s m2, porosity
ground, Receiver 1.5 m above ground, horizontal separa- 30%, layer depth 0.01 m). Red curve: As in the black
tion distance 50 m (not to scale). The direct wave PD and curve, but more realistic air absorption (at 20 C, 75%
the reflected wave PG superpose at R. (b) For frequencies relative humidity, standard atmospheric pressure) and
whose wavelengths are in phase, superposition results in moderate turbulence (mean-squared refractive index of
level enhancement up to 6 dB; at frequencies with 105) were added. Effects of temperature and wind-
wavelengths out of phase at R, levels are attenuated up to induced refraction were excluded in the model, which
20–30 dB. Black curve: The curve represents the predicted was developed by Keith Attenborough and Shahram
decibel values that need to be added to the geometric Taherzadeh and improved by Kenneth Kragh Jensen
attenuation loss. The ground was modeled as a grass-
5 Source-Path-Receiver Model for Airborne Sounds 165
humidity, and turbulence (see Attenborough et al. 24 m varied from about 5 dB at 2 kHz to 10 dB
2007). Acoustically hard ground surfaces (such as at 4 kHz, which is the range of dominant
rock or consolidated sand) produce comb-filter frequencies in many songbird songs. This foliage
effects over a wide frequency range extending to attenuation is less than, but needs to be added to,
relatively high frequencies, whereas acoustically the 28-dB attenuation caused by spherical spread-
soft surfaces (such as grasslands, forest floors, or ing over the same distance (Eq. 5.2).
unpacked snow) mainly generate the ground Some research on sound propagation through
effect at low frequencies. Recordists may reduce vegetation was motivated by a desire to attenuate
the ground effect by placing microphones as high anthropogenic noise such as road noise, but gen-
as practically possible above soft ground. For a erally and most surprisingly dense foliage only
general introduction to the phenomenon, see accounts for a small amount of attenuation.
Michelsen and Larsen (1983) or Wahlberg and Martínez-Sala et al. (2006) concluded that a
Larsen (2017). For a comparison between ground 15-m wide patch of regularly spaced trees could
effect models and outdoor recordings, see Jensen attenuate car noise by at least 6 dB. The effect was
et al. (2008). similar for more traditional noise barriers.
Defrance et al. (2002), for instance, found that a
100-m wide forest strip was effective at providing
5.2.6 Attenuation by Vegetative an acoustical barrier to noise, such as shown in
Cover Fig. 5.12, where octave-band sound was broad-
cast through dense foliage and recorded at differ-
Absorption of sound by vegetation is a compo- ent distances in the forest.
nent of EA that can further dissipate airborne At present, vegetation attenuation is not well
sounds over distance as acoustic energy is understood. A much larger database is needed
converted to heat in the plant material by viscous before it is possible to accurately predict the effect
friction. The absorption of sound in vegetation of different kinds of vegetation on sound propa-
depends on the material composition and hard- gation (see Attenborough et al. 2007).
ness of the surfaces including the soft ground
often found especially in woodland. Leaves
absorb more sound energy than a tree trunk;
5.2.7 Speed of Sound in Still Air
whereas a tree trunk reflects more sound than
leaves do. All of this is frequency-dependent.
The speed of sound in still air is affected only by
This component of EA obeys no simple rules
the ambient air temperature and, to a minimal
and needs to be measured by propagation
extent, air pressure (or altitude). If the sound
experiments in the field (e.g., Dabelsteen et al.
propagates under windy conditions, however,
1993). Aylor (1972a, b) measured sound propa-
the effective speed of sound will be modified by
gation loss through various crops, bushes, and
the wind velocity such that the wind velocity of a
trees by broadcasting from a loudspeaker and
tailwind will add to the speed of sound and the
recording at some distance with a microphone.
wind velocity of a headwind will subtract from
He found foliage enhanced absorption and scat-
the speed of sound.
tering. Price et al. (1988) modeled and measured
The speed of sound determines the arrival time
attenuation by vegetation in different forest
of a signal from the sender to the receiver and
environments and documented scattering from
bends a propagating sound wave away from
tree trunks, enhanced ground effect in the pres-
higher air temperature and towards lower air tem-
ence of mature forest litter, and attenuation by
perature (or from higher wind velocity towards
foliage. Foliage attenuation had the greatest effect
lower wind velocity). The speed of sound in air at
above 1 kHz and increased almost linearly with
21 C is 344 m/s. At freezing point, 0 C, the
the logarithm of frequency. Through mixed conif-
speed of sound in air is 331 m/s. A good
erous forest, for instance, the attenuation over
166 O. N. Larsen et al.
c ¼ ð331:45 þ 0:607 T c Þ m=s ð5:6Þ Note that, while the frequency of the sound
does not change during transmission, the wave-
length does change. With c ¼ λf (see Chap. 4,
section on the speed of sound), the wavelength is
5.2.8 Refraction by Air Temperature smaller in the medium with lower sound speed.
Gradients in Still Air Refraction of sound waves in air is a common
phenomenon due to vertical gradients of air
Refraction is the change of the direction of sound temperature and/or wind velocity. A gradual
propagation due to changes in the speed of sound. change in sound speed is illustrated in
In the example of Fig. 5.13a, a plane wave in Fig. 5.13b, where the rays bend more and more
medium 1 hits an interface with medium upwards as the sound speed increases. In terres-
2. Some of the acoustic energy might be reflected trial environments, the sound source is typically
(as in Fig. 5.8a, not drawn in Fig. 5.13a), and located close to the ground. A sound speed profile
some of the energy is transmitted. The transmitted that has the speed of sound increase with altitude
wave is refracted, because the speeds of sound is downward refracting, while a sound speed pro-
differ in the two media. If c1 > c2, then the file that has the speed of sound decrease with
transmitted wave bends towards the normal (i.e., altitude is upward refracting. Bent propagation
away from the interface; Fig. 5.13a); if c1 < c2, paths have the effect that sound appears to arrive
then the transmitted wave bends away from the from a non-intuitive (i.e., not straight-line) direc-
normal (i.e., towards the interface; Fig. 5.13b). tion. This phenomenon is like an acoustic mirage
The angles of incidence and refraction (transmis- in analogy to optical mirages, which produce
sion) are related via Snell’s law (named after displaced images of far-away objects and which
Dutch astronomer and mathematician Willebrord are also caused by refraction (of light).
Snell): The EA from refraction may be positive or
negative, and so RL may be smaller or greater
5 Source-Path-Receiver Model for Airborne Sounds 167
a) b)
Medium 1: c1 C1
Incident wave rays
C2
⍬i C3
t1 t2 t3
⍬t C4
Medium 2: c2
Refracted C5
(transmied) wave rays
Fig. 5.13 (a) Sketch of refraction at a boundary between medium. With rays, by definition, being perpendicular to
medium 1 (high sound speed) and medium 2 (low sound the wavefronts, it can be seen that the rays bend towards
speed). Three rays (black arrows) are shown, hitting the the normal in the second medium (θt < θi). Successive
interface at times t1-t3. Each gives rise to secondary waves wavefronts are drawn to show that they are spaced farther
(by Huygens’ principle) starting at the points marked with apart in the medium with higher sound speed, and so the
small suns. At time t3, the third ray just meets the interface, wavelength λ is greater in the medium with higher sound
the second ray has produced a small secondary wave, and speed. (b) Sketch of gradual refraction by a vertical gradi-
the first ray’s secondary wave has grown quite a bit. ent in sound speed. In the illustrated example,
Drawing the tangent to the secondary waves at time t3 c1 < c2 < c3 < c4 < c5
yields the new wavefront (green line) in the second
than predicted without a refracting atmosphere. zone around the sound source, where the sound
Air temperature varies throughout the day and level decreases way faster than predicted from
creates varying temperature gradients. So, record- distance alone (Fig. 5.14b). While the shadow
ing at the same location at a different time of day zone cannot be reached by a direct path, it may
can produce different results. Therefore, taking be ensonified by reflection off houses (or other
periodic measurements of the ambient tempera- reflectors) in the vicinity and by paths passing
ture at different heights above the ground can through turbulence, and the shadow zone is thus
provide the researcher with a notion of whether not totally quiet.
sound propagation is changing and at what pace. For example, on a sunny day with little wind,
In still air during daytime, the air is both the air temperature can be 30 C at the ground
warmer and more humid close to the ground and (c ¼ 351 m/s), but at 2–3 m above ground, the
a stable air temperature gradient can be temperature may be only 25 C (c ¼ 347 m/s).
established with warmer air near the ground, This decrease continues up through the atmo-
because of sunlight heating the ground, which sphere by 1 C/100 m, the so-called temperature
warms up much faster than the overlaying air. lapse. With such an air temperature gradient, the
At higher elevations, the air temperature sound rays from a sound source located a few
decreases by 0.01 C/m (Fig. 5.14a). Sound meters above ground will bend upwards, because
waves consequently bend away from locations part of the wave closest to the warmer ground will
near the ground where the temperature is higher travel the fastest. In a carefully conducted experi-
and upwards towards locations with lower ment, a combination of upward refraction, strong
temperatures (Fig. 5.14b). Horizontal rays will upwind propagation, and air absorption was
be directed upwards as will downwards directed measured to reduce the level of propagating
rays after bouncing from the ground. Therefore, a sound at a distance of 640 m by up to 20 dB
certain limiting ray exists that defines a shadow more than predicted from Eq. 5.2 (Attenborough
168 O. N. Larsen et al.
Fig. 5.14 Sketch of the effects of upward refracting Reprinted by permission from Springer Nature. Acoustic
sound speed gradients on outdoor sound propagation. (a) Conditions Affecting Sound Communication in Air and
Temperature profile: Air temperature and consequently Underwater, Larsen and Radford (2018), Fig. 5.5.4. In: H
sound speed increases towards the ground in still air. (b) Slabbekoorn, RJ Dooling, AN Popper and RR Fay (eds).
Ray traces: Sounds from a source (filled circle, here 5 m Effects of Anthropogenic Noise on Animals, Springer
above ground) are refracted upwards, creating a circular Handbook of Acoustic Research 66, Springer Science
shadow zone close to the ground around the source. and Business Media, LLC, part of Springer Nature:
Dashed line indicates a sound ray bouncing off the ground. New York, Heidelberg, Dordrecht, London.
(c) Wind velocity profile: Similar upward refraction is pp. 109–144. https://doi.org/10.1007/978-1-4939-8574-
created upwind. Arrows indicate wind direction towards 6_5. # Springer Nature, 2018. All rights reserved
the source (“headwind”) and their length wind speed.
2007). Perhaps for this reason, birds do not com- infrasonic elephant call during the middle of the
monly sing in open environments near the ground day would travel no more than 1 km (i.e., be heard
on sunny days. Rather, they sing in flight well over an area of 3 km2), but an elephant call at
above ground, or from a perch (Wiley 2009). night might be heard over an area of 300 km2 (see
On calm nights, the opposite air temperature also, Garstang et al. 1995; Larom et al. 1997).
gradient can occur close to ground (called tem- Elephants might adjust timing and abundance of
perature inversion) as it cools faster than the their low-frequency calls and apply them specifi-
overlaying air. Air temperatures increase up to cally for long-distance communication according
50–100 m above ground before decreasing again to atmospheric conditions.
with altitude. Therefore, sound rays bend down- An air temperature gradient can arise in other
wards and hit the ground (Fig. 5.15). A tempera- locations than just close to ground. Geiger (1965)
ture inversion favors long-distance sound found the air in and above the forest canopy begin-
propagation as it leads to higher received levels ning to warm immediately after sunrise, whereas
than predicted by spherical spreading. For this the air below the canopy was slower to respond.
reason, nocturnal communication distances of This creates a bilinear sound speed profile with an
low-frequency African savanna elephant upward refracting gradient above the canopy and a
(Loxodonta africana) sound doubled on the downward refracting gradient below the canopy.
savanna to as much as 10 km (Garstang et al. So, for a short period after sunrise, vocalizing birds
1995). In these conditions, sound energy is and, for instance, howler monkeys (Alouatta sp.)
channeled making spreading losses effectively located below the canopy can increase the range of
cylindrical, rather than spherical within the sur- their vocalizations relative to later in the day (Wiley
face layer. Garstang (2010) suggested that a loud and Richards 1978; Wiley 2009).
5 Source-Path-Receiver Model for Airborne Sounds 169
Fig. 5.15 Sketch of the effects of downward refracting the source (“tailwind”) and their length wind speed.
sound speed gradients on outdoor sound propagation. (a) Reprinted by permission from Springer Nature. Acoustic
Temperature profile: On calm nights, air temperature and Conditions Affecting Sound Communication in Air and
consequently sound speed may increase with height above Underwater, Larsen and Radford (2018), Fig. 5.5.5. In: H
ground until temperature lapse starts. (b) Ray traces: Slabbekoorn, RJ Dooling, AN Popper and RR Fay (eds).
Sounds from a source (filled circle, here 5–10 m above Effects of Anthropogenic Noise on Animals, Springer
ground) are refracted downwards, creating higher sound Handbook of Acoustic Research 66, Springer Science
levels with distance than predicted from spherical spread- and Business Media, LLC, part of Springer Nature:
ing. (c) Wind velocity profile: Similar downward refrac- New York, Heidelberg, Dordrecht, London.
tion with increased sound levels may be created pp. 109–144. https://doi.org/10.1007/978-1-4939-8574-
downwind. Arrows indicate wind direction away from 6_5. # Springer Nature, 2018. All rights reserved
5.2.9 Refraction by Gradients of Wind c(z), the air temperature profile T(z), and the wind
Velocity velocity profile u(z), where z is the height above
ground, when the wind blows in the direction of
Strong air temperature gradients cannot exist sound propagation (when the wind blows against
during strong wind conditions, so the effects of propagation, u(z) is added):
wind velocity on sound propagation in open rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
environments are more influential than air tem- T ðzÞ þ 273:15
cðzÞ ¼ cð0Þ þ uð z Þ ð5:8Þ
perature gradients (Attenborough 2007). Wind 273:15
may cause a shift in sound direction such that
Wind velocity is lowest at the ground and
the appearance from where the sound is generated
increases with altitude (Figs. 5.14c, 5.15c).
differs from where it is actually sent (acoustic
Sound traveling upwind refracts upwards and
mirage). Wind velocity gradients can enhance or
sound traveling downwind refracts downward
impede sound propagation, leading to negative or
(Fig. 5.14b, Fig. 5.15b). As with temperature
positive EA. The actual speed of sound is the sum
gradients, this creates a shadow zone upwind
of the air temperature-generated speed of sound
(Fig. 5.14b), where the sound is not heard. Down-
and the net wind velocity.
wind, sounds propagate in a channeled way
Attenborough et al. (2007) reported the gen-
(Fig. 5.15b) with less loss. Sound attenuates more
eral relationship between the sound speed profile
against the wind than with the wind. Despite this
170 O. N. Larsen et al.
Fig. 5.16 Noise map showing the received levels 50 cm ground. Note how the wind attenuates the gunshot upwind
above ground of a gunshot fired towards east at a location and enhances it downwind. Noise map calculated by
(small red circle in dark blue area upper left corner) close DELTA—a part of FORCE Technology, Hørsholm,
to a lake (lake contour lines indicated by thin black curves) Denmark, using Nord2000 software (https://eng.mst.dk/
with varied topography. The color coding indicates iso- air-noise-waste/noise/traffic-noise/nord2000-nordic-
dB-curves in 5-dB steps. The dark arrow indicates wind noise-prediction-method/; accessed 23 December 2020).
direction and its length corresponds to 300 m on the Figure donated by Jesper Madsen, Aarhus University
common phenomenon, Wiley (2009) commented pattern radiating from the sound source is typi-
that there are no documented cases of animals cally irregular in shape (rather than concentric)
selectively communicating downwind. But refrac- and helps identify environmental conditions that
tion by gradients of wind velocity played a signifi- impede or promote sound propagation. Sound
cant role in Civil War battles in the rolling hills of mapping tools can commonly utilize data on
the eastern U.S. There was no radio communica- topography and ground absorption, air tempera-
tion in the nineteenth century, so commanders ture, and wind direction and speed. The example
often depended on what they heard of the battle in Fig. 5.16 shows how wind attenuated noise
in front of them to make decisions about troop from a gunshot upwind but enhanced received
movements. An acoustic shadow zone existed dur- levels downwind.
ing the Battle of Gettysburg and commanders
could not hear the sounds of battle just 10 miles
away, whereas people 150 miles away in
5.2.10 Attenuation from Air
Pittsburgh clearly heard the skirmish (Ross 2000).
Turbulence
Sound maps portray the attenuation of sound
over distance from a source. The maps take a
Turbulence refers to unsteady and irregular
bird’s-eye view, showing attenuation in 360
motion of the air. It is very difficult to model
about a sound source. Such maps can be produced
and predict. It may be mechanically or thermally
at a specific receiver altitude, or commonly show
induced. Mechanical turbulence is caused by fric-
maximum received levels over a range of
tion, for example, when air moves over rough
altitudes with the intent of yielding “conserva-
ground or past obstacles such as houses and
tive” estimates of received level. The attenuation
trees. Friction causes eddies and thus turbulence.
5 Source-Path-Receiver Model for Airborne Sounds 171
This turbulence is stronger in higher wind speeds Fig. 5.17, two gentoo penguins (Pygoscelis
and rougher terrain. Turbulence is particularly papua) are communicating within their nesting
great during fall winds, which shoot down the colony in Antarctica. The sender (i.e., the source)
slope of a mountain. Thermal turbulence is cre- emits a penguin display call. The call spreads
ated when the sun heats the ground unevenly. For through the habitat, experiencing various forms
example, bare ground warms up faster than fields of attenuation. The receiver is another gentoo
with vegetative cover or bodies of water. Convec- penguin. It might respond acoustically and thus
tive air currents are established with warm and become the next sender. Whether this two-way
less dense air rising and cold and denser air sink- acoustic communication is successful, depends
ing. These currents, in turn, may generate eddies. on a number of parameters.
Eddies may extend from the ground to a few The locations of sender and receiver matter;
hundred meters height. They can be of various the closer together they are, the better the com-
sizes (height and diameter) and larger eddies may munication—most likely. If the source emission
break up into smaller ones. Because of air tem- pattern is directional rather than omnidirectional
perature, gradients and wind, air is always in (i.e., the call can be emitted in a specific direc-
motion and this motion may always generate tion), then the orientation of the sender towards
turbulence. the receiver matters. Similarly, if the receiver’s
Turbulence causes EA, which increases with hearing is directional, then the receiver’s orienta-
distance from the source, with the level of turbu- tion affects communication success. A stronger
lence, and with sound frequency (see red curve in source level will increase the likelihood of suc-
Fig. 5.11b). EA is typically highest during day- cessful reception, unless the environment is
time and on hot sunny days. A characteristic of highly reverberant, in which case the echoes
turbulence on sound propagation is that received would also be louder and potentially interfere
levels at a fixed location quickly fluctuate with with communication success. The frequency con-
time and, at some range, this fluctuation stabilizes tent of the call matters, because different
at a standard deviation of about 6 dB (Daigle et al. frequencies propagate differently, and the hearing
1983). Van Staaden and Römer (1997), for abilities of the receiver are frequency-dependent.
instance, reported that at night, the sound pressure Along the path, some of the call energy is lost
level of the song of an African bladder grasshop- due to geometrical spreading and some is
per (Bullacris intermedia) over open grassland absorbed by the air, snow, and soil. The direction
was reduced with distance very close to the of propagation changes due to reflection and scat-
expected 6-dB per doubling of distance of spheri- tering off rocks, and due to refraction by sound
cal attenuation. However, during daytime, the speed gradients in air. Diffraction around
attenuation was much larger and more variable mountains might play a role over longer ranges.
due to air turbulence. Ambient noise in the environment does not affect
For more in-depth reading on outdoor sound sound propagation; i.e., it neither leads to attenu-
propagation, please see Attenborough et al. ation nor changes the direction of propagation.
(2007), Attenborough et al. (2007), Larsen and Ambient noise in the environment affects
Wahlberg (2017), Wahlberg and Larsen (2017), whether the call is received and correctly
or Larsen and Radford (2018). interpreted. Ambient noise can be of abiotic,
biotic, or anthropogenic origin. Wind causes
noise, as do waves and breaking ice. The other
5.3 The Source-Path-Receiver penguins in the colony create ambient noise with
Model for Animal Acoustic their own acoustic communications. Human pres-
Communication ence (e.g., chatting tourists stomping through the
snow towards the penguin colony) might add to
The SPRM can be used to examine acoustic com- the ambient noise. Ambient noise at the location
munication among animals. In the example of of the receiver lowers the signal-to-noise ratio
172 O. N. Larsen et al.
Fig. 5.17 Example of the SPRM for animal acoustic propagation effects leading to attenuation. Ambient noise
communication. The source is a gentoo penguin emitting in the habitat stems from waves, wind, and ice (abiotic),
its display call within its nesting colony in Antarctica. The other penguins (biotic), and perhaps humans (anthropo-
sound propagation path takes the call through the local genic). Ambient noise at the receiver reduces the signal-to-
habitat. The receiver is another gentoo penguin in a neigh- noise ratio and hence the detectability of the call. Ambient
boring colony who might respond acoustically, thereby noise at the source may lead to increases in source level
becoming the next source. The parameters that affect suc- and repetition (redundancy) and shifts in spectral content
cessful communication are listed below the source and the (Lombard effect)
receiver. Along the path, the call experiences various
(SNR) at which the call is received. The critical 5.3.1 The Sender
ratios (specific to the receiver’s auditory system;
see Chap. 10) dictate, below which SNR the call In animal acoustic communication, the signal that
is masked by the ambient noise and thus not is being sent depends on the sender’s species,
detected. At intermediate SNRs, the call might demographic parameters, behavioral state, and
be detected, but not correctly interpreted. many other factors. Obviously, different taxo-
Masking-release processes (also specific to the nomic groups produce different sounds, ranging
receiver’s auditory system) include comodulation from infrasonic rumbles of elephants to ultrasonic
masking release and spatial release from masking clicks of bats (see Chap. 8 on classifying animal
(e.g., Erbe et al. 2016) and aid signal detection sounds). But even closely-related species may be
and interpretation. Ambient noise at the sender told apart acoustically. For example, Gerhardt
may lead to the Lombard effect (Lombard 1911), (1991) found that the number of pulses in the
whereby the sender raises the source level of its advertisement call in male Eastern gray treefrogs
call, actively changes the spectral characteristics (Dryophytes versicolor) and Cope’s gray
to move sound energy out of the frequency band treefrogs (Dryophytes chrysoscelis) is the major
most at risk from masking, and repeats the call to cue distinguishing sympatric males who are simi-
increase the likelihood of reception. Finally, lar in size and color. While species-specific calls
ambient noise may instill anti-masking strategies of bats have been recognized for decades
in both sender and receiver whereby they change (Balcombe and Fenton 1988; Fenton and Bell
their location and orientation (both towards each 1981; O’Farrell et al. 1999), more recently,
other) to foster communication success. acoustic differences have been noted in bat
5 Source-Path-Receiver Model for Airborne Sounds 173
see Michelsen 1992; Genevois and Bretagnolle Context further determines acoustic signaling.
1994; Fletcher 2004, and Larsen and Wahlberg For example, predators often hunt quietly, and
2017). For instance, Charlton et al. (2011) prey remain silent when it is aware of being
reported that increased body size in male koalas stalked. A classic case where (prey) moths
(Phascolarctos cinereus) was reflected in the attempt to jam (predator) bat echolocation signals
closer spacing of vocalization formants. with a counter signal to confuse the approaching
(Formants refer to a concentration of acoustic predator has developed another twist. Ter
energy around particular frequencies caused by Hofstede and Ratcliffe (2016) found that, “spe-
resonances in the vocal tract.) Stoeger-Horwath cific predator counter-adaptations include calling
et al. (2007) reported age-dependent variations in at frequencies outside the sensitivity range of
the grunt and trumpet calls of African savanna most eared prey, changing the pattern and fre-
elephants. The grunts were only recorded in quency of echolocation calls during prey pursuit,
individuals less than 2 months of age and infants and quiet, or ‘stealth,’ echolocation.” Acoustic
never produced trumpet calls until they were interactions between a parent and offspring are
3 months old. The authors also reported often brief and relatively quiet to conceal and
age-dependent variations in the low-frequency protect the young. In contrast, messages with a
rumble; older individuals rumbled at a lower fun- high reproductive value, such as mating calls or
damental frequency than younger individuals, territorial defense calls, and calls with high sur-
and there also was a tendency for rumble duration vival value, such as infant distress calls or adult
to increase slightly with age. Weddell seal alarm calls, are produced loudly and repeatedly.
(Leptonychotes weddellii) pups on rookeries To this point, it has been shown that distress calls
emit high-frequency calls that transition into of three species of pipistrelle bats (Pipistrellus
low-frequency adult calls used exclusively while nathusii, P. pipistrellus, and P. pygmaeus) were
hauled-out on the ice (Thomas and Kuechle structurally convergent, “consisting of a series of
1982). Reby and McComb (2003) reported that downward-sweeping, frequency-modulated
lower-frequency male roars in red deer (Cervus elements of short duration and high intensity
elaphus) stags were associated with greater age with a relatively strong harmonic content” (Russ
and weight, so provided “honest” cues about et al. 2004). The study suggested that it was not as
reproductive condition. important to have species-specific signals as it
In many species, sex-specific differences in the was to have some device that produced a mob-
acoustic repertoires are employed to insure proper bing by bats of the predator regardless of species
mate selection (Hardouin et al. 2014). The of bat.
sender’s reproductive state and drive for mating Ambient noise at the location of the sender
often is represented in its acoustic signals. In may also affect signal emission level, repetition,
songbirds and many orthopteran insects, only and spectral shifts (collectively called the Lom-
males sing (Miller et al. 2007; Riede et al. bard effect; Brumm and Zollinger 2011). For
2010). Songs are under the influence of reproduc- instance, male túngara frogs (Engystomops
tive hormones associated with courtship, and pustulosus) increased the level, repetition, and
songbird songs are long, complex, and repeated complexity of their calls when noise overlapped
in a typical and recognizable sequence of sounds. with their normal frequency band of calling but
In species in which males compete acoustically to not when noise was higher and non-overlapping
attract a female mate, a substandard mating call in frequency (Halfwerk et al. 2016). Brumm
could indicate immaturity, agedness, or poor (2004) and Brumm and Todt (2003) noted that
health of the caller. For example, Hardouin et al. birds in a noisy environment called louder and
(2007) examined hoots by 17 male scops owls more often, and repositioned themselves, possi-
(Otus scops) on the Isle of Oléron, France. bly to increase the likelihood of the sound being
Heavier male owls made lower-frequency hoots, received. Similarly, greater horseshoe bats
which could give them a competitive mating (Rhinolophus ferrumequinum) increased their
advantage over lighter weight males. call level and shifted frequency in noisy
5 Source-Path-Receiver Model for Airborne Sounds 175
environments (Hage et al. 2013). Eliades and wind, there may be noise from branches creaking
Wang (2012) examined the neural processes and breaking in the heat or noise from rustling
underlying the Lombard effect in marmoset leaves in the understory as animals walk through.
monkeys (Callithrix jacchus) and found that Wind also drives waves; surf noise or noise from
increased vocal intensity was accompanied by a breaking waves is typical for coastal areas. Even
change in auditory cortex activity toward neural without wind, moving water, such as waterfalls,
response patterns observed during vocalizations can be noisy. Precipitation (i.e., rain, hail, thun-
under normal feedback conditions. der, and lightning) creates noise. Geological
Many animal communication calls are close to events such as earthquakes, seismic rumblings,
being omnidirectional, radiating equally in all and volcanic eruptions contribute noise to the
directions—at least at their lower frequencies terrestrial soundscape. In polar regions, melting
(Larsen and Dabelsteen 1990). However, some ice and calving glaciers contribute to ambient
bird species (e.g., juncos, warblers, and finches) noise.
showed an ability to focus their calls in the direc- Biotic ambient noise comes from animals in
tion of an owl to warn-off the predator. Yorzinski the environment. These can be of the same or
and Patricelli (2009) examined the acoustic direc- different species from the target species. Several
tionality of antipredator calls of 10 species of taxa call in large numbers at certain times of day
passerines and found that some birds would and season, significantly raising ambient noise
“call out of the side of their beaks” with their levels (e.g., chorusing cicadas, katydids, or
head pointed away from conspecifics in an appar- frogs). Biologists typically think of soniferous
ent attempt at ventriloquist behavior. Whether animals as calling with specialized anatomies for
terrestrial animals can actively change the sound sound production (i.e., syringes in birds and vocal
emission directivity in response to noise (in order cords in mammals). However, most animals also
to enhance acoustic communication) needs to be can produce mechanical sounds using external
investigated. anatomies, such as wing-stridulation by a locust,
abdomen vibration by a spider, beak-pecking by a
woodpecker, teeth-chattering by a squirrel, foot-
5.3.2 The Path and the Acoustic thumping by a rabbit, etc. In addition, animals can
Environment produce unintentional sounds, such as noise
associated with rustling leaves as an animal
As the signal leaves the sender and travels walks through a forest, respiration noise, flight
through the environment, it is subjected to various noise, feeding sounds, etc., not intended for com-
forms of attenuation (as detailed above) and so munication with a conspecific. Example
the level at the receiver location is less than the spectrograms for many of these sounds are
source level. In addition, ambient noise at the found in Chap. 7 on soundscapes as well as
receiver location reduces the SNR, making it Chap. 8 on detecting and classifying animal
harder for the receiver to detect the signal. Ambi- sounds.
ent noise may be classed according to its sources: Anthropogenic ambient noise is due to aircraft,
abiotic, biotic, or anthropogenic. Chapter 7 road traffic, trains, ships, military activities, con-
provides a detailed overview of ambient noise struction activities, etc. Increasing encroachment
with example spectrograms. of human activities on animal habitats results in
In terms of abiotic ambient noise, wind is a increased noise exposure for all taxa of animals
major contributor and its noise level increases (see Chap. 13 on noise impacts).
with wind speed. In addition, remember that the Ambient noise varies with time on scales of
direction of wind (i.e., upwind or downwind) hours, days, lunar phase, season, and year. The
affects the distance that sounds propagate. Wind reason is a combination of sound propagation
drives other types of noise, such as noise from effects and source behavior. The time of day and
vegetation moving in the wind. Even without season of year affect sound propagation. As
176 O. N. Larsen et al.
explained above, sounds can be heard from far- In American mink (Neovison vison), for
ther away during the night; for example, a train instance, hearing-sensitivity and frequency range
can be heard in the distance at night, but not changed markedly with postnatal age. Pups up to
during the day. Walking in the woods during the 32 days old were almost deaf, whereas three
winter, the listener can hear sounds over much weeks later, their audiogram started to resemble
greater distances than during the summer with that of an adult (in shape), but they remained less
thick vegetation. In many animals, sound- sensitive than adults, especially below 10 kHz
production rates are highest during the breeding (Brandt et al. 2013). There might be good reasons
season. Chorusing insects, amphibians, and birds why hearing in young is immature. For example,
precisely time the commencement of their a male fruit fly (Drosophila melanogaster) cannot
cacophonies to a breeding season each year. hear the female’s flight tone until he is physically
Amphibians stop calling when they go into winter mature enough to mate (Eberl and Kernan 2011).
hibernation, so chorusing can stop abruptly in late This ensures the female fruit fly that any pursuing
autumn. Some birds migrate, so their songs are male is mature. Hearing capabilities further
missing from the winter soundscape. Many change over an adult’s life. Natural deterioration
migrating birds are soniferous and their flight with age due to anatomical and physiological
calls can temporarily dominate the soundscape aging is a process called presbycusis. Hearing
as they pass through an area during a spring loss can also be caused by acute noise exposure
migration (e.g., a honking flock of migrating at strong levels and chronic exposure to moderate
geese or a chirping flock of starlings). Yet, other noise (see Chap. 13). Hearing loss likely affects
species of birds remain in temperate areas over the ability of a receiver to hear and interpret a
winter and produce sounds all year long (e.g., sender’s message. For example, a hearing-
cardinals, sparrows, and snow juncos). Tropical impaired moth, which typically avoids a bat pred-
insects, frogs, and birds can reproduce multiple ator through an evasive flight pattern, will be
times per year, they do not migrate or hibernate, easier to capture if the bat’s echolocation signals
and so are soniferous throughout the year. Diurnal are not heard.
cycles exist in all animals with birds calling in the The receiver’s sex rarely influences its hearing
morning, insects in the afternoon, frogs in the capabilities; however, Narins and Capranica
evening, and nocturnal animals in the middle of (1976, 1980) provided an example of sex
the night. differences in the auditory reception system of a
Puerto Rican treefrog, the coquina frog (Eleuther-
odactylus coqui). Male and female treefrogs
responded to different notes of the male’s
5.3.3 The Receiver
two-note, co-qui call. Females were attracted to
the qui-part of the call. Males paid most attention
The same factors that can affect the sender also
to the co-part of the call, which was important in
could affect the receiver’s ability to detect and
male–male aggressive interactions. The authors
interpret a signal (i.e., species, population, indi-
found that the inner ear basilar papilla was tuned
vidual traits, age, sex, context, and ambient
differently in males and females; males had fewer
noise). On the species level, different species
fibers tuned to the qui-part of the call and females
typically hear sound at different frequencies and
had fewer fibers tuned to the co-part of the call.
levels. In other words, audiograms are species-
These differences also occurred in higher-order
specific (Fig. 5.19). Fortunately, data on hearing
neurons in the brain, where response decisions
abilities of invertebrates, insects, reptiles,
take place. Later studies (Mason et al. 2003)
amphibians, fish, birds, and mammals continue
showed similar sexual differences in the middle
to accumulate (see Volume 2). Nonetheless,
ear of bullfrogs (Lithobates catesbeianus).
there is some intra-species and individual
Ambient noise is a ubiquitous factor
variability in hearing (see Chap. 10).
influencing signal reception and interpretation.
5 Source-Path-Receiver Model for Airborne Sounds 177
Fig. 5.19 Hearing ranges of some animals and humans. (1994), Heffner (1983), Heffner and Heffner (2007),
Bars represent the approximate hearing frequency range, Lipman and Grassi (1942), Warfield (1973), and West
ordered after increasing upper frequency cut-off; blue: (1985), previously compiled by Vanderbilt University
fish, gray: bird, green: frog, orange: terrestrial mammal, and Louisiana State University (http://lsu.edu/deafness/
violet: human, and brown: marine mammal. The red verti- HearingRange.html; accessed 6 January 2021), and plot-
cal lines are the frequencies of musical notes C0–C16, for ted by Wikimedia Commons author Cmglee. https://
comparison. There is one octave between successive commons.wikimedia.org/wiki/File:Animal_hearing_fre
C-notes. Middle-C on a piano is C4. A full-sized piano quency_range.svg. Figure licensed under the Creative
will only range from just under C1 to C8, with tones >C11 Commons Attribution-Share Alike 3.0 Unported license;
being ultrasound. Data from Fay (1988), Fay and Popper https://creativecommons.org/licenses/by-sa/3.0/deed.en
Having experienced various forms of attenuation et al. 2003; Dooling et al. 2009; Dooling and
along its path, a signal will be audible if its Blumenrath 2013; Dooling and Leek 2018).
amplitude remains above the power spectral den- Some birds take advantage of these limitations
sity level of the ambient noise plus the critical by producing both high-amplitude broadcast
ratio of the receiver. The critical ratio is essen- sounds and low-amplitude soft sounds. The for-
tially a minimum SNR needed for signal detec- mer become public since they cover a large active
tion (see Chap. 10 for more information on the space with many potential receivers whereas the
critical ratio). An even higher SNR is needed for latter become private as they cover a very small
signal discrimination, recognition, and finally, active space with only few receivers (Larsen
comfortable communication (Fig. 5.20; Lohr 2020).
178 O. N. Larsen et al.
Zone of … Detection
Discrimination
Recognition
Comfortable
Communication
distance
between
100
10
00 18
180
8 0 210
21 2 45 birds [m]
1 0 245
Fig. 5.20 Sketch of the radii about a calling bird over louder ambient noise, the ranges will be even less. For
which a broadcast public call might be detected, animals with soft private calls or greater critical ratios, the
discriminated, and recognized. Detection (i.e., signal pres- radii will also be less (Erbe et al. 2016). # Erbe et al.;
ence/absence) is possible over the longest ranges (i.e., https://doi.org/10.1016/j.marpolbul.2015.12.007.
lowest SNR). A higher SNR is needed for signal discrimi- Licensed under CC BY 4.0; https://creativecommons.org/
nation, then signal recognition, and finally, comfortable licenses/by/4.0/
communication, yielding progressively shorter ranges. In
The auditory systems of some animals have noise from frequencies outside of the signal fre-
built-in masking-release processes to reduce the quency to filter the noise within the frequency
impact of ambient noise. A spatial release from band of the signal. A comodulation masking
masking results from the directional hearing release has been demonstrated in gray treefrogs
capabilities of the animal. If the signal arrives (Bee and Vélez 2018), European starling (Sturnus
from a direction in which the receiver is more vulgaris; Klump and Langemann 1995), and
sensitive and if the noise arrives from a direction house mice (Mus musculus; Klink et al. 2010).
in which the receiver is less sensitive, then Addionally, animals have a host of behavioral
the reception directivity improves the SNR and adaptations to optimize sound reception. For
the signal can be detected in higher ambient example, an animal may improve the SNR for
noise. A spatial release from masking has sound arriving at its ears by approaching the
been demonstrated in several taxa including source, tilting its head, adjusting its pinnae
tropical crickets (Paroecanthus podagrosus and (in the case of mammals), or moving to another
Diatrypa sp.; Schmidt and Römer 2011), gray location away from a noise source (Nelson and
treefrogs (Bee 2008), budgerigars (Melopsittacus Suthers 2004).
undulatus; Dent et al. 1997), and pigmented
Guinea pigs (Cavia porcellus; Greene et al.
2018). A comodulation masking release is possi-
5.4 Summary
ble if the noise is broadband and amplitude-
modulated coherently across its frequencies. The
The Source-Path-Receiver Model (SPRM) is used
animal might then utilize information about the
widely in technical noise control and illustrates
5 Source-Path-Receiver Model for Airborne Sounds 179
Bee MA, Vélez A (2018) Masking release in temporally Dooling RJ, West EW, Leek MR (2009) Conceptual and
fluctuating noise depends on comodulation and overall computational models of the effects of anthropogenic
level in Cope’s gray treefrog. J Acoust Soc Am 144(4): noise on birds. Proc Inst Acoust 31(1):1
2354–2362. https://doi.org/10.1121/1.5064362 Eberl DF, Kernan MJ (2011) Recording sound-evoked
Bradbury JW, Vehrencamp SL (2011) Principles of animal potentials from the Drosophila antennal nerve. Cold
communication, 2nd edn. Sinauer Associates, Spring Harb Protoc 2011:prot5576
Sunderland, MA Eliades SJ, Wang X (2012) Neural correlates of the Lom-
Brandt C, Malmkvist J, Nielsen RL, Brande-Lavridsen N, bard effect in primate auditory cortex. J Neurosci
Surlykke A (2013) Development of vocalization and 32(31):10737–10748. https://doi.org/10.1523/
hearing in American mink (Neovison vison). J Exp Biol JNEUROSCI.3448-11.2012
216:3542–3550 Erbe C, Reichmuth C, Cunningham KC, Lucke K,
Brumm H (2004) The impact of environmental noise on Dooling RJ (2016) Communication masking in marine
song amplitude in a territorial bird. J Anim Ecol 73: mammals: a review and research strategy. Mar Pollut
434–440 Bull 103:15–38. https://doi.org/10.1016/j.marpolbul.
Brumm H, Todt D (2003) Facing the rival: directional 2015.12.007
singing behaviour in nightingales. Behaviour 140(1): Fay RR (1988) Hearing in vertebrates: a psychophysics
43–53 databook. Hill-Fay Associates, Winnetka IL
Brumm H, Zollinger SA (2011) The evolution of the Fay RR, Popper AN (1994) Comparative hearing:
Lombard effect: 100 years of psychoacoustic research. mammals. Springer handbook of auditory research
Behaviour 148(11–13):1173–1198. https://doi.org/10. series. Springer-Verlag, New York
1163/000579511X605759 Feng AS, Arch VS, Yu Z, Yu X-J, Xu Z-M, Shen J-X
Chappuis C (1971) Un exemple de l’influence du milieu (2009) Neighbor–stranger discrimination in concave-
sur les émissions vocales des oiseaux: L’évolution des eared torrent frogs, Odorrana tormota. Ethology
chants en fôret équatoriale. Terre Vie 118:183–202 115(9):851–856
Charlton BD, Ellis WA, McKinnon AJ, Cowin GJ, Fenton MB, Bell G (1981) Recognition of species of
Brumm J, Nilsson K, Fitch WT (2011) Cues to body insectivorous bats by their echolocation calls. J Mam-
size in the formant spacing of male koala mal 62:233–243
(Phascolarctos cinereus) bellows: honesty in an Fletcher NH (2004) A simple frequency-scaling rule for
exaggerated trait. J Exp Biol 214:3414–3422 animal communication. J Acoust Soc Am 115:2334–
Dabelsteen T (1981) The sound pressure level in the dawn 2338
song of the blackbird Turdus merula and a method for Galeotti P, Pavan G (1991) Individual recognition of male
adjusting the level in experimental song to the level in tawny owls (Strix aluco) using spectrograms of their
natural song. Z Tierpsychol 56(2):137–149 territorial calls. Ethol Ecol Evol 3:113–126
Dabelsteen T, Larsen ON, Pedersen SB (1993) Habitat- Galeotti P, Appleby BM, Redpath SM (1996) Macro and
induced degradation of sound signals: Quantifying the microgeographical variations in the “hoot” of Italian
effects of communication sounds and bird location on and English Tawny Owls (Strix aluco). Ital J Zool 63:
blur ratio, excess attenuation, and signal-to-noise ratio 57–64
in blackbird song. J Acoust Soc Am 93(4):2206–2220 Gannon W, Racz GR (2006) Character displacement and
Daigle GA, Piercy JE, Embleton T (1983) Line-of-sight ecomorphological analysis of two long-eared Myotis
propagation through atmospheric turbulence near the (M. auriculus and M. evotis). J Mammal 87(1):
ground. J Acoust Soc Am 74(5):1505–1513 171–179
Defrance J, Barriere N, Premat E (2002) Forest as a mete- Gannon WL, Sherwin RE, de Carvalho TN, O’Farrell MJ
orological screen for traffic noise. Proc. 9th ICSV, (2001) Pinnae and echolocation call differences
Orlando, FL, USA. between Myotis californicus and M. ciliolabrum
Dent ML, Larsen ON, Dooling RJ (1997) Free-field bin- (Chiroptera: Vespertilionidae). Acta Chiropterologica
aural unmasking in budgerigars (Melopsittacus 3(1):77–91
undulatus). Behav Neurosci 111:590–598 Gannon WL, O’Farrell MJ, Corben C, Bedrick EJ (2003)
Dooling RJ, Blumenrath SH (2013) Avian sound percep- Call character lexicon and analysis of field recorded bat
tion in noise. In: Brumm H (ed) Animal communica- echolocation calls. In: Thomas JA, Moss CF, Vater M
tion and noise, Animal signals and communication, vol (eds) Echolocation in bats and dolphins. Univ Chicago
2. Springer-Verlag, Heidelberg, pp 229–250 Press, Chicago, IL, pp 478–484
Dooling RJ, Leek MR (2018) Communication masking by Garcia M, Charrier I, Rendall D, Iwaniuk AN (2012)
man-made noise. In: Slabbekoorn H, Dooling RJ, Pop- Temporal and spectral analyses reveal individual vari-
per AN, Fay RR (eds) Effects of anthropogenic noise ation in a non-vocal acoustic display: the drumming
on animals, Springer handbook of auditory research, display of the ruffed grouse (Bonasa umbellus, L.).
vol 66, New York, pp 23–46 Ethology 118(3):292–301
5 Source-Path-Receiver Model for Airborne Sounds 181
Garstang M (2010) Elephant infrasounds: Long-range Klink KB, Dierker H, Beutelmann R et al (2010)
communication. In: Brudzynski SM (ed) Handbook Comodulation masking release determined in the
of mammalian vocalization: an integrative neurosci- mouse (Mus musculus) using a flanking-band para-
ence approach. Elsevier BV, Oxford digm. JARO 11:79–88. https://doi.org/10.1007/
Garstang M, Larom D, Raspe R, Lindeque M (1995) s10162-009-0186-7
Atmospheric controls on elephant communication. J Klump GM, Langemann U (1995) Comodulation masking
Exp Biol 198:939–951 release in a songbird. Hear Res 87(1):157–164. https://
Geiger R (1965) The climate near the ground. Harvard doi.org/10.1016/0378-5955(95)00087-K
University Press, Cambridge, MA Krokstad A, Svensson UP, Strøm S (2015) The early history
Genevois F, Bretagnolle V (1994) Male blue petrels reveal of ray tracing in acoustics. In: Xiang N, Sessler G (eds)
their body mass when calling. Ethol Ecol Evol 6:377– Acoustics, information, and communication. Modern
383 acoustics and signal processing. Springer, Cham
Gerhardt HC (1991) Female mate choice in treefrogs: Larom DL, Garstang M, Payne RR, Lindeque M (1997)
static and dynamic acoustic criteria. Anim Behav The influence of surface atmospheric conditions on the
42(4):615–635 range and area reached by animal vocalizations. J Exp
Greene NT, Anbuhl KL, Ferber AT, DeGuzman M, Allen Biol 200:421–431
PD, Tollin DJ (2018) Spatial hearing ability of the Larsen ON (1995) Acoustic equipment and sound field
pigmented Guinea pig (Cavia porcellus): minimum calibration. In: Klump GM, Dooling RJ, Fay RR,
audible angle and spatial release from masking in azi- Stebbins WC (eds) Methods in comparative psycho-
muth. Hear Res 365:62–76. https://doi.org/10.1016/j. acoustics. Birkhäuser Verlag, Basel, pp 31–45
heares.2018.04.011 Larsen ON (2020) To shout or to whisper? Strategies for
Hage SR, Jiang T, Berquist SW, Feng J, Metzner W encoding public and private information in sound
(2013) Lombard effect in horseshoe bats. Proc Natl signals. In: Aubin T, Mathevon N (eds) Coding
Acad Sci 110(10):4063–4068. https://doi.org/10. strategies in vertebrate acoustic communication, Ani-
1073/pnas.1211533110 mal signals and communication, vol 7. Springer Nature
Halfwerk W, Lea AM, Guerra MA, Page RA, Ryan MJ Switzerland AG, Cham, pp 11–44
(2016) Vocal responses to noise reveal the presence of Larsen ON, Dabelsteen T (1990) Directionality of black-
the Lombard effect in a frog. Behav Ecol 27(2): bird vocalization. Implications for vocal communica-
669–676. https://doi.org/10.1093/beheco/arv204 tion. Ornis Scand 21:37–45
Hardouin LA, Reby D, Bavoux C, Burneleau G, Larsen ON, Radford C (2018) Acoustic conditions affect-
Bretagnolle V (2007) Communication of male quality ing sound communication in air and underwater. In:
in owl hoots. Am Nat 169(4):552–562 Slabbekoorn H, Dooling RJ, Popper AN, Fay RR (eds)
Hardouin LA, Thompson R, Stenning M, Reby D (2014) Effects of anthropogenic noise on animals. Springer
Anatomical bases of sex- and size-related acoustic handbook of acoustic research. Springer, New York,
variation in herring gull alarm calls. J Avian Biol 45: pp 109–144
157–166 Larsen ON, Wahlberg M (2017) Sound and sound
Heffner HH (1983) Hearing in large and small dogs: abso- sources. In: Brown CH, Riede T (eds) Comparative
lute thresholds and size of the tympanic membrane. bioacoustics: an overview. Bentham Science
Behav Neurosci 97:310–318 Publishers, Sharjah, pp 3–62
Heffner HE, Heffner RS (2007) Hearing ranges of labora- Larsson C (2000) Weather effects on outdoor sound prop-
tory animals. J Am Assoc Lab Anim Sci 46(1):20–22 agation. Int J Acoust Vib 5(1):33–36
Heller EJ (2013) Why you hear what you hear. Princeton Lipman EA, Grassi JR (1942) Comparative auditory sen-
University Press, Princeton, NJ sitivity of man and dog. Amer J Psychol 55:84–89
Holland J, Dabelsteen T, Pedersen SB, Paris AL (2001) Lohr B, Wright TF, Dooling RJ (2003) Detection and
Potential ranging cues contained within the energetic discrimination of natural calls in masking noise by
pauses of transmitted wren song. Bioacoustics 12(1): birds: estimating the active space of a signal. Anim
3–20 Behav 65:763–777
Jansen DA, Cant MA, Manser MB (2012) Segmental Lombard É (1911) Le signe de l’élévation de la voix.
concatenation of individual signatures and context Annales des Maladies de L’Oreille et du Larynx
cues in banded mongoose (Mungos mungo) close XXXVII(2):101–109
calls. BMC Biol 10(1):97. https://doi.org/10.1186/ Lovell SF, Lein MR (2004) Neighbor-stranger discrimina-
1741-7007-10-97 tion by song in a suboscine bird, the alder flycatcher,
Jensen KK, Larsen ON, Attenborough K (2008) Empidonax alnorum. Behav Ecol 15(5):799–804
Measurements and predictions of hooded crow (Cor- Marten K, Marler P (1977) Sound transmission and its
vus corone cornix) call propagation over open field significance for animal vocalization. I. Temperate
habitats. J Acoust Soc Am 123(1):507–518 habitats. Behav Ecol Sociobiol 2(3):271–290
182 O. N. Larsen et al.
Martínez-Sala R, Rubio C, García-Raffi LM, Sánchez- (eds) Comparative hearing: insects. Springer handbook
Pérez JV, Sánchez-Pérez EA, Llinares J (2006) Control of auditory research. Springer-Verlag, New York, pp
of noise by trees arranged like sonic crystals. J Sound 63–96
Vib 291:100–106 Ross CD (2000) Outdoor sound propagation in the US
Mason MJ, Lin CC, Narins PM (2003) Sex differences in Civil War. Appl Acoust 59:137–147
the middle ear of the bullfrog (Rana catesbeiana). Russ J, Jones G, Mackie I, Racey P (2004) Interspecific
Brain Behav Evol 61(2):91–101 responses to distress calls in bats (Chiroptera: Vesperti-
Melendez KV, Feng AS (2010) Communication calls of lionidae): A function for convergence in call design?
little brown bats display individual-specific Anim Behav 67:1005–1014. https://doi.org/10.1016/j.
characteristics. J Acoust Soc Am 128:919–923 anbehav.2003.09.003
Michelsen A (1978) Sound reception in different Schmidt AKD, Römer H (2011) Solutions to the cocktail
environments. In: Ali MA (ed) Sensory ecology, party problem in insects: selective filters, spatial
NATO Adv Study Inst Ser, vol 18. Plenum Press, release from masking and gain control in tropical
London, pp 345–373 crickets. PLoS One 6(12):e28593. https://doi.org/10.
Michelsen A (1992) Hearing and sound communication in 1371/journal.pone.0028593
small animals: evolutionary adaptations to the laws of Searby A, Jouventin P, Aubin T (2004) Acoustic recogni-
physics. In: Webster DB, Fay RR, Popper AN (eds) tion in macaroni penguins: an original signature sys-
The evolutionary biology of hearing. Springer-Verlag, tem. Anim Behav 67:615–625
New York, pp 61–77 Stoeger-Horwath AS, Stoeger S, Schwammer HM,
Michelsen A, Larsen ON (1983) Strategies for acoustic Kratochvil H (2007) Call repertoire of infant African
communication in complex environments. In: Huber F, elephants: first insights into the early vocal ontogeny. J
Markl H (eds) Neuroethology and behavioral physiol- Acoust Soc Am 121(6):3922–3931
ogy. Springer-Verlag, Berlin, pp 321–331 Suthers RA (1994) Variable asymmetry and resonance in
Miller EH, Williams J, Jamieson SE, Gilchrist HG, the avian vocal tract: a structural basis for individually
Mallory ML (2007) Allometry, bilateral asymmetry, distinct vocalizations. J Comp Physiol A 175:457–466
and sexual differences in the vocal tract of common ter Hofstede HM, Ratcliffe JM (2016) Evolutionary esca-
eiders Somateria mollissima and king eiders lation: the bat-moth arms race. J Exp Biol 219(11):
S. spectabilis. J Avian Biol 38:224–233 1589–1602. https://doi.org/10.1242/jeb.086686
Mitani JC, Hasegawa T, Gros-Louis J, Marler P, Byrne R Thomas JA, Kuechle V (1982) Quantitative analysis of the
(1992) Dialects in wild chimpanzees? Am J Primatol underwater repertoire of the Weddell seal
27:233–243. https://doi.org/10.1002/ajp.1350270402 (Leptonychotes weddellii). J Acoust Soc Am 72:
Narins PM, Capranica RR (1976) Sexual differences in the 1730–1738
auditory system of the tree frog Eleutherodactylus Trefry SA, Hik DS (2010) Variation in pika (Ochotona
coqui. Science 192(4237):378–380 collaris, O. princeps) vocalizations within and between
Narins PM, Capranica RR (1980) Neural adaptations for populations. Ecography 33:784–795
processing the two-note call of the Puerto Rican tree Van Staaden M, Römer H (1997) Sexual signalling in
frog, Eleutherodactylus coqui. Brain Behav Evol 17: bladder grasshoppers: tactical design for maximizing
48–66 calling range. J Exp Biol 200:2597–2608
Nelson BS, Suthers RA (2004) Sound localization in a Vannoni E, McEligott AG (2007) Individual acoustic var-
small passerine bird: discrimination of azimuth as a iation in fallow deer (Dama dama) common and harsh
function of head orientation and sound frequency. J groans: a source-filter theory perspective. Ethology
Exp Biol 207:4121–4133 113:223–234
O’Farrell MJ, Miller BW, Gannon WL (1999) Qualitative Wahlberg M, Larsen ON (2017) Propagation of sound. In:
identification of free-flying bats using the Anabat Brown CH, Riede T (eds) Comparative bioacoustics:
detector. J Mammal 80(1):11–23 an overview. Bentham Science Publishers, Sharjah, pp
Ottemöller L, Evers LG (2008) Seismo-acoustic analysis 63–121
of the Buncefield oil depot explosion in the UK, Warfield D (1973) The study of hearing in animals. In:
2005 December 11. Geophys J Int 172(3):1123–1134 Gay W (ed) Methods of animal experimentation,
Price MA, Attenborough K, Heap NW (1988) Sound IV. Academic Press, London, pp 43–143
attenuation through trees: measurements and models. West CD (1985) The relationship of the spiral turns of the
J Acoust Soc Am 84:1836–1844 cochlea and the length of the basilar membrane to the
Reby D, McComb K (2003) Anatomical constraints gen- range of audible frequencies in ground dwelling
erate honesty: acoustic cues to age and weight in the mammals. J Acoust Soc Am 77:1091–1101
roars of red deer stags. Anim Behav 65:519–530 Wiley RH (2009) Signal transmission in natural
Riede T, Fisher JH, Goller F (2010) Sexual dimorphism of environments. In: Squire LR (ed) Encyclopedia of neu-
the zebra finch syrinx indicates adaptation for high roscience, vol 8. Academic Press, Oxford, pp 827–832
fundamental frequencies in males. PLoS One 5:e11368 Wiley RH, Richards DG (1978) Physical constraints on
Römer H (1998) The sensory ecology of acoustic commu- acoustic communication in the atmosphere:
nication in insects. In: Hoy RR, Popper AN, Fay RR
5 Source-Path-Receiver Model for Airborne Sounds 183
Implications for the evolution of animal vocalizations. Wilkinson GS, Boughman JW (1998) Social calls coordi-
Behav Ecol Sociobiol 3:69–94 nate foraging in greater spear-nosed bats. Anim Behav
Wiley RH, Richards DG (1982) Adaptations for acoustic 55:337–350
communication in birds: sound transmission and signal Yorzinski JL, Patricelli GL (2009) Birds adjust acoustic
detection. In: Kroodsma DE, Miller EH, Quellet H directionality to beam their antipredator calls to
(eds) Acoustic communication in birds. Academic predators and conspecifics. Proc R Soc B 277:923–932
Press, New York, pp 131–181
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Introduction to Sound Propagation
Under Water 6
Christine Erbe, Alec Duncan, and Kathleen J. Vigness-Raposa
high frequencies is more readily scattered and 6.2.1 Propagation Loss Form
absorbed than energy at low frequencies. The
receiver of sound can thus infer information not As sound propagates through the ocean, it loses
just about the source of sound but also about the energy, termed propagation loss (PL2). A simple
environment’s complexity. form of the sonar equation equates PL to the
Understanding the physics of sound in water is difference between the source level (SL) and the
an important step in studies of aquatic animal received level (RL) of sound (Urick 1983):
sound usage and perception, whether these are
conspecific social sounds, predator sounds, prey PL ¼ SL RL ðpropagation loss formÞ ð6:1Þ
sounds, navigational clues, environmental SL was defined by Urick as 10log10 of the ratio
sounds, or anthropogenic sounds. It is also critical of source intensity to reference intensity (see
for the study of impacts of sound on aquatic Chap. 4). RL was equal to 10log10 of the ratio of
fauna, and for using passive or active acoustic received intensity to reference intensity. PL was
tools for monitoring aquatic fauna and mapping computed as 10log10 of the ratio of source inten-
biodiversity. The goal of this chapter is to intro- sity to received intensity.
duce the basic concepts of sound propagation For example, a whale-watching boat might
under water. have SL ¼ 160 dB re 1 μPa2 (in terms of mean-
square pressure, which is proportional to inten-
sity; see Chap. 4) and be located 100 m from a
6.2 The Sonar Equation group of whales. If PL in this environment and
over this range is 40 dB, then RL at the whales is
The sonar equation was developed by the US 120 dB re 1 μPa2 (Erbe 2002; Erbe et al. 2016a).
Navy to assess the performance of naval sonar
systems. These sonar systems were designed to
detect foreign submarines. The sonar emits an 6.2.2 Signal-to-Noise Ratio Form
acoustic signal under water and listens to
returning echoes. The time of arrival and acoustic Another simple form of the sonar equation relates
features of the echo may determine not only from the RL of a signal to the background noise level
what target the signal reflected, but also the range (NL ¼ 10log10 of the ratio of noise intensity to
and speed of the target. The term “sonar” stands reference intensity):
for “SOund Navigation And Ranging.”
There are numerous forms of the sonar equa- SNR ¼ RL NL ðsignal‐to‐noise ratio formÞ
tion. What they all have in common is that ð6:2Þ
(1) they each represent an equation of energy
conservation, meaning that the total acoustic SNR is the level of the signal-to-noise ratio,
energy on either side of the equation is the expressed in dB. For example, a call from a whale
same; and (2) all of the terms in the equation are might have a received level RL ¼ 105 dB re
expressed in decibel (dB). The sonar equation 1 μPa2 at another whale; however, background
with its original terms as defined in Urick noise at the time might be NL ¼ 115 dB re 1 μPa2
(1983) allows an easy conceptual exploration of over the frequency band of the call. The SNR is
various scenarios encountered in bioacoustics. 10 dB. Can the whale still hear the other one or
The definitions and notations of some of the does the noise mask the call?
terms are more mathematically specific in the Because the SNR is a negative number in this
recent underwater acoustics terminology standard example, if one was just considering the relative
(ISO 18405)1. levels of signal and noise, the animals would not
1 2
International Organization for Standardization. (2017). In this chapter, we italicize variables, but keep
Underwater acoustics—Terminology (ISO 18405). abbreviations as regular font; so PL is an abbreviation
Geneva, Switzerland. while PL is a variable.
6 Introduction to Sound Propagation Under Water 187
be able to hear one another because the back- not just the range between the two animals, but
ground noise level is much greater than the also at which depth each happens to be located. If
received signal level. However, animals (and the two animals are oriented towards each other,
sonar systems) can take advantage of spectral directional emission and reception capabilities
and temporal characteristics of a received sound, will enhance signal detection. The environment
as is explained below. Therefore, in the example changes the level and spectral characteristics of
of beluga whales (Delphinapterus leucas) trying the signal by reflection, refraction, scattering,
to communicate in icebreaker noise, the listening absorption, and spreading losses. The detection
whale can indeed detect the call, because of the capabilities of the receiver can be quantified by
different spectral and temporal structures of call the detection threshold, critical ratio, and other
and noise (Erbe and Farmer 1998). factors. Ambient noise in the environment can
initiate anti-masking strategies at both the sender
(e.g., increasing the source level) and receiver
(e.g., orienting towards the signal). A sonar equa-
6.2.3 Forms to Assess
tion can be constructed to investigate each of
Communication Masking
these factors, as outlined in the following
sections.
Acoustic communication under water remains an
The basic sonar relation for the communica-
area of active research. In the conceptual model of
tion scenario in Fig. 6.1 is:
Fig. 6.1, one animal (the sender) emits a signal,
which travels through the habitat to the location SL PL NL > DT ðbasic signal detection formÞ,
of the receiver. Whether the receiver can hear the
message depends on a number of factors that where DT is the detection threshold of the
relate to the sender, the habitat, and the receiver. receiver, expressed in dB. A sound is deemed
The level and spectral features of the signal will detectable if the expression on the left side
affect how far it propagates and how well it can be exceeds the detection threshold. In the absence
detected above the ambient noise in the environ- of noise, DT equals the audiogram. Audiograms
ment. The locations of sender and receiver matter, are measured by exposing an animal to pure-tone
Sender Receiver
Habitat
Effects:
• Propagation loss (PL)
• Absorption (αR)
Fig. 6.1 Sketch of the factors related to acoustic commu- level (NL), and receiver detection threshold (DT), critical
nication in natural (not just aquatic) environments and ratio (CR), and directivity index (DIr). Modified from Erbe
their corresponding terms in the sonar equation: source et al. (2016c); # Erbe et al. (2016); https://www.
level (SL), time-bandwidth product (TBP), sender direc- sciencedirect.com/science/article/pii/
tivity index (DIs), propagation loss (PL), absorption S0025326X15302125. Published under CC BY 4.0;
(absorption coefficient α multiplied by range R), noise https://creativecommons.org/licenses/by/4.0/
188 C. Erbe et al.
signals of varying levels. The RL that is just when the tone is just audible (American National
detectable defines the audiogram at that fre- Standards Institute 2015). Conceptually, the CR
quency (see Chap. 10 for a more thorough defini- quantifies the ability of the auditory system to
tion of audiogram): focus on a narrowband (tonal) signal. It captures
how many of the noise frequencies surrounding
RL ¼ DT ðaudiogram formÞ
the tone frequency are effective at masking the
The mammalian auditory system acts as a bank tone, and the resulting band of frequencies has
of overlapping bandpass filters and the listener been termed the Fletcher critical band (American
focuses on the auditory band that receives the National Standards Institute 2015). A narrowband
highest SNR (Moore 2013). Under the equal- signal is thus detectable, if
power assumption (Fletcher 1940), a signal is RL CR > NLf ðcritical ratio formÞ ð6:4Þ
detected if its power is greater than the noise
power in any of the auditory bands. So, for any RL is the tone level in dB re 1 μPa2, NLf is the
auditory band, noise mean-square pressure spectral density level
in dB re 1 μPa2/Hz, and CR is measured in dB re
RL NL > 0 ðwithin an auditory bandÞ ð6:3Þ
1 Hz (see p. 29 in Erbe et al. 2016c).
Communication signals of many species, In the above-mentioned study with beluga
including birds and marine mammals (Erbe et al. whales communicating amidst icebreaker noise,
2017a), are commonly tonal, while noise is com- the beluga whale call consisted of a sequence of
monly broadband. In order to assess the risk of six tones with overtones from 800 to 1800 Hz,
communication masking, the critical ratio (CR) is and the icebreaker’s bubbler system noise was
a useful quantity that has been measured in broadband and relatively unstructured in fre-
humans and animals. The CR is the level differ- quency and time (Fig. 6.2) (Erbe and Farmer
ence between the mean-square sound pressure 1998). The bandwidth of the call, expressed in
level (SPL) of a tone and the mean-square sound dB, was 10log10(1800–800) ¼ 30 dB re 1 Hz (see
pressure spectral density level of broadband noise Chap. 4 for definitions and formulae). Given
NL ¼ 115 dB re 1 μPa2 over the bandwidth of the 6.2.4 Form for Biomass Surveying
call, NLf was equal to NL (115 dB re 1 μPa2)
minus the bandwidth (30 dB re 1 Hz): NLf ¼ 85 dB Surveys for animals ranging from zooplankton to
re 1 μPa2/Hz. Beluga whales have a CR of fish and sharks may use an echosounder, fish
approximately 15 dB re 1 Hz at 800 Hz, therefore, finder, or sonar (e.g., Parsons et al. 2014; Kloser
the call with RL ¼ 105 dB re 1 μPa2 was audible, et al. 2013). In this scenario, the echosounder
because Eq. (6.4) was satisfied (Erbe 2008; Erbe emits a signal, which travels to the fish, where
and Farmer 1998): 105–15 > 85. some of it is reflected. How much of the signal is
In studies on critical ratios and in the beluga reflected is expressed by the target strength (TS),
whale experiments (Erbe and Farmer 1998; Erbe defined as 10log10 of the ratio of echo intensity to
2000), signal and noise were broadcast by the incident intensity (Urick 1983). The reflected sig-
same loudspeaker and thus arrived at the listener nal travels to the receiver, which has a specific DT
from the same direction. If the caller and the noise and DIr. The receiver is typically co-located with
are spatially separated, then there is an additional the source, so that the signal travels the same path
processing gain in the sonar equation: the twice and thus experiences twice the PL. The fish
receiver’s directivity index DIr: is detected if the following sonar equation is
satisfied:
RL CR þ DIr NLf > 0
ðcritical ratio form with directivity indexÞ SL 2 PL þ TS NL > DT DIr
ðtwo way sonar surveying formÞ
The DIr is defined as 10log10 of the ratio of the
intensity measured by an omnidirectional receiver Target strength will vary for each type of ani-
to that of a directional receiver. Directivity mal, as well as with the number of animals in the
indices increase with frequency and values up to group and their orientation relative to the
19 dB have been measured for communication echosounder. Figure 6.4 shows reflected signals
sounds in marine mammals. The associated spa- received on a REMUS autonomous underwater
tial release from masking should be considered in vehicle. Individual animals are observed in two
environmental impact assessments of underwater aggregations, with two dolphins swimming
noise (Erbe 2015). Directivity indices are even within one of the aggregations. Researchers are
greater at higher frequencies used by dolphins using cameras on the same platforms to better
during echolocation (Fig. 6.3). understand the information contained in reflected
Fig. 6.3 Sketches of the receiving directivity pattern of a bottlenose dolphin (Tursiops truncatus) in the vertical (a) and
horizontal (b) planes. Courtesy of Chong Wei after data in (Au and Moore 1984)
190 C. Erbe et al.
Fig. 6.4 Echosounder image of marine fauna in two their high reflectivity (Benoit-Bird et al. 2017). # Benoit-
aggregations, with two dolphins being in the aggregation Bird et al. 2017; https://aslopubs.onlinelibrary.wiley.com/
on the left. Colors represent acoustic target strength and doi/full/10.1002/lno.10606. Published under CC BY 4.0;
the shapes of the two dolphins can easily be recognized by https://creativecommons.org/licenses/by/4.0/
signals and ultimately convert that information are therefore very useful for understanding how
into species classifications and estimates of bio- sound will propagate in different geographical
mass (Benoit-Bird and Waluk 2020). regions.
heat from the sea surface, which results in a heating and cooling effects can eliminate or
temperature profile in the upper part of the enhance this effect. As explained later in this
ocean that increases with increasing depth from chapter, whether or not there is a distinct increase
a minimum of about 2 C at or (in summer) in sound speed with depth in the mixed layer
slightly below the surface. determines whether there is a surface duct,
Salinity typically changes by only a small which has a considerable impact on acoustic
amount with depth, and in most parts of the propagation from near-surface sound sources
ocean is between 34 and 36 psu. As a result, the and to near-surface receivers.
sound speed is usually determined by temperature Below the mixed layer, the rapid reduction in
and depth, however, salinity can have an impor- temperature with depth (i.e., in the thermocline)
tant effect on sound speed in situations where it results in sound speed also reducing until, at a
changes abruptly. Examples include locations depth of about 1000 m, the temperature becomes
where there is a large freshwater outflow into nearly constant. In the deeper isothermal layer,
the ocean from a river, or in estuaries where it is the increasing pressure results in the sound speed
common to have a wedge of dense, saline water starting to increase with depth. There is therefore
underlying a surface layer of freshwater. In polar a minimum in the sound speed in non-polar
regions, the salinity of near-surface water can waters at a depth of approximately 1000 m,
vary considerably depending on whether sea ice which, as will be seen later, is important for
is forming, a process that excludes salt and there- long-range sound propagation.
fore increases salinity in the water below the ice. In polar waters, the temperature and pressure
When sea ice melts, freshwater is released, reduc- both increase with increasing depth, so the sound
ing near-surface salinity. speed also increases, which results in a strong
surface duct. However, in the Arctic Ocean, the
existence of water masses with different
properties entering from the Pacific and Atlantic
6.3.2 Sound Speed Profiles
oceans can lead to more complicated sound speed
profiles.
The following equation is one of a number of
Temperature and salinity profiles for the
equations of varying complexity that can be
world’s oceans can be found in the World
found in the literature relating the speed of
Ocean Atlas3 (Locarnini et al. 2018; Zweng
sound to temperature, salinity, and depth
et al. 2018). These are based on averages of a
(Mackenzie 1981). It is valid for temperatures
large amount of measured data and are very use-
from 2 to 30 C, salinities of 30 to 40 psu, and
ful for calculating estimated sound speed profiles
depths from 0 to 8000 m.
for particular locations for particular months or
c ¼ 1448:96 þ 4:591 T 5:304 102 T 2 seasons of the year. The real ocean is, however,
highly variable; particularly the upper thermo-
þ 2:374 104 T 3 þ 1:340 ðS 35Þ cline and mixed layer, which can change on
þ 1:630 102 D þ 1:675 107 D2 time scales of hours, and in some extreme cases,
tens of minutes, so there is no substitute for in situ
1:025 102 T ðS 35Þ 7:139
measurements of temperature and salinity profiles
1013 TD3 ½m=s to support acoustic work.
water depth; however, while useful for very rough The absorption coefficient increases with fre-
PL estimates, this approach should be adopted quency (Fig. 6.7). At low frequencies, it is
with caution as the best choice will depend on dominated by molecular relaxation of two minor
the characteristics of the seabed. The only way to constituents of seawater: B(OH)3 and MgSO4,
accurately determine rt for a given situation is to whereas above a few hundred kHz, it is primarily
carry out numerical propagation modeling, in due to the water’s viscosity.
which case you might as well use that to directly In summary, Fig. 6.8 compares how propaga-
determine the propagation loss, removing the tion loss increases with range for spherical
need for (Eq. 6.7) and its inherent inaccuracies. spreading (Eq. 6.5), cylindrical spreading
(Eq. 6.6), and combined spherical/cylindrical
spreading with a transition range of 100 m
(Eq. 6.7). The effect of absorption (Eq. 6.8) in
6.4.2 Absorption Loss
addition to spherical spreading is also shown for
frequencies of 1, 10, and 100 kHz.
When a sound wave propagates through water, it
results in a periodic motion of the molecules
present in the water, and the slight friction within
and between them converts some of the sound 6.4.3 Additional Losses
energy into heat, reducing the intensity of the
sound wave. This is called absorption loss and 6.4.3.1 The Air–Water Interface
results in a propagation loss that is proportional to
the range traveled: Reflection and Transmission Coefficients
In animal bioacoustics as well as noise research,
PL ¼ αr km ð6:8Þ one typically deals with sounds in one medium
where rkm is the range in kilometers and α is the (i.e., either air or water) and then sticks to this
absorption coefficient in dB/km. The propagation medium, only modeling propagation within this
loss due to absorption must be added to the prop- medium and only considering receivers in this
agation loss due to geometrical spreading medium. However, sound does cross into other
described in Sect. 6.4.1. media, and so a fish might be able to hear an
A commonly used formula for α is: airplane flying overhead, and a bird flying directly
overhead might be able to hear a submarine’s
f 1 f 2 ðpH8Þ=0:56 sonar (Fig. 6.9).
α ¼ 0:106 e
þ f2
f 21 As sound hits an interface, the incident wave,
f 2 f 2 z=6
in most situations, gives rise to a reflected wave
T S
þ 0:52 1 þ e and a transmitted wave4 (also see Chap. 5, where
43 35 f 22 þ f 2
reflection is explained based on Huygens’ princi-
þ 4:9 104 f 2 eðT=27þz=17Þ ð6:9Þ ple). The energy of the reflected wave remains
within the medium of the incident sound, but the
with f1 = 0.78(S/35)1/2eT/26 and f2 = 42eT/17; f energy of the transmitted wave is lost from the
[kHz], α[dB/km] medium of the incident sound and transmitted
into the adjacent medium. The amplitudes of the
valid for 6 < T < 35 C ðS ¼ 35psu, pH ¼ 8, z ¼ 0Þ
reflected and transmitted (plane) waves are given
7:7 < pH < 8:3 ðT ¼ 10 C, S ¼ 35psu, z ¼ 0Þ
5 < S < 50psu ðT ¼ 10 C, pH ¼ 8, z ¼ 0Þ
0 < z < 7km ðT ¼ 10 C, S ¼ 35psu, pH ¼ 8Þ
4
Dan Russell’s animations of waves being reflected from
(François and Garrison 1982a, b; Ainslie and hard and soft boundaries, and being transmitted: https://
McColm 1998). www.acs.psu.edu/drussell/Demos/reflect/reflect.html;
accessed 12 October 2020.
6 Introduction to Sound Propagation Under Water 195
by the reflection and transmission coefficients R where θ1 is the grazing angle of the incident
and T (Medwin and Clay 1998): wave, measured from the interface, and θ2 is the
grazing angle of the transmitted (refracted) wave,
also measured from the interface. The angle of
Z 2 sin θ1 Z 1 sin θ2
R ¼ ð6:10Þ incidence is measured from the normal (i.e., per-
Z 2 sin θ1 þ Z 1 sin θ2
pendicular to the interface); the angle of incidence
2Z 2 sin θ1 and the grazing angle of the incident wave always
T ¼
Z 2 sin θ1 þ Z 1 sin θ2 add to 90 . The acoustic impedance Z is the
196 C. Erbe et al.
Fig. 6.9 Sketches of a sound source in the air (helicopter; panel, medium 1 corresponds to air with sound speed c1,
left) and water (submarine; right), and the incident pi, and medium 2 corresponds to water with sound speed c2.
reflected pr, and transmitted pt rays (i.e., vectors pointing The situation is reversed in the right panel, where medium
in the direction of travel, perpendicular to the wavefront), 1 is water, and medium 2 is air
with corresponding grazing angles θ1 and θ2. In the left
negative amplitude, which means that the incident interface. The direct ray does not experience a
and reflected pressures cancel each other out. This flip in amplitude. Depending on the relative path
is why the water–air interface is called a pressure- lengths, the surface-reflected sound will add con-
release boundary (or “soft” boundary) for sound structively to the sound that traveled along the
incident from below. For non-normal incidence, direct path, or they will cancel each other out.
R and T need to be computed with Eq. (6.10). This creates a pattern of constructive and destruc-
Also, as a sound source is moved to shallower tive interference about the sound source, called
depth (i.e., closer to the sea surface), the propor- the Lloyd’s mirror effect. As a ship passes a
tion of transmitted sound increases. This is moored recorder, the spectrogram shows the char-
because of the evanescent (i.e., exponentially acteristic U-shaped interference pattern as succes-
decaying) field, which is ignored by Eq. (6.10), sive peaks and troughs in amplitude at any one
but that might still have enough amplitude at the frequency over time (Fig. 6.10). Additional
sea surface for shallow sources (Godin 2008). images of the Lloyd’s mirror interference pattern
can be found in (Parsons et al. 2020) for small
Lloyd’s Mirror electric ferries and in (Erbe et al. 2016b) for
While not resulting in a loss of sound energy, the recreational swimmers and boogie boarders.
Lloyd’s mirror effect is a result of reflection from
the water–air interface from shallow sound
Scattering at the Sea Surface
sources. An omnidirectional source (i.e., one
If the sea surface is not flat, then some of the
that emits sound in all directions) close to the
reflected energy is scattered away from the geo-
sea surface (such as a ship’s propeller) emits
metric reflection direction, reducing the ampli-
some of its sound in an upwards direction, and
tude of the geometrically reflected wave. This is
this sound reflects off the sea surface. At any
called surface scattering loss, which increases as
receiver location, sound that traveled along the
the roughness of the sea surface increases, the
surface-reflected path overlaps with sound that
acoustic wavelength decreases (i.e., acoustic fre-
traveled along the direct path from the source to
quency increases), and the grazing angle between
the receiver. The reflected ray’s amplitude is
the direction of the incident wave and the plane of
opposite in sign to the incident ray’s amplitude
the sea surface increases. This relationship is
(R ¼ 1); conceptually, this ray emerged from
quantified by the Rayleigh roughness parameter
an image source (also called virtual source) with
(Jensen et al. 2011):
negative amplitude on the other side of the
198 C. Erbe et al.
14 h/ =0.08 1
h/ =0.11
Pressure reflection coefficient
h/ =0.14
12
h/ =0.17
0.8
h/ =0.2
10
8 0.6
6
0.4
4
2 Silt
0.2
Sand
Limestone
0 Basalt
0 20 40 60 80 0
Grazing angle (deg) 20 40 60 80
Grazing angle (deg)
Fig. 6.11 Graphs of additional propagation loss per
bounce as a function of grazing angle for reflection from Fig. 6.12 Curves of pressure reflection coefficient versus
rough surfaces with various ratios of rms roughness to grazing angle for four different seabed types, calculated
acoustic wavelength with parameters from Jensen et al. (2011)
6 Introduction to Sound Propagation Under Water 199
Unconsolidated sediments become more reflec- equation depend on grazing angle. The propaga-
tive as the sediment grain size increases from tion loss per bounce is given by 20 log 10 jR 0 j.
silt to sand. Limestone and basalt are consolidated
rocks, which allow both compressional waves 6.4.3.3 Scattering Within the Water
and shear waves to propagate, and are thus Column
referred to as solid elastic seabeds. Basalt is a Sound can be scattered within the water column
hard rock and highly reflective at all grazing by anything that causes sharp changes in sound
angles. The reflection coefficient of limestone, speed, density, or both (i.e., acoustic impedance,
however, is perhaps surprising. While it is also a which is the product of sound speed and density;
rock, it has the lowest reflectivity of the four see Chap. 4). This includes gas bubbles,
seabeds at small grazing angles. This is because biological organisms (in particular those with
the shear wave speed in limestone is very similar gas-filled organs like lungs or swim bladders),
to the sound speed in water, which allows energy and suspended sediment particles. Water column
to pass easily from sound waves in the water to scattering is utilized in active sonar systems,
shear waves in the seabed. which rely on the backscattered signal to detect
Curves of reflection coefficients versus and/or characterize objects within the water
grazing angle are even more complicated for column. However, clouds of air bubbles formed
layered seabeds due to interference between by breaking waves can cause an appreciable
waves reflecting from different layers, and in increase in propagation loss in some
this case, the reflectivity becomes frequency circumstances.
dependent. Despite the complexity, there are Air bubbles are essentially small, resonant
computer programs available, based on cavities within the water column, which can
techniques described in Jensen et al. (2011), that both scatter and absorb sound and, when found
can numerically calculate the reflection coeffi- in large numbers, can change the effective den-
cient curve for any arbitrarily layered seabed. A sity, and hence sound speed, of the water. When a
good example is BOUNCE, which is part of the wave breaks, it entrains a large amount of air
Acoustics Toolbox.6 A much bigger problem is down to depths of several meters, forming a
the common lack of information on the cloud of bubbles of a range of sizes. The large
geoacoustic properties of the seabed, to be able bubbles rise to the surface quite quickly, but the
to provide these programs with accurate smaller bubbles can remain at depth for many
input data. minutes. This can increase the propagation loss
Seafloor roughness can further reduce the for sound traveling close to the surface (Ainslie
apparent acoustic reflectivity, although if the 2005; Hall 1989).
rms roughness is known, this can be dealt with
(at least approximately) by using Eq. (6.12) to
calculate the associated Rayleigh roughness 6.4.4 Numerical Propagation Models
parameter γ as a function of grazing angle. The
effective seabed reflection coefficient is then: 6.4.4.1 The Wave Equation and Solution
Approaches
R 0 ¼ R e0:5γ
2
ð6:14Þ The ocean is a complicated environment for
sound propagation, and the simple approaches to
where R is the pressure reflection coefficient for
estimating propagation loss described above are
the flat seafloor (Eq. 6.10). All terms in this
very limited in their applicability. As a result, a
great deal of effort has gone into developing
numerical propagation models that can calculate
6 acoustic propagation loss for realistic situations.
Acoustics Toolbox: https://oalib-acoustics.org/models-
and-software/acoustics-toolbox/; accessed 30 September What follows is a brief introduction to the topic.
2020. The interested reader is referred to Etter (2018)
200 C. Erbe et al.
and Jensen et al. (2011) for a more comprehen- horizontal distance from the source to the
sive treatise. receiver, z is the receiver depth below the sea
Fundamentally, all numerical propagation surface, and ϕ is the horizontal plane azimuth
models solve the acoustic wave equation, which angle of the receiver relative to some direction
is a differential equation that relates the way the reference.
pressure changes over time to how it changes Many modeling approaches start by assuming
spatially as a wave propagates: that the solution has a harmonic time dependence
2
so that p(r, z, ϕ, t) ¼ pω(r, z, ϕ)eiωt where
pffiffiffiffiffiffiffi
1 ∂ Φ ω ¼ 2πf is the angular frequency and i ¼ 1 .
∇2 Φ ¼ ð6:15Þ
c2 ∂t 2 Substituting this solution form into the wave
where ∇2 is the Laplace operator, ∂ indicates the equation (Eq. 6.15) leads to another differential
partial derivative, c is the speed of sound, equation called the Helmholtz equation, which
t represents time, and Φ is the solution to the can be solved at a specified ω to give pω(r, z, ϕ).
wave equation. The computational advantage of this is that the
The wave equation itself is well understood Helmholtz equation can be solved independently
and straightforward to solve in simple cases; how- for each required frequency, converting a coupled
ever, there are two issues that make it difficult to four-dimensional (4D) problem into a number of
solve numerically for typical underwater acous- independent 3D problems. Models that use this
tics problems: approach are known as frequency domain
models, whereas models that directly solve the
1. Solutions are usually desired over domains wave equation are known as time domain models.
that are orders of magnitude larger than the If required, the time domain solution can be
acoustic wavelength. Direct solution methods, reconstructed from multiple frequency domain
such as finite differences or finite elements, solutions using Fourier synthesis (see Jensen
require meshing the solution domain at a reso- et al. 2011, Chap. 8, for details).
lution of a small fraction of a wavelength, so The azimuth angle dependence can be dealt
the size of the required domain makes these with by two different approaches. Modeling in
approaches impractical for most propagation 3D retains the full azimuth dependence of the
problems, even with modern computing environment, whereas N 2D modeling assumes
hardware. that changes in the environment due to small
2. The boundaries of the domain, particularly the changes in ϕ have negligible effect on sound
seabed, are complicated, but very important to propagation, so that modeling can be carried out
model accurately as they have a strong influ- independently along each azimuth of interest. The
ence on sound propagation. majority of numerical models use the N 2D
Getting around these difficulties requires approach, because there is again a substantial
making approximations that lead to equations computational saving, this time by reducing a
that are practical to solve for the problems of coupled 3D problem, solving for pω(r, z, ϕ), to
interest, with different approximations leading to a number of independent 2D problems, each solv-
different methods suitable for different situations. ing for pω, ϕ(r, z) using only environmental infor-
In general, the solution of the acoustic wave mation for the corresponding azimuth.
equation is a function of three spatial dimensions The inherent assumption of the N 2D
and time. In Cartesian coordinates, the acoustic method provides a good approximation to the
pressure can be written as: p(x, y, z, t). In most sound field in many propagation modeling
cases, we are interested in the field generated by a situations where horizontal sound speed gradients
small source, which can be approximated as a are much smaller than vertical sound speed
single point in space. It is more convenient to gradients, the seabed slopes are small, and the
work in cylindrical coordinates centered on the ranges are not large enough for the remaining
source location, p(r, z, ϕ, t), where r is the out-of-plane effects to have an appreciable effect
6 Introduction to Sound Propagation Under Water 201
on the sound field. However, there are cases transition will be smoother. Both of these problems
where full 3D modeling may be required; for are a result of a high-frequency approximation
example, around steep-sided submarine canyons, inherent in ray theory, which cannot deal with
in the presence of nonlinear internal waves that diffraction (i.e., the phenomenon of waves bending
can produce strong horizontal sound speed around obstacles or spreading out after passing
gradients, or for very-long-range propagation through a narrow gap; see Chap. 5 on sound prop-
across ocean basins. agation examples in the terrestrial world).
Some propagation models further simplify An alternative approach to calculating the
their calculations by assuming that the environ- amplitude of the acoustic field is to treat each
ment (but not the sound field) is independent of ray as the center of a beam with a specified
range, which means that the sound speed profile is (usually Gaussian) amplitude profile. The field
a function of depth only, and the water depth and at a particular location is then obtained by sum-
seabed properties are the same at all ranges (i.e., ming the contributions from all the beams that
the seafloor is flat). These are called range-inde- overlap at that location. The main challenge with
pendent (RI) propagation models, whereas prop- this approach is determining how the amplitude
agation models that allow the sound speed profile and width of the beam should change along the
and/or the water depth and/or the seabed ray, but algorithms have been developed to do
properties to vary with range are known as this (see Jensen et al. 2011, Sect. 3.5, for details).
range-dependent (RD) models. One of the best-known propagation codes of this
Acoustic propagation models are usually type is Bellhop (Porter and Bucker 1987), a fully
characterized by the numerical approach adopted, range-dependent, Gaussian beam tracing program
and the following sections described some of the suitable for N 2D modeling that is available as
most common. Guidance on which propagation part of the Acoustics Toolbox. The toolbox also
model to use in various scenarios follows this includes a fully 3D variant called Bellhop3D.
section. Although Gaussian beam tracing is an
improvement to conventional ray tracing and
6.4.4.2 Ray and Beam Tracing reduces the effects of the high-frequency assump-
A ray is a vector, normal to the wavefront, and tion inherent in ray theory, it does not completely
shows the direction of sound propagation. Ray eliminate them. Its treatment of shadow zones and
models trace rays by repeatedly applying Snell’s caustics produces realistic, but not necessarily
law (Eq. 6.11). For layered media (such as layers accurate results and, importantly, it does not pre-
of ocean water with differing properties), Snell’s dict waveguide cutoff effects.
law relates the angles of incidence θ1 and refrac- In underwater acoustics, the term waveguide
tion θ2 at every layer boundary. Rays bend or duct is used to describe any situation in which
towards the horizontal, if c2 > c1, and away sound is constrained to a particular span of
from the horizontal if c1 > c2. depths by reflection, refraction, or some combi-
There are several approaches to calculating the nation of the two. Common examples include
amplitude of the acoustic field. The simplest, (Fig. 6.13):
known as conventional ray tracing, is to use the
1. A shallow-water duct in which sound is
distance between initially adjacent rays to deter-
constrained by reflection from both the sea
mine the area over which the sound power has
surface and the seabed.
spread and calculate the intensity as the power
2. A surface duct, in which the sound speed near
per unit area. Unfortunately, this method results
the sea surface increases with increasing depth.
in unphysical predictions of infinite sound ampli-
This results in sound that is initially heading
tude at locations called caustics, where initially
downward being refracted upwards towards
adjacent rays cross and therefore have zero separa-
the sea surface, where it is reflected back down-
tion. It also predicts sharp transitions to zero sound
ward again, and so on. It is therefore
intensity in shadow zones, which are regions
constrained by reflection at the top and by
where rays do not enter, whereas in reality, the
202 C. Erbe et al.
Fig. 6.13 Sound speed profiles (left) and ray trace plots described in the text. The source depth was 10 m for all
computed using Bellhop (Porter and Bucker 1987, right) except the deep sound channel example, which had a
illustrating the common underwater acoustic ducts source depth of 1200 m
refraction at the bottom. Weak surface ducts the minimum in the sound speed (i.e., towards
are often found in the mixed layer due to sound the waveguide axis). The waveguide axis
speed increasing with increasing pressure, and occurs at a depth of about 1000 m in much of
strong surface ducts are ubiquitous in polar the world’s ocean. The sound is constrained by
oceans because both pressure and temperature refraction both above and below the axis of the
increase with increasing depth. Sea ice can, waveguide. However, these are not sharp
however, reduce the acoustic reflectivity of boundaries, and the steeper the angle of prop-
the sea surface and therefore increase the atten- agation is, the larger are the excursions of the
uation of sound traveling in the duct. ray paths away from the axis.
3. The Deep Sound Channel (DSC), also known 4. Convergence zone propagation in which
as the sound fixing and ranging (SOFAR) sound is constrained by reflection from the
channel, in which sound is refracted towards sea surface and refraction from the increase
6 Introduction to Sound Propagation Under Water 203
of sound speed with increasing depth that to zero, which requires that an incident sound
occurs below the axis of the DSC. wave is inverted on reflection. Conversely, the
seafloor is a hard boundary, which requires that
In all cases, the waveguide will only trap rays
the incident and reflected waves sum to a maxi-
leaving the source within a certain span of angles
mum pressure; so the amplitudes of the incident
from the horizontal. In the case of the shallow
and reflected waves must have the same sign.
water waveguide, this is because the seabed
Both of these boundary conditions have to be
reflectivity reduces as the grazing angle increases
satisfied simultaneously. The water depth is fixed,
(Fig. 6.12), so more energy is lost on each bottom
and normal modes consider one frequency at a
bounce at steeper angles. In the other waveguide
time, so the wavelength is fixed. The only vari-
cases, it is because the refraction is not
able that can change to satisfy the requirements is
strong enough to turn the ray around before it
the angle from the horizontal at which the wave
either reaches a depth where the sound speed
propagates. There are certain, discrete propaga-
gradient is refracting it away from the waveguide
tion angles that allow the surface and seafloor
(surface duct) or it hits the seabed (DSC and
boundary conditions to be met simultaneously,
convergence zone).
corresponding to the normal modes. Each normal
According to ray theory, rays can be launched
mode consists of a pair of plane waves, one
at any angle, irrespective of the frequency, and so
propagating upward and the other downward, at
it should always be possible to find rays that will
the same angle to the horizontal (Fig. 6.14). The
be trapped in the waveguide, provided the source
mode that corresponds to the pair of waves
is at a suitable depth. However, this is not actually
propagating closest to the horizontal is called the
the case at low frequencies, where the acoustic
lowest-order mode (mode 1), and the mode order
wavelength becomes an appreciable fraction of
increases as the propagation angle gets steeper.
the thickness of the waveguide. It turns out that
Note that the waves can never propagate exactly
if the frequency is sufficiently low, no energy will
horizontally, because that does not meet the
be trapped in the waveguide, and the waveguide
boundary conditions.
is said to be cut off. Understanding why this is the
A receiver in the water column will receive the
case requires an understanding of normal modes,
sum of the pressures from the upward and down-
which is the topic of the next section.
ward traveling waves. The amplitude of that com-
bined signal can be plotted as a function of depth
6.4.4.3 Normal Modes and range for each mode, yielding a series of
Most people find the concept of normal modes to mode shape curves (Fig. 6.15). Note that there is
be less intuitive than that of rays, but it is very always a null in pressure (i.e., a node) at the sea
useful for understanding low-frequency sound surface and a maximum in pressure magnitude
propagation in the ocean and forms the basis for (i.e., þ1 or 1; an antinode) at the hard seafloor.
a class of acoustic propagation models called The mode shapes are reminiscent of standing
normal-mode models. waves on a guitar string, which are also
Normal modes are best understood by first normal modes. However, on a guitar string,
considering an ideal shallow-water waveguide different modes correspond to different
with a constant depth (i.e., flat seafloor), constant frequencies of vibration, whereas in a waveguide,
sound speed, and perfectly reflecting seafloor. different modes correspond to sound of the same
Solving the Helmholtz equation for this situation frequency propagating at different angles to the
requires that two so-called boundary conditions horizontal.
be met: one at the sea surface and one at the For any waveguide thickness, the propagation
seafloor. The sea surface is a soft boundary as angles for a particular mode increase as frequency
far as underwater sound is concerned, so the is reduced. The ideal waveguide considered so far
boundary condition here is that the acoustic pres- has no limit to how steep the propagation angles
sure due to the incident and reflected waves sums can be, but that is not the case for real ocean
204 C. Erbe et al.
most applications, the short-range limitation so-called high-angle PE models greatly relaxed
introduced by the Hankel function approximation this approximation. The way in which the solu-
is of little consequence, but, if necessary, it can be tion marches out in range makes it straightfor-
removed (at additional computational cost) by ward to include range-dependent water depth,
directly evaluating the integral transform. sound speed profiles, and seabed properties, and
It has proved difficult to extend the as a result, high-angle PE models have become
wavenumber integration method to range- the method of choice for solving range-dependent
dependent problems in a way that results in an propagation problems.
efficient propagation model, although the full Perhaps the most widely used PE model is
(paid) version of OASES7 does have this capabil- RAM (Collins 1993), which allows the user a
ity. The theoretical background of this model is trade-off between the valid angular range and
described in Goh and Schmidt (1996). computational efficiency by specifying the num-
ber of terms to be used in a Padé approximation,
which is central to the wide-angle algorithm. The
6.4.4.5 Parabolic Equation
more terms that are used in the Padé approxima-
Inserting a solution of the form pω,ϕ ðr, zÞ ¼
ð1Þ
tion, the wider is the valid angular range. Even
f ðr, zÞH 0 ðk0 r Þ into the Helmholtz equation though this allows the paraxial approximation to
yields parabolic-equation (PE) models. Here, be greatly relaxed, it cannot be completely
ð1Þ
H 0 represents an outgoing cylindrical wave eliminated, and so PE models should always be
with wavenumber k0 ¼ 2πf /c0 where c0 is an used with care when acoustic energy propagating
ð1Þ
assumed sound speed. Technically, H 0 is a at steep angles is significant.
Hankel function of the first kind of zero order. Another consideration when running RAM or
The aim of PE models is to solve for f(r, z), which similar PE models is that they use a finite compu-
represents the way in which the true field varies tational grid in the depth direction, and energy
from that produced by the ideal outgoing will be artificially reflected by the sudden trunca-
cylindrical wave. tion at the bottom of the grid. This is usually dealt
If the sound is assumed to be propagating with by including an extra attenuation layer
predominantly in the range direction (the underneath the layer representing the physical
so-called paraxial approximation), then an effi- seabed. The attenuation layer has the same
cient numerical algorithm can be employed. density and sound speed as the seabed but an
Given f(r, z), a small range step dr is added to artificially high attenuation coefficient so that
calculate f(r + dr, z), a little bit farther from the little energy reaches the bottom of the grid,
source. This calculation can then be repeated as and any energy that does reflect is further
many times as desired to march the solution out in attenuated before reappearing in the water col-
range. The sound field at one range is thus used to umn. A sudden change in attenuation can also
calculate the sound field at the next range and so lead to reflections, so in critical situations, it is
on, without explicitly solving the depth-separated advisable to ramp the attenuation up smoothly
Helmholtz equation, making this a fundamentally from its seabed value to a high value, rather than
different approach to the normal mode and having a step change.
wavenumber integration methods discussed There are several variants of RAM intended for
previously. different purposes (Table 6.1). The only one that
Initially, the paraxial approximation was very can deal with elastic seabeds is RAMS, but it
restrictive and severely limited the utility of PE requires careful tuning of parameters to avoid
models for solving underwater acoustics instability, and in some cases involving layered
problems. The more recent development of seabeds, it is impossible to obtain a stable solu-
tion. More recent PE models have been devel-
7
OASES code https://oceanai.mit.edu/lamss/pmwiki/ oped that overcome these limitations (Collis
pmwiki.php?n¼Site.Oases; accessed 1 October 2020. et al. 2008) yet are research codes not readily
6 Introduction to Sound Propagation Under Water 207
available. The majority of PE codes are intended the greater practicality of the PE range-marching
for N 2D modeling. However, research-level algorithm.
3D PE codes have been developed (see Jensen Range-dependent modeling with layered elas-
et al. 2011, Sect. 6.8, for details). tic seabeds remains a difficult computational task.
One commonly resorts to work-around strategies,
such as replacing the true seabed with an “equiv-
alent” fluid seabed that has a similar reflection
6.4.5 Choosing the Most Appropriate
coefficient versus grazing angle dependence at
Model
low grazing angles. This allows a standard PE
code to be used for the modeling but is only
If the frequency is high enough that the acoustic
accurate at ranges large enough that there is no
wavelength is less than a small fraction of the
high-angle energy reaching the receiver.
smallest significant feature in the sound speed
profile (e.g., mixed layer thickness, water
depth), then use a ray tracing or beam model
(e.g., Bellhop), otherwise use one of the 6.4.6 Accessing Acoustic Propagation
low-frequency models. A rule of thumb for the Models
‘small fraction’ is 1/100. However, accurately
modeling sound propagation in a weak duct may Many of the models described in this chapter
require the use of a low-frequency model up to a are freely available for download from the
higher frequency than this rule would suggest. If Ocean Acoustics Library8 (OALIB). OALIB
in doubt, run some tests using both types of includes Michael D. Porter’s Acoustics Toolbox,
models to determine the frequency at which the which incorporates a Gaussian beam tracing
two models start to agree. model (Bellhop), wavenumber integration code
When choosing a low-frequency model, if the (SCOOTER), normal-mode model (KRAKEN),
range is short enough that the environment can be as well as several other useful programs including
considered range-independent, then pick a one for calculating seabed reflectivity as a func-
wavenumber integration model (e.g., OASES or tion of grazing angle for arbitrarily complicated,
SCOOTER), otherwise use a PE model (e.g., layered seabeds (BOUNCE). These all use similar
RAM). The benefit of wavenumber integration input and output file formats, have been regularly
for range-independent modeling is its greater updated until at least 2020, and are well
accuracy at short range compared to either a documented. A number of MATLAB (The
normal-mode model (which only considers MathWorks Inc., Natick, MA, USA) routines for
trapped energy) or a PE model (which has high- dealing with the input and output are also
angle limitations). Wavenumber integration can provided. Also available on OALIB is the free
also deal accurately with elastic seabed effects, version of the wavenumber integration code
which tend to be most important at short range.
PE codes have largely replaced normal-mode
codes for range-dependent modeling because of 8
Ocean Acoustics Library https://oalib-acoustics.org/;
accessed 17 June 2020.
208 C. Erbe et al.
OASES and a number of different PE codes, sound emission characteristics, and are
including the RAM family. conceptually based on re-arranging the passive
Unfortunately, downloading a particular code sonar equation (Eq. 6.1) to solve for the received
is often just the start of a journey that may level RL:
include compiling it for the particular operating
RL ¼ SL PL: ð6:16Þ
system you are using, deciphering the documen-
tation to determine what input files are required The tasks are:
and how they need to be formatted, and then
working out how to read and plot the output 1. Calculate RL as a function of range and depth
data. There are usually a number of adjustable in a given direction from a tonal (i.e., single-
parameters that affect how the program operates, frequency) source.
and it is necessary to have an understanding of 2. Calculate RL as a function of range and depth
the underlying numerical methods in order to set in a given direction from a broadband source.
these appropriately. Inappropriate parameter 3. Calculate RL as a function of geographical
selection will often lead to meaningless results, position and depth for an omnidirectional
so whenever you start using a different propaga- source in a directional environment.
tion model, you should run a series of tests on 4. Calculate RL as a function of geographical
simple problems (to which the answer is known) position and depth for a directional source in
in order to make sure you are getting the correct a directional environment.
results. The standard of documentation varies Indicative execution times are given for
considerably between the different models that calculations that were carried out on a desktop
are available from OALIB and is minimal computer with an Intel i7–7700 CPU, a clock
for some. speed of 3.6 GHz, and 64 GB of RAM. The
AcTUP9 is a MATLAB GUI to earlier (2005) processor had 4 physical cores but the models
versions of the Acoustics Toolbox and several of used here were single-threaded so only used one
the RAM family of PE codes. AcTUP comes core. The computer was running a 64-bit
packaged with the required Windows Windows 10 operating system.
executables. This provides a convenient entry
point for those new to acoustic propagation
modeling as it allows different codes to be run
on the same problem with minimal changes. 6.5.1 Received Level Versus Range
However, careful parameter selection is still and Depth from a Tonal Source
required in order to get meaningful results; put
garbage in, get garbage out. For this case, it is only necessary to specify the
acoustic environment (i.e., bathymetry profile,
sound speed profile, and seabed properties)
6.5 Practical Acoustic Modeling along a single azimuth from the source. The
Examples propagation loss PL is only required at the source
transmission frequency, and can be obtained
Having worked through the theory and concepts, using a single run of an appropriate propagation
this section finally puts all of the above into action model. The received level RL can then be
and provides examples of some practical acoustic obtained using Eq. (6.16).
propagation modeling tasks of increasing com- The example of a fin whale (Balaenoptera
plexity. These all involve the estimation of physalus) located about 40 km off the coast of
received levels due to a source with known southwestern Australia, at a depth of 50 m, while
emitting a 20-Hz tone at a source level of 189 dB
9 re 1 μPa m (Sirovic et al. 2007) is depicted in
AcTUP http://cmst.curtin.edu.au/products/underwater/
download/; accessed 1 October 2020. Fig. 6.16. The modeled direction of propagation
6 Introduction to Sound Propagation Under Water 209
A) B)
Fig. 6.16 (a) Sound speed profile used for the modeling a 20-Hz tone with a source level of 189 dB re 1 μPa m. The
examples. (b) Modeled received SPL as a function of magenta line is the seafloor
range and depth for a fin whale at a depth of 50 m emitting
was due west from the source, and the bathymetry has traveled from the source to the receiver via
profile (i.e., magenta line in Fig. 6.16b) was different paths. This is typical of the sound fields
interpolated from the Geosciences Australia produced by tonal sources. The overall reduction
0.150 resolution bathymetry database.10 The in received level with increasing range is quite
sound speed profile (Fig. 6.16a) was calculated slow, particularly beyond 70 km, due to the sound
from salinity and temperature data obtained from becoming constrained by refraction in the deep
the World Ocean Atlas (Locarnini et al. 2018; sound channel. This is typical of downslope prop-
Zweng et al. 2018). The seabed was modeled as agation from a near-surface source situated over
a fine sand half-space with parameters from the continental slope into deep water.
Jensen et al. (2011). Propagation loss modeling
was carried out with RAMGeo in AcTUP, which
is very efficient at such a low frequency, taking
6.5.2 Received Level Versus Range
only a few seconds. A simple program was writ-
and Depth from a Broadband
ten in MATLAB to read the propagation loss file
Source
produced by RAMGeo, calculate the received
levels using Eq. (6.16), and plot the results.
Many sources of underwater sound are broad-
Note that AcTUP can be used to plot propagation
band, which means that they produce significant
loss, but not received level.
acoustic output over a wide range of frequencies.
The sound field has a complicated structure of
Ships, pile driving, and the airgun arrays used for
peaks and nulls that is the result of constructive
seismic surveying all produce broadband noise,
and destructive interference between sound that
and modeling the resulting sound fields is of
importance when assessing the potential impacts
of these sources on marine animals.
10
A common way to carry out broadband
Whiteway, T., Australian Bathymetry and Topography
Grid, June 2009, https://ecat.ga.gov.au/geonetwork/srv/
modeling for continuous sound such as ship
eng/catalog.search#/metadata/67703; accessed noise is:
6 November 2020.
210 C. Erbe et al.
1. Break the required frequency span into a series Computing the received levels for impulsive
of frequency bands (e.g., 1/3 octave bands are sources follows the same steps as for broadband,
commonly used; see Chap. 4). continuous sources, except that in step 3, the
2. Use a propagation model to estimate a typical source spectrum needs to be specified as an
propagation loss for each band. This can either energy density spectrum instead of a power den-
be done by running the propagation loss model sity spectrum, and in step 5, it is sound exposures
at the center frequency of each band or by that are summed across the bands to obtain the
running it at a number of frequencies within overall sound exposure, which is then converted
the band and then averaging the results. The to a sound exposure level.
latter is preferred as it smooths out the inter- As an example, the modeled received sound
ference field to some extent, but if the source exposure levels due to a single 3.3-l (200-cui)
emits a wide range of frequencies that span airgun are plotted as a function of range and
many bands, then the two methods will yield depth in Fig. 6.17. The airgun (i.e., a cylindrical
very similar results for the total field. tube filled with compressed air, which is sud-
3. Integrate the source power spectral density denly released into the water) is located at the
over each band and convert to a source level. geographical location that was used for the fin
4. Use Eq. (6.16) to obtain the received level in whale example, but at a depth of 6 m, which is
each band. typical of seismic survey source depths. The
5. Sum the corresponding mean-square pressures scenario is otherwise the same as previously
across the bands to obtain an overall mean- described. The airgun’s source waveform was
square pressure that can then be converted to modeled using the Cagam airgun array model
an overall received sound pressure level (SPL, (Duncan and Gavrilov 2019). The airgun array
see Chap. 4). model also calculated the signal’s energy density
spectrum, which was then used in step 3 of the
The use of mean-square pressure as a metric is
broadband modeling procedure outlined above.
problematic for impulsive sources such as airguns
Once again, AcTUP was used to run RAMGeo to
or pile driving, because the results become very
carry out the propagation modeling, but this time
sensitive to the duration of the signal, which is
at 1/3 octave band center frequencies from
often hard to determine. Source and received
7.9 Hz to 1 kHz, which took about 5 minutes.
levels for impulsive sources are therefore usually
A separate MATLAB program was written to
characterized in terms of sound exposure, and its
carry out the post-processing steps and to plot
logarithmic measure, the sound exposure level
the results.
(SEL, see Chap. 4).
Comparing Fig. 6.17 with Fig. 6.16, it can be situations involving sound propagating across
seen that the broad range of frequencies emitted steeply sloping seabeds, or in some special
by the airgun has the effect of smoothing out the situations in which horizontal sound speed
fluctuations in the sound field caused by gradients become significant.
interfering paths. The color scales on these two The result is a 3D grid of the received level as a
figures are not directly comparable because function of range, depth, and azimuth (i.e., direc-
Fig. 6.16 gives SPL in dB re 1 μPa whereas tion in the horizontal plane). To create a 2D map
Fig. 6.17 presents SEL in dB re 1 μPa2s. The of the sound field, it is necessary to extract some
two are related through: measure of the sound field in the vertical dimen-
sion and then interpolate that in the horizontal
SEL ¼ SPL þ 10 log 10 T ð6:17Þ
plane, with the appropriate measure depending
where T is the duration of the received signal in on the purpose of the modeling. For example, in
seconds, conventionally defined as the duration of environmental impact assessments, it is common
the time interval containing 90% of the signal’s to use the maximum level at any depth in the
energy (90% energy signal duration; see Chap. 4). water column, or the maximum level in a depth
range corresponding to the diving range of an
animal of interest.
Here we illustrate N 2D modeling using the
6.5.3 Received Level as a Function previous two examples, but this time carrying out
of Geographical Position the propagation modeling with bathymetry appro-
and Depth priate for each of the 37 tracks shown in Fig. 6.18.
These were set at 10 increments in azimuth, with
The geographical distribution of received sound some adjustment and an extra track inserted in the
levels can be modeled by repeating the tonal inshore direction to improve the definition of the
source modeling procedure (Sect. 6.5.1) or broad- received field in the vicinity of the two capes.
band source modeling procedure (Sect. 6.5.2) MATLAB programs were written to automate the
using bathymetry profiles appropriate for differ- various steps of the process.
ent directions from the source. For long-range Results are plotted in Fig. 6.19 for the fin
modeling, it may also be necessary to make the whale and the airgun. In both cases, the plots are
sound speed profile a function of range and direc- of the maximum received level over depth, but
tion. This is called N 2D modeling and is once again, they are not directly comparable
adequate in most circumstances, but is less accu- because SPL was plotted for the fin whale,
rate than running a fully 3D propagation model in whereas SEL was plotted for the airgun.
Fig. 6.19 (a) Map of maximum SPL over depth as a Map of maximum SEL over depth due to a single firing of
function of geographical position due to a fin whale calling an airgun of volume 3.3 l (200 cui) at a depth of 6 m
at a depth of 50 m off the southwest coast of Australia. (b)
6.5.4 Received Level as a Function Figure 6.20b combines range-depth plots for the
of Geographical Position 90 and 270 azimuths in a single plot, which
and Depth for a Directional illustrates the contrasting sound attenuation rates
Source in the upslope and downslope directions.
Fig. 6.20 (a) Map of maximum SEL over depth as a from the same airgun array as a function of range and
function of geographical position due to a single firing of depth. The source was at 0-km range, negative ranges
a typical airgun array off the southwest coast of Australia. correspond to the 270 azimuth (i.e., west of the source)
The total volume of the airguns in the array was 55.7 l and positive ranges correspond to the 90 azimuth (i.e.,
(3400 cui), and the array was at a depth of 6 m. The tow east of the source). The magenta line is the seafloor.
direction of the array was northwards. (b) Received SEL Colorbar applies to both panels
different sets of seabed properties in order to simulated, which allows other signal measures
obtain an estimate of the uncertainty in the results. such as peak sound pressure levels (SPLpk) to be
The use of N 2D rather than fully 3D calculated. Calculating SPLpk by this means
modeling in the above examples may introduce works well at short ranges but tends to overesti-
some inaccuracies for cross-slope propagation mate levels at longer ranges because the propaga-
paths, which in this case are to the north and tion models do not properly account for seabed
south of the source. The effect of the sloping and sea surface scattering effects that broaden the
bathymetry would be to deflect the sound towards peaks and reduce their amplitudes.
the downslope direction, slightly increasing Simple propagation modeling tasks such as
levels downslope and decreasing them upslope. those described in Sects. 6.5.1 and 6.5.2 can be
The modeling methods described above treat carried out using free propagation modeling tools
the source as an ideal point source, which is a such as the Acoustics Toolbox and AcTUP, with the
good approximation provided the receiver is addition of some relatively straightforward post-
much farther away from the source than the processing coded in any convenient programming
dimensions of the source. Modeling received language. However, when N 2D modeling in
levels close to a large source such as an airgun multiple directions is required, it becomes desirable
array requires a different and more computation- to automate the process of interpolating bathymetry
ally intensive approach in which the individual profiles from databases, generating sound speed
airguns in the array are treated as separate profile files, initiating multiple runs of the
sources, and their signals are combined, taking propagation model, calculating received levels,
account of their relative phases at the receiver interpolating and plotting results, etc. Most
locations. The same approach accounts for the organizations that routinely carry out this type of
full 3D directivity of the source, rather than just modeling have written their own proprietary soft-
the horizontal directivity, as was the case for the ware for these tasks. To the authors’ knowledge,
example in Sect. 6.5.4. Combining this approach there is no freely available software package with
with a process called Fourier synthesis (Jensen all of these capabilities, although there is at least
et al. 2011) allows the received waveforms to be one commercially available package.
214 C. Erbe et al.
Benoit-Bird KJ, Waluk CM (2020) Exploring the promise Erbe C, Parsons M, Duncan AJ, Osterrieder S, Allen K
of broadband fisheries echosounders for species dis- (2017b) Aerial and underwater sound of unmanned
crimination with quantitative assessment of data aerial vehicles (UAV, drones). J Unmanned Veh Syst
processing effects. J Acoust Soc Am 147(1):411–427. 5(3):92–101. https://doi.org/10.1139/juvs-2016-0018
https://doi.org/10.1121/10.0000594 Erbe C, Williams R, Parsons M, Parsons SK, Hendrawan
Benoit-Bird KJ, Moline MA, Southall BL (2017) Prey in IG, Dewantama IMI (2018) Underwater noise from
oceanic sound scattering layers organize to get a little airplanes: an overlooked source of ocean noise. Mar
help from their friends. Limnol Oceanogr 62(6): Pollut Bull 137:656–661. https://doi.org/10.1016/j.
2788–2798. https://doi.org/10.1002/lno.10606 marpolbul.2018.10.064
Collins MD (1993) An energy-conserving parabolic equa- Erbe C, Peel D, Smith JN, Schoeman RP (2021) Marine
tion for elastic media. J Acoust Soc Am 94(2):975–982 acoustic zones of Australia. J Mar Sci Eng 9(3):340.
Collis JM, Siegmann WL, Jensen FB, Zampolli M, Küsel https://doi.org/10.3390/jmse9030340
ET, Collins MD (2008) Parabolic equation solution of Etter PC (2018) Underwater acoustic modeling and simu-
seismo-acoustics problems involving variations in lation, 5th edn. CRC Press, Boca Raton, FL. https://
bathymetry and sediment thickness. J Acoust Soc Am doi.org/10.1201/9781315166346
123(1):51–55. https://doi.org/10.1121/1.2799932 Fletcher H (1940) Auditory patterns. Rev Mod Phys 12:
Duncan AJ, Gavrilov AN (2019) The CMST Airgun Array 47–65
model—a simple approach to modeling the underwater François RE, Garrison GR (1982a) Sound absorption
sound output from seismic Airgun arrays. IEEE J based on ocean measurements: part I: pure water and
Ocean Eng 44(3):589–597. https://doi.org/10.1109/ magnesium sulphate contributions. J Acoust Soc Am
JOE.2019.2899134 72(3):896–907
Erbe C (2000) Detection of whale calls in noise: perfor- François RE, Garrison GR (1982b) Sound absorption
mance comparison between a beluga whale, human based on ocean measurements: part II: boric acid con-
listeners and a neural network. J Acoust Soc Am tribution and equation for total absorption. J Acoust
108(1):297–303. https://doi.org/10.1121/1.429465 Soc Am 72(6):1879–1890
Erbe C (2002) Underwater noise of whale-watching boats Godin OA (2008) Sound transmission through water–air
and its effects on killer whales (Orcinus orca). Mar interfaces: new insights into an old problem. Contemp
Mamm Sci 18(2):394–418. https://doi.org/10.1111/j. Phys 49(2):105–123. https://doi.org/10.1080/
1748-7692.2002.tb01045.x 00107510802090415
Erbe C (2008) Critical ratios of beluga whales Goh JT, Schmidt H (1996) A hybrid coupled wave-
(Delphinapterus leucas) and masked signal duration. number integration approach to range-dependent
J Acoust Soc Am 124(4):2216–2223. https://doi.org/ seismoacoustic modeling. J Acoust Soc Am 100(3):
10.1121/1.2970094 1409–1420. https://doi.org/10.1121/1.415988
Erbe C (2015) The maskogram: a tool to illustrate zones of Hall MV (1989) A comprehensive model of wind-
masking. Aquat Mamm 41(4):434–443. https://doi. generated bubbles in the ocean and predictions of the
org/10.1578/AM.41.4.2015.434 effects on sound propagation at frequencies up to
Erbe C, Farmer DM (1998) Masked hearing thresholds of 40 kHz. J Acoust Soc Am 86(3):1103–1117. https://
a beluga whale (Delphinapterus leucas) in icebreaker doi.org/10.1121/1.398102
noise. Deep Sea Res II 45(7):1373–1388. https://doi. Hall MV (2004) Preliminary analysis of the applicability
org/10.1016/S0967-0645(98)00027-7 of adiabatic modes to inverting synthetic acoustic data
Erbe C, Liong S, Koessler MW, Duncan AJ, Gourlay T in shallow water over a sloping sea floor. IEEE J Ocean
(2016a) Underwater sound of rigid-hulled inflatable Eng 29(1):51–58. https://doi.org/10.1109/JOE.2003.
boats. J Acoust Soc Am 139(6):EL223–EL227. 823315
https://doi.org/10.1121/1.4954411 Jensen FB, Kuperman WA, Porter MB, Schmidt H (2011)
Erbe C, Parsons M, Duncan AJ, Allen K (2016b) Under- Computational Ocean acoustics, 2nd edn. Springer,
water acoustic signatures of recreational swimmers, New York
divers, surfers and kayakers. Acoust Aust 44(2): Kloser RJ, Macaulay GJ, Ryan TE, Lewis M (2013) Iden-
333–341. https://doi.org/10.1007/s40857-016-0062-7 tification and target strength of orange roughy
Erbe C, Reichmuth C, Cunningham KC, Lucke K, (Hoplostethus atlanticus) measured in situ. J Acoust
Dooling RJ (2016c) Communication masking in Soc Am 134(1):97–108
marine mammals: a review and research strategy. Mar Koessler MW (2016) Modelling of underwater acoustic
Pollut Bull 103:15–38. https://doi.org/10.1016/j. propagation over elastic, range-dependent seabeds. Ph.
marpolbul.2015.12.007 D. Thesis, Curtin University, Perth, WA, Australia
Erbe C, Dunlop R, Jenner KCS, Jenner M-NM, McCauley Kuehne LM, Erbe C, Ashe E, Bogaard LT, Collins MS,
RD, Parnum I, Parsons M, Rogers T, Salgado-Kent C Williams R (2020) Above and below: military aircraft
(2017a) Review of underwater and in-air sounds emit- noise in air and under water at Whidbey Island,
ted by Australian and Antarctic marine mammals. Washington. J Mar Sci Eng 8(11):923. https://doi.org/
Acoust Aust 45:179–241. https://doi.org/10.1007/ 10.3390/jmse8110923
s40857-017-0101-z
216 C. Erbe et al.
Locarnini RA, Mishonov AV, Baranova OK, Boyer TP, Schmidt H, Glattetre J (1985) A fast field model for three-
Zweng MM, Garcia HE, Reagan JR, Seidov D, dimensional wave propagation in stratified environments
Weathers K, Paver CR, Smolyar I (2018) World based on the global matrix method. J Acoust Soc Am
Ocean atlas 2018, volume 1: temperature. National 78(6):2105–2114. https://doi.org/10.1121/1.392670
Oceanic and Atmospheric Administration Sirovic A, Hildebrand JA, Wiggins SM (2007) Blue and
Mackenzie KV (1981) Nine-term equation for sound speed fin whale call source levels and propagation range in
in the oceans. J Acoust Soc Am 70:807–812 the Southern Ocean. J Acoust Soc Am 122(2):
Medwin H, Clay CS (1998) Chapter 2 - sound 1208–1215. https://doi.org/10.1121/1.2749452
propagation. In: Medwin H, Clay CS (eds) Urick RJ (1983) Principles of underwater sound, 3rd edn.
Fundamentals of acoustical oceanography. Academic McGraw Hill, New York
Press, San Diego, pp 17–69. https://doi.org/10.1016/ Vigness-Raposa KJ, Scowcroft G, Morin H, Knowlton C,
B978-012487570-8/50004-0 Miller JH, Ketten DR, Popper AN (2016) Discovery of
Moore BCJ (2013) An introduction to the psychology of sound in the sea: resources for decision makers. Proc
hearing. Brill, Leiden, The Netherlands Meet Acoust 27(1):010008. https://doi.org/10.1121/2.
Parsons MJG, Parnum IM, Allen K, McCauley RD, Erbe 0000257
C (2014) Detection of sharks with the Gemini imaging Vigness-Raposa KJ, Scowcroft G, Morin H, Knowlton C,
sonar. Acoust Australia 42(3):185–189 Miller JH, Ketten DR, Popper AN (2019) Discovery
Parsons MJG, Duncan AJ, Parsons SK, Erbe C (2020) of sound in the sea: communicating underwater
Reducing vessel noise: an example of a solar-electric acoustics research to decision makers. Proc Meet
passenger ferry. J Acoust Soc Am 147(5):3575–3583. Acoust 37(1):025001. https://doi.org/10.1121/2.
https://doi.org/10.1121/10.0001264 0001204
Porter MB (1990) The time-marched fast-field program Westwood EK, Tindle CT, Chapman NR (1996) A normal
(FFP) for modeling acoustic pulse propagation. J mode model for acousto-elastic ocean environments. J
Acoust Soc Am 87(5):2013–2023. https://doi.org/10. Acoust Soc Am 100(6):3631–3645. https://doi.org/10.
1121/1.399329 1121/1.417226
Porter MB, Bucker HP (1987) Gaussian beam tracing for Zweng MM, Reagan JR, Seidov D, Boyer TP, Locarnini
computing ocean acoustic fields. J Acoust Soc Am RA, Garcia HE, Mishonov AV, Baranova OK,
82(4):1349–1359. https://doi.org/10.1121/1.395269 Weathers K, Paver CR, Smolyar I (2018) World
Porter M, Reiss EL (1984) A numerical method for ocean- Ocean atlas 2018, volume 2: salinity. National Oceanic
acoustic normal modes. J Acoust Soc Am 76(1): and Atmospheric Administration
244–252. https://doi.org/10.1121/1.391101
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Analysis of Soundscapes as an
Ecological Tool 7
Renée P. Schoeman, Christine Erbe, Gianni Pavan,
Roberta Righini, and Jeanette A. Thomas
Fig. 7.4 A comparison of the soundscapes at two differ- Catharus aurantiirostris, Arremon aurantiirostris,
ent moments of the morning in a secondary wet forest at Phaeothlypis fulvicauda, and Formicarius analis). Bottom
Las Cruces Biological Station, Costa Rica. Top spectro- spectrogram recorded at the same location just after the
gram recorded minutes prior to the onset of Zammara onset of cicada morning choruses. # Hart et al. (2015);
smaragdina cicada morning choruses, displaying https://academic.oup.com/view-large/figure/79529274/
vocalizations from seven bird species (Arremon beheco_arv018_f0001.jpeg. Published under CC BY 3.0;
aurantiirostris, Picumnus olivaceus, Arremon torquatus, https://creativecommons.org/licenses/by/3.0/
Vergne et al. 2009, 2011; Reber et al. 2017). Mammalian species vocalize at frequencies
Crocodile hatchlings emit calls before, during, that, for some taxa, are inversely related to their
and after hatching, which function to synchronize body size (Bowling et al. 2017). African
hatching, alert the mother to their due arrival, and elephants (Loxodonta africana) and Asian
stay in contact (Vergne et al. 2011; Chabert et al. elephants (Elephas maximus), for example, vocal-
2015). Adult crocodiles produce calls during ize within the infrasonic range (i.e., <20 Hz;
courtship, during territorial defense, and to main- fundamental frequency as low as 14 Hz). These
tain group cohesion with offspring (Fig. 7.6; low-frequency calls function to coordinate move-
Vergne et al. 2009; Reber et al. 2017). ment and to advertise an individual’s
222 R. P. Schoeman et al.
Fig. 7.5 Spectrograms of the flight sound produced by tonals and overtones and the European tree frog (Hyla
the European honeybee (Apis mellifera; a) and the Japa- arborea) with higher-pitched, broadband sounds starting
nese yellow hornet (Vespa simillima xanthoptera; b). at around 5 s and increasing in intensity and bandwidth
Sound files from Kawakita and Ichikawa (2019). Spectro- from 13 s onwards (c). Recording courtesy of Marco
gram of chorusing frogs in a pond in Colli Euganei, Italy. Pesente
Yellow-bellied toad (Bombina variegata) with 500-Hz
Fig. 7.6 Male (a) and female (b) American alligator (2017); https://www.nature.com/articles/s41598-017-
(Alligator mississippiensis) bellows that may be produced 01948-1/figures/2. Published under CC BY 4.0; https://
during courtship and territorial defense (Vergne et al. creativecommons.org/licenses/by/4.0/
2009). Modified from Reber et al. (2017). # Reber et al.
reproductive status over distances as far as 2.5 km also characterized by harmonics that extend well
(Soltis 2010). Elephants also produce vibrations into the ultrasonic range (Fig. 7.7; Behr and van
that propagate through the substrate and so pro- Helversen 2004; Lattenkamp et al. 2019).
vide additional cues to listening conspecifics Primate vocalizations cover a wide frequency
(Payne et al. 1986; O’Connell-Rodwell et al. range from approximately 100 Hz in western
2000). The majority of aerial feeding bats, at the gorillas (Gorilla gorilla; Salmi et al. 2013) to
opposite end of the body-size scale, produce short 16 kHz in pygmy marmosets (Cebuella pygmaea;
echolocation calls (biosonar) in the ultrasonic Pola and Snowdon 1975). Primate vocalizations
range (15–110 kHz), for navigation and hunting play an important role in intergroup communica-
(Fenton et al. 1998). Bat social calls, potentially tion, predominantly facilitating social interactions
related to agonistic encounters and courtship, are and group movement (Cheney and Seyfarth 1996,
7 Analysis of Soundscapes as an Ecological Tool 223
2018). Primates are also known to use various bears frequently emit a low-intensity, repetitive,
alarm calls, which were previously suggested to pulsed sound when initiating or continuing body
be functionally referential signals (e.g., Cheney contact with their cub (20 Hz–2 kHz; Wemmer
and Seyfarth 1996). However, recent studies have et al. 1976). Pinnipeds produce in-air sounds with
shown that primates often use general alarm calls main energy <9 kHz (Fig. 7.8). Mother and pup
and infer meaning from previous experiences or recognize each other by individually unique calls
contextual information (Fichtel 2020). that help them to reunite amidst all other
Marine mammals, such as polar bears (Ursus individuals of the colony (Insley et al. 2010),
maritimus), pinnipeds (i.e., seals, sea lions, and while males produce individually unique calls
walruses), and sea otters (Enhydra lutris nereis) during agonistic behavior (e.g., Fernández-Juricic
also produce in-air sounds. Nursing female polar et al. 1999; Van Parijs and Kovacs 2002). Female
Fig. 7.9 Example spectrograms of dog barks (a) and bleating sheep (b). Sheep bleats were produced by an ewe (solid
box), her lamb (dashed box), and a distant lamb (dotted box)
and pup sea otters produce individually distinct New Zealand significantly decreased with
calls with main energy <5 kHz, which also seem increasing wind speeds from calm (<4 km/h) to
to function as contact calls between separated windy (>15 km/h) conditions (Priyadarshani
individuals (McShane et al. 1995). et al. 2018). Precipitation also creates sound
Urbanized areas may be characterized by the (Fig. 7.10). Rain increased sound levels within a
sounds of domesticated animals (i.e., pets and deciduous forest (Ardennes, France) within the
livestock). Dogs bark to greet conspecifics and frequency band of 100 Hz to 10 kHz (Lengagne
humans, during play (i.e., excitement), when rais- and Slater 2002). The increase in sound levels
ing alarm, or when seeking attention (Yin and resulted in a reduction of acoustic communication
McCowan 2004), sometimes to the nuisance of space (i.e., area over which an individual can
the neighborhood (Flint et al. 2014). Barks are communicate with conspecifics) for tawny owls
short acoustic signals with main energy between
300 Hz and 2.5 kHz (Fig. 7.9), often repeated in
bouts (Yin and McCowan 2004). Ewes and their
lamb recognize each other by unique calls with
main energy <5 kHz (Sèbe et al. 2008), resulting
in a cacophony of bleats in lambing season.
7.2.2 Geophony
(Strix aluco) to 1/69th of the space without rain, Low-frequency sound, mostly generated
with a simultaneous marked decrease in vocal by engines, propagates over large distances
activity. Thunder is the most common loud natu- and appears to be the most invasive and pervasive
ral sound with a peak frequency near 100 Hz, sound related to transportation infrastructures.
although sounds extend into the infrasonic and Sound from cars and heavy trucks caused by
mid-frequency range (250 Hz–4 kHz; Fig. 7.10). tire-pavement interaction, aerodynamic sources,
Other sources of terrestrial geophony are rivers, and engines peaks around 100 Hz (Rochat and
waterfalls, earthquakes, and volcanic eruptions. Reiter 2016), but may reach as high as 10 kHz
Infrasonic monitoring of soundscapes can iden- when measured close to the source (Fig. 7.11a).
tify the location of continuous geophonic sound Both birds (e.g., Halfwerk and Slabbekoorn
sources, such as waterfalls and seismic activity, as 2009) and anurans (e.g., Cunnington and Fahrig
well as transient (i.e., short-duration) sound 2010; Caorsi et al. 2017) have been found to
sources, such as thunder, up to distances of change vocal behavior in response to traffic
10 km (Johnson et al. 2006). noise (see Chap. 13). Conventional railway
sound (i.e., electrified railway with a service
speed <200 km/h) has a broad peak between
7.2.3 Anthropophony 10 Hz and 2 kHz, whereas high-speed railway
sound (i.e., electrified railway with a service
Anthropophony identifies the presence and speed >200 km/h) peaks <100 Hz (Di et al.
activities of human beings. Some of these sounds 2014).
give cues about local culture, tradition, language, Sound from aircrafts, especially near airports,
working habits, and religion (e.g., voices, music, is perceived by humans as a source of disturbance
cow and sheep bells, church bells, etc.) and can and may have negative effects on children’s
enrich a soundscape (Stack et al. 2011, Pavan learning, human sleep, and human health (Basner
2017). However, with the industrial revolution, et al. 2017). In addition, sound during take-off
new sound sources have emerged at an unprece- and landing overlaps with biophony resulting in
dented level and spatial extension, with conse- acoustic and behavioral responses (Fig. 7.11b;
quent impacts on natural soundscapes and Sáncez-Pérez et al. 2013; Vidović et al. 2017).
human health. Birds near international airports in Spain, for
Terrestrial anthropophony includes sounds example, were found to advance their dawn cho-
from transportation (e.g., road vehicles, trains, rus to reduce overlap with aircraft sound (Gil
snowmobiles, ships, and airplanes; Ernstes and et al. 2015), which is a common response to
Quinn 2016; Mullet et al. 2017b; White et al. noise for urban species (Bermúdez-Cuamatzin
2017; Duarte et al. 2019), recreational boats et al. 2020). However, common chiffchaffs
(Kariel 1990; Bernardini et al. 2019), machinery (Phylloscopus collybita) near airports in the UK
(e.g., excavation devices, drilling devices, and the Netherlands were found to sing songs
generators, and chain saws; Potočnik and Poje with a lower maximum and peak frequency than
2010; Deichmann et al. 2017), gunshots (Wrege conspecifics in nearby control areas, thus
et al. 2017), fireworks (Kukulski et al. 2018), and resulting in an increased overlap with aircraft
outdoor events (Greta et al. 2019; Kaiser and sound (Wolfenden et al. 2019). In addition, air-
Rohde 2013). The intensity of anthropophony port populations sang at a slower rate and
correlates with the degree of urbanization (Joo responded more aggressively to song playbacks.
et al. 2011; Kuehne et al. 2013) and is considered In South Africa, the critically endangered
noise pollution with an impact on both human Pickersgill’s reed frog (Hyperolius pickersgilli)
(European Environment Agency [EEA] 2014) called more frequently and at higher frequencies
and animal health (Barber et al. 2010; Shannon during and after aircraft overflights than before
et al. 2016), potentially affecting entire (Kruger and Du Preez 2016). Even in wild remote
ecosystems (Pavan 2017). areas, aircrafts flying at ~8000 m altitude may
226 R. P. Schoeman et al.
Fig. 7.11 (a) Spectrogram of a passing car at 2-m and a approach (at ~12 s) and the bird vocalizations between
truck at 5-m distance. (b) Spectrogram of a commercial 7 and 9 kHz. (c) Spectrogram of a 3-m recreational power
passenger airplane flying overhead at an altitude of ~300 boat with a 3-hp 2-stroke engine, passing at 5-m distance;
m after take-off. Note the Doppler shift from high to low bird vocalizations within the gray dashed boxes. (d) Spec-
frequency (from 2.8 to 2 kHz) around the time of closest trogram of a jackhammer breaking tar
produce noise below 500 Hz at 60 dB re 20 μPa et al. 2016). Small recreational power boats on
(unweighted) at ground level (Pavan 2017; Farina lakes, on rivers, and near shore also increase
et al. 2021). It is also essential to consider that in-air sound levels, predominantly below 1 kHz
take-off and landing corridors, where the noise (Fig. 7.11c), with potential negative effects on
levels are much higher, may cross more rural bird species and hauled-out sea lions (York
lands where airplane sound creates a stark con- 1994; Tripovich et al. 2012).
trast with ambient sound levels. Construction equipment may generate strong
Smaller transport vehicles, such as powered sounds that are audible over long ranges. Pneu-
two wheelers and snowmobiles, also contribute matic tools, for example, generate repetitive,
to the soundscape (Paviotti and Vogiatzis 2012; broadband sound (Fig. 7.11d). Heavy and station-
Mullet et al. 2017b). Mullet et al. (2017b) found ary equipment, such as earth-moving machinery
that snowmobile noise, with main energy and air-compressors, generate sounds at
<2 kHz, affected 39% of the Alaskan wilderness frequencies <2 kHz (e.g., Berglund et al. 1996;
open to snowmobiles and may mask Roberts 2009). Although one may associate con-
vocalizations from common winter bird species. struction sounds with urban areas, there are many
In-air ship noise from machinery and ventilation examples in rural and remote areas, too. In the
systems may propagate to areas near channels, western Amazon (Peru), sounds from the con-
ports, and coasts (Badino et al. 2012; Borelli struction and operation of a natural gas-well and
7 Analysis of Soundscapes as an Ecological Tool 227
pipeline (i.e., generators, helicopters, and pneu- The air may be layered, with layers at different
matic tools) were audible up to 250 m from the altitudes having different acoustic properties.
source (Deichmann et al. 2017). Anthropogenic Higher temperature and higher humidity increase
sources in rural areas include farming machinery the speed of sound. By Snell’s law of refraction,
dominating <500 Hz (Gulyas et al. 2002), sound bends toward the horizontal when the
chainsaws recorded in forests with main energy speed of sound increases and away from the hori-
between 100 Hz and 9 kHz (Potočnik and Poje zontal when the speed of sound decreases. During
2010), and transient, broadband gunshots (Prince the day, temperature typically decreases with
et al. 2019), which can provide valuable informa- increasing altitude, leading to an upward
tion on illegal hunting, in particular in remote refracting environment that exhibits so-called
areas that are difficult to patrol. In urban settings, shadow zones that have reduced sound levels. In
additional sources of anthropophony originate the morning or in winter, the air near the ground is
from outdoor events, such as (music) festivals often relatively cold, while there might be a
(Greta et al. 2019), fun parks (Kaiser and Rohde warmer layer of air at higher altitude; this situa-
2013), and Formula 1 races (Payne et al. 2012). tion is called a temperature inversion. Sound is
downward refracted and channeled close to the
ground. Hence, in winter, sound might travel very
7.2.4 Sound Propagation far at low altitude (see Chap. 5).
in Terrestrial Environments Vegetation attenuates sound, so in temperate
areas with high vegetation, the same sound during
The propagation of sound, from its source summer propagates over shorter distances than
through an environment, affects the local during winter (Aylor 1972). Areas or seasons of
soundscape. In environments with good sound full vegetative cover have soundscapes different
propagation conditions, sources from far away from those bare in vegetation (Attenborough et al.
contribute to the local soundscape; whereas in 2012). Both temperature and humidity near the
environments with poor sound propagation ground may change quickly; therefore, sound
conditions, only nearby sources contribute. propagation conditions, soundscapes, and the
Sound propagation is affected by air temperature, communication space of terrestrial animals can
humidity, ground cover (bare rock versus vary within a few hours.
grasslands or bush), wind, turbulence, and the
presence of sound absorbers (e.g., snow),
scatterers (e.g., trees), and reflectors (e.g., cliffs 7.3 Aquatic Soundscapes
or buildings; see Chap. 5).
As sound spreads, it is transmitted into and The vast majority of aquatic soundscape studies
through different media, absorbed, reflected, have focused on marine and estuarine
scattered, and diffracted. Many of these effects environments, where soundscapes vary among
depend on frequency; meaning that sound geographic regions from the northern marginal
propagates differently at different frequencies ice-zone via equatorial regions to Antarctic
and that the environment changes the spectral waters (Haver et al. 2017), from the deep ocean
characteristics of the sound. If the wavelength of (e.g., Dziak et al. 2017) to shallow coastal waters
sound is smaller in size than features of the envi- (e.g., McWilliam and Hawkins 2013), and from
ronment (e.g., rocks), then sound will reflect. The urban rivers (e.g., Marley et al. 2016) to estuarine
wavelength can be computed as the ratio of sound reserves (e.g., Ricci et al. 2016). Soundscape
speed (about 330 m/s in air) and frequency (e.g., a studies in freshwater are less common but have
100-Hz tone has a wavelength of 3 m in air; see covered a variety of settings from frozen lakes in
Chap. 4). At wavelengths much greater than Canada (Martin and Cott 2016) to urbanized lakes
features in the environment, sound will travel in the UK (Bolgan et al. 2016, 2018b), from
unhindered. pristine swamps in Costa Rica (Gottesman et al.
228 R. P. Schoeman et al.
120 wi
nd
in
sh
all upper limit of prevailing sound
ow
100 wa
ter
heavy rain
heavy ship traffic
80
c
affi
ter tr ffic
60 a tra sea state
-w
eep ater 6
d -w
ow 4
h all
40 s 2
1
lower limit of prevailing sound 0.5
20
r
ula
lec
mo tation
0 Frequency [Hz] agi
1 10 100 1k 10k 100k
Prevailing Sound: seismic background
turbulent pressure fluctuations
surface waves
ship traffic
bubbles, spray
Fig. 7.12 Spectra of prevailing and local underwater sound sources between 1 Hz and 100 kHz (after Wenz 1962; Cato
2008)
7 Analysis of Soundscapes as an Ecological Tool 229
Fig. 7.13 Spectrograms of (a) snapping shrimp, (b) a Nature. Coquereau L, Grall J, Chauvaud L, et al. Sound
swimming great scallop (Pecten maximus), and (c) a feed- production and associated behaviours of benthic
ing spider crab (Maja brachydactyla). Spectrograms b and invertebrates from a coastal habitat in the north-east Atlan-
c were created from supplementary material in Coquereau tic. Mar Biol 163: 127; https://doi.org/10.1007/200227-
et al. (2016). Reprinted by permission from Springer 016-2902-2. # Springer Nature, 2020. All rights reserved
up to 200 kHz (Fig. 7.13a; Knowlton and activities are displayed in Fig. 7.13b, c
Moulton 1963; Au and Banks 1998). This short, (Coquereau et al. 2016).
intense, repetitive sound is a byproduct of many Over 1200 fish species were estimated to pro-
shrimps rapidly closing their snapper claw, which duce sounds by Kaatz (2011), of which 800 were
creates a jet stream used in agonistic encounters confirmed soniferous species (Kaatz 2002;
and to stun prey (Herberholz and Schmitz 1999). Rountree et al. 2006). Fish produce sounds in a
As snapping shrimps predominantly live in large variety of behavioral contexts, such as courtship
aggregations (Duffy 1996; Duffy and Macdonald (Amorim et al. 2015), agonistic interactions
1999), their sounds can be heard as a constant (Ladich 1997), and when in distress (Knight and
‘crackling’ chorus with temporal and spatial Ladich 2014). It is therefore not surprising that
variations in intensity (e.g., Bohnenstiehl et al. fish are common contributors to aquatic
2016; Lillis et al. 2017). Other well-known soundscapes, most noticeably when large num-
sound-producing invertebrates are lobsters and bers vocalize in chorus (e.g., Rice et al. 2017;
sea urchins. Lobsters produce broadband pulse Pagniello et al. 2019). Parsons et al. (2016)
trains when facing predators or competing summarized fish chorus patterns over a 2-year
conspecifics (Staaterman et al. 2010; Jézéquel period in Darwin Harbour, Australia. Nine differ-
et al. 2019). Jézéquel et al. (2019) characterized ent chorus types were detected (Fig. 7.14),
pulse trains of the European spiny lobster dominating the frequency band from 50 Hz to
(Palinurus elephas) as signals with a mean band- 3 kHz and displaying cycles on several temporal
width of 5–23 kHz. Sea urchins scrape algae from scales (i.e., diurnal, lunar, seasonal, and annual).
rocks. This foraging strategy causes the fluid Fish chorusing was also associated with environ-
inside the sea urchin to resonate, producing mental parameters, including water temperature,
sound at frequencies between 700 Hz and 2 kHz depth, salinity, and tidal cycle.
(Radford et al. 2008). In New Zealand, groups of Marine mammal sounds range from
foraging endemic Kina sea urchins (Evechinus infrasounds of mysticetes (baleen whales; e.g.,
chloroticus) increase sound levels between Mellinger and Clark 2003) to ultrasounds of
18:00 and 20:00 compared to mid-day levels odontocetes (toothed whales; e.g., Hiley et al.
(Radford et al. 2008). Further examples of sounds 2017). Calls may function as contact or warning
from invertebrate movement and foraging signals. For example, northern right (Eubalaena
230 R. P. Schoeman et al.
Fig. 7.14 Spectrograms of the fish calls making up nine Salgado-Kent CP, Marley SA, et al., Characterizing diver-
fish choruses (50 Hz–3 kHz) in Darwin Harbour, sity and variation in fish choruses in Darwin Harbour.
Australia. The middle panel shows the chorus levels over ICES J Mar Sci 73:2058–2074; https://doi.org/10.1093/
time, in hours relative to sunrise and sunset. There is a icesjms/fsw037. # International Council for the Explora-
peak in chorusing activity shortly after sunset. tion of the Sea, 2016; https://global.oup.com/academic/
Figure created from material in Parsons et al. (2016), by rights/permissions/. All rights reserved. Reuse requires
permission from Oxford University Press. Parsons MJG, permission from OUP
glacialis) and southern right (E. australis) whale which may serve as an advertisement call and/or
upsweeps (i.e., upcalls; 50–235 Hz) seem to be agonistic call produced by male individuals
used as a contact call (Fig. 7.15a; Clark 1982; (Parks et al. 2006). However, female right whales
Parks et al. 2007). Another characteristic call of sometimes also produce this sound (Gerstein et al.
this species is a strong, brief, broadband pulse 2014). Foraging humpback whales (Megaptera
with energy up to 16 kHz (called gunshot), novaeangliae) produce a characteristic tonal call
7 Analysis of Soundscapes as an Ecological Tool 231
Fig. 7.15 Spectrograms of marine mammal sounds. (a) leptonyx) and (f) Ross seal (Ommatophoca rossii), both
Southern right whale upcall. (b) Humpback whale song. under water. # Erbe et al. (2017); https://doi.org/10.1007/
(c) Common dolphin (Delphinus delphis) whistles and (d) s40857-017-0101-z. Published under CC BY 4.0; https://
clicks and burst-pulse sounds. (e) Leopard seal (Hydrurga creativecommons.org/licenses/by/4.0/
that have peak frequencies between 20 Hz and produce non-vocal surface-generated sounds
6 kHz (Fig. 7.15b; Payne and McVay 1971). The through breaching, pectoral fin slapping, and tail
functions of whale song may include female slapping (e.g., Dunlop et al. 2007).
attraction, male-male interactions, and long-
range sonar (Herman 2017; Mercado 2018).
Odontocete echolocation clicks with peak energy 7.3.2 Geophony
between ~10 and ~150 kHz are used for naviga-
tion and prey capture (Au 1993). Odontocete The aquatic geophony comprises sounds from
tonal calls (i.e., whistles) with fundamental wind acting on the water surface (e.g., Knudsen
frequencies between ~1 and ~50 kHz and broad- et al. 1948); precipitation (e.g., Nystuen 1986);
band burst-pulse sounds are used for communica- ice movement, pressure cracking, and melting
tion (Fig. 7.15c, d; Herzing 1996). Some (e.g., Mikhalevsky 2001; Martin and Cott 2016);
odontocete species also communicate with clicks subsea volcanoes and earthquakes (e.g., Fox et al.
(e.g., sperm whales, Physeter macrocephalus, 2001; Dziak and Fox 2002); and sediment dis-
and porpoises, Phocoenidae; Weilgart and White- placement (e.g., Lorang and Tonolla 2014).
head 1993; Clausen et al. 2010). Delphinids may Geophony can be nearly continuous and domi-
arrange their whistles and burst-pulse sounds into nate the soundscape in certain regions at certain
patterned sequences (e.g., killer whales, Orcinus times (e.g., wind noise in southern Australia; Erbe
orca, Wellard et al. 2020; and pilot whales, et al. 2021). Wind-driven sound lies between
Globicephala melas, Courts et al. 2020). Seals, 100 Hz and 20 kHz (typical peak at 500 Hz;
sea lions, and walruses use underwater Wenz 1962). Rainfall can contribute to the under-
vocalizations particularly during the breeding water soundscape over frequencies between
season and in social interactions (Schusterman 500 Hz and 50 kHz depending on drop size,
et al. 1966; Stirling et al. 1987; Van Parijs and rainfall rate, and impact angle related to wind
Kovacs 2002). The majority of pinniped under- speed (Ma et al. 2005). In the Perth Canyon,
water vocalizations fall within the frequency Australia, rainfall is often accompanied by strong
range between 10 Hz and 6 kHz (Fig. 7.15e, f), wind. Consequently, the weather-related sound
although Weddell seals (Leptonychotes weddellii) spectrum shows two peaks: one dominated by
were found to produce calls containing energy up wind at 300–600 Hz and another dominated by
to 13 kHz (Thomas and Kuechle 1982). rain at about 3 kHz (Fig. 7.16a; Erbe et al. 2015).
Mysticetes, odontocetes, and pinnipeds also In polar regions and underneath frozen lakes,
140
A 2000 B
1000
PSD [dB re 1 μPa2/Hz]
120
Frequency [Hz]
500
200
100
100
50
80
20
10
60
0 100 200 300
Time [s]
Fig. 7.16 Sources of aquatic geophony. (a) Underwater (dB re 1 μPa2/Hz). Note the logarithmic frequency axes.
power spectral density (PSD) levels illustrating an increase Both figures were modified; # Erbe et al. (2015); https://
in levels under increased wind speeds (m/s) and rain fall doi.org/10.1016/j.pocean.2015.05.015. Published under
rates (mm/h). (b) Spectrogram of an earthquake recorded CC BY 4.0; https://creativecommons.org/licenses/by/4.0/
in the Perth Canyon, Australia. Colors indicate PSD level
7 Analysis of Soundscapes as an Ecological Tool 233
sounds of colliding, oscillating, breaking, and shark nets (e.g., Erbe and McPherson 2012),
melting ice range from <10 Hz to 8 kHz snowmobiles and vehicles on ice-covered lakes
(Talandier et al. 2006; Martin and Cott 2016). (Martin and Cott 2016), bridge traffic (Holt and
Sound from polar ice can be detected thousands Johnston 2015; Martin and Popper 2016), augers
of kilometers away at tropical latitudes (i.e., ice drills; Putland and Mensinger 2020),
(Matsumoto et al. 2014). Underwater volcanic airplanes (e.g., Martin and Cott 2016; Erbe et al.
eruptions generate impulsive sounds as well as 2018), and activities alongside, rather than on,
harmonic tremors <100 Hz, which can travel the water (Kuehne et al. 2013). Lesser-known
over distances greater than 12,000 km through anthropophony originates from unpowered recre-
the Sound Fixing And Ranging (SOFAR) channel ational activities (e.g., scuba diving and swim-
(Tepp et al. 2019). Similarly, earthquakes can be ming; Erbe et al. 2016c).
detected at thousands of kilometers in distance as Sound from ship traffic is the most pervasive
low-frequency (<100 Hz) rumbles, lasting sev- anthropogenic sound in the ocean (e.g., Sertlek
eral minutes (Fig. 7.16b; Erbe et al. 2015). Sedi- et al. 2019). The level of sound emitted depends
ment flow may generate sound in rivers and on ship type, size, speed, and operational mode
streams, creating acoustic cues for freshwater (e.g., reversing, idling, carrying, or towing load;
species (Tonolla et al. 2010, 2011). Depending MacGillivray and de Jong 2021). In water <300
on grain size and flow velocity, the spectrum may m deep, large ships (>300 t) can temporarily
range from tens of hertz to kilohertz. increase sound levels up to 125 kHz within
500 m from shipping routes (Hermannsen et al.
2014; Veirs et al. 2016). In deep water,
7.3.3 Anthropophony low-frequency sound from ships can travel far-
ther, especially when entering the SOFAR chan-
In the last century, human activities began to nel (Fig. 7.17; Erbe et al. 2019). The number of
contribute significantly to underwater sound small, recreational boats that occupy coastal
levels. The anthropophony has grown ambient waters is on the rise in many places and these
sound levels rapidly compared to evolutionary vessels may raise sound levels between 100 Hz
time scales, making it hard for animals to adapt and 20 kHz in coastal and estuarine habitats,
(see Chap. 13). Anthropogenic sound may be depending on boat type, hull type, length, propul-
present in aquatic soundscapes far away from sion system, operational mode, and speed
human activities, owing to the long-range propa- (Parsons et al. 2021).
gation of low-frequency sound in water (see Another common anthropogenic sound that
Chap. 6). The aquatic anthropophony includes has received much concern over its potential
personal watercrafts (e.g., jetskis; Erbe 2013), impacts on marine life (see Chap. 13) is produced
small boats (e.g., Erbe et al. 2016a; Dey et al. by seismic surveys, used for seabed profiling and
2019), electric ferries (Parsons et al. 2020), mer- hydrocarbon exploration. Surveys are done with a
chant ships (e.g., Ross 1976; Hatch et al. 2008; vessel towing an array of airguns. Airguns are
McKenna et al. 2012), offshore hydrocarbon metal chambers storing compressed air, which is
exploration and production (e.g., marine seismic rapidly released, producing an acoustic pulse with
surveys and drilling; Wyatt 2008; Erbe and King energy up to at least 10 kHz (Dragoset 2000;
2009; Erbe et al. 2013), near-shore construction Hermannsen et al. 2015). Airguns exist with dif-
including geotechnical work and pile-driving ferent operating volumes and firing pressures,
(e.g., Erbe 2009; Dahl et al. 2015; Erbe and affecting the spectrum and level of the acoustic
McPherson 2017), windfarms (e.g., Koschinski pulses (Fig. 7.18a; Erbe and King 2009;
et al. 2003; Tougaard et al. 2009), dredging Hermannsen et al. 2015). Airgun arrays can be
(e.g., Reine et al. 2014), explosions (e.g., tuned to focus acoustic emission down into the
Soloway and Dahl 2014), military sonars (e.g., seabed, yet some sound ends up traveling hori-
Ainslie 2010), acoustic alarms on fishing gear or zontally through the water. Hence, sounds from
234 R. P. Schoeman et al.
Fig. 7.17 Sketch of the propagation of sound from a left panel and a hard, dense, limestone seafloor. Colors
156-m ship (at 0 km range) sailing at a speed of 15 knots represent received level (RL). # Erbe et al. 2019; https://
above the continental slope in the absence of ambient www.frontiersin.org/files/Articles/476898/fmars-06-
sound. Propagation modeled with RAMGeo in AcTUP 00606-HTML/image_m/fmars-06-00606-g001.jpg.
V2.8 (https://cmst.curtin.edu.au/products/underwater/) Published under CC BY 4.0; https://creativecommons.org/
with an equatorial sound speed profile as indicated in the licenses/by/4.0/
22 A B
20
18
16
14
Frequency [kHz]
12
10
8
6
4
2
Fig. 7.18 Spectrograms of impulsive sound sources. (a) Seismic airgun pulses recorded off Western Australia (Erbe
et al. 2021). (b) Pile driving recorded in Moreton Bay, Queensland, Australia (Erbe 2009)
seismic surveys may affect marine life at both Pile driving for windfarm construction and
short and long ranges (Gordon et al. 2003; detonations of World War II ammunition are reg-
Slabbekoorn et al. 2019). A typical seismic sur- ular sources of sound within European waters
vey may last several weeks, during which the (Bailey et al. 2010; von Benda-Beckmann et al.
airgun array is discharged every few seconds. 2015). Impact pile driving generates high-
Other common sounds of concern are emitted intensity pulses with energy exceeding 40 kHz
by pile driving, explosions, and acoustic alarms. at close range (Fig. 7.18b). Acoustic alarms are
7 Analysis of Soundscapes as an Ecological Tool 235
devices that purposefully emit sound between a polar oceans, the speed of sound is the smallest at
few hundred hertz and tens of kilohertz to deter the surface. This leads to a surface duct, in which
marine animals from potential hazards, such as sound travels by repeated reflection off the sea
pile driving sites, aquaculture farms, or bather surface and refraction at depth.
protection nets (e.g., Jacobs and Terhune 2002; Snell’s law creates additional interesting phe-
Erbe and McPherson 2012), yet their efficacy nomena such as shadow zones and convergence
remains controversial (e.g., see Erbe et al. zones. Sound does not distribute evenly through-
2016d). Acoustic alarms differ widely in their out the oceans. There are patterns of shadow
signal type, frequency, and source level (Findlay zones (into which sound cannot travel by direct
et al. 2018). paths, and which receive little to no sound) and
convergence zones (where received levels are
enhanced; Fig. 7.17). These zones will be in dif-
7.3.4 Sound Propagation in Aquatic ferent places for different source locations. In
Environments addition, sound at low frequencies does not travel
far in shallow water. The waveguide concept and
Underwater, the propagation of sound is affected normal modes nicely explain this (see Chap. 6).
by water temperature, salinity, hydrostatic pres- The water depth can be too small to “fit” sound of
sure (i.e., depth below the sea surface), sea sur- large wavelength. As a result, ship noise may be
face roughness, potential ice cover, bathymetry, attenuated quickly in coastal water and the spec-
seafloor roughness, upper seafloor geology (i.e., tral hump of distant shipping is characteristic only
sediment type and thickness), depth and type of in offshore water (see Sect. 7.5.3.2). Ergo,
the underlying bedrock, and the presence of soundscapes may differ with location and depth,
sound absorbers, scatterers, and reflectors (e.g., merely because of sound propagation.
aquatic fauna, bubble clouds, or suspended sedi-
ment; see Chap. 6).
The speed of sound in water changes gradually 7.4 Soundscape Changes Over
with depth. As a result, sound does not travel in Space and Time
straight lines. Instead, sound paths are bent by
refraction. By Snell’s law, paths bend toward Soundscapes may vary on a range of spatial
local minima in sound speed. The most pro- scales, exhibit temporal cycles (e.g., because of
nounced local minimum occurs in all non-polar diurnal animal behaviors, periodic animal pres-
oceans at a depth of about 1000 m below the sea ence, or seasonal weather events; Erbe et al. 2015;
surface. Sound reaching this depth at not too steep Caruso et al. 2017; McWilliam et al. 2017), or
angles can get trapped in the so-called SOFAR gradually change over longer periods of time.
channel by being repeatedly refracted toward the Such changes may be natural or, directly or indi-
channel axis. This is how sound can traverse rectly, related to human activity. Understanding
entire oceans, with sound sources contributing to natural variability is important for using
soundscapes thousands of kilometers away (e.g., soundscapes (1) as an ecological tool to study
Gavrilov 2018). The SOFAR channel does not animal behavior and (2) as a management tool
only trap sounds from deep-water sound sources of the potential effects of human activity. Our
(e.g., submarines or diving megafauna) located understanding of the function of animal calls
within the channel, but also from sources near and natural or anthropogenic interferences is
the sea surface (e.g., ships or whales) because based on limited observational data (Slabbekoorn
sound can radiate into the SOFAR channel with et al. 2018) and so interpreting changes in sounds
just one reflection off a downward sloping sea- is even more difficult. Gavrilov et al. (2012), for
floor (Fig. 7.17). The minimum in sound speed example, recorded the underwater soundscape
(and so the axis of the SOFAR channel) rises to between 21 and 27 May in 2002, 2006, and
shallower depths in polar waters. In fact, in the 2010 off Cape Leeuwin, Australia. Between
236 R. P. Schoeman et al.
fin whales Antarctic blue whales ambient noise may result in spatial differences in
98 vocalizations (Slabbekoorn and Smith 2002). If
PSD [dB re 1 μPa2/Hz]
Fig. 7.20 Seasonal timing of pygmy blue whale migra- of pygmy blue whale singers as 24-h means. The red
tion along the west and south coasts of Australia based on horizontal lines indicate when the recorders were
passive acoustic monitoring. The chart shows the locations operating (Erbe et al. 2016b)
of sound recordings (red dots). The diagram shows counts
signals (i.e., whistles) were mostly produced dur- degradation by humans as a root cause. Humans
ing the day. Seasonal variation, with a peak num- add sound to soundscapes, change biodiversity
ber of clicks in August, was also evident, but no through land-use, and directly remove animals
effect of lunar cycle was observed. Off Western from habitats (e.g., by hunting). Humans also
Australia, pygmy blue whales (Balaenoptera contribute to climate change, with greenhouse
musculus brevicauda) are a seasonally dominant gas emissions resulting in environmental changes,
contributor to the marine soundscape and simply which can have direct and indirect effects on
by listening, their seasonal migration can be ecosystems and related soundscapes. The conser-
traced along the coast (Fig. 7.20; Erbe et al. vation of soundscapes is important not only for
2016b). scientific and ecological reasons but also for tour-
istic interests and human welfare (Pavan 2017).
exploration and production, military exercises, primates, squirrels, tree-shrews, and bats between
recreational activities, etc. These activities undisturbed, logged, and transformed patches of
produce sounds over a wide range of forest (i.e., to rubber and oil palm plantations) in
frequencies and at a variety of intensities (see eastern Sumatra, Indonesia. Logging changed the
Sects. 7.2.3 and 7.3.3). While some activities composition of bird species, revealing a decrease
are temporary, others result in sustained in the number of specialized insectivorous species
increases in ambient sound levels over time. and an increase in insectivore-frugivore generalist
For example, underwater sound from shipping species. The species richness of bats also
has increased ambient sound levels between decreased with a concomitant increase in abun-
10 and 100 Hz in large parts of the world’s dance of the most dominant bat species. How-
oceans by up to 3 dB per decade (e.g., Andrew ever, logging impacts differed between
et al. 2011; Chapman and Price 2011; Miksis- geographical regions and management strategies
Olds et al. 2013). (e.g., conventional selective, salvage, or reduced-
Seismic surveys produce intense sound over a impact logging; Chaudhary et al. 2016; LaManna
few weeks at a time to explore a specified area; and Martin 2017). Land transformation to
yet, Nieukirk et al. (2004, 2012) detected airgun plantations resulted in a dramatic decrease in
pulses along the Mid-Atlantic ridge from seismic biodiversity with the disappearance of primates,
survey vessels located 3000–4000 km away. In squirrels, and tree-shrews as well as a reduction in
1999, airgun signals were routinely detected for bird and bat species richness by 90–95% and
more than 80% of the days in a month, which 75–87%, respectively.
increased to 95% in 2005. Finally, anthropogenic
sounds may affect animal behavior (i.e., physical 7.4.3.3 Direct Takes
or acoustic, Slabbekoorn et al. 2018; see Accidental, illegal, or over-harvesting of animal
Chap. 13), which can further alter soundscapes. species occurs in both terrestrial and aquatic
habitats (e.g., Challender and MacMillan 2014;
7.4.3.2 Land Use Anderson et al. 2020), resulting in population
Humans transform natural landscapes to increase declines and species extinctions (Hoffmann
agricultural land coverage, to build infrastructure et al. 2011; Dulvy et al. 2014). Perhaps one of
(e.g., roads, buildings, and power supply the greatest examples is the removal of millions
systems), or to extract resources (e.g., tree log- of whales during the nineteenth and twentieth
ging and mining). These activities generate centuries (Rocha Jr. et al. 2014), which unequiv-
sound and affect animal density and biodiversity, ocally changed marine soundscapes world-wide.
ultimately changing soundscapes (Phillips A modern example is the threat of dissapearing
et al. 2017). In 1962, ecologist Rachel Carson Gulf corvina (Cynoscion othonopterus) choruses
expressed her concern about the use of chemicals in the Colorado River delta because of
and pesticides in agriculture, killing not only soil overfishing (Erisman and Rowell 2017).
micro-fauna but also macro-fauna (Carson 1962). Overfishing can also result in excessive growth
She foresaw a silent natural world without the of algae, ultimately changing soundscapes.
songs of insects, frogs, and birds, if they were Freeman et al. (2018), for example, found a posi-
lost due to urbanization or chemical pollution. tive correlation between sound levels and
She was one of the first to consider animal sounds macroalgae coverage on Hawaiian coral reefs,
as an expression of ecosystem integrity and qual- attributable to ringing bubbles emitted during
ity. Kerr and Cihlar (2004) found a correlation photosynthesis.
between high-intensity, high-biomass agriculture
and high numbers of endangered species on both 7.4.3.4 Climate Change
national and regional levels in Canada. The Earth is experiencing rapid climate change,
Danielsen and Heegaard (1995) compared the affecting soundscapes in a variety of ways. The
species richness and abundance of birds, geophony is affected by changing weather
7 Analysis of Soundscapes as an Ecological Tool 239
patterns (i.e., wind, precipitation, and storms; sea-ice freeze-up (Hauser et al. 2016). These
Sueur et al. 2019). Rising temperatures reduce examples stress the importance of collecting envi-
sea- and land ice, which is changing polar ronmental data together with acoustic data, to
soundscapes (Intergovernmental Panel on Cli- correlate changes in animal distribution patterns
mate Change [IPCC] 2014). Climate change fur- and behavior with environmental change
ther modifies the acoustic properties of the (Kloepper and Simmons 2014).
environment with direct effects on sound propa-
gation and thus the audible distances of sounds.
Larom et al. (1997) calculated that the effective 7.5 How to Analyze Soundscapes
communication range for African elephant calls
varied between 2 and 10 km with temperature and Soundscape analysis may involve various, some-
windspeed. Ocean acidification, as a result of times sequential, methods ranging from listening
climate change, results in less absorption of to recordings, via visual inspection of
low-frequency sounds (Gazioğlu et al. 2015). spectrograms, to automated detection of target
Thus, low-frequency sound sources, such as signals, and computation of several acoustic
ships and whales, may become more prominent metrics. Often, the larger the acoustic monitoring
in future marine soundscapes. project, the more automated the tools, as long-term
Climate change may also directly affect a spe- projects, which might compare multiple recording
cies’ vocal behavior, distribution pattern, or sites, might gather terabytes of data, which are
timing of behavioral events, such as migration virtually impossible to analyze by hand.
and mating (Krause and Farina 2016; Sueur
et al. 2019). Narins and Meenderink (2014)
found that Puorto Rican coqui frogs (Eleuthero- 7.5.1 Standard Soundscape
dactylus coqui), over a period of 23 years, moved Measurements
to higher altitudes, while their calls increased in
pitch and decreased in duration. These changes in Initial assessments of soundscapes typically
distribution and call characteristics corresponded involve the computation of spectrograms and
to an overall increase in temperature of 0.37 C, some general statistics, such as the broadband
with a concomitant decrease in body size. A dif- root-mean-square (rms) Sound Pressure Level
ferent response was seen by four frog species near (SPLrms) in either dB re 20 μPa or dB re 1 μPa
Ithaca, NY, USA, who advanced the start of their in air and water, respectively (see Chap. 4). This
breeding season by 14 days between 1900–1912 allows an initial quality-check of the recordings
and 1990–1999, as evident from recordings of and the identification of potential spatial or
mating calls (Gibbs and Breisch 2001). During temporal patterns in overall sound levels,
this time, temperatures increased on average highlighting areas or temporal events of interest
0.7–1.7 C. Insects also depend on air tempera- for further investigation (e.g., very quiet or very
ture for the expression of their behavior, includ- noisy areas or times of day, Fig. 7.21). However,
ing sound emission (Ciceran et al. 1994). Rossi broadband SPLrms levels are strongly influenced
et al. (2016a, b) found that snapping shrimp by the noisiest events and cannot identify
(family Alpheidae) reduced their snap rate (i.e., the myriad of soundscape components and
snaps per minute) and intensity under increased contributors to spatial and temporal differences.
levels of CO2. This might affect the behavior of As sound sources are often known to cover
species that rely on acoustic cues from snapping certain frequency bands, it is beneficial to com-
shrimp for navigation (Rossi et al. 2016b). pute SPLs within purposefully chosen frequency
The eastern Chukchi Sea beluga whale bands or standard octave or 1/3 octave bands.
(Delphinapterus leucas) population delayed Buscaino et al. (2016) used Octave Band Levels
timing of migration from foraging habitats by (OBLs) at center frequencies from 62.5 Hz to
2–4 weeks, corresponding to a delay in regional 64 kHz to study temporal patterns in the
240 R. P. Schoeman et al.
Fig. 7.21 Spectrograms (top) and time series (bottom) of Reprinted by permission from Springer Nature.
broadband (20 Hz–22 kHz) sound pressure levels of a 24-h Bertucci F, Guerra AS, Sturny V, et al., A preliminary
recording period at three sites around Bora Bora Island, acoustic evaluation of three sites in the lagoon of Bora
French Polynesia. Recording schedule was set at 60 s Bora, French Polynesia. Environ Biol Fishes 103:891–
every 10 min. Note the increase in sound levels at night 902; https://doi.org/10.1007/s10641-020-01000-8.
(shaded areas) as well as the strong fluctuation in sound # Springer Nature, 2020. All rights reserved
levels between 60-s segments (Bertucci et al. 2020).
Fig. 7.22 Spectrograms highlighting the difference in evolution of vocal displays in Traupidae (tanagers), the
vocalizations between 14 different tanager species, which largest family of songbirds. Biol J Linn Soc 114:538–551;
can be used to monitor behavior and response to environ- https://doi.org/10.1111/bij.12455. # The Linnean Society
mental change (Mason and Burns 2015). Reprinted by of London, 2015; https://global.oup.com/academic/rights/
permission from Oxford University Press. Mason NA, permissions/. All rights reserved. Reuse requires permis-
Burns KJ, The effect of habitat and body size on the sion from OUP
2001), birds (Fig. 7.22; e.g., Jahn et al. 2017), and inspection of sound files is labor intensive;
mammals (e.g., Nijman 2001; Parks et al. 2007). and so, some studies make use of automatic
Similarly, the sounds of the geophony and detection and classification software (see
anthropophony have characteristic spectral Chap. 8).
features by which they can be identified.
Studies differ, however, in their methodology
to identify sound sources. By listening to sounds 7.5.3 Visual Displays of Soundscapes
while observing their spectrograms in real-time
(see Sect. 7.5.3.1), experts can employ their per- 7.5.3.1 Spectrograms
sonal experience to separate biotic and abiotic A spectrogram displays acoustic power density as
sounds and to identify species. Alternatively, a function of time and frequency. Each column in
sounds can be compared to labeled recordings the spectrogram is a result of Fourier-
in sound libraries (see URLs at the end of this transforming a section of the recorded time series
chapter) and spectrograms can be compared to of sound pressure. The frequency and time
those found in the literature. However, manual resolutions of the spectrogram are affected by
242 R. P. Schoeman et al.
the window length and type of window function identifying the sound source. Spectrograms that
used (see Chap. 4). Techniques such as zero- contain the vocalizations of multiple sound
padding (i.e., expanding a time window with sources can provide information on species
zeros) and overlapping time windows may vocal dynamics, acoustic niches, and how
enhance the apparent resolution in frequency animals may be affected by acoustic changes in
and time. Each pixel (or cell) of the spectrogram their surroundings. For example, mixed anuran
eventually represents an average sound power, species’ breeding choruses in Minnesota, USA,
averaged into time and frequency bins. revealed acoustic niche partitioning within the
Spectrograms are a useful tool to examine the frequency domain (Fig. 7.23), while fin whale
time, frequency, and amplitude details of a vocalizations were masked by ship noise in Italy
sound at different time scales, potentially (Fig. 7.24).
Fig. 7.23 Anuran choruses recorded in Minnesota com- image; # Nityananda and Bee (2011); https://journals.
prising calls of four species. Note the occupation of differ- plos.org/plosone/article?id¼10.1371/journal.pone.
ent frequency bands by these species, suggesting acoustic 0021191. Published under CC BY 4.0; https://
niche partitioning within the frequency domain. Modified creativecommons.org/licenses/by/4.0/
125
A
100
Frequency [Hz]
75
50
25
0
125
B
100
Frequency [Hz]
75
50
25
0
0 30 60 Time [s] 90 120 150
Fig. 7.24 Spectrograms of (a) 20-Hz fin whale vocalizations off Sicily, Italy, and (b) a passing ship, which masked the
fin whale sounds
7 Analysis of Soundscapes as an Ecological Tool 243
Fig. 7.25 Spectrograms of the marine soundscape in the surrounding panels display short-term spectrograms of
Perth Canyon, Australia. Middle panel shows a 3-week example sounds (Erbe et al. 2016b)
LTSA, computed with a 10-min observation window. The
244 R. P. Schoeman et al.
Fig. 7.26 LTSmax spectrograms from the same location decreased in August. LTSmax produced with SeaPro soft-
(Sasso Fratino Integral Nature Reserve, Italy) on three ware by combining 48 frames of 10 min each, recorded
different dates and under different weather conditions. every 30 min (Righini and Pavan 2020)
Biophony is concentrated between 1.5 and 9 kHz and
events, which would otherwise be averaged and might be a need to quantify this variability.
potentially missed in LTSAs. Fig. 7.26 shows Power spectral density (PSD) percentile plots
three 24-h LTSmax of an Italian soundscape on quantify the spectrum variability over the dura-
different dates and under different weather tion of a temporal analysis window. PSD is plot-
conditions (Righini and Pavan 2020). The images ted against frequency. At each frequency, several
show sound sources present from midnight to percentile levels are shown, commonly the
midnight: (top) one day in June 2015 with some median (50th percentile) and the quartiles (25th
bursts of rain, (middle) one day with good and 75th percentiles), but perhaps also additional
weather and a clear image of the biophony percentiles (e.g., 1st, 5th, 95th, and 99th). The nth
concentrated between dawn and dusk in the fre- percentile gives the levels that were exceeded n%
quency range 1.5–9 kHz, and (bottom) one day of the time. There is no standard for the length of
recorded in August, with a less dense biophony the temporal analysis window, and selection
during daylight hours but Orthopteran choruses in depends on the specific study questions. Tempo-
the night. In August, a short period of light rain is ral analysis windows of 24 h, one season, or one
also shown on the left side. In addition, the stream full year are common. Dominant contributors to
noise below 1 kHz in August was lower than in the soundscape can then be identified by the
June. The faint band between 12 and 18 kHz shape and levels of the curves. Additional infor-
present in all 3 panels was due to the intrinsic mation is provided by plotting the Spectral Prob-
noise of the recorder. ability Density (SPD) as background colors that
represent the probability of levels being reached
7.5.3.2 Power Spectral Density based on normalized histograms of sound levels
Percentile Plots within each frequency bin (Fig. 7.27; Merchant
While spectrograms (including LTSAs) show et al. 2013). Merchant et al. (2015) gave detailed
how the sound spectrum changes over time information on how to compute PSDs and SPDs
(from one FFT window to the next or from one with their publicly available software PAMGuide.
LTSA observation window to the next), there Also see Chap. 4.
7 Analysis of Soundscapes as an Ecological Tool 245
Probability density
80 Fish 0.08
70 1% 0.06
5%
25%
60 50% 0.04
75%
50 0.02
Fish ? 95%
99%
40 0
101 102 Frequency [Hz] 10
3
Fig. 7.27 Plot of power spectral density percentiles and humpback whales at 300 Hz, and fishes at 2 kHz, whereas
probability density for the annual soundscape of the Perth the most common sound sources were distant shipping at
Canyon, Australia. The strongest sound sources were 10–100 Hz and wind at 300 Hz–3 kHz (Erbe et al. 2016b)
pygmy blue whales and nearby ships at 10–200 Hz,
Fig. 7.28 Noise-map of a roadway in an urban area. Red com/journals/jat/2018/7031418/fig4/. Published under CC
indicates highest noise levels and green represents the BY 4.0; https://creativecommons.org/licenses/by/4.0/
quietest areas. # Cai et al. 2018; https://www.hindawi.
the biophony, geophony, and anthropophony sep- score the structure and distribution of acoustic
arately or in comparison. Acoustic indices can be power over frequency and/or time, reflecting a
used as a tool to assess the quality of soundscapes correlation with species presence and distribution
and the underlying ecosystem. Historically, (e.g., Towsey et al. 2014). While traditionally
researchers assessed the number of species (i.e., developed for terrestrial communities, acoustic
species richness) and number of individuals indices are now also increasingly applied to the
belonging to each species (i.e., species evenness) aquatic environment (e.g., Parks et al. 2014;
by counting the number of acoustic identifications Harris et al. 2016; Bolgan et al. 2018a). In partic-
while walking along survey transects or listening ular when the same instruments and protocols are
to recordings (Obrist et al. 2010). However, this used, acoustic indices allow for comparisons of
approach is inefficient, subjective, and limited to soundscapes between multiple sites recorded over
brief observation times. In contrast, a transect or the same period or an evaluation of the changes of
grid of automated recording systems allows a soundscape over time (Righini and Pavan 2020;
acoustic surveys in remote areas, over extended Farina et al. 2021).
periods, and in most field conditions (Acevedo Examples of acoustic indices include:
and Villanueva-Rivera 2006).
1. Bioacoustic Index (BI): Aims to quantify
To support the analyses and interpretation of
biophonic activity by thresholding spectral
consequent large datasets, researchers have been
power in biophony-specific frequency bands
developing acoustic indices that summarize and
(Fig. 7.31; Boelman et al. 2007),
7 Analysis of Soundscapes as an Ecological Tool 247
2. Entropy Index (H): Equals the product of two and applies the Shannon entropy to these bins
sub-indices, spectral (Hf) and temporal (Villanueva-Rivera et al. 2011),
entropy (Ht), computed on the average fre- 4. Acoustic Evenness Index (AEI): Divides the
quency spectrum and on the Hilbert amplitude spectrum into specific frequency bins, selects
envelope of the raw bioacoustic signal, respec- the bins surpassing a preset power threshold,
tively (Sueur et al. 2008b), and considers the distribution of strong fre-
3. Acoustic Diversity Index (ADI): Divides the quency bins by computing the Gini coefficient
spectrum into specific frequency bins, selects (Villanueva-Rivera et al. 2011),
the bins surpassing a preset power threshold,
248 R. P. Schoeman et al.
Fig. 7.30 Maps of (a) harbor porpoise (Phocoena density and low noise) in British Columbia, Canada.
phocoena) density, (b) audiogram-weighted ship noise, # Williams et al. 2015; https://doi.org/10.1016/j.
(c) areas of risk (i.e., high animal density and high marpolbul.2015.09.012. Licensed under CC BY-NC-ND
noise), and (d) areas of opportunity (i.e., high animal 4.0; https://creativecommons.org/licenses/by-nc-nd/4.0/
Fig. 7.31 Bioacoustic Index (BI) and Acoustic Complex- peak at sunrise, followed by a gradual decline with a
ity Index (ACI) for three Italian locations in the Integral second peak at sunset
Nature Reserve of Sasso Fratino, Italy, showing a strong
species’ signatures by listening, by observing resolutions used in the computation of the various
spectrograms, and by using sound recognition quantities and is affected by temporal (and spa-
tools to identify the presence and recurrence of tial) patterns as well as local (and temporally
defined sound models. The R package monitoR variable) sound propagation conditions (Mooney
(Katz et al. 2016) can be used to identify user- et al. 2020). As a result, acoustic indices are
defined sound models. sometimes tuned for specific environments, limit-
It should be noted that acoustic indices applied ing comparability across environments and time.
in two different environments can produce
confounding results and so the robustness of
these indices to environmental change and to 7.6 Applications of Soundscape
different soundscape compositions has been Studies
questioned (Harris et al. 2016; Bolgan et al.
2018a). Soundscape studies can reveal information on
Parks et al. (2014) found that seismic airgun animal distribution, abundance, and behavior;
pulses interfered with the Entropy Index and species diversity; and changes of all of these
therefore did not accurately reflect species rich- over time under environmental and human
ness within the Atlantic Ocean where seismic influences. Hence, soundscape analyses can be
surveys were commonly detected. Bolgan et al. used as ecological tools to understand, conserve,
(2018a) assessed the robustness of the Acoustic and restore soundscapes as part of conservation
Complexity Index to fine variations in fish sound management plans (Pavan 2017).
abundance (i.e., number of sounds) and diversity
(i.e., number of different calls); both changed
index values. Hence, it would be difficult to
infer whether a change in this index resulted 7.6.1 Conservation of Natural
from a change in fish abundance or fish species Soundscapes
diversity. Biophony and anthropophony can over-
lap in frequency and time as well as vary with 7.6.1.1 Management
frequency and time. Acoustic index performance Documenting, analyzing, and understanding a
depends greatly on the frequency and time soundscape can provide important information
for wildlife and habitat managers on species
250 R. P. Schoeman et al.
richness, animal behavior patterns, effects of highlighting the potential use of soundscape stud-
anthropogenic sounds, land-use, and climate ies to monitor for illegal human activities and to
change. Documenting relatively pristine assess the effectiveness of conservation efforts.
soundscapes before they disappear (Righini and Investigation of underwater soundscapes can
Pavan 2020; Farina et al. 2021) can aid also aid in the detection of foreign vessels by
re-establishment of degraded acoustic habitats the military, unauthorized commercial fishing
through habitat restoration, animal relocation, vessels, unlawful vessels in restricted areas (i.e.,
elimination of invasive species, or restrictions of no-go zones or marine protected areas; Kline et al.
activities that generate anthropogenic sound and 2020), and illegal fishing activities with
affect animal behavior. The success of explosives (Xu et al. 2020).
soundscape restoration can then be demonstrated
through acoustic monitoring and analysis (Pavan 7.6.1.2 Education
2017). The rates of biodiversity loss, habitat loss, inva-
Development and implementation of a com- sion of alien species, and species extinctions are
prehensive acoustic monitoring program can aid high (Intergovernmental Science-Policy Platform
management of a protected area in several ways. on Biodiversity and Ecosystem Services [IPBES]
Firstly, storage of quantitative data about the 2019). Helping citizens and stakeholders appreci-
acoustic environment can be used to create piv- ate biodiversity is a necessity to establish a gen-
otal repositories for immediate or future analyses eral willingness to address anthropogenic causes
of spatial and temporal patterns and differences at of ecosystem demise. In this context, animal
large scales. LTSA spectrograms, for example, sound and soundscape recordings not only serve
provide a summary of day-by-day acoustic science but have the potential to trigger people’s
settings and the possibility to display information, curiosity to learn more about the importance of
not only on the diversity of acoustic species (as in ecosystems and their preservation, which will
a census) but also on the density and richness of lead to conservation efforts. Such transfer of sci-
the biophonic components. The study of an Inte- ence, via education, to conservation has been
gral Nature Reserve (Sasso Fratino, Casentinesi demonstrated in several case studies (e.g., Padua
Forests National Park, Italy) demonstrated that 1994; Macharia et al. 2010; Pavan 2017; Barthel
the biophony dominated both geophony and et al. 2018). Exhibits and educational programs
anthropophony, with undisturbed daily cycles on the sounds from nature in museums, zoos, park
(Righini and Pavan 2020; Farina et al. 2021). visitor centers, and websites can stimulate interest
Secondly, monitoring soundscapes can help in and care about the acoustic environment. An
managers detect unwanted and unlawful activities example is Bernie Krause’s Great Animal
in protected areas. Human voices can be used to Orchestra exhibition1. Alternatively, listening to
identify trespassers, gunshots to locate hunters animal sounds during a guided nature walk can
and poachers, humming chainsaws to find illegal generate an appreciation for soniferous animals,
logging, vehicle sounds to document unautho- which can result in long-term public engagement
rized vehicle use, and sounds from livestock to and commitment to conservation by citizen
pinpoint unlawful grazing. Wrege et al. (2017) scientists. Soundscape studies can help to create
found that gunshot sounds within a closed- publicly available sound libraries and help to
canopy forest of the Congo could be detected identify areas within a park for visitors to experi-
over a 7–10 km2 area, depending on the gun ence songbirds, calling frogs, chorusing insects,
used and orientation to the acoustic receiver. waterfalls, rushing streams, etc. One example of
Eight years of acoustic monitoring did not reveal integrating soundscape monitoring and education
a correlation between illegal hunting of forest is the Natural Sound Program, established in
elephants (Loxodonta cyclotis) and time of day
or season. However, hunting intensity seemingly 1
https://thevinylfactory.com/features/bernie-krause-
decreased after initiating patrols in 2009, great-animal-orchestra/; accessed 27 September 2020
7 Analysis of Soundscapes as an Ecological Tool 251
2000 by the U.S. National Park Service (National (Wiseman et al. 2014), anthropogenic sound from
Park Service [NPS] 2000). This program aims to mechanical devices (e.g., Wysocki et al. 2007;
manage the acoustic environment while Scheifele et al. 2012b), background music
providing for educational and inspirational visitor (Scheifele et al. 2012a), and visitors (e.g.,
experiences. Quadros et al. 2014; Sherwen and Hemsworth
2019) is characteristic of many indoor, outdoor,
and underwater animal holding facilities. O’Neal
7.6.2 Monitoring the Health (1998), for example, found that underwater sound
of Agroecosystems pressure levels were 25 dB (20–6400 Hz) louder
in exhibits inside the Monterey Bay Aquarium
High productivity from agricultural fields can be than in a nearby natural offshore environment,
maintained through insecticides, pesticides, and predominantly due to sound from machinery.
fertilizers, but the use of these products may result Similarly, Scheifele et al. (2012b) detected an
in chemical pollution with consequent loss of increase in sound pressure levels by 10–20 dB
plant and animal biodiversity (e.g., Carson (20 Hz–1 kHz) when air pumps were switched on
1962; Boatman et al. 2004; Kerr and Cihlar within the Georgia Aquarium. These increases in
2004; Kleijn et al. 2009). Hence, habitats sound levels can have adverse effects on animal
connected to agricultural lands might exhibit welfare because of physiological and behavioral
poorer soundscapes. In contrast, organic farmers changes (e.g., Owen et al. 2004).
strive to maintain productivity through natural Sound sources that may impact animals might
agroecosystems, ensuring environment quality not be audible to humans, and so animal keepers
and ecological balances. Bird, insect, amphibian, might not be aware of acoustic disturbance to
and bat communities serve as indicators of eco- kept animals. For example, laboratory mice
system health, and an agroecosystem should have are sensitive to ultrasound, above the human
a balance of mixed species that provide natural hearing range. Laboratory equipment (e.g., air
pest control. The ecological quality of an conditioners and lighting) may emit ultrasound
agroecosystem can therefore be evaluated by the and, unknown to humans, stress animals within
species-richness of its soundscape (e.g., Hole these facilities (Sales et al. 1988). Identifying
et al. 2005; Kleijn et al. 2011; Pavan 2017). such sources is necessary for the improvement
Doohan et al. (2019) identified bird and bat of acoustic conditions to increase captive animal
species-specific or guild-specific bioindicators as welfare (De Queiroz 2018). Sound can further be
successful biomonitoring tools for agricultural exacerbated by hard reflective surfaces and the
industries. Systematic monitoring of biological geometry of an exhibit; hence, some noise
sounds can provide an accurate and practical problems can be solved by improving exhibit
assessment tool for farmers, policymakers, design (Wark 2015; De Queiroz 2018).
researchers, and others interested in maintaining Restricting visitor group sizes, reducing operation
or restoring farmland ecosystems, and ultimately hours, limiting the number of shows, and reduc-
encourage the adoption of beneficial and sustain- ing the level of background music can also miti-
able farming practices. gate negative impacts of noise on captive animals.
lead to changes in ecosystem functioning and • A collection of biophony, geophony, and vari-
biodiversity. At present, natural soundscapes are ous soundscape recordings from all over the
disappearing at an unprecedented rate because of world, the British Library: https://sounds.bl.
human interference. Human activities create uk/Environment
sound, change land-use patterns, directly remove • Sounds recorded by National Park Service
animals from their habitat through overharvesting researchers in U.S. National Parks, such as
and illegal hunting, and lead to climate change, Yellowstone National Park and Rocky Moun-
thereby directly and indirectly affecting both tain National Park: https://www.nps.gov/
geophony and biophony. Soundscape studies subjects/sound/gallery.htm
can be used as an ecological tool to study animal • A collection of biophony (i.e., invertebrates,
distribution, behavior, biodiversity, and the amphibians, fishes, reptiles, birds, and
effects of environmental stressors (such as anthro- mammals), Museum für Naturkunde. Note
pogenic noise or climate change). Soundscape that some sound descriptions are in German:
studies can subsequently inform conservation https://www.museumfuernaturkunde.berlin/
management and assess the effectiveness of man- en/science/animal-sound-archive
agement and conservation efforts. • A collection of biophony, SeaWorld Parks and
Entertainment: https://seaworld.org/animals/
sounds/
• A collection of marine biophony, geophony,
7.8 Additional Resources
and anthropophony, Ocean Conservation
Research: https://ocr.org/sound-library/
Below is a selection of free, online resources; last
• The Xeno-Canto collection of animal
accessed 20 June 2022.
recordings provided by scientists and amateur
recordists: https://www.xeno-canto.org/
• Web pages of the University of Pavia about
7.8.1 Sound Libraries bioacoustics and ecoacoustics, including
samples of sounds: http://www.unipv.it/cibra
Sound libraries can serve as reference during the
identification of sound sources. They are also an
educational tool to create awareness of the myriad
7.8.2 Ocean Acoustic Observatories
of sounds that may contribute to a soundscape.
• The Macauley library from the Cornell Lab of Ocean acoustic observatories provide a continu-
Ornithology contains a large collection of ous stream of acoustic data either in real-time or
biophony: https://search.macaulaylibrary.org/ archived:
catalog?view¼List&searchField¼animals
• Australia’s Integrated Marine Observing Sys-
• The Discovery Of Sound In The Sea
tem (IMOS): https://imos.org.au/facilities/
(DOSITS) website, developed by the Univer-
nationalmooringnetwork/
sity of Rhode Island Graduate School of
acousticobservatories
Oceanography in partnership with Marine
• Indian Ocean Acoustic Observatory
Acoustics Inc., contains an underwater sound
OHASISBIO: https://www-iuem.univ-brest.
library as well as a collection of easy-to-read
fr/lgo/les-chantiers/ohasisbio/?lang¼en
scientific information on sound in the ocean:
• Listening to the Deep Ocean (LIDO): http://
https://dosits.org
www.listentothedeep.net/
• The sounds of Australian and Antarctic marine
• Monterey Bay Aquarium Research Institute
mammals, Curtin University: https://cmst.
(MBARI): https://www.mbari.org/
curtin.edu.au/research/marine-mammal-
soundscape-listening-room/
bioacoustics/
7 Analysis of Soundscapes as an Ecological Tool 253
• Characterization Of Recorded Underwater • The Acoustic Toolbox User interface and Post
Sound (CHORUS), a MATLAB (The processor (AcTUP) written in MATLAB for
MathWorks Inc., Natick, MA, USA) graphic modeling range-independent and range-
user interface developed by Curtin University: dependent environments: http://cmst.curtin.
https://cmst.curtin.edu.au/products/chorus- edu.au/products/underwater/ (Duncan and
software/ (Gavrilov and Parsons 2014). Maggi 2006).
• PAMGuard for passive acoustic monitoring: • Graphical user interface i-Simpa suitable for
http://www.pamguard.org/download.php? 3D indoor sound propagation modeling as
id¼108 well as for modeling of environmental noise:
• Triton Software Package, a MATLAB graphic https://i-simpa.ifsttar.fr/download/download0/
user interface developed at Scripps Institution • Software tool created by the openPSTD proj-
of Oceanography: http://www.cetus.ucsd.edu/ ect to aid sound propagation modeling in
technologies_triton.html urban environments: http://www.openpstd.
• OSPREY, a MATLAB graphic user interface org/Download%20openPSTD.html
developed by Oregon State University: • The NoiseModelling tool designed to create
https://www.mobysound.org/software.html environmental noise maps of large urban
• R package seewave available for download areas: https://noise-planet.org/noisemodelling.
from within RStudio: https://cran.r-project. html
org/web/packages/seewave/index.html • The ArcGIS toolbox SPreAD-GIS for
• R package soundecology available for down- modeling engine noise propagation in natural
load from within RStudio: https://cran.r- areas incorporating atmospheric, wind, vege-
project.org/web/packages/soundecology/ tation, and terrain effects (Reed et al. 2010).
index.html
• R package bioacoustics available for down-
load from within RStudio: https://cran.r-
7.8.5 Software for Automatic Signal
project.org/web/packages/bioacoustics/index.
Detection
html
• SoundRuler for measuring acoustic signals:
Some of the software packages for soundscape
http://soundruler.sourceforge.net/main/
analysis include signal detectors:
• Sound Analysis Pro for analysis of biophony:
http://soundanalysispro.com • CHORUS includes detectors for pygmy blue
• SeaPro and SeaWave for recording, analysis, whale song, fin whale 20-Hz downsweeps, and
and real-time display of bioacoustic signals an unidentified spot-call.
and biophony: http://www.unipv.it/cibra/ • PAMGuard includes detectors for odontocete
seapro.html and mysticete vocalizations.
• SOX a command line tool for sound file manip-
ulation and analysis: https://sourceforge.net/ Other automatic signal detection resources:
projects/sox/
• Raven Lite to record, save, and visualize • R package monitoR available for
sounds as spectrograms and waveforms: download from: https://cran.r-project.org/
https://ravensoundsoftware.com/software/ web/packages/monitoR/index.html
raven-lite/ • Ishmael: http://bioacoustics.us/ishmael.html
254 R. P. Schoeman et al.
sources influence acoustic levels? Proc Meet Acoust Buzzetti F, Brizio C, Pavan G (2020) Beyond the audible:
27:070004. https://doi.org/10.1121/2.0000260 wide band (0-125 kHz) field investigation on Italian
Bolgan M, Amorim MCP, Fonseca PJ et al (2018a) Acous- Orthoptera (Insecta) songs. Biodivers J 11:443–496.
tic complexity of vocal fish communities: a field and https://doi.org/10.31396/Biodiv.Jour.2020.11.2.443.
controlled validation. Sci Rep 8:10559. https://doi.org/ 496
10.1038/s41598-018-28771-6 Cai M, Yao Y, Wang H (2018) Urban traffic noise maps
Bolgan M, O’Brien J, Chorazyczewska E et al (2018b) under 3D complex building environments on a super-
The soundscape of the Arctic Charr spawning grounds computer. J Adv Transp 2018:7031418. https://doi.
in lotic and lentic environments: can passive acoustic org/10.1155/2018/7031418
monitoring be used to detect spawning activities? Bio- Caorsi VZ, Both C, Cechin S et al (2017) Effects of traffic
acoustics 27:57–85. https://doi.org/10.1080/09524622. noise on the calling behavior of two Neotropical hylid
2017.1286262 frogs. PLoS One 12:e0183342. https://doi.org/10.
Bolin K (2009) Prediction method for wind-induced vege- 1371/journal.pone.0183342
tation noise. Acta Acust United Acust 95:607–619. Carson R (1962) Silent Spring. Houghton Mifflin Com-
https://doi.org/10.3813/AAA.918189 pany, Boston
Bond AB, Diamond J (2005) Geographic and ontogenetic Caruso F, Alonge G, Bellia G et al (2017) Long-term
variation in the contact calls of the kea (Nestor monitoring of dolphin biosonar activity in deep pelagic
notabilis). Behaviour 142:1–20. https://doi.org/10. waters of the Mediterranean Sea. Sci Rep 7:4321.
1163/1568539053627721 https://doi.org/10.1038/s41598-017-04608-6
Borelli D, Gaggero T, Rizzuto E, Schenone C (2016) Catchpole CK, Slater PJR (2008) Bird song: biological
Holistic control of ship noise emissions. Noise Mapp themes and variations. Cambridge University Press,
3:107–119. https://doi.org/10.1515/noise-2016-0008 Cambridge
Bourgeois K, Curé C, Legrand J et al (2007) Morphologi- Cato DH (2008) Ocean ambient noise: its measurement
cal versus acoustic analysis: what is the most efficient and its significance to marine animals. In: Proceedings
method for sexing yelkouan shearwaters Puffinus of the Institute of Acoustics. Institute of Acoustics,
yelkouan? J Ornithol 148:261–269. https://doi.org/10. Southampton, pp 1–9
1007/s10336-007-0127-3 Cerchio S, Dahlheim M (2001) Variation in feeding
Bowling DL, Garcia M, Dunn JC et al (2017) Body size vocalizations of humpback whales Megaptera
and vocalization in primates and carnivores. Sci Rep 7: novaeangliae from southeast Alaska. Bioacoustics
41070. https://doi.org/10.1038/srep41070 11:277–295. https://doi.org/10.1080/09524622.2001.
Bozkurt TS, Demirkale SY (2017) The field study and 9753468
numerical simulation of industrial noise mapping. J Chabert T, Colin A, Aubin T et al (2015) Size does matter:
Build Eng 9:60–75. https://doi.org/10.1016/j.jobe. crocodile mothers react more to the voice of smaller
2016.11.007 offspring. Sci Rep 5:15547. https://doi.org/10.1038/
Brady J (1974) The physiology of insect circadian srep15547
rhythms. Adv Insect Phys 10:1–115. https://doi.org/ Challender DWS, MacMillan DC (2014) Poaching is more
10.1016/S0065-2806(08)60129-0 than an enforcement problem. Conserv Lett 7:484–
Bregman AS (1990) Auditory scene analysis: the percep- 494. https://doi.org/10.1111/conl.12082
tual organization of sound. The MIT Press, Cambridge Chapman NR, Price A (2011) Low frequency deep ocean
Brown A, Garg S, Montgomery J (2019) Automatic rain ambient noise trend in the northeast Pacific Ocean. J
and cicada chorus filtering of bird acoustic data. Appl Acoust Soc Am 129:EL161–EL165. https://doi.org/10.
Soft Comput 81:105501. https://doi.org/10.1016/j. 1121/1.3567084
asoc.2019.105501 Charrier I, Mathevon N, Jouventin P, Aubin T (2001)
Brunetti AE, Saravia AM, Barrionuevo JS, Reichle S Acoustic communication in a black-headed gull col-
(2017) Silent sounds in the Andes: underwater ony: how do chicks identify their parents? Ethology
vocalizations of three frog species with reduced tym- 107:961–974. https://doi.org/10.1046/j.1439-0310.
panic middle ears (Anura: Telmatobiidae: 2001.00748.x
Telmatobius). Can J Zool 95:335–343. https://doi.org/ Chaudhary A, Burivalova Z, Koh LP, Hellweg S (2016)
10.1139/cjz-2016-0177 Impact of forest management on species richness:
Burbidge T, Parson T, Caycedo-Rosales PC et al (2015) global meta-analysis and economic trade-offs. Sci
Playbacks revisited: asymmetry in behavioural Rep 6:23954. https://doi.org/10.1038/srep23954
response across an acoustic boundary between two Cheney DL, Seyfarth RM (1996) Function and intention in
parapatric bird species. Behaviour 152:1933–1951. the calls of non-human primates. In: Runciman WG,
https://doi.org/10.1163/1568539X-00003309 Smith JM, Dunbar RIM (eds) Proceedings of the Brit-
Buscaino G, Ceraulo M, Pieretti N et al (2016) Temporal ish Academy, Evolution of social behaviour patterns in
patterns in the soundscape of the shallow waters of a primates and man, vol 88. Oxford University Press,
Mediterranean marine protected area. Sci Rep 6:34230. Oxford, pp 59–76
https://doi.org/10.1038/srep34230 Cheney DL, Seyfarth RM (2018) Flexible usage and social
function in primate vocalizations. Proc Natl Acad Sci
256 R. P. Schoeman et al.
USA 115:1974–1979. https://doi.org/10.1073/pnas. De Queiroz MB (2018) How does the zoo soundscape
1717572115 affect the zoo experience for animals and visitors?
Ciceran M, Murray AM, Rowell G (1994) Natural varia- University of Salford, Manchester
tion in the temporal patterning of calling song structure Deichmann JL, Hernández-Serna A, Delagado CJA et al
in the field cricket Gryllus pennsylvanicus: effects of (2017) Soundscape analysis and acoustic monitoring
temperature, age, mass, time of day, and nearest neigh- document impacts of natural gas exploration on biodi-
bor. Can J Zool 72:38–42. https://doi.org/10.1139/ versity in a tropical forest. Ecol Indic 74:39–48. https://
z94-006 doi.org/10.1016/j.ecolind.2016.11.002
Clark CW (1982) The acoustic repertoire of the southern Dentressangle F, Aubin T, Mathevon N (2012) Males use
right whale, a quantitative analysis. Anim Behav 30: time whereas females prefer harmony: individual call
1060–1071. https://doi.org/10.1016/S0003-3472(82) recognition in the dimorphic blue-footed booby. Anim
80196-6 Behav 84:413–420. https://doi.org/10.1016/j.anbehav.
Clark CJ (2021) Ways that animal wings produce sound. 2012.05.012
Integr Comp Biol 61:696–709. https://doi.org/10. Dey M, Krishnaswamy J, Morisaka T, Kelkar N (2019)
1093/icb/icab008 Interacting effects of vessel noise and shallow river
Clausen KT, Wahlberg M, Beedholm K et al (2010) Click depth elevate metabolic stress in Ganges river
communication in harbour porpoises Phocoena dolphins. Sci Rep 9:15426. https://doi.org/10.1038/
phocoena. Bioacoustics 20:1–28. https://doi.org/10. s41598-019-51664-1
1080/09524622.2011.9753630 Di GQ, Lin QL, Li ZG, Kang J (2014) Annoyance and
Coquereau L, Grall J, Chauvaud L et al (2016) Sound activity disturbance induced by high-speed railway and
production and associated behaviours of benthic conventional railway noise: a contrastive case study.
invertebrates from a coastal habitat in the north-east Environ Health 13:12. https://doi.org/10.1186/1476-
Atlantic. Mar Biol 163:127. https://doi.org/10.1007/ 069X-13-12
s00227-016-2902-2 Dingle C, Halfwerk W, Slabbekoorn H (2008) Habitat-
Courts R, Erbe C, Wellard R et al (2020) Australian long- dependent song divergence at subspecies level in the
finned pilot whales (Globicephala melas) emit stereo- grey-breasted wood-wren. J Evol Biol 21:1079–1089.
typical, variable, biphonic, multi-component, and https://doi.org/10.1111/j.1420-9101.2008.01536.x
sequenced vocalisations, similar to those recorded in Dingle C, Poelstra JW, Halfwerk W et al (2010) Asym-
the northern hemisphere. Sci Rep 10: 20609. https:// metric response patterns to subspecies-specific song
doi.org/10.1038/s41598-020-74111-y differences in allopatry and parapatry in the gray-
Crowley SR, Pietruszka RD (1983) Aggressiveness and breasted wood-wren. Evolution 64:3537–3548.
vocalization in the leopard lizard (Gambelia https://doi.org/10.1111/j.1558-5646.2010.01089.x
wislizennii): the influence of temperature. Anim Doohan B, Fuller S, Parsons S, Peterson EE (2019) The
Behav 31:1055–1060. https://doi.org/10.1016/S0003- sound of management: acoustic monitoring for agricul-
3472(83)80012-8 tural industries. Ecol Indic 96:739–746. https://doi.org/
Cunnington GM, Fahrig L (2010) Plasticity in the 10.1016/j.ecolind.2018.09.029
vocalizations of anurans in response to traffic noise. Dragoset B (2000) Introduction to air guns and air-gun
Acta Oecologica 36:463–470. https://doi.org/10.1016/ arrays. Lead Edge 19:892–897. https://doi.org/10.
j.actao.2010.06.002 1190/1.1438741
Cure C, Aubin T, Mathevon N (2009) Acoustic conver- Drozdova L, Butorina M, Kuklin D (2019) Evaluation and
gence and divergence in two sympatric burrowing reduction of the common effect of road and rail
nocturnal seabirds. Biol J Linn Soc 96:115–134. noise. In: Proceedings of the 26th International Con-
https://doi.org/10.1111/j.1095-8312.2008.01104.x gress on Sound and Vibration. International Institute of
Dahl PH, Dall’Osto DR, Farrell DM (2015) The underwa- Acoustics and Vibration, Montreal
ter sound field from vibratory pile driving. J Acoust Duarte MHL, Caliari EP, Scarpelli MDA et al (2019)
Soc Am 137:3544–3554. https://doi.org/10.1121/1. Effects of mining truck traffic on cricket calling activ-
4921288 ity. J Acoust Soc Am 146:656–664. https://doi.org/10.
Danielsen F, Heegaard M (1995) Impact of logging and 1121/1.5119125
plantation development on species diversity: a case Duffy JE (1996) Eusociality in coral-reef shrimp. Nature
study from Sumatra. In: Sandbukt Ø 381:512–514. https://doi.org/10.1038/381512a0
(ed) Management of tropical forests: towards an Duffy JE, Macdonald KS (1999) Colony structure of the
integrated perspective. Centre for Development and social snapping shrimp Synalpheus filidigitus in Belize.
the Environment, University of Oslo, Oslo, pp 73–92 J Crustac Biol 19:283–292. https://doi.org/10.1163/
Davies WJ, Adams MD, Bruce NS et al (2013) Perception 193724099X00097
of soundscapes: an interdisciplinary approach. Appl Dulvy NK, Fowler SL, Musick JA et al (2014) Extinction
Acoust 74:224–231. https://doi.org/10.1016/j. risk and conservation of the world’s sharks and rays.
apacoust.2012.05.010 Elife 3:e00590. https://doi.org/10.7554/eLife.00590
Duncan AJ, Maggi AL (2006) A consistent, user friendly
interface for running a variety of underwater acoustic
7 Analysis of Soundscapes as an Ecological Tool 257
propagation codes. In: Proceedings of Acoustics. Erbe C, Parsons M, Duncan AJ, Allen K (2016c) Under-
Christchurch, 20–22 November 2006 water acoustic signatures of recreational swimmers,
Dunlop RA, Noad MJ, Cato DH, Stokes D (2007) The divers, surfers and kayakers. Acoust Aust 44:333–
social vocalization repertoire of east Australian migrat- 341. https://doi.org/10.1007/s40857-016-0062-7
ing humpback whales (Megaptera novaeangliae). J Erbe C, Wintner S, Dudley SFJ, Plön S (2016d) Revisiting
Acoust Soc Am 122:2893–2905. https://doi.org/10. acoustic deterrence devices: long-term bycatch data
1121/1.2783115 from South Africa’s bather protection nets. Proc Meet
Duque FG, Rodríguez-Saltos CA, Wilczynski W (2018) Acoust 27:010025. https://doi.org/10.1121/2.0000306
High-frequency vocalizations in Andean Erbe C, Dunlop R, Jenner KCS et al (2017) Review of
hummingbirds. Curr Biol 28:927–928. https://doi.org/ underwater and in-air sounds emitted by Australian
10.1016/j.cub.2018.07.058 and Antarctic marine mammals. Acoust Aust 45:179–
Dziak RP, Fox CG (2002) Evidence of harmonic tremor 241. https://doi.org/10.1007/s40857-017-0101-z
from a submarine volcano detected across the Pacific Erbe C, Williams R, Parsons M et al (2018) Underwater
Ocean basin. J Geophys Res 107:1–11. https://doi.org/ noise from airplanes: an overlooked source of ocean
10.1029/2001JB000177 noise. Mar Pollut Bull 137:656–661. https://doi.org/
Dziak RP, Haxel JH, Matsumoto H et al (2017) Ambient 10.1016/j.marpolbul.2018.10.064
sound at challenger deep, mariana trench. Oceanogra- Erbe C, Marley SA, Schoeman RP et al (2019) The effects
phy 30:186–197. https://doi.org/10.5670/oceanog. of ship noise on marine mammals: a review. Front Mar
2017.240 Sci 6:606. https://doi.org/10.3389/fmars.2019.00606
Erbe C (2009) Underwater noise from pile driving in Erbe C, Schoeman RP, Peel D, Smith JN (2021) It often
Moreton Bay, Qld. Acoust Aust 37:87–92 howls more than it chugs: wind versus ship noise under
Erbe C (2013) Underwater noise of small personal water- water in Australia’s maritime regions. J Mar Sci Eng 9:
craft (jet skis). J Acoust Soc Am 133:EL326–EL330. 1–27. https://doi.org/10.3390/jmse9050472
https://doi.org/10.1121/1.4795220 Erisman BE, Rowell TJ (2017) A sound worth saving:
Erbe C, King AR (2009) Modelling cumulative sound acoustic characteristics of a massive fish spawning
exposure around marine seismic surveys. J Acoust aggregation. Biol Lett 13:20170656. https://doi.org/
Soc Am 125:2443–2451. https://doi.org/10.1121/1. 10.1098/rsbl.2017.0656
3089588 Ernstes R, Quinn JE (2016) Variation in bird vocalizations
Erbe C, McPherson C (2012) Acoustic characterisation of across a gradient of traffic noise as a measure of an
bycatch mitigation pingers on shark control nets in altered urban soundscape. Cities Environ 8:7
Queensland, Australia. Endanger Species Res 19: European Environment Agency [EEA] (2014) Noise in
109–121. https://doi.org/10.3354/esr00467 Europe. Publications Office of the European Union,
Erbe C, McPherson C (2017) Underwater noise from Luxembourg
geotechnical drilling and standard penetration testing. Farina A, Gage SH (2017) Ecoacoustics: the ecological
J Acoust Soc Am 142:EL281–EL285. https://doi.org/ role of sounds. Wiley, Hoboken
10.1121/1.5003328 Farina A, Righini R, Fuller S et al (2021) Acoustic com-
Erbe C, MacGillivray A, Williams R (2012) Mapping plexity indices reveal the acoustic communities of the
cumulative noise from shipping to inform marine spa- old-growth Mediterranean forest of Sasso Fratino Inte-
tial planning. J Acoust Soc Am 132:EL423–EL428. gral Natural Reserve (Central Italy). Ecol Indic 120:
https://doi.org/10.1121/1.4758779 106927. https://doi.org/10.1016/j.ecolind.2020.
Erbe C, McCauley RD, McPherson C, Gavrilov A (2013) 106927
Underwater noise from offshore oil production vessels. Feng AS, Narins PM, Xu CH et al (2006) Ultrasonic
J Acoust Soc Am 133:EL465–EL470. https://doi.org/ communication in frogs. Nature 440:333–336. https://
10.1121/1.4802183 doi.org/10.1038/nature04416
Erbe C, Williams R, Sandilands D, Ashe E (2014) Fenton MB, Portfors CV, Rautenback IL, Waterman JM
Identifying modeled ship noise hotspots for marine (1998) Compromises: sound frequencies used in echo-
mammals of Canada’s Pacific region. PLoS One 9: location by aerial-feeding bats. Can J Zool 76:1174–
e89820. https://doi.org/10.1371/journal.pone.0089820 1182. https://doi.org/10.1139/z98-043
Erbe C, Verma A, McCauley R et al (2015) The marine Fernández-Juricic E, Campagna C, Enriquez V, Ortiz CL
soundscape of the Perth Canyon. Prog Oceanogr 137: (1999) Vocal communication and individual variation
38–51. https://doi.org/10.1016/j.pocean.2015.05.015 in breeding south American sea lions. Behaviour 136:
Erbe C, Liong S, Koessler MW et al (2016a) Underwater 495–517. https://doi.org/10.1163/156853999501441
sound of rigid-hulled inflatable boats. J Acoust Soc Am Ferreira LM, Oliveira EG, Lopes LC et al (2018) What do
139:EL223–EL227. https://doi.org/10.1121/1. insects, anurans, birds, and mammals have to say about
4954411 soundscape indices in a tropical savanna. J Ecoacoust
Erbe C, McCauley R, Gavrilov A et al (2016b) The under- 2: #PVH6YZ. https://doi.org/10.22261/jea.pvh6yz
water soundscape around Australia. In: Proceedings of Fichtel C (2020) Monkey alarm calling: it ain’t all referen-
Acoustics. Brisbane, 9–11 November 2016. tial, or is it? Anim Behav Cogn 7:101–107. https://doi.
org/10.26451/abc.07.02.04.2020
258 R. P. Schoeman et al.
Findlay CR, Ripple HD, Coomber F et al (2018) Mapping Gazioğlu C, Müftüoğlu AE, Demir V et al (2015) Connec-
widespread and increasing underwater noise pollution tion between ocean acidification and sound propaga-
from acoustic deterrent devices. Mar Pollut Bull 135: tion. Int J Environ Geoinformatics 2:16–26. https://doi.
1042–1050. https://doi.org/10.1016/j.marpolbul.2018. org/10.30897/ijegeo.303538
08.042 Gerstein ER, Trygonis V, McCulloch S et al (2014)
Flint EL, Minot EO, Perry PE, Stafford KJ (2014) A Female north Atlantic right whales produce gunshot
survey of public attitudes towards barking dogs in sounds. J Acoust Soc Am 135:2369. https://doi.org/10.
New Zealand. N Z Vet J 62:321–327. https://doi.org/ 1121/1.4877814
10.1080/00480169.2014.921852 Gibbs JP, Breisch AR (2001) Climate warming and calling
Fournet MEH, Gabriele CM, Sharpe F et al (2018) Feed- phenology of frogs near Ithaca, New York, 1900-1999.
ing calls produced by solitary humpback whales. Mar Conserv Biol 15:1175–1178. https://doi.org/10.1046/j.
Mamm Sci 34:851–865. https://doi.org/10.1111/mms. 1523-1739.2001.0150041175.x
12485 Gil D, Honarmand M, Pascual J et al (2015) Birds living
Fox CG, Matsumoto H, Lau TKA (2001) Monitoring near airports advance their dawn chorus and reduce
Pacific Ocean seismicity from an autonomous hydro- overlap with aircraft noise. Behav Ecol 26:435–443.
phone array. J Geophys Res 106:4183–4206. https:// https://doi.org/10.1093/beheco/aru207
doi.org/10.1029/2000JB900404 Giles JC, Davis JA, McCauley RD, Kuchling G (2009)
Francis CD, Newman P, Taff BD et al (2017) Acoustic Voice of the turtle: the underwater acoustic repertoire
environments matter: synergistic benefits to humans of the long-necked freshwater turtle, Chelodina
and ecological communities. J Environ Manag 203: oblonga. J Acoust Soc Am 126:434–443. https://doi.
245–254. https://doi.org/10.1016/j.jenvman.2017. org/10.1121/1.3148209
07.041 Gill SA, Bierema AMK (2013) On the meaning of alarm
Franco LS, Shanahan DF, Fuller RA (2017) A review of calls: a review of functional reference in avian alarm
the benefits of nature experiences: more than meets the calling. Ethology 119:449–461. https://doi.org/10.
eye. Int J Environ Res Public Health 14:864. https:// 1111/eth.12097
doi.org/10.3390/ijerph14080864 Gordon J, Gillespie D, Potter J et al (2003) A review of the
Freeman SE, Freeman LA, Giorli G, Haas AF (2018) effects of seismic surveys on marine mammals. Mar
Photosynthesis by marine algae produces sound, Technol Soc J 37:16–34. https://doi.org/10.4031/
contributing to the daytime soundscape on coral reefs. 002533203787536998
PLoS One 13:e0201766. https://doi.org/10.1371/jour Gordon TAC, Radford AN, Davidson IK et al (2019)
nal.pone.0201766 Acoustic enrichment can enhance fish community
Gadziola MA, Grimsley JMS, Faure PA, Wenstrup JJ development on degraded coral reef habitat. Nat
(2012) Social vocalizations of big brown bats vary Commun 10:5414. https://doi.org/10.1038/s41467-
with behavioral context. PLoS One 7:e44550. https:// 019-13186-2
doi.org/10.1371/journal.pone.0044550 Gottesman BL, Francomano D, Zhao Z et al (2020)
Gage SH, Axel AC (2014) Visualization of temporal Acoustic monitoring reveals diversity and surprising
change in soundscape power of a Michigan lake habitat dynamics in tropical freshwater soundscapes. Freshw
over a 4-year period. Ecol Inform 21:100–109. https:// Biol 65:117–132. https://doi.org/10.1111/fwb.13096
doi.org/10.1016/j.ecoinf.2013.11.004 Grafe TU (2005) Anuran choruses as communication
Galeotti P, Sacchi R, Fasola M, Ballasina D (2005) Do networks. In: McGregor G (ed) Animal communica-
mounting vocalisations in tortoises have a communi- tion networks. Cambridge University Press,
cation function? A comparative analysis. Herpetol J Cambridge, pp 277–299
15:61–71 Greta M, Louena S, Arianna A, et al (2019) Prediction of
Garrett JK, Blondel P, Godley BJ et al (2016) Long-term off-site noise levels reduction in open-air music events
underwater sound measurements in the shipping noise within densely populated urban areas. In: INTER-
indicator bands 63 Hz and 125 Hz from the port of NOISE 2019 MADRID - 48th International Congress
Falmouth Bay, UK. Mar Pollut Bull 110:438–448. and Exhibition on Noise Control Engineering. Interna-
https://doi.org/10.1016/j.marpolbul.2016.06.021 tional Institute of Noise Control Engineering, Madrid,
Gavrilov A (2018) Propagation of underwater noise from 16–19 June 2019
an offshore seismic survey in Australia to Antarctica: Gulyas K, Pinte G, Augusztinovicz F, et al (2002) Active
measurements and modelling. Acoust Aust 46:143– noise control in agricultural machines. In: Proceedings
149. https://doi.org/10.1007/s40857-018-0131-1 of the 2002 International Conference on Noise and
Gavrilov AN, Parsons MJG (2014) A MATLAB tool for Vibration Engineering, ISMA, Leuven, 16–18 Septem-
the characterisation of recorded underwater sounds ber 2002
(CHORUS). Acoust Aust 42:190–196 Halfwerk W, Slabbekoorn H (2009) A behavioural mech-
Gavrilov AN, McCauley RD, Gedamke J (2012) Steady anism explaining noise-dependent frequency use in
inter and intra-annual decrease in the vocalization fre- urban birdsong. Anim Behav 78:1301–1307. https://
quency of Antarctic blue whales. J Acoust Soc Am doi.org/10.1016/j.anbehav.2009.09.015
131:4476–4480. https://doi.org/10.1121/1.4707425
7 Analysis of Soundscapes as an Ecological Tool 259
Harris SA, Shears NT, Radford CA (2016) Ecoacoustic Hole DG, Perkins AJ, Wilson JD et al (2005) Does organic
indices as proxies for biodiversity on temperate reefs. farming benefit biodiversity? Biol Conserv 122:113–
Methods Ecol Evol 7:713–724. https://doi.org/10. 130. https://doi.org/10.1016/j.biocon.2004.07.018
1111/2041-210X.12527 Holt DE, Johnston CE (2015) Traffic noise masks acoustic
Hart PJ, Hall R, Ray W et al (2015) Cicadas impact bird signals of freshwater stream fish. Biol Conserv 187:
communication in a noisy tropical rainforest. Behav 27–33. https://doi.org/10.1016/j.biocon.2015.04.004
Ecol 26:839–842. https://doi.org/10.1093/beheco/ Insley SJ, Phillips AV, Charrier I (2010) A review of
arv018 social recognition in pinnipeds. Aquat Mamm 29:
Hatch L, Clark C, Merrick R et al (2008) Characterizing 181–201. https://doi.org/10.1578/
the relative contributions of large vessels to total ocean 016754203101024149
noise fields: a case study using the Gerry E. Studds Intergovernmental Panel on Climate Change [IPCC]
Stellwagen Bank National Marine Sanctuary. Environ (2014) Climate change 2014 synthesis report. Contri-
Manag 42:735–752. https://doi.org/10.1007/s00267- bution of working groups I, II and III on the fifth
008-9169-4 assessment report of the intergovernmental panel on
Hauser DDW, Laidre KL, Stafford KM et al (2016) climate change. IPCC, Geneva
Decadal shifts in autumn migration timing by Pacific Intergovernmental Science-Policy Platform on Biodiver-
Arctic beluga whales are related to delayed annual sea sity and Ecosystem Services [IPBES] (2019) Summary
ice formation. Glob Change Biol 23:2206–2217. for policymakers of the global assessment report on
https://doi.org/10.1111/gcb.13564 biodiversity and ecosystem services of the intergovern-
Haver SM, Klinck H, Nieukirk SL et al (2017) The not-so- mental science-policy platform on biodiversity and
silent world: measuring Arctic, Equatorial and Antarc- ecosystem services. PBES Secretariat, Bonn
tic soundscapes in the Atlantic Ocean. Deep Res I International Organization for Standardization [ISO]
Oceanogr Res Pap 122:95–104. https://doi.org/10. (2014) International Standard 12913-1 acoustics -
1016/j.dsr.2017.03.002 soundscape - Part 1: definition and conceptual frame-
Herberholz J, Schmitz B (1999) Flow visualisation and work. International Organization for Standardization,
high speed video analysis of water jets in snapping Geneva
shrimp (Alpheus heterochaelis). J Comp Physiol A International Organization for Standardization [ISO]
185:41–49. https://doi.org/10.1007/s003590050364 (2017) International Standard 18405 underwater
Herman LM (2017) The multiple functions of male song acoustics - terminology. International Organization
within the humpback whale (Megaptera novaeangliae) for Standardization, Geneva
mating system: review, evaluation, and synthesis. Biol Iversen RTS, Perkins PJ, Dionne RD (1963) An indication
Rev 92:1795–1818. https://doi.org/10.1111/brv.12309 of underwater sound production by squid. Nature 199:
Hermannsen L, Beedholm K, Tougaard J, Madsen PT 250–251. https://doi.org/10.1038/199250a0
(2014) High frequency components of ship noise in Jacobs SR, Terhune JM (2002) The effectiveness of acous-
shallow water with a discussion of implications for tic harassment devices in the Bay of Fundy, Canada:
harbor porpoises (Phocoena phocoena). J Acoust Soc seal reactions and a noise exposure model. Aquat
Am 136:1640–1653. https://doi.org/10.1121/1. Mamm 28:147–158
4893908 Jahn O, Ganchev TD, Marques MI, Schuchmann KL
Hermannsen L, Tougaard J, Beedholm K et al (2015) (2017) Automated sound recognition provides insights
Characteristics and propagation of airgun pulses in into the behavioral ecology of a tropical bird. PLoS
shallow water with implications for effects on small One 12:e0169041. https://doi.org/10.1371/journal.
marine mammals. PLoS One 10:e0133436. https://doi. pone.0169041
org/10.1371/journal.pone.0133436 Jézéquel Y, Bonnel J, Coston-Guarini J, Chauvaud L
Herzing DL (1996) Vocalizations and associated underwa- (2019) Revisiting the bioacoustics of European spiny
ter behavior of free-ranging Atlantic spotted dolphins, lobsters Palinurus elephas: comparison of antennal
Stenella frontalis and bottlenose dolphins, Tursiops rasps in tanks and in situ. Mar Ecol Prog Ser 615:
truncatus. Aquat Mamm 22:61–79. https://doi.org/10. 143–157. https://doi.org/10.3354/meps12935
12966/abc.02.02.2015 Johnson JB, Lees JM, Yepes H (2006) Volcanic eruptions,
Hiley HM, Perry S, Hartley S, King SL (2017) What’s lightning, and a waterfall: differentiating the menagerie
occurring? Ultrasonic signature whistle use in Welsh of infrasound in the Ecuadorian jungle. Geophys Res
bottlenose dolphins (Tursiops truncatus). Bioacoustics Lett 33:L06308. https://doi.org/10.1029/
26:25–35. https://doi.org/10.1080/09524622.2016. 2005GL025515
1174885 Joo W, Gage SH, Kasten EP (2011) Analysis and interpre-
Hoffmann M, Belant JL, Chanson JS et al (2011) The tation of variability in soundscapes along an urban–
changing fates of the world’s mammals. Philos Trans rural gradient. Landsc Urban Plan 103:259–276.
R Soc B 366:2598–2610. https://doi.org/10.1098/rstb. https://doi.org/10.1016/j.landurbplan.2011.08.001
2011.0116 Kaatz IM (2002) Multiple sound-producing mechanisms
in teleost fishes and hypotheses regarding their
260 R. P. Schoeman et al.
behavioural significance. Bioacoustics 12:230–233. Knight L, Ladich F (2014) Distress sounds of thorny
https://doi.org/10.1080/09524622.2002.9753705 catfishes emitted underwater and in air: characteristics
Kaatz IM (2011) How fishes use sound: quiet to loud and and potential significance. J Exp Biol 217:4068–4078.
simple to complex signalling. In: Farrell AP https://doi.org/10.1242/jeb.110957
(ed) Encyclopedia of fish physiology. Elsevier, San Knowlton RE, Moulton JM (1963) Sound production in
Diego, pp 684–691 the snapping shrimps Alpheus (Crangon) and
Kaiser F, Rohde T (2013) Orlando theme park acoustics - a Synalpheus. Biol Bull 125:311–331. https://doi.org/
soundscape analysis. In: Internoise. International Con- 10.2307/1539406
gress and Exposition on Noise Control Engineering, Knudsen VO, Alford RS, Emling JW (1948) Underwater
15–18 September 2013, Austrian Noise Abatement ambient noise. J Mar Res 7:410–429
Association, Innsbruck Koschinski S, Culik BM, Henriksen OD et al (2003)
Kariel HG (1990) Factors affecting response to noise in Behavioural reactions of free-ranging propoises and
outdoor recreational environments. Can Geogr 34: seals to the noise of a simulated 2 MW windpower
142–149. https://doi.org/10.1111/j.1541-0064.1990. generator. Mar Ecol Prog Ser 265:263–273. https://doi.
tb01259.x org/10.3354/meps265263
Kasten EP, Gage SH, Fox J, Joo W (2012) The remote Krause BL (1987) Bio-acoustics: habitat ambience & eco-
environmental assessment laboratory’s acoustic logical balance. Signal 57:14–16
library: an archive for studying soundscape ecology. Krause BL (1993) The niche hypothesis: a virtual
Ecol Inform 12:50–67. https://doi.org/10.1016/j. symphony of animal sounds, the origins of musical
ecoinf.2012.08.001 expression and the health of habitats. Soundscape
Kasumyan AO (2008) Sounds and sound production in Newsl 6:6–10
fishes. J Ichthyol 48:981–1030. https://doi.org/10. Krause B (2008) Anatomy of the soundscape: evolving
1134/S0032945208110039 perspectives. J Audio Eng Soc 56:73–80
Katz J, Hafner SD, Donovan T (2016) Tools for automated Krause B (2012) The great animal orchestra. Little, Brown
acoustic monitorig with the R package monitoR. Bio- and Company, Boston
acoustics 25:197–210. https://doi.org/10.1080/ Krause B, Farina A (2016) Using ecoacoustic methods to
09524622.2016.1138415 survey the impacts of climate change on biodiversity.
Kawakita S, Ichikawa K (2019) Automated classification Biol Conserv 195:245–254. https://doi.org/10.1016/j.
of bees and hornet using acoustic analysis of their flight biocon.2016.01.013
sounds. Apidologie 50:71–79. https://doi.org/10.1007/ Kruger DJD, Du Preez LH (2016) The effect of airplane
s13592-018-0619-6 noise on frogs: a case study on the critically
Kerr JT, Cihlar J (2004) Patterns and causes of species endangered Pickersgill’s reed frog (Hyperolius
endangerment in Canada. Ecol Appl 14:743–753. pickersgilli). Ecol Res 31:393–405. https://doi.org/10.
https://doi.org/10.1890/02-5117 1007/s11284-016-1349-8
Kleijn D, Kohler F, Báldi A et al (2009) On the relation- Kuehne LM, Padgham BL, Olden JD (2013) The
ship between farmland biodiversity and land-use inten- soundscapes of lakes across an urbanization gradient.
sity in Europe. Proc R Soc B 276:903–909. https://doi. PLoS One 8:e55661. https://doi.org/10.1371/journal.
org/10.1098/rspb.2008.1509 pone.0055661
Kleijn D, Rundlöf M, Scheper J et al (2011) Does conser- Kukulski B, Wszolek T, Mleczko D (2018) The impact of
vation on farmland contribute to halting the biodiver- fireworks noise on the acoustic climate in urban areas.
sity decline? Trends Ecol Evol 26:474–481. https://doi. Arch Acoust 43:697–705. https://doi.org/10.24425/
org/10.1016/j.tree.2011.05.009 aoa.2018.125163
Klenova AV (2015) Chick begging calls reflect degree of Ladich F (1997) Agonistic behaviour and significance of
hunger in three Auk species (Charadriiformes: sounds in vocalizing fish. Mar Freshw Behav Physiol
Alcidae). PLoS One 10:e0140151. https://doi.org/10. 29:87–108. https://doi.org/10.1080/
1371/journal.pone.0140151 10236249709379002
Klinck H, Nieukirk SL, Mellinger DK et al (2012) Sea- Ladich F, Winkler H (2017) Acoustic communication in
sonal presence of cetaceans and ambient noise levels in terrestrial and aquatic vertebrates. J Exp Biol 220:
polar waters of the north Atlantic. J Acoust Soc Am 2306–2317. https://doi.org/10.1242/jeb.132944
132:EL176–EL181. https://doi.org/10.1121/1. LaManna JA, Martin TE (2017) Logging impacts on avian
4740226 species richness and composition differ across latitudes
Kline LR, DeAngelis AI, McBride C et al (2020) and foraging and breeding habitat preferences. Biol
Sleuthing with sound: understanding vessel activity in Rev 92:1657–1674. https://doi.org/10.1111/brv.12300
marine protected areas using passive acoustic monitor- Larom D, Garstang M, Payne K et al (1997) The influence
ing. Mar Policy 120:104138. https://doi.org/10.1016/j. of surface atmospheric conditions on the range and
marpol.2020.104138 area reached by animal vocalizations. J Exp Biol 200:
Kloepper LN, Simmons AM (2014) Bioacoustic monitor- 421–431
ing contributes to an understanding of climate change. Lattenkamp EZ, Shields SM, Schutte M et al (2019) The
Acoust Tod 10:8–15 vocal repertoire of pale spear-nosed bats in a social
7 Analysis of Soundscapes as an Ecological Tool 261
roosting context. Front Ecol Evol 7:116. https://doi. Martin SB, Cott PA (2016) The under-ice soundscape in
org/10.3389/fevo.2019.00116 Great Slave Lake near the city of Yellowknife, North-
Lengagne T, Slater PJB (2002) The effects of rain on west Territories, Canada. J Great Lakes Res 42:248–
acoustic communication: Tawny owls have good rea- 255. https://doi.org/10.1016/j.jglr.2015.09.012
son for calling less in wet weather. Proc R Soc London Martin SB, Popper AN (2016) Short- and long-term moni-
B 269:2121–2125. https://doi.org/10.1098/rspb.2002. toring of underwater sound levels in the Hudson River
2115 (New York, USA). J Acoust Soc Am 139:1886–1897.
Lengagne T, Jouventin P, Aubin T (1999) Finding one’s https://doi.org/10.1121/1.4944876
mate in a king penguin colony: efficiency of acoustic Mason NA, Burns KJ (2015) The effect of habitat and
communication. Behaviour 136:833–846. https://doi. body size on the evolution of vocal displays in
org/10.1163/156853999501595 Traupidae (tanagers), the largest family of songbirds.
Lewicki MS, Olshausen BA, Surlykke A, Moss CF (2014) Biol J Linn Soc 114:538–551. https://doi.org/10.1111/
Scene analysis in the natural environment. Front bij.12455
Psychol 5:199. https://doi.org/10.3389/fpsyg.2014. Matsumoto H, Bohnenstiel DR, Tournadre J et al (2014)
00199 Antarctic icebergs: a significant natural ocean sound
Lillis A, Perelman JN, Panyi A, Mooney TA (2017) Sound source in the southern hemisphere. Geochem Geophys
production patterns of big-clawed snapping shrimp Geosyst 15:3448–3458. https://doi.org/10.1002/
(Alpheus spp.) are influenced by time-of-day and social 2014GC005454
context. J Acoust Soc Am 142:3311–3320. https://doi. McKenna MF, Ross D, Wiggins SM, Hildebrand JA
org/10.1121/1.5012751 (2012) Underwater radiated noise from modern com-
Linke S, Gifford T, Desjonquères C (2020) Six steps mercial ships. J Acoust Soc Am 131:92–103. https://
towards operationalising freshwater ecoacoustic moni- doi.org/10.1121/1.3664100
toring. Freshw Biol 65:1–6. https://doi.org/10.1111/ McShane LJ, Estes JA, Riedman ML, Staedler MM (1995)
fwb.13426 Repertoire, structure, and individual variation of
Lorang MS, Tonolla D (2014) Combining active and pas- vocalizations in the sea otter. J Mammol 76:414–427.
sive hydroacoustic techniques during flood events for https://doi.org/10.2307/1382352
rapid spatial mapping of bedload transport patterns in McWilliam JN, Hawkins AD (2013) A comparison of
gravel-bed rivers. Fundam Appl Limnol 184:231–246. inshore marine soundscapes. J Exp Mar Bio Ecol
https://doi.org/10.1127/1863-9135/2014/0552 446:166–176. https://doi.org/10.1016/j.jembe.2013.
Ma BB, Nystuen JA, Lien RC (2005) Prediction of under- 05.012
water sound levels from rain and wind. J Acoust Soc McWilliam JN, McCauley RD, Erbe C, Parsons MJG
Am 117:3555–3565. https://doi.org/10.1121/1. (2017) Patterns of biophonic periodicity on coral
1910283 reefs in the Great Barrier Reef. Sci Rep 7:17459.
MacGillivray A, de Jong C (2021) A reference spectrum https://doi.org/10.1038/s41598-017-15838-z
model for estimating source levels of marine shipping Mellinger DK, Clark CW (2003) Blue whale
based on Automated Identification System data. J Mar (Balaenoptera musculus) sounds from the north Atlan-
Sci Eng 9:369. https://doi.org/10.3390/jmse9040369 tic. J Acoust Soc Am 114:1108–1119. https://doi.org/
Macharia JM, Thenya T, Ndiritu GG (2010) Management 10.1121/1.1593066
of highland wetlands in central Kenya: the importance Mercado IIIE (2018) The sonar model for humpback
of community education, awareness and eco-tourism in whale song revised. Front Psychol 9:1156. https://doi.
biodiversity conservation. Biodiversity 11:85–90. org/10.3389/fpsyg.2018.01156
https://doi.org/10.1080/14888386.2010.9712652 Merchant ND, Barton TR, Thompson PM et al (2013)
Mack AL, Jones J (2003) Low-frequency vocalizations by Spectral probability density as a tool for ambient
cassowaries (Casuarius spp.). Auk 120:1062–1068. noise analysis. J Acoust Soc Am 133:EL262–EL267.
https://doi.org/10.1093/auk/120.4.1062 https://doi.org/10.1121/1.4794934
Maglio A, Soares C, Bouzidi M et al (2015) Mapping Merchant ND, Fristrup KM, Johnson MP et al (2015)
shipping noise in the Pelagos Sanctuary (French part) Measuring acoustic habitats. Methods Ecol Evol 6:
through acoustic modelling to assess potential impacts 257–265. https://doi.org/10.1111/2041-210X.12330
on marine mammals. Sci Rep Port Cros Natl Park 29: Mikhalevsky PN (2001) Acoustics, Arctic. In: Steele JH,
167–185 Thorpe SA, Turekian KK (eds) Encyclopedia of ocean
Marchal J, Fabianek F, Scott C et al (2020) “bioacoustics”: sciences, vol 1. Elsevier, San Diego, pp 53–61
analyse audio recordings and automatically extract ani- Miksis-Olds JL, Bradley DL, Maggie Niu X (2013)
mal vocalizations. R package version 0.2.8 Decadal trends in Indian Ocean ambient sound. J
Marley SA, Erbe C, Salgado-Kent CP (2016) Underwater Acoust Soc Am 134:3464–3475. https://doi.org/10.
sound in an urban estuarine river: sound sources, 1121/1.4821537
soundscape contribution, and temporal variability. Mooney TA, Di Iorio L, Lammers M et al (2020) Listening
Acoust Aust 44:171–186. https://doi.org/10.1007/ forward: approaching marine biodiversity assessments
s40857-015-0038-z using acoustic methods. R Soc Open Sci 7:201287.
https://doi.org/10.1098/rsos.201287
262 R. P. Schoeman et al.
Morton ES (1975) Ecological sources of selection on avian thesis, Naval Postgraduate School, Monterey. Avail-
sounds. Am Nat 109:17–34 able from https://apps.dtic.mil/sti/pdfs/ADA350428.
Mulard H, Aubin T, White JF et al (2009) Voice variance pdf (accessed on 21 June 2022)
may signify ongoing divergence among black-legged Obrist MK, Pavan G, Sueur J et al (2010) Bioacoustics
kittiwake populations. Biol J Linn Soc 97:289–297. approaches in biodiversity inventories. In: Eymann J,
https://doi.org/10.1111/j.1095-8312.2009.01198.x Degreef J, Häuser C et al (eds) Manual on field record-
Mullet TC, Gage SH, Morton JM, Huettmann F (2016) ing techniques and protocols for all taxa biodiverity
Temporal and spatial variation of a winter soundscape inventories. Abc Taxa, Brussels, pp 68–99
in south-central Alaska. Landsc Ecol 31:1117–1137. O'Connell-Rodwell CE, Arnason BT, Hart LA (2000)
https://doi.org/10.1007/s10980-015-0323-0 Seismic properties of Asian elephant (Elephas
Mullet TC, Farina A, Gage SH (2017a) The acoustic maximus) vocalizations and locomotion. J Acoust Soc
habitat hypothesis: an ecoacoustics perspective on spe- Am 108:3066–3072. https://doi.org/10.1121/1.
cies habitat selection. Biosemiotics 10:319–336. 1323460
https://doi.org/10.1007/s12304-017-9288-5 Odom KJ, Hall ML, Riebel K et al (2014) Female song is
Mullet TC, Morton JM, Gage SH, Huettmann F (2017b) widespread and ancestral in songbirds. Nat Commun 5:
Acoustic footprint of snowmobile noise and natural 3379. https://doi.org/10.1038/ncomms4379
quiet refugia in an Alaskan wilderness. Nat Areas J Owen MA, Swaisgood RR, Czekala NM et al (2004)
37:332–349. https://doi.org/10.3375/043.037.0308 Monitoring stress in captive giant pandas (Ailuropoda
Mumm CAS, Knörnschild M (2014) The vocal repertoire melanoleuca): behavioral and hormonal response to
of adult and neonate giant otters (Pteronura ambient noise. Zoo Biol 23:147–164. https://doi.org/
brasiliensis). PLoS One 9:e112562. https://doi.org/ 10.1002/zoo.10124
10.1371/journal.pone.0112562 Padua SM (1994) Conservationn awareness through an
Narins PM (1990) Seismic communication in anuran environmental education programme in the Atlantic
amphibians. Bioscience 40:268–274. https://doi.org/ forest of Brazil. Environ Conserv 21:145–151. https://
10.2307/1311263 doi.org/10.1017/S0376892900024577
Narins PM, Meenderink WF (2014) Climate change and Pagniello CMLS, Cimino MA, Terrill E (2019) Mapping
frog calls: long-term correlations along a tropical alti- fish chorus distributions in southern California using
tudinal gradient. Proc R Soc B 281:20140401. https:// an autonomous wave glider. Front Mar Sci 6:526.
doi.org/10.1098/rspb.2014.0401 https://doi.org/10.3389/fmars.2019.00526
National Park Service [NPS] (2000) Director’s order #47: Parks SE, Hamilton PK, Kraus SD, Tyack PL (2006) The
soundscape preservation and noise management. gunshot sound produced by male north Atlantic right
https://www.nps.gov/policy/DOrders/DOrder47.html. whales (Eubalaena glacialis) and its potential function
Accessed 5 Feb 2020 in reproductive advertisement. Mar Mamm Sci 21:
Nieukirk SL, Stafford KM, Mellinger DK et al (2004) 458–475. https://doi.org/10.1111/j.1748-7692.2005.
Low-frequency whale and seismic airgun sounds tb01244.x
recorded in the mid-Atlantic Ocean. J Acoust Soc Am Parks SE, Clark CW, Tyack PL (2007) Short- and long-
115:1832–1843. https://doi.org/10.1121/1.1675816 term changes in right whale calling behavior: the
Nieukirk SL, Mellinger DK, Moore SE et al (2012) potential effects of noise on acoustic communication.
Sounds from airguns and fin whales recorded in the J Acoust Soc Am 122:3725–3731. https://doi.org/10.
mid-Atlantic Ocean, 1999-2009. J Acoust Soc Am 1121/1.2799904
131:1102–1112. https://doi.org/10.1121/1.3672648 Parks SE, Miksis-Olds JL, Denes SL (2014) Assessing
Nijman V (2001) Effect of behavioural changes due to marine ecosystem acoustic diversity across ocean
habitat disturbance on density estimation of rain forest basins. Ecol Inform 21:81–88. https://doi.org/10.
vertebrates, as illustrated by gibbons (Primates: 1016/j.ecoinf.2013.11.003
Hylobatidae). In: Hillegers PJM, de Iongh HH (eds) Parsons MJG, Salgado-Kent CP, Marley SA et al (2016)
The balance between biodiversity conservation and Characterizing diversity and variation in fish choruses
sustainable use of tropical rain forests. Tropenbos in Darwin Harbour. ICES J Mar Sci 73:2058–2074.
Foundation, Wageningen, pp 217–226 https://doi.org/10.1093/icesjms/fsw037
Nityananda V, Bee MA (2011) Finding your mate at a Parsons MJG, Duncan AJ, Parsons SK, Erbe C (2020)
cocktail party: frequency separation promotes auditory Reducing vessel noise: an example of a solar-electric
stream segregation of concurrent voices in multi- passenger ferry. J Acoust Soc Am 147:3575–3583.
species frog choruses. PLoS One 6:e21191. https:// https://doi.org/10.1121/10.0001264
doi.org/10.1371/journal.pone.0021191 Parsons MJG, Erbe C, Meekan MG, Parsons SK (2021) A
Nystuen JA (1986) Rainfall measurements using underwa- review and meta-analysis of underwater noise radiated
ter ambient noise. J Acoust Soc Am 79:972–982. by small (<25 m length ) vessels. J Mar Sci Eng 9:827.
https://doi.org/10.1121/1.393695 https://doi.org/10.3390/jmse9080827
O’Neal D (1998) Comparison of the underwater ambient Pavan G (2017) Fundamentals of soundscape
noise measured in three large exhibits at the Monterey conservation. In: Farina A, Gage SH (eds)
Bay aquarium and in the inner Monterey Bay. M.Sc.
7 Analysis of Soundscapes as an Ecological Tool 263
Ecoacoustics. The ecological role of sound. Wiley, addressee, context, and behavior. Sci Rep 6:39419.
Hoboken, pp 235–258 https://doi.org/10.1038/srep39419
Pavan G, Priano M, De Carli P et al (1997) Stridulatory Prince P, Hill A, Covarrubias EP et al (2019) Deploying
organ and ultrasonic emission in certain species of acoustic detection algorithms on low-cost, open-source
Ponerine ants (Genus Ectatomma and Pachycondyla, sensors for environmental monitoring. Sensors 19:553.
Hymenoptera, Formicidae). Bioacoustics 8:209–221. https://doi.org/10.3390/s19030553
https://doi.org/10.1080/09524622.1997.9753363 Priyadarshani N, Castro I, Marsland S (2018) The impact
Paviotti M, Vogiatzis K (2012) On the outdoor annoyance of environmental factors in birdsong acquisition using
from scooter and motorbike noise in the urban envi- automated recorders. Ecol Evol 8:5016–5033. https://
ronment. Sci Total Environ 430:223–230. https://doi. doi.org/10.1002/ece3.3889
org/10.1016/j.scitotenv.2012.05.010 Putland RL, Mensinger AF (2020) Exploring the
Payne RS, McVay S (1971) Songs of humpback whales. soundscape of small freshwater lakes. Ecol Inform
Science 173:585–597. https://doi.org/10.1126/science. 55:101018. https://doi.org/10.1016/j.ecoinf.2019.
173.3997.585 101018
Payne KB, Langbauer WR Jr, Thomas EM (1986) Infra- Quadros S, Goulart VDL, Passos L et al (2014) Zoo visitor
sonic calls of the Asian elephant (Elephas maximus). effect on mammal behaviour: does noise matter? Appl
Behav Ecol Sociobiol 18:297–301. https://doi.org/10. Anim Behav Sci 156:78–84. https://doi.org/10.1016/j.
1007/BF00300007 applanim.2014.04.002
Payne CJ, Jessop TS, Guay PJ et al (2012) Population, Radford C, Jeffs A, Tindle C, Montgomery JC (2008)
behavioural and physiological responses of an urban Resonating sea urchin skeletons create coastal
population of black swans to an intense annual noise choruses. Mar Ecol Prog Ser 362:37–43. https://doi.
event. PLoS One 7:e45014. https://doi.org/10.1371/ org/10.3354/meps07444
journal.pone.0045014 Rashed A, Khan MI, Dawson JW et al (2009) Do
Phillips HRP, Newbold T, Purvis A (2017) Land-use hoverflies (Diptera: Syrphidae) sound like the
effects on local biodiversity in tropical forests vary Hymenoptera they morphologically resemble? Behav
between continents. Biodivers Conserv 26:2251– Ecol 20:396–402. https://doi.org/10.1093/beheco/
2270. https://doi.org/10.1007/s10531-017-1356-2 arn148
Picciulin M, Sebastianutto L, Fortuna CM et al (2016) Reber SA, Janisch J, Torregrosa K et al (2017) Formants
Are the 1/3-octave band 63- and 125 Hz noise levels provide honest acoustic cues to body size in American
predictive of vessel activity? The case in the Cres- alligators. Sci Rep 7:1816. https://doi.org/10.1038/
Lošinj Archipelago (northern Adreatic Sea, s41598-017-01948-1
Croatia). In: Popper AN, Hawkins A (eds) The effects Reed SE, Boggs JL, Mann JP (2010) SPreAD-GIS: an
of noise on aquatic life II. Springer, New York, pp ArcGIS toolbox for modeling the propagation of
821–828 engine noise in a wildland setting. Version 2.0. The
Pieretti N, Farina A, Morri D (2011) A new methodology Wilderness Society, San Franscisco, CA
to infer the singing activity of an avian community: the Reine KJ, Clarke DG, Dickerson C (2014) Characteriza-
Acoustic Complexity Index (ACI). Ecol Indic 11:868– tion of underwater sounds produced by hydraulic and
873. https://doi.org/10.1016/j.ecolind.2010.11.005 mechanical dredging operations. J Acoust Soc Am
Pijanowski BC, Farina A, Gage SH et al (2011a) What is 135:3280–3294. https://doi.org/10.1121/1.4875712
soundscape ecology? An introduction and overview of Ricci SW, Eggleston DB, Bohnenstiehl DR, Lillis A
an emerging new science. Landsc Ecol 26:1213–1232. (2016) Temporal soundscape patterns and processes
https://doi.org/10.1007/s10980-011-9600-8 in an estuarine reserve. Mar Ecol Prog Ser 550:25–
Pijanowski BC, Villanueva-Rivera LJ, Dumyahn SL et al 38. https://doi.org/10.3354/meps11724
(2011b) Soundscape ecology: the science of sound in Rice AN, Soldevilla MS, Quinlan JA (2017) Nocturnal
the landscape. Bioscience 61:203–216. https://doi.org/ patterns in fish chorusing off the coasts of Georgia and
10.1525/bio.2011.61.3.6 eastern Florida. Bull Mar Sci 93:455–474. https://doi.
Pola YV, Snowdon CT (1975) The vocalizations of pygmy org/10.5343/bms.2016.1043
marmosets (Cebuella pygmaea). Anim Behav 23:826– Righini R, Pavan G (2020) First assessment of the
842. https://doi.org/10.1016/0003-3472(75)90108-6 soundscape of the Integral Nature Reserve “Sasso
Polidori C, Pavan G, Ruffato G et al (2013) Common Fratino” in the Central Apennine, Italy. Biodiversity
features and species-specific differences in stridulatory 21:4–14. https://doi.org/10.1080/14888386.2019.
organs and stridulation patterns of velvet ants 1696229
(Hymenoptera: Mutillidae). Zool Anz 252:457–468. Roberts C (2009) Construction noise and vibration impact
https://doi.org/10.1016/j.jcz.2013.01.003 on sensitive premises. In: Proceedings of Acoustics
Potočnik I, Poje A (2010) Noise pollution in forest envi- 2009. Australian Acoustical Society, Adelaide, 23–25
ronment due to forest operations. Croat J For Eng 31: November 2009
137–148 Robillard T, Montealegre-Z F, Desutter-Grandcolas L et al
Prat Y, Taub M, Yovel Y (2016) Everyday bat (2013) Mechanisms of high-frequency song generation
vocalizations contain information about emitter, in brachypterous crickets and the role of ghost
264 R. P. Schoeman et al.
frequencies. J Exp Biol 216:2001–2011. https://doi. exhibit. Adv Acoust Vib 2012:402130. https://doi.
org/10.1242/jeb.083964 org/10.1155/2012/402130
Rocha RC Jr, Clapham PJ, Ivashchenko YV (2014) Emp- Scheifele PM, Johnson MT, Kretschmer L et al (2012b)
tying the oceans: a summary of industrial whaling Ambient habitat noise and vibration at the Georgia
catches in the 20th century. Mar Fish Rev 76:37–48. aquarium. J Acoust Soc Am 132:EL88–EL94. https://
https://doi.org/10.7755/MFR.76.4.3 doi.org/10.1121/1.4734387
Rochat JL, Reiter D (2016) Highway traffic noise. Acoust Schmitz B (2002) Sound production in crustacea with
Today 12:38–47 special reference to the Alpheidae. In: Wiese K
Römer H, Lewald J (1992) High-frequency sound trans- (ed) The crustacean nervous system. Springer, Berlin,
mission in natural habitats: implications for the evolu- pp 536–547
tion of insect acoustic communication. Behav Ecol Schoeman RP, Erbe C, Plön S (2022) Underwater chatter
Sociobiol 29:437–444. https://doi.org/10.1007/ for the win: a first assessment of underwater
BF00170174 soundscapes in two bays along the Eastern Cape
Ross D (1976) Mechanics of underwater noise. Pergamon coast of South Africa. J Mar Sci Eng 10:746. https://
Press, Oxford doi.org/10.3390/jmse10060746
Rossi T, Connell SD, Nagelkerken I (2016a) Silent oceans: Schusterman RJ, Gentry R, Schmook J (1966) Underwater
ocean acidification impoverishes natural soundscapes vocalizations by sea lions: social and mirror stimuli.
by altering sound production of the world’s noisiest Science 154:540–542. https://doi.org/10.1126/science.
marine invertebrate. Proc R Soc B 283:20153046. 154.3748.540
https://doi.org/10.1098/rspb.2015.3046 Sèbe F, Aubin T, Boué A, Poindron P (2008) Mother-
Rossi T, Nagelkerken I, Pistevos JCA, Connell SD young vocal communication and acoustic recognition
(2016b) Lost at sea: ocean acidification undermines promote preferential nursing in sheep. J Exp Biol 211:
larval fish orientation via altered hearing and marine 3554–3562. https://doi.org/10.1242/jeb.016055
soundscape modification. Biol Lett 12:20150937. Sertlek HÖ, Slabbekoorn H, ten Cate C, Ainslie MA
https://doi.org/10.1098/rsbl.2015.0937 (2019) Source specific sound mapping: spatial, tempo-
Rountree RA, Gilmore RG, Goudey CA et al (2006) ral and spectral distribution of sound in the Dutch
Listening to fish: applications of passive acoustics to North Sea. Environ Pollut 247:1143–1157. https://
fisheries science. Fisheries 31:433–446. https://doi. doi.org/10.1016/j.envpol.2019.01.119
org/10.1577/1548-8446(2006)31[433:LTF]2.0.CO;2 Shannon G, McKenna MF, Angeloni LM et al (2016) A
Rountree RA, Bolgan M, Juanes F (2019) How can we synthesis of two decades of research documenting the
understand freshwater soundscapes without fish sound effects of noise on wildlife. Biol Rev 91:982–1005.
descriptions? Fisheries 44:137–143. https://doi.org/10. https://doi.org/10.1111/brv.12207
1002/fsh.10190 Sherwen SL, Hemsworth PH (2019) The visitor effect on
Sales GD, Wilson KJ, Spencer KEV (1988) Environmen- zoo animals: implications and opportunities for zoo
tal ultrasound in laboratories and animal houses: a animal welfare. Animals 9:366. https://doi.org/10.
possible cause for concern in the welfare and use of 3390/ani9060366
laboratory animals. Lab Anim 22:369–375. https://doi. Slabbekoorn H (2004) Habitat-dependent ambient noise:
org/10.1258/002367788780746188 consistent spectral profiles in two African forest types.
Salmi R, Hammerschmidt K, Doran-Sheehy DM (2013) J Acoust Soc Am 116:3727–3733. https://doi.org/10.
Western gorilla vocal repertoire and contextual use of 1121/1.1811121
vocalizations. Ethology 119:831–847. https://doi.org/ Slabbekoorn H, Bouton N (2008) Soundscape orientation:
10.1111/eth.12122 a new field in need of sound investigation. Anim Behav
Sáncez-Pérez LA, Sánchez-Fernández LP, Suárez- 76:5–8. https://doi.org/10.1016/j.anbehav.2008.06.010
Guerra S, Carbajal-Hernández JJ (2013) Aircraft class Slabbekoorn H, Smith TB (2002) Habitat-dependent song
identification based on take-off noise signal segmenta- divergence in the little greenbul: an analysis of envi-
tion in time. Expert Syst Appl 40:5148–5159. https:// ronmental selection pressures on acoustic signals. Evo-
doi.org/10.1016/j.eswa.2013.03.017 lution 56:1849–1858. https://doi.org/10.1111/j.0014-
Scarpelli MDA, Ribeiro MC, Teixeira FZ et al (2020) 3820.2002.tb00199.x
Gaps in terrestrial soundscape research: it’s time to Slabbekoorn H, Dooling RJ, Popper AN, Fay RR (2018)
focus on tropical wildlife. Sci Total Environ 707: Effects of anthropogenic noise on animals. Springer,
135403. https://doi.org/10.1016/j.scitotenv.2019. New York
135403 Slabbekoorn H, Dalen J, de Haan D et al (2019)
Schafer RM (1969) The new soundscape: a handbook for Population-level consequences of seismic surveys on
the modern music teacher. Associated Music fishes: an interdisciplinary challenge. Fish Fish 20:
Publishers, New York 653–685. https://doi.org/10.1111/faf.12367
Schafer RM (1977) The tuning of the world. Knopf, Soloway AG, Dahl PH (2014) Peak sound pressure and
New York sound exposure level from underwater explosions in
Scheifele PM, Clark JG, Sonstrom K et al (2012a) Ball- shallow water. J Acoust Soc Am 136:218–223. https://
room music spillover into a beluga whale aquarium doi.org/10.1121/1.4892668
7 Analysis of Soundscapes as an Ecological Tool 265
Soltis J (2010) Vocal communication in African elephants Thiebault A, Charrier I, Aubin T et al (2019) First evi-
(Loxodonta africana). Zoo Biol 29:192–209. https:// dence of underwater vocalisations in hunting penguins.
doi.org/10.1002/zoo.20251 PeerJ 7:e8340. https://doi.org/10.7717/peerj.8240
Southworth M (1969) The sonic environment of cities. Thomas JA, Kuechle VB (1982) Quantitative analysis of
Environ Behav 1:49–70. https://doi.org/10.1177/ Weddell seal (Leptonychotes weddelli) underwater
001391656900100104 vocalizations at McMurdo Sound, Antarctica. J Acoust
Staaterman ER, Claverie T, Patek SN (2010) Soc Am 72:1730–1738. https://doi.org/10.1121/1.
Disentangling defense: the function of spiny lobster 388667
sounds. Behaviour 147:235–258. https://doi.org/10. Tonolla D, Acuña V, Lorang MS et al (2010) A field-based
1163/000579509X12523919243428 investigation to examine underwater soundscapes of
Stack DW, Peter N, Manning RE, Fristrup KM (2011) five common river habitats. Hydrol Process 24:3146–
Reducing visitor noise levels at Muir Woods National 3156. https://doi.org/10.1002/hyp.7730
Momument using experimental management. J Acoust Tonolla D, Lorang MS, Heutschi K et al (2011) Character-
Soc Am 129:1375–1380. https://doi.org/10.1121/1. ization of spatial heterogeneity in underwater
3531803 soundscapes at the river segment scale. Limnol
Stirling I, Calvert W, Spencer C (1987) Evidence of ste- Oceanogr 56:2319–2333. https://doi.org/10.4319/lo.
reotyped underwater vocalizations of male Atlantic 2011.56.6.2319
walruses (Odobenus rosmarus rosmarus). Can J Zool Torigoe K (1982) A study of the world soundscape project.
65:2311–2321. https://doi.org/10.1139/z87-348 York University, Toronto, Ontario
Sueur J (2002) Cicada acoustic communication: potential Tougaard J, Henriksen OD, Miller LA (2009) Underwater
sound partitioning in a multispecies community from noise from three types of offshore wind turbines: esti-
Mexico (Hemiptera: Cicadomorpha: Cicadidae). Biol J mation of impact zones for harbor porpoises and harbor
Linn Soc 75:379–394. https://doi.org/10.1046/j.1095- seals. J Acoust Soc Am 125:3766–3773. https://doi.
8312.2002.00030.x org/10.1121/1.3117444
Sueur J (2018) Sound analysis and synthesis with Towsey M, Wimmer J, Williamson I, Roe P (2014) The
R. Springer, Cham use of acoustic indices to determine avian species
Sueur J, Aubin T, Simonis C (2008a) “Seewave”: a free richness in audio-recordings of the environment. Ecol
modular tool for sound analysis and synthesis. Bio- Inform 21:110–119. https://doi.org/10.1016/j.ecoinf.
acoustics 18:213–226. https://doi.org/10.1080/ 2013.11.007
09524622.2008.9753600 Tripovich JS, Hall-Aspland S, Charrier I, Arnould JPY
Sueur J, Pavoine S, Hamerlynck O, Duvail S (2008b) (2012) The behavioural response of Australian fur
Rapid acoustic survey for biodiversity appraisal. seals to motor boat noise. PLoS One 7:e37228.
PLoS One 3:e4065. https://doi.org/10.1371/journal. https://doi.org/10.1371/journal.pone.0037228
pone.0004065 Truax B (1984) Acoustic communication. Ablex Publish-
Sueur J, Krause B, Farina A (2019) Climate change is ing Corporation, Norwood
breaking earth’s beat. Trends Ecol Evol 34:971–973. Truax B (1996) Soundscape, acoustic communication and
https://doi.org/10.1016/j.tree.2019.07.014 environmental sound composition. Contemp Music
Talandier J, Hyvernaud O, Reymond D, Okal EA (2006) Rev 15:49–65
Hydroacoustic signals generated by parked and drifting van der Lee GH, Desjonquères C, Sueur J, Kraak MHS,
icebergs in the southern Indian and Pacific oceans. Verdonschot PFM (2020) Freshwater ecoacoustics:
Geophys J Int 165:817–834. https://doi.org/10.1111/j. Listening to the ecological status of multi-stressed
1365-246X.2006.02911.x lowland waters. Ecol Indic 113:106252. https://doi.
Tasker ML, Amundin M, Andre M, et al (2010) Marine org/10.1016/j.ecolind.2020.106252
Strategy Framework Directive - Task Group 11: under- van Opzeeland I, van Parijs S, Bornemann H et al (2010)
water noise and other forms of energy. JRC Scientific Acoustic ecology of Antarctic pinnipeds. Mar Ecol
and Technical Report EUR 24341 EN - 2010. Office Prog Ser 414:267–291. https://doi.org/10.3354/
for Official Publications of the European Communities, meps08683
Luxembourg Van Parijs SM, Kovacs KM (2002) In-air and underwater
Tennessen JB, Parks SE (2016) Acoustic propagation vocalizations of eastern Canadian harbour seals, Phoca
modeling indicates vocal compensation in noise vitulina. Can J Zool 80:1173–1179. https://doi.org/10.
improves communication range for North Atlantic 1139/z02-088
right whales. Endanger Species Res 30:225–237. Veirs S, Veirs V, Wood JD (2016) Ship noise extends to
https://doi.org/10.3354/esr00738 frequencies used for echolocation by endangered killer
Tepp G, Chadwick WW Jr, Haney MM et al (2019) whales. PeerJ 4:e1657. https://doi.org/10.7717/peerj.
Hydroacoustic, seismic, and bathymetric observations 1657
of the 2014 submarine eruption at Ahyi seamount, Vergne AL, Pritz MB, Mathevon N (2009) Acoustic com-
Mariana arc. Geochem Geophys Geosyst 20:3608– munication in crocodilians: from behaviour to brain.
3627. https://doi.org/10.1029/2019GC008311 Biol Rev 84:391–411. https://doi.org/10.1111/j.
1469-185X.2009.00079.x
266 R. P. Schoeman et al.
Vergne AL, Aubin T, Taylor P, Mathevon N (2011) Wiseman S, Wilson PS, Sepulveda F (2014) Measuring a
Acoustic signals of baby black caimans. Zoology soundscape of the captive southern white rhinoceros
114:313–320. https://doi.org/10.1016/j.zool.2011. (Ceratotherium simum simum). In: Davy J, Don C,
07.003 McMinn T et al (eds) Proceedings of 43rd International
Vidović A, Štimac I, Zečević-Tadić R (2017) Aircraft Congress on Noise Control Engineering. The
noise monitoring in function of flight safety and air- Australian Acoustical Society, Melbourne, 16–19
craft model determination. J Adv Transp 2017: November 2014
2850860. https://doi.org/10.1155/2017/2850860 Wolfenden AD, Slabbekoorn H, Kluk K, de Kort SR
Villanueva-Rivera LJ (2014) Eleutherodactylus frogs (2019) Aircraft sound exposure leads to song fre-
show frequency but no temporal partitioning: quency decline and elevated aggression in wild
implications for the acoustic niche hypothesis. PeerJ chiffchaffs. J Anim Ecol 88:1720–1731. https://doi.
2:e496. https://doi.org/10.7717/peerj.496 org/10.1111/1365-2656.13059
Villanueva-Rivera LJ, Pijanowski BC (2018) Wrege PH, Rowland ED, Keen S, Shiu Y (2017) Acoustic
“Soundecology”: soundscape ecology. R package ver- monitoring for conservation in tropical forests:
sion 1.3.3 examples from forest elephants. Methods Ecol Evol
Villanueva-Rivera LJ, Pijanowski BC, Doucette J, Pekin B 8:1292–1301. https://doi.org/10.1111/2041-210X.
(2011) A primer of acoustic analysis for landscape 12730
ecologists. Landsc Ecol 26:1233–1246. https://doi. Wyatt R (2008) Review of existing data on underwater
org/10.1007/s10980-011-9636-9 sounds produced by the oil and gas industry. Joint
von Benda-Beckmann AM, Aarts G, Sertlek HO et al Industry Programme on Sound and Aquatic Life,
(2015) Assessing the impact of underwater clearance London
of unexploded ordnance on harbour porpoises Wysocki LE, Davidson JW III, Smith ME et al (2007)
(Phocoena phocoena) in the southern North Sea. Effects of aquaculture production noise on hearing,
Aquat Mamm 41:503–523. https://doi.org/10.1578/ growth, and disease resistance of rainbow trout
AM.41.4.2015.503 Oncorhynchus mykiss. Aquaculture 272:687–697.
Walker SE, Cade WH (2003) The effects of temperature https://doi.org/10.1016/j.aquaculture.2007.07.225
and age on calling song in a field cricket with a com- Xu W, Dong L, Caruso F, Gong Z, Li S (2020) Long-term
plex calling song, Teleogryllus oceanicus (Orthoptera: and large-scale spatiotemporal patterns of soundscape
Gryllidae). Can J Zool 81:1414–1420. https://doi.org/ in a tropical habitat of the Indo-Pacific humpback
10.1139/z03-106 dolphin (Sousa chinensis). PLoS One 15:e0236938.
Wark JD (2015) The influence of the sound environment https://doi.org/10.1371/journal.pone.0236938
on the welfare of zoo-housed callitrichine monkeys. Yin S, McCowan B (2004) Barking in domestic dogs:
Case Western Reserve University, Cleveland context specificity and individual identification. Anim
Weilgart L, Whitehead H (1993) Coda communication by Behav 68:343–355. https://doi.org/10.1016/j.anbehav.
sperm whales (Physeter macrocephalus) off the 2003.07.016
Galápagos Islands. Can J Zool 71:744–752. https:// Yip DA, Bayne EM, Sólymos P et al (2017) Sound atten-
doi.org/10.1139/z93-098 uation in forest and roadside environments:
Wellard R, Pitman RL, Durban J, Erbe C (2020) Cold call: implications for avian point-count surveys. Condor
the acoustic repertoire of Ross Sea killer whales Ornithol Appl 119:73–84. https://doi.org/10.1650/
(Orcinus orca, Type C) in McMurdo Sound, CONDOR-16-93.1
Antarctica. R Soc Open Sci 7:191228. https://doi.org/ York D (1994) Recreational-boating disturbances of natu-
10.1098/rsos.191228 ral communities and wildlife: an annotated bibliogra-
Wemmer C, von Ebers M, Scow K (1976) An analysis of phy. Biological Report 22. U.S. Department of the
the chuffing vocalization in the polar bear (Ursus Interior, Washington
maritimus). J Zool 180:425–439. https://doi.org/10. Young BA (1991) Morphological basis of “growling” in
1111/j.1469-7998.1976.tb04686.x the king cobra, Ophiophagus hannah. J Exp Zool 260:
Wenz GM (1962) Acoustic ambient noise in the ocean: 275–287. https://doi.org/10.1002/jez.1402600302
spectra and sources. J Acoust Soc Am 34:1936–1956. Young BA (2003) Snake bioacoustics: toward a richer
https://doi.org/10.1121/1.1909155 understanding of the behavioral ecology of snakes. Q
White K, Arntzen M, Walker F et al (2017) Noise annoy- Rev Biol 78:303–325. https://doi.org/10.1086/377052
ance caused by continuous descent approaches com- Young BA, Brown IP (1993) On the acoustic profile of the
pared to regular descent procedures. Appl Acoust 125: rattlesnake rattle. Amphibia Reptilia 14:373–380.
194–198. https://doi.org/10.1016/j.apacoust.2017. https://doi.org/10.1163/156853893X00066
04.008 Zhang F, Zhao J, Feng AS (2017) Vocalizations of female
Williams R, Erbe C, Ashe E, Clark CW (2015) frogs contain nonlinear characteristics and individual
Quiet(er) marine protected areas. Mar Pollut Bull signatures. PLoS One 12:e0174815. https://doi.org/10.
100:154–161. https://doi.org/10.1016/j.marpolbul. 1371/journal.pone.0174815
2015.09.012
7 Analysis of Soundscapes as an Ecological Tool 267
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Detection and Classification Methods
for Animal Sounds 8
Julie N. Oswald, Christine Erbe, William L. Gannon,
Shyam Madhusudhana, and Jeanette A. Thomas
analyzed while enduring the acrid smell of etched acoustic characteristics. For example, develop-
Kay Sona-Graph paper and piles of 8-s printouts ment of methods for detection and automated
removed from a spinning recording drum littering signal processing of bat sounds led to a variety
laboratory tables and floors. Output from a long- of automated, off-the-shelf, ready-to-deploy bat
duration sound had to be spliced together (see detectors that detect and classify sounds by spe-
Chap. 1). Many bioacoustic studies generated an cies (Fenton and Jacobson 1973; Gannon et al.
enormous amount of data, which made this man- 2004). These detectors can be very useful in
ual review process at best inefficient, and at worst addressing biological or management issues in
impossible to accomplish. ecology, evolution, and impact mitigation.
For decades, scientists have worked to auto- While the accuracy and robustness of automated
mate the process of detecting and classifying approaches are always a matter of concern (Herr
sounds into categories or types. Automated clas- et al. 1997; Parsons et al. 2000), modern
sification involves three main steps: (1) detection techniques promise much improved recognition
of potential sounds of interest, (2) extraction of performances that could rival manual analyses
relevant acoustic characteristics (or, features) (e.g., Brown and Smaragdis 2009).
from these sounds, and (3) classification of these Multivariate statistical methods can be power-
sounds as produced by a particular species, sex, ful for classification of sounds produced by spe-
age, or individual. Methods for the automated cies with variable vocal repertoires because they
detection of sounds have progressed quickly can identify complex relationships among many
with technological advances in digital recording acoustic features (see Chap. 9). With the advent
(see Chap. 2). Likewise, the extraction of sound of powerful personal computers in the 1980s and
variables useful in analysis has expanded with an 1990s, the use of multivariate techniques became
increasing amount of information provided by popular for classifying bird sounds (e.g., Sparling
new technology. For instance, where features and Williams 1978; Martindale 1980a, b). Since
such as maximum frequency or time between then, enormous effort has been expended to
sounds originally were measured manually off develop these and other automatic methods for
sonagraph paper, devices today allow for measur- the detection of sounds produced by many taxa
ing these, and many more variables, automati- and their classification into discrete categories,
cally or semi-automatically using computer such as species, population, sex, or individual.
software. Now, derived variables, such as time These days, there are applications (apps) for
difference between individual signal elements, smartphones that use advanced algorithms to
frequency modulation, running averages of automatically detect and recognize sounds. For
sound frequency, and harmonic structure can be example, the BirdNET app detects and classifies
easily obtained for classifying the sounds in a bird song—similar to the Shazam app for
repertoire. music—and provides a listing of the top-ranked
Some of the earliest methods used for matching species. It includes almost 1000 of the
automated detection and classification included most common species of North America and
energy threshold detectors (e.g., Clark 1980) and Europe. A similar app, Song Sleuth, recognizes
matched filters (e.g., Freitag and Tyack 1993; songs of nearly 200 bird species likely to be heard
Stafford et al. 1998; Dang et al. 2008; Mankin in North America and also provides references for
et al. 2008). These methods were used to detect species identification, such as the David Sibley
and classify simple, stereotypical sounds pro- Bird Reference (Sibley 2000), allowing the user
duced by species such as the Asian longhorn to “dig into” the bird's biology and conservation
beetle (Anoplophora glabripennis), cane toads needs.
(Rhinella marina), blue whales (Balaenoptera In this chapter, we present an overview of
spp.), and fin whales (Balaenoptera physalus). methods for detection and classification of sounds
Once sounds are detected, they can be organized along with examples from different taxa. No sin-
into groups, or classified, based on selected gle method is appropriate for every research
8 Detection and Classification Methods for Animal Sounds 271
project and so the strengths and weaknesses of sound type. An example of this difficulty in
each method are summarized to help guide describing a sound is the ubiquitous rooster
decisions on which methods are better suited for crow, which can be described by a US citizen as
particular research scenarios. Because algorithms “cock-a-doodle-doo” and by a German citizen as
for statistical analyses, automated detection, and “kikeriki”. Roosters make the same sound, no
computer classification of animal sounds are matter in which country they live, yet their single
advancing rapidly, this is not a comprehensive sound has been named so differently, as has the
overview of methods, but rather a starting point bark of dogs (Fig. 8.1). Of course, onomatopoeic
to stimulate further investigations. naming of sounds also fails when the sounds are
outside of the human hearing range.
If the above was not confusing enough, bird
8.2 Qualitative Naming calls have been described using onomatopoeic
and Classification of Animal phrases. For example, the song of a white-
Sounds throated sparrow (Zonotrichia albicollis) has
been described in Canada as sounding like “O
Prior to computer-assisted detection and classifi- sweet Canada Canada Canada” and in New
cation of animal sounds, bioacousticians used England, USA, as “Old Sam Peabody Peabody
various qualitative methods to categorize sounds. Peabody.” Another example is the barred owl
(Strix varia), which hoots “Who cooks for you?
Who cooks for you all?”.
8.2.1 Onomatopoeic Names
Frequently, researchers describe and name animal 8.2.2 Naming Sounds Based
sounds based on their perception of the sound and on Animal Behavior
thus based on their own language. This approach
has been common in the study of terrestrial Researchers sometimes name sounds based on
animals (in particular, birds) and marine observed and interpreted animal behavior. For
mammals (in particular, pinnipeds and example, the various echolocation signals
mysticetes). Researchers also have given ono- described for insectivorous bats have been
matopoeic names to sounds. These are names named “search clicks” (i.e., slow and regular
that phonetically resemble the sound they clicks) while pursuing insect prey and “terminal
describe. For example, the sounds of squirrels feeding buzz” (i.e., accelerated click trains) dur-
and chipmunks have been described as barks, ing prey capture (Griffin et al. 1960). The bird and
chatters, chirps, and growls. The primate litera- mammal literature is replete with sounds named
ture is also rich in these sorts of sound for a behavior, such as the begging call of nestling
descriptions (e.g., the hack sequences and chicks (Briskie et al. 1999; Leonard and Horn
boom-hack sequences described for Campbell’s 2001), the contact call for isolated young
monkeys, Cercopithecus campbelli; Ouattara (Kondo and Watanabe 2009), and the alarm call
et al. 2009). Bioacousticians studying humpback warning of a nearby predator (Zuberbuhler et al.
whales (Megaptera novaeangliae) have described 1999; Gill and Bierema 2013). In some cases, the
a repertoire of sounds including barks, bellows, function of sounds has been studied in detail,
chirps, cries, croaks, groans, growls, grumbles, which justifies using their function in the name.
horns, moans, purrs, screams, shrieks, sighs, Examples are feeding buzzes in echolocation or
sirens, snorts, squeaks, thwops, trumpets, violins, alarm calls in primates. However, naming sounds
wops, and yaps (Dunlop et al. 2007, 2013). While according to behavior can be misleading because
it is potentially convenient for researchers within a sound can be associated with several contexts.
a group to discuss sounds this way, it is more Names based on the associated behavior should
difficult for others, and perhaps impossible for really only be used after detailed studies of
foreign-language speakers to recognize the context-specificity of the calls in question.
272 J. N. Oswald et al.
OUAF OUAF
GUAU GUAU
GAHF GAHF
Russia
North
BOW WOW
America France WAN WAN
Spain
GUK GUK
WAI WAI
Japan
Nigeria Indonesia
AU AU
Brazil
Try It!
Say all the words out loud. Which
words do you think sound most like
a dog barking?
Fig. 8.1 Dogs speak out. Labels used for dog barks in different countries
An example of the confusion that can sound appears as a series of pulses; however, each
arise from different representations of sound pulse actually is a 0.3-s FM downswept tone from
is the boing sound made by minke whales 300 to 100 Hz (Fig. 8.3b). As if this was not
(Balaenoptera acutorostrata), which was given enough in terms of interesting sounds and odd
an onomatopoeic name. In spectrograms, the names, dwarf minke whales produce the so-called
boing might look like an FM sound (Fig. 8.3a), star-wars sound, which is composed of a series of
however, it is actually a series of rapid pulses pulses with varying pulse rates (Gedamke et al.
(Rankin and Barlow 2005), similar to burst- 2001). The different pulse rates make this sound
pulse sounds produced by odontocetes (e.g., appear as a mixture of broadband pulses and FM
Wellard et al. 2015). As another example, the sounds in spectrograms, depending on the spec-
bioduck sound made by Antarctic minke whales trogram settings (Fig. 8.3c). The sound name
(Balaenoptera bonaerensis) got its name because presumes the reader is familiar with the sound-
it resembles a duck’s quack to human listeners track of an American movie from the 1970s.
(Risch et al. 2014). A spectrogram of the bioduck
Fig. 8.3 Spectrograms of the dwarf minke whale boing the dwarf minke whale star-wars sound (c fs ¼ 44 kHz,
(a fs ¼ 16 kHz, NFFT ¼ 1024, 50% overlap, Hann win- NFFT ¼ 4096, 50% overlap, Hann window). Recordings
dow), the Antarctic minke whale bioduck sound (b fs ¼ a and b from Erbe et al. (2017), c from Gedamke et al.
96 kHz, NFFT ¼ 8192, 50% overlap, Hann window), and (2001)
274 J. N. Oswald et al.
8.2.5 Naming Sounds Based and/or mate attraction. Birds of this taxon usually
on Human Communication use sets of sounds that are repeated in an
Patterns organized structure. In many species, males pro-
duce such songs continuously for several hours
The term “song” is perhaps the best-known exam- each day, producing thousands of songs in each
ple of using human communication labels in the performance. In the bird song literature, songs are
description of animal sounds. The word “song” distinguished from calls by their more complex
may be used to simply indicate long-duration and sustained nature, species-typical patterns, or
displays of a specific structure. Songs of insects syntax that governs their combination of syllables
and frogs are relatively simple sequences, and notes into a song. Songs are under the influ-
consisting of the same sound repeated over long ence of reproductive hormones and associated
periods of time. The New River tree frog with courtship (Bradbury and Vehrencamp
(Trachycephalus hadroceps), for example, 2011). Bird song can vary geographically and
produces nearly 38,000 calls in a single night over time (e.g., Fig. 8.4; Camacho-Alpizar et al.
(Starnberger et al. 2014). Many frogs use trilling 2018). In contrast, calls are typically acoustically
notes in mate attraction, which has been described simple and serve non-reproductive, maintenance
as song, but switch to a different vocal pattern in functions, such as coordination of parental duties,
aggressive territorial displays (Wells 2007). In foraging, responding to threats of predation, or
some frog songs, different notes serve different keeping members of a group in contact (Marler
purposes, with one type of note warding off com- 2004).
peting males, and another attracting females. In Several terrestrial mammals have been
birds and mammals, songs are often more com- reported to sing. For instance, adult male rock
plex, consisting of several successive sounds in a hyraxes (Procavia capensis) engage throughout
recognizable pattern. They appear to be used pri- most of the year in rich and complex vocalization
marily for territorial defense or mate attraction behavior that is termed singing (Koren et al.
(Bradbury and Vehrencamp 2011). Our 2008). These songs are complex signals and are
statements in this chapter show one way to composed of multiple elements (chucks, snorts,
describe calls and songs in animals; however, it squeaks, tweets, and wails) that encode the iden-
is important to note that borrowing terminology tity, age, body mass, size, social rank, and hor-
from human communication when studying monal status of the singer (Koren and Geffen
animals can lead to confusion. The terms we 2009, 2011). Holy and Guo (2005) described
discuss here are not well defined and are used ultrasonic sounds from male laboratory mice
differently by different authors. Make sure to (Mus musculus) as song. Von Muggenthaler
pay close attention to these definitions when et al. (2003) reported that Sumatran rhinoceros
reading literature about animal communication. (Dicerorhinus sumatrensis) produce a song com-
Some ornithologists have used human- posed of three sound types: eeps (simple short
language properties further to describe the struc- signals, 70 Hz–4 kHz), humpback whale like
ture of bird song. Song may be broken down into sounds (100 Hz–3.2 kHz, varying in length,
phrases (also called motifs). Each phrase is com- only produced by females), and whistle blows
posed of syllables, which consist of notes (loud, 17 Hz–8 kHz vocalizations followed by a
(or elements, the smallest building blocks; Catch- burst of air with strong infrasonic content). Clarke
pole and Slater 2008). Notes, syllables, and et al. (2006) described the syntax and meaning of
phrases are identified and defined based on their wild white-handed gibbon (Hylobates lar) songs.
repeated occurrence. An entire taxon of birds Among marine mammals, blue, bowhead
(songbirds, Order Passeriformes) has been (Balaena mysticetus), fin, humpback, minke, and
designated by ornithologists because of their use right whales, Weddell seals (Leptonychotes
of these elaborate sounds for territorial defense weddellii), harbor seals (Phoca vitulina), and
8 Detection and Classification Methods for Animal Sounds 275
Fig. 8.4 Geographic variation in birdsong. These (Camacho-Alpizar et al. 2018). # Camacho-Alpizar
spectrograms show a portion of song from Timberline et al.; https://doi.org/10.1371/journal.pone.0209508.
wrens (Thryorchilus browni) recorded at four locations Licensed under CC BY 4.0; https://creativecommons.org/
in Costa Rica (CBV ¼ Cerro Buena Vista, CV ¼ Cerro licenses/by/4.0/
Vueltas, CCH ¼ Cerro Chirripó, IV ¼ Irazú Volcano)
276 J. N. Oswald et al.
walrus (Odobenus rosmarus) have all been recorded in order to study the mating behavior of
reported to sing (Payne and Payne 1985; Sjare this species. Listening to the first few minutes of
et al. 2003; McDonald et al. 2006; Stafford et al. recording, the bioacoustician can easily hear the
2008; Oleson et al. 2014; Crance et al. 2019). The target species, but there are calls every few
songs of blue, bowhead, fin, minke, and right seconds—too many to pick by hand. So, the
whales are simple compared to those of the hump- scientist looks for software tools to help detect
back whale and little is known about the behav- all frog signals, and potentially sort them based
ioral context of song in any marine mammal on their acoustic features. The first step, signal
species besides the humpback whale. Humpback detection, is discussed in Sect. 8.3; the second
whales are well-known for their long, elaborate step, signal classification, is discussed in
songs. These songs are composed of themes Sect. 8.4.
consisting of repetitions of phrases made up of Automated signal detectors work by common
patterns of units similar to syllables in bird song principles. The raw input data are the ideally
(Fig. 8.5; Payne and Payne 1985; Helweg et al. calibrated time series of pressure recorded with
1998). Winn and Winn (1978) suggested that a microphone in air or hydrophone in water.
only male baleen whales sing, as a means of There might be one or more pre-processing
reproductive display. Sjare et al. (2003) reported steps to filter or Fourier transform the data in
that Atlantic walrus produce two main songs: the successive time windows (see Chap. 4). The
coda song and the diving vocalization song that pre-processed time series is then fed into the
differ by their pattern of knocks, taps, and bell detector, which computes a specific quantity
sounds. from the acoustic data. This may be instantaneous
Song production does not exclude the emis- energy, energy within a specified time window,
sion of non-song sounds and most singing species entropy, or a correlation coefficient, as a few
likely emit both. The non-song sounds of hump- examples. Then, a detection threshold is applied.
back and pygmy blue whales (Balaenoptera If the quantity exceeds the threshold, the signal is
musculus brevicauda), for example, have been deemed present, otherwise not.
cataloged (e.g., Recalde-Salas et al. 2014, 2020). The threshold is commonly computed the
Some song units may resemble non-song sounds. following way:
Whether sounds are part of song or not, their
þ γσ E
E th ¼ E
detection and classification can be challenging
when repertoires are large and possibly variable where E symbolizes the chosen quantity (e.g.,
across time and space. Humpback whale songs, energy), E is its mean value computed over a
for example, vary by region and year (Cerchio long time window (e.g., an entire file), σ E is the
et al. 2001; Payne and Payne 1985). standard deviation, and γ is a multiplier (integer
Characterizing and describing the structure of or real). Setting a high threshold will result in
song can be a difficult task even for the experi- only the strongest signals being detected and
enced bioacoustician. With the assistance of com- weaker ones being missed. Setting a low thresh-
puter analysis tools, sound detection and old will result in many false alarms, which are not
classification may be more efficient. signals. By varying γ, the ideal threshold may be
found and the performance of the detector may be
assessed (see Sect. 8.3.6).
8.3 Detection of Animal Sounds
Fig. 8.5 Spectrogram of the song structure of humpback whales, with sounds organized by theme, phrases, and units (Garland et al. 2017). # Acoustical Society of America,
2017. All rights reserved
277
278 J. N. Oswald et al.
200
100 80
fin whale chorus Antarctic blue whale chorus
50 70
20 60
10 ship noise ship noise
50
14/9 17/9 18/9 19/9 23/9 Date 28/9 2/10
Fig. 8.6 Spectrogram showing three weeks of choruses were the cause of ongoing tones at 18 and 28 Hz for weeks
by fish, fin whales, and blue whales in the Perth Canyon, at a time. Colors represent power spectral density (PSD).
Australia (modified from Erbe et al. 2015). Fish raised Black arrows point to strong noise from passing ships.
ambient levels by 20 dB in the 1800–2500 Hz band # Erbe et al.; https://doi.org/10.1016/j.pocean.2015.05.
every night. Fin whales raised ambient levels by 20 dB 015. Licensed under CC BY 4.0; https://creativecommons.
in the 15–35 Hz band over two days. Antarctic blue whales org/licenses/by/4.0/
Frequency (Hz)
103
102
Fig. 8.7 Spectrogram of the kernel for Omura’s whales’ 1 2 3 4 5 6 7 8
(Balaenoptera omurai) doublet calls, computed as the Time (s)
average of over 800 hand-picked calls (Madhusudhana
et al. 2020) Fig. 8.8 Spectrogram of marine mammal tonal sounds
with negative entropy (black curve) overlain. Negative
entropy is high when the power spectral density is
drawback to this method is that it can be concentrated in a few narrow frequency bands (Erbe and
prohibitively processor-intensive. To speed up King 2008)
the calculations, Harland (2008) first employed
an energy threshold detector (as described above) 1998; Bouffaut et al. 2018), and beaked whales
to detect times of potential signal presence and (Hamilton and Cleary 2010). Their performance
then used spectrogram cross-correlation to detect suffers in the presence of even a small amount of
individual signals within the flagged time periods. sound variation compared to the kernel.
The matched filter approach for sound classifica- In general, entropy measures the disorder or
tion is similar to spectrogram cross-correlation uncertainty of a system. Applied to communica-
but is performed in the time-domain. This means tion theory, the information entropy (also called
that the waveforms (i.e., sound pressure levels as Shannon entropy; Shannon and Weaver 1998)
a function of time) are correlated instead of the measures the amount of information contained
spectrogram. A kernel of the waveform of the in a data stream. Entropy is computed as the
sound to be detected is produced, often empiri- negative product of a probability distribution
cally using a high-quality recording, and then and its logarithm. Therefore, a strongly peaked
cross-correlated with the incoming signal (i.e., probability distribution has low entropy, while a
the time series of sound pressure). Matched filters broad probability distribution has high entropy. If
are efficient at detecting signals in Gaussian noise applied to an acoustic power spectral density dis-
(white noise), but colored noise (typical in many tribution, entropy measures the peakedness of the
natural environments) poses more of a problem. power spectra and detects narrowband signals in
As with spectrogram cross-correlation, the selec- broadband noise (Fig. 8.8). Spectral entropy has
tion of kernels is critical to the performance of the successfully been applied to animal sounds; for
detector. Matched filters are only appropriate for example, from birds, beluga whales
detection of well-known, stereotyped acoustic (Delphinapterus leucas), bowhead whales, and
features, such as sounds produced by cane toads walruses (Erbe and King 2008; Mellinger and
(Dang et al. 2008), blue whales (Stafford et al. Bradbury 2007; Valente et al. 2007).
280 J. N. Oswald et al.
Detector Input
Signal Present Signal Absent
Signal True Positive (TP) False Positive (FP)
Reported Present Correct Detection False Alarm
Output Signal False Negative (FN) True Negative (TN)
Absent Missed Detection Correct Rejection
Fig. 8.11 (a) Generalized receiver operating characteris- ROC curves computed during the development of
tic (ROC) plot, in which the probability of true positives is automated detectors for marine mammal calls in the Arc-
plotted against the probability of false positives. Areas in tic. The spectral entropy detector outperformed others
this graph that correspond to a liberal bias, conservative (Erbe and King 2008)
bias, and deliberate mistakes are indicated. (b) Example
The major diagonal in Fig. 8.11a represents differences between the numbers of TPs and
performance at chance, where the probabilities TNs. In these situations, precision and recall
of TP and FP are equal. Responses falling below (P-R) can provide a more accurate representation
the line would indicate deliberate mistakes. The of detector performance because this representa-
minor diagonal represents neutral bias, and splits tion does not rely on determining the number of
responses into conservative versus liberal. A con- true negatives (Davis and Goadrich 2006). In the
servative response strategy yields decreased cor- P-R framework, events are scored only as TPs,
rect detection and false alarm probabilities; a FPs, and FNs.
liberal response strategy yields increased correct Precision is a measure of accuracy and is the
detection and false alarm probabilities. An exam- proportion of automated detections that are true
ple ROC curve is given in Fig. 8.11b, comparing detections.
the performances of three detectors (operating on
TP
underwater acoustic recordings from the Arctic Precision ¼
TP þ FP
and trying to detect marine mammal calls)
based on: (1) spectral entropy, (2) bandpassed Recall is a measure of completeness and is the
energy, and (3) waveform (i.e., broadband) proportion of true events that are detected. This is
energy. The performance of the entropy detector the same as the true positive rate defined in the
surpassed that of the other two. ROC framework.
TP
8.3.6.3 Precision and Recall Recall ¼
TP þ FN
The performance of a detector can be over-
estimated using a ROC curve when there is a Detectors can be evaluated by plotting preci-
large difference between the numbers of TPs sion against recall (Fig. 8.12). An ideal detector
and TNs. In addition, estimation of the number would have both scores approaching a value of
of TNs requires discrete sampling units. The 1. In other words, the curve would approach the
duration of the discrete sampling units is often upper right-hand corner of the graph (Davis and
somewhat arbitrary and can lead to unrealistic Goadrich 2006). Precision and recall also can be
282 J. N. Oswald et al.
40
35
30
25
Frequency [kHz]
20
15
10
5
0.00
0 200 400 600
Time [ms]
Fig. 8.13 Spectrogram of a pilot whale (Globicephala changes from clockwise to counter-clockwise, or vice
melas) whistle showing the following features: Start fre- versa), and one overtone (Courts et al. 2020). # Courts
quency (Start f), End frequency (End f), Maximum fre- et al.; https://www.nature.com/articles/s41598-020-
quency (Max f), Minimum frequency (Min f), locations of 74111-y/figures/5. Licensed under CC BY 4.0; https://
two local maxima and one local minimum in the funda- creativecommons.org/licenses/by/4.0/
mental contour, four inflection points (where the curvature
literature (e.g., Mellinger et al. 2011; Roch et al. which significantly reduces the number of
2011; Gillespie et al. 2013; Kershenbaum et al. parameters that must be estimated (Picone
2016). Spectrographic measurements of bat calls, 1993). Cepstral coefficients are calculated by
for example, can be extracted using Analook computing the Fourier transform in successive
(Titley Scientific, Columbia, MO, USA), time windows over the recorded pressure time
SonoBat (Joe Szewczak, Department of Biology, series of a sound (see Chap. 4). The frequency
Humboldt State University, Arcata, CA, USA), or axis then is warped by multiplying the spectrum
Kaleidoscope Pro (Wildlife Acoustics, Inc., May- with a series of n filter functions at appropriately
nard, MA, USA), exported to an Excel spread- spaced frequencies. This is done because there is
sheet (XML, CSV, and other formats), classified evidence that many animals perceive frequencies
using machine learning algorithms, and compared on a logarithmic scale, in a similar fashion to
to a reference library for identification. humans (Clemins et al. 2005). The output of the
frequency band filters is then used as input to a
8.4.1.2 Cepstral Features discrete cosine transform, which results in an n-
Cepstral coefficients are spectral features of dimensional cepstral feature vector (Picone 1993;
bioacoustic signals commonly used in human Clemins et al. 2005; Roch et al. 2007, 2008).
speech processing (Davis and Mermelstein Using cepstral feature space allows the timbre
1980). These features are based on the source- of sounds to be captured, a quality that is lost
filter model of human speech analysis, which has when extracting parameters from spectrograms.
been applied to many different animal species Roch et al. (2007) developed an automated clas-
(Fitch 2003). Cepstral coefficients are well-suited sification system based on cepstral feature vectors
for statistical pattern-recognition models because extracted for whistles, burst-pulse sounds, and
they tend to be uncorrelated (Clemins et al. 2005), clicks produced by short- and long-beaked
284 J. N. Oswald et al.
common dolphins (Delphinus spp.), Pacific able to identify dolphin signature whistles more
white-sided dolphins (Lagenorhynchus reliably than computer methods. A problem with
obliquidens), and bottlenose dolphins (Tursiops qualitative classification of sounds in a repertoire
truncatus). The system did not rely on specific (and taxonomy in general), however, is that some
sound types and had no requirement for listeners are “splitters” and other listeners are
separating individual sounds. The system “lumpers.” So, even researchers on the same proj-
performed relatively well, with correct classifica- ect could classify an animal’s sound repertoire
tion scores of 65–75%, depending on the differently. One way to avoid individual
partitioning of the training- and test-data. Cepstral researcher differences in classification is to use
feature vectors also have been used as input to graphical, statistical, and computer-automated
classifiers for many other animal species, includ- methods that objectively sort and compare
ing groupers (Epinephelus guttatus, E. striatus, measured variables that describe the sounds. A
Mycteroperca venenosa, M. bonaci; Ibrahim et al. variety of statistical methods can be employed to
2018), frogs (Gingras and Fitch 2013), song birds classify animal sounds into categories (Frommolt
(Somervuo et al. 2006), African elephants et al. 2007). Below are brief descriptions of some
(Zeppelzauer et al. 2015), and beluga, bowhead, of the statistical methods that are commonly used
gray (Eschrichtius robustus), humpback, and for classification of animal sounds.
killer (Orcinus orca) whales, and walrus (Mouy
et al. 2008). Cepstral features appear to be a 8.4.2.1 Parametric Clustering
promising alternative to the traditional time- and Parametric cluster analysis produces a dendro-
frequency-parameters measured from gram (i.e., classification tree) that organizes simi-
spectrograms as input to classification algorithms. lar sounds into branches of a tree. A distance
However, cepstral features are relatively sensitive matrix also is generated, which gives correlation
to the SNR, the signal’s phase, and modeling coefficients between all variables in the dataset.
order (Ghosh et al. 1992). The resulting distance index ranges from 0 (very
Noda et al. (2016) used mel-frequency cepstral similar sounds) to 1 (totally dissimilar sounds).
coefficients and random forest analyses to classify The matrix can then be joined by rows or columns
sounds produced by 102 species of fish and com- to examine relationships. The type of linkage and
pared the performance of three classifiers: type of distance measurement can be selected to
k-nearest neighbors, random forest, and support find the best fit for a particular dataset (Zar 2009).
vector machines (SVMs). The mel-frequency Cluster analysis has been used to classify
cepstrum (or cepstrogram) is a form of acoustic sound types in several species, including owls
power spectrum (or spectrogram) that is (Nagy and Rockwell 2012), mice
computed as a linear cosine transform of a (Hammerschmidt et al. 2012), rats (Rattus
log-power spectrum that is presented on a nonlin- norvegicus, Takahashi et al. 2010), African
ear mel-scale of frequency. The mel-scale elephants (Wood et al. 2005), and primates
resembles the human auditory system better than (Hammerschmidt and Fischer 1998). In a study
the linearly-spaced frequency bands of the normal of six populations of the neotropical frog
cepstrum. All three classifiers performed simi- (Proceratophrys moratoi) in Brazil, Forti et al.
larly, with average classification accuracy ranging (2016) measured spectrographic variables from
between 93% and 95%. calls produced by males and performed cluster
analysis to examine similarities in acoustic traits
(based on the Bray–Curtis index of acoustic simi-
8.4.2 Statistical Classification larity) across the six locations (Fig. 8.14).
of Animal Sounds Baptista and Gaunt (1997) used hierarchical clus-
ter analysis of correlation coefficients of several
For some sounds, qualitative classification is suf- acoustic parameters to categorize sounds of the
ficient. Janik (1999) reported that humans were sparkling violet-eared hummingbird (Colibri
8 Detection and Classification Methods for Animal Sounds 285
Fig. 8.14 Dendrogram from a hierarchical cluster analy- Odontophrynidae species (Forti et al. 2016). # Forti
sis of the call similarities between 15 male Proceratophrys et al.; https://peerj.com/articles/2014/. Licensed under
moratoi from different sites and two other CC BY 4.0; https://creativecommons.org/licenses/by/4.0/
coruscans), which is found in two neighboring syllable sharing between individuals of Anna’s
assemblages in their study area. A matrix of hummingbird (Calypte anna). They identified
sound similarity values obtained from spectral 38 syllable types in songs of 44 males, which
cross-correlation of these birds’ songs indicated clustered into five basic syllable categories:
similar sound types from the two areas. Yang “Bzz,” “bzz,” “chur,” “ZWEE,” and “dz!”. Also,
et al. (2007) used cluster analysis to examine microgeographic song variation patterns were
286 J. N. Oswald et al.
6
MYCI MYCA
4
2
PC2 (Pinna Shape)
-6 -4 -2 2 4 6
-2
-4
-6
PC1 (Call Frequency Characters)
Fig. 8.15 Plot showing the results of principal compo- in ear height and characteristic frequency of their echolo-
nent analysis, in which two cryptic species of myotis bats cation signals. Plotted is characteristic frequency versus
(California myotis, Myotis californicus, MYCA, black signal duration for these species recorded from field sites
squares; western small-footed bat, M. ciliolabrum, in New Mexico and Arizona, USA
MYCI, hollow circles) were distinguished by differences
found in that nearest neighbors sang more similar variables (i.e., the features) into a set of linearly
songs than non-neighbors. Pozzi et al. (2010) uncorrelated variables (i.e., the principal
used several acoustic variables to group black components; Hotelling 1933; Zar 2009). The
lemur (Eulemur macaco macaco) sounds into principal components are linear combinations of
categories, including the frequencies of the fun- the original variables (features). Plotting the prin-
damental and of the first three harmonic overtones cipal components against each other shows how
(measured at the start, middle, and end of each the measurements cluster.
call), and the total duration. The agreement of this For example, by examining bat biosonar
analysis with manual classification was high signals in multivariate space, bat species that are
(>88.4%) for six of eight categories. very similar in external appearance can be distin-
guished. Using PCA, Gannon et al. (2001) found
8.4.2.2 Principal Component Analysis ear height and characteristic frequency were
Principal component analysis (PCA) is a multi- correlated, along with duration of the signal
variate statistical method that examines a set of (Fig. 8.15).
measurements such as the feature vectors As another example, Briefer et al. (2015)
discussed earlier in Sect. 8.4. These features categorized emotional states associated with vari-
may well be correlated. For example, bandwidth ation in whinnies from 20 domestic horses (Equus
is sometimes correlated with maximum fre- ferus) using PCA. They designed four situations
quency, or the number of inflection points can to elicit different levels of emotional arousal that
be correlated with signal duration (Ward et al. were likely to stimulate whinnies: separation
2016). PCA performs an orthogonal transforma- (negative situation) and reunion (positive situa-
tion that converts the potentially correlated tion) with either all group members (high
8 Detection and Classification Methods for Animal Sounds 287
Fig. 8.16 Spectrograms and oscillograms of horse higher G0 fundamentals than positive whinnies
whinnies in negative (a, c) and positive (b, d) situations (b, d Briefer et al. 2015). # Briefer et al.; https://www.
emitted by two different horses. Red arrows point to fun- nature.com/articles/srep09989/figures/3. Licensed under
damental frequencies (F0, G0) and first overtones (H1). CC BY 4.0; http://creativecommons.org/licenses/by/4.0/
Negative whinnies (a, c) are longer in duration and have
emotional arousal) or only one group member variables measured from a training dataset. One
(moderate emotional arousal). The authors canonical discriminant function is produced for
measured 21 acoustic features from whinnies each sound type in the dataset. Variables
(Fig. 8.16). PCA transformed the feature vectors measured from sounds in the test dataset are
into six principal components that accounted for then substituted into each function and each
83% of the variance in the original dataset. sound type is classified according to the function
that produced the highest value. Because DFA is
8.4.2.3 Discriminant Function Analysis a parametric technique, it is assumed that input
In discriminant function analysis (DFA), canoni- data have a multivariate normal distribution with
cal discriminant functions are calculated using the same covariance matrix (Afifi and Clark 1996;
288 J. N. Oswald et al.
Neotamias siskiyou
Neotamias townsendii
Discriminant Function 2
Neotamias senex
Neotamias ochrogenys
Discriminant Funcon 1
Fig. 8.17 Plot resulting from discriminant function anal- Discriminant function 1 was dominated by differences in
ysis. Four species of Townsend-group chipmunks maximum frequency of the signal and discriminant func-
(Townsend’s chipmunk, Neotamias townsendii; Siskiyou tion 2 was most influenced by temporal features including
chipmunk, N. siskiyou; Allen’s chipmunk, N. senex; and total signal length and the number of signals emitted by a
yellow-cheeked chipmunk, N. ochrogenys) in northern chipmunk during a signaling bout
California, USA, produced discernibly different sounds.
Zar 2009). Violations of these assumptions can The goal for each split is to divide the data into two
create problems with some datasets. One of the nodes, each as homogeneous as possible. As the
main weaknesses of DFA for animal sound clas- tree is grown, results are split into successively
sification is that it assumes classes are linearly purer nodes. This continues until each node
separable. Because a linear combination of contains perfectly homogeneous data (Gillespie
variables takes place in this analysis, the feature and Caillat 2008). Once this maximal tree has
space can only be separated in certain, restricted been generated, it is pruned by removing nodes
ways that are not appropriate for all animal and examining the error rates of these smaller trees.
sounds. Figure 8.17 shows the DFA separation The smallest tree with the highest predictive accu-
of California chipmunk (genus Neotamias) taxa racy is the optimal tree (Oswald et al. 2003).
that are morphologically similar but acoustically Tree-based analysis provides several
different, using six variables measured from their advantages over some of the other classification
sounds. techniques. It is a non-parametric technique;
therefore, data do not need to be normally
distributed as required for other methods, such
8.4.2.4 Classification Trees
as DFA. In addition, tree-based analysis is a sim-
Classification tree analysis is a non-parametric sta-
ple and naturally intuitive way for humans to
tistical technique that recursively partitions data
classify sounds. It is essentially a series of true/
into groups known as “nodes” through a series of
false questions, which makes the classification
binary splits of the dataset (Clark and Pregibon
process transparent. This allows easy examina-
1992; Breiman et al. 1984). Each split is based on
tion of which variables are most important in the
a value for a single variable and the criteria for
classification process. Tree-based analysis also
making splits are known as primary splitting rules.
8 Detection and Classification Methods for Animal Sounds 289
Fig. 8.18 Classification tree grown using Splus computer Fk, and slope (S1). Along the tangents between boxes are
software (version S-PLUS 6.2 2003, TIBCO Software values for variables used to split the nodes (for instance,
Inc., Palo Alto, CA, USA) from 1369 bat calls. The pruned Fmin is minimum frequency). The fraction below each
tree used variables measured from each bat call: duration box is the misclassification rate (e.g., 1/5 ¼ 20% misclassi-
(DUR), minimum frequency (Fmin), characteristic fre- fication rate). The tree has 12 terminal nodes defining the
quency (Fc; i.e., frequency at the flattest part of the call), branches, resulting in a classification designation for each
frequency at the “knee” of the call (Fk), time of Fc, time at species (Gannon et al. 2004)
with promising results. Fristrup and Watkins approaches were employed, the resulting limited
(1993) used tree-based analysis to classify the set of chosen features or measurements are essen-
sounds of 53 species of marine mammal (includ- tially representations of the underlying data in a
ing mysticetes, odontocetes, pinnipeds, and reduced space. Such dimensionality reduction is
manatees). Their correct classification score of typically aimed at making the downstream task of
66% was 16% higher than the score obtained clustering (with PCA, DFA, etc.) computationally
when applying DFA to the same dataset. The tractable.
whistles of nine delphinid species were correctly In recent years, nonlinear dimensionality
classified 53% of the time by Oswald et al. (2003) reduction methods have gained widespread pop-
using tree-based analysis. Oswald et al. (2007) ularity, specifically in applications for exploring
subsequently applied classification tree analysis and visualizing very high-dimensional data.
to the whistles of seven species and one genus of Originally popular for processing image-like
marine mammal, resulting in a correct classifica- data in the field of machine learning, these
tion score of 41%. This score was improved methods bring about dimensionality reduction
slightly, to 46%, when classification decisions without requiring one to explicitly choose and
were based on a combination of classification extract features. The methods can be easily
tree and DFA results. Gannier et al. (2010) used adapted for processing bioacoustic recordings
classification trees to identify the whistles of wherein the qualitative cluster structure (i.e.,
five delphinid species recorded in the Mediterra- similarities in the visually identifiable informa-
nean, with a correct classification score of 63%. tion) in spectrogram-like data (e.g.,
Finally, Gillespie and Caillat (2008) classified the mel-spectrogram or cepstrogram) containing
clicks of Blainville’s beaked whales (Mesoplodon hundreds or thousands of time-frequency points
densirostris), short-finned pilot whales is effectively captured in an equivalent 2- or
(Globicephala macrorhynchus), and Risso’s 3-dimensional space (e.g., Sainburg et al. 2019;
dolphins (Grampus griseus). Their tree-based anal- Kollmorgen et al. 2020).
ysis classified 80% of clicks to the correct species. One of the earlier methods for capturing non-
linear structure, the t-distributed stochastic neigh-
8.4.2.5 Nonlinear Dimensionality bor embedding (t-SNE; van der Maaten and
Reduction Hinton 2008) is based on non-convex optimiza-
Clustering techniques described above require tion. It computes a similarity measure between
that certain features or measurements, as appro- pairs of points (data samples) in the original
priate for the problem domain, be available high-dimensional space and in the reduced
beforehand. They are gathered from sound space, then minimizes the Kullback–Leibler
recordings either manually (e.g., number of divergence between the two sets of similarity
inflection points in whistle contours, number of measures. t-SNE tries to preserve distances in a
harmonics) or using signal processing tools (e.g., neighborhood whereby points close together in
peak frequency, energy), or both. Manual extrac- the high-dimensional space have a high probabil-
tion of features is usually time-consuming and ity of staying close in the reduced space. The Bird
often inefficient, especially when dealing with Sounds project (Tan and McDonald 2017)
recordings covering large spatial and temporal presents an excellent demonstration of using
scales. Automated extraction of measurements t-SNE for organizing thousands of bird sound
improves efficiency and eliminates the risk of spectrograms in a 2-dimensional similarity grid.
human biases. However, when recordings contain Some of the shortcomings of t-SNE were
a lot of confounding sounds or have extreme addressed in a newer method called uniform man-
noise variations, reliability and accuracy of the ifold approximation and projection (UMAP;
measurements can become questionable and can McInnes et al. 2018). UMAP is backed with a
have adverse effects on clustering outcomes. strong theoretical framework. While effectively
Regardless of whether manual or automated capturing local structures like t-SNE, UMAP
8 Detection and Classification Methods for Animal Sounds 291
Fig. 8.19 Demonstration of clustering katydid sounds the left, and clustering outcomes are shown on the right.
using UMAP. Randomly chosen samples of call The clustering activity has successfully captured both
spectrograms of the five species considered are shown on inter-species and intra-species variations
also offers a better promise for preserving some modern variants of variational autoencoders
global structures (inter-cluster relationships). (Kingma and Welling 2013).
UMAP processes data faster and is capable of
handling very large dimensional data. Fig. 8.19
is a demonstration of the use of UMAP for clus-
tering sounds of five species of katydids 8.4.3 Model Based Classification
(Tettigoniidae) from Panamanian rainforest
recordings (Madhusudhana et al. 2019). Inputs 8.4.3.1 Artificial Neural Networks
to UMAP clustering comprised of spectrograms Artificial neural networks (ANNs) were devel-
(dimensions 216h x 469w) computed from 1-s oped by modeling biological systems of
clips containing katydid call(s). The inputs often information-processing (Rosenblatt 1958) and
contained confounding sounds and varying noise became very popular in the areas of word recog-
levels. The clustering results, however, demon- nition in human speech studies (e.g., Waibel et al.
strate the utility of UMAP as a quick means to 1989; Gemello and Mana 1991) and character or
effective clustering. UMAP has also been used, in image-recognition (e.g., Fukushima and Wake
combination with a pre-trained neural network, 1990; Van Allen et al. 1990; Belliustin et al.
for assessing habitat quality and biodiversity 1991) in the 1980s. Since that time, ANNs have
variations from soundscape recordings across dif- been used successfully to classify a number of
ferent ecosystems (Sethi et al. 2020). complex signal types, including quail crows
We have presented here two popular methods (Coturnix spp., Deregnaucourt et al. 2001),
that are currently trending in this field of research. alarm sounds of Gunnison’s prairie dogs
There are, however, other alternatives available (Cynomys gunnisoni, Placer and Slobodchikoff
including earlier methods such as isomap 2000), stress sounds by domestic pigs (Sus scrofa
(Tenenbaum et al. 2000) and diffusion map domesticus, Schon et al. 2001), and dolphin echo-
(Coifman et al. 2005), newer variants of t-SNE location clicks (Roitblat et al. 1989; Au and
(e.g., Maaten 2014; Linderman et al. 2017), and Nachtigall 1995).
292 J. N. Oswald et al.
The greater potential of ANNs remained signals (e.g., spectrogram), many of the successes
largely untapped for many years, in part due to of CNNs in computer vision have been replicated
prevailing limitations in computational in the field of animal bioacoustics. In contrast to
capabilities. In the mid-1980s, backpropagation CNNs, RNNs are better suited for processing
paved a way for efficiently training multi-layer sequence inputs. RNNs contain internal states
ANNs (Rumelhart et al. 1986). Backpropagation, (memory) that allow them to “learn” temporal
an algorithm for supervised learning of the patterns. However, their utility is limited by the
weights in an ANN using gradient descent, “vanishing gradient problem,” wherein the
greatly facilitated development of deeper gradients (from the gradient descent algorithm)
networks (having many hidden layers). Many of the network's output with respect to the
classes of deep neural networks (DNNs; LeCun weights in the early layers become extremely
et al. 2015) such as convolutional neural small. The problem is overcome in modern
networks (CNNs) and recurrent neural networks flavors of RNNs such as long short-term memory
(RNNs) became easier to train. While the afore- (LSTM; Hochreiter and Schmidhuber 1997)
mentioned ANN approaches often require hand- networks and gated recurrent unit (GRU; Cho
picked features or measurements as inputs, DNNs et al. 2014) networks.
trained with backpropagation demonstrated the These types of ML solutions are heavily data-
ability to learn good internal representations driven and often require large quantities of train-
from raw data (i.e., the hidden layers captured ing samples. Typically, the training samples are
non-trivial representations effectively). In their time-frequency representations (e.g., spectrogram
landmark work on using CNNs for the automatic or mel-spectrogram) of short clips of recordings
recognition of handwritten digits, LeCun et al. (e.g., Stowell et al. 2016; Shiu et al. 2020).
(1989a, b) used backpropagation to learn Robustness of the resulting models are improved
convolutional kernel coefficients directly from by ensuring that the inputs adequately cover pos-
images. Over the past two decades, advances in sible variations of the target signals and of the
computing technology, especially the wider avail- ambient background conditions. Data scientists
ability of graphics processing units (GPUs), have employ a variety of data augmentation techniques
considerably accelerated machine learning to overcome data shortage. Some examples
(ML) research in many disciplines such as com- include introducing synthetic variations such as
puter vision, speech processing, natural language infusion of Gaussian noise, shifting in time (hori-
processing, recommendation systems, etc. Shift zontal shift) and frequency content (vertical shift)
invariance is an attractive characteristic of (Jaitly and Hinton 2013; Ko et al. 2015; Park et al.
CNNs, which makes them suitable for analyzing 2019). The training process, which involves suc-
visual imagery (LeCun et al. 1989a, b, 1998). cessively lowering a loss function iteratively
CNN-based solutions have consistently using the backpropagation algorithm, is usually
dominated many of the large-scale visual recog- computationally intensive and is often sped up
nition challenges. As such, several competing with the use of GPUs.
architectures of CNNs have been developed: DNNs have been used in the automatic recog-
AlexNet (Krizhevsky et al. 2017), ResNet nition vocalizations of insects (e.g.,
(He et al. 2016), DenseNet (Huang et al. 2017), Madhusudhana et al. 2019), fish (e.g., Malfante
etc. Some of these architectures have become the et al. 2018), birds (e.g., Stowell et al. 2016; Goëau
state-of-the-art in computer vision applications et al. 2016), bats (e.g., Mac Aodha et al. 2018),
such as face recognition, emotion detection, marsupials (e.g., Himawan et al. 2018), primates
object extraction, scene classification, and also (e.g., Zhang et al. 2018), and marine mammals
in conservation applications (e.g., species identi- (e.g., Bergler et al. 2019). CNNs have been used
fication in camera trap data, land-use monitoring in the recognition of social calls, song calls, and
in aerial surveys). Given the image-like nature of whistles (e.g., Jiang et al. 2019; Thomas et al.
time-frequency representations of acoustic 2019). While typical 2-dimensional CNNs have
294 J. N. Oswald et al.
been successfully used in the detection of echolo- Wiener 2002; Cutler et al. 2007; Armitage and
cation clicks (e.g., Bermant et al. 2019), Ober 2010; Ross and Allen 2014).
1-dimensional CNNs (with waveforms as inputs) One of the advantages of a random forest
have been attempted as well (e.g., Luo et al. analysis is that it provides information on the
2019). CNNs and LSTM networks have been degree to which each one of the input variables
compared in an application for classifying grou- contributes to the final species classification. This
per species (Ibrahim et al. 2018) where the information is given by the Gini index and is
authors observed similar performances between known as the Gini variable importance. The
the two models. Shiu et al. (2020) attempted Gini index is calculated based on the “purity” of
combining a CNN with a GRU network for each node in each of the classification trees,
detecting North Atlantic right whale (Eubalaena where purity is a measure of the number of
glacialis) calls. Madhusudhana et al. (2021) whistles from different species in a given node
incorporated long-term temporal context by com- (Breiman et al. 1984). Smaller Gini indices repre-
bining independently trained CNNs and LSTM sent higher purity. When a random forest analysis
networks and achieved notable improvements in is run, the algorithm assigns splitting variables so
recognition performance. An attractive approach that the Gini index is minimized at each node
for developing recognition models is the use of (Oh et al. 2003). When a forest has been grown,
transfer learning technique (Torrey and Shavlik the Gini importance value is calculated for each
2010), where components of an already trained variable by summing the decreases in Gini index
model are reused. Typically, weights of the early from one node to the next each time the variable is
layers of a pre-trained network are frozen used. Variables are ranked according to their Gini
(no longer trainable) and the model is adapted to importance values—those with the highest values
the target domain by training only the leaf nodes contribute the most to the random forest model
with data from the target domain. Zhong et al. predictions. Random forests also produce a prox-
(2020) used transfer learning to produce a CNN imity measure, which is the fraction of trees in
model for classifying the calls of a few species of which particular observations end up in the same
frogs and birds. terminal nodes. This measure provides informa-
tion about the similarity of individual
observations because similar observations should
8.4.3.2 Random Forest Analysis
end up in the same terminal nodes more often
A random forest is a collection of many (hundreds
than dissimilar observations (Liaw and Wiener
or thousands) individual classification trees,
2002).
which are grown without pruning. Each tree is
Armitage and Ober (2010) compared the
different from every other tree in the forest
classification performance of random forests, sup-
because at each node, the variable to be used as
port vector machines (SVMs), artificial neural
a splitter is chosen from a random subset of the
networks, and DFA for bat echolocation signals
variables (Breiman 2001). Each tree in the forest
and found that, with the exception of DFA, which
produces a predicted category for the sound to be
had the lowest classification accuracy, all
classified as, and the sound is ultimately classified
classifiers performed similarly. Keen et al.
as the category that was predicted by the majority
(2014) compared the performance of four classi-
of trees. Random forests are often more accurate
fication algorithms using spectrographic
than single classification trees because they are
measurements (spectrographic cross-correlation,
robust to over-fitting and stable to small
dynamic time-warping, Euclidean distance, and
perturbations in the data, correlations between
random forest) for flight calls from four warbler
predictor variables, and noisy predictor variables.
species. In this study, random forests produced
Random forests perform well on polymorphic
the most accurate results, correctly classifying
categories such as the variety of flight calls pro-
68% of calls.
duced by many bird species (e.g., Liaw and
8 Detection and Classification Methods for Animal Sounds 295
Oswald et al. (2013) compared classifiers depended on the number of cepstral coefficients
generated using DFA versus random forest and the number of Gaussian mixtures in the
classifiers for whistles produced by eight model. Lee et al. (2013) used GMMs to classify
delphinid species recorded in the tropical Pacific song segments of 28 species of birds based on
Ocean and found that random forests resulted in image-shape features instead of traditional spec-
the highest overall correct classification score. trographic features. This approach resulted in
Rankin et al. (2016) trained a random forest clas- 86% or 95% classification accuracy for 3- or 5-s
sifier for five delphinid species in the California birdsong segments, respectively.
Current ecosystem. This classifier used informa- Roch et al. (2008) classified clicks produced
tion from whistles, clicks, and burst-pulse sounds by Blainville’s beaked whales, pilot whales, and
and correctly classified 84% of acoustic Risso’s dolphins using a GMM. Correct classifi-
encounters. Both Oswald et al. (2013) and Rankin cation scores for these three species were 96.7%,
et al. (2016) used spectrographic measurements 83.2%, and 99.9%, respectively. Brown and
as input variables for their classifiers. Smaragdis (2008, 2009) used GMMs to classify
sounds of killer whales, resulting in up to 92%
8.4.3.3 Gaussian Mixture Models agreement with 75 perceptually created
Gaussian Mixture Models (GMMs) are used com- categories of sound types, depending on the num-
monly to model arbitrary distributions as linear ber of cepstral coefficients and Gaussians in the
combinations of parametric variables. They are estimate of the probability density function.
appropriate for species identification when there GMMs were used to classify the A and B type
are no expectations, such as the sequence of sounds produced by blue whales in the Northeast
sounds (Roch et al. 2007). To create a GMM, a Pacific (McLaughlin et al. 2008), and six marine
set of n normal distributions with separate means mammal species (Mouy et al. 2008) recorded in
and diagonal covariance matrices are scaled by the Chukchi Sea: bowhead whales, humpback
weight-factors ci (1 < i < n). The sum over all ci whales, gray whales, beluga whales, killer
must be 1 to ensure that the GMM represents a whales, and walruses. Both studies reported that
probability distribution (Huang et al. 2001; Roch their classifiers worked very well, but correct
et al. 2007, 2008). The number of mixtures in the classification scores were not provided.
GMM is chosen empirically and its parameters
are estimated using an iterative algorithm, such as 8.4.3.4 Support Vector Machines
the Expectation Maximization algorithm (Moon Support vector machines (SVMs) are a rich fam-
1996). Once a GMM has been trained, likelihood ily of learning algorithms based on Vapnik’s
is computed for each sound type and a log- (1998) statistical learning theory. An SVM
likelihood-ratio test is used to decide the species works by mapping features measured from
(Roch et al. 2008). sounds into a high-dimensional feature space.
Gingras and Fitch (2013) used GMMs to clas- The SVM then finds the optimal hyperplane
sify male advertisement songs of four genera of (function) that maximizes the separation among
anurans (Bufo, Hyla, Leptodactylus, Rana) based classes with the lowest number of parameters and
on spectral features and mel-frequency cepstral the lowest risk of error. This approach attempts to
coefficients. The GMM based on spectral features meet the goal of minimizing both the training
resulted in 60% true positives and 13% false error and the complexity of the classifier (Mazhar
positives, and the GMM based on et al. 2007). The best hyperplane is one that
mel-frequency cepstral coefficients resulted in maximizes the distance between the hyperplane
41% true positives and 20% false positives. and the nearest data points belonging to different
Somervuo et al. (2006) correctly classified classes. The support vectors are the data points
55–71% of song fragments from 14 different spe- that determine the position of the hyperplane, and
cies of birds based on mel-frequency cepstral the distance between the hyperplane and the sup-
coefficients. The correct classification score port vectors is called the margin (Fig. 8.21). The
296 J. N. Oswald et al.
Fig. 8.21 Examples of support vector machine hyperplanes. (a) The margin of the hyperplane is not optimal, (b) a
hyperplane with a maximized margin. The support vectors are circled
optimal classifier maximizes the margin on both classifiers at each node. The two datasets used
sides of the hyperplane. Because the hyperplane by Fagerlund (2007) contained six and eight
can be defined by only a few of the training bird species and correct classification scores
samples, SVMs tend to be generalized and robust were 78–88% and 96–98% for the two datasets,
(Cortes and Vapnik 1995; Duda et al. 2001). respectively, depending on which variables were
When classes cannot be separated linearly, used in the classifiers.
SVMs can map features onto a higher dimen- Zeppelzauer et al. (2015) and Stoeger et al.
sional space where the samples become linearly (2012) both used SVM to identify African ele-
separable (see Fig. 8.26 in Zeppelzauer et al. phant rumbles. Zeppelzauer et al. (2015) used
2015). cepstral feature vectors and an SVM to distin-
SVMs originally were designed for binary guish African elephant rumbles from background
classification, but a number of methods have noise. This SVM resulted in an 88% correct
been developed for applying them to multi-class detection rate and a 14% false alarm rate. In
problems. The three most common methods are: addition to SVM, Stoeger et al. (2012) also used
(1) form k binary “one-against-the-rest” linear discriminant analysis (LDA) and nearest
classifiers, where k is the number of classes and neighbor classification algorithms to categorize
the class whose decision-function is maximized is two types of rumbles produced by five captive
chosen (Vapnik 1998), (2) form all k(k 1)/2 African elephants based on spectral
pair-wise binary classifiers, and choose the representations of the sounds. They obtained a
class whose pair-wise decision-functions are classification accuracy of greater than 97% for
maximized (Li et al. 2002), and (3) reformulate all three classification methods.
the objective function of SVM for the multi-class Jarvis et al. (2006) developed a new type of
case so decision boundaries for all classes are multi-class SVM, called the class-specific SVM
optimized jointly (Guemeur et al. 2000). (CS-SVM). In this method, k binary SVMs are
Gingras and Fitch (2013) used four different created, where each SVM discriminates between
algorithms (SVM, k-nearest neighbor, multivari- one of the k classes of interest and a common
ate Gaussian distribution classifier, and GMM) to reference-class. The class whose decision-
classify advertisement calls from four genera of function is maximized with respect to the
anurans and obtained comparable accuracy levels reference-class is selected. If all decision-
from all three models. Fagerlund (2007) used functions are negative, the reference-class is
SVMs to classify bird sounds produced by several selected. The advantage of this method is that
species using decision trees with binary SVM noise in recordings is treated as the reference-
8 Detection and Classification Methods for Animal Sounds 297
class. Jarvis et al. (2006) used their CS-SVM to songbirds: indigo buntings (Passerina cyanea)
discriminate clicks produced by Blainville’s and zebra finches (Taeniopygia guttata). Their
beaked whales from ambient noise and obtained analysis resulted in 97% correct classification of
a correct classification score of 98.5%. They also stereotyped syllables and 84% correct classifica-
created a multi-class CS-SVM that classified tion of syllables in plastic song. It is important to
clicks produced by Blainville’s beaked whales, note, however, that these results were obtained for
spotted dolphins (Stenella attenuata), and song recorded from a single individual of each
human-made sonar pings. This CS-SVM resulted species in a controlled setting. Somervuo et al.
in 98% correct classification for Blainville’s (2006) performed DTW to classify bird song
beaked whale clicks, 88% correct classification syllables produced by 14 different species. They
for spotted dolphin clicks, and 95% correct clas- compared two different methods for computing
sification for sonar pings. It is important to note distance between syllables: (1) simple Euclidean
that the training data were included in their test distances between frequency-amplitude vectors,
data, which likely resulted in inflated correct clas- and (2) absolute distance between frequencies
sification scores. weighted by the sum of their amplitudes. Classi-
fication accuracy was low, at about 40–50%,
8.4.3.5 Dynamic Time-Warping depending on the species and the distance method
Dynamic time-warping (DTW) is a class of used. They obtained higher classification success
algorithms originally developed for automated using classification methods such as hidden Mar-
human speech recognition (Myers et al. 1980). kov models (HMM) and GMM based on song
DTW is used to quantitatively compare time- fragments, rather than on single syllables.
frequency contours of different durations using Buck and Tyack (1993) performed DTW to
variable extension and compression of the time classify three signature whistles from each of
axis (Deecke and Janik 2006; Roch et al. 2007). five wild bottlenose dolphins recorded in
There are different DTW techniques (e.g., Itakura Sarasota, Florida, USA, with 100% accuracy.
1975; Sakoe and Chiba 1978; Kruskal and Deecke and Janik (2006) used DTW to classify
Sankoff 1983), but all are based on comparing a signature whistles produced by captive bottlenose
reference sound to a test sound. The test sound is dolphins. The DTW algorithm outperformed
stretched and compressed along its contour to human analysts and other statistical methods
minimize the difference between the shapes of tested by Janik (1999). DTW also was applied
the two contours. Restrictions can be placed on to classify stereotypical pulsed sounds produced
the amount of time-warping that takes place. For by killer whales, both in captivity (Brown et al.
example, Buck and Tyack (1993) did not time- 2006) and at sea (Deecke and Janik 2006; Brown
warp contours that differed by a factor of more and Miller 2007). In all of these studies, sounds
than 2 in duration and assigned those contours a were classified into categories that were identified
similarity score of zero. Deecke and Janik (2006) perceptually by humans with very high correct
stated that contours could only be stretched or classification scores.
compressed up to a factor of 3 to fit the reference Oswald et al. (2021) used dynamic time-
contour. In a DTW analysis, all individual warping and neural network analysis to group
contours are compared to all other contours and whistle contours produced by short- and long-
a similarity matrix is constructed. Sounds are beaked common dolphins (Delphinus delphis
clustered into categories based on the similarity and D. bairdii) into categories. Many of the
matrix using methods such as k-nearest neighbor resulting categories were shared between the
cluster analysis or ANNs (Deecke and Janik two species, but each species also produced a
2006; Brown and Miller 2007). number of species-specific categories. Random
DTW has been used to classify bird sounds. forest analysis showed that whistles in species-
Anderson et al. (1996) applied DTW to recognize specific categories could be classified to species
individual song syllables for two species of with significantly higher accuracy than whistles
298 J. N. Oswald et al.
in shared categories. This suggests that not every used to classify the sounds produced by birds
whistle carries species information, and that spe- (Kogan and Margoliash 1998; Trawicki et al.
cific whistle types play an important role in dol- 2005, Trifa et al. 2008, Adi et al. 2010), red
phin species identification. deer (Cervus elaphus; Reby et al. 2006), African
elephants (Clemins et al. 2005), common
8.4.3.6 Hidden Markov Models dolphins (Sturtivant and Datta 1997; Datta and
Hidden Markov mode (HMM) theory was devel- Sturtivant 2002), killer whales (Brown and
oped in the late 1960s by Baum and Eagon (1967) Smaragdis 2008, 2009); beluga whales (Clemins
and now is used commonly for human speech and Johnson 2005; Leblanc et al. 2008), bowhead
recognition (Rabiner et al. 1983, 1996; Levinson whales (Mellinger and Clark 2000), and hump-
1985; Rabiner 1989). To create an HMM, a vec- back whales (Suzuki et al. 2006). HMMs perform
tor of features is extracted from a signal at discrete as well as, or better than, both GMMs and DTW
time steps. The temporal evolution of these (Weisburn et al. 1993; Kogan and Margoliash
features from one state to the next is modeled by 1998) and are becoming more common in animal
creating a transition matrix M, where Mij is the classification studies.
probability of transition from state i to state j, and Adi et al. (2010) also used HMMs to examine
an emission matrix E, where Eis is the probability individually distinct acoustic features in songs
of observing signal s in state i (Rickwood and produced by ortolan buntings (Emberiza
Taylor 2008). A different HMM is created for hortulana). They represented each song syllable
each species in the dataset and a sound is classi- using a 15-state HMM (Fig. 8.22). These HMMs
fied by determining which of the HMMs has the then were connected to represent song types. The
highest likelihood of producing that particular set 14 most common song types were included in the
of signal states. Training HMMs requires signifi- analysis and correct classification ranged from
cant amounts of computing, and proper estima- 50% to 99%, depending on the song type. Over-
tion of the transition and output probabilities is of all, 90% of songs were correctly classified. Adi
crucial importance (Makhoul and Schwarz 1995). et al. (2010) used these results to illustrate the
Excellent tutorials on HMMs can be found in feasibility of using acoustic data to assess popula-
Rabiner and Juang (1986) and Rabiner (1989). tion sizes for these birds.
A significant advantage inherent to HMMs is Reby et al. (2006) used HMMs to examine
their ability to model time and spectral variability whether common roars uttered by red deer during
simultaneously (Makhoul and Schwarz 1995). the rutting season can be used for individual
They are able to model time series that have subtle recognition. They recorded roar bouts from
temporal structure and are efficient for modeling seven captive red deer and used HMMs to
signals with varying durations by performing non- model roar bouts as successions of silences and
linear, temporal alignment during both the training roars. Each roar in the analysis was modeled as a
and classification processes (Clemins et al. 2005; succession of states of frequency components
Roch et al. 2007; Trifa et al. 2008). Using HMMs, measured from the roars. Overall, the HMM
complex models can be built to deal with compli- correctly identified 85% of roar bouts to the indi-
cated biological signals (Rickwood and Taylor vidual deer, showing that roars were individually
2008), but care must be taken when choosing train- specific. Reby et al. (2006) also used HMMs to
ing samples to obtain a high generalization ability. examine stability in this individuality over the
The performance of an HMM is influenced by the rutting season. They did this by training an
size of the training set, the feature extraction HMM using roar bouts recorded at the beginning
method, and the number of states in the model of the rutting season and testing the model using
(Trifa et al. 2008). Recognition performance is roar bouts recorded later in the rutting season.
also affected by noise (Trifa et al. 2008). Overall, 58% of roar bouts were classified
In addition to being successfully implemented correctly, suggesting that individual identification
in human speech recognition, HMMs have been cues in roar bouts varied over time.
8 Detection and Classification Methods for Animal Sounds 299
Fig. 8.22 Example of a 15-state hidden Markov model pattern of the syllable (Adi et al. 2010). # Acoustical
representation of the waveform of a song syllable pro- Society of America, 2010. All rights reserved
duced by an ortolan bunting to capture the temporal
instruments (e.g., wipers on a turbidity sensor) microphone contain more ultrasonic components
may be recorded. Recorders resting on soft sea- than signals recorded from a bat flying away from
floor in coastal water may record the sound of the microphone. The signal with the longest fre-
sand swishing over the mooring. In addition, quency modulation (from 100 to 50 kHz) is
hydrostatic pressure fluctuations from the received when the bat is closest to the micro-
recorder bouncing in the water column or vortices phone. Variations in this spectrogram show how
at the hydrophone if deployed in strong currents one sound type could be categorized differently
will cause flow noise. All of these artifacts can simply because of distance between the animal
last from seconds to minutes and appear in and recorder, orientation to the microphone, and
spectrograms as power from a few hertz to high the gain setting.
kilohertz. Minimization of mooring noise and Other sound propagation effects include rever-
identification of recording artifacts is an art (also beration (which leads to the temporal spreading of
see Chaps. 2 and 3). brief, pulsed sounds) and frequency dispersion.
Similarly, artifacts can be recorded during air- Frequency dispersion is a result of energy at dif-
borne recordings. Wind is a primary artifact; ferent frequencies traveling at different speeds.
however, moving vegetation and precipitation This leads to sounds being spread out in time
can also add noise to a recording. Any distur- and, specifically in some underwater
bance to the microphone can generate unwanted environments, can cause pulsed sounds to
tapping or static on a recording. Recording become frequency-modulated sounds (either up-
systems in terrestrial environments need to be or downsweeps; Fig. 8.24).
secured to minimize such noises. Finally, ambient noise (i.e., geophysical noise,
anthropogenic noise, and non-target biological
noise) superimposes with animal sounds, and at
8.5.2 Sound Propagation Effects some distances and frequencies, parts of the ani-
mal sound spectrum will begin to drop below the
Environmental features of air or water can change levels of ambient noise. As a result, the same
the way sound propagates and thus the acoustic animal sound in a different environment and at a
characteristics of a recorded sound. Bioacousticians different distance from the animal can look quite
need to understand environmental effects on the different on a spectrogram and cause it to be
features of received sound to avoid classification misclassified as two different sound types.
of a signal variant as a new type, rather than as a
particular sound type affected by propagation
conditions. The sound propagation environment 8.5.3 Angular Aspects of Sound
can affect both the spectral and temporal features Emission
of sound as it propagates from the animal to the
recorder (see Chaps. 5 and 6). For example, energy The orientation of an animal relative to the
at high frequencies is lost (attenuates) very quickly receiver (microphone or hydrophone) can change
due to scattering and absorption, and therefore high- the acoustic features of the recorded sound. This
frequency harmonics do not propagate over long complicates classification, and off-axis variations
ranges. Acoustic energy at low frequencies (i.e., of a sound need to be known so they can be
long wavelengths) does not travel well in narrow categorized as just a variant of a particular
waveguides (e.g., shallow water). Because different sound type, rather than as a new sound type.
frequencies within a sound can attenuate at different Not all sounds emitted by animals are omni-
rates, the same sound can appear differently on a directional (i.e., propagate equally in all angles
spectrogram, depending on the distance at which it relative to the animal). Au et al. (2012) studied the
was recorded. directionality of bottlenose dolphin echolocation
Differential attenuation of frequencies in air is clicks by measuring the horizontal and vertical
shown in Fig. 8.23. Signals produced by a big emission beam patterns of these sounds. The
brown bat (Eptesicus fuscus) flying toward a angle at which an echolocation click was
8 Detection and Classification Methods for Animal Sounds 301
160k
140k
60k
50k
45k
40k
Frequency [Hz]
35k
30k
25k
20k
18k
Search phase Approach Terminal
16k
phase phase
14k
12k
10k
9k
8k Sequence of Big Brown Bat
7k
secs
0 . 00 0 . 05 0 . 10 0 . 15 0 . 20 0 . 25 0 . 30 0 . 35 0 . 40 0 . 45
Time [s]
Fig. 8.23 Spectrogram of big brown bat (Eptesicus bat pursues an insect prey for capture. Notice that the bat
fuscus) circling a recording device while searching and emits “search” calls at 25–40 kHz, approach calls at
pursuing aerial prey. As the bat approaches the micro- 30–70 kHz when it is in pursuit or trying to navigate flight
phone, more of the ultrasonic signal is received (calls through complex space, and finally terminal calls at
reach up to 70 kHz). As the bat moves away, the signal 30–55 kHz
is attenuated. Time between calls shortens notably as the
recorded relative to the transducer living in different regions, has been documented
(or echolocating animal) not only affected its for many terrestrial and aquatic animals, includ-
received level, but also the waveform and fre- ing Hawaiian crickets (Mendelson and Shaw
quency spectrum (Fig. 8.25). Sperm whale 2003), Túngara frogs (Engystomops pustulosus,
(Physeter macrocephalus) echolocation clicks, Prӧhl et al. 2006), bats (Law et al. 2002;
when recorded off-axis (i.e., away from the center Aspetsberger et al. 2003; Russo et al. 2007;
of its emission beam), consisted of multiple com- Yoshino et al. 2008), pikas (Borisova et al.
plex pulses that were likely due to internal 2008), sciurid rodents (Gannon and Lawlor
reflections within the sperm whale’s head (Møhl 1989; Slobodchikoff et al. 1998; Yamamoto
et al. 2003; also see Chap. 12). et al. 2001; Eiler and Banack 2004), singing
mice (Scotinomys spp., Campbell et al. 2010),
primates (Mitani et al. 1992; Delgado 2007;
Wich et al. 2008), cetaceans (Helweg et al.
8.5.4 Geographic Variation
1998; McDonald et al. 2006; Delarue et al.
2009; Papale et al. 2013, 2014), and elephant
Geographic variation, or differences in the sounds
seals (Mirounga spp., Le Boeuf and Peterson
produced by populations of the same species
302 J. N. Oswald et al.
Fig. 8.25 Waveforms and spectra of a bottlenose dolphin echolocation click in the horizontal (a) and vertical (b) planes
(Au et al. 2012). # Acoustical Society of America, 2012. All rights reserved
304 J. N. Oswald et al.
29
ABW, Cape Leeuwin & others
Spot calls, Cape Leeuwin
28 Spot calls, Perth Canyon
Spot calls, Portland
Spot calls, GAB
Spot calls, Bremer Bay
27 Spot calls, Kangaroo Island
Frequency (Hz)
26
25
24
23
22
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
Year
Fig. 8.27 Weekly means of the upper part of the Antarc- locations are off Australia (GAB: Great Australian Bight).
tic blue whale Z-call over several years, as well as of the Data updated from Gavrilov et al. (2012) and Ward et al.
spot call, which remains to be identified to species. All (2017). Courtesy of Sasha Gavrilov
sounds in the field, often while visually observing are detected and how many are missed. We
animals. Scientists recorded sounds in the field presented two ways of finding the best threshold
and analyzed the recordings in the laboratory by and assessing detector performance: receiver
listening, looking at oscillograms or operating characteristics and precision-recall
spectrograms, and manually sorting sounds into curves.
types. Nowadays, with the affordability of auton- Once signals have been detected, they can be
omous recording equipment, bioacousticians col- classified. A common pre-processing step imme-
lect vast amounts of data, which can no longer be diately prior to classification includes the mea-
analyzed without the aid of automated data surement of sound features such as minimum
processing, data reduction, and data analysis and maximum frequency, duration, or cepstral
tools. Given simultaneous advances in computer features. The software tools we presented for
hard- and software, datasets may be analyzed classification included parametric clustering,
more efficiently, and with the added advantage principal component analysis, discriminant func-
of reducing opportunities for human subjective tion analysis, classification trees, and machine
biases. learning algorithms. No single tool outperforms
In this chapter, we presented software tools for all others; rather, the best tool suited for the spe-
automatically detecting animal sounds in acoustic cific task needs to be employed. We discussed
recordings, and for classifying those sounds. The advantages and limitations of the various tools
detectors we discussed compute a specific quan- and provided numerous examples from the litera-
tity of the sound (such as its instantaneous energy ture. Finally, challenges resulting from recording
or entropy) and then apply a threshold above artifacts, the environment affecting sound
which the sound is deemed detected. The specific features, and changes in sound features over
detectors were based on acoustic energy, Teager– time and space were explored.
Kaiser energy, entropy, matched filtering, and It is important to remember that human per-
spectrogram cross-correlation. Setting the detec- ception of a sound likely is not the same as an
tion threshold critically affects how many signals animal’s perception of the sound and yet
306 J. N. Oswald et al.
bioacousticians commonly describe or classify ani- With a goal to foster wider participation in
mal sounds in human terms. Classification of the research on bioacoustic pattern recognition, a
acoustic repertoire of an animal into sound types number of global competitions are held regularly.
provides a convenient framework for comparing The annual Detection and Classification of
and contrasting sounds, taking systematic Acoustic Scenes and Event (DCASE) workshops
measurements from portions of the repertoire, and and BirdCLEF challenges (part of Cross Lan-
performing statistical analyses. However, categories guage Evaluation Forum) attract hundreds of
determined based on human perception may have data scientists for developing machine learning
little or no relevance to the animals and so human solutions for recognizing bird sounds in
categorizations can be biologically meaningless. soundscape recordings. The marine mammal
For example, humans have limited low-frequency community organizes the biennial Detection,
and high-frequency hearing abilities compared to Classification, Localization, and Density Estima-
many other species, and so aural classification of tion (DCLDE) workshops. These challenges put
sound types is sometimes based on only a portion of out large training datasets for researchers to
a sound audible to the human listener. Whether develop detection and classification systems,
sound types determined by humans are meaningful assess the performance of submitted solutions
classes to the animals is mostly unknown. While with “held out” datasets, and reward the
categorizing sounds based on function is an attrac- top-ranked submissions. The datasets from these
tive approach for the behavioral zoologist, challenges are often made available for use by the
establishing the functions of these sounds is often research community after the competitions, while
challenging. In our review of classification some workshops make available the submitted
methods, it was clear that methods developed for solutions as well.
human speech could be applied to animal sounds.
Some fascinating questions lie ahead for
bioacousticians as they attempt to extend under-
8.7 Additional Resources
standing of the perception experienced by other
animals.
• PAMGuard is an open-source software pack-
Even with the above caveats, detection and
age for acoustic detection, classification, and
classification of animal sounds is useful for
localization of cetacean sounds: https://www.
research and conservation. It allows populations
pamguard.org/
to be monitored, their distribution and abun-
• Ishmael is a free software package for acoustic
dance to be determined, and impacts (e.g., from
detection, classification, and localization of
human presence or climate change) to be
cetacean sounds: http://www.bioacoustics.us/
assessed. It can also be useful for conservation
ishmael.html
of a species (i.e., to create taxonomy, identify
• Koe is a free, web-based software for annota-
geographic variation in populations, examine
tion, measurement, and classification of bio-
ecological connectivity among populations, and
acoustics signals: https://koe.io.ac.nz/#
detect changes in the biological uses sounds due
(Fukuzawa et al. 2020)
to the advent and growth of anthropogenic
• Praat is free software originally designed for
noise). Classification of animal sounds is impor-
human speech analysis, but used by many
tant for understanding behavioral ecology and
bioacousticians: https://www.fon.hum.uva.nl/
social systems of animals and can be used to
praat/
identify individuals, social groups, and
• Characterization Of Recorded Underwater
populations. The ability to study these types of
Sound (CHORUS) is a MATLAB graphic user
topics will ultimately lead to a deeper under-
interface developed by Curtin University,
standing of the evolutionary forces that shape
Perth, WA, Australia, with built-in automatic
animal bioacoustics.
detectors for pygmy blue and fin whales
8 Detection and Classification Methods for Animal Sounds 307
(Gavrilov and Parsons 2014): https://cmst. aquatic mammals. De Spil Publishers, Woerden, The
curtin.edu.au/products/chorus-software/ Netherlands, pp 183–199
Au WWL, Branstetter B, Moore P, Finneran J (2012) The
• Detection, Classification, Localization, and biosonar field around an Atlantic bottlenose dolphin
Density Estimation of Marine Mammals (Tursiops truncatus). J Acoust Soc Am 131(1):
using Passive Acoustics meeting websites: 569–576. https://doi.org/10.1121/1.3662077
– Mount Hood, Oregon, USA, 2011: http:// Baptista LF, Gaunt SSL (1997) Social interaction and
vocal development in birds. In: Snowden CT,
www.bioacoustics.us/dcl.html Hausberger M (eds) Social influences on vocal devel-
– St Andrews, Scotland, UK, 2013: https:// opment. Cambridge Univ Press, Cambridge, pp 23–40
soi.st-andrews.ac.uk/dclde2013/ Baum LE, Eagon JA (1967) An inequality with
– San Diego, California, USA, 2015: http:// applications to statistical estimation for probabilistic
functions of Markov processes and to a model for
www.cetus.ucsd.edu/dclde/index.html ecology. Bull Am Math Soc 73:360–363
– Paris, France, 2018: http://sabiod.univ-tln. Baumgartner MF, Fratantoni DM (2008) Diel periodicity
fr/DCLDE/ in both Sei whale vocalization rates and the vertical
– Hawaii, USA, 2022: http://www.soest. migration of their copepod prey observed from ocean
gliders. Limnol Oceanogr 53:2197–2209. https://doi.
hawaii.edu/ore/dclde/ org/10.4319/lo.2008.53.5_part_2.2197
• Bird sound recognition challenges: http:// Beeman K (1998) Digital signal analysis, editing and
dcase.community/ (DCASE), https://www. synthesis. In: Hopp SL, Owren MJ, Evans CS (eds)
imageclef.org/BirdCLEF2020 (BirdCLEF) Animal acoustic communication: sound analysis and
research methods. Springer, Berlin, pp 59–103
• BirdNET is an Android app for birdsong rec- Belliustin NS, Kuznetsov SO, Nuidel IV, Yakhno VG
ognition: https://birdnet.cornell.edu/ (1991) Neural networks with close nonlocal coupling
• SongSleuth is an Apple or Android app for for analyzing composite image. Neurocomputing 3:
birdsong recognition: https://www. 231–246. https://doi.org/10.1016/0925-2312(91)
90005-V
songsleuth.com/#/ Bergler C, Schröter H, Cheng RX, Barth V, Weber M,
• All accessed 5 Aug 2022. Nöth E, Hofer H, Maier A (2019) ORCA-SPOT: an
automatic killer whale sound detection toolkit using
deep learning. Sci Rep 9(1):1–7. https://doi.org/10.
1038/s41598-019-47335-w
References Bermant PC, Bronstein MM, Wood RJ, Gero S, Gruber
DF (2019) Deep machine learning techniques for the
detection and classification of sperm whale bioacous-
Adi K, Johnson MT, Osiejuk TS (2010) Acoustic
tics. Sci Rep 9(1):1–10. https://doi.org/10.1038/
censusing using automatic vocalization classification
s41598-019-48909-4
and identity recognition. J Acoust Soc Am 127:874–
Borisova NG, Rudneva LV, Starkov AI (2008) Interpopu-
883. https://doi.org/10.1121/1.3273887
lation variability of vocalizations in the Daurian pika
Afifi AA, Clark V (1996) Computer-aided multivariate
(Ochotona daurica). Zool Zh 87:850–861
analysis, 3rd edn. Chapman and Hall/CRC, New York
Bouffaut L, Dréo R, Labat V, Boudraa AO, Barruol G
Amorim MC, Vasconcelos RO, Fonseca PJ (2015) Fish
(2018) Passive stochastic matched filter for Antarctic
sounds and mate choice. In: Ladich F (ed) Sound com-
blue whale call detection. J Acoust Soc Am 144(2):
munication in fishes. Springer, Vienna, pp 1–33
955–965. https://doi.org/10.1121/1.5050520
Anderson SE, Dave AS, Margoliash D (1996) Template-
Bradbury JW, Vehrencamp SL (2011) Principles of animal
based automatic recognition of birdsong syllables from
communication, 2nd edn. Sinauer Associates,
continuous recordings. J Acoust Soc Am 100:1209–
New York
1219. https://doi.org/10.1121/1.415968
Brandes TS (2008) Feature-vector selection and use with
Armitage DW, Ober HK (2010) A comparison of
Hidden Markov Models to identify frequency-
supervised learning techniques in the classification of
modulated bioacoustic signals amidst noise. IEEE
bat echolocation calls. Ecol Inform 5:465–473. https://
Trans Speech Lang Process 16:1173–1180. https://
doi.org/10.1016/j.ecoinf.2010.08.001
doi.org/10.1109/TASL.2008.925872
Aspetsberger F, Brandsen D, Jacobs DS (2003) Geo-
Breiman L (2001) Random forests. Mach Learn 45:5–32
graphic variation in the morphology, echolocation
Breiman L, Friedman J, Olshen R, Stone C (1984) Classi-
and diet of the little free-tailed bat, Chaerephon
fication and regression trees. Wadsworth, Pacific
pumilus (Molossidae). Afr Zool 38:245–254. https://
Grove, CA
doi.org/10.1080/15627020.2003.11407278
Briefer EF, Maigrot A-L, Roi T, Mandel R, Briefer
Au WWL, Nachtigall PE (1995) Artificial neural network
Freymond S, Bachmann I, Hillmann E (2015) Segre-
modeling of dolphin echolocation. In: Kastelein RA,
gation of information about emotional arousal and
Thomas JA, Nachtigall PE (eds) Sensory systems of
308 J. N. Oswald et al.
valence in horse whinnies. Sci Rep 5(1):1–11. https:// Clark LA, Pregibon D (1992) Statistical models. In:
doi.org/10.1038/srep09989 Chambers SJM, Hastie TJ (eds) Statistical models in
Briskie JV, Martin PR, Martin TE (1999) Nest predation S. Wadsworth and Brooks/Cole, Pacific Grove, CA
and the evolution of nestling begging calls. Proc R Soc Clarke E, Reichard UH, Zuberbühler K (2006) The syntax
Lond B 266:2153–2159. https://doi.org/10.1098/rspb. and meaning of wild gibbon songs. PLoS One 1(1):
1999.0902 E73. https://doi.org/10.1371/journal.pone.0000073
Brown JC, Miller PJO (2007) Automatic classification of Clemins PJ, Johnson MT (2005) Unsupervised classifica-
killer whale vocalizations using dynamic time warping. tion of beluga whale vocalizations. J Acoust Soc Am
J Acoust Soc Am 122:1201–1207. https://doi.org/10. 117:2470. https://doi.org/10.1121/1.4809461
1121/1.2747198 Clemins PJ, Johnson MT, Leong KM, Savage A (2005)
Brown JC, Smaragdis P (2008) Automatic classification of Automatic classification and speaker identification of
vocalizations with Gaussian mixture models and African elephant (Loxodonta africana) vocalizations. J
Hidden Markov Models. J Acoust Soc Am 123:3345. Acoust Soc Am 117:956–963. https://doi.org/10.1121/
https://doi.org/10.1121/1.2933896 1.1847850
Brown JC, Smaragdis P (2009) Hidden Markov and Coifman RR, Lafon S, Lee AB, Maggioni M, Nadler B,
Gaussian mixture models for automatic sound classifi- Warner F, Zucker SW (2005) Geometric diffusions as
cation. J Acoust Soc Am 125:EL221–EL224. https:// a tool for harmonic analysis and structure definition
doi.org/10.1121/1.3124659 of data: diffusion maps. Proc Natl Acad Sci 102(21):
Brown JC, Hodgins-Davis A, Miller PJO (2006) Classifi- 7426–7431. https://doi.org/10.1073/pnas.0500334102
cation of vocalizations of killer whales using dynamic Cortes C, Vapnik V (1995) Support-vector networks.
time warping. J Acoust Soc Am 119:EL34–EL40. Mach Learn 20:273–297
https://doi.org/10.1121/1.2166949 Courts R, Erbe C, Wellard R, Boisseau O, Jenner KC,
Buck JR, Tyack PL (1993) A quantitative measure of Jenner M-N (2020) Australian long-finned pilot whales
similarity for Tursiops truncatus signature whistles. J (Globicephala melas) emit stereotypical, variable,
Acoust Soc Am 94:2497–2506. https://doi.org/10. biphonic, multi-component, and sequenced
1121/1.407385 vocalisations, similar to those recorded in the northern
Camacho-Alpízar A, Fuchs EJ, Barrantes G (2018) Effect hemisphere. Sci Rep 10(1):20609. https://doi.org/10.
of barriers and distance on song, genetic, and morpho- 1038/s41598-020-74111-y
logical divergence in the highland endemic Timberline Crance JL, Berchok CL, Wright DL, Brewer AM,
Wren (Thryorchilus browni, Troglodytidae). PLoS Woodrich DF (2019) Song production by the North
One 13(12):e0209508. https://doi.org/10.1371/jour Pacific right whale, Eubalaena japonica. J Acoust Soc
nal.pone.0209508 Am 145(6):3467–3479. https://doi.org/10.1121/1.
Campbell P, Pasch B, Pino JL, Crino OL, Phillips M, 5111338
Phelps SM (2010) Geographic variation in the songs Cutler DR, Edwards TC Jr, Beard KH, Cutler A, Hess KT,
of neotropical singing mice: testing the relative impor- Gibson J, Lawler JJ (2007) Random forests for classi-
tance of drift and local adaptation. Evolution 64(7): fication in ecology. Ecology 88:2783–2792. https://
1955–1972. https://doi.org/10.1111/j.1558-5646. doi.org/10.1890/07-0539.1
2010.00962.x Dang T, Bulusu N, Hu W (2008) Lightweight acoustic
Catchpole CK, Slater PJB (2008) Bird song: biological classification for cane toad monitoring. In: 42nd
themes and variations, 2nd edn. Cambridge University Asilomar Conference on Signals, Systems and
Press, Cambridge Computers. IEEE, New York, pp 1601–1605
Cerchio S, Jacobsen JK, Norris TF (2001) Temporal and Datta S, Sturtivant C (2002) Dolphin whistle classification
geographical variation in songs of humpback whales, for determining group identities. Sig Process 82(2):
Megaptera novaeangliae: synchronous change in 251–258. https://doi.org/10.1016/S0165-1684(01)
Hawaiian and Mexican breeding assemblages. Anim 00184-0
Behav 62(2):313–329. https://doi.org/10.1006/anbe. Davis J, Goadrich M (2006) The relationship between
2001.1747 precision-recall and ROC curves. In: Proceedings of
Cho K, Van Merriënboer B, Bahdanau D, Bengio Y the 23rd International Conference on Machine
(2014) On the properties of neural machine translation: Learning, Pittsburgh, PA
encoder-decoder approaches. arXiv:1409.1259 Davis SB, Mermelstein P (1980) Comparison of
Clark CW (1980) A real-time direction-finding device for parametric representations for monosyllabic word rec-
determining the bearing to the underwater sounds of ognition in continuously spoken sentences. IEEE Trans
southern right whales, Eubalaena australis. J Acoust Acoust Speech Sig Process 28:357–366. https://doi.
Soc Am 68:508–511. https://doi.org/10.1121/1. org/10.1109/TASSP.1980.1163420
384762 Dawson MRW, Charrier I, Sturdy CB (2006) Using an
Clark CW (1982) The acoustic repertoire of the southern Artificial Neural Network to classify black-capped
right whale, a quantitative analysis. Anim Behav 30(4): chickadee (Poecile atricapillus) sound note types. J
1060–1071. https://doi.org/10.1016/S0003-3472(82) Acoust Soc Am 119(5):3161–3172. https://doi.org/
80196-6 10.1121/1.2189028
8 Detection and Classification Methods for Animal Sounds 309
Deecke VB, Janik VM (2006) Automated categorization learning algorithms. Appl Acoust 120:158–166.
of bioacoustic signals: avoiding perceptual pitfalls. J https://doi.org/10.1016/j.apacoust.2017.01.025
Acoust Soc Am 119:645–653. https://doi.org/10.1121/ Fagerlund S (2007) Bird species recognition using support
1.2139067 vector machines. EURASIP J Appl Sig Proc 2007(1):
Deecke VB, Ford JKB, Spong P (1999) Quantifying com- 1–8. https://doi.org/10.1155/2007/38637
plex patterns of bioacoustic variation: use of a neural Fenton MB, Jacobson SL (1973) An automatic ultrasonic
network to compare killer whale (Orcinus orca) sensing system for monitoring the activity of some
dialects. J Acoust Soc Am 105:2499–2507. https:// bats. Can J Zool 51:291–299. https://doi.org/10.1139/
doi.org/10.1121/1.426853 z73-041
Delarue J, Todd SK, Van Parijs SM, Di Iorio L (2009) Fitch WT (2003) Mammalian vocal production: themes
Geographic variation in Northwest Atlantic fin whale and variation. In: Proceedings of the 1st International
(Balaenoptera physalus) song: implications for stock Conference on Acoustic Communication by Animals,
structure assessment. J Acoust Soc Am 125:1774– 27–30 July, pp 81–82
1782. https://doi.org/10.1121/1.3068454 Forti LR, Costa WP, Martins LB, Nunes-de-Almeida CH,
Delgado RA (2007) Geographic variation in the long Toledo LF (2016) Advertisement call and genetic
sounds of male orangutans (Pongo spp.). Ethology structure conservatism: good news for an endangered
113:487–498. https://doi.org/10.1111/j.1439-0310. Neotropical frog. PeerJ 4:e2014. https://doi.org/10.
2007.01345.x 7717/peerj.2014
Deregnaucourt S, Guyomarch JC, Richard V (2001) Clas- Freitag LE, Tyack PL (1993) Passive acoustic localization
sification of hybrid crows in quail using artificial neural of the Atlantic bottlenose dolphin using whistles and
networks. Behav Process 56:103–112. https://doi.org/ echolocation clicks. J Acoust Soc Am 93:2197–2205.
10.1016/S0376-6357(01)00188-7 https://doi.org/10.1121/1.406681
Duda R, Hart P, Stork D (2001) Pattern classification, 2nd Fristrup KM, Watkins WA (1993) Marine animal sound
edn. Wiley, Hoboken, NJ classification. Woods Hole Oceanographic Institution
Dunlop RA, Noad MJ, Cato DH, Stokes D (2007) The Technical Report WHOI-94-13, p 29
social vocalization repertoire of east Australian migrat- Frommolt K-H, Bardeli R, Clausen M (eds) (2007)
ing humpback whales (Megaptera novaeangliae). J Computational bioacoustics for assessing biodiversity.
Acoust Soc Am 122(5):2893–2905. https://doi.org/ Proceed Internat Expert meeting on IT-based detection
10.1121/1.2783115 of bioacoustical patterns, 7–10 December 2007 at the
Dunlop RA, Cato DH, Noad MJ, Stokes DM (2013) International Academy for Nature Conservation (INA)
Source levels of social sounds in migrating humpback Isle of Vilm, Germany. BfN - Skripten Federal Agency
whales (Megaptera novaeangliae). J Acoust Soc Am for Nature Conservation, p 234
134(1):706–714. https://doi.org/10.1121/1.4807828 Fukushima K, Wake N (1990) Alphanumeric character
Egan JP (1975) Signal detection theory and ROC analysis. recognition by neocognitron. In: Miller RE
Academic Press, New York (ed) Advanced neural computers. Elsevier Science,
Eiler KC, Banack SA (2004) Variability in the alarm call Amsterdam, pp 263–270
of golden-mantled ground squirrels (Spermophilus Fukuzawa Y, Webb WH, Pawley MD, Roper MM,
lateralis and S. saturatus). J Mammal 85:43–50. Marsland S, Brunton DH, Gilman A (2020) Koe:
https://doi.org/10.1644/1545-1542(2004)085<0043: web-based software to classify acoustic units and ana-
VITACO>2.0.CO;2 lyse sequence structure in animal vocalizations.
Erbe C, King AR (2008) Automatic detection of marine Methods Ecol Evol 11:431–441. https://doi.org/10.
mammals using information entropy. J Acoust Soc Am 1111/2041-210X.13336
124(5):2833–2840. https://doi.org/10.1121/1.2982368 Gannier A, Fuchs S, Quebre P, Oswald JN (2010) Perfor-
Erbe C, Verma A, McCauley R, Gavrilov A, Parnum I mance of a contour-based classification method for
(2015) The marine soundscape of the Perth Canyon. whistles of Mediterranean dolphins. Appl Acoust 7:
Prog Oceanogr 137:38–51. https://doi.org/10.1016/j. 1063–1069. https://doi.org/10.1016/j.apacoust.2010.
pocean.2015.05.015 05.019
Erbe C, Reichmuth C, Cunningham K, Lucke K, Dooling R Gannon WL, Lawlor TE (1989) Variation in the chip
(2016) Communication masking in marine mammals: a vocalization of three species of Townsend’s
review and research strategy. Mar Pollut Bull 103:15– chipmunks (genus Eutamias). J Mammal 70:740–753
38. https://doi.org/10.1016/j.marpolbul.2015.12.007 Gannon WL, Sherwin RE, deCarvalho TN, O’Farrell MJ
Erbe C, Dunlop R, Jenner KCS, Jenner M-NM, McCauley (2001) Pinnae and echolocation call differences
RD, Parnum I, Parsons M, Rogers T, Salgado-Kent C between Myotis californicus and M. ciliolabrum
(2017) Review of underwater and in-air sounds emitted (Chiroptera: Vespertilionidae). Acta Chiropterol 3(1):
by Australian and Antarctic marine mammals. Acoust 77–91
Aust 45:179–241. https://doi.org/10.1007/s40857- Gannon WL, O’Farrell MJ, Corben C, Bedrick EJ (2004)
017-0101-z Call character lexicon and analysis of field recorded bat
Esfahanian M, Erdol N, Gerstein E, Zhuang H (2017) echolocation calls. In: Thomas J, Moss C, Vater M
Two-stage detection of north Atlantic right whale (eds) Echolocation in bats and dolphins. The Univer-
upcalls using local binary patterns and machine sity of Chicago Press, Chicago, pp 478–484
310 J. N. Oswald et al.
Garland EC, Castellote M, Berchok CL (2015) Beluga Hamilton LJ, Cleary J (2010) Automatic discrimination of
whale (Delphinapterus leucas) vocalizations and call beaked whale clicks in noisy acoustic time series. In:
classification from the eastern Beaufort Sea population. OCEANS’10 IEEE Sydney, pp 1–5
J Acoust Soc Am 137:3054–3067. https://doi.org/10. Hammerschmidt K, Fischer J (1998) The vocal repertoire
1121/1.4919338 of Barbary macaques: a quantitative analysis of a
Garland EC, Rendell L, Lilley MS, Poole MM, Allen J, graded signal system. Ethology 104(3):203–216.
Noad MJ (2017) The devil is in the detail: quantifying https://doi.org/10.1111/j.1439-0310.1998.tb00063.x
vocal variation in a complex, multi-levelled, and rap- Hammerschmidt K, Reisinger E, Westekemper K,
idly evolving display. J Acoust Soc Am 142(1): Ehrenreich L, Strenzke N, Fischer J (2012) Mice do
460–472. https://doi.org/10.1121/1.4991320 not require auditory input for the normal development
Gavrilov AN, Parsons MJG (2014) A MATLAB tool for of their ultrasonic vocalizations. BMC Neurosci 13:40
the characterization of recorded underwater sound Harland E (2008) Processing the workshop datasets using
(CHORUS). Acoust Aust 42(3):190–196 the TRUD algorithm. Can Acoust 36:27–33
Gavrilov A, McCauley R, Gedamke J (2012) Steady inter He K, Zhang X, Ren S, Sun J (2016) Deep residual
and intra-annual decrease in the vocalization frequency learning for image recognition. Proc IEEE Conf
of Antarctic blue whales. J Acoust Soc Am 131(6): Comput Vis Pattern Recogn 2016:770–778
4476–4480. https://doi.org/10.1121/1.4707425 Helweg DA, Cato ADH, Jenkins PF, Garrigue D,
Gedamke J, Costa DP, Dunstan A (2001) Localization and McCauley RD (1998) Geographic variation in South
visual verification of a complex minke whale vocaliza- Pacific humpback whale songs. Behaviour 135:1–27
tion. J Acoust Soc Am 109(6):3038–3047. https://doi. Herr, A, Klomp, NL, Atkinson, JS (1997) Identification of
org/10.1121/1.1371763 bat echolocation calls using decision tree classification
Gemello R, Mana F (1991) A neural approach to speaker system Complexity International. https://www.
independent isolated word recognition in an uncon- researchgate.net/publication/293134471_Identifica
trolled environment. In: Proceedings of the Interna- tion_of_bat_echolocation_calls_using_a_decision_
tional Neural Networks Conference, Paris 9–13 July tree_classification_system. Accessed 17 July 2017
1990, vol 1. Kluwer Academic Publishers, Dordrecht, Himawan I, Towsey M, Law B, Roe P (2018). Deep
pp 83–86 learning techniques for Koala Activity detection. In:
Ghosh J, Deuser LM, Beck SD (1992) A neural network INTERSPEECH, pp. 2107–2111
based hybrid system for detection, characterization, Hochreiter S, Schmidhuber J (1997) Long short-term
and classification of short-duration oceanic signals. memory. Neural Comput 9(8):1735–1780
IEEE J Ocean Eng 17:351–363. https://doi.org/10. Holy TE, Guo Z (2005) Ultrasonic songs of male mice.
1109/48.180304 PLoS One Biol 3(12):e386. https://doi.org/10.1371/
Gill SA, Bierema AM-K (2013) On the meaning of alarm journal.pbio.0030386
calls: a review of functional reference in avian alarm Horn AG, Falls JB (1996) Categorization and the design of
calling. Ethology 119:449–461. https://doi.org/10. signals: the case of song repertoires. In: Kroodsma DE,
1111/eth.12097 Miller EH (eds) Ecology and evolution of acoustic
Gillespie D, Caillat M (2008) Statistical classification of communication in birds. Comstock Publishing
odontocete clicks. Can Acoust 36:20–26 Associates, Ithaca, pp 121–135
Gillespie D, Caillat M, Gordon J (2013) Automatic detec- Hotelling H (1933) Analysis of a complex of statistical
tion and classification of odontocete whistles. J Acoust variables into principal components. J Edu Psychol 24:
Soc Am 134:2427–2437. https://doi.org/10.1121/1. 417–441
4816555 Huang X, Acero A, Hon H-W (2001) Spoken language
Gingras G, Fitch WT (2013) A three-parameter model for processing. Prentice Hall, Upper Saddle River, NJ
classifying anurans into four genera based on adver- Huang G, Liu Z, Van Der Maaten L, Weinberger KQ
tisement calls. J Acoust Soc Am 133:547–559. https:// (2017) Densely connected convolutional networks.
doi.org/10.1121/1.4768878 Proc IEEE Conf Comput Vis Pattern Recogn 2017:
Goëau H, Glotin H, Vellinga WP, Planqué R, Joly A 4700–4708
(2016) LifeCLEF bird identification task 2016: the Ibrahim AK, Chérubin LM, Zhuang H, Schärer Umpierre
arrival of deep learning. CLEF 1609:440–449 MT, Dalgleish F, Erdol N, Ouyang B, Dalgleish A
Griffin DR, Webster FA, Michael CR (1960) The echolo- (2018) An approach for automatic classification of
cation of flying insects by bats. Anim Behav 8:141– grouper vocalizations with passive acoustic monitor-
154 ing. J Acoust Soc Am 143:666–676. https://doi.org/10.
Guemeur Y, Elisseeff A, Paugam-Moisey H (2000) A new 1121/1.5022281
multi-class SVM based on a uniform convergence Itakura F (1975) Minimum prediction residual principle
result. Proceedings of the IEEE-INNS-ENNS Interna- applied to speech recognition. IEEE Trans Acoust
tional Joint Conference on Neural Networks. IJCNN Speech Sig Process 23:57–72
2000. Neural Computing: New Challenges and Jacobson EK, Yack TM, Barlow J (2013) Evaluation of an
Perspectives for the New Millennium 4:183–188 automated acoustic beaked whale detection algorithm
8 Detection and Classification Methods for Animal Sounds 311
using multiple validation and assessment methods. In: Ko T, Peddinti V, Povey D, Khudanpur S (2015) Audio
NOAA Technical Memorandum NOAA-TM-NMFS- augmentation for speech recognition. In: Sixteenth
SWFSC-509 Annual Conference of the International Speech Com-
Jaitly N, Hinton GE (2013) Vocal tract length perturbation munication Association
(VTLP) improves speech recognition. In: Proceedings Kogan J, Margoliash D (1998) Automated recognition of
of ICML Workshop on Deep Learning for Audio, bird song elements from continuous recordings using
Speech and Language, vol 117 dynamic time warping and hidden Markov models: a
Janik VM (1999) Pitfalls in the categorization of behavior: comparative study. J Acoust Soc Am 103:2185–2196.
a comparison of dolphin whistle classification https://doi.org/10.1121/1.421364
methods. Anim Behav 57:133–143. https://doi.org/10. Kollmorgen S, Hahnloser RH, Mante V (2020) Nearest
1006/anbe.1998.0923 neighbours reveal fast and slow components of motor
Jarvis S, Dimarzio N, Morrissey R, Moretti D (2006) learning. Nature 577(7791):526–530. https://doi.org/
Automated classification of beaked whales and other 10.1038/s41586-019-1892-x
small odontocetes in the Tongue of the Ocean, Kondo N, Watanabe S (2009) Contact calls: information
Bahamas. Oceans 2006:1–6. https://doi.org/10.1109/ and social function. Jpn Psych Res 51:197–208.
OCEANS.2006.307124 https://doi.org/10.1111/j.1468-5884.2009.00399.x
Jiang JJ, Bu LR, Duan FJ, Wang XQ, Liu W, Sun ZB, Li Koren L, Geffen E (2009) Complex call in male rock
CY (2019) Whistle detection and classification for hyrax (Procavia capensis): a multi-information
whales based on convolutional neural networks. Appl distributing channel. Behav Ecol Sociobiol 63(4):
Acoust 150:169–178. https://doi.org/10.1016/j. 581–590. https://doi.org/10.1007/s00265-008-0693-2
apacoust.2019.02.007 Koren L, Geffen E (2011) Individual identity is
Kandia V, Stylianou Y (2006) Detection of sperm whale communicated through multiple pathways in male
clicks based on the Teager–Kaiser energy operator. rock hyrax (Procavia capensis) songs. Behav Ecol
Appl Acoust 67(11):1144–1163. https://doi.org/10. Sociobiol 65(4):675–684. https://doi.org/10.1007/
1016/j.apacoust.2006.05.007 s00265-010-1069-y
Karlsen JD, Bisther A, Lyndersen C, Haug T, Kovacs KM Koren L, Mokady O, Geffen E (2008) Social status and
(2002) Summer vocalizations of adult male white cortisol levels in singing rock hyraxes. Horm Behav
whales (Delphinapterus leucas) in Svalbard, Norway. 54:212–216
Polar Biol 25:808–817. https://doi.org/10.1007/ Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet
s00300-002-0415-6 classification with deep convolutional neural networks.
Keen S, Ross JC, Griffiths ET, Lanzone M, Farnsworth A Commun ACM 60(6):84–90
(2014) A comparison of similarity-based approaches Kruskal J, Sankoff D (1983) An anthology of algorithms
in the classification of flight calls of four species of and concepts for sequence comparison. In: Sankoff D,
North American wood-warblers (Parulidae). Ecol Inf Kruskal J (eds) Time warps, string edits and
21:25–33. https://doi.org/10.1016/j.ecoinf.2014.01. macromolecules: the theory and practice of string com-
001 parison. Addison-Wesley, Reading, MA, pp 265–310
Keighley MV, Langmore NE, Zdenek CN, Heinsohn R Lammers MO, Au WWL, Herzing DL (2003) The broad-
(2017) Geographic variation in the vocalizations of band social acoustic signaling behavior of spinner and
Australian palm cockatoos (Probosciger aterrimus). spotted dolphins. J Acoust Soc Am 114:1629–1639.
Bioacoustics 26(1):91–108. https://doi.org/10.1080/ https://doi.org/10.1121/1.1596173
09524622.2016.1201778 Law BS, Reinhold L, Pennay M (2002) Geographic varia-
Kershenbaum A, Blumstein DT, Roch MA, Akcay C, tion in the echolocation sounds of Vespadelus spp.
Backus G, Bee MA, Bohn K, Cao Y, Carter G, (Vespertilionidae) from New South Wales and
Cäsar C, Coen M, DeRuiter SL, Doyle L, Edelman S, Queensland, Australia. Acta Chiropt 4:201–215.
Ferrer-i-Cancho R, Freeberg TM, Garland EC, https://doi.org/10.3161/001.004.0208
Gustison M, Harley HE, Huetz C, Hughes M, Bruno Le Boeuf BJ, Peterson RS (1969) Dialects in elephant
JH, Ilany A, Jin DZ, Johnson M, Ju C, Karnowski J, seals. Science 166(3913):1654–1656. https://doi.org/
Lohr B, Manser MB, McCowan B, Mercado E, Narins 10.1126/science.166.3913.1654
PM, Piel A, Rice M, Salmi R, Sasahara K, Sayigh L, Leblanc E, Bahoura M, Simard Y (2008) Comparison of
Shiu Y, Taylor C, Vallejo EE, Waller S, Zamora- automatic classification methods for beluga whale
Gutierrez V (2016) Acoustic sequences in non-human vocalizations. J Acoust Soc Am 123:3772
animals: a tutorial review and prospectus. Biol Rev 91: LeCun Y, Boser B, Denker JS, Henderson D, Howard RE,
13–52 Hubbard W, Jackel LD (1989a) Backpropagation
Kingma DP, Welling M (2013) Auto-encoding variational applied to handwritten zip code recognition. Neural
bayes. arXiv preprint arXiv:1312.6114 Comput 1(4):541–551. https://doi.org/10.1162/neco.
Klinck H, Mellinger DK (2011) The energy ratio mapping 1989.1.4.541
algorithm: a tool to improve the energy-based detection LeCun Y, Boser B, Denker JS, Henderson D, Howard RE,
of odontocete echolocation clicks. J Acoust Soc Am Hubbard W, Jackel LD (1989b) Handwritten digit rec-
129(4):1807–1812. https://doi.org/10.1121/1.3531924 ognition with a back-propagation network. In:
312 J. N. Oswald et al.
environments. In: Proceedings of the 2nd International their acoustic signals. Appl Sci 6(12):443. https://doi.
Conference on Underwater Acoustic Measurements: org/10.3390/app6120443
Technologies and Results, Heraklion, Greece, O’Farrell MJ, Miller BW, Gannon WL (1999) Qualitative
25–29 June 2007 identification of free-flying bats using Anabat detector.
Mellinger DK, Clark CW (2000) Recognizing transient J Mammal 80:11–23
low-frequency whale sounds by spectrogram correla- Oh J, Laubach M, Luczak A (2003) Estimating neuronal
tion. J Acoust Soc Am 107(6):3518–3529. https://doi. variable importance with random forest. Proc IEEE
org/10.1121/1.429434 Bioeng Conf:33–34. https://doi.org/10.1109/NEBC.
Mellinger DK, Martin SW, Morrissey RP, Thomas L, 2003.1215978
Yosco JJ (2011) A method for detecting whistles, Oleson EM, Širović A, Bayless AR, Hildebrand JA (2014)
moans and other frequency contour sounds. J Acoust Synchronous seasonal change in fin whale song in the
Soc Am 129:4055–4061. https://doi.org/10.1121/1. North Pacific. PLoS One 9(12):e115678. https://doi.
3531926 org/10.1371/journal.pone.0115678
Mendelson TC, Shaw KL (2003) Rapid speciation in an Oswald JN, Barlow J, Norris TF (2003) Acoustic identifi-
arthropod. Nature 433:375–376. https://doi.org/10. cation of nine delphinid species in the eastern tropical
1038/433375a Pacific Ocean. Mar Mamm Sci 19:20–37. https://doi.
Mitani JC, Hasegawa T, Groslouis J, Marler P, Byrne R org/10.1111/j.1748-7692.2003.tb01090.x
(1992) Dialects in wild chimpanzees. Am J Primatol Oswald JN, Rankin S, Barlow J, Lammers MO (2007) A
27:233–243 tool for real-time acoustic species identification of
Møhl B, Wahlberg M, Madsen PT, Heerford A, Lund A delphinid whistles. J Acoust Soc Am 122:587–595.
(2003) The monopulsed nature of sperm whale sonar https://doi.org/10.1121/1.2743157
clicks. J Acoust Soc Am 114(2):1143–1154. https:// Oswald JN, Au WWL, Duennebier F (2011) Minke whale
doi.org/10.1121/1.1586258 (Balaenoptera acutorostrata) boings detected at the Sta-
Moon TK (1996) The expectation-maximization algo- tion ALOHA cabled observatory. J Acoust Soc Am 129:
rithm. IEEE Sig Process Mag 13:47–60. https://doi. 3353–3360. https://doi.org/10.1121/1.3575555
org/10.1109/79.543975 Oswald JN, Rankin S, Barlow J, Oswald M (2013) Real-
Morrissey RP, Ward J, DiMarzio N, Jarvis S, Moretti DJ time odontocete call classification algorithm: software
(2006) Passive acoustic detection and localization of for species identification of delphinid whistles. In:
sperm whales (Physeter macrocephalus) in the tongue Adam O, Samaran F (eds) Detection, classification
of the ocean. Appl Acoust 67:1091–1105. https://doi. and localization of marine mammals using passive
org/10.1016/j.apacoust.2006.05.014 acoustics, 2003-2013: 10 years of international
Mouy X, Leary D, Martin B, Laurinolli M (2008) A research. DIRAC NGO, Paris, France
comparison of methods for the automatic classification Oswald JN, Walmsley SF, Casey C, Fregosi S, Southall B,
of marine mammal vocalizations in the Arctic. In: Janik VM (2021) Species information in whistle fre-
Proceedings of the PASSIVE’08 Workshop on New quency modulation patterns of common dolphins.
Trends for Environmental Monitoring using Passive Philos Trans R Soc B 376:20210046. https://doi.org/
Systems, Hyeres, France, 14–17 October 2008 10.1098/rstb.2021.0046
Murray SO, Mercado E, Roitblat HL (1998) Ou H, Au WWL, Oswald JN (2012) A non-spectrogram-
Characterizing the graded structure of false killer correlation method of automatically detecting minke
whale (Pseudorca crassidens) vocalizations. J Acoust whale boings. J Acoust Soc Am 132:EL317–EL322
Soc Am 104:1679–1687. https://doi.org/10.1121/1. Ouattara K, Lemasson A, Zuberbunter K (2009)
424380 Campbell’s monkeys concatenate vocalizations into
Myers C, Rabiner LR, Rosenberg AE (1980) Performance context-specific call sequences. Proc Natl Acad Sci
tradeoffs in dynamic time warping algorithms for USA 106(51):22026
isolated word recognition. IEEE Trans Acoust Speech Papale E, Azzolin M, Cascao I, Gannier A, Lammers MO,
Sig Process 28:623–635. https://doi.org/10.1109/ Martin VM, Oswald JN, Perez-Gil M, Prieto R, Silva
TASSP.1980.1163491 MA, Giacoma C (2013) Geographic variability in the
Nagy CM, Rockwell RF (2012) Identification of individ- acoustic parameters of striped dolphin’s (Stenella
ual eastern screech-owls (Megascops asio) via vocali- coeruleoalba) whistles. J Acoust Soc Am 133:1126–
zation analysis. Bioacoustics 21:127–140. https://doi. 1134. https://doi.org/10.1121/1.4774274
org/10.1080/09524622.2011.651829 Papale E, Azzolin M, Cascao I, Gannier A, Lammers MO,
Narins PM, Feng AS, Fay RR (eds) (2006) Hearing Martin VM, Oswald J, Perez-Gil M, Prieto R, Silva
and sound communication in amphibians. Springer, MA, Giacoma C (2014) Macro- and micro- geographic
New York variation of short-beaked common dolphin’s whistles
Noad MJ, Cato DH, Bryden MM, Jenner MN, Jenner KCS in the Mediterranean Sea and Atlantic Ocean. Ethol
(2000) Cultural revolution in whale songs. Nature 408: Ecol Evol 26:392–404. https://doi.org/10.1080/
537. https://doi.org/10.1038/35046199 03949370.2013.851122
Noda JJ, Travieso CM, Sánchez-Rodríguez D (2016) Park DS, Chan W, Zhang Y, Chiu C, Zoph B, Cubuk ED,
Automatic taxonomic classification of fish based on Le QV (2019) SpecAugment: a simple data
314 J. N. Oswald et al.
augmentation method for automatic speech recogni- clicks and burst-pulses. Mar Mamm Sci 33:520–540.
tion. Proc Interspeech 2019:2613–2617. https://doi. https://doi.org/10.1111/mms.12381
org/10.21437/Interspeech.2019-2680 Reby D, André-Obrecht R, Galinier A, Farinas J,
Parsons S, Boonman AM, Obrist MK (2000) Advantages Cargnelutti B (2006) Cepstral coefficients and hidden
and disadvantages of techniques for transforming and Markov models reveal idiosyncratic voice
analyzing chiropteran echolocation calls. J Mammal characteristics in red deer (Cervus elaphus) stags. J
81:927–938. https://doi.org/10.1644/1545-1542(2000) Acoust Soc Am 120:4080–4089. https://doi.org/10.
081<0927:AADOTF>2.0.CO;2 1121/1.2358006
Payne K, Payne R (1985) Large scale changes over Recalde-Salas A, Salgado Kent CP, Parsons MJG, Marley
19 years in songs of humpback whales in Bermuda. Z SA, McCauley RD (2014) Non-song vocalizations of
Tierpsychol 68:89–114. https://doi.org/10.1111/j. pygmy blue whales in Geographe Bay, Western
1439-0310.1985.tb00118.x Australia. J Acoust Soc Am 135(5):EL213–EL218.
Picone JW (1993) Signal modeling techniques in speech https://doi.org/10.1121/1.4871581
recognition. Proc IEEE 81:1215–1247. https://doi.org/ Recalde-Salas A, Erbe C, Salgado Kent C, Parsons M
10.1109/5.237532 (2020) Non-song vocalizations of humpback whales
Placer J, Slobodchikoff CN (2000) A fuzzy-neural system in Western Australia. Front Mar Sci 7:141. https://
for identification of species-specific alarm sounds of doi.org/10.3389/fmars.2020.00141
Gunnison’s prairie dogs. Behav Process 52:1–9. Rickwood P, Taylor A (2008) Methods for automatically
https://doi.org/10.1016/S0376-6357(00)00105-4 analyzing humpback song units. J Acoust Soc Am 123:
Potter JR, Mellinger DK, Clark CW (1994) Marine mam- 1763–1772. https://doi.org/10.1121/1.2836748
mal sound discrimination using artificial neural Risch D, Gales NJ, Gedamke J, Kindermann L, Nowacek
networks. J Acoust Soc Am 96:1255–1262. https:// DP, Read AJ, Siebert U, Van Opzeeland IC, Van Parijs
doi.org/10.1121/1.410274 SM, Friedlander AS (2014) Mysterious bio-duck
Pozzi L, Gamba M, Giacoma C (2010) The use of Artifi- sound attributed to the Antarctic minke whale
cial Neural Networks to classify primate vocalizations: (Balaenoptera bonaerensis). Biol Lett 10:20140175.
a pilot study on black lemurs. Am J Primatol 72(4): https://doi.org/10.1098/rsbl.2014.0175
337–348. https://doi.org/10.1002/ajp.20786 Roch MA, Soldevilla MS, Burtenshaw JC, Henderson EE,
Prӧhl H, Koshy RA, Mueller U, Rand AS, Ryan MJ Hildebrand JA (2007) Gaussian mixture model classi-
(2006) Geographic variation of genetic and behavioral fication of odontocetes in the Southern California
traits in northern and southern Túngara frogs. Evol 60: Bight and the Gulf of California. J Acoust Soc Am
1669–1679. https://doi.org/10.1111/j.0014-3820.2006. 121:1737–1748. https://doi.org/10.1121/1.2400663
tb00511.x Roch MA, Soldevilla MS, Hoenigman R, Wiggins SM,
Rabiner LR (1989) A tutorial on hidden Markov models Hildebrand JA (2008) Comparison of machine-
and selected applications in speech recognition. Proc learning techniques for the classification of echoloca-
IEEE 77:257–285 tion clicks from three species of odontocetes. Can
Rabiner LR, Juang BH (1986) An introduction to Hidden Acoust 36:41–47
Markov Models. IEEE ASSP Mag 1986:4–16 Roch MA, Brandes TS, Patel B, Barkley Y, Baumann-
Rabiner LR, Levinson S, Sondhi M (1983) On the appli- Pickering S, Soldevilla MS (2011) Automated
cation of vector quantization and hidden Markov extraction of odontocete whistle contours. J Acoust
models to speaker-independent, isolated word recogni- Soc Am 130:2212–2223. https://doi.org/10.1121/1.
tion. Bell Syst Tech J 62:1075–1106. https://doi.org/ 3624821
10.1002/j.1538-7305.1983.tb03115.x Rocha HS, Ferreira LS, Paula BC, Rodrigues HG, Sousa-
Rabiner LR, Juang B, Lee C (1996) An overview of Lima RS (2015) An evaluation of manual and
automatic speech recognition. In: Lee C, Soong F, automated methods for detecting sounds of mane
Paliwal K (eds) Automatic speech and speaker recog- wolves (Chrysocyon brachyurus Illiger 1815). Bio-
nition. Kluwer Academic, New York, pp 1–30 acoustics 24:185–198. https://doi.org/10.1080/
Rankin S, Barlow J (2005) Source of the North Pacific 09524622.2015.1019361
‘boing’ sound attributed to minke whales. J Acoust Soc Roitblat HL, Moore PWB, Nachtigall PE, Penner RH, Au
Am 118(5):3346–3351. https://doi.org/10.1121/1. WWL (1989) Natural echolocation with an artificial
2046747 neural network. Int J Neural Syst 1:239–247
Rankin S, Ljungblad D, Clark CW, Kato H (2005) Rosenblatt F (1958) The perceptron: a probabilistic model
Vocalisations of Antarctic blue whales, Balaenoptera for information storage and organization in the brain.
musculus intermedia, recorded during the 2001/2002 Psychol Rev 65:386–408. https://doi.org/10.1037/
and 2002/2003 IWC/SOWER circumpolar cruises, h0042519
Area V, Antarctica. J Cet Res Manag 7(1):13–20 Ross JC, Allen PE (2014) Random forest for improved
Rankin S, Archer F, Keating JL, Oswald JN, Oswald M, analysis efficiency in passive acoustic monitoring. Ecol
Curtis A, Barlow J (2016) Acoustic classification of Inform 21:34–39. https://doi.org/10.1016/j.ecoinf.
dolphins in the California Current using whistles, 2013.12.002
8 Detection and Classification Methods for Animal Sounds 315
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning Slobodchikoff CN, Ackers SH, Van Ert M (1998) Geo-
representations by back-propagating errors. Nature graphic variation in alarm calls of Gunnison’s prairie
323(6088):533–536. https://doi.org/10.1038/323533a0 dogs. J Mammal 79(4):1265–1272. https://doi.org/10.
Russo D, Mucedda M, Bello M, Biscardi S, 2307/1383018
Pidinchedda E, Jones G (2007) Divergent echolocation Somervuo P, Härmä A, Fagerlund S (2006) Parametric
sound frequencies in insular rhinolophids (Chiroptera): representations of bird sounds for automatic species
a case of character displacement? J Bioeng 34:2129– recognition. IEEE Trans Audio Speech Lang Process
2138. https://doi.org/10.1111/j.1365-2699.2007. 14:2252–2263. https://doi.org/10.1109/TASL.2006.
01762.x 872624
Sainburg T, Theilman B, Thielk M, Gentner TQ (2019) Sparling DW, Williams JD (1978) Multivariate analysis of
Parallels in the sequential organization of birdsong and avian vocalizations. J Theor Biol 74:83–107. https://
human speech. Nat Commun 10:3636. https://doi.org/ doi.org/10.1016/0022-5193(78)90291-6
10.1038/s41467-019-11605-y Stafford KM, Fox CG, Clark DS (1998) Long-range
Sakoe H, Chiba S (1978) Dynamic programming optimi- acoustic detection and localization of blue whale
zation for spoken word recognition. IEEE Trans sounds in the northeast Pacific Ocean. J Acoust Soc
Acoust Speech Sig Process 26:43–49. https://doi.org/ Am 104(6):3616–3625. https://doi.org/10.1121/1.
10.1109/TASSP.1978.1163055 423944
Schassburger RM (1993) Vocal communication in the Stafford KM, Nieukirk SL, Fox CG (1999)
timber wolf, Canis lupus, Linnaeus: structure, motiva- Low-frequency whale sounds recorded on
tion, and ontogeny. Parey Scientific Publication, hydrophones moored in the eastern tropical Pacific. J
New York Acoust Soc Am 106:3687–3698. https://doi.org/10.
Schon PC, Puppe B, Manteauffel G (2001) Linear predic- 1121/1.428220
tion coding analysis and self-organizing feature map as Stafford KM, Moore SE, Laidre KL, Heide-Jørgensen MP
tools to classify stress sounds of domestic pigs (Sus (2008) Bowhead whale springtime song off West
scrofa). J Acoust Soc Am 110:1425–1431. https://doi. Greenland. J Acoust Soc Am 124(5):3315–3323.
org/10.1121/1.1388003 https://doi.org/10.1121/1.2980443
Sethi SS, Jones NS, Fulcher BD, Picinali L, Clink DJ, Starnberger I, Preininger D, Hödl W (2014) The anuran
Klinck H, Orme CD, Wrege PH, Ewers RM (2020) vocal sac: a tool for multimodal signalling. Anim
Characterizing soundscapes across diverse ecosystems Behav 97:281–288. https://doi.org/10.1016/j.anbehav.
using a universal acoustic feature set. Proc Natl Acad 2014.07.027
Sci 117(29):17049–17055. https://doi.org/10.1073/ Stoeger AS, Heilmann G, Zeppelzauer M, Ganswindt A,
pnas.2004702117 Hensman S, Charlton BD (2012) Visualizing sound
Shannon CE, Weaver W (1998) The mathematical theory emission of elephant vocalizations: evidence for two
of communication. University of Illinois Press, rumble production types. PLoS One 7:1–8. https://doi.
Champaign org/10.1371/journal.pone.0048907
Shiu Y, Palmer KJ, Roch MA, Fleishman E, Liu X, Nosal Stowell D, Wood M, Stylianou Y, Glotin H (2016). Bird
EM, Helble T, Cholewiak D, Gillespie D, Klinck H detection in audio: a survey and a challenge. In: 2016
(2020) Deep neural networks for automated detection IEEE 26th International Workshop on Machine
of marine mammal species. Sci Rep 10(1):1–12. Learning for Signal Processing (MLSP), pp 1–6.
https://doi.org/10.1038/s41598-020-57549-y https://doi.org/10.1109/MLSP.2016.7738875
Sibley DA (2000) The Sibley field guide to birds. Knopf, Sturtivant C, Datta S (1997) Automatic dolphin whistle
New York detection, extraction, encoding, and classification. Proc
Simmons JA, Wever EG, Pylka JM (1971) Periodical Inst Acoust 19:259–266
cicada: sound production and hearing. Science Suzuki R, Buck J, Tyack P (2006) Information entropy
171(3967):212–213. https://doi.org/10.1126/science. of humpback whale songs. J Acoust Soc Am 119:
171.3967.212 1849–1866. https://doi.org/10.1121/1.2161827
Širović A (2016) Variability in the performance of the Swets JA, Dawes RM, Monahan J (2000) Better decisions
spectrogram correlation detector for north-east Pacific through science. Sci Am 283:82–87
blue whale calls. Bioacoustics 25(2):145–160. https:// Takahashi N, Kashino M, Hironaka N (2010) Structure of
doi.org/10.1080/09524622.2015.1124248 rat ultrasonic vocalizations and its relevance to behav-
Širović A, Cutter GR, Butler JL, Demer DA (2009) ior. PLoS One 5(11):e14115. https://doi.org/10.1371/
Rockfish sounds and their potential use for population journal.pone.0014115
monitoring in the Southern California Bight. ICES J Tan M, McDonald K (2017) Bird sounds | Experiments
Mar Sci 66:981–990. https://doi.org/10.1093/icesjms/ with Google [online]. https://experiments.withgoogle.
fsp064 com/bird-sounds
Sjare B, Stirling I, Spencer C (2003) Seasonal and longer- Tchernichovski O, Nottebohm F, Ho CE, Pesaran B, Mitra
term variability in the songs of Atlantic walruses PP (2000) A procedure for an automated measurement
breeding in the Canadian High Arctic. Aquat Mamm of song similarity. Anim Behav 59:1167–1176. https://
29(2):297–318 doi.org/10.1006/anbe.1999.1416
316 J. N. Oswald et al.
Tenenbaum JB, De Silva V, Langford JC (2000) A global conjunction with a digital tag (DTag) recording. Can
geometric framework for nonlinear dimensionality Acoust 36:60–66
reduction. Science 290(5500):2319–2323. https://doi. Ward R, Parnum I, Erbe C, Salgado-Kent CP (2016)
org/10.1126/science.290.5500.2319 Whistle characteristics of Indo-Pacific bottlenose
Thomas JA, Golladay CL (1995) Analysis of underwater dolphins (Tursiops aduncus) in the Fremantle Inner
vocalizations of leopard seals (Hydrurga leptonyx). In: Harbour, Western Australia. Acoust Aust 44(1):
Kastelein RA, Thomas JA, Nachtigall PE (eds) Sen- 159–169. https://doi.org/10.1007/s40857-015-0041-4
sory systems of aquatic mammals. De Spil Publishers, Ward R, Gavrilov AN, McCauley RD (2017) “Spot” call:
Amsterdam, pp 201–221 A common sound from an unidentified great whale in
Thomas M, Martin B, Kowarski K, Gaudet B, Matwin S Australian temperate waters. J Acoust Soc Am 142(2):
(2019) Marine mammal species classification using EL231–EL236. https://doi.org/10.1121/1.4998608
convolutional neural networks and a novel acoustic Weisburn BA, Mitchell SG, Clark CW, Parks TW (1993)
representation. In: Joint European Conference on Isolating biological acoustic transient signals. Proc
Machine Learning and Knowledge Discovery in IEEE Int Conf Acoust Speech Sig Process 1:269–
Databases, pp 290–305 272. https://doi.org/10.1109/ICASSP.1993.319107
Torrey L, Shavlik J (2010) Transfer learning. In: Hand- Wellard R, Erbe C, Fouda L, Blewitt M (2015)
book of research on machine learning applications and Vocalisations of killer whales (Orcinus orca) in the
trends: algorithms, methods, and techniques. IGI Bremer Canyon, Western Australia. PLoS One 10(9):
Global, New York, pp 242–264 e0136535. https://doi.org/10.1371/journal.pone.
Trawicki MB, Johnson MT, Osiejuk TS (2005) Automatic 0136535
song-type classification and speaker identification of Wells KD (2007) The ecology and behaviour of
Norwegian ortolan bunting. IEEE Int Conf Mach Learn amphibians. University of Chicago Press, Chicago, IL
Sig Process (MLSP) 2005:277–282. https://doi.org/10. Wich SA, Schel AM, De Vries H (2008) Geographic
1109/MLSP.2005.1532913 variation in Thomas langur (Presbytis thomasi) loud
Trifa VM, Kirschel ANG, Taylor CE (2008) Automated sounds. Am J Primatol 70:566–574. https://doi.org/10.
species recognition of antbirds in a Mexican rainforest 1002/ajp.20527
using hidden Markov Models. J Acoust Soc Am 123: Winn HE, Winn LK (1978) The song of the humpback
2424–2431. https://doi.org/10.1121/1.2839017 whale Megaptera novaeangliae in the West Indies.
Valente D, Wang H, Andrews P, Mitra PP, Saar S, Mar Biol 47:97–114. https://doi.org/10.1007/
Tchernichovski O, Golani I, Benjamini Y (2007) BF00395631
Characterizing animal behavior through audio and Wood JD, McCowan B, Langbauer WR, Viljoen JJ,
video signal processing. IEEE Multimedia 14:32–41. Hart LA (2005) Classification of African elephant
https://doi.org/10.1109/MMUL.2007.71 Loxodonta africana rumbles using acoustic
Van Allen E, Menon MM, Dicaprio N (1990) A modular parameters and cluster analysis. Bioacoustics 15:
architecture for object recognition using neural 143–161. https://doi.org/10.1080/09524622.2005.
networks. In: Proceedings of International Neural 9753544
Networks Conference, Paris, vol 1, pp 35–379, Yamamoto O, Moore B, Brand L (2001) Variation in the
13 July 1990. Kluwer Academic Publishers, Dordrecht bark sound of the red squirrel (Tamiasciurus
Vapnik VN (1998) Statistical learning theory. Wiley, hudsonicus). West N Am Nat 61:395–402
New York Yang X-J, Lei F-M, Wang G, Jesse AJ (2007) Syllable
Venter PJ, Hanekom JJ (2010) Automatic detection of sharing and inter-individual syllable variation in
African elephant (Loxodonta africana) infrasonic Anna’s hummingbird Calypte anna songs, in San
vocalizations from recordings. Biosyst Eng 106:286– Francisco, California. Folia Zool 56:307–318
294. https://doi.org/10.1016/j.biosystemseng.2010.04. Yoshino H, Armstrong KN, Izawa M, Yokoyama J,
001 Kawata M (2008) Genetic and acoustic population
Von Muggenthaler E, Reinhart P, Lympany B, Craft RB structuring in the Okinawa least horseshoe bat: are
(2003) Songlike vocalizations from the Sumatran rhi- intercolony acoustic differences maintained by vertical
noceros (Dicerorhinus sumatrensis). Acoust Res Lett maternal transmission? Mol Ecol 17:4978–4991.
4(3):83–88. https://doi.org/10.1121/1.1588271 https://doi.org/10.1111/j.1365-294X.2008.03975.x
Waibel A, Hanazawa T, Hinton G, Shikano K, Lang KL Zar JH (2009) Biostatistical analysis, 5th edn. Pearson,
(1989) Phoneme recognition using time-delay neural New York, p 960
networks. IEEE Trans Acoust Speech Signal Proc 37: Zeppelzauer M, Hensman S, Stoeger AS (2015) Towards
328–339. https://doi.org/10.1109/29.21701 an automated acoustic detection system for free-
Ward J, Morrissey R, Moretti D, DiMarzio N, Jarvis S, ranging elephants. Bioacoustics 24:13–29. https://doi.
Johnson M, Tyack PL, White C (2008) Passive acous- org/10.1080/09524622.2014.906321
tic detection and localization of Mesoplodon Zhang YJ, Huang JF, Gong N, Ling ZH, Hu Y (2018)
densirostris (Blainville’s beaked whale) vocalizations Automatic detection and classification of marmoset
using distributed bottom-mounted hydrophones in vocalizations using deep and recurrent neural
8 Detection and Classification Methods for Animal Sounds 317
networks. J Acoust Soc Am 144(1):478–487. https:// labeling. Appl Acoust 166:107375. https://doi.org/10.
doi.org/10.1121/1.5047743 1016/j.apacoust.2020.107375
Zhong M, LeBien J, Campos-Cerqueira M, Dodhia R, Zuberbuhler K, Jenny D, Bshary R (1999) The predator
Ferres JL, Velev JP, Aide TM (2020) Multispecies deterrence function of primate alarm calls. Ethology
bioacoustic classification using transfer learning of 105:477–490. https://doi.org/10.1046/j.1439-0310.
deep convolutional neural networks with pseudo- 1999.00396.x
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Fundamental Data Analysis Tools
and Concepts for Bioacoustical Research 9
Chandra Salgado Kent, Tiago A. Marques, and Danielle Harris
acoustic instruments, and/or surveyed animals). approach may still do the job well. Consequently,
The reality is that for a bioacoustician to be able not only is it important for researchers to have a
to confidently answer research questions, budgets solid foundation in long-established analytical
must allow for robust experimental designs and approaches, but they must keep up to date with
sufficient time to collect sample sizes representa- new developments. In general, a researcher
tive of the study population. Even when budgets should understand the fundamentals involving
and time allow for carefully designed randomness, variability, and statistical modeling
experiments, however, environmental conditions discussed in this chapter, and be able to adapt
and study animals often cannot be controlled, them to their specific context—this understanding
particularly when studied in their natural environ- is arguably more valuable than a book of recipes
ment. Moreover, many studies occur that tells a researcher which method to use
opportunistically and are not the result of an and when.
experimental design developed specifically for A consequence of the many advancements
the study aims. They are observational in nature over recent years and the large range of analytical
and can take advantage of large, long-term approaches available today is that selecting the
existing datasets or unexpected opportunities to right tool can be an overwhelming task. In fact,
collect field data. In fact, data collected the right tool might not exist for a specific setting.
opportunistically are prevalent in bioacoustical In such cases, collaboration with an applied stat-
studies, as many researchers take recording istician may be fundamental. This chapter aims to
systems into the field during other work to use give general guidance on considerations that
when time permits. bioacousticians should make when tasked with
The challenges described above, from ensur- undertaking research resulting in what are often
ing that the research questions have biological complex and messy bioacoustical datasets. The
relevance, to evaluating the achievability of a information presented in this chapter is by no
study and reliability of its outcomes, are only a means meant to provide a menu of analytical
few of many challenges faced by bioacousticians. tools, their mathematical basis, or conditions of
To overcome these challenges, bioacousticians use. There are a large number of widely available
must have solid foundational knowledge about textbooks that do just that, and many are
the quantitative aspects of their research: from referenced here. Bioacousticians should consult
how to formulate quantitative research questions, the relevant textbooks for in-depth knowledge of
to designing robust studies and undertaking suit- approaches, their applications, limitations, and
able analyses. Only by having these skills can assumptions about the characteristics of the data
reliable conclusions and scientific claims that must be met. Rather, the focus of this chapter
be made. is to provide practical guidance on: (1) the devel-
Today, not only are there a wide range of opment of meaningful research questions, (2) data
analytical tools available to select from, but this exploration and experimental design
ever-increasing number has been evolving considerations (also see Chap. 3), and (3) common
quickly over recent decades due to the dramatic analytical approaches used today. The approach
improvement in computer capacity. Moreover, taken in this chapter is to define basic terms and
ongoing research in statistics continually updates concepts as they appear in the text, so that readers
our knowledge on the suitability of commonly new to the subject can also understand the more
used methods (Wilcox 2010). In some instances, complex concepts discussed, regardless of their
methods previously used over a wide range of prior statistical knowledge.
applications may now only be acceptably applied Note that this chapter has been written from
to certain scenarios, with new methods the perspective of a biologist faced with the
superseding old ones. Having said this, while a challenges common to bioacoustical research. If,
new method may be considered the ‘Rolls Royce’ from this chapter, the reader gains an appreciation
of analyses, sometimes an older, simpler of limitations in their data, considerations they
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 321
should make when selecting analytical commonly used during preliminary data explora-
approaches, and the biological relevance of their tion before undertaking inferential, explanatory,
analytical outputs, then this chapter has achieved or predictive studies (see Sect. 9.3.3). Indeed,
its purpose. Entire books could be written about descriptive and exploratory surveys are often
how a bioacoustician, in fact, any ecologist, might used to develop the more complex inferential,
become more quantitative. A good example of explanatory, and predictive study type questions.
such a book is suitably named How to be a Inferential studies build on descriptive and
quantitative ecologist (Matthiopoulos 2010), exploratory studies by quantifying whether
which we wholeheartedly recommend as good findings are likely to be true for a broader popu-
reading after this chapter. lation and hence can be generalized. For example,
inferential studies are commonly used to make
decisions about whether there is sufficient evi-
9.2 Developing a Clear Research dence regarding observed patterns or
Question relationships in sample data to believe that they
have not arisen from the population by pure
At the concept stage of any study, the purpose and chance alone. Explanatory studies aim to identify
specific research aim must be clearly defined. The associated conditions (e.g., species, age, sex of an
research aim should be novel (i.e., not already animal, date, time of day, season, and environ-
answered in previous research). Once the general mental factors such as temperature, noise, etc.)
aim has been defined, the specific analytical influencing or explaining an outcome (e.g., the
research question can be developed. While devel- rate at which animals produce their calls). These
oping the question may seem to be a simple, self- studies seek to determine the magnitude and
evident task, it requires careful consideration. The direction of relationships (Leek and Peng 2015).
structure of the question drives the experimental Predictive studies aim to predict future outcomes
design and selection of analytical tools, thus its in given conditions or scenarios (but may not
accurate development is essential. To frame a necessarily explain conditions leading to an
question in clear, concise analytical terms, it is observed outcome). By identifying which of the
useful to identify the type of study involved. study types your research aim falls into, the gen-
There are many types of studies conducted for a eral structure of the analytical question can be
wide range of purposes. Depending upon the formed. Some examples of the different study
discipline, groupings that describe types of stud- types and corresponding analytical questions are
ies and their definitions vary. Here, we have given in Table 9.1.
adopted five of the six groupings referred to by
Leek and Peng (2015) as common in bioacous-
tics. These study types include descriptive, 9.3 Designing the Study
exploratory, inferential, explanatory (called and Collecting Data
‘causal’ in Leek and Peng 2015), and predictive
studies. Definitions we give here have been Once the analytical question has been formulated
framed within the context of common based on the study type, novelty, and whether it
bioacoustical questions, and thus are adapted truly addresses the research question, the feasibil-
from more broad definitions. ity of collecting the required data will need to be
Of the study types, descriptive studies are the assessed. Practical considerations, for instance,
simplest, aiming to summarize datasets collected. include identifying any hindrances to study site
Exploratory studies take a step beyond and accessibility or timely ethics approvals and ani-
explore relationships, trends, and patterns in mal experimentation permits. Below (Fig. 9.1) is
datasets. Neither of these types of studies a checklist of some preliminary considerations
attempts to infer beyond the dataset collected to before committing to developing, designing, and
the wider population. These types of studies are executing a study.
322 C. Salgado Kent et al.
Table 9.1 Examples of study types and their corresponding objectives and questions
Study type Purpose Example objective Example questions
Descriptive Studies conducted to describe Describe the characteristics • What is the frequency range of
phenomena and conditions of sound produced by sea sounds produced?
measured during a study. turtle hatchlings recorded • What are the source levels of
during a study. sounds produced?
• What is the rate of sound
production by sea turtle hatchlings?
Exploratory Studies exploring relationships, Establish how observed • How does observed hatchling sea
trends, and patterns in datasets hatchling sea turtles’ sound turtles’ sound production vary
(not in a broader population). production varied during a during a given survey?
survey.
Inferential Studies aiming to estimate Determine the average • What is the average expected
population parameters or test expected sound production sound production rate of a
hypotheses about a broader rate of a population of population of hatchling sea turtles?
population. hatchling sea turtles.
Explanatory Studies that aim to understand Identify what influences • Are communications influenced
the underlying cause(s) of a sound production in sea turtle by the presence of other sea turtles,
behavior, state, or phenomenon. hatchlings. environmental conditions, or
human/predator threats?
Predictive Studies that aim to predict an Predict hatchling sea turtle • What will be the expected sound
outcome (such as animal sound production rate when production rate of hatchling sea
behaviors) in response to a threatened by humans. turtles when exposed to human
stimulus or condition. threats?
Will there be any logiscal / ethical constraints that will affect the execuon of the study?
9.3.1 Experimental Design collected for another primary study are used to
answer a new research question. In these cases,
The ideal situation is to formulate the analytical the methods and experiment are not necessarily
question before data are collected (i.e., a priori) so designed according to the analytical requirements
that experiments can be designed to maximize the of the new research question. Bioacoustical stud-
chance that, based on the observations, they pro- ies using pre-existing opportunistic data often do
duce precise (i.e., close to one another) and accu- so because collecting new data can be
rate (i.e., proximal to true values) estimates of the prohibitively expensive (e.g., if the field site is
parameters of interest, and so that there is a high remote or if specialized equipment is required).
probability of detecting relevant effects (i.e., that Since the methods and experimental design may
there is sufficient statistical power) when they are be sub-optimal for the current study questions, the
present. In some cases, however, formulation of data must be meticulously evaluated to check that
the analytical questions occurs after data have newly formulated analytical questions can indeed
been collected (i.e., a posteriori). This may be answered. Studies attempting to answer spe-
occur as a result of poor planning or of new and cific research questions using sub-optimal or
unforeseen research opportunities. A scenario in poor-quality data cannot always be salvaged,
which this often occurs is when data already even with sophisticated analyses. The prominent
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 323
twentieth century biostatistician, Sir Ronald if there was no error then there would be no need
Fisher, illustrated this problem with the following for statistics. Of course, the performance of the
quote: “To call in the statistician after the experi- analytical methods is affected by the amount of
ment is done may be no more than asking him to error in the data, in that the statistical power to
perform a post-mortem examination: he may be detect significant effects decreases with increas-
able to say what the experiment died of” (Fisher ing error, but if there was no error, by definition
1959). This message cannot be overstated. It is there would be no questions left to answer and
critical, wherever possible, to consider the ques- statistics would have no role to play. Systematic
tion carefully a priori, so that the study is able to error (i.e., bias) is consistent error that is repeat-
answer the question (Cochran 1977). If you think able if the data are recorded again. It can arise
you might need to consult with a statistician, do from many causes, such as a person consistently
so before collecting the data. making the same erroneous observation (i.e.,
For analyses to answer ecological research biased observation; e.g., incorrectly recording
questions, the experimental design must yield male birds as female birds) or an incorrectly
sufficient information about the question of inter- calibrated instrument. In behavioral studies,
est. Often, ecological questions involve sets of biases in collected data can also be introduced
sampling units taken from a larger group (i.e., by the presence of the researchers themselves
the statistical population, hereafter referred to as (e.g., through human disturbance in a study on
a population unless otherwise stated). For a given supposedly undisturbed animal vocal behavior).
study species, or set of species, sampling units The introduction of bias can be further illustrated
could be defined as individuals, groups, cohorts, in the example of a bioacoustician estimating
communities, or local populations of the species acoustic cue production rate (i.e., number of
of interest—it depends on the research question. cues, such as calls, produced per unit time) for a
Usually, due to logistical and time constraints, it population. In this example, the researcher
is not possible nor desirable to make obtains samples of animals by locating the
measurements over all objects or the whole pop- animals producing acoustic cues. It is highly
ulation. In these cases, a sample is taken and data likely, however, that the sample collected will
collected from the sample are considered to be be only from animals that are in a sound-
representative of the population. It is key that the producing state (as silent animals will go unde-
process used to draw the sample is well under- tected), hence acoustic cue rate might be inadver-
stood and is ideally random in design. The pro- tently overestimated. Furthermore, animals may
cess of drawing conclusions regarding a respond to the presence of the researcher by alter-
population based on a sample from it is called ing their cue production rates, thereby introducing
statistical inference. further error to cue rate estimation. Such studies
To make meaningful inferences about the should be designed to remove or control biases. If
properties of a population, the sampling protocol controls cannot be integrated into the experimen-
must yield a sample size that is sufficiently large tal design, then these may be able to be applied at
to represent the population. In addition, the sam- the analytical stage (statistical controls; see
pling protocol should either eliminate or control Dytham 2011) and estimation of, and adjustments
significant sources of error including random and for, unavoidable biases may be made during the
systematic error (Cochran 1977; Panzeri et al. analysis. For topics on experimental design (e.g.,
2008). Random error is caused by unknown and systematic, stratified-random, and random-block)
unpredictable changes, such as in the environ- that aim to reduce biases and increase inferential
ment, in instruments taking measurements, or as power, the reader is referred to textbooks such as
a result of the inability of an observer to take the Lawson (2014), Manly and Alberto (2014),
exact same measurement in the same way. Statis- Cohen (2013), Underwood (1997), and Cochran
tical methods typically quantify this error and, in (1977), among many others. It is critical that
fact, build on it to draw inferences. In some sense, researchers carefully consider and identify the
324 C. Salgado Kent et al.
Does the scope of the experimental design match those of the quesons?
Is the sample size large enough given the effect size (see Secon 9.5.1.2 for discussion on
effect size) being invesgated?
Are the resources (e.g., me, money, and trained personnel) available for the project
sufficient to carry out the study?
Will data be reliable (i.e., accurate and precise) enough to answer the quesons?
Will causes of biases in data collected be able to be idenfied and removed or addressed
adequately?
Fig. 9.2 Checklist of some considerations to determine whether a research question can be answered
most suitable sampling design for their research have sufficient accuracy and precision to detect
questions. the effect(s) of interest. The accuracy of an esti-
Despite all attempts to obtain reasonable sam- mate is its proximity to the true value, while
ple sizes, minimize biases, and carefully select an precision refers to the variability of successive
appropriate experimental design, data quality is estimates of the same quantity. Naturally, to be
frequently sub-optimal due to logistical or practi- able to derive accurate and precise estimates,
cal constraints. Often unexpected restrictive measurements must also be accurate and precise.
weather conditions and/or failure of instruments Accuracy and precision of measurements are
limit data collection during fieldwork. Good evaluated through calibration and testing of the
planning can mitigate unexpected data instruments. Some instruments may simply not
limitations, thus wherever possible, there should have the capacity or range required for the
be contingency plans in place to deal with the study. For example, a low-frequency acoustic
unexpected (e.g., budgeting for a reasonable recorder will not have the capacity to measure
number of poor-weather days or redundancy in the acoustic behavior of bats, which produce
instrumentation). Even with careful design and high-frequency echolocation signals. While care-
contingencies implemented, data limitations can ful consideration must be made in selecting
still occur and may need to be dealt with at the instrumentation, considerable advances in their
analysis stage. However, as noted before, sophis- capacities have been made over recent decades.
ticated analyses to deal with these are always a Instrumentation in bioacoustical studies is
second-best option over implementing data col- discussed in detail in Chap. 2. Below is a check-
lection methods and survey design that are robust list for evaluating whether the selected instrumen-
to potential limitations. Figure 9.2 gives a list of tation will collect the required data for a project
some considerations to be made for assessing (Fig. 9.3).
whether research questions can be answered
before data are collected.
9.3.3 Preliminary Data Exploration
9.3.2 Instruments and Measurements Data quality resulting from the experimental
design, selected instrumentation, and
Instruments must be able to measure subject measurements must be checked through data
behavior and conditions of interest in the study exploration and visualization (e.g., graphics,
such that estimates derived from the observations spectrograms) before embarking on planned
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 325
Do the instruments have the sensitivity (i.e., sufficiently low noise floor and thus sufficiently
low amplitude that can be recorded), dynamic range (i.e., range of amplitudes that can be
recorded), frequency range (for sound recorders), and field robustness required for the study?
Is there a quality-control process to ensure that instrument accuracy and precision can be
measured over time (e.g., systematic calibration and testing)?
Are the instruments reliable in that they will not result in significant sets of missing or biased
data?
Fig. 9.3 Checklist of example considerations for selecting instrumentation for a bioacoustical study
analyses. It can be said that it is never early honest approach, with little added cost, is to pres-
enough to explore data, nor can there be too ent and discuss the results of an analysis with and
many graphs involved in doing so. In fact, a without those observations. This approach
preliminary exploration of data should always provides useful information about the practical
be conducted at the beginning of data collection consequences of the presence of anomalous
to allow the structure of the data to be observations.
investigated, including the presence of anomalous If sufficiently large gaps in information from
data points, missing values, and potential biases. missing values occur, the data may not be repre-
By identifying these early in the study, unfore- sentative of the larger population, especially since
seen design, sampling, or instrumentation issues it might be hard to determine after the survey
can be rectified. Preliminary exploration of data, whether the data were missing at random. Simi-
after data collection has been completed, will larly, if measurements were collected under cer-
allow for any remaining anomalies and biases to tain conditions (e.g., poor weather or noise), the
be identified and planned analyses refined. Suspi- data cannot typically be used to make inferences
cious observations can be introduced at different outside this range of conditions (which would be
stages of the research, for instance through: referred to as extrapolation). Finally, data of very
(1) data entry error, (2) changes in the measure- poor quality may not be salvageable, and—as
ment methods, (3) experimental error, or (4) some mentioned before—it is far preferable to get the
unexpected, but real variation. For the first three data right in the first place than to trust analytical
cases, the anomalous value(s) might be removed solutions to deal with problems introduced at the
before analysis. In the last case, there could be data collection stage. Data exploration and visu-
some biologically important reason for the alization are further discussed in Sects. 9.4
observed unexpected values. Sometimes the and 9.5.
word “outlier” is used to refer to these suspicious
observations, but we prefer to avoid the term. An
outlier implies something that was unexpected,
9.4 Data Types and Statistical
but only after defining what would be expected
Concepts
can we decide what the word “outlier” means.
Often “outliers” are very informative and can
Regardless of the analytical approaches used,
even lead to new research questions. Conse-
there are some fundamental terms and concepts
quently, it is important to understand how
that need to be understood before embarking on
anomalies have occurred and to ascertain whether
analyses.
they should be removed or not. A good and
326 C. Salgado Kent et al.
9.4.1 Variable Types and Their corresponding factor would have three levels.
Distributions Numerical variables are quantitative, and can be
discrete (e.g., integers such as counts) or continu-
Measures of observations or conditions of interest ous (where, by definition, an infinite number of
in a study can be called variables. For instance, values are possible between any two values).
variables can be measurable properties of Examples of continuous variables are the height
animals, their behaviors, or their environment. and weight of an individual or pressure and tem-
In a study of the acoustic characteristics of ele- perature, while the number of sounds or the num-
phant vocalizations recorded at different ranges ber of individuals are examples of discrete
from the animal, relevant variables might include variables. A summary of variable classification
the range between the microphone and the ele- and metrics is given in Table 9.2.
phant, the subject (i.e., which animal it is), the Properties of these variables, such as central
sound type, the received sound level, the spectral tendency measures like the mean, mode, and
characteristics of the sound at the receiver median, or measures of spread like variance and
locations, and the acoustic characteristics of the standard deviation, are statistics that can be used
environment between the elephant and the to describe a sample of values. When these refer
receiver. In general, a researcher will have a to the values that these quantities have in the
good idea about the plausible values for the population (as distinct from a sample of that pop-
variables of interest, and hence what range of ulation), these properties are called parameters.
values to expect, but not know the exact values Often, additional variables are collected that
before the observations are made. Variables of are not necessarily of interest in explaining a
known expected range but whose exact values research question but could influence the
are unknown until observed are random variables response variables. For example, while a bioac-
by definition. The notion of “outlier” is related to oustician might be interested in measuring the
this expectation, as “unexpected” values might be rate of vocalization of chicks as a function of the
considered suspicious. Within a regression con- parents’ presence, the frequency of predator visi-
text (see Sect. 9.4.3 for more detail), the variables tation could also influence vocalization rates. In
that represent the outcome of interest are called this example, collecting information on the main
dependent variables or response variables. When independent variable (parent presence) and the
they represent the conditions that influence the variable not of direct interest (predator presence)
outcome, they are called independent variables would be considered important to capture all
or explanatory variables, sometimes known as variables influencing vocalization rate. Some of
predictors or covariates. Hereafter we use all these variables might be of direct interest, but
terms to discuss variables, choosing each time some might just be included in a study because
the definition we feel will help to make the mean- they can affect the response, and if ignored,
ing of a concept most intuitive. would confound the results. For this reason, they
Variables can be of two types: (1) categorical, might sometimes be referred to as confounding
which can be further subdivided into nominal or factors or confounding effects. Note that these
ordinal (if there is an order), and (2) numerical, terms and their definitions vary with discipline
which could be discrete or continuous. Categori- (e.g., there is some discussion about the exact
cal variables are often called factors and are qual- definition of a covariate; see Salkind 2010) and
itative. For example, if the variable was a sound analytical software, and sometimes are used inter-
type produced by a bird categorized as either song changeably. Therefore, the reader should make
or chirp, then sound type would be a nominal sure that, when reading a source or when
factor with two levels, also called a binary vari- reporting their own results, the context provides
able. If the bird species was known to produce the required clarity for the wording chosen.
three different sound types, then the Not only are variables described according to
the properties they measure and whether they are
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 327
independent or dependent variables, but in the standard deviation, and for the case of the
context of some analytical methods (e.g., linear Poisson, it is defined by the mean only. Given
regression models and their extensions) they are the parameter values that define a random vari-
also described by whether they represent a specific able, all the characteristics of the random variable
or random set of values. Generally, in statistics, a are unambiguously defined.
variable with a value that is not known before it is Values of a discrete variable are characterized
observed (e.g., peak frequency of a call or number by a probability mass function (pmf). A pmf is a
of animals in a group), but of which the range of function that gives the probability that a single
possible values is known (e.g., a positive continu- realization of the variable takes on a specific
ous number like the amplitude of a lion’s roar), is discrete value. The number of vocalizing
known as a random variable, as described above. individuals detected in an area might be
Its range of possible values is referred to as the approximated by a Poisson random variable,
domain of the random variable. characterized by its mean (such as 3.7
A random variable can be characterized by its individuals). The Poisson distribution is special
probability distribution, which describes the in that its variance is equal to its mean, a restric-
probability of observing values in a given range tion that means that often it does not fit biological
of the domain of the variable. An infinite number data well, where larger variance than the mean is
of distributions exist, but some, given their useful the norm.
properties, are widely used. These distributions In contrast, continuous variables can be
are given names so that we can easily refer to characterized by a probability density function
them. Arguably, the most widely used are the (pdf). In the instance of a variable such as the
Gaussian distribution (perhaps more often change in duration of song, the pdf might be
known as the normal distribution, but since there represented by a Gaussian distribution—a bell-
is nothing normal about it and it induces shaped curve characterized by its mean and stan-
practitioners to think there might be, we avoid dard deviation. For example, the variable “change
the term here), gamma distribution, and beta dis- in song duration” could have a true mean change
tribution, used to model continuous data; while in duration of 240 s and a true standard deviation
the Poisson distribution, negative binomial distri- of 12 s. These true values are generally unob-
bution, and binomial distribution are useful when served, but we would like to estimate them. A
modeling discrete values. The uniform distribu- single measurement of change in song duration
tion is one in which all values in the domain are by a researcher could produce a value of 228 or
equally likely and can be either continuous or 271 s. These single values are referred to as
discrete. These distributions are typically defined realizations of the random variable. Pdf functions
by their parameters. As an example, the normal provide information about how the values are
distribution is defined by the mean and the distributed before they are observed. Further
328 C. Salgado Kent et al.
Fig. 9.4 Examples of samples taken from different experiments and outcome success probability p),
distributions. The Gaussian, gamma (defined by its shape represented with barplots, are discrete distributions. Note
parameter k and scale parameter θ) and beta (defined by some distributions can be special cases of others. As an
shape parameters α and β) are continuous distributions, example, the beta distribution, with shape parameters
represented with histograms. The Poisson (defined by its α ¼ 1, β ¼ 1 is shown, illustrating the fact that it is
mean) and binomial (defined by n independent equivalent to a uniform distribution
examples of distributions are given in Fig. 9.4. so that the terms do not come as a surprise. The
The reader is referred to Quinn and Keough reader is referred to Casella and Berger (2002) for
(2002) for a good introduction to useful probabil- further details on statistical inference, estimators
ity distributions in biostatistics. and their variance.
As discussed previously, a parameter is a
quantity relating to the population of interest.
When performing statistical inference, we want
9.4.2 Estimators and Their Variance
to estimate the parameters in the population (e.g.,
the mean cue production for a species of whale)
In this section, we introduce estimators and
using samples (e.g., a sample of acoustic tags put
related concepts because we will need them
on whales). To estimate parameters, we use
later, but we note that we do so very briefly, just
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 329
estimators. An estimator is a formula that we can might be considered. The rationale behind the
use to compute a parameter based on a sample. In bootstrap is that one can resample with replace-
the case of estimating the population mean, the ment from the original sample, and the variability
estimator is, not surprisingly, the well-known for- of the estimates computed over the resamples is
mula for the sample mean. Estimators are there- an estimate of the estimator variability. The
fore based on random variables, in the sense that reader is referred to Manly (2007) for further
each time we collect a sample we would get a new details about these procedures. While variance is
observed value (i.e., a new estimate). Thus, an commonly reported, when comparing variances
estimator can also be thought of as a sample of quantities that have different means, the coeffi-
statisitic that estimates the population parameter cient of variation (CV), which is the standard
such as the mean. If we collected infinite samples deviation divided by the mean, can be useful.
and computed the estimator each time, we would The CV is typically reported as a percentage (%
get the estimator sampling distribution, from CV ¼ standard deviation/mean 100).
which we could evaluate the bias and the variance
of an estimator. However, collecting infinite
samples is not possible, but by understanding 9.4.3 Modeling
the properties of the estimator and the design
used to collect the data, we can also quantify the In its most simplistic form, a model is a mathe-
variability associated with an estimator, based on matical generalization of the relationship among
a single sample. Variability is a key attribute of an processes (Ford 2000). Models are by necessity a
estimator, and the resulting estimate from the simplification of reality. Extending a quote
single sample (known as the point estimate) is popularized by George P. Box (1976), all models
not enough to provide a full representation of are strictly wrong, in that they are always
it. For example, it is very different to say that oversimplifications of reality, but many models
we estimate a cue production rate to be 7.2 sounds are useful, in that they provide useful
per hour, than to provide the additional informa- explanations or predictions of reality. Models
tion that it could vary from 7.1 to 7.2, or that it can either be empirical or theoretic. A common
could vary from 1.2 and 27.7. In the first example example of a theoretical model in acoustics is the
we have a small variance, and the latter we have piston model used to represent the beam pattern in
such a large variance that the estimator itself is a directional sound source like the dolphin
borderline useless. To compute an estimator’s biosonar system (Zimmer et al. 2005). While
variance, there are two main approaches. If the theoretical models are based on theory, empirical
estimator and the process by which we collect the models are based on observations. Here we will
sample is simple enough, we have standard focus discussion on empirical models as observed
formulae for the variance. That is the case for data are commonly used to fit models to describe
the sample mean from a simple random sample. bioacoustical processes. Models describing the
However, often in practice, that is not the case, relationships between whale vocalization rates
say because the sampling procedure is convo- and season or location (Warren et al. 2017) or
luted, there is a hierarchy in the process, or the dolphin occupancy and pile driving noise (Paiva
estimator is composed of several random et al. 2015) are examples of empirical models.
components, possibly not independent among Another example is a mathematical equation that
themselves. A good example is an animal density describes the number of bird calls recorded within
estimator from Passive Acoustic Monitoring a given period as a function of the number of
(PAM), where different random components like birds present. By identifying the mathematical
encounter rate, detection probability, cue rate, and relationship between variables, past events can
false-positives might be at play (see Sect. 9.6.2 be explained and future scenarios predicted.
for a PAM density estimation example). In such However, finding such an association requires
cases, resampling techniques like the bootstrap careful interpretation, especially in observational
330 C. Salgado Kent et al.
studies. Finding an association between two or a variables. In addition to having these two explan-
set of variables does not necessarily imply a cau- atory variables of direct interest, other variables
sation. This could be either a spurious associa- may also be relevant to include in models,
tion, or an observation induced by a variable that because they might a priori be expected to also
was not recorded. It is a statistical capital sin to influence the response variable. Variables that
confuse correlation with causation. For example, may affect vocalization rate may include time,
on hot days, the consumption of ice creams season, social context, or location. Studies in
increases, and so does the number of fires. But which multiple explanatory variables influence
you can eat an ice cream guilt-free as you will not the outcome might have interactions between
cause a fire! the explanatory variables that are important to
consider. For instance, vocalization rate may dif-
fer between male and female sea lions, but only
9.4.3.1 Introduction to Regression: The
for sub-adults and adults and not for pups and
Cornerstone of Statistical
juveniles.
Ecology
In a regression model, a distribution is typi-
Arguably, the most common and most useful
cally assumed for the response variable. This will
class of statistical models are regression models.
induce a distribution for the random errors. His-
The simplest regression model (i.e., the Gaussian
torically, regression models considered the errors
linear regression model) has three basic
of the dependent variable to be Gaussian
components: (1) a dependent variable that is to
distributed, and much of regression theory was
be modeled (i.e., described or explained), and
developed under this assumption. Note that a
(2) independent variables that are thought to
model assuming a Gaussian error distribution in
influence the dependent variable. The third com-
the dependent variable is commonly simply
ponent, the random error, distinguishes statistical
referred to as a linear model. Nowadays many
models from deterministic mathematical models.
generalizations to linear models exist
The random error captures how the model differs
(as described below and see Zuur et al. 2009 for
from the actual observations. In other words, it
common examples in ecology; see Generalized
measures how well, or badly, our model describes
Linear Models in Sect. 9.5.3 below). Arguably,
reality. Written as a mathematical expression, the
as noted above for random variables, the more
simple regression model looks like this:
commonly used distributions in regression
Y ¼ α þ Xβ þ ε, ð9:1Þ models are Gaussian and gamma for continuous
data, Poisson and negative binomial for counts,
where Y is the response variable, α is the intercept binomial for binary data, and beta for proportions
(a constant), X is the fixed independent variable, β (or probabilities), but many others exist. As for
is the regression coefficient for the fixed indepen- linear models, generalizations assuming other
dent variable that describes the rate of change of distributions associated with the response vari-
the response variable as a function of the indepen- able and associated error structure are commonly
dent variable, and ε is the random error. In gen- referred to by their distributions. For example, a
eral, the parameters α and β are not known and Poisson distributed response variable with
must be estimated based on data. associated error structure of counts of animals is
Most variables, particularly in ecology, are commonly referred to simply as a Poisson model.
influenced by many covariates, and hence models A gamma model might be used to model continu-
can include multiple independent variables. For ous positive values resulting from measurements
instance, in a study on whether the vocalization of duration of a recorded song. Values
rate of sea lions differs with sex and age, vocali- representing the probability of producing a
zation rate (i.e., number of vocalizations per unit sound (between 0 and 1), however, might be
time) would be the response (dependent) variable modeled assuming a beta distribution.
and sex and age the explanatory (independent)
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 331
Fig. 9.5 Checklist of some considerations for defining variables in your study
purpose is, by way of examples, to provide a taste properties such as statistics for central tendency
of the explosion of tools developed over the past including the mean (note that there are different
few decades, the lively discussion that has arisen types of means; e.g., arithmetic, geometric, and
from their varied and inherent limitations, and the harmonic), median, or mode, and spread of data
resulting developments in statistical approaches. including the range (maximum and minimum),
The reader is directed to the wide range of avail- variance, standard deviation, skewness (degree
able statistical textbooks and scientific papers to of asymmetry), kurtosis (i.e., how peaked a distri-
gain an in-depth understanding of the full range of bution is), or interquartile range (see Table 9.3).
approaches, their underlying concepts, and their Data corresponding to a single variable can be
correct use, limitations, and interpretation of summarized and explored using a range of
outputs. graphing tools, such as histograms, box plots,
bar charts, or scatterplots. Additionally, geo-
graphical data can be explored on maps and
9.5.1 Descriptive and Exploratory marine charts, and acoustic spectral
Research Questions characteristics on spectrograms (representing sig-
nal strength over different frequencies over time).
Having defined the question (Sect. 9.2) and As noted previously, it is (arguably) almost
identified the variable types and some of their impossible to produce too many graphs at an
attributes (Sect. 9.4), tackling the analyses is the exploratory stage—the more that you can learn
natural next step. For descriptive and exploratory about your data, the better. The reader is referred
questions and preliminary data exploration, sum- to standard statistical textbooks for information
mary statistics and graphical visualizations pro- on the large range of summary statistics and
vide information about the attributes of variable graphical visualizations available (e.g., Zuur
measures and patterns and relationships in data. et al. 2007; Zuur 2015; Rahlf 2019 for examples
The information relates only to the properties of in R).
the observed data. Analyses that aim to generalize
a sample to a population require inferential, 9.5.1.2 Bivariate and Multivariate
explanatory, and predictive type analyses Descriptive Statistics
(discussed in Sects. 9.5.2 and 9.5.3). The analyses of two variables together are called
bivariate analyses. For instance, exploration and
9.5.1.1 Univariate Summary Statistics visualization of a given variable as a function of
and Graphical Visualization another variable to investigate possible correla-
Exploration and visualization in their simplest tion is a bivariate analysis (see Fig. 9.7). A prac-
forms are undertaken by evaluating each variable tical example of a bivariate visualization is the use
on its own (Fig. 9.6). Analyses of single variables of box plots to visualize the distribution of call
are called univariate analyses and are used for types (one variable) as a function of age class
representing and summarizing the characteristics (a second variable), or a scatterplot of a recorded
of the variable in question. For example, univari- acoustic cue rate as a function of time of day.
ate exploratory statistics describe a variable’s Following this logic, multivariate analyses
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 333
Fig. 9.6 Example of univariate data visualizations of dolphin sounds detected: (left) scatterplot and (right) line chart.
Data source: WAMSI as part of Project 1.2.4 (Brown et al. 2017)
naturally consist of the joint analysis of multiple referred to as an effect size. For example,
variables. Visualization tools and summary statis- Pearson’s correlation coefficient is a standardized
tics can also be applied to multivariate analyses. metric ranging from 1 to 1; with a perfect nega-
For instance, two and three-dimensional tive association yielding a value of 1, no asso-
scatterplots, bar charts, stacked bar charts, and ciation 0, and a perfect positive association a
multiple line graphs can display statistics and value of 1. In some disciplines, conventional
spread of data as a function of multiple variables criteria have been suggested to classify effects
on the same figure. as small, medium, and large (see Cohen 1988).
When bi- or multivariate analyses aim to What may be in one study considered a large
explore associations and patterns, the magnitude effect (say, r ¼ >0.6), however, may not neces-
of the association can sometimes be quantified. sarily be in another study (where say, r ¼ >0.8
For example, in a bivariate analysis, the magni- might be considered large). Consequently,
tude of the linear relationship between two evaluating what is a meaningful effect size that a
variables can be quantified using a statistic called study aims to detect should always guide the
Pearson’s correlation coefficient (r). The magni- design of a study and interpretation of its
tude of an association such as this one is often outcomes. It is a question that the researcher
334 C. Salgado Kent et al.
20 20
10 10
0 0 0
-28
-22
-07
-11
-12
-17
-19
-21
-22
-23
-27
-28
-07
-11
-12
-17
-19
-21
-22
-23
-27
-28
-07
-11
-12
-17
-19
-27
-21
-23
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
-07
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
Date
Fig. 9.7 Example of bivariate data visualizations of dol- bars. Data source: WAMSI as part of Project 1.2.4 (Brown
phin sounds detected during July 2014: (left) scatterplot, et al. 2017)
(middle) box plot, and (right) bar chart with standard error
should answer based on their biological knowl- characteristics of acoustic signals. Consequently,
edge and is not related to statistical using a data reduction method to capture the most
considerations. variance explained by these variables by creating
When a study’s goal is to explore associations just one or two new variables (called principal
and patterns among many variables, analyses components in PCA) makes the exploration of
become more complex. Multivariate approaches patterns in sound characteristics easier. The first
are commonly used to reduce many variables to a principal component retains most of the original
few key ones. This is known as dimension reduc- variance, followed by the second component, and
tion. Multivariate approaches are also used to so forth. These principal components are some-
explore relationships and clustering, and to clas- times called factors. Factor 1 and 2 can be plotted
sify objects based on common multiple variable against each other, and distinct groupings of plot-
attributes. A good source for additional details on ted values for different populations would be
multivariate methods is Borcard et al. (2011). suggestive of differing characteristics in
One of the most common analyses used for stridulations among populations. To statistically
dimension reduction is principal components test differences, PCA might be used to generate
analysis (PCA). The name of the method is factor scores as inputs into inferential, explana-
derived from the fact that new variables, known tory, and predictive analyses (e.g., a regression
as principal components, are obtained from the analysis). Note that there are many dimensionality
set of original variables. For example, a reduction approaches (see Van der Maaten et al.
researcher may be interested in exploring whether 2007), and researchers planning on using these
populations of a social insect, such as a species of tools should acquaint themselves with the wide
ant, can be determined based solely on acoustic range available today, their conditions of use, and
signals (e.g., stridulations) its individuals produce their limitations. While one approach may be suit-
for communication. In this case, a range of able given the attributes of one dataset, another
variables might be measured, such as pulse dura- may be required for a different dataset.
tion, bandwidth, minimum and maximum fre- Clustering and classification analyses assign
quency, and intensity, to name a few. In objects into groups based on measured attributes
acoustics, a large number of variables might be (variables). Cluster analyses form groups
measured to capture the full range of (McGarigal et al. 2000; Zuur et al. 2009) using
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 335
The answer to this is always YES. Data always need to be checked for quality
and aributes, and if the queson requires inference or empirical models, the
validity of assumpons needs to be checked (see Secon 9.3.3 and 9.4)!
Fig. 9.8 Checklist of some considerations for identifying approaches for descriptive and exploratory questions
follows directly from their training, or just by values of a distribution are estimated by
convenience and actually not having thought maximizing the likelihood function so that the
much about the philosophical ramifications of MLE estimates are the values of the parameters
their choice. Sometimes they are rather inflexible that are most likely given the sample data. An
in their beliefs (be it in one or the other camp). We alternative method is Least-Squares Estimation
recommend a more pragmatic approach in prac- (LSE), where a solution that minimizes the sum
tice. Depending upon the problem at hand, one or of the squares of the residuals (the difference
the other framework might be more suited to the between the observed values and those obtained
question, easier to implement, or more sensible using the fitted model) is obtained. For a
for incorporating all available information Gaussian-distributed response variable, and sev-
(Nuzzo 2014; Ortega and Navarrete 2017). Con- eral other simple examples, the LSE solution is
sequently, we believe that the modern bioacousti- equivalent to the MLE. Nowadays LSE are
cian should have a basic understanding of the mostly introduced for teaching purposes, and
differences between frequentist and Bayesian most implementations use maximum likelihood.
approaches, and suggest that rather than only As indicated above, the Bayesian framework
being frequentist or Bayesian, a pragmatic combines information on the likelihood of an
approach be taken. Below, we provide a very outcome using observed data with prior informa-
brief introduction to statistical inference applied tion on the distribution of the unknown parameter
to parameter estimation and hypothesis testing. being estimated. The prior distribution can be an
assumption based on the researcher’s understand-
9.5.2.1 Parameter Estimation ing and experience of the parameter before the
There are a range of approaches to estimate pop- study began or it can be based on the results from
ulation parameters, such as the population mean a pilot or previous study. Often the prior distribu-
or variance, or a shape or scale parameter of a tion simply reflects a lack of knowledge and may
distribution, from a sample. In the context of be uniform over all the possible values the param-
ecological modeling, the frequentist approach to eter of interest might take (i.e., the parameter
estimating parameters typically uses maximum- space). A posterior distribution (i.e., updated
likelihood (Hilborn and Mangel 1997). In Maxi- understanding) is attained by multiplying the
mum Likelihood Estimation (MLE), parameter prior distribution function with the likelihood
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 337
function and scaling the result to provide a prob- 9.5.2.2 Hypothesis Testing
ability distribution function. All the inferences are While hypothesis testing has been traditionally
then based on this posterior distribution. The pos- undertaken using a frequentist approach (called
terior distribution thus can be seen as a compro- null hypothesis significance testing, NHST),
mise between the prior information and the equivalent Bayesian approaches are increasingly
information contained in the data, expressed via applied. This section focuses on providing a brief
the likelihood function. There are various introduction to NHST as a foundation and
resources available for further reading on the provides references for further reading on Bayes-
Bayesian framework. Ellison (2004) provides an ian approaches. These basic concepts are
excellent and gentle introduction to the use of introduced here with examples of their applica-
Bayesian methods in ecology, while McCarthy tion to test statistics (i.e., statistics values used to
(2007) provides a more thorough overview. reject or support a null hypothesis), however, they
Stauffer (2007) gives an in-depth introduction to are also an integral part of modeling and model
Bayesian and frequentist statistical research selection in explanatory and predictive questions
methods and Gelman et al. (2013) discuss Bayes- (discussed in Sect. 9.5.3).
ian data analysis. Statistical Rethinking by NHST constitutes a widespread paradigm
McElreath (2020) is a comprehensive treatment under which research has been conducted
for a reader wanting to become fully versed in the (NHST, Fisher 1959), however, it is often not
Bayesian philosophy, including R code to explore used sensibly, and frequently blindly used and
all the key concepts. abused. In some of these cases, pressure on
When inferential methods, such as those researchers to find statistically significant effects
introduced above, are used to estimate parameters has resulted in poor research practices (see Nuzzo
from sample data, the inferences we draw from 2014; Beninger et al. 2012 for detailed
them are uncertain. Confidence intervals (CIs; a discussions on the topic). Applying NHST to
frequentist approach) and credible intervals (CrIs; reasonable hypotheses and qualifying results
Bayesian counterparts) are tools for expressing our according to the limitations and assumptions of
uncertainty about parameter estimates. Confidence NHST, however, can produce important new
intervals, although more widely used, are arguably knowledge. To achieve this, an understanding of
more difficult to interpret than credible intervals. how NHST works is required. Here we provide
Confidence intervals give information based on insight into the framework by way of example.
our sample estimate, and by definition, if we Under the NHST framework, researchers put
repeated the procedure many times, 95% would forward a hypothesis (i.e., proposed explanation)
include the true parameter value. Note a 95% CI about the phenomena being studied based on a
does not mean that 95% of the observations lie study question. Let us say the researchers’ ques-
within the interval, nor that the probability of the tion is “Do seal pup call rates differ between night
true value of the parameter being in the estimated and day?” The null hypothesis (H0) is that call
interval is 0.95. After you estimate the confidence rates do not differ between night and day, and the
interval, the true parameter value either is, or is not, corresponding alternative hypothesis (HA) is that
in the interval, even if we do not know which it pup call rates do differ between night and day.
is. In contrast, 95% CrIs would represent a range of Note that this hypothesis implies a two-tailed test,
values for which there is a 0.95 probability that the one for which the null hypothesis is rejected if a
parameter falls in that range. Ironically, what this positive or a negative effect (i.e., a large or small
means is that while most people use frequentist value of the test statistic) is found. In contrast, a
confidence intervals, they often interpret them, one-tailed test would be used by a researcher
incorrectly, as credible intervals. Although credi- interested only in the difference between groups
ble intervals are intuitively easier to understand, in a specific direction (e.g., “Are call rates greater
they can be more difficult to calculate than confi- during the day than at night?”).
dence intervals.
338 C. Salgado Kent et al.
In this example, the researchers cannot mea- during the day than at night, and on average
sure the call rates of all animals in the population, T (number of successes) would equal 50 (T ¼ 50).
so they collect a random sample, say of Now imagine that the researchers observe
100 animals. Sampling at random is key to T ¼ 46. From Fig. 9.9, T ¼ 46 is consistent with
collecting data that represent the broad popula- the null hypothesis, which we would not reject for
tion, thereby avoiding biases in the parameter the usual levels of statistical significance (see
estimates. In this example, on a given day, for below for a more in-depth discussion of signifi-
each animal, the researchers record the number of cance levels). On the contrary, consider the case
calls produced during daylight hours and during of T ¼ 11. This result would have been extremely
the night. Let us call the event, in which for a unlikely under the null hypothesis, and we would
given animal there are more calls during the day be tempted to reject the null hypothesis, implying
than at night, a “success.” If we assume animals that differences between night and day might
operate independently, then the number of occur.
successes in the 100 animals provides informa- The example given here illustrates the ratio-
tion about the null hypothesis: the further from nale under NHST, the steps of which are:
the expected number if there were no differences (1) define the hypothesis, (2) collect the data,
between night and day, the larger the evidence (3) calculate a test statistic, with known distribu-
against H0. We also assume that the probability of tion under H0, (4) evaluate how likely
a success is constant and independent across trials (or unlikely) the data would be under the null
and animals. Under H0 we assume the probability hypothesis, and (5) if very unlikely, then reject
of a success is p ¼ 0.5. Under H0, the number of the null hypothesis, but if not unlikely, do not
successes has a binomial distribution with reject it. Consequently, the trick is to put forward
parameters n (the sample size) and p. The a null hypothesis under which the distribution of
corresponding probability mass function with the test statistic can be evaluated to assess how
n ¼ 100 and p ¼ 0.5 is illustrated in Fig. 9.9. likely the data are under the null hypothesis.
To test the null hypothesis, the researchers use Given the sampling uncertainty (i.e., not observ-
the number of successes as a test statistic. The test ing the entire population), we can make mistakes
statistic has information about the null hypothe- when making decisions about whether to reject
sis, and under the null hypothesis, we know the the null hypothesis or not. The confusion matrix
distribution of the test statistic. If call rates are on in Table 9.4 illustrates the possible outcomes of a
average the same during the night and day (i.e., decision.
H0 is true), then we would expect that animals The two wrong decisions we can make are to
have a probability of 0.5 of producing more calls reject the null hypothesis when it is in fact true or
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 339
Table 9.4 Confusion matrix showing the possible require a significance level (i.e., Type I error rate), which
outcomes of a null hypothesis decision: correct decisions defines the probability of being wrong if the null hypothe-
and Type I and Type II errors. Statistical tests usually sis is true
Decision on null hypothesis
Do not reject Reject
Reality H0 true Correct decision Type I error
H0 false Type II error Correct decision
to not reject it when it is false. The former is hypothesis before it is rejected. Alternatively, we
known as a Type I error (i.e., an incorrect rejec- can compute the probability of, given the null
tion, sometimes referred to as a false-positive) hypothesis is true, observing a value for the test
and the latter a Type II error (i.e., failing to find statistic that is as or even more extreme than the
a real effect, sometimes referred to as a false- observed value. This probability value is com-
negative). In general, it is believed that Type I monly referred to as the p-value. In the above
error is what we should guard against, with the example, assuming a two-tailed test, the p-value
logic illustrated here as analogous to the legal associated with T ¼ 46 or T ¼ 11 would be 0.484
system: It is better to have a guilty defendant and ~0, respectively. This would lead us not to
not convicted than to have an innocent defendant reject the null hypothesis in the first case, but to
sent to death. We note, however, that depending reject it in the second case. Note that a common
on the problem at hand, a Type II error could have error is to confuse the p-value with the probability
a greater consequence than a Type I error. To of the null hypothesis being true or the alternative
illustrate this, imagine that you are testing being false. Researchers should take care in their
whether the size of a population has decreased interpretation of p-values to ensure they are
below a critical threshold that requires an action accurate.
for it to not go extinct. If you do not reject the null The predefined probability threshold below
hypothesis (i.e., that the population size has not which we are willing to reject the null hypothesis
changed) but it is false, you might miss the oppor- is called the significance level (typically
tunity to take action and prevent the population’s designated as α). A typical value for the signifi-
extinction. Alternatively, if you mistakenly take cance level is 5%, with tests having p-values
action to protect the population while it is in fact lower than 0.05 often being reported as statisti-
above the minimum threshold, you might waste cally significant. This value has become widely
money but any risk of detrimental population used; however, it should be noted explicitly that
consequences is eliminated. So, while many there is nothing special about a 5% significance
textbooks may allude to the importance of level. While using this threshold has been
safeguarding against Type I error, the error type extremely useful in practice, there is arguably no
that should be of most concern is likely to be other concept in statistics that has received more
study-specific. The usual advice applies: Do not criticism. The abuse of the 5% significance level
use cookbook recipes, rather think about your by blindly using it is among the most common
study. The allowable Type I error can typically criticisms of the p-value and hypothesis testing
be specified with a critical significance level value (Nuzzo 2014; Yoccoz 1991; Beninger et al.
(defined below). Estimation of Type II errors 2012). Using common sense is fundamental in
typically requires another step, called a power selecting significance levels. It is intuitively sen-
analysis (see Ellis 2010 for a textbook on power sible that it cannot be sound science to blindly
analyses). claim a result to be significant if p ¼ 0.049 but not
In practice, the amount of evidence against the significant if p ¼ 0.051. Ultimately, researchers
null hypothesis required in a study is given by need to think carefully about the cost of errors
setting a threshold based on how unlikely the they can incur and define suitable significance
observed data would have to be under the null levels accordingly. The focus should arguably
340 C. Salgado Kent et al.
be on reporting confidence intervals and assessing (marine) ecology. An entire Forum section in
the biological importance of reported effects, not the journal Ecology has been dedicated to the
on claims of statistical significance that are often topic in recent years, and Ellison et al. (2014)
not more than statements about sample size. show that while having been discussed and
Given a large enough sample size, even the revisited many times in recent years, the discus-
smallest difference will become statistically sig- sion about their use is alive and kicking!
nificant. Therefore, it is perhaps not surprising Having said this, a wide range of NHSTs have
that a common pitfall for researchers, and equally been developed over many decades to accommo-
as or arguably more important than evaluating date a range of questions and data types. Tradi-
statistical significance, is failure to consider a tionally, many of these have been described as
result’s biological significance. Imagine two either “parametric tests” or “non-parametric
populations of a whale species that produce the tests,” with parametric tests often assuming
same stereotyped calls. Let us say animals in samples arise from Gaussian distributions and
population A produced calls at a mean rate of non-parametric tests are often used for categorical
22.7 per hour and in population B at 22.6 calls or continuous data that do not fit assumptions of
per hour, and that these are significantly different parametric tests. While we urge the reader to be
statistically. Is this result meaningful biologi- cautious about blindly using such tests and be
cally? In other words, is the effect size of a mag- aware of their limitations, we feel we must dis-
nitude that we care about? In most cases, almost cuss them since this is how statistics is presented
certainly not. Therefore, a researcher should have in most undergraduate and postgraduate courses
a good understanding a priori of the magnitude of aimed at the applied sciences, biology and ecol-
the effect that is biologically relevant. ogy included. As examples, tests commonly
Researchers undertaking studies with large sam- referred to as parametric include the z-test (for
ple sizes having the power to detect very small testing a sample mean), t-test (for comparing the
effect sizes can fall into the trap of reporting means of two groups), and analysis of variance or
results as important based on statistical signifi- ANOVA (used for comparing two or more
cance instead of on effect size and significance groups). Common non-parametric alternatives to
together. Conversely, studies having a large prob- the t-test and the (one-way) ANOVA are the
ability of incurring Type II errors (also known as Mann–Whitney U and Kruskal–Wallis tests,
low power, i.e., having a low probability of respectively. The tests referred to here are only a
correctly rejecting the null hypothesis when it is few of the vast range available, and readers will
false) due to a small sample size may only be able not find it difficult to find a plethora of textbooks
to detect very large effect sizes and miss smaller describing them. Note that these tests have been
ones that are biologically important. The effect used widely in past decades and continue to be
size that is meaningful in a study, thus, needs to used in current research. Today, however, with
inform the experimental design to ensure a suffi- improved knowledge of limitations of these tests,
ciently large sample is collected before the study they are losing their appeal (see e.g., Touchon and
commences. McCoy 2016). In general, they are no longer the
While NHST and p-values can provide valu- standard go-to for particular types of problems as
able tools to bioacousticians, it is not amiss for they have been superseded by more robust
researchers to be well aware of the lively discus- approaches. With advances in statistics, a wide
sion on their misuse, drawbacks, and limitations. range of readily available modeling approaches
Nuzzo (2014) provides an introduction to this has been developed that more than accommodate
discussion, Yoccoz (1991) provides a classical data that would have traditionally been analyzed
critical review regarding their use in biology and using non-parametric tests (see Sect. 9.5.3 for an
ecology, and Beninger et al. (2012) frame the overview). Note that while many disciplines are
problem in the wider context of statistics in guided by traditional “parametric” and “non-
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 341
parametric” classifications, where parametric extensions of the former methods). Note that
would often be associated exclusively with the these approaches have additional assumptions,
Gaussian distribution, modern approaches in sta- such as that of homogeneity of variances. Homo-
tistical ecology using regression models are gen- geneity of variance means that the variance for a
erally not said to be parametric or non-parametric; response variable is assumed to be constant
rather, they tend to be referred to based on the across values of the independent variable. Many
data distributions for which they are suited, such datasets have been forced through these methods
as a Poisson or gamma regression (see below for even when they were clearly not the right tool for
more on these). the job. This included, for example, transforming
the response variable (e.g., by applying a log
function to it) until Gaussian distributional
9.5.3 Explanatory and Predictive assumptions were met to a reasonable extent.
Research Questions But even then, often a method’s assumptions
were not met. For instance, there is no transfor-
Explanatory and predictive studies have mation that will turn a discrete count into a con-
questions requiring a response variable to be tinuous variable. For an interesting presentation
described as a function of a set of independent about why not to log-transform data, see O’Hara
variables. Arguably, the majority of the models and Kotze (2010). Nonetheless, sometimes pro-
used by ecologists to answer this type of question cesses might have properties that make a
are some kind of regression model. However, log-transformation of the data sensible and useful
these models come in many forms. This section (e.g., Kerkhoff and Enquist 2009). While
aims to introduce the reader to different types of transforming data to fulfill methods’ assumptions
regression models. We note upfront that model has been acceptable in the past given a lack of
selection and validation, and inference from accessible alternative methods, this is often no
selected models, are fundamental aspects of longer the case, and successful ecologists need
these analyses and are only very briefly men- to have a few additional tools in their toolbox.
tioned in Sect. 9.5.3.1. Relevant yet accessible The rule is one that practitioners do not enjoy:
books with plenty of practical examples There is not a single rule that fits all questions and
addressing these steps include Zuur et al. (2007) problems, we need to understand the problem to
and Zuur et al. (2009). know how to model it. Sometimes it is even said
Historically, linear regression models that modeling is as much an art as it is a science.
(in which the errors are assumed to follow a But like any good artist, you must master the
Gaussian distribution) were the only tools avail- techniques to use them correctly.
able to answer this type of question. When the The next level of sophistication in regression
only tool you have is a hammer, all your problems models came with the advent of Generalized Lin-
begin to look like nails. With a Gaussian error ear Models (GLMs). GLMs allow for different
distribution assumption, the only analytical types of response variable and some degree of
options are simple linear regression models of non-linearity in the relationship between the
the type given in Eq. (9.1) or linear regression response and explanatory variables. The relation-
models with several predictors (i.e., multiple ship will still be linear at some level, but it might
regression). There are many special cases of not be at the response level, it might only be linear
such linear normal regression models including at the level of the link function. What is the link
the independent sample t-test, ANOVA (i.e., function? It is a fundamental component of a
analysis of variance for multiple sample mean GLM and is what allows responses to be
comparison), ANCOVA (i.e., analysis of covari- constrained to a specific range of values. The
ance for regressing a continuous response vari- link function, as its name implies, links the linear
able on a factor and a continuous covariate), and predictor and the response variable so that the
MANOVA or MANCOVA (i.e., multivariate model equation looks like:
342 C. Salgado Kent et al.
measurements sharing a correlation structure in to Martin et al. (2005) for a gentle introduction to
biological studies. the topic with ecological examples.
Despite these advances, some data still do not Truncated regression is another special case of
fit the distributional requirements of GLMs and regression under which some values of the
GAMs. Generalized Estimating Equations response variable cannot be observed. An exam-
(GEEs) have been introduced recently, and ple is modeling animal group sizes as a function
hence they might still be considered in their of their acoustic footprint (e.g., the number of
infancy, but they are showing promising results. sounds produced by a group that are detected
GEEs generalize GLMs and GAMs even further per minute). Now that you know about GLMs,
by not requiring that the response variable come your first thought might be to consider a Poisson
from a particular family of distributions. GEEs or negative binomial GLM, with group size as the
simply impose a relationship between the mean response variable and numbers of sounds detected
and variance of the response. These models also as the predictor. However, in modeling this, you
allow a wide range of correlation structures to be soon face a problem: You fit your model and
imposed on the data, making them quite appeal- make some predictions, one of which is a group
ing when there are many observations clustered size of zero! What does this mean? Nothing
inside a few individuals. GEEs are marginal really, it is what we call an inadmissible estimate
models in that the focus of inference is on the and a clear sign that something is not adequate.
population average, and we are not so interested Under such a case, you might want to try a zero-
in the responses at the individual level. GEEs are truncated regression, which is essentially a GLM
quite specialized, and the reader is referred to for which zeroes cannot be observed. Chapter 11
Zuur et al. (2009, Chap. 12) for an introduction. in Zuur et al. (2009) explores both zero-inflated
In addition to the somewhat “general” regres- and zero-truncated models.
sion models above, there is a range of specialized Survival models are regression techniques that
regression models that are worth considering in deal with a special type of response variable: the
certain biological questions. For instance, we time up to an event. While these types of models
have mentioned the problem of overdispersion. were developed to model survival of animals,
Often with biological data, we have very special plants, and people, they can be used in any sce-
cases of overdispersion in which there is an nario where observations might be censored.
excess of zeroes. For example, consider you are Censored data result when we do not know the
trying to model the number of echolocation clicks real value of the response variable but know it is
a sperm whale produces per second as a function at least above or below some limit or within some
of depth, time of day, and sex. There are (at least) interval; say because we observe an animal is
two reasons for there being zero clicks in a given dead at a given time, and/or we know it was
second. A whale is in a silent state when recorded alive at a different time. For example in a
and many zeroes occur in successive seconds, or bioacoustic study, a researcher may wish to
the whale is in a click-producing state but does model the time animals take to produce their
not produce a click in the given second recorded. first acoustic cue, and animals are observed for
The regression models discussed above will 5 min each. However, we do not know when an
likely fail to produce reasonable answers because animal produced a cue before observations began
the excess zeroes from the silent periods (poten- (i.e., left censoring). In addition, an animal might
tially not explained by the covariates; i.e., not not produce any cues during the 5 min, or the
dependent on sex, depth, or time of day) cannot animal might leave the study area before the
be accommodated. Under such a scenario, hurdle 5 min elapse (i.e., right censoring). Finally, if
models or zero-inflated models might come in we recorded only which minute, but not the actual
handy. While these are advanced methods and second a sound was produced, we would only
more difficult to implement and evaluate, they know that the event occurred sometime within
are worth knowing about. The reader is referred the interval of that minute. These are interval
344 C. Salgado Kent et al.
Table 9.5 Description of some commonly used models to test the association between multiple explanatory variables
and a response variable
Model type Use
Generalized Linear Modeling (GLM) Allows different distributions for the response variable and some degree of
non-linearity in the relationship between response and explanatory variables
Generalized Linear Mixed Effects An extension of GLM for use with random effects (e.g., repeated measures of
Modeling (GLMM) subjects)
Generalized Additive Modeling Allows different distributions for the response variable (as in GLMs) modeled
(GAM) as a function of smoothed predictors
Generalized Additive Mixed Effects An extension of GAM for use with random effects (e.g., repeated measures of
Modelling (GAMM) subjects)
Generalized Estimating Equations Do not require the response variable to come from a particular family of
(GEE) distributions, and allows correlation structures in the data to be accounted for
censored data. While a somewhat contrived 20, then sound is predicted to spread spherically
example, this allows us to introduce the different (see Chaps. 5 and 6 on sound propagation in air
kinds of censoring that are common in survival and under water, respectively).
analysis. All the models described so far do not consider
Generalized Least Squares (GLS) is a regres- predictor variables that are in hierarchies.
sion approach that might be used when we want Hierarchical data occur when variables are nested
to relax the usual assumption of homogeneous within each other (i.e., organized into levels). For
residual variance by modeling the variance as a example, individuals from different resident
function of covariates. Zuur et al. (2009, Chap. 4) populations can be said to be nested within
provide examples of the use of GLS and Reyier subpopulations. In turn, subpopulations can be
et al. (2014) give an acoustics application of GLS. nested within populations. Hierarchical modeling
Another perhaps more specialized use of such a (also known as multilevel modeling) is used when
regression technique is when we want to consider inferences need to be drawn for population means
a general non-linear model with a specific form to at specified levels and is useful for fitting models
relate a response variable with covariates. Then to data obtained from complex, multilevel survey
we might still want to find the parameters of the designs. For example, a study may evaluate vocal
model that best fit the data. A way to do so is, akin complexity of elephants at the population,
to what might happen if one considers a straight sub-population, and resident population levels.
line, to find the parameter values that minimize Here, we do not discuss these methods further.
the sum of the squares of the residuals (i.e., the Rather, we refer the reader to Cressie et al. (2009)
difference between the observations and the and Royle and Dorazio (2008) for descriptions of
model). In a simple regression context, the these methods, including their strengths and
model produces the fitted line, while in a limitations.
generalized least squares context, the model is Given the large range of models available
any function in which we might be interested. (a taste of which has been described above),
For example, if you want to determine the propa- what should aspiring ecologists today have in
gation loss (PL) for a sound that has traveled from their statistical regression toolbox? We propose
the source to the receiver, and you expect it is that a bare minimum is an understanding of the
proportional to log(r), where r is the range, then structure, implementation, outputs, and interpre-
your model is PL ¼ K log (r). Based on tation of GLMs, GLMMs, GAMs, and GAMMs
measurements of received levels of sounds with (Table 9.5). Parameter estimates and significance
known source level, you may apply a GLS regres- tests resulting in p-values are common outputs of
sion to estimate the value of K that best fits your software capable of fitting GLMs, GLMMs,
data. If K is close to 10, then your environment GAMs, GAMMs, and GEEs. For a practical
supports cylindrical spreading, if it is close to guide to applying these in behavioral and
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 345
ecological studies, see Zuur et al. (2009). O’Hara Olivares and Garcia-Forero 2010), while residual
(2009) and Bolker et al. (2009) provide good diagnostics determine whether residuals fit the
introductions to GLMMs for ecologists, and the assumption of being effectively random (see
books by Zuur et al. (2007, 2009) provide infor- Zuur et al. 2009 for common examples in ecol-
mation to implement and interpret GLMMs. For ogy). Checking for multi-collinearity (i.e., collin-
GAMs, the book by Wood (2006) is a standard earity between two or more covariates) is also
reference, and Zuur et al. (2009) has worked-out standard for explanatory modeling, while it is
examples in the software R. close to irrelevant for predictive modeling (see
Most of the models described in this section Shmueli 2010 for detailed discussion). In contrast
can be implemented in a frequentist framework, to explanatory modeling, model validation in pre-
for instance using maximum likelihood or dictive modeling is focused on evaluating the
restricted maximum likelihood estimation. None- model’s ability to generalize and predict new
theless, for more complex models such as those data. Validation commonly is undertaken using
including (often complex) spatial and temporal approaches such as cross-validation. In cross-
covariates (i.e., spatio-temporal models), Bayes- validation, the model’s ability to accurately pre-
ian implementations are gaining ground. For dict a new data set is assessed after calibrating it
instance, GLMs and GLMMs are fitted via maxi- with a training dataset (Shmueli 2010; Cawley
mum likelihood, or Markov Chain Monte Carlo and Talbot 2010).
(MCMC). MCMCs are Bayesian iterative Once a set of models have been validated, the
solutions and are described in Gamerman best candidate model is selected (though model
(1997), Brémaud (1999), Draper (2000), and validation and selection can often be an iterative
Link (2002). With advances of widely available process). Approaches to model selection, again,
implementations, users might even be using depend upon whether modeling has an explana-
Bayesian approaches without realizing it. An tory or predictive goal. In explanatory modeling,
example is the Integrated Nested Laplace the explanatory power of nested candidate models
Approximation (INLA) implemented via is commonly compared with a step-wise approach
R-INLA (www.r-inla.org) and its derivatives using significance testing (e.g., using an F-test).
that allow fitting complex spatio-temporal models Here a nested model refers to one composed of
without the Bayesian framework being obvious subsets of covariates of another candidate model.
(by not requiring priors to be explicitly defined). Caution should be taken, however, as researchers
The philosophical nuances of which framework may be inclined to remove covariates that are not
might be more adequate under given settings, significant, even when there is a strong theoretical
however, are beyond what we hope to discuss in justification for retaining them since they are rel-
this chapter. evant in the models, regardless of whether they
are significant or not (Shmueli 2010). For exam-
9.5.3.1 Model Validation, Selection, ple, a covariate representing the age class of a
and Averaging sparrow in a study assessing the influence of
Depending upon whether modeling is undertaken predator presence on sparrow vocal behavior
for explanatory or predictive purposes, may be of theoretical importance in the model.
approaches for model validation and selection Model selection in predictive modeling com-
may differ (Shmueli 2010). Validation means monly involves a priori specification of candidate
that the model has been demonstrated to have models and selecting the best model based on the
satisfactory accuracy for its intended use (Rykiel smallest possible number of parameters that ade-
Jr 1996). Validation in explanatory modeling quately represent the data (i.e., the principle of
commonly takes the form of goodness-of-fit and parsimony). The simpler a model is, the more it
residual diagnostics. Goodness-of-fit tests evalu- can be generalized, while more complex models
ate how well-observed values agree with those (containing more parameters) are more specific to
expected under the statistical model (Maydeu- the data used to fit the model. Consequently,
346 C. Salgado Kent et al.
criteria for model selection have been developed covariance between models is low (McElroy
that essentially maximize the likelihood while 2016).
penalizing for the number of parameters included. While a highly simplified overview of some
The Akaike’s Information Criterion (AIC; see tools available on the topic of model validation,
Akaike 1974) and Bayesian Information Criterion selection, and averaging has been provided here,
(BIC) currently are the most commonly used, researchers should be familiar with them and
among a range of others available. They are access the latest literature to identify the appropri-
widely used for comparing nested and ate approaches for their study.
non-nested models (Burnham and Anderson
2002), although there is some discussion around
suitability for use in non-nested models (see 9.5.4 The Future of Bioacoustical
Ripley 2004). Resulting criteria such as AIC or Analytical Approaches
BIC values for candidate models are then com-
pared and the model yielding the lowest value is In this chapter, we have only provided a flavor of
generally deemed to be preferred. Note that there common approaches used today and have not
is active research on the circumstances under delved into the wide range of new developments
which AIC, BIC, and the many other criteria being introduced into the discipline. Interdisci-
available perform best, and whether they should plinary research linking the fields of biology,
be used together to inform model selection (Kuha ecology, and statistics has a long tradition of
2004). An important take-home message is that providing fertile ground for innovative statistical
model selection criteria such as AIC and BIC can methods, with many methods having been devel-
only suggest a preferred model from those com- oped when existing methods were not adequate to
pared, even if they all perform poorly at the cope with new problems (Olivier et al. 2014). The
validation stage. In other words, the preferred current revolution in data acquisition systems (see
model may still be a poorly fitting model, and Chap. 2), such as high-resolution sensors in
therefore, selection criteria are only relative animal-borne tags and increasing numbers of
measures of model goodness-of-fit. long-term passive acoustic deployments that
In predictive modeling, averaging over a range lead to big data, is also likely to influence the
of plausible models has become widely used to next generation of statistical methods suited for
reduce prediction error and improve model selec- ecological and acoustical analysis. Analysis of
tion uncertainty. This is undertaken, for example, big data through increased computational capac-
by computing a measure that ranks the set of ity has already provided a range of new powerful
plausible models according to their support by tools to science.
the data (e.g., Akaike weights), applying the As an example of such approaches, machine
weights to predictions from each model, and learning is rapidly gaining in popularity as it
then computing the average. This provides increasingly improves pattern recognition accu-
weighted averaged predictions, with weights racy (Christin et al. 2019). Such methods can
dependent on how much each model is supported improve processing capacity in large datasets
by the data. There are many other methods for resulting from acoustic instrumentation. An
undertaking model averaging. Model averaging example of more sophisticated analytical
performance depends on each model’s predictive approaches is the growing use of hierarchical,
bias and variance and covariance between state-space, and hidden process methods (e.g.,
models, among other things (see McElroy 2016 Auger-Méthé et al. 2020 for an introduction to
for complete discussion). In recent work, model their application in ecology) that model underly-
averaging has been shown to be particularly use- ing processes while accounting for biases and
ful when predictive errors of contributing model uncertainty. Advances in these approaches may
predictions are dominated by variance, and when improve our ability to predict future scenarios and
implement intervention before a potentially
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 347
undesirable future scenario unfolds (see Cressie Rather than reviewing analytical approaches
et al. 2009 for discussion). across the hundreds of existing bioacoustics stud-
We also suggest readers to be acquainted with ies, we have selected two recent studies as
the growing work being conducted in the area of examples, and discuss the rationale for the partic-
statistical decision theory, which is concerned ular analytical approaches taken. The research
with making decisions by accounting for topics in the example studies are exploring tem-
uncertainties involved in the decision process poral changes in call frequency and using acous-
using statistical knowledge resulting from data tic data for abundance and density estimation.
collected. Rather than attempting to provide a
general review of the large field of decision the-
ory here, we refer the reader to an introduction in 9.6.1 Temporal Changes in Call
its application to ecology by Williams and Frequency
Hooten (2016), which will introduce the reader
to a range of other resources on the topic. As indicated previously, due to ever-increasing
Because the advancement of these and many computing power and storage and technological
other methods are continually evolving, advances in acoustic equipment, acoustic studies
researchers are encouraged to keep well-informed can provide extremely long-term datasets. These
of current developments appearing in methods- datasets allow us to explore changes to calling
based scientific journals, such as Methods in behavior on a scale that, until recently, would
Ecology and Evolution. have been very difficult. A recent example is
illustrated in Miksis-Olds et al. (2018) where the
frequency content of a type of blue whale song
9.6 Examples in Bioacoustics recorded primarily in the Indian Ocean was
investigated. The song type is attributed to a
The wide range of quantitative approaches pygmy blue whale subspecies (Balaenoptera
introduced above can be used to analyze musculus indica, Committee on Taxonomy
bioacoustical data to answer research questions 2021) that appears to be resident in the northern
ranging from understanding natural vocal behav- Indian Ocean. The song type has three distinct
ior to activity patterns, community and conserva- units, and this analysis focused on the ~60-Hz
tion ecology, habitat use, species diversity, component of Unit 2, a frequency-modulated
distribution, occupancy, density and abundance, upsweep, and Unit 3, a ~100-Hz tonal
and anthropogenic impacts (among many others). downsweep. A decade of data from the Indian
Faunal groups that have been the subject of bio- Ocean Comprehensive Nuclear-Test-Ban Treaty
acoustics research include invertebrates, anurans International Monitoring Station (CTBTO IMS)
(i.e., frogs and toads), fish, birds, bats, other ter- at Diego Garcia was analyzed (2002–2013).
restrial mammals, and marine mammals, but Ambient noise was also analyzed, but we do not
many others could be considered. As long as focus on that part of the study here.
sound is produced, it could be used as a source Power spectral densities (PSD) were computed
of information. A recent review documented for 2-h sections of data, which could be used to
460 peer-reviewed published papers on passive detect peaks in the frequency bands of interest
acoustic monitoring in terrestrial habitats alone, (approximately 56–63 Hz for the 60-Hz compo-
with bats (50% of papers) and activity patterns nent of Unit 2, and 107–100 Hz for Unit 3), using
(24%) dominating (Moreria Sugai et al. 2018). a 3-dB signal-to-noise threshold. The paper
Marine mammals feature prominently in shows a figure of number of hours with vocal
bioacoustic research as water is a highly condu- presence detected each week, for each year
cive medium for sound to travel through, and (Fig. 9.3 in Miksis-Olds et al. 2018), highlighting
visual observations can prove comparatively the importance of producing exploratory plots; in
expensive for limited returns on detections. this case, the variability in the data is made clear.
348 C. Salgado Kent et al.
The average over each week, across years, was a Generalized Linear Model, or non-linear
used to identify weeks with peak average vocal patterns in the frequency decline could be
presence. Weeks 21 and 22 were those with peak explored using a Generalized Additive Model.
average vocal presence and data from these weeks
were investigated further. The frequency peaks
from the PSDs from these weeks across all years 9.6.2 Abundance and Density
were measured. A linear regression model was Estimation
fitted to the week 21 and 22 frequency peak
measurements from all years. The response vari- The estimation of animal population size (abun-
able was frequency, and year and song unit were dance) and the number of animals in a given area
explanatory variables. Song unit was included in (density) are metrics that are very informative for
the model as a factor variable. An interaction was management and conservation actions. There are
also included between year and song unit, which several abundance and density estimation
was used to investigate whether the rate of any methods available (e.g., Borchers et al. 2002);
frequency change over time differed between the popular methods include mark-recapture and dis-
two song units. Model assumptions (linearity, tance sampling. Such methods are known as
constant error variance, error independence, and absolute abundance or density estimation
normality) were all assessed using diagnostic methods, as the methods estimate the total num-
plots and relevant hypothesis tests, and all ber of animals (in a defined area, for density
model assumptions were met. estimates), including animals missed by a survey.
The linear model results are depicted in Common reasons why animals are not detected
Fig. 9.10. The figure shows all weekly data plot- during a survey is that they may be too far away,
ted (blue dots) with the modeled 21–22 week data and/or detection is made difficult by environmen-
highlighted in red for both song units. Again, the tal conditions (e.g., rough seas may prevent
utility of plotting data is clear here: the decline in marine mammal sightings at sea unless the
frequency is evident, with an apparent difference animals are very close, or windy conditions may
in rate of decline between the two units. The mask the sounds of singing birds in recordings).
linear model results confirmed the frequency The probability of detecting an animal is a key
decline; the frequency of the ~60-Hz Unit parameter in absolute abundance and density esti-
2 decreased at a rate of 0.18 Hz/year, while the mation methods, and accounts (in part) for unde-
frequency of Unit 3 decreased at 0.54 Hz/year. tected animals during a survey.
The interaction term was selected during model Acoustic data are increasingly being used for
selection (using an F-test), which confirmed that absolute abundance and density estimation, both
the rates of frequency decline were indeed differ- in terrestrial and marine environments (e.g.,
ent between the two units. Marques et al. 2013; Stevenson et al. 2015).
This analysis shows that simple regression Here we discuss a density estimation analysis
analyses can be very effective in confirming for Blainville’s beaked whales (Mesoplodon
patterns observed in exploratory data plots. We densirostris) from seafloor-moored hydrophone
note here that the regression analysis in the paper data recorded in the Bahamas (Marques et al.
focused on data from weeks 21 and 22 to be 2009). The analysis involved several of the
comparable with methods from a similar study concepts we have discussed throughout the chap-
(Gavrilov et al. 2012). However, frequency ter, which we highlight here.
measurements were taken across all weeks of The paper begins by introducing the density
each year (as shown in Fig. 9.10), which could estimation equation (i.e., the estimator; see Sect.
also be used in a regression model. In addition, it 9.4.2). The equation contains several parameters
is common for bioacoustical analyses to have to be estimated, including the probability of
several natural extensions. In this case, relaxing detecting a beaked whale echolocation click on
the Gaussian assumption could be considered via one of the seafloor-moored hydrophones. Survey
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 349
Fig. 9.11 The estimated detection function. Plots (on the the smooths, where black and white represent respectively
response scale) of the fitted smooths for a binomial GAM an estimated probability of detection of 0 and 1. Distance
model with slant distance and a 2D smooth of hoa and voa. (top right panel) and angle not shown (bottom panels) are
For the top left plot, the off-axis angles are fixed at 0, 45, fixed respectively at 0 m and 0 . Reprinted with permis-
and 90 (respectively the solid, dashed, and dotted lines). sion from Marques et al. (2009). # Acoustic Society of
Remaining plots are two-dimensional representations of America, 2009. All rights reserved
detection probability, (c) horizontal off-axis angle of detecting that same click). The variance around
and slant range, and (d) vertical off-axis angle and the average was estimated using the bootstrap and
slant range are all depicted. The average detection presented as a coefficient of variation (CV,
probability of a beaked whale click within 8 km defined in Sect. 9.4.2) and was estimated to be
of a moored hydrophone was estimated to be 0.03 0.16, or 16% when expressed as a percentage.
(i.e., if a beaked whale click was produced within Finally, the estimator was used to estimate beaked
8 km of a moored hydrophone, the study whale density in the study area of either 25.3 (CV:
estimated that there was, on average, a 3% chance 19.5%) or 22.5 (19.6%) animals per 1000 km2,
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 351
depending on the false-positive proportion used for statistical analysis of biological data. R can be
(two estimates were produced using differing accessed and downloaded through a web
methods). browser2 and for most users, we recommend a
user-friendly GUI like RStudio (RStudio Team
20203). RStudio is an integrated development
9.7 Software for Analyses environment for R that includes a console, an
editor for code development and execution, and
There are many standard, relatively easy-to-use tools for plotting, debugging, tracking history,
software packages that require no (or very little) and managing the workspace. An interesting fea-
coding skills to carry out statistical analyses, ture of R integrated with RStudio is the ability to
including SPSS (IBM Corp., Armonk, NY, adhere in a straightforward way to the concept of
USA), Statistica (TIBCO Software, CA, USA), reproducible research via dynamic reports in
Stata (StataCorp, College Station, TX, USA), RMarkdown. If the reader is new to the topic,
Minitab (Minitab Inc., State College, PA, USA), we recommend the book by Xie et al. (2020).4
Xlstat (Addinsoft, Ile-de-France, France), and
SAS (SAS Institute, Cary, NC, USA), among
others. In the field of bioacoustics, it is common 9.8 Summary
for acoustic data to be processed in MATLAB
(The MathWorks Inc., Natick, MA, USA) due to A key outcome of bioacoustics research is the
its powerful signal processing package. MATLAB production of new knowledge that informs con-
users may find that their workflow is streamlined servation management. The knowledge produced
by undertaking statistical analyses in the same needs to be reliable and easily understood, which
software if all required tools are available. is no trivial task given the complicated nature of
For those planning, however, on undertaking animal behavior. The reality is that the phenom-
analyses that draw from the most recent up-to- ena from which we want to derive inferences are
date developments in statistical ecology and multifaceted, with many interconnecting
require a highly flexible environment to do so, a attributes, and patterns and signals obscured by
free open-source software environment like R is statistical noise (i.e., variability not associated
recommended (R Core Team 2020). R is primar- with the conditions under investigation). Conse-
ily used for statistical computing and production quently, underlying mechanisms that explain the
of graphics (though R’s GIS, and even signal patterns we observe are not easily revealed.
processing capabilities, are expanding). The soft- Not only are animal behaviors occurring in a
ware benefits from a large number of base and highly complex environment, but many
contributed packages that can easily be challenges are presented in conducting the
downloaded and an environment in which users research itself. For instance, as researchers we
may develop their own algorithms and packages. are not easily able to avoid or reduce the statistical
There are now many sources of instructional noise in the environment by controlling field
manuals and books guiding users on how to cre- conditions; and when we undertake experiments
ate high-quality data representations and run of animals in captivity to reduce noise in a labo-
analyses in R, including Crawley (2013), Kerns ratory, we cannot be sure that results are
(2010), Zuur et al. (2009), Bolker (2008), Lawson
(2014), among many others. The CRAN
2
Task View: Analysis of Ecological and Environ- R Core Team is accessible at https://www.r-project.org/;
accessed 1 January 2020.
mental Data1 maintained by Gavin Simpson is an 3
RStudio is accessible at https://www.rstudio.com/
excellent resource for locating suitable packages products/RStudio/; accessed 9 November 2020.
4
RMarkdown: The Definitive Guide by Xie Y, Allaire JJ,
1
CRAN Task View: https://CRAN.R-project.org/ Grolemund G: https://bookdown.org/yihui/rmarkdown/;
view¼Environmetrics; accessed 9 November 2020. accessed 9 November 2020.
352 C. Salgado Kent et al.
transferable to the wild. In addition, we introduce driving the final flavor of a meal, and guides the
biases in our observations through our own sub- collection and mixing of the ingredients, through
jective, non-random filters. Only by understand- sampling, experimentation, and analysis. Taken
ing these filters can we either eliminate or adjust together, hopefully, delicious scientific meals will
biases to make reliable inferences about nature. result, by drawing meaningful and reliable
Quantitative skills, including survey design inferences from data. Statistics is paramount for
considerations, are therefore an essential part of science in general, and bioacoustics is in that
a bioacoustician’s toolkit and should be viewed regard no exception.
just as essential as field skills and signal
processing methods. These statistical methods Acknowledgement We thank Steve Buckland and Jay
are tools that enable the researcher to ask difficult Barlow for their helpful comments prior to Springer’s
peer-review.
but often important and exciting questions about
their research topic.
However, given the complexity in nature,
research design challenges, and the multi- References
disciplinary nature of studying animal behavior
through acoustics, it is not realistic to expect Akaike H (1974) A new look at the statistical model
identification. IEEE Trans Autom Control 19(6):
specialists in one field to become experts across
716–723
multiple fields (i.e., behavior, ecology, bioacous- Auger-Méthé M, Newman K, Cole D, Empacher F,
tics, and statistics). What behaviorists and Gryba R, King AA, Leos-Barajas V, Flemming JM,
bioacousticians can aim for is to understand foun- Nielson A, Petris G (2020) An introduction to state-
space modeling of ecological time series. arXiv pre-
dational statistical concepts, have a broad knowl-
print arXiv:2002.02001
edge of the range of existing techniques available, Beninger PG, Boldina I, Katsanevakis S (2012)
and be able to identify critical pitfalls in survey Strengthening statistical usage in marine ecology. J
design and data analyses. In addition, Exp Mar Biol Ecol 426:97–108
Bolker BM (2008) Ecological models and data in
practitioners should be able to conduct a range
R. Princeton University Press, Princeton
of current standard analyses and know when to Bolker BM, Brooks ME, Clark CJ, Geange SW, Poulsen
seek support for more sophisticated approaches. JR, Stevens MHH, White JSS (2009) Generalized lin-
It is our hope that through the introduction of ear mixed models: a practical guide for ecology and
evolution. Trends Ecol Evol 24(3):127–135
basic statistical concepts in this chapter, readers
Borcard D, Gillet F, Legendre P (2011) Numerical ecology
can more confidently avoid design and analysis with R. Springer, New York
pitfalls and make the necessary considerations to Borchers DL, Buckland ST, Zucchini W (2002)
select the most suitable approaches to success- Estimating animal abundance. Springer, New York
Box GE (1976) Science and statistics. J Am Stat Assoc
fully answer their research questions. We would
71(356):791–799
like researchers to feel empowered to critically Brémaud P (1999) Markov chains: Gibbs fields and Monte
evaluate the transferability of standard practices Carlo simulation. Springer, New York, pp 253–322
across broader spectra of questions and identify Brown A, Smith J, Salgado Kent C, Marley S, Allen S,
Thiele BL, Erbe C, Chabanne D (2017). Relative abun-
inadequacies where they occur. Finally, and fore-
dance, population genetic structure and passive acous-
most, we hope that at the conclusion of this chap- tic monitoring of Australian snubfin and humpback
ter, readers feel inspired to place greater focus on dolphins in regions within the Kimberley. https://doi.
the biological significance of research outputs, org/10.13140/RG.2.2.17354.06082
Burnham K, Anderson D (2002) Model selection and
using quantitative methods as a tool to support
multimodel inference: a practical information-theoretic
their conclusions. approach, 2nd edn. Springer, New York
We close this chapter by providing you, the Casella G, Berger RL (2002) Statistical inference.
reader, with our culinary rendition of the meaning Duxbury, Belmont, CA
Cawley GC, Talbot NL (2010) On over-fitting in model
of statistics: It is the science that uses data as its
selection and subsequent selection bias in performance
main ingredient, uncertainty as a key seasoning evaluation. J Mach Learn Res 11:2079–2107
9 Fundamental Data Analysis Tools and Concepts for Bioacoustical Research 353
Christin S, Hervet É, Lecomte N (2019) Applications for Kuha J (2004) AIC and BIC: comparisons of assumptions
deep learning in ecology. Methods Ecol Evol 10(10): and performance. Sociol Methods Res 33(2):188–229
1632–1644 Lawson J (2014) Design and analysis of experiments
Cochran WG (1977) Sampling techniques, 3rd edn. Wiley, with R, vol 115. CRC Press, Boca Raton, FL
New York Leek JT, Peng RD (2015) What is the question? Science
Cohen J (1988) Statistical power analysis for the behav- 347(6228):1314–1315
ioral sciences, 2nd edn. L. Erlbaum Associates, Link RF (2002) Principal applications of Bayesian
Hillsdale, NJ methods in actuarial science: a perspective. North Am
Cohen J (2013) Statistical power analysis for the behav- Actuarial J 6(2):129
ioral sciences, 2nd edn. Routledge, New York Manly BFJ (2007) Randomization, bootstrap and Monte
Committee on Taxonomy (2021) List of marine mammal Carlo methods in biology. CRC Press, Boca Raton, FL
species and subspecies. Society for Marine Mammal- Manly BF, Alberto JAN (2014) Introduction to ecological
ogy. www.marinemammalscience.org. Accessed sampling. CRC Press, Boca Raton, FL
2 Sep 2021 Marques TA, Thomas L, Ward J, Dimarzio N, Tyack PL
Crawley MJ (2013) The R book, 2nd edn. Wiley, (2009) Estimating cetacean population density using
Hoboken, NJ fixed passive acoustic sensors: an example with
Cressie N, Calder CA, Clark JS, Hoef JMV, Wikle CK Blainville’s beaked whales. J Acoust Soc Am 125(4):
(2009) Accounting for uncertainty in ecological analy- 1982–1994. https://doi.org/10.1121/1.3089590
sis: the strengths and limitations of hierarchical statis- Marques TA, Thomas L, Martin SW, Mellinger DK, Ward
tical modeling. Ecol Appl 19(3):553–570 JA, Moretti DJ, Harris D, Tyack PL (2013) Estimating
Dytham C (2011) Choosing and using statistics: a animal population density using passive acoustics. Biol
biologist’s guide, 3rd edn. Wiley, Boca Raton, FL Rev 88(2):287–309
Ellis PD (2010) The essential guide to effect sizes: statisti- Martin TG, Wintle BA, Rhodes JR, Kuhnert PM, Field
cal power, meta-analysis, and the interpretation of SA, Low-Choy SJ, Tyre AJ, Possingham HP (2005)
research results. Cambridge University Press, Zero tolerance ecology: improving ecological infer-
Cambridge ence by modelling the source of zero observations.
Ellison AM (2004) Bayesian inference in ecology. Ecol Ecol Lett 8(11):1235–1246
Lett 7(6):509–520 Matthiopoulos J (2010) How to be a quantitative ecologist.
Ellison AM, Gotelli NJ, Inouye BD, Strong DR (2014) P Wiley, Hoboken, NJ
values, hypothesis testing, and model selection: it's Maydeu-Olivares A, Garcia-Forero C (2010) Goodness-
déjà vu all over again. Ecology 95(3):609–610 of-fit testing. Int Encycl Educ 7(1):190–196
Everitt B, Hothorn T (2011) An introduction to applied McCarthy MA (2007) Bayesian methods for ecology.
multivariate analysis with R. Springer, New York Cambridge University Press, Cambridge
Fisher RA (1959) Statistical methods and scientific infer- McElreath R (2020) Statistical rethinking: a Bayesian
ence, 2nd ed. Oliver and Boyd, Edinburgh, UK course with examples in R and Stan. CRC Press,
Ford ED (2000) Scientific method for ecological research. Boca Raton, FL
Cambridge University Press, Cambridge McElroy TS (2016) Nonnested model comparisons for
Gamerman D (1997) Sampling from the posterior distri- time series. Biometrika 103(4):905–914
bution in generalized linear mixed models. Stat McGarigal K, Cushman SA, Stafford S (2000) Multivari-
Comput 7(1):57–68 ate statistics for wildlife and ecology research, 1st edn.
Gavrilov AN, McCauley RD, Gedamke J (2012) Steady Springer, New York
inter and intra-annual decreases in the vocalization Miksis-Olds JL, Nieukirk SL, Harris DV (2018) Two unit
frequency of Antarctic blue whales. J Acoust Soc Am analysis of Sri Lankan pygmy blue whale song over a
131:4476–4480. https://doi.org/10.1121/1.4707425 decade. J Acoust Soc Am 144(6):3618–3626
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Moreria Sugai L, Freire Silva T, Wagner Ribeiro J, Llusia
Rubin DB (2013) Bayesian data analysis, 3rd edn. D (2018) Terrestrial passive acoustic monitoring:
CRC Press, Boca Raton, FL review and perspectives. Bioscience 69. https://doi.
Harrison XA, Donaldson L, Correa-Cano ME, Evans J, org/10.1093/biosci/biy147
Fisher DN, Goodwin CE, Robinson BS, Hodgson DJ, Nakagawa S, Schielzeth H (2010) Repeatability for Gauss-
Inger R (2018) A brief introduction to mixed effects ian and non-Gaussian data: a practical guide for
modelling and multi-model inference in ecology. PeerJ biologists. Biol Rev 85:935–956
6:e4794 Nuzzo R (2014) Scientific method: statistical errors. Nat
Hilborn R, Mangel M (1997) The ecological detective: News 506(7487):150
confronting models with data, vol 28. Princeton Uni- O’Hara RB (2009) How to make models add up—a primer
versity Press, Princeton on GLMMs. Annales Zoologici Fennici, BioOne, pp
Kerkhoff AJ, Enquist BJ (2009) Multiplicative by nature: 124–137
why logarithmic transformation is necessary in allom- O’Hara R, Kotze J (2010) Do not log-transform count data.
etry. J Theor Biol 257(3):519–521 Nat Precedings 1:118–122
Kerns GJ (2010) Introduction to probability and statistics Ortega A, Navarrete G (2017) Bayesian hypothesis test-
using R, 1st edn. G. Jay Kerns, Youngstown ing: an alternative to null hypothesis significance
354 C. Salgado Kent et al.
testing (NHST). In: Psychology and social sciences. RStudio Team (2020) RStudio: integrated development
IntechOpen for R. RStudio, PBC, Boston, MA. http://www.
Paiva EG, Salgado Kent CP, Gagnon MM, McCauley R, rstudio.com/
Finn H (2015) Reduced detection of Indo-Pacific R Core Team (2020) R: A language and environment for
bottlenose dolphins (Tursiops aduncus) in an inner statistical computing. R Foundation for Statistical
harbour channel during pile driving activities. Aquat Computing, Vienna. https://www.R-project.org/
Mamm 41(4):455–468 Touchon JC, McCoy MW (2016) The mismatch between
Panzeri S, Magri C, Carraro L (2008) Sampling bias. current statistical practice and doctoral training in ecol-
Scholarpedia 3(9):4258 ogy. Ecosphere 7(8):e01394
Pedersen EJ, Miller DL, Simpson GL, Ross N (2019) Underwood AJ (1997) Experiments in ecology: their logi-
Hierarchical generalized additive models in ecology: cal design and interpretation using analysis of variance.
an introduction with mgcv. PeerJ 7:e6876 Cambridge University Press, Cambridge
Quinn GP, Keough MJ (2002) Experimental design and Van Der Maaten L, Postma E, Van den Herik J (2007)
data analysis for biologists. Cambridge University Dimensionality reduction: a comparative review. J
Press, Cambridge Mach Learn Res 10(66–71):13
Rahlf T (2019) Data visualisation with R: 111 examples. Warren VE, Marques TA, Harris D, Thomas L, Tyack PL,
Springer Nature, New York Aguilar de Soto N, Hickmott LS, Johnson MP (2017)
Reyier EA, Franks BR, Chapman DD, Scheidt DM, Stolen Spatio-temporal variation in click production rates
ED, Gruber SH (2014) Regional-scale migrations and of beaked whales: implications for passive acoustic
habitat use of juvenile lemon sharks (Negaprion density estimation. J Acoust Soc Am 141(3):
brevirostris) in the US South Atlantic. PLoS One 1962–1974
9(2):e88470 Wasserstein RL, Schirm AL, Lazar NA (2019) Moving to
Ripley BD (2004) Selecting amongst large classes of a world beyond “p<0.05”. Taylor & Francis,
models. In: Methods and models in statistics, In New York
Honour of Professor John Nelder, FRS. World Scien- Wilcox RR (2010) Fundamentals of modern statistical
tific, New York, pp 155–170 methods: substantially improving power and accuracy.
Royle JA, Dorazio RM (2008) Hierarchical modeling and Springer, New York
inference in ecology: the analysis of data from Williams PJ, Hooten MB (2016) Combining statistical
populations, metapopulations and communities. inference and decisions in ecology. Ecol Appl 26:
Elsevier, New York 1930–1942
Rykiel EJ Jr (1996) Testing ecological models: the mean- Xie Y, Allaire JJ, Grolemund G (2020) R markdown: the
ing of validation. Ecol Model 90:229–244 definitive guide. CRC Press, Boca Raton, FL
Salkind NJ (2010) Encyclopedia of research design, vol Yoccoz NG (1991) Use, overuse, and misuse of signifi-
1. Sage, Thousand Oaks, CA cance tests in evolutionary biology and ecology. Bull
Shmueli G (2010) To explain or to predict? Stat Sci 25(3): Ecol Soc Am 72(2):106–111
289–310 Zimmer WM, Johnson MP, Madsen PT, Tyack PL (2005)
Stauffer HB (2007) Contemporary Bayesian and Echolocation clicks of free-ranging Cuvier’s beaked
frequentist statistical research methods for natural whales (Ziphius cavirostris). J Acoust Soc Am
resource scientists. Wiley, Hoboken, NJ 117(6):3919–3927
Stevenson BC, Borchers DL, Altwegg R, Swift RJ, Zuur A, Ieno EN, Smith GM (2007) Analyzing ecological
Gillespie DM, Measey GJ (2015) A general framework data. Springer, New York
for animal density estimation from acoustic detections Zuur A, Ieno EN, Walker N, Saveliev AA, Smith GM
across a fixed microphone array. Methods Ecol Evol (2009) Mixed effects models and extensions in ecology
6(1):38–48. https://doi.org/10.1111/2041-210X.12291 with R. Springer, New York
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license
and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain
permission directly from the copyright holder.
Behavioral and Physiological
Audiometric Methods for Animals 10
Sandra L. McFadden, Andrea Megela Simmons, Christine Erbe,
and Jeanette A. Thomas
animals can experience presbycusis (i.e., loss of calling methods, which do not require training
hearing with age; Willott 1991; McFadden et al. but which likely underestimate the animals’
1997) and they can develop hearing loss if true hearing sensitivity. Understanding the
exposed to ototoxic drugs, such as auditory capabilities of non-traditional species
aminoglycoside antibiotics or platinum-based provides insight into how hearing has become
anti-cancer medications (Henderson et al. 1999). adapted to the challenges that animals face in a
Hearing loss in wildlife due to noise exposure is variety of natural environments. Unfortunately,
of increasing concern because of widespread for the vast majority of species, and even
noise sources associated with anthropogenic major taxa, there are no audiometric data
activities in the ocean and on land (see Chap. 13 available.
on the effects of noise). Audiometric studies of
animals can also contribute to the understanding
and treatment of human hearing and hearing
10.2 What Is an Audiogram?
disorders. For example, the study of the genetic
and biological bases of hearing disorders often
An audiogram is a graph of hearing threshold as a
involves audiometric testing of animals with
function of frequency (ANSI/ASA S3.20-2015;
induced genetic conditions (e.g., knockin and
ISO 18405: 2017).1 Frequency refers to the sinu-
knockout mice in which an existing gene is
soidal vibration in cycles/s of a pure tone (sine
replaced or disrupted with an artificial piece of
wave). The hearing threshold of a listener is
DNA, thereby altering or eliminating its function)
defined as the minimum stimulus level that
and the investigation of pharmacological
evokes an auditory sensation in a specified frac-
influences on human hearing is studied in labora-
tion of trials at a given frequency. On an audio-
tory animals.
gram (Fig. 10.1), low threshold values correspond
Audiometric studies have been conducted on
to high sensitivity to sound at that frequency and
many aquatic and terrestrial species, with the
vice versa. The stimulus level is often a root-
choice of species guided by availability and the
mean-square sound pressure level (SPL)
particular questions (biological, medical, or evo-
expressed in dB with a reference of 20 μPa
lutionary) that the experimenter poses. Hearing
when testing in air or 1 μPa when testing under
abilities have been studied extensively in tradi-
water; see Chap. 4, Introduction to Acoustics. The
tional laboratory mammals (Fig. 10.1) including
stimulus level may also be a root-mean-square
the house mouse (Mus musculus), chinchilla
sound particle velocity level (e.g., in the case of
(Chinchilla lanigera), Mongolian gerbil
some fish audiograms) specified in dB re 1 nm/s.
(Meriones unguiculatus), guinea pig (Cavia
Because audiograms may be measured with
porcellus), and laboratory rat (Rattus norvegicus).
signals other than pure tones (e.g., tone pips or
These species are easy to obtain, easily bred in the
clicks), signal type, threshold level, and reference
laboratory, and readily trained in conditioning
value should be reported, along with the
procedures, and so have long served as models
measured ambient noise levels. If the ambient
for both normal and impaired human hearing.
noise is negligible, the hearing threshold is
Audiometric studies have been conducted with
referred to as an unmasked threshold. If the ambi-
many non-mammal species, including insects,
ent noise is high enough to raise the hearing
amphibians, reptiles, fishes, and birds (see Vol-
threshold above its unmasked level, the hearing
ume 2). Many species are challenging to obtain,
threshold is called a masked threshold (ISO
to house, and to train in a laboratory environment.
18405: 2017).
For these reasons, behavioral audiograms are
sometimes based on data from only one or
very few animals, which limits the generaliz- 1
Acoustical Society of America, Standard Acoustical &
ability of the results. Further, hearing in some Bioacoustical Terminology Database: https://asastandards.
species is estimated by phonotaxis and evoked org/asa-standard-term-database/; accessed 5 January 2021.
10 Behavioral and Physiological Audiometric Methods for Animals 357
Fig. 10.1 Left: Behavioral audiograms of rodents com- averaged thresholds based on 50% correct detection.
monly used as laboratory animal models for hearing. Data were collected by Heffner and Heffner (1991, from
Tones were presented through loudspeakers, and the three chinchillas); Koay et al. (2002, from two domestic
animals’ conditioned responses measured. All of the mice); Heffner et al. (1994, from four Norway rats); and
audiograms are U-shaped, with frequencies of best sensi- Heffner et al. (1971, from four Mongolian gerbils). Right:
tivity (tip of the audiogram, at the lowest sound pressure The photo of a mouse participating in a behavioral hearing
level) within the range of 4–16 kHz. These species differ test is courtesy of Micheal Dent, University at Buffalo,
considerably in the low-frequency limit of hearing, with The State University of New York (Screven and Dent
the chinchilla being more sensitive to a broader range of 2019)
low frequencies than the domestic mouse. Plots are
There are two general approaches to assessing the non-acoustic self-noise arising from myo-
the auditory thresholds of live animals: behav- genic and neurogenic sources plus any artifact
ioral and physiological. The behavioral hearing due to non-biological electrical interference.
threshold is the lowest level that evokes a behav- Electrophysiological hearing threshold estimates
iorally measurable auditory sensation in a can be determined from different physiological
specified fraction of trials (ISO 18405: 2017). processes (e.g., microphonic potentials, auditory
The pure-tone behavioral hearing threshold mea- brainstem response, cortical evoked responses),
surement procedure (prescribed in ANSI/ASA which characterize auditory processing at differ-
S3.21-2004) recommends that the behavioral ent levels of the auditory system. Various thresh-
hearing threshold be defined as the lowest input old estimation procedures also exist; each carries
level at which responses occur in at least 50% of a with it associated errors and assumptions, so the
series of ascending trials (i.e., trials in which method for threshold estimation should be
signal level is systematically increased). The specified.
behavioral hearing threshold provides an Electrophysiological methods are not equiva-
integrated, whole-organism response to signal lent to behavioral procedures, and electrophysio-
detection. logical hearing thresholds can differ from
An electrophysiological hearing threshold is behavioral hearing thresholds (even for the same
the lowest level that evokes a detectable and test animal). Within each of these two
reproducible electrophysiological response (ISO approaches, several methods can be employed,
18405:2017). Both the ambient noise and the depending on the species being tested and the
background electrophysiological noise levels goals of the researcher. Behavioral techniques
should be reported. Electrophysiological noise is can be based on either unconditioned responses
358 S. L. McFadden et al.
that the animal makes spontaneously and as part other directions being attenuated. The external
of its natural repertoire, or conditioned responses auditory meatus is an acoustic resonator that
that the animal is trained to make. Common phys- boosts the amplitude of received frequencies at
iological techniques measure otoacoustic and near its resonant frequency. The resonant
emissions (OAEs; i.e., sounds generated by frequency of the ear canal is inversely propor-
outer hair cells in the inner ear and measured tional to its length, so animals with short ear
using a very sensitive microphone) and auditory canals, such as mice, have their best hearing sen-
evoked potentials (AEPs; i.e., summed electrical sitivity at high frequencies, whereas animals with
responses of hair cells and auditory neurons long ear canals, such as elephants, have their best
recorded from electrodes). Results from behav- hearing sensitivity at low frequencies. The reso-
ioral and AEP experiments in the same species or nant characteristics of the external auditory mea-
even in the same animal can produce audiograms tus, coupled with the sound transfer properties of
that are similar in shape and frequency range but the middle ear, help determine the acoustic
may differ in absolute thresholds (see energy levels reaching the inner ear.
Sect. 10.4.3). Often, audiograms are incorrectly interpreted
Audiograms in most species are typically as illustrating hard thresholds to sounds, assum-
U-shaped, but not symmetrical (Fig. 10.1). The ing that sounds at amplitudes just below the
frequency region of best sensitivity encompasses published audiogram are inaudible and sounds
those sound frequencies at the trough of the just above the audiogram are always audible.
U-shaped curve, where thresholds are lowest. That is not the case. The faintest sound that an
The animal’s best hearing sensitivity (or lowest animal can hear depends on many factors, includ-
threshold) corresponds to the threshold range at ing stimulus characteristics (e.g., duration, repeti-
the frequency region of best sensitivity. The range tion rate), environmental factors (e.g., ambient
of hearing specifies the sound frequencies that are noise level, testing context such as anechoic
audible to an animal at some specified level (e.g., chamber versus natural environment), and indi-
60 dB) above the lowest threshold. The range of vidual factors (e.g., health, response bias, atten-
hearing for sounds at high sound levels is wider tion, age). A given animal may show a loss of
than the range of hearing for sounds at low sound sensitivity due to aging, noise exposure, or expo-
levels because the audiogram is broad and sure to ototoxic drugs, and even due to repeated
U-shaped. The range of hearing should be or prolonged exposure to the stimulus during
expressed as between X Hz and Y Hz at Z dB testing that leads to sensory adaptation and/or
above the best hearing sensitivity. Unfortunately, cognitive habituation. At high ambient noise
many publications do not include the number of levels or when additional sounds are present, an
decibels above the best hearing sensitivity when animal might lose the ability to hear a sound it
reporting the range of hearing for an animal or previously heard in a quiet environment. This is
species, and they may not indicate whether the because of masking, in which the presence of
highest and lowest frequencies shown in an non-target sounds or noise decreases the detect-
audiogram reflect the limits of testing or the limits ability of the sound of interest.
of the animal’s hearing capabilities. Within a species, there can be significant indi-
In terrestrial mammals, the main contributors vidual differences in hearing sensitivity, which
to the U-shape of the audiogram and the location can reflect differences in attention to the task,
of the frequency of best sensitivity are the acous- age, health, and history of exposure to sounds,
tic properties of the auditory periphery: the pin- among other factors. Because there can be con-
nae, external auditory meatus, and middle ear siderable variability among animals of a given
(Tonndorf 1976; Hellström 1995). The pinna species, it is important to test many animals
serves to funnel sounds into the external auditory when possible. Also, it is important to know
meatus (i.e., the ear canal), with sounds from when examining an audiogram whether the
some directions being amplified and those from
10 Behavioral and Physiological Audiometric Methods for Animals 359
Fig. 10.2 Left: Underwater behavioral audiograms of averaged data from the same male and female and an
three beluga whales obtained at two different times additional juvenile male, obtained by Awbrey et al.
10 years apart. Data were obtained using an ascending (1988). The gray squares show the ambient noise level in
Method of Limits (described in Sect. 10.3.3). The whales the test pool, which was close to the measured thresholds
were trained to leave a station when they heard a tone and at 4 and 8 kHz, indicating that the whales’ actual
swim to the trainer for a food reward. Thresholds were thresholds at these frequencies were likely lower than
defined as the tone level at which the whales detected the indicated on this graph. The gray dashed line is 60 dB
signal 50% of the time. The red triangles show the mean above the lowest threshold at 30 kHz, where the range of
audiogram from one male and one female beluga whale hearing was measured. Right: Photo of two beluga whales
reported by White et al. (1978). The arrow shows the most at Vancouver Aquarium
sensitive frequency at 30 kHz. The blue circles show
curve is based on a single animal or a group of and conditioned response techniques. Uncondi-
animals. tioned response techniques are based on
Audiograms from three beluga whales behaviors that the animal naturally makes to
(Delphinapterus leucas) are shown in Fig. 10.2. sound and are readily employed in the animal’s
From this graph, it can be seen that testing was natural habitat. Animals must be trained to make
conducted in water because the dB reference is conditioned responses, and this training should be
1 μPa, rather than 20 μPa for sounds presented in based on the species’ typical behavioral reper-
air (as in Fig. 10.1). In belugas, hearing sensitivity toire. Klump et al. (1995) provide a full discus-
increased from low frequencies around 250 Hz to sion of different methods used to study hearing
the best frequency range around 30 kHz (thresh- sensitivity in animals.
old around 37 dB re 1 μPa), and then decreased For both techniques, establishing stimulus
toward higher frequencies up to 120 kHz; this control over an animal’s behavior is crucial. A
results in a U-shaped hearing curve. The range pure tone is typically the test signal, although
of hearing at 60 dB above lowest threshold broadband clicks, and noises of varying
extends from about 1–110 kHz. bandwidths can be used, depending on the
research question. How signals are generated
and presented is extremely important to control
and monitor. The sound may be delivered via a
10.3 Behavioral Methods
loudspeaker to animals ranging freely, being con-
for Audiometric Studies on Live
fined to the experimental chamber, or trained to
Animals
hold station (e.g., at a bite plate or in a hoop), or
delivered via tubes, insert earphones, or
Behavioral approaches can be divided into two
headphones (Fig. 10.3). Stimuli can be presented
general types, unconditioned response techniques
360 S. L. McFadden et al.
Fig. 10.3 Photos of a budgerigar (Melopsittacus a reward during a frequency discrimination experiment
undulatus) wearing headphones during a sound localiza- (right; Dent et al. 2000). Courtesy of Micheal Dent, Uni-
tion experiment (left; Welch and Dent 2011) and receiving versity at Buffalo, The State University of New York
using several different protocols, each of which hearing abilities but are not good measures for
has its own assumptions and limitations. Ambient determining absolute thresholds of hearing.
noise can influence thresholds and so must also be The Preyer reflex has been described as an
controlled. Ambient noise can be minimized if the orientation or attentional reflex (Jero et al.
animal is tested in an anechoic chamber or a 2001). In mammalian species that are able to
sound-attenuating chamber (Fig. 10.4). If animals move their pinnae, it involves a quick retraction
are tested in their natural environments where of the ears, a rapid twitch of the ears, or a change
ambient noise levels cannot be controlled, in orientation of the pinnae toward the source of
researchers must take periodic measurements of the sound. In species with immobile pinnae, turn-
the amount of ambient noise present during ing of the head toward the sound source (which
hearing tests. brings the source of the sound into the animal’s
line of vision) is the measure of orientation. In
some studies, a trained observer simply rates the
10.3.1 Behavioral Methods Using Preyer reflex as present or absent. The reflex also
Unconditioned Behaviors can be monitored using a motion-tracking camera
system and reflective markers attached to each of
10.3.1.1 Preyer Reflex and Acoustic the animal’s pinnae, as described in a study using
Startle Response the guinea pig (Berger et al. 2013). The magni-
The Preyer reflex and the acoustic startle response tude and latency of the Preyer reflex can then be
(ASR) are behaviors triggered automatically by determined by measuring pinnae displacement
unexpected, high-amplitude sounds. These are during sound presentation.
reflexive responses to sound that require no train- The ASR is a whole-body response to unex-
ing of the animal and thus are relatively easy to pected sounds presented at very high amplitudes
implement. On the other hand, animals can habit- (typically above 90 dB re 20 μPa) and has been
uate to repeated presentations of high-amplitude interpreted as a protective or alarm reflex. It can
sounds that best evoke these reflexes. Thus, be elicited in a wide range of adults and develop-
sound-evoked reflexes can be useful as fast and ing vertebrates, including fishes and most
easy screening tests for bracketing an animal’s mammals, and typically is quantified in terms of
10 Behavioral and Physiological Audiometric Methods for Animals 361
response amplitude and response latency. In tele- plates filled with water and mounted on top of a
ost fish, the ASR is called the tail-flip reflex or vibration device that produces particle motion
C-start response, and it involves an initial full stimulation. A high-speed video camera is needed
flexion of the body followed by a weaker flexion to visualize the C-start response (Bhandiwad and
in the opposite direction, so that the animal bends Sisneros 2016).
and swims away from the source of the stimulus. In small mammals such as rodents, the ASR
The response is mediated by the Mauthner cells, a consists of hunching of the shoulders,
pair of giant neurons located at the level of the dorsiflexion of the neck, and rapid extension
auditory-vestibular nerve in the hindbrain. The then flexion of the limbs. ASR in rodents is typi-
Mauthner cells receive input from the auditory cally measured by placing the animal on a plat-
nerve and then send signals to motor neurons on form that measures displacement and force or
the opposite side of the body, which then produce acceleration caused by limb extension
the behavioral response. The ASR in fishes can be (Fig. 10.4). In primates, the ASR involves the
measured by placing the animals in small acrylic reflex contraction of striate skeletal muscles,
362 S. L. McFadden et al.
primarily muscles of the face, neck, shoulders, 50% or more of the fish in a school reacted to the
and arms (Braff et al. 2001). sound stimulus by increasing swimming speed
An animal that twitches its ears or startles and making tight turns. Disadvantages of using
repeatedly (e.g., in at least two out of three startle responses are that they require presentation
presentations) in response to finger snaps, hand of high amplitude stimuli and they habituate
claps or pure tones at different frequencies has quickly.
demonstrated an ability to hear. At the same time,
however, the presence of a startle response does 10.3.1.2 Prepulse Inhibition (PPI)
not mean the animal has normal hearing. This was and Reflex Modification
demonstrated clearly in a study of the sensitivity Although the ASR is a reflex that is not typically
and specificity of the Preyer reflex by Jero et al. under voluntary control, it is sensitive to and can
(2001). The researchers used hand claps or the be modified by ongoing behaviors and attentional
metallic sound of two hammers hitting together to status of an animal. The ASR can be potentiated
elicit startle responses from young adult albino under some circumstances and attenuated or
laboratory mice of the FVB strain. They found inhibited under others. Animals typically show
that the reflex test was effective for identifying larger ASRs when they are afraid or anxious
profound hearing loss, but was insensitive for than when they are not, so fear-potentiated startle
identifying less severe hearing losses. paradigms commonly are used to study fear and
Reflex responses to sound can be used to show anxiety states in animals. When an animal is
differences between groups of animals as a func- processing another stimulus, such as a brief
tion of age or experimental treatment. Bhandiwad low-level sound or a puff of air or a flash of
and Sisneros (2016) examined the development light, it will startle less to a sudden, loud sound
of hearing in two species of larval fishes, the than when it is not otherwise engaged. The ability
three-spined stickleback (Gasterosteus aculeatus) of an auditory, tactile, or visual prepulse stimulus
and the zebrafish (Danio rerio), by quantifying to reduce the amplitude of the ASR is termed
the probability of a startle reflex in response to prepulse inhibition (PPI).
sounds of different frequencies at different ages Even an auditory prepulse stimulus near the
post-fertilization. McFadden et al. (2010) showed hearing threshold of an animal can attenuate the
declines in the amplitude and increases in the ASR, and this makes the PPI paradigm suitable
latency of the ASR with age in laboratory rats. for testing threshold levels of sound and deter-
Age-related changes in one or more of the mining subtle effects of treatments on auditory
components of the ASR circuit or to brain regions function. PPI has been used to study the auditory
providing inhibitory input to this circuit can sensitivity of fishes, frogs, and mammals
account for ASR changes observed in older (Fig. 10.5). In larval zebrafish, the probability of
animals and humans. an ASR to a high-amplitude tone was reduced
Startle responses also can be useful for deter- when the tone was preceded by other tones at
mining the range of frequencies that an animal sub-startle levels (Bhandiwad and Sisneros
can hear. Bowles and Francine (1993) determined 2016). Thresholds obtained by PPI in this species
that kit foxes (Vulpes macrotis) have a functional were lower than thresholds obtained by using the
hearing range from 1 to 20 kHz by observing ASR alone.
startle responses of four wild-caught kit foxes to Reflexes other than acoustic startle responses
playbacks of tones of different frequencies. An can be modified by the prior presentation of a
additional advantage of startle reflex testing is sound; these paradigms are termed reflex
that a group of animals can be tested simulta- modifications (Hoffman and Ison 1980).
neously. Kastelein et al. (2008) determined the Simmons and Moss (1995) adapted this paradigm
frequency range of hearing for eight species of to obtain audiograms for two species of frogs, the
marine fish by noting the frequencies at which American bullfrog (Lithobates catesbeianus) and
10 Behavioral and Physiological Audiometric Methods for Animals 363
10.3.1.3 Phonotaxis
Some animals have a natural tendency to
approach sound (positive phonotaxis) or make
evasive movements away from sound (negative
phonotaxis). Sounds that elicit positive
phonotaxis include species advertisement calls
(i.e., mating calls), while sounds that elicit nega-
tive phonotaxis include sounds made by
predators. These natural behavioral responses to
sound can be exploited to estimate hearing sensi-
tivity in those species for which training
procedures based on conditioned responses are
extremely difficult to implement. Phonotaxis
experiments are readily conducted in the animal’s
Fig. 10.5 Schematic drawing of a setup used to study habitat and so can provide crucial information on
prepulse inhibition of the ASR in Mongolian gerbils. The the acoustic features animals use to recognize
top drawing shows a gerbil placed into an acrylic tube
10 cm in front of a loudspeaker. The force sensor under conspecific (own species) vocal signals such as
the acrylic tube monitors the gerbil’s movements. The C advertisement and aggressive calls. These kinds
label shows the position of the stimulation/recording com- of field studies are particularly important for
puter. Center drawing shows the timing of acoustic stimu- identifying the impact of the entire soundscape
lation (dB) with the pre-stimulus (lower amplitude trace)
preceding the startle-producing stimulus (higher amplitude on sound detection and discrimination, and for
trace). Bottom drawing shows the response measured by assessing the effects of environmental variables,
the force sensor. Here, the response occurs only to the such as air temperature and humidity, on acoustic
stimulus and not to the pre-stimulus. After repeated communication.
pairings of the pre-stimulus and stimulus, the response to
the stimulus declines (Walter et al. 2012). # Walter et al. Phonotaxis has been especially useful for
2012; https://www.scirp.org/journal/paperinformation. studying auditory capabilities of female orthop-
aspx?paperid¼17796. Licensed under CC BY 4.0; teran insects, frogs, and songbirds, because these
https://creativecommons.org/licenses/by/4.0/ animals naturally approach stationary calling
males in order to mate with them. For example,
the green treefrog (Dryophytes cinereus). Frogs gravid female frogs readily approach
were constrained inside a small dish (1–2 cm in loudspeakers broadcasting sounds (tone bursts,
diameter larger than the animal), which was then amplitude-modulated tones, or frequency-
placed on top of a stabilimeter that picked up the modulated tones) which they recognize as
frog’s movements within the dish. Two copper components of the advertisement calls of males
strips cemented to the side of the dish produced a of their own species, or even a synthetic version
mild electric shock that evoked small reflex of these conspecific calls (Gerhardt 1995). The
contractions of the frog’s hind limbs. The reflex sensitivity of females to these sounds is measured
evoked by the electric shock was modified in in experiments in which sounds of different
strength by prepulses of pure tones, with the levels, frequencies, or temporal patterning are
extent of modification varying with prepulse broadcast from a loudspeaker, and the female’s
amplitude. At any given tone frequency, the approach to the loudspeaker is quantified. Sounds
amplitude of the prepulse producing 10% inhibi- can be broadcast from one source (one-speaker
tion of the reflex response was defined as the design) to estimate sound detection or from two
threshold to that frequency. The magnitude of sources (choice or two-speaker design) to esti-
the reflex modification effect varied with the mate sound discrimination. The researcher can
amplitude of the prepulse, but only when stimula- obtain an estimate of the female’s relative sensi-
tion was spaced at intervals wide enough to avoid tivity to sounds (if sound frequency is varied) or
habituation. her ability to distinguish sounds of two intensities
364 S. L. McFadden et al.
(if sound level is varied). Responses are Because most species of insects and frogs call
quantified in terms of the nearness and the path at night, visualizing their movements in a
of the phonotactic approach, the latency of the phonotaxis experiment can be challenging. Fig-
response, and the presence of orientation ure 10.6 shows a new technique designed to
movements, such as head-turning toward the monitor phonotactic movements of frogs in both
sound source. Data are typically presented as the the laboratory and the natural environment
proportion of females responding to a particular (Aihara et al. 2017). In this technique, a female
stimulus as a function of whatever parameter is Australian orange-eyed treefrog (Ranoidea
being varied, with the 50% correct point on the chloris) wears a miniature LED backpack. A
resulting function defined as the threshold in a video camera records the energy emitted from
one-choice experiment and the 75% correct point the LEDs, thus allowing researchers to track the
(midway between chance and perfect perfor- frog’s movements. Sounds are broadcast through
mance) defined as the threshold in a two-choice multiple loudspeakers, and monitored by separate
experiment (see Volume 2, Chap. 3 on LED sound indication devices, each of which has
amphibians). a different pattern of illumination. In this way,
Fig. 10.6 (a) An image of a sound indication device that the middle of the arena. The lights emitted by the sound
consists of a miniature microphone and a light-emitting indication device and the LED backpack are recorded by a
diode (LED). The LED is illuminated when detecting video camera. (d) Natural habitat of the orange-eyed
sounds. (b) Photo of an orange-eyed female treefrog wear- treefrog. The position of the sound-indication device is
ing a LED backpack. (c) Arena playback experiment. Two shown (Aihara et al. 2017). # Aihara et al. 2017; https://
loudspeakers at each end of the arena present sounds. A www.nature.com/articles/s41598-017-11150-y. Licensed
sound indication device is placed in front of each loud- under CC BY 4.0; https://creativecommons.org/licenses/
speaker. The female wearing the backpack is released from by/4.0/
10 Behavioral and Physiological Audiometric Methods for Animals 365
researchers can not only track the female’s what acoustic features of communication signals
movements but also which of several are most important for mediating behavioral
loudspeakers is playing the preferred sound. responses. Despite their limitations, phonotaxis
There are limitations to the use and interpreta- and evoked calling techniques are useful because
tion of phonotaxis data. Although phonotaxis they provide insight into what sounds animals pay
experiments can tell us which sounds animals attention to in their natural environment and thus
prefer and how sensitive they are to these sounds, into perceptual decision-making in a biologically
they are not suitable for the compilation of entire relevant context.
audiograms or estimates of an animal’s entire
range of hearing. When a female fails to approach
a sound source, it may be because she does not 10.3.2 Behavioral Methods Using
hear it or because she does not recognize it as an Conditioned Behaviors
advertisement call. Moreover, females of many
species will show phonotaxis only when they are 10.3.2.1 Classical Conditioning
gravid. This limits the timespan during which Classical conditioning techniques have been used
experiments can be conducted, although to train several species of animals for audiometric
phonotaxis can be induced by hormone injections studies. In classical conditioning, an uncondi-
(Gerhardt 1995). Male insects and frogs typically tioned stimulus that naturally elicits an uncondi-
exhibit phonotaxis only in response to a high tioned response is paired with a conditioned
amplitude sound resembling an advertisement stimulus. After a number of pairings of the
call or an aggressive call from a rival male. conditioned stimulus with the unconditioned
Males treat aggressive calls from rivals as threats stimulus, presentation of the conditioned stimulus
and respond aggressively, by approaching the alone elicits a conditioned response that is the
source and attempting to engage it physically. same as or similar to the unconditioned response.
Because males are less likely than females to Fay (1995) described the use of classical respi-
approach sound sources, descriptions of their ratory conditioning to estimate auditory
hearing sensitivity based on phonotaxis are not thresholds in the goldfish (Carassius auratus).
reliable. The goldfish was restrained in a cloth bag and
submerged in a small tank. An underwater loud-
10.3.1.4 Evoked Calling speaker was placed on the bottom of the tank. A
Evoked calling is another method based on tone of a particular frequency was presented
unconditioned responses that can be used to esti- shortly before a brief electric shock (uncondi-
mate hearing sensitivity and acoustic preferences. tioned stimulus) that produced an unconditioned
Males of some species (orthopteran insects, frogs, suppression of the fish’s respiration. Changes in
songbirds) vocalize in response to playbacks of the amplitude and rate of fish’s respiration were
signals resembling conspecific advertisement or measured by a thermister placed in front of the
aggressive calls. The male’s sensitivity to these fish’s mouth. After multiple pairings of the tone
playbacks can be estimated by lowering the and shock, presentation of the tone alone pro-
amplitude of the signal until the male no longer duced a conditioned suppression of respiration.
vocalizes back. Varying the acoustic features (fre- By determining the amplitude level of the tone
quency, temporal patterning) of the signal can that no longer produced a conditioned response,
provide estimates of sensitivity to these particular the fish’s sensitivity to that tone frequency could
features (Fay and Simmons 1999). Evoked call- be determined.
ing experiments, like phonotaxis experiments, Ehret and Romand (1981) used both uncondi-
can be implemented either in the laboratory or in tioned and classically conditioned pinnae
the field. As with the phonotaxis technique, the movements and eye-blink responses to track the
evoked calling technique does not measure audi- postnatal development of auditory thresholds in
bility per se but can be useful for determining domestic kittens (Felis catus). Unconditioned
366 S. L. McFadden et al.
movements of the pinnae and/or facial muscles in various frequencies and amplitudes of sound to
response to high-intensity tone bursts were determine the audiogram. Sometimes animals
observed in one group of kittens up to 12 days mistakenly respond when there is no signal pres-
of age. A second group of kittens (aged 10 days to ent; this is a false alarm. Some animals are more
1 month) was trained with tone-shock pairs to inclined to make false alarms than others. To
make conditioned movements of their eyelids assess this bias, “catch trials” (i.e., control trials
and pinnae when they heard a sound. Ehret and in which no signal is presented) are interspersed
Romand’s results showed that some kittens as at random in the stimulus series. Some
young as 1–2 days of age were able to respond researchers desire to assess the animal’s attentive-
to some frequencies, and that sensitivity to low, ness to a hearing task before collecting data, such
mid, and high frequencies developed at as by conducting a set of easily heard “warm-up
different ages. trials” at the beginning of a session, and a set of
easily heard “cool-down trials” at the end of a
10.3.2.2 Operant Conditioning session. Criteria can be set such that if the
There are many responses animals can make to animal’s performance does not reach a certain
indicate when sounds are heard (or not heard), percent of correct responses during either the
such as touching a response paddle, pressing a warm-up or the cool-down trials (e.g., 80%), test-
lever with a nose or paw, lifting a paw, licking a ing is discontinued for that session or data from
tube from a water bottle, swimming across a that session are eliminated.
barrier, or vocalizing. It is important to choose a In conditioned suppression/avoidance
response that is based on an animal’s natural paradigms, an animal learns to suppress an ongo-
behaviors and thus is easy to learn. Once the ing behavior when it detects a sound that signals
response is chosen, there are several behavioral shock (Heffner and Heffner 2001). The shock
methods that can be used to train animals to make levels used in these studies are kept low so that
the response when a sound is detected or refrain the animals do not become agitated or develop a
from the response when no stimulus is presented. fear of the test apparatus that would impair their
These different paradigms have been performance. Heffner et al. (2014) used the
implemented successfully with a large number conditioned suppression procedure to determine
of species, with modifications that take into behavioral audiograms and sound localization
account species-typical behaviors and habitats. abilities of three young male alpacas (Vicugna
Operant conditioning techniques can use posi- pacos). Thirsty alpacas were trained to break con-
tive or negative reinforcement procedures for tact with a water spout when they heard a tone or
training or “shaping” a conditioned response. noise signal (a conditioned stimulus) that warned
Positive reinforcement methods establish the of impending shock (unconditioned stimulus) and
behavior by providing a reward, such as food, to resume drinking water following a safety sig-
water, or even verbal praise or tactile stimulation nal. The safety signal for tone threshold testing
whenever the animal makes the appropriate was a shock indicator light that turned off when
response. Negative reinforcement methods shock was terminated. Hit rates (measuring the
remove an unpleasant or aversive stimulus (usu- percentage of correct detections of sound,
ally mild electric shock) whenever the animal indicated by breaking contact with the water
makes the appropriate response. Methods can bowl when the tone signal was present) and
also be used to decrease unwanted or incorrect false alarm rates (measuring the percentage of
responses; these are termed punishment false alarms, indicated by breaking contact with
procedures. For example, a time-out period the water bowl when no tone was present) were
might be imposed (positive punishment) when determined for each stimulus intensity. The pure-
an animal makes an incorrect response. After the tone thresholds of the three alpacas showed little
desired behavior has been established through an variability among individuals. Indeed, Heffner
appropriate schedule of reinforcement during a and Heffner (2001) argued that individual varia-
training phase, the animal is then tested using tion among animals is less when using
10 Behavioral and Physiological Audiometric Methods for Animals 367
conditioned suppression compared to methods can wane if there are changes in the social envi-
based on positive reinforcement. ronment, routine, or the animal’s health.
Another common technique based on positive Because behavioral audiograms require a long
reinforcement, used in many species of aquatic period to train and test the animal, and since the
(Fig. 10.7) and terrestrial species, is a go/no-go number of individuals in captivity is limited for
response paradigm. Thomas et al. (1990) used this many species, in some marine mammals, hearing
technique to measure the audiogram of a subadult data are available for only a single animal. Hall
male Hawaiian monk seal (Neomonachus and Johnson (1972) conducted a behavioral
schauinslandi). At the start of each trial, a trainer audiogram on a captive killer whale (Orcinus
sent the seal, using a hand cue, to station under orca) and reported that this species had much
water with its chin resting on a headstand. If a tone worse high-frequency hearing than other toothed
was heard, the seal was expected to leave the whales tested to that date. Later, Bain et al. (1993)
station, touch a response paddle, and swim to the conducted behavioral audiograms on five killer
trainer for a fish reward (go response). If no tone whales and found their hearing was very typical
was heard (either a control trial or an inaudible of other toothed whales. Upon investigation, the
signal), the seal was supposed to stay at the station, researchers found that the original test subject had
wait for the trainer to give a release whistle, and been given high dosages of an ototoxic antibiotic.
then swim back to the trainer for a reward (no-go So, the first killer whale tested was likely hearing
response). Half the trials were signal-present and impaired as a result of antibiotic-induced death of
half were signal-absent controls; the order of pre- hair cells in the high-frequency region of the
sentation of the trial types was pseudorandomized cochlea. By now, another eight individuals have
throughout a session so that the animal would been tested confirming more typical delphinid
adopt a neutral response bias. The trainer then audiograms in killer whales (Branstetter et al.
called the seal back to the initial station with a 2017).
whistle and the next trial commenced.
There are several drawbacks of behavioral
audiometric studies based on conditioning 10.3.3 Signal Presentation Paradigms
procedures. Most notably, weeks or months may for Behavioral Audiograms
be required to train the animal to respond reliably.
It is important to maintain the animal’s motivation There are three classic paradigms commonly used
to respond and attention to the task, both of which for signal presentation in behavioral audiogram
368 S. L. McFadden et al.
tests with animals (Levitt 1970; Klump et al. level is determined (often by interpolation) as
1995): the Method of Constant Stimuli, the the level at which the animal indicated it heard
Method of Limits, and the Up/Down Staircase the signal on 50% of the trials.
method (also called “adaptive tracking method”). The stimulus presentation levels cover a wide
One important factor to keep in mind when range that bracket the animal’s threshold, so addi-
choosing a signal presentation paradigm is the tional points on the psychometric function can be
time available for measuring thresholds, as there estimated. Randomized presentation of stimuli
is a trade-off between the number of trials and the prevents the animal from anticipating the stimulus
accuracy and reliability of hearing-threshold level on the next trial. Many of the stimulus levels
measurements. are well above threshold, so the animal is not
required to make difficult detections on every
10.3.3.1 Method of Constant Stimuli trial. On the other hand, the method is time-
The Method of Constant Stimuli provides the consuming, and the choice of stimulus levels to
greatest accuracy and reliability for threshold present requires some prior knowledge of likely
measurements. In this paradigm, the animal is thresholds at a specific frequency.
tested at one frequency in a session with blocks
of trials having an equal number of different 10.3.3.2 Method of Limits
signal levels ranging from very low to very high The Method of Limits involves the presentation
amplitude (i.e., no silent controls), presented in of stimuli in small steps (typically 2 to 5 dB) over
random order. The animal makes a response when a fixed range of stimulus levels. At each level, the
a signal is heard, and the results for each signal experimenter records whether the animal
presentation (“Yes” the tone was heard or “No” responded to the test tone or not (Fig. 10.9).
the tone was not heard) are tallied by amplitude Stimuli may be presented in an ascending series,
levels (Fig. 10.8 left panel). After all responses from the lowest amplitude to the highest, or in a
are tallied, a psychometric function (i.e., a plot of descending series, from the highest amplitude to
the animal’s responses, typically the percentage the lowest. Multiple runs are conducted, and for
of “Yes” responses) versus amplitude level each run, the crossover level (i.e., the level half-
(Fig. 10.8 right panel) is made. The threshold way between the stimulus level not heard and the
Fig. 10.8 Illustration of the Method of Constant Stimuli. the highest stimulus levels, the subject reported detection
Left panel: Fifty stimuli were presented at each of nine on all 50 trials (100%). Right panel: Data from the tallies
stimulus levels (450 trials total). The number of times the chart were used to plot a psychometric function, showing
subject indicated that the stimulus was heard at each level performance as a function of stimulus level. Threshold,
was tallied in the Number column and converted to a defined as the stimulus level at which the subject made a
percentage in the Percent column. At stimulus levels detection response on 50% of the trials, was interpolated to
below threshold, the subject rarely responded, whereas at be 5.2 in this example
10 Behavioral and Physiological Audiometric Methods for Animals 369
Fig. 10.10 Example of “bracketing” a hearing threshold immediate reversal. Signals were presented at random
using the Up/Down Staircase method (Modified Method intervals to prevent the subject from developing a response
of Limits). The first signal was presented at a level that the bias based on timing. In this example, the predetermined
subject easily heard (“Yes” at 40 dB re 20 μPa). Signal criterion for threshold was the lowest signal level with
level was then decreased in 5-dB steps until the subject no three “Yes” responses on ascending trials (circled
longer signaled detection (“No” at 25 dB re 20 μPa). The responses), so 30 dB re 20 μPa was the threshold for this
change of response from “Yes” to “No” triggered the first frequency. Testing at this frequency terminated when the
reversal, from a descending series to an ascending one. criterion for threshold was met
Thereafter, each change of response triggered an
However, receiving a reward for both correct broadcast), (3) false alarm (i.e., responding that a
responses to signal and silent control trials helps signal is present when it is not, or indicating “yes”
reduce negative effects. The major advantage of before the signal is broadcast), and (4) missed
the adaptive tracking method over the Method of detection or miss (i.e., responding that a signal
Constant Stimuli and the Method of Limits is that is absent when a signal is broadcast or failing to
fewer trials need to be conducted, resulting in a respond). The four response choices of an animal
shorter test session for both the researcher and the in a behavioral hearing test are illustrated in
animal subject. Fig. 10.11.
Response bias can be disentangled from sen-
sory capabilities by constructing a Receiver
10.3.4 Receiver Operating Operating Characteristic (ROC) curve (Green
Characteristic (ROC) Curves and Swets 1966). Upon signal presentation, the
animal can respond either “yes” or “no” and so decreasing signal-to-noise ratio (from 0 to
the probability of correct detection, P(CD), and 30 dB), the animal’s hit rate decreased (i.e.,
the probability of missed detection, P(MD) add to decreasing P(CD)). False alarms were only
1: P(CD) + P(MD) ¼ 1. Similarly, in the case of made at low signal-to-noise ratio (24 dB)
no signal presented, the probabilities of false indicating an overall conservative response bias.
alarm, P(FA), and correct rejection, P(CR), add Data are based on the study by Erbe and Farmer
to 1: P(FA) + P(CR) ¼ 1. In other words, the (1998); see Fig. 10.7 for a photo of the training
probabilities computed from the animal responses setup.
in Fig. 10.11 are not all independent. In the ROC The bias of the animal in these hearing tests
plot, therefore, two independent probabilities are can be manipulated by changing the reinforce-
plotted against each other: P(CD) versus P(FA). ment regimen. If the possible responses from
As illustrated in Fig. 10.12a, the major diagonal Fig. 10.11 are differently rewarded (e.g., positive
line marks all the points at which P(CD) ¼ P(FA), reinforcement for the two correct responses and
which would be expected if the subject were negative reinforcement for the two false
making random choices or simply guessing. responses), then the animal will aim to maximize
Below this line, the animal would perform the percentage of correct responses. If the four
worse than by chance; i.e., the animal would be responses are all differently rewarded, then the
making deliberate mistakes. The minor diagonal perceived values and risks will influence the
corresponds to P(CD) + P(FA) ¼ 1 and so animal’s response. For example, in a study with
represents neutral response bias, with responses an Arctic fox (Vulpes lagopus; Stansbury et al.
falling to the left of the line indicating a conser- 2014), correct detections and correct rejections
vative response bias (i.e., low false alarm proba- were rewarded with 3–4 pieces of kibble. When
bility) and to the right a liberal response bias (i.e., the animal missed a signal, it was rewarded with
high false alarm probability). The best possible 1 piece of kibble. False alarms resulted in a 2–3 s
performance is at the point (0|1), where the ani- time-out, after which the animal was restationed
mal detects all signals and does not report any for the next trial. By rewarding misses (i.e., one of
false alarms. Actual results from a beluga whale the two false responses) and with only false
(Fig. 10.12b) detecting played-back beluga calls alarms receiving no food but instead a time-out,
in icebreaker noise are shown in Fig. 10.12c. At the animal was conditioned to avoid false alarms
Fig. 10.12 (a) Receiver Operating Characteristic (ROC) noise at signal-to-noise ratios of 0, 6, 12, 18, 24,
plot showing the lines and areas relating the probability of and 30 dB. The animal was trained to indicate whenever
correct detection, P(CD), and the probability of false it heard the call in the noise. The animal’s performance
alarm, P(FA). (b) Photo of a beluga whale at Vancouver decreased with decreasing signal-to-noise ratio. The ani-
Aquarium. (c) ROC plot of this animal’s performance mal adopted a very conservative response bias (Erbe and
when presented with a beluga call mixed into icebreaker Farmer 1998)
372 S. L. McFadden et al.
but accept misses. The reinforcement regimen (whereby fewer catch trials render the animal
directly influenced the animal’s conservative more liberal; Schusterman and Johnson 1975) or
bias. Similar conditioning likely happened with even changing the probability of handing out a
the beluga whale (Erbe and Farmer 1998). After reward (i.e., not all correct trials are rewarded all
the animal stationed, a sound was played ran- the time; Schusterman 1976). The resulting ROC
domly within a 30-s period. The animal indicated curves then allow the separation of the animal’s
a detection (of the beluga call mixed into ice- actual sensitivity from its bias (Green and Swets
breaker noise) by breaking from the station. If 1966; Au 1993), but much more experimental time
the animal did not detect a call, it held station is needed to collect all these data.
for the full 30 s. Correct detections were rewarded
with fish within 2 s. False alarms received a time-
out. A “no” response received a delayed (by up to 10.4 Physiological Methods
30 s) fish reward; these would have correct for Audiometric Studies on Live
rejections (i.e., signal absent trials) and missed Animals
detections (i.e., signal present trials, but under
the assumption that the signal was too quiet to Behavioral tests of hearing can be too time-
be detected). Effectively, the animal thus also consuming to conduct, too difficult to employ
received a reward (albeit delayed) for missed because of animals’ limitations in learning or
detections, even if the signal was above threshold performing a behavioral task, or impractical for
on some of the trials. Not knowing in advance some other reason such as animal health, disposi-
what the animal’s hearing threshold is, it is tion, or developmental status. Physiological
impossible to tell whether the animal truly did methods offer a practical, complementary
not hear the signal when it indicated “no” to a approach because they do not require training
low-level signal-present trial. the animal and they can be completed in a rela-
An even greater benefit of ROC analysis is tively shorter period of time. However, because
realized by measuring actual ROC curves (rather physiological methods do not require a behavioral
than settling for scatter plots of data as in response from the animal that indicates the sound
Fig. 10.12c). To do that, the animal’s bias needs was perceived, they are considered to be tests of
to be actively manipulated using reinforcement. “auditory function” rather than “hearing” per
For example, the beluga experiment could be se. The relationship between behavioral and
redone with the same animal, but instead of physiological measures of hearing is discussed
rewarding both correct responses with one fish, later in this chapter.
the animal might be given 3 fishes for a correct As in behavioral studies, physiological studies
detection and only 1 fish for a correct rejection. test responses to different kinds of acoustic stim-
The animal might begin to favor the “yes” ulation and must take into account ambient noise
response, exhibiting a more liberal response bias. that can affect thresholds. Other factors to con-
So, rather than having just one data point at say sider in physiological studies are body tempera-
12 dB signal-to-noise ratio, we would get a curve ture and whether or not the animal is anesthetized,
for 12 dB, with the points along the curve because these factors can affect neural thresholds,
corresponding to the same sensitivity (hence also amplitudes, and latencies. Anesthesia is com-
called isosensitivity curve) but to different biases, monly used in physiological studies because it is
which were driven by the different reinforcement difficult to keep an unanesthetized animal in a
regimen. This is exactly what was done by fixed position in a sound field during testing and
Schusterman et al. (1975) with a California sea physical restraint can be stressful. However, anes-
lion (Zalophus californianus) and a bottlenose dol- thesia can affect brain activity and severely
phin (Tursiops truncatus), yielding actual ROC diminish or abolish neural responses to sound
curves. Other ways of actively changing the bias (Cui et al. 2017; Kiebel et al. 2012; McFadden
include changing the percentage of catch trials and Kiebel 2013; Fig. 10.13). Anesthesia can also
10 Behavioral and Physiological Audiometric Methods for Animals 373
Fig. 10.13 Top: Testing apparatus devised by Kiebel typically remained on the platform for the entire testing
et al. (2012) for recording auditory evoked potentials session (30–45 min). Stimuli were delivered from a head-
from awake mice. The mice were placed on a platform phone speaker placed 700 above the animal’s head. A
(i.e., an inverted jar about 300 in diameter) in a plastic tub computer-controlled camera was used to monitor the
containing warm water in a recording chamber. Mice were mouse, and recording was manually paused when the
acclimated to the apparatus in daily 10-min sessions for animal groomed or became active. Bottom: Auditory
1–2 days prior to the first recording session. Typically, a evoked responses recorded from a mouse while it was
mouse placed on the platform for the first time would enter awake and then again after it had been anesthetized. The
the water and after a brief period of swimming, would waveforms are responses to 12 kHz tones at 90 dB re
climb back on the platform and remain there until removed 20 μPa, averaged across 100 artifact-free trials in each
by the researcher. In subsequent sessions, the mouse condition
including frogs, lizards, birds, and mammals presented through the sound tubes, and the sound
(Manley 2001). DPOAEs are abolished by loss in the ear canal is sampled by the microphone for
or dysfunction of outer hair cells, and also by a fixed period of time. The output of the micro-
middle ear dysfunction that prevents retrograde phone is filtered, digitized, averaged over a num-
transmission of acoustic energy from the cochlea ber of trials, and then analyzed using a
to the ear canal. It is important to recognize, computerized signal-analysis system. A normal
however, that the absence of OAEs is not neces- inner ear will generate several nonlinear distor-
sarily evidence of outer hair cell dysfunction, tion products that will be propagated in a reverse
because OAEs are not recordable from all normal direction back through the middle ear and into the
ears. The technique is not very useful for ear canal (when present). When this occurs, spec-
pinnipeds because their stapedial reflex shuts trum analysis of the sound recorded by the micro-
down the auditory meatus as an adaptation for phone will show not only the original f1 and f2
diving. tones that were delivered to the ear, but also
DPOAE tests in mammals typically use a several new tones that were generated as nonlin-
probe assembly that is inserted into the external ear distortion products. The largest distortion
auditory meatus to form a closed acoustic system. product is the cubic DPOAE, with a frequency
For animals lacking ear canals (e.g., fishes, frogs, equal to 2f1 f2. For example, if f1 ¼ 1000 Hz
reptiles, and birds), the probe tip is placed inside a and f2 ¼ 1200 Hz, then the cochlea will generate a
plastic tube that is then coupled to the animal’s cubic DPOAE at 800 Hz. Because 2f1 f2 is the
ear using silicone grease or Vaseline to seal any largest DPOAE produced (typically 30–40 dB re
gaps (Bergevin et al. 2008). The probe tip 20 μPa below the level of the primary tones) and
contains a very sensitive external microphone is less variable than other distortion products, it is
and tubes from two external sound sources typically the only one reported in animal studies.
(Fig. 10.14). Two primary test tones, f1 and a The frequency ratio f2: f1 of the primary tones,
higher frequency tone f2, are generated by sepa- the level of the higher-frequency primary tone L2,
rate channels of a sound-generating system and and the difference between the levels of the two
primary tones L1 L2 are selected to maximize
the amplitude of the cubic DPOAE in the ear
canal. These parameters are species-specific and
must be determined empirically. For all
combinations of stimulus parameters ( f2:f1, L2
and L1 L2), the amplitude of the cubic
DPOAE increases as the level of the primary
tones increases until it saturates. DPOAEs can
be difficult to measure at low frequencies due to
masking by low-frequency ambient sounds in the
ear canal (i.e., high noise-floor levels occur at low
frequencies). But it is possible to measure
low-frequency DPOAEs if great care is taken to
ensure deep insertion and a good seal of the probe
assembly in the ear canal.
Shaffer and Long (2004) measured
Fig. 10.14 A commercially available low-noise micro- low-frequency DPOAEs in two species of kanga-
phone with two external sound sources. The probe tip roo rats to test the hypothesis that a large foot-
containing the microphone and sound tubes is covered
with a foam or plastic ear tip and inserted into the ear
drumming species (Dipodomys spectabilis) has
canal to form a closed acoustic system. For animals with- better low-frequency sensitivity than a small
out ear canals, the probe can be inserted into a plastic tube foot-drumming species (D. merriami). In both
that is then sealed in place against the ear of the animal species, DPOAEs were generated rated at low
10 Behavioral and Physiological Audiometric Methods for Animals 375
frequencies between 225 and 900 Hz. DPOAE Electrical potentials generated by the cochlea
amplitudes were greater in the larger kangaroo and auditory nerve include the cochlear micro-
rat species compared to the smaller species. Addi- phonic potential (CM potential) generated by
tionally, the authors found good correspondence outer hair cells, the summating potential
between DPOAE amplitudes, behavioral hearing (SP) generated primarily by inner hair cells, and
thresholds, and electrophysiological hearing the compound action potential (CAP) generated
thresholds in D. merriami. This suggests that by the synchronous depolarization of auditory
DPOAE amplitudes are good estimates of hearing nerve fibers. AEPs generated by the auditory
sensitivity. nerve and neurons in the auditory brainstem
(i.e., cochlear nucleus, superior olive, lateral lem-
niscus, and inferior colliculus) contribute to the
10.4.2 Auditory Evoked-Potential short-latency scalp-recorded auditory brainstem
and Auditory Brainstem response (ABR). AEPs recorded from electrodes
Response Methods implanted into the auditory midbrain of mammals
are referred to as inferior colliculus evoked
Auditory evoked-potential (AEP) methods record potentials (IC-EVPs). AEPs generated by fore-
stimulus-evoked electrical activity at various brain regions (thalamus and cortex) include
levels of the auditory nervous system. Hair cells long-latency potentials recorded from electrodes
and neurons in the auditory system function by implanted into the brain or from surface
generating electrical potentials in response to electrodes.
sounds, and measurements of these stimulus- AEP methods share a number of common
evoked potentials can provide information about procedures. Stimuli can be presented using the
the functional state of the inner ear, auditory same paradigms discussed in Sect. 10.3.3
nerve, central auditory nuclei, and their fiber (Method of Constant Stimuli, Method of Limits,
pathways (Salvi et al. 2000; McFadden 2007). Up/Down Staircase method) with the criterion for
There are many ways of classifying AEPs. threshold being an electrophysiological, rather
Common classifications are based on: (1) the than a behavioral, response. Responses are
region involved in the generation of the response recorded and averaged over a number of trials
(e.g., cochlea, brainstem, thalamus, or cortex), (e.g., 50–2000 trials); the number of trials
(2) the latency of the response (i.e., short-, mid- depends on the size of the response relative to
dle-, and long-latency potentials reflecting gener- background electrical noise (i.e., the signal-to-
ation by neural elements at progressively higher noise ratio). They are typically quantified in
regions of the auditory system), (3) electrode terms of response amplitude (e.g., peak-to-peak
placement (invasive near-field recordings made voltage or peak voltage relative to a baseline
with an electrode inserted into an auditory voltage level) and latency (i.e., the lag-time
nucleus versus noninvasive far-field recordings between the onset of the stimulus and a defined
made from electrodes placed on the scalp), portion of the response). Threshold is variously
(4) the type of electrode used (high-impedance defined as the lowest stimulus level that elicits a
microelectrodes for recording potentials from detectable physiological response, the lowest
individual cells versus low-impedance surface or level at which a peak replicates, the midpoint
needle electrodes for recording activity from large between the level at which a response replicates
groups of neurons from the scalp), and (5) the size and the next lower level at which it does not, or
of the cellular population contributing to the the sound pressure level at which the amplitude of
response (e.g., local field potentials reflecting a particular peak reaches a criterion voltage level.
the extracellular electrical activity of a discrete Other parameters that are commonly measured
group of neurons versus gross potentials from AEP waveforms include peak amplitudes,
generated by large populations of cells such as peak latencies, and in the case of the ABR, inter-
those recorded from scalp electrodes). peak intervals (i.e., time between different peaks,
376 S. L. McFadden et al.
reflecting neural conduction time). Results are higher anatomical sites of generation. ABRs
summarized as input-output functions that show from mammals typically have five prominent
response magnitude or latency as a function of peaks (Fig. 10.15). The first peak of the waveform
stimulus level, or as an audiogram, showing has a cochlear origin, reflecting the summed syn-
threshold as a function of stimulus frequency. chronous neural activity from the peripheral por-
Because the ABR is an onset response that tion of the auditory nerve, and the second peak
requires synchronous activity of an ensemble of most likely reflects neural activity from the cen-
neural elements, stimuli with very short rise/fall tral portion of the auditory nerve at the level of the
times are most effective. Clicks, which are brief cochlear nucleus. Subsequent peaks are generated
(e.g., 5–100 μs) and therefore spectrally broad, by brainstem regions between the cochlear
often are used as stimuli, particularly for screen- nucleus and the lateral lemniscus or inferior
ing of auditory function. Pure tones with a rapid colliculus. In all species studied, peak amplitudes
onset are preferred when more frequency-specific of the ABR increase and latencies decrease as the
information is required, as for testing the fre- stimulus level increases (Fig. 10.15). The rate of
quency range of hearing. Sinusoidal amplitude stimulus presentation can influence response
modulated tones provide even greater frequency amplitudes and thresholds. Data acquisition time
specificity. is shortened by using a rapid signal presentation
At high stimulus levels that are clearly audible rate, but there is a cost in terms of response size,
to an animal, several characteristic peaks are typ- with high signal rates resulting in decreased peak
ically present in the response waveform, with amplitudes in the response waveform and
latencies that correspond to their progressively increased response latencies.
Fig. 10.15 Left: Photo of a squirrelfish (Sargocentron level, in 5-dB steps from 90 to 55 dB re 20 μPa. Threshold,
sp.) with subcutaneous electrodes about to undergo ABR defined as the lowest level with a repeatable response, was
testing. Photo courtesy of Rob McCauley, Centre for 65 dB re 20 μPa for this frequency. The first two peaks of
Marine Science and Technology, Curtin University. the ABR (short bracket) show activity from the auditory
Right: ABR waveforms obtained from an anesthetized nerve, whereas the subsequent peaks (long bracket) arise
C57BL/6J mouse. Needle electrodes (pictured at top left) from successively more rostral regions of the central audi-
were inserted under the skin at the top of the head (active), tory nervous system. Note the decrease in peak amplitude
behind the right ear (reference), and at the base of the tail and increase in peak latency with decreasing stimulus
(ground). Two waveforms were collected at each stimulus level, typical of ABR waveforms
10 Behavioral and Physiological Audiometric Methods for Animals 377
Preparation of animals for ABR testing is min- less to obtain a complete audiogram from an
imal. Typically, the animal is restrained or anesthetized animal), as compared to the weeks
sedated or anesthetized to keep it still during the or months needed to train an animal for compiling
recording session. Aquatic animals under human behavioral audiograms. In addition, ABR testing
care can be trained to remain still at a station (e.g., is practical to use in studies requiring many
in a hoop) and are maintained at a good ambient animals and multiple measurements (e.g., before
water temperature in a pool. Terrestrial animals and after a treatment is applied), and for testing
are placed on a heating pad to maintain normal young animals in developmental studies. For
body temperature. Electrodes for recording elec- example, McFadden et al. (1996) used ABR
trical activity are then applied. For most animals, methods to study the ontogeny of auditory func-
the electrodes are low-impedance needle tion in the Mongolian gerbil and identified three
electrodes that are inserted under the skin; how- phases of development based on frequency-
ever, other types of electrodes, such as surface threshold curves. ABRs were elicited by intense
electrodes and suction-cup electrodes that attach stimuli in the low- and mid-frequency range as
to the surface of the head (Fig. 10.16) are suitable early as 10 post-natal days (pnd) in a small pro-
as well. One electrode, termed the active, portion of animals. By 16 pnd, all gerbils were
non-inverting, or positive electrode, is placed at responding reliably to tones between 125 Hz and
the vertex (upper surface of the head, along the 32 kHz, similar to adult animals.
midline, and between the ears) and another, ABR testing has become the AEP method of
termed the reference, inverting, or negative elec- choice for audiometric testing in a wide range of
trode, is placed behind the pinna or in another species. In particular, ABRs are useful for
relatively neutral region of the head. A third elec- estimating hearing capabilities of animals that
trode, which serves as a ground, is placed in the are difficult to test using other methods. For
pool water or in a non-neural site on the animal example, Hu et al. (2009) used ABR recordings
(e.g., beneath the skin of the neck, back, or leg). to determine hearing of cephalopods: the oval
One advantage of ABRs is that it requires less squid (Sepiotheuthis lessoniana) and the common
time to collect a complete set of data (often 1 h or octopus (Octopus vulgaris). Each cephalopod
was anesthetized and then transferred to a holder in numerous degenerative disorders and
inside a plastic tub filled with seawater. Teflon- age-related hearing loss (McFadden et al.
coated silver needle electrodes were inserted on 2001a, b). For example, hearing thresholds of
the head between the eyes (non-inverting) and on aged (13-month-old), wild type (WT) mice with
the mantle (inverting) and a wire was placed in normal levels of SOD1 are lower at all four tested
the tub to serve as the ground. In both frequencies than those of SOD1-deficient
cephalopods, the ABR had only one prominent littermates. SOD1 deficiency had a greater effect
peak. The resulting ABR audiogram showed that on thresholds at 16 and 32 kHz than at lower
the squid responded to a wider frequency range frequencies (8 and 4 kHz).
(400–1500 Hz vs. 400–1000 Hz) and had signifi-
cantly lower thresholds at 600 Hz (its frequency
of best sensitivity) compared to the octopus.
10.4.3 Comparison of Behavioral
Comparisons of ABR audiograms can show
and Physiological Audiograms
the effects of factors such as age, noise exposure,
drug treatment, and genetic mutations. The ABR
It is important to compare data obtained from
audiograms shown in Fig. 10.17, for example,
physiological and behavioral methods to deter-
show the effects of an induced genetic mutation
mine their reliability and validity. Even in the
of the gene that codes for the copper-zinc form of
same species, experiments might use different
superoxide dismutase (SOD1) on auditory sensi-
stimulus presentation paradigms and different
tivity in mice. SOD1, an enzyme found in the
threshold criteria, making direct comparisons of
cytosol of all cells, serves as a first line of defense
results difficult. Although ABR and behavioral
against oxidative damage and has been implicated
audiograms in the same species can have the
same overall shape and similar frequencies of
best hearing sensitivity, actual thresholds may
differ considerably (Fig. 10.18). Some authors
argue that these audiograms should not be con-
sidered equivalent (Sisneros et al. 2016). Ladich
and Fay (2013) compiled AEP and behavioral
audiograms of goldfish collected in different stud-
ies in different laboratories. They found that, at
frequencies below 1000 Hz, median ABR
thresholds were about 10 dB higher than behav-
ioral thresholds, while at higher frequencies,
ABR thresholds were lower than behavioral
thresholds.
Schlundt et al. (2007) quantified differences in
audiograms recorded from bottlenose dolphins in
Fig. 10.17 Average ABR thresholds (dB re 20 μPa) from a variety of underwater test conditions (in a quiet
aged mice with normal levels of SOD1 enzyme
(WT) compared to thresholds from littermates missing
pool and in a noisy bay). AEPs were recorded
50% (HET) or 100% (KO) of SOD1 due to genetic manip- using a transducer embedded in a suction cup on
ulation of the copper-zinc superoxide dismutase gene. the jawbone. In behavioral tests, the dolphins
WT ¼ wildtype mice (with two normal gene alleles and were conditioned by the trainer’s whistle to
normal levels of SOD1); HET ¼ heterozygous knockout
mice (with one abnormal allele, resulting in 50% reduction
respond when the same tone was heard.
of SOD1); KO ¼ homozygous knockout mice (with two Thresholds measured using the two techniques
abnormal alleles, resulting in complete elimination of were very similar, although there was less
SOD1) variability in behavioral data.
10 Behavioral and Physiological Audiometric Methods for Animals 379
40
10 2 10 3 10 4 10 5
Frequency [Hz]
Fig. 10.19 Psychometric function at a tone frequency of 1000 Hz, the frequency difference limen is 30 Hz. Middle:
1000 Hz (left) and a graph of the Weber fraction across The Weber fraction (ΔF/F) increases with frequency. The
frequency (middle) collected from an Indian elephant Weber fraction is low at frequencies of 250 and 500 Hz,
(right). Left: A psychometric function showing percent indicating good ability to discriminate frequency
correct detection of a frequency difference between two differences, and increases at higher frequencies, indicating
tones. The base frequency is 1000 Hz, and frequency poorer acuity. Data collected by Heffner and Heffner
differences range from 20 to 100 Hz. The solid gray line (1982). Image of the elephant from Evelyn Fuchs, Univer-
shows the elephant’s performance and the dashed gray line sity of Vienna
shows the 75% correct criterion for the frequency DL. At
types of pulse trains, until the animal no longer it provides a standard error for the hearing thresh-
detected the difference reliably. A psychometric old values.
function for a tone frequency of 1000 Hz, a fre- Intensity DLs are estimated using similar
quency of best sensitivity for the elephant, is procedures as used for estimating frequency
plotted in Fig. 10.19. The 75% correct discrimi- DLs, except that tone frequency is kept constant
nation threshold is at 1030 Hz, giving a DL or while tone intensity is varied. Difference limens
30 Hz. The DLs calculated from psychometric are also commonly measured for noise. These
functions at different tone frequencies are plotted measurements are useful for estimating a species’
in Fig. 10.19 as the Weber fraction (ΔF/F) the dynamic range of hearing, the intensity range
ratio of the DL to the test frequency. The Weber over which changes in sound levels can be per-
fraction increases with frequency, showing that ceived. Determining an animal’s sensitivity to the
the ability to discriminate differences in tone fre- depth of amplitude modulation in a sound and the
quency becomes absolutely worse with increases ability to detect a short, silent gap between two
in frequency. Changes in the Weber fraction with sounds is also a problem of intensity
tone frequency have implications for understand- discrimination.
ing how frequency is coded in the nervous system
across different species.
The psychometric function illustrated in 10.5.2 Frequency Selectivity
Fig. 10.19 is based on actual data points. Some
investigators use a statistical procedure called Frequency selectivity refers to the perceptual abil-
Probit Analysis to find the best-fitting regression ity to discriminate two simultaneous signals of
line through the data points, and then base the different frequency (e.g., a signal against noise).
estimate of the DL from that regression (Levitt Behavioral measures of frequency selectivity are
1970). The center of the best-fitting regression used to estimate the width of internal auditory
line can then be taken as the most probable filters (i.e., the physical space including number
threshold value. Probit analysis is useful because of hair cells and portion of the sensory epithelia)
10 Behavioral and Physiological Audiometric Methods for Animals 381
10 5
1 octave
1/3 octave
1/6 octave
1/12 octave
10 4
Tursiops truncatus
Tursiops truncatus
Bandwidth [Hz]
Tursiops truncatus
Tursiops truncatus
Delphinapterus leucas
Delphinapterus leucas
10 3 Neophocaena phocaenoides
Phocoena phocoena
Mirounga angustirostris
Phoca vitulina
Phoca vitulina
Phoca vitulina
Zalophus californianus
10 2
10 3 10 4 10 5
Frequency [Hz]
Fig. 10.20 Graph of frequency selectivity in marine in-air and underwater measurements are shown (Erbe
mammals. *: Critical bandwidths. ★: Equivalent rectangu- et al. 2016). # Erbe et al. 2016; https://www.
lar bandwidths. +: 3-dB bandwidths. O: 10-dB sciencedirect.com/science/article/pii/
bandwidths. Some of these data were collected behavior- S0025326X15302125. Licensed under CC BY 4.0; https://
ally, others electrophysiologically. For pinnipeds, both creativecommons.org/licenses/by/4.0/
382 S. L. McFadden et al.
Critical Ratios of Cetaceans & Sirenians underwater Critical Ratios of Pinnipeds underwater
45 1/3 octave 35 1/3 octave
1/6 octave 1/6 octave
1/12 octave 1/12 octave
Delphinapterus leucas Callorhinus ursinus
40 Phocoena phocoena Mirounga angustirostris
Pseudorca crassidens 30 Phoca largha
Tursiops truncatus Phoca vitulina
Trichechus manatus Pusa hispida
35 Zalophus californianus
25
CR [dB]
CR [dB]
30
20
25
15
20
15 10
10 2 10 3 10 4 10 5 10 2 10 3 10 4
frequency [Hz] frequency [Hz]
method. The CR can also be measured of the size of the auditory filter. It is a good
electrophysiologically. approximation in some bird species (Langemann
CR measurements are relatively easy to obtain et al. 1995) but in many other species differs from
and are thus available for a number of species. In a more direct measure, the critical bandwidth.
the horseshoe bat (Rhinolophus ferrumequinum)
and in the green treefrog, for example, CRs are
10.5.2.2 Critical Bandwidth
lowest, implying sharper filters, at the spectral
The critical bandwidth (CB) refers to a band of
peaks within this species’ echolocation and
frequencies within which sound at any frequency
advertisement calls, respectively (Long 1977;
can interfere with sound at the center frequency
Moss and Simmons 1986). In many other species,
(ANSI/ASA S3.20-2015; ISO 18405: 2017). The
CRs gradually increase with tone frequency (e.g.,
critical bandwidth is typically measured in noise-
Fay 1988; Erbe et al. 2016). In the absence of CR
widening experiments. The listener tries to detect
data, 1/3 octave bands are often used (in particular
a tone at the center of a band of masking noise. As
in the noise impact assessment literature). While
the noise band is widened, the level of the tone
this is a good approximation in birds (e.g.,
has to increase for it to remain audible. There
Dooling and Blumenrath 2013), in several spe-
comes a bandwidth, at which the width of the
cies, 1/3 octave bands overestimate CRs at some
masking noise band no longer affects the level
frequencies (Fig. 10.21).
of the tone at detection threshold. This is the
The CR is often taken as an estimate of the
critical bandwidth. The difference between a CR
width of the auditory filters. In this case, it should
and a CB experiment thus is that the listener has
be referred to as the Fletcher critical band (ANSI/
to detect a tone in broadband masking noise in the
ASA S3.20-2015).2 If CR is in dB re 1 Hz, then
former and in noise of variable (increasing) band-
the Fletcher critical band is computed as 10CR/10.
width in the latter. CBs are time-consuming to
The Fletcher critical band is an indirect estimate
collect, because they require determining masked
thresholds at each tone frequency at many differ-
2
Acoustical Society of America, Standard Acoustical & ent noise bandwidths. For this reason,
Bioacoustical Terminology Database: https://asastandards. measurements of CB are available for fewer spe-
org/working-groups-portal/asa-standard-term-database/;
accessed 7 January 2021. cies than are measurements of CR.
10 Behavioral and Physiological Audiometric Methods for Animals 383
Fig. 10.22 Psychophysical tuning curves (left) for the threshold. Masker tones (130-ms duration, with
Pig-tailed macaque monkey (Macaca nemestrina; right), frequencies varying around that of the probe tone) were
measured in a forward masking paradigm. Animals were presented 2 ms before the onset of the probe tone. The
trained to detect tones using positive reinforcement. Tones blue, dark red, and dark gray curves show the psychophys-
were presented via earphones, and the animals were seated ical tuning curves plotting the level of the masker (y-axis)
inside a sound-attenuating chamber. Masked thresholds to needed to just mask the probe tone at each masker fre-
probe tones (0.5, 2, and 8 kHz; blue, dark red, dark gray, quency. The black dashed line shows the animals’ absolute
respectively; x-axis) were determined using an adaptive thresholds (audiogram). Data collected by Serafin et al.
tracking procedure and defined as the mean of eight rever- (1982). # Stauss, 2006; https://commons.wikimedia.org/
sal points at each frequency. Probe tones (25-ms duration) w/index.php?curid¼1733069. Licensed under CC BY-SA
were presented at a level of 10 dB above absolute 3.0; https://creativecommons.org/licenses/by-sa/3.0/
384 S. L. McFadden et al.
described common behavioral and physiological Au WWL (1993) The sonar of dolphins. Springer,
methods, along with some of their strengths and New York
Awbrey FT, Thomas JA, Kastelein RA (1988)
weaknesses. Testing hearing abilities in animals Low-frequency underwater hearing sensitivity in
is not as easy as in humans because animal belugas, Delphinapterus leucas. J Acoust Soc Am
subjects cannot verbally report to the researcher 84(6):2273–2275
when a test signal is heard. Instead, animals indi- Bain DE, Kriete B, Dahlheim M (1993) Hearing abilities
of killer whales (Orcinus orca). J Acoust Soc Am
cate that they heard a sound by making unlearned 94(3):1829–1829
or learned responses in behavioral studies. Berger JI, Coomber B, Shackleton TM, Palmer AR,
Thresholds based on conditioned responses are Wallace MN (2013) A novel behavioural approach to
the most accurate and reliable, but conditioning detecting tinnitus in the Guinea pig. J Neurosci
Methods 213(2):188–195
procedures are not suitable for all animals or Bergevin C, Freeman DM, Saunders JC, Shera CA (2008)
research questions. Some animals are not train- Otoacoustic emissions in humans, birds, lizards, and
able or are unable to participate in a behavioral frogs: evidence for multiple generation mechanisms. J
study due to age, health, or some other factor. Comp Physiol A 194(7):665–683
Bhandiwad AA, Sisneros JA (2016) Revisiting
Physiological methods, especially auditory psychoacoustic methods for the assessment of fish
brainstem response testing, can be particularly hearing. In: Sisneros JA (ed) Fish hearing and bio-
helpful in these situations. While ABR and other acoustics: an anthology in honor of Arthur N. Popper
physiological methods provide useful informa- and Richard R. Fay, vol 877. Springer, New York, pp
157–184
tion about auditory function, it is important to Bowles AE, Francine JK (1993) Effects of simulated air-
recognize that the results they provide are not craft noise on hearing, food detection, and predator
equivalent to those from behavioral studies that avoidance behavior of the kit fox, Vulpes macrotis. J
assess hearing directly; thresholds obtained using Acoust Soc Am 93:2378–2378
Braff DL, Geyer MA, Swerdlow NR (2001) Human stud-
physiological methods may under- or over- ies of prepulse inhibition of startle: normal subjects,
estimate behavioral thresholds in an unpredict- patient groups, and pharmacological studies. Psycho-
able manner. pharmacology 156(2–3):234–258
Research on hearing abilities in animals has Branstetter BK, Leger JS, Acton D, Stewart J, Houser D,
Finneran JJ, Jenkins K (2017) Killer whale (Orcinus
advanced beyond documenting the basic audio- orca) behavioral audiograms. J Acoust Soc Am
gram of a species. Data on frequency and inten- 141(4):2387–2398. https://doi.org/10.1121/1.4979116
sity discrimination, sound localization, and the Brill RL, Moore PWB, Dankiewicz LA (2001) Assess-
effects of noise on hearing in animals are current ment of dolphin (Tursiops truncatus) auditory sensitiv-
ity and hearing loss using jawphones. J Acoust Soc Am
topics of study for many animal species. Informa- 109(4):1717–1722
tion on hearing and an animal’s abilities to adapt Cui J, Zhu B, Fang G, Smith E, Brauth SE, Tang Y (2017)
to noise can have important applications for the Effect of the level of anesthesia on the auditory
conservation of species in areas of high anthropo- brainstem response in the Emei Music Frog (Babina
daunchina). PLoS One 12(1):e0169449. https://doi.
genic noise. org/10.1371/journal.pone.0169449
Dent ML, Dooling RJ, Pierce AS (2000) Frequency dis-
crimination in budgerigars (Melopsittacus undulatus):
effects of tone duration and tonal context. J Acoust Soc
References Am 107(5):2657–2664. https://doi.org/10.1121/1.
428651
Aihara I, Bishop PJ, Ohmer MEB, Awano H, Mizumoto T, Dooling RJ, Blumenrath SH (2013) Avian sound percep-
Okuno HG, Narins PM, Hero JM (2017) Visualizing tion in noise. In: Brumm H (ed) Animal communica-
phonotactic behavior of female frogs in darkness. Sci tion in noise. Springer, Heidelberg, pp 229–250.
Rep 7:10539. https://doi.org/10.1038/s41598- https://doi.org/10.1007/978-3-642-41494-7_8
017-11150 Ehret G, Romand R (1981) Postnatal development of
American National Standards Institute (2004) Methods for absolute auditory thresholds in kittens. J Comp Physiol
manual pure-tone threshold audiometry (ANSI S3.21- Psychol 95(2):304–311
2004). Acoustical Society of America, New York Erbe C, Farmer DM (1998) Masked hearing thresholds of
American National Standards Institute (2015) a beluga whale (Delphinapterus leucas) in icebreaker
Bioacoustical terminology (ANSI S3.20-2015, R noise. Deep Sea Res II Top Stud Oceanogr 45(7):
2020). Acoustical Society of America, New York 1373–1388. https://doi.org/10.1016/S0967-0645(98)
00027-7
10 Behavioral and Physiological Audiometric Methods for Animals 385
Erbe C, Reichmuth C, Cunningham KC, Lucke K, Henderson D, Salvi RJ, Quaranta A, McFadden SL,
Dooling RJ (2016) Communication masking in marine Burkard RF (eds) (1999) Ototoxicity: basic science
mammals: a review and research strategy. Mar Pollut and clinical applications, Annals of the New York
Bull 103:15–38. https://doi.org/10.1016/j.marpolbul. Academy of Sciences, vol 884. The New York Acad-
2015.12.007 emy of Sciences, New York
Fay RR (1988) Hearing in vertebrates: a psychophysics Hoffman H, Ison JR (1980) Reflex modification in the
databook. Hill-Fay Associates, Winnetka, IL domain of startle: I. Some empirical findings and
Fay RR (1995) Psychoacoustical studies of the sense of their implications for how the nervous system pro-
hearing in goldfish using conditioned respiratory cesses sensory input. Psychol Rev 87:175–189
suppression. In: Klump GM, Dooling RJ, Fay RR, Houser DS, Finneran JJ (2006) A comparison of underwa-
Stebbins WC (eds) Methods in comparative psycho- ter hearing sensitivity in bottlenose dolphins (Tursiops
acoustics. Birkhauser, Basel, pp 249–261 truncatus) determined by electrophysiological and
Fay RR, Simmons AM (1999) The sense of hearing in behavioral methods. J Acoust Soc Am 120:1713–1722
fishes and amphibians. In: Popper AN, Fay RR (eds) Hu MY, Yan HY, Chung W-S, Shiao J-C, Hwant PP
Comparative hearing: fish and amphibians. Springer, (2009) Acoustically evoked-potentials in two
New York, pp 269–318 cephalopods inferred using the auditory brainstem
Finneran JJ, Houser DS, Blasko D, Hicks C, Hudson J, response (ABR) approach. Comp Biochem Physiol A
Osborn M (2008) Estimating bottlenose dolphin 153(3):278–283
(Tursiops truncatus) hearing thresholds from single International Organization for Standardization (2017)
and multiple simultaneous auditory evoked potentials. Underwater acoustics—terminology (ISO 18405).
J Acoust Soc Am 123(1):542–551 Switzerland, Geneva
Finneran JJ, Mulsow J, Schlundt CE, Houser DS (2011) Jero J, Coling DE, Lalwani AK (2001) The use of Preyer’s
Dolphin and sea lion auditory evoked potentials in reflex in evaluation of hearing in mice. Acta
response to single and multiple swept amplitude Otolaryngol 121(5):585–589
tones. J Acoust Soc Am 130(2):1038–1048. https:// Johnson CS (1966) Auditory thresholds of the bottlenose
doi.org/10.1121/1.3608117 porpoise (Tursiops truncatus, Montagu). U.S. Naval
Gerhardt HC (1995) Phonotaxis in female frogs and toads: Ordnance Test Station. Tech Publ 4178:1–28
execution and design of experiments. In: Klump GM, Kastelein RA, Heu S, van der Verboom W, Jennings N,
Dooling RJ, Fay RR, Stebbins WC (eds) Methods in Veen J, Vander J, de Haan D (2008) Startle response of
comparative psychoacoustics. Birkhauser, Basel, pp captive North Sea fish species to underwater tones
209–220 between 0.1 and 64 kHz. Mar Environ Res 65(5):
Green D, Swets J (1966) Signal detection theory and 369–377
psychophysics. Wiley, New York. Reprinted 1974 by Kemp DT (2002) Otoacoustic emissions, their origin in
Krieger, Huntington, New York cochlear function, and use. Br Med Bull 63(1):
Hall JD, Johnson CS (1972) Auditory thresholds of a killer 223–241
whale, Orcinus orca Linnaeus. J Acoust Soc Am 51: Kiebel EM, Sunderman MG, Leonhard JR, McFadden SL
515–517 (2012) Measurement of cortical auditory event-related
Heffner RS, Heffner HE (1982) Hearing in the elephant potentials in conscious CBA/CaJ mice. Association for
(Elephas maximus): absolute sensitivity, frequency Psychological Science conference, Chicago
discrimination, and sound localization. J Comp Physiol Klump GM, Dooling RJ, Fay RR, Stebbins WC (eds)
Psychol 96:926–944 (1995) Methods in comparative psychoacoustics.
Heffner RS, Heffner HE (1991) Behavioral hearing range Birkhauser, Basel
of the chinchilla. Hear Res 52:13–16 Koay G, Heffner RS, Heffner HE (2002) Behavioral
Heffner HE, Heffner RS (2001) Behavioral assessment of audiograms of homozygous medJ mutant mice with
hearing in mice. In: Willott JF (ed) Handbook of sodium channel deficiency and unaffected controls.
mouse auditory research: from behavior to molecular Hear Res 171:111–118
biology. CRC Press, Boca Raton, FL, pp 19–29 Ladich F, Fay RR (2013) Auditory evoked-potential audi-
Heffner RS, Heffner HE, Masterton B (1971) Behavioral ometry in fish. Rev Fish Biol Fish 23:317–364
measurements of absolute and frequency-difference Langemann U, Klump GM, Dooling RJ (1995) Critical
thresholds in Guinea pig. J Acoust Soc Am bands and critical-ratio bandwidth in the European
49(6B):1888–1895 starling. Hear Res 84(1–2):167–176
Heffner HE, Heffner RS, Contos C, Ott T (1994) Audio- Levitt H (1970) Transformed up-down methods in psy-
gram of the hooded Norway rat. Hear Res 73:244–248 choacoustics. J Acoust Soc Am 49:467–477
Heffner RS, Koay G, Heffner HE (2014) Hearing in Long GR (1977) Masked auditory thresholds from the bat,
alpacas (Vicugna pacos): audiogram, localization acu- Rhinolophus ferrumequinum. J Comp Physiol 116:
ity, and use of binaural locus cues. J Acoust Soc Am 247–255
135(2):778–788 Manley GA (2001) Evidence for an active process and a
Hellström P-A (1995) The relationship between sound cochlear amplifier in nonmammals. J Neurophysiol
transfer functions and hearing levels. Hear Res 86(2):541–549
88(1–2):54–60 McFadden SL (2007) Biochemical bases of hearing. In:
Campbell K (ed) Pharmacology and ototoxicity for
386 S. L. McFadden et al.
audiologists. Thomson Delmar Learning, New York, Acoust Soc Am 57(6):1526–1532. https://doi.org/10.
pp 86–123 1121/1.380595
McFadden SL, Kiebel EM (2013) A parametric study of Screven LA, Dent ML (2019) Perception of ultrasonic
auditory event-related potentials recorded from cortex vocalizations by socially housed and isolated mice.
of CBA/CaJ mice. Association for Psychological Sci- eNeuro 6(5). https://doi.org/10.1523/ENEURO.
ence Conference, Washington, DC 0049-19.2019
McFadden SL, Walsh EJ, McGee J (1996) Onset and Serafin JV, Moody DB, Stebbins WC (1982) Frequency-
development of auditory brainstem response selectivity of the monkey’s auditory system: psycho-
thresholds in the Mongolian gerbil (Meriones physical tuning-curves. J Acoust Soc Am 71(6):
unguiculatus). Hear Res 100:68–79 1513–1518
McFadden SL, Campo P, Quaranta N, Henderson D Shaffer LA, Long GR (2004) Low-frequency distortion
(1997) Age-related decline of auditory function in the product otoacoustic emissions in two species of
chinchilla (Chinchilla laniger). Hear Res 111:114–126 kangaroo rats: implications for auditory sensitivity. J
McFadden SL, Ohlemiller KK, Ding DL, Salvi RJ (2001a) Comp Physiol A 190(1):55–60
The role of superoxide dismutase in age-related and Simmons AM, Moss CF (1995) Reflex modification: a tool
noise-induced hearing loss: clues from Sod1 knockout for assessing basic auditory function in anuran
mice. In: Willott JF (ed) Handbook of mouse auditory amphibians. In: Dooling R, Fay R, Klump G, Stebbins
research: from behavior to molecular biology. CRC W (eds) Methods in comparative psychoacoustics.
Press, Boca Raton, FL, pp 489–504 Birkhauser, Basel, pp 197–208
McFadden SL, Ohlemiller KK, Ding DL, Salvi RJ (2001b) Sisneros JA, Popper AN, Hawkins AD, Fay RR (2016)
The influence of superoxide dismutase and glutathione Auditory evoked-potential audiograms compared to
peroxidase deficiencies on noise-induced hearing loss behavioral audiograms in aquatic animals. In: Popper
in mice. In: Henderson D, Prasher D, Kopke R, AN, Hawkins AD (eds) Effects of noise on aquatic life
Salvi R, Hamernik R (eds) Noise-induced hearing II, vol 875. Springer, New York, pp 1049–1056
loss: basic mechanisms, prevention and control. NRN Stansbury AL, Thomas JA, Stalf CE, Murphy LD,
Publications, London, pp 3–18 Lombardi D, Carpenter J, Mueller T (2014) Behavioral
McFadden SL, Zulas AL, Morgan RE (2010) audiogram of two Arctic fox (Alopex lagopus). Polar
Age-dependent effects of modafinil on acoustic startle Biol 37:417–422. https://doi.org/10.1007/s00300-014-
and prepulse inhibition in rats. Behav Brain Res 1446-5
208(1):118–123 Thomas JA, Moore PWB, Withrow R, Stoermer M (1990)
Moss CF, Simmons AM (1986) Frequency selectivity of Underwater audiogram of a Hawaiian monk seal
hearing in the green treefrog, Hyla cinerea. J Comp (Monachus schauinslandii). J Acoust Soc Am 87(1):
Physiol A 159:257–266 417–420
Popov VV, Supin AY (1990) Electrophysiological studies Tonndorf J (1976) Relationship between the transmission
on hearing in some cetaceans and a manatee. In: characteristics of conductive system and noise-induced
Thomas JA, Kastelein RA (eds) Sensory abilities of hearing-loss. In: Henderson D, Hamernik RP, Dosanjh
cetaceans: laboratory and field evidence. Plenum Press, DS, Mills JH (eds) Effects of noise on hearing. Raven
New York, pp 405–415 Press, New York, pp 159–178
Salvi RJ, McFadden SL, Wang J (2000) Anatomy and von Békésy G (1960) Experiments in hearing. McGraw
physiology of the peripheral auditory system. In: Hill, New York
Roeser RJ, Valente M, Hosford-Dunn H (eds) Audiol- Walter M, Tziridis K, Ahlf S, Schulze H (2012) Context
ogy diagnosis. Thieme, New York, pp 19–43 dependent auditory thresholds determined by
Schlundt CE, Dear RL, Green L, Houser DS, Finneran JJ brainstem audiometry and prepulse inhibition in Mon-
(2007) Simultaneously measured behavioral and electro- golian gerbils. Open J Acoust 2:34–49. https://doi.org/
physiological hearing thresholds in a bottlenose dolphin 10.4236/oja.2012.21004
(Tursiops truncatus). J Acoust Soc Am 122:615–622 Welch TE, Dent ML (2011) Lateralization of acoustic
Schusterman RJ (1976) California Sea lion underwater signals by dichotically listening budgerigars
auditory detection and variation of reinforcement (Melopsittacus undulatus). J Acoust Soc Am 130(4):
schedules. J Acoust Soc Am 59(4):997–1000. https:// 2293–2301
doi.org/10.1121/1.380928 White JR, Norris MJH, Ljungblad DK, Barton K, di
Schusterman RJ, Johnson BW (1975) Signal probability Sciarra GN (1978) Auditory thresholds of two beluga
and response bias in California Sea lions. Psych Rec whales, Delphinapterus leucas. HSWRI Tech Rep No
25(1):39–45. https://doi.org/10.1007/BF03394287 78-109 Sea World Research Institute, San Diego, CA
Schusterman RJ, Barrett B, Moore P (1975) Detection of Willott JF (1991) Aging and the auditory system: anat-
underwater signals by a California Sea lion and a omy, physiology, and psychophysics. Singular Publi-
bottlenose porpoise: variation in the payoff matrix. J cation Group, San Diego, CA
10 Behavioral and Physiological Audiometric Methods for Animals 387
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Vibrational and Acoustic
Communication in Animals 11
Rebecca Dunlop, William L. Gannon, Marthe Kiley-
Worthington, Peggy S. M. Hill, Andreas Wessel,
and Jeanette A. Thomas
Bioacoustics is defined as the study of use devices such as laser Doppler vibrometers
mechanical communication by acoustic (sound) and wavelet analysis. These function to detect
waves. It is a widely used term when referring to faint vibrational emissions made by animals. In
animal communication. Biotremology is a rela- addition, electromagnetic transducers produce
tively recent term. It was conceived to refer to signals, and when in contact with the substrate,
communication signals that comprise substrate- serve as vibration generators for artificial play-
borne vibrations, and which are detected as sur- back experiments.
face vibrations by specialized perception organs Now, nearly 60 years later beyond Busnel’s
such as slit-sense organs in spiders, subgenual (1963) paradigm of bioacoustics, tremendous
organs in insects, hair receptors, or Pacinian and changes in recording technology and analysis
Herbst corpuscles in vertebrates (Hill and have occurred. Acoustic identification of any-
Wessel 2016). Substrate-borne vibrations are thing from birds to bats can be carried out using
sensed via, “. . .pressure waves traveling an iPhone, an acoustic detection application, and
through . . . solid matter . . . detected via the a bluetooth speaker or microphone!
surface vibrations they elicit or the airborne
waves (sound) they induce” (Hill and Wessel
2016). Bioacoustical (sound) communication, 11.2 The Origins of Substrate-Borne
refers to signals that are encoded in acoustic Vibrational and Acoustic
waves, and are detected using the ear. Vibra- Communication
tional communication has been recognized as
evolutionarily older than bioacoustic communi- Communication is the transfer of information
cation and is much more prevalent among some from one animal (sender) to another animal
animal groups (e.g., arthropods; Fig. 11.1). (receiver) that can affect the current or future
Therefore, researchers are also interested in behavior of the receiver. In other words, commu-
how these mechanical vibrations affect nication conveys information. It is adaptive, in
behavior. that a successful communication exchange
Both areas of study use similar equipment to enhances the survival of one or both participants.
record and analyze communication signals. How- Vibrational communication has been suggested to
ever, scientists in the field of biotremology also have evolved, along with chemical
11 Vibrational and Acoustic Communication in Animals 391
communication, concurrently with evolution of example, holding the breath and then letting it
the Metazoa (all animals; Endler 2014). We out as a sigh or a cough produces various sounds.
know that any movement of an animal, whether These sounds are then associated with situations
in water or at the boundary between air and any being experienced by the sender, meaning this
type of substrate, creates vibrations that can be information is available to all who hear
detected by any other organism with receptors it. Presumably, it was this evolutionary process
capable of receiving and translating them. that gave rise to sound-making organs in the
Increasing evidence also suggests that inverte- respiratory tract to the point where vocal commu-
brate hearing organs evolved from vibrational nication now involves a larynx.
precursors millions of years ago (Stumpner and Ritualization is the evolutionary process by
von Helversen 2001; Lakes-Harlan and Strauss which a pattern of behavior changes to become
2014). Therefore, the discussion of origins of more effective as a signal (Huxley 1966; Morris
communication in this section is restricted to the 1957). The behavior is performed in a consistent
more recently evolved acoustic communication. way and is either stereotyped or incomplete.
The origins of acoustic communication are Incomplete behaviors may be used for activities
likely to be in nonverbal sounds made by chance such as courtship. For example, a drake mallard
as the animal moves through the environment. (Anas platyrhynchos), when preening and
These sounds could be scraping, a stick breaking, displaying to a female, acts as if he is addressing
footfalls, opening or flapping of wings, or a skin irritation (Morris 1956), but he may not
scratching. They are the result of environmental even touch his feathers during the display. In
disturbance, which in turn makes a sound through other words, the behavior seems to be a preening
the air, earth, or water. By just being made, these behavior, but is in fact a courtship behavior. To
sounds convey to others the presence of the ani- increase the effectiveness of the ritualized signal,
mal, and something about what it might be doing. anatomical modifications may also have evolved.
It is then a simple developmental step for a par- A classic example of this is the elaborate colors of
ticular sound to become associated with a partic- the Mandarin drake (Aix galericulata). During the
ular situation and thus carry a particular message courtship of a female, the male will highlight
to the recipient. Examples of nonverbal sounds these colors by pointing to them during incom-
are sounds from an elephant breaking sticks as it plete, exaggerated, and stereotypical preening.
moves through the environment, a sigh, a cough, Exaggerated signal ritualization is
or a sneeze. Originally, these sounds may not characterized by a clear signaling behavior, such
have been made to communicate. However, as the ears of a horse (Equus caballus) flattening
sounds that provide an advantage for an individ- back as a precursor signal to biting. This
ual, or a population, will be perpetuated if they exaggerated ear movement has a clearer meaning
enhance the fitness of the species. This, ulti- than just putting the ears back. Ritualistic behav-
mately, gives them an evolutionary advantage ior is usually no longer tied to its original role
that would reinforce further refinement of this because it has become more important for the
new sensory mode. signaler’s fitness to communicate, rather than
This origin likely gave the evolutionary open- being used for its original purpose. Therefore,
ing to develop specialized body parts that could the signal has evolved to produce a clear message.
produce auditory signals, in tandem with sophis- Signals can also evolve to become more effec-
ticated sensory capabilities to receive them tive by redundancy, or by emulation of another’s
(Narins et al. 2009). One such specialized body acoustic or vibrational expression. Redundancy in
part is the respiratory tract. Once a respiratory animal acoustic communication is the repeated
tract had developed in vertebrates, sounds use of a signal. Vocal signals, for example, can
associated with breathing could convey informa- be repeated for long periods of time, such as the
tion to others, and so the necessary adaptations continuous chorusing of frogs advertising during
for sound generation began to develop. For mating sessions. Redundancy reduces the risk
392 R. Dunlop et al.
display, which involves nine steps, and includes Much of the communication in insects, other
both visual and acoustic modalities (Schaller invertebrates, and nonmammalian vertebrates
1964). In other words, the threat display can such as fish and amphibians, involves stereotyped
encompass several different signals. signals. That is, the signal is produced in a con-
A similar threatening display is produced by a stant form and the response is evoked only by that
dog (Canis lupus familiaris), drawing back its signal. As a result, this signal/response relation-
lips and exposing its teeth (visual), as well as ship becomes characteristic of that species. In this
growling (acoustic) (Fig. 11.5). Again, this is a way, stereotyped signals can be important in evo-
complex display involving multiple steps and lution. For example, if a signal influences mate
multiple modalities. However, displays can be selection, then a slight alteration in the signal
simpler, such as a grasshopper (Orthoptera) could lead to failure to reproduce, or if mating is
scraping its wings as an acoustic signal to indicate successful, it might give rise to a new species.
location and readiness to mate.
Fig. 11.6 Mechanical wave forms produced by a signal- planthopper tymbal organ is homologous to the “drum-
ing plant-dwelling insect. A planthopper is one of the ming organ” of the large singing cicadas. Tens of
small relatives of the cicadas. It has a tymbal organ to thousands of these smaller hemipteran bugs use tymbal
produce vibrations, which are transferred through its legs, organs to produce “silent songs.” Reprinted by permission
then the thin air layer between its body and the plant from Elsevier. Hill P SM, Wessel A (2016). Biotremology.
surface, to the plant on which it is sucking fluids. By Current Biology 26, R181–R191; https://doi.org/10.1016/
doing this, the planthoppers produce a very faint sound, j.cub.2016.01.054 # Elsevier, 2016. All rights reserved
which can be propagated through the air or soil. The
396 R. Dunlop et al.
different sensory organs (tarsal hair receptors In mammals, most vibrational signals are pro-
v. basitarsal slit sensilla). That was a significant duced by drumming or vocalization. Curiously,
discovery on the path to biotremology. Until then, the vibrational communication of the largest land
the substrate the scorpions use, loose sand, was animal, the African savanna elephant (Loxodonta
considered as not fitting for the transmission of africana), was discovered by O’Connell-Rodwell
vibrational signals, nor for the differential detec- in the 1990s, when she noticed peculiar
tion of different waveforms. Since the establish- behaviors. A freezing behavior in the elephant
ment of the view that a host of natural substrates and change in orientation, without an apparent
are suitable for vibrational communication, a cause, nevertheless reminded her of the behaviors
great number of (apparently) well-known of the tiny planthoppers whose vibrational com-
behaviors are now seen in a new perspective, munication she had studied earlier (Fig. 11.7).
and new discoveries are made for almost all ani- O’Connell-Rodwell and colleagues demonstrated
mal groups with increasing frequency (Hill et al. that the signals the elephants generate with low
2022). frequency “rumbles” (about 20 Hz) could be very
The production of vibrational signals nor cues useful for intraspecific long-distance communica-
can be accomplished through different forms: tion (O’Connell-Rodwell et al. 1997, 2000).
drumming (any sort of percussion event where a Also, drumming is a type of long-range vibra-
body part impacts the substrate of soil or a plant tional signal production. For instance, drumming
or water, etc.), tremulation (a body shaking/trem- by prairie chickens (Tympanuchus cupido) can be
bling that does not strike the substrate as the detected up to 5 km away from the source
signal travels through the signaler’s legs to the (Jackson and DeArment 1963). Kangaroo rats
surface on which they are standing), stridulation (Dipodomys deserti, D. ingens, and
(rubbing together a specialized file and scraper, D. spectabilis) drum the soil surface (seismic
which may be found on a variety of body parts), communication) with their feet to communicate
buckling of tymbal organs in animals that have such things as territorial ownership, their compet-
them, vocalizations and perhaps others, such as itiveness, and their presence and location to other
scraping a surface while signaling, or even kangaroo rats (Fig. 11.8, Randall 1984; Randall
scratching against a tree, or rolling on the ground. and Lewis 1997; Cooper and Randall 2007).
Some of these signal production mechanisms, Many species of marsupial kangaroos
such as drumming, stridulation, and vocalization, (Macropodidae) are known to produce a foot
always produce both a substrate-borne (vibra- thump when confronted by predators. The
tional) and an airborne (acoustic) component intended recipient of the vibration is not known
with a single action, even if only one of the and could be either a predator or other kangaroos
potential signals is capable of eliciting a response (Narins et al. 2009). Sheep and many other
in a receiver. ungulates stamp their feet when frightened or
Arthropods, and especially insects, show the aroused in other ways.
greatest variety of specialized organs to produce As every movement of an animal cause
vibrational signals. All mentioned means of particles in the surrounding media to oscillate
vibration production, except for vocalization, are and evokes all possible sorts of mechanical
present in several groups of arthropods and may waves, it is the mechanism of reception of
have evolved several times, independently. For a mechanical signals or cues that defines acoustic
subgroup of the insect order Hemiptera, the vs vibrational communication. It also follows that
Tymbalia or tymbal bugs, comprising tens of every act of communication establishes—at least
thousands of species including plant- and potentially—a complex communicational net-
leafhoppers, cicadas, and true bugs (Heteroptera), work in the realm of the “acousto-vibro-active-
vibrational communication is known to be evolu- space,” whereby the active space for vibrational
tionarily old and ubiquitous (Hoch et al. 2006; signals can be surprisingly wide, even bridging
Wessel et al. 2014). air gaps (Fig. 11.9; Virant-Doberlet et al. 2014;
11 Vibrational and Acoustic Communication in Animals 397
Fig. 11.7 Elephant vibration detection posture. (a) To for triangulation or better coupling). If focused on an
detect a signal, an elephant appears to focus solely on acoustic signal, an elephant will hold its ears out and
somatosensory detection via receptors in the trunk. Its scan its head back and forth in the general direction of
ears are relaxed suggesting no airborne assessment for the sound. Reprinted by permission from Springer Nature.
signals. (b) Elephant vibration detection posture, where it Biotremology: Studying vibrational behavior, edited by
appears to be using its toenails and trunk to assess a P. S. M Hill, R. Lakes-Harlan, V. Mazzoni. P. M. Narins,
ground-borne signal. Again, its ears are not fully extended. M. Virant-Doberlet and A. Wessel, pp. 259–276, Vibra-
This suggests it uses both bone conduction through the tional communication in Elephants: A case for bone con-
toenails and a somatosensory pathway through Pacinian duction, C. O’Connell-Rodwell, X. Guan and S. Puria;
corpuscles in the trunk for signal detection. Elephants may https://link.springer.com/chapter/10.1007/978-3-030-
also lean forward on their front legs with ears flat, some- 22293-2_13. # Springer Nature, 2019. All rights reserved
times lifting one of the front feet off the ground (possibly
Fig. 11.8 Kangaroo rats (genus Dipodomys) produce 49936422922). (right) Ord’s Kangaroo rat (Dipodomys
seismic signals by drumming the soil surface with their ordii). Photo of “Two Ord’s Kangaroo rats, Alberta” by
large hind feet. (left) Photo of “Kangaroo Rat by Stuart Andy Teucher licensed under CC BY-NC 2.0; https://
Wilson” by cameraclub231 is licensed under CC BY 2.0 www.flickr.com/photos/63265212@N03/8736679123
(https://www.flickr.com/photos/135081788@N03/
398 R. Dunlop et al.
Fig. 11.9 Types of communication acts by a vibrational a braconid wasp) are eavesdropping on the spider whereby
signaler. The signaling lycosid wolf spider establishes establishing a complex communication network.
vibrational communication with a conspecific receiver, Reprinted by permission from Elsevier. Hill P SM, Wessel
even one that is not on the same substrate as the sender. A (2016). Biotremology. Current Biology 26, R181–
Likewise, a vibrational communicating prey (e.g., a R191; https://doi.org/10.1016/j.cub.2016.01.054.
planthopper) and an acoustically orienting parasite (e.g., # Elsevier, 2016. All rights reserved
Mazzoni et al. 2014; Gordon et al. 2019). On an female to freeze at the end of the courtship,
ecosystems level, we have begun to think of, and facilitating copulation (McKelvey et al. 2021).
to study, a whole complex multilevel vibroscape The male’s vibrational signals are transmitted
(Šturm et al. 2021). through the common courtship floor—overripe
Despite the importance of reception fruits—and were picked up by a subset of neurons
mechanisms for the study of vibrational commu- of the female’s femoral chordotonal organ. By
nication, they are, for now, the least understood genetic knockout experiments of several
aspect in biotremology. Arthropods have in their mechanotransducer ion channels, McKelvey
bauplan—in every body segment and at every (et al.) also identified a protein involved known
joint of their legs—mechanosensitive stretch to be responsible for gentle touch sensitivity in
organs (chordotonal organs) that are responsible vertebrates—suggesting a deep evolutionary ori-
for body and movement control, but could also gin of vibrational communication.
pick up environmental vibrations. In some In several cases, we need to consider a bimodal
groups, such as grasshoppers, crickets, and acousto-vibrational communication on the signal
cicadas, chordotonal organs have evolved into production as well as on the reception side that
ears with a tympanum attached to one end of the results in a complex perception of the environ-
stretch organ. It is hypothesized that in every such ment outside of the experience of human beings.
case these hearing organs transformed through an Elephants, for example, produce low-frequency
evolutionary intermediate stage of vibration signals by vocal “rumbles” and “foot stomps”
receptors, i.e., vibrational reception is evolution- that produce airborne vibrations (sound) as well
arily older than hearing. as seismic waves (O’Connell-Rodwell et al.
A recent breakthrough was the demonstration 2000). New findings point to a simultaneous
of the complete pathway, from signaling through monitoring of the signaling by three reception
reception, to perception, and response behavior, pathways: sound hearing by the ear’s tympanum,
of the vibrational component of the courtship of bone conduction hearing, and somatosensory
the fruit fly Drosophila melanogaster. It is the detection via receptors in the trunk (Fig. 11.7;
vibrational signaling of the male that triggers the O’Connell-Rodwell et al. 2019). In this way, the
11 Vibrational and Acoustic Communication in Animals 399
overall chance of detecting a signal at all in a environment, and production of these vibrations
heterogeneous environment is improved, and the cannot be eliminated by the individual, even if
animals could also make use of the different walking more softly does lower the amplitude.
propagation velocities for assessing the distance Therefore, we can be certain that in both verte-
to the source of the signal. brate and invertebrate predators, a substrate-borne
vibration or sound that alerts potential prey of the
presence and direction of movement of the preda-
11.3.3 Diversity in Communication tor is not communication. In animal communica-
tion, we refer to this class of unintended
Recent evidence indicates that many messages information as a cue. On the other hand, we may
may be conveyed auditorily in nonhuman also be familiar with a hunting dog moving
primates when the larynx is not used. These com- through a meadow and flushing birds on the
monly take the form of rumbling of the stomach, ground into flight with the result that the hunter
farting, breaking sticks, swishing of grass, sounds can shoot them. We simply do not know if this
during digging or flying, and others. In fact, many sort of behavior exists in a more natural less
sounds made by an individual can carry informa- domesticated setting.
tion to those who hear, but the question is whether
they are used for communication. These sounds
could just be the result of physiological or envi- 11.4 The Advantages
ronmental adjustments that the sender may or and Disadvantages
may not be able to control, or that are not of Vibrational and Acoustic
recognized as significant in communication. One Communication
example is surface behavior in humpback whales.
Humpback whales can launch their body out of Substrate-borne vibrational and acoustic signals
the water, turn, and splash down on their side or are used in communication by almost all
back (breach), slap the water with their pectoral invertebrates and vertebrates. Sometimes each
fins, tail flukes, and even their head. These pro- type of signal is used by a single species but in
duce loud “bang” sounds, thought to be used as different contexts. There are many examples of
communication signals during periods of high the two being used across animal taxa in the same
underwater noise when vocal signals are not as basic context. Some major groups of animals
effective (Dunlop et al. 2010). have evolved a heavier dependence on one than
In general, the use of these sounds for commu- the other. For example, only as recently as 2015
nication has not been given much research time to did we observe the first described substrate-borne
date, except for cases where they have been signaling in mating birds (Ota and Soma 2022)
ritualized to carry information to others. For and in the very well-studied fruit fly Drosophila
example, we do know, from centuries of hunter’s melanogaster (McKelvey et al. 2021), both of
anecdotal evidence, that a hunted antelope, ele- which were well-known for acoustic and visual
phant, or even a rhino, will move much more signaling. These signals are essential for many
carefully to not make a sound when it is being species to find a mate, keep in contact (such as
hunted, compared to when traveling/grazing in a between mother and young), maintain territory,
group (e.g., Baze 1950). If this is the case, the warn conspecifics of predators, link food location,
individual must recognize that the sound will reinforce social living, communicate emotional
carry a message (Heyes and Dickinson 1990). state, and many other types of information
In invertebrates and non-primate vertebrate (Bradbury and Vehrencamp 1998). For any ani-
animals, ascertaining whether or not these signals mal, being out in the world advertising your pres-
are being used for communication is more of a ence has many advantages, but it also has its
challenge. Each movement of an animal’s body disadvantages. The advantages of using vibra-
creates vibrations that propagate through the tional and acoustic communication signals are
400 R. Dunlop et al.
essentially the same. There is no need for light— to avoid predation. A conspecific eavesdropper
so signals can be detected at night. Sound can can gain important information about the sig-
flow around obstacles, so acoustic signals can be naler/receiver relationship without having to
heard anywhere and anytime, and even though directly take part in the interaction. Siamese fight-
the substrate filters vibrational signals and cues ing fish (Betta splendens), for example, eavesdrop
in ways that are difficult to predict, they still can on fighting males to gain information about their
be detected without respect to time. Compared strength, which they then use in future
with other signals, most vibrational and acoustic interactions (Oliveira et al. 1998; Peake and
signals do not need a great deal of energy to McGregor 2004). To add further complexity, the
produce. Because of the physics of signal propa- presence of an eavesdropper audience can affect
gation, vibrational and acoustic signals can travel communicative interactions and force signalers to
over long distances. For instance, in primates, the change their signaling behavior according to who
roaring of howler monkeys (genus Alouatta) can else may be listening in. This is known as the
travel up to 1 km. audience effect and was first documented in a
However, there are disadvantages to vibra- study of domestic chickens (Gallus gallus;
tional and acoustic communication. These Evans and Marler 1991, 1994).
include energetic and developmental costs, such Despite these and other disadvantages, it is
as requiring special structures for signal produc- obvious that substrate-borne vibrational and
tion and reception. Being able to produce a loud acoustic communication and all that they entail
signal often requires new, and possibly elaborate have provided extraordinary benefits in compet-
structures, such as the larynx of vertebrates and ing, surviving, and propagating the next genera-
the melon of sperm whales, Physeter tion. The stories of the development of vibrational
macrocephalus). Invertebrates have also evolved and acoustic communication are ongoing and
specialized structures, such as the stridulatory much knowledge about the mechanisms,
apparatus in insects, which requires a receptor meanings, and extent of these systems is yet to
such as the subgenual organ (for substrate-borne be discovered.
vibrations) and the ear (for sound) to pick up the
messages. Many animals have evolved
specialized receptors to detect substrate-borne 11.5 The Influence
vibration signals (Pacinian corpuscles, Meissner’s of the Environment
corpuscles, Eimer’s organ; Narins and Lewis on Acoustic and Vibrational
1984; Narins et al. 2009). Communication
The disadvantages of signaling can, however,
be subtle—such as a wasted broadcast when there For the most part, animals do not sit in a studio,
is no one to receive it or alerting others and then acoustic lab, or anechoic chamber when signaling
being overcome by a predator. “Blurting out” acoustically or with substrate-borne vibrations.
who and where one is means others can find They are usually in a natural environment subject
you. By listening in, these others, or unintended to atmospheric and other conditions. Signals may
receivers, which could be predators, prey, or even be affected by spatial separation, movement of
eavesdropping conspecifics, can obtain valuable the caller, and they may even vary spatially or
information about the signaler. This may come at geographically. Environmental noise is a signifi-
a cost to the signaler. If the unintended receiver is cant factor influencing animal signaling behavior.
a predator, the cost is obvious: by listening in on While few studies to date have addressed vibra-
the sound signals, the predator can recognize the tional environmental noise, this topic is the focus
signaler as prey and locate it. Conversely, prey of a recent review of both terrestrial and marine
can be alerted to, and identify, a signaling preda- anthropogenic noise topics and literature, includ-
tor and its location, thus making it easier for prey ing previously unpublished case studies that can
11 Vibrational and Acoustic Communication in Animals 401
be used as guides for future work (Roberts and A similar mechanism to spatial release from
Howard 2022). masking is known as the cocktail party effect.
Here, the receiver focuses its attention on the
signaler, while selectively filtering out other
11.5.1 Atmospheric Conditions stimuli such as other sounds. At a party, humans
can “tune in” to one conversation when many are
Atmospheric conditions, which include changes in taking place. Many frogs and songbirds have also
temperature and wind, exert powerful and predict- been shown to successfully communicate in noisy
able influences on animal sounds. These influences party-like situations. Frogs can recognize, local-
can cause the ability to detect a signal to change ize, and respond to signals within a cacophony of
rapidly. The transmitting of a signal may be chorusing (Gerhardt and Bee 2006; Wells and
prolonged or modulated by topography, regional Schwartz 2006). Songbirds are able to recognize
weather, seasonality, and climate. Mammalian conspecific song and songs from other species
carnivores, such as coyotes (Canis latrans) and within a dawn chorus (Benney and Braaten
wolves (Canis lupus), live in areas with nocturnal 2000; Hulse et al. 1997). Reunited offspring and
lower temperatures (David Mech and Boitani parents within a noisy colony clearly occur suc-
2003). These animals show crepuscular calling to cessfully in penguin colonies (Aubin and
maximize their chances of being heard over the Jouventin 1998).
longest possible distances. Vibrations in the soil The above mechanisms demonstrate how the
or other substrates due to wind or rain can also receiver overcomes masking sounds to improve
interfere with normal signal production and recep- signal detectability. Another way to improve sig-
tion to the extent that individuals will stop court- nal detectability is for a signaler to change the
ship displays under windy or rainy conditions. way it calls. For example, a signaler could
increase its call amplitude, call duration, and/or
call at a different frequency. These changes are
11.5.2 Masking Sounds collectively known as the “Lombard Effect.” The
Lombard effect has been demonstrated in species
Masking sounds are environmental sounds, such such as the Japanese quail (Coturnix japonica;
as a stream, wind moving through the trees, and Potash 1972), budgerigars (Manabe et al. 1998),
sounds from other animals, which cover, or chickens (Gallus gallus domesticus; Brumm et al.
dilute, the signal. In birds and other animals, 2009), nightingales (Luscinia megarhynchos;
spatially separating a signal from a masking Brumm and Todt 2002), white-rumped munia
sound is one way to improve signal detectability. (Lonchura striata; Brumm and Zollinger 2011),
If the signal and masking sound are separated and zebra finches (Taeniopygia guttata; Cynx
spatially, the receiver can focus efforts to hear et al. 1998) and even in large whales such as the
the signal. This “spatial release from masking” humpback whale (Dunlop et al. 2014).
has been demonstrated in the behavior and physi-
ology of the northern leopard frog (Lithobates
pipiens) (Ratnam and Feng 1998). Bee (2007) 11.5.3 Geographic Variation
showed that female Cope’s gray treefrogs and Dialects
(Dryophytes chrysoscelis) approached a target
signal more readily when they were spatially Changes in the environment may lead to geo-
separated by 90 from a masking sound, implying graphic variation, and this variation can eventu-
this spatial separation aided with signal reception. ally separate animals within a species into
Spatial release from masking has also been shown different populations. It should be noted that geo-
to occur in budgerigars (Melopsittacus undulatus; graphic variation is not necessarily due to
Dent et al. 1997) and killer whales (Orcinus orca; changes in the environment. While this is occur-
Bain and Dahlheimm 1994). ring, geographic separation can lead to the
402 R. Dunlop et al.
formation of dialects. A dialect can evolve where and Slater 2008; Podos and Warren 2007;
species dispersal is occurring and their acoustic Keighley et al. 2017). Another mechanism that
contact with each other becomes limited (Slater helps maintain variable acoustic dialects is social
1986, 1989). As a result, individuals within a adaptation. Social adaptation refers to the ability
species population may exhibit similar sounds to to adjust behavior to a prevailing pattern in a
each other, but these sounds may be quite differ- population. Migrating birds, for example, learn
ent in structure to other separated and more dis- calls quickly (Salinas-Melgoza and Wright
tant populations (Catchpole and Slater 2008; 2012), which provides reproductive benefits due
Gannon and Lawlor 1989). This results in to acoustic familiarity by potential mates (Catch-
within-species vocal variation. pole and Slater 2008; Farabaugh and Dooling
Dialects are also known from biotremology 1996). In this way, newly arriving immigrants fit
studies. For example, the well-known southern in quickly and do not insert changes to bird songs
green stink bug (Nezara viridula) has spread of the residents, thereby maintaining the local
throughout the world (except for the Arctic and dialect.
Antarctic) from its native Ethiopia in the past Vocal dialects can act as precursors to genetic
100 years. Geographically isolated populations isolation (e.g., in coastal US chipmunks, genus
(e.g., California and Florida in the United States, Neotamias). Dialects can also be maintained over
the French Antilles, Australia, Japan, Slovenia, time if the populations are separated and have
and France) have distinct differences in duration little acoustic contact. This separation can be
and repetition time of male and female signals. reinforced by geographic boundaries, or other
Individuals appear to be able to recognize adults isolation mechanisms, that reduce breeding
from other populations but prefer to mate with chances (Gannon and Lawlor 1989). Examples
those of their own dialect/population (Virant- include the pika (Ochotona), grasshopper mice
Doberlet and Čokl 2004). (Onychomys), white-crowned sparrows
The study of population dialects offers a (Zonotrichia), prairie dogs (Cynomys), and bats
means to explore the causes and the functions of (Myotis evotis), which have all been shown to
signal variation and change (Henry et al. 2015). exhibit dialects due to geographic variation. Sev-
Geographic variation in acoustic signals can eral species of birds, such as the chaffinch
reflect historical evolutionary changes within spe- (Fringilla coelebs), have been identified as hav-
cies. Not only can these signals be used to assess ing song dialects and therefore are described as
links between geographic variations and popula- having distinct “cultures” (Slater 1981). One of
tion connectivity, but they can be used to provide the most striking examples of cultural influences
important information for the conservation of a is the rapid spread of new humpback whale songs
species. For example, geographic variation in across the South Pacific basin. All male hump-
calls could indicate how birds disperse through a back whales within a population generally con-
fragmented habitat, meaning the study of dialects form to the same song pattern, making it a cultural
can be used as a noninvasive tool to assess popu- trait. These song types move eastward across the
lation connectivity (Kroodsma and Miller 1982; South Pacific basin in a series of cultural waves at
Amos et al. 2014). a geographic scale unparalleled in the animal
The formation of dialects can occur through kingdom (Garland et al. 2011).
several mechanisms; as a result of a side-effect or Behavioral repertoires are malleable—that is,
“epiphenomenon” of learning via incorporating they are affected by the environment, learning,
copying errors (such as adding or omitting parts and interactions within a population. Variants in
of the call), due to structural changes to call signal characteristics are no exception (Brumm
elements through drift, or as a possible indicator et al. 2009). Thus, signal characteristics can act
of the level of behavioral or genetic variation in a as precursors to variants in other genetic
population (Baptista and Gaunt 1997; Catchpole characteristics, and eventually, speciation.
11 Vibrational and Acoustic Communication in Animals 403
body mass, and formant dispersion (e.g., domes- often associated with pleasure, close contact
tic dog, Canis lupus familiaris, Riede and Fitch between animals that like each other (such as
1999; southern elephant seals, Mirounga leonina, mother to young), or between social partners
Sanvito et al. 2007). when close (Morton 1977).
As a result, information about the sender’s Affiliative calls can indicate a welcoming, or
body size, sex, age, and sometimes rank can be “I am fond of you” context. For example, familiar
acquired from their vocalizations. Sounds from elephants meeting each other after a long separa-
small or young animals are typically higher in tion may trumpet for pleasure/joy (a high state of
frequency than those of larger or older animals arousal). They also murmur to a friend, infant, or
(see Riondato et al. 2021 for an exception). Some- person they like who has been close, indicating a
times rank information is used by females low level of arousal but a similar emotion (Kiley-
selecting males. For example, the “roar” of the Worthington 2017).
male Red deer (Cervus elaphus) contains infor- Aggressive calls include territorial calls and
mation on its sex and size. The larger the animal, calls used as threats, and like affiliative calls, the
the lower the frequency of the roar. Females agnostic call structure can change because of
chose mates based on their roar and have been arousal. A highly aroused bull (Bos taurus), for
found to prefer the roars of larger males (Charlton example, will give visual signals: pawing, lower-
et al. 2007). The signaler’s dominance rank can ing his head withdrawing his chin and rubbing his
also be signaled using size-related formants (e.g., horns in the earth, at the same time as roaring. At
male fallow deer, Dama dama, Vannoni and the highest level of threat, the roar has a vocalized
McElligott 2008; and baboons, Papio ursinus, inspiration as well as a vocal expiration known as
Fischer et al. 2004). As the sender’s features do a “see saw” call (Kiley 1972).
not change (e.g., their sex), or change slowly over
time (e.g., their size or age), it is known as static
information. 11.6.3 Context-Dependent Meanings
food. The variation in these food calls can indi- fighting for access to a breeding female. In these
cate food a quality and quantity. For example, groups, where arousal level is much higher,
spider monkeys (genus Ateles) are known to pro- “grumbles” turn into harsh sounding “roars” and
duce a higher call rate in response to greater “purrs,” and become more modulated to sound
quantities and quality of food. Acoustic signals more like “groans” and “moans.”
can attract group members to food locations and Different levels of graded calls can be given in
these calls can also be used to protect the food one situation. For example, cattle may give a low
resource from others (Clay et al. 2012). These “mmmmm” call when in close contact with other
authors examined food-associated calls made by cattle. On opening its mouth, the sound has an
some birds and mammals (see page added syllable: “en” to “mmen.” When it is suffi-
326, Table 11.1 in Clay et al. 2012) and found ciently aroused, a “hh” syllable is added, which is
that most species did not produce unique calls for the result of letting the remaining air out of her
different foods. More commonly, signalers varied respiratory track. This can change even further
their calling rate to advertise food quality or with higher excitement or arousal by being
abundance. repeated. Finally, at the highest level of arousal,
Therefore, context-dependent vocalizations the inspiratory phase of the call is also vocalized
may not necessarily convey information about (Table 11.1). This is a very different type of
the type of situation but can act as an analogue auditory communication from context-
system to inform the recipient about the general independent calls such as human language
level of arousal of the sender, and consequently, where auditory communication can reflect either
how (or if) to respond. In some species, calls are or both and environmental contexts or come from
graded, meaning that there are intermediates some thought or idea generated by cognition.
between one call and another. Humpback whales,
for example, use a repertoire of graded signals
and the use of these signals is likely related to the
11.6.4 Species Recognition
motivation and arousal of the signaler (Dunlop
2017). “Grumbles” and “snorts” are used by
To be sure that the call maintains the same struc-
females and their calf while migrating by them-
ture (and can therefore be recognized as having
selves and presumably in a low-arousal context.
the same message), there are a number of
Female–calf pairs can be joined by male escorts
measures including call interval, maximum fre-
and form a competitive group, where males are
quency, minimum frequency, fundamental or
Table 11.1 The variety of situations that give rise to the major call types of Bos taurus (reproduced from Kiley 1969)
Situation/call mm men menh (m)enENh SeeSaw A (no inspir) SeeSaw B (+inspir)
Confident greeting + + +
Greeting equals + + + +
Defensive threat + + +
Aggressive threat + + + +
Fear + +
Close contact retain +
Tactile stimulation +
Isolation + + + + +
Startle
Pain/fear + + +
Frustration + + + + + +
Anticipation pleasant + + + + + +
Anticipation unpleasant + + + + + +
Disturbance + + + + + +
406 R. Dunlop et al.
predominate frequency, call length, duration, relate to the context, also emit a specific “warning
amplitude or loudness, and the repetition rate bark”—a context-independent short sharp call
found in both acoustic and vibrational signals. that is difficult to locate as an alarm call (Kiley
These characteristics, combined with the presence 1972). This alarm call works to conceal the posi-
of harmonics, form patterns that are often charac- tion of the signaler but conveys that a disturbing
teristic of a species or individual. As a result, object has been sighted.
other animals are likely to be able to identify The importance of altruism (or lack of it) when
individuals from their calls, as we can with vocalizing has been investigated within the con-
human voices. For example, many species of text of emitting alarm calls and food calls. For
vespertilionid bat can be identified by time and example, studies have shown that, even those
frequency characters measured from their echolo- calls that are difficult to locate (ventriloquial
cation calls (Gannon et al. 2003). Individual rec- calls), will increase the chances of being detected
ognition is also evident in bats. Playback by a predator (Fig. 11.11). However, studies on
responses in common vampire bats (Desmodus kinship and altruism have yet to relate the ease of
rotundus) suggested they vocally recognized locating an alarm call by a predator to the rate of
individual bats, given they were biased toward vocalizations and to actual predation (Reznikova
callers that had fed them more (food sharing), 2019). Still, it seems that coterie members of
but not biased toward kin (Carter and Wilkinson prairie dogs (Cynomys ludovicianus) alert others
2016). Crickets (Teleogryllus spp.) can be to the presence of potential predators using alarm
differentiated based on the amplitude and repeti- calls, and that these alarms significantly reduce
tion of their call, not just their call “note” (that is, predation (Wilson-Henjum et al. 2019).
the fundamental). The mean frequency of this Functionally referential signals are those that
signal is approximately 4 kHz, but the pattern provide very specific information. They are struc-
and call rate increase as the cricket’s motivation turally distinct and reflect a stimulus-specific
changes from “calling” to “encountering” to meaning used only in a very specific set of
“fighting” to “courtship” and finally “copulating.” circumstances. Most alarm calls are nonspecific,
but the vervet monkey (Chlorocebus
pygerythrus), uses a lexicon of four or five sounds
11.6.5 Context-Independent to identify the type of intruder. When a major bird
Meanings or mammal predator is nearby, the vervet
produces a “chirp” and “bark” (Strusaker 1966).
Some calls in animals, like human language, have When a snake is nearby it evokes a special
a specific meaning, whatever the context. These “chutter” call, a minor bird or mammalian preda-
calls often include alarm calls used to alert a tor is indicated by an abrupt “uh” or “nyow”
group to danger of an approaching predator, terri- sounding signal, and a major bird predator elicits
torial invader, or other “alarm” in the caller’s a “rraup.”
environment. The alarm call may elicit a response Distress calls can be context independent, such
by recipients to retreat, freeze in place, or conduct as the calls used by young to attract adults to their
defensive behavior. Slobodchikoff et al. (2009) location. African wild dog (Lycaon pictus) pups,
discussed the complexity of alarm calls in prairie for example, emit a “lamenting call” when they
dogs (Cynomys gunnisoni) in the southwestern are deserted by their parents. Precocial birds, such
United States. He and his students have found as domestic fowl, ducks, or geese, “pipe” in the
that prairie dogs are precise in their signaling same way as when they are cold or hungry.
and can communicate a description of the preda- Young, collared lemmings (Dicrostonyx
tor, its size, its speed, and even its color. Wild groenlandicus) emit ultrasonic chirps when they
boars (Sus scrofa) use context-dependent calls, are abandoned, cold, or feel as if they are in
such as “grunts” and “screams,” whose meanings danger (Sales and Pye 1974). Young primates,
11 Vibrational and Acoustic Communication in Animals 407
11.6.6 Songs
population over time. Songs can also completely Duetting can be especially important within
change between 1 year and the next, known as a environments, such as in dense vegetation,
song revolution. This is thought to be due to the where birds cannot see each other. By duetting,
influx of males from a different population, car- pairs keep close to each other, and in synchrony,
rying with them their own song. Males from the so when conditions in a variable environment
original population then pick up and learn this become right, mating can be achieved quickly
new song causing the song within that population and efficiently. In most gibbon species (family
to completely change (Noad et al. 2000). Hylobatidae), males, and some females, sing
A duet is an exchange of sounds or substrate- solos that function to attract mates and advertise
borne vibrations between a pair of animals often their territory. If a male and female like one
produced in rapid succession (Fig. 11.13). The another’s song, they will find each other and
duet may be so rapid, that it is difficult to distin- conduct a short mating dance followed by a long
guish which animal is producing the various vigorous mating ritual. The song dialect is used to
parts. It functions as a contact-maintaining signal identify the singing gibbon’s species and the area
and individual mated pairs within a species can it is from. Therefore, duetting also reduces
develop their unique duet helping them to main- hybridization with closely related species (Mitani
tain contact with their partner. Duets are especially and Marler 1989).
common in frogs, birds (cranes, sea eagles, geese,
quail, grebes, woodpeckers, barbets, megapode
scrub hens, kingfishers, ravens, cuckoo-shrikes, 11.6.7 From Chorusing to Copulation
and honey-eaters), tree shrews (mammalian order
Scandentia), and siamang (Symphalangus Males that chorus (e.g., frogs, toads, and insects
syndactylus), as well as being common in major such as locusts (order Orthoptera) and cicadas
groups of insects that communicate via substrate- (order Hemiptera)), attract females to a localized
borne vibrations. Species that perform duets often area. A classic example of this are the periodical
are monogamous (such as siamangs) and the two cicadas (Magicicada sp.). Millions of 17-year
sexes resemble each other in appearance (that is, cycle cicada gather to mate in forests in the east-
they are not dimorphic). ern United States. Males aggregate into chorus
Duets are used when mated pairs are required centers and attract mates by producing high-
to remain in touch over long periods of time. intensity sounds (Fig. 11.13). The desert locust
After mate attraction, comes copulation. Ovu- chimp Pan troglodytes with their son and treated
lation in female alpacas (Vicugna pacos) is her similarly. At the end of several years,
thought to be simulated during copulation, although their son was talking, the chimp found
where the male produces a loud “orrgle” for great difficulty making human sounds, and man-
30 to 45 minutes while mounting the female aged only “mama.” The conclusion was that the
(Abba et al. 2013). Even after copulation, calling chimp’s inability to learn language implied that
may continue, where the tree frog Phyllomedusa chimps have lower intelligence than humans.
(Hylidae) gives a separate call after oviposition. However, later it was discovered that the reason
for her difficulty in making speech sounds was
not a mental/cognitive lapse, it was physiological.
11.7 Comparing Human Language She did not have the necessary muscles to control
to Nonhuman Auditory the sophisticated movements of the tongue, lar-
Communication ynx, buccal and nasal cavities in order to make the
different sounds (Lyn 2012). More recently, Fitch
Despite the phenomenal array of different types (2011) has argued that humans have what he
of auditory communication in the different spe- called a “language ready brain.” However,
cies, what are the defining characteristics of Savage-Rumbaugh et al. (2009) argue strongly
human language? Human language involves the that human language may not be any more
use of vocal sounds that are symbolic of sophisticated than ape languages. This is
meanings, and therefore context independent. supported by the recognition of the many mental
Thus, human language can be understood in the homologies between humans and other mammals
total absence of the communicator, such as when (e.g., Kiley-Worthington 2017).
written, or when heard on the telephone. Since the middle of the twentieth century, the
There is a vast literature on human language, distinguishing features found in human language
and a whole field of study: linguistics. Many have been widely discussed, and the synopsis
scientists believe that the development of human developed by Hockett (1960) is still widely
language was the most important evolutionary adhered to. The first question is to what degree
step in distinguishing humans biologically. It is these defining features are found in other species
also widely maintained that development of (Table 11.2).
human language was responsible for the further This list has been elaborated, extended, and
cognitive development of humans. Interestingly, modified, to include tactile, visual, taste, and
nonhumans respond to general sounds and olfactory communication (e.g., Christin 1999).
emotions in human language. More recent work The vocal repertoire of many species has been
has shown that some primates, dogs, marine shown to fulfill most of these characteristics, and
mammals, horses, and elephants comprehend a list of some of the most pertinent studies is
individual words and phrases. In fact, with expe- given here (e.g., Fitch 2011; Herman et al. 1984;
rience, they understand a great deal more human Schusterman and Kastak 1998; Nehaniv and
language than we previously assumed (e.g., de Dautenhahn 2002; Rendell and Whitehead 2001;
Waal 2016; Kiley-Worthington 2017). Young Christiansen and Kirby 2003).
human or nonhuman mammals do not only learn To simplify the differences between human
the meaning of words by conditioning as the spoken language, and communication attributes
behaviorists believed (Skinner 1957), but they of other species, there are two human
also learn by observing others, imitation, and specializations. The first is that the human spoken
learning about cause and effect. language, unlike auditory communication of
One of the first experiments to test if many other species (although not all), is mainly
nonhumans could learn to speak a human lan- (but not exclusively) context independent. That
guage was the Kelloggs’ studies (Kellogg and is, the same word means the same thing in any
Kellogg 1933). This family raised a young context. Humans have developed this
11 Vibrational and Acoustic Communication in Animals 411
Table 11.2 Design features of human language and whether they have been recorded in other species. The species listed
here are only examples, since there are others for which better evidence exists
Design features Humans Chimpanzees Horses Elephants
PRODUCTIVITY + + + +
Different components together at different times
ARBITRARINESS + + + +
Different responses to same display
INTERCHANGEABILITY + + + ?
One display triggers another
SPECIALIZATION + + + +
Not directly related to consequences
DISPLACEMENT + + + +
Key features not related to antecedents
CULTURAL TRANSMISSION + + + +
Differences between populations as a result of learning
DUALITY + + ? ?
Symbols form sentences; components of expression contribute to
whole interpretation
“water bird” (Gardner and Gardner 1984). Gluck investigate this before it is too late and many
(2016), in his account of grappling with central species have become extinct due to our actions,
philosophical problems in animal ethics, most of which are the consequences of human
recollects one of his weekly lab meetings language.
(he was part of a research lab known for numer-
ous breakthroughs in psychology and animal
behavior) where the graduate students would dis-
11.8 Summary
cuss their research and topics of the day; signing
chimps was a hot topic at the time. He noted that
With modern technological aids and further stud-
one of the students, a bit of a maverick, inquired
ies, the study of acoustic and substrate-borne
whether the chimp ever asked “Can I go home
vibrational communication has advanced consid-
now?” or “Can I leave?” Gluck and the other
erably since Busnel’s (1963) seminal work. The
students dismissed this as foolhardy and would
origins of acoustic communication are likely to be
spend the next two decades exploring how pri-
from sounds associated with moving about in the
mate models could inform human biomedical and
environment and breathing in and out through
behavioral science. But that is still the question of
respiratory passages. These sounds have become
our time. If a captive animal could, would they
specialized for communication. Likewise, as
ask to be released? Would they ask “Why are you
animals move, regardless of how quietly, the
doing this to me?” These animal-intensive tests
motions lead to vibrations through the substrate
came under extreme criticism from other
that can be detected by others of the same or
scientists (Terrace 1985). Since then, a gorilla,
different species. Responses to these vibrations
bonobos (Pan paniscus), and other chimps, have
by others are reinforced or are lethal to the
learned to use computer symbols as a human-type
receiver, but likely also inform the sender. The
language (Hopkins and Savage-Rumbaugh
first step is for the sounds or vibrations to become
1991). Kenneally explored the origin of the first
ritualized, leading to displays. The development
word, and speculated on which great apes might
of the necessary sending and receiving structures,
have been capable of speaking the first word.
such as the larynx or the insect tymbal, and a
Among other things, she said that such a speaker
sensory apparatus such as the ear or subgenual
would have to have the anatomical and physio-
organ, facilitated the evolution of an extremely
logical capacity for speech, but they would also
diverse range of auditory and vibratory signals
have to have something to say. In her view, this
and cues, of which only some are described here.
probably eliminated chimps, which she thought
Auditory and vibratory communication each
were immature and lacking in focus, rather than
has advantages and disadvantages. Though a sig-
cognitively limited (Kenneally 2007).
nal can travel through substrates, meaning the
Thomas Nagel’s (1974) thought-provoking
signaler does not have to be in visual range, it
question “What is it like to be a bat?” argues
can be overheard by others. Atmospheric
that humans might imagine what it is like to be
conditions can influence the signal and other
another being but can never know the conscious
sounds/vibrations can mask it. Geographic sepa-
mental state to be that species, or even another
ration of animals within a population can cause
human. We can look at systems, patterns, and
auditory and vibrational signals to evolve over
responses, but each species and every human
time into different dialects and cultural waves.
retain their own secrets and have their own
This variation can eventually separate animals
experiences. That does not mean we should not
within a species into different populations. One
try to understand nonhuman auditory and vibra-
thing that is becoming increasingly clear is that
tional communication signals. These different
there is not much time to uncover more about the
world views, or knowledge of the world, lead us
complexities of auditory and substrate-borne
to a study of the epistemology of different spe-
vibrational communication in nonhumans before
cies. Let us hope that we begin seriously to
11 Vibrational and Acoustic Communication in Animals 413
the behavior of our species, as human language Brumm H, Schmidt R, Schrader L (2009) Noise-
users, has led to the extinction of many species. dependent vocal plasticity in domestic fowl. Anim
Behav 78:741–746
Bush K, Aldridge CL, Carpenter JE, Paszkowski CA,
BoyceM CDW (2010) Birds of a feather do not always
Lek together: genetic diversity and kinship structure of
References greater sage-grouse (Centrocercus urophasianus) in
Alberta. Auk 127(2):343–353. https://doi.org/10.
Abba MA, Bianchi C, Cavilla V (2013) Chapter 15— 1525/auk.2009.09035
South American Camelids. In: Tynes VV (ed) The Busnel RG (ed) (1963) Acoustic behavior of animals.
behavior of exotic pets. Wiley-Blackwell, New York Elsevier, New York
Amos JN, Harrisson JA, Radford JQ, White M, Newell G, Carter GG, Wilkinson GS (2016) Common vampire bat
Nally NM, Sunnucks P, Pavlova A (2014) Species-and contact calls attract past food-sharing partners. Anim
sex-specific connectivity effects of habitat fragmenta- Behav 116:45–51. https://doi.org/10.1016/j.anbehav.
tion in a suite of woodland birds. Ecology 95(6): 2016.03.005
1556–1568. https://doi.org/10.1890/13-1328.1 Catchpole CK, Slater PJB (2008) Bird song. Biological
Aubin T, Jouventin P (1998) Cocktail-party effect in king themes and variations, 2nd edn. Cambridge University
penguin colonies. Proc R Soc B 265:1665–1673 Press, Cambridge
Bain DE, Dahlheimm ME (1994) Effects of masking noise Charlton BD, Reby D, McCombe K (2007) Female per-
on detection thresholds of killer whales. In: Bain DE, ception of size-related formant shifts in red deer
Dahlheim ME (eds) Marine mammals and the Exxon (Cervus elaphus). Anim Behav 74:707–714
Valdez. Elsevier, New York Christiansen MH, Kirby S (2003) Language evolution.
Baptista LF, Gaunt SSL (1997) Social interaction and Oxford University Press, New York
vocal development in birds. In: Snowden CT, Christin AM (1999) Les origines de l’écriture: Image,
Hausberger M (eds) Social influences on vocal devel- signe, trace. Le Débat 106(4):28. https://doi.org/10.
opment. Cambridge University Press, Cambridge 3917/deba.106.0028
Baze W (1950) Just elephants. Corgi Books, London Clay Z, Smith CL, Blumstein DT (2012) Food-associated
Bee MA (2007) Sound source segregation in grey vocalizations in mammals and birds: what do these
treefrogs: spatial release from masking by the sound really mean? Anim Behav 83:323–330
of a chorus. Anim Behav 74(3):549–558. https://doi. Cocroft R, Gogala M, Hill PSM, Wessel A (2014) Study-
org/10.1016/j.anbehav.2006.12.012 ing vibrational communication. Springer, Berlin
Bennet-Clark H (1998) How cicadas make their noise. Sci Cooper LD, Randall JA (2007) Seasonal changes in home
Am 278(5):58–61. http://www.jstor.org/stable/ ranges of the giant kangaroo rat (Dipodomys ingens): a
26057783. Retrieved 7 Feb 2021 study of flexible social structure. J Mammol 88:1000–
Bennet-Clark HC (2000) Resonators in insect sound pro- 1008. https://doi.org/10.1644/06-MAMM-A-197R1.1
duction: how insects produce loud pure-tone songs. J Cynx J, Lewis R, Tavel B, Tse H (1998) Amplitude
Exp Biol 202:3347–3357 regulation of vocalizations in noise by a songbird,
Benney KS, Braaten RF (2000) Auditory scene analysis in Taeniopygia guttata. Anim Behav 56(1):107–113.
estrildid finches (Taeniopygia guttata and Lonchura https://doi.org/10.1006/anbe.1998.0746
striata domestica): a species advantage for detection David Mech LH, Boitani L (eds) (2003) Wolves: behavior,
of conspecific song. J Comp Psychol 114:174–182 ecology, and conservation. University of Chicago
Bradbury JW, Vehrencamp SL (1998) Principles of animal Press, Chicago, p 472
communication. Sinauer Associates, Sunderland, MA De Waal F (2016) Are we smart enough to know how
British Broadcasting Corporation (BBC) News (2020) The smart animals are? W W Norton & Company,
biblical locust plagues of 2020. https://www.bbc.com/ New York
future/article/20200806-the-biblical-east-african- Dent ML, Larsen ON, Dooling RJ (1997) Free-field bin-
locust-plagues-of-2020 aural unmasking in budgerigars (Melopsittacus
Brownell P, Farley RD (1979) Orientation to vibrations in undulatus). Behav Neurol 111(3):590–598. https://
sand by the nocturnal scorpion Paruroctonus doi.org/10.1037/0735-7044.111.3.590
mesaensis: mechanisms of target localization. J Comp Dunlop RA (2017) Potential motivational information
Physiol A 131:31–38 encoded within humpback whale non-song vocal
Brumm H, Todt D (2002) Noise-dependent song ampli- sounds. J Acoust Soc Am 141(3):2204–2213. https://
tude regulation in a territorial songbird. Anim Behav doi.org/10.1121/1.4978615
63(5):891–897. https://doi.org/10.1006/anbe.2001. Dunlop RA, Cato DH, Noad MJ (2010) Your attention
1968 please: increasing ambient noise levels elicits a change
Brumm HB, Zollinger SA (2011) The evolution of the in communication behaviour in humpback whales
Lombard effect: 100 years of psychoacoustic research. (Megaptera novaeangliae). Proc R Soc B 277(1693):
Behav 148(11–13):1173–1198. https://doi.org/10. 2521–2529. https://doi.org/10.1098/rspb.2009.2319
1163/000579511X605759
414 R. Dunlop et al.
Dunlop RA, Cato DH, Noad MJ (2014) Evidence of a Popper AN (eds) Hearing and sound communication in
Lombard response in migrating humpback whales amphibians, vol 28. Springer, New York, pp 113–146
(Megaptera novaeangliae). J Acoust Soc Am 135(1): Gluck JP (2016) Voracious science and vulnerable
430–437 animals. A primate scientist’s ethical journey. Univer-
Emlen ST (1972) An experimental analysis of the sity of Chicago Press, Chicago, IL
parameters of bird song eliciting species recognition. Gordon SD, Tiller B, Windmill JFC, Krugner R, Narins
Behaviour 41(1/2):130–171 PM (2019) Transmission of the frequency components
Endler JA (2014) The emerging field of tremology. In: of the vibrational signal of the glassy-winged sharp-
Cocroft R, Gogala M, Hill PSM, Wessel A (eds) shooter, Homalodisca vitripennis, in and between
Studying vibrational communication. Springer, Berlin, grapevines. J Comp Physiol 205:783–791
pp vii–vix. https://doi.org/10.1007/978-3-662-43607-3 Henry L, Barbu S, Lemasson A, Hausberger M (2015)
Evans CS, Marler P (1991) On the use of video images as Dialects in animals: evidence, development and poten-
social stimuli in birds – audience effects on alarm tial functions. Anim Behav Cogn 2(2):132–155.
calling. Anim Behav 41:17–26 https://doi.org/10.12966/abc.05.03.2015
Evans CS, Marler P (1994) Food calling and audience Herman LM, Richards DG, Wolz JP (1984) Comprehen-
effects in male chickens, Gallus gallus – their sion of sentences by bottlenosed dolphins. Cognition
relationships to food availability, courtship and social 16(2):129–219. https://doi.org/10.1016/0010-0277
facilitation. Anim Behav 47(5):1159–1170 (84)90003-9
Ey E, Pfefferle D, Fischer J (2007) Do age- and sex-related Heyes C, Dickinson A (1990) The intentionality of animal
variations reliably reflect body size in non-human pri- action. Mind Lang 5(1):87–104
mate vocalizations? A review. Primates 48:253–267 Hill PSM (1998) Environmental and social influences on
Farabaugh SM, Dooling RJ (1996) Acoustic communica- calling effort in the prairie mole cricket (Gryllotalpa
tion in parrots: laboratory and field studies of major). Behav Ecol 9(1):101–108. https://doi.org/10.
budgerigars, Melopsittacus undulatus. In: Kroodsma 1093/beheco/9.1.101
DE, Miller EH (eds) Ecology and evolution of acoustic Hill PSM (1999) Lekking in Gryllotalpa major, the prairie
communication in birds. Cornell University Press, mole cricket (Insecta: Gryllotalpidae). Ethology 105:
New York, pp 97–117. https://doi.org/10.7591/ 531–545
9781501736957 Hill PSM (2008) Vibrational communication in animals,
Fischer J, Kitchen D, Seyfarth RM, Cheney DL (2004) 1st edn. Harvard University Press, London
Baboon loud calls advertise male quality: acoustic Hill PSM, Wessel A (2016) Biotremology. Curr Biol 26:
features and relation to rank, age, and exhaustion. R181–R191
Behav Ecol Sociobiol 56:140–148 Hill PSM, Lakes-Harland R, Mazzoni V, Narins PM,
Fitch WT (1997) Vocal tract length and formant frequency Virant-Doberlet M, Wessel A (2019) Biotremology:
dispersion correlate with body size in rhesus macaques. studying vibrational behavior. Springer Nature,
J Acoust Soc Am 102:1213. https://doi.org/10.1121/1. Cham, Switzerland
421048 Hill PSM, Mazzoni V, Stritih Peljhan N, Virant-Doberlet
Fitch WT (2011) Unity and diversity in human language. M, Wessel A (2022) Biotremology: physiology, ecol-
Philos Trans R Soc Lond B Biol Sci 366(1563): ogy and evolution. Springer Nature, Cham,
376–388. https://doi.org/10.1098/rstb.2010.0223 Switzerland
Gannon WL, Lawlor TE (1989) Variation of the Chip Hoch H, Deckert J, Wessel A (2006) Vibrational signal-
vocalization of three species of Townsend chipmunks ling in a Gondwanan relict insect (Hemiptera:
(Genus Eutamias). J Mammal 70(4):740–753. https:// Coleorrhyncha: Peloridiidae). Biol Lett 2:222–224
doi.org/10.2307/1381708 Hockett CF (1960) The origin of speech. Sci Am 203:88–
Gannon WL, O’Farrell MJ, Corben C, Bedrick EJ (2003) 111
Call character lexicon and analysis of field recorded bat Hopkins WD, Savage-Rumbaugh ES (1991) Vocal com-
echolocation calls. In: Thomas J, Moss C, Vater M munication as a function of differential rearing
(eds) Echolocation in bats and dolphins. University of experiences in Pan paniscus: a preliminary report. Int
Chicago Press, Chicago, IL, pp 478–484 J Primatol 12(6):559–583
Gardner RA, Gardner BT (1984) A vocabulary test for Hulse SH, MacDougall-Shackleton SA, Wisniewski AB
chimpanzees (Pan troglodytes). J Comp Psychol 98(4): (1997) Auditory scene analysis by songbirds: stream
381–404. https://doi.org/10.1037/0735-7036.98.4.381 segregation of birdsong by European starlings (Sturnus
Garland EC, Goldizen AW, Rekdahl ML, Constantine R, vulgaris). J Comp Psychol 111:3–13
Garrigue C, Daeschler Hauser N, Poole MM, Huxley JS (1966) A discussion of ritualization of behavior
Robbins J, Noad MJ (2011) Dynamic horizontal cul- in animals and man. Philos Trans R Soc B 251:247–
tural transmission of humpback whale song at the 271
ocean Basin Scale. Curr Biol 21(8):687–691. https:// Jackson A, DeArment R (1963) The lesser prairie chicken
doi.org/10.1016/j.cub.2011.03.019 in the Texas panhandle. J Wildl Manag 27(4):733–737.
Gerhardt HC, Bee MA (2006) Recognition and localization https://doi.org/10.2307/379848
of acoustic signals. In: Narins PM, Feng AS, Fay RR,
11 Vibrational and Acoustic Communication in Animals 415
Janik VM, Slater PJB (2000) The different roles of social amplitude in plant-borne vibrational
learning in vocal communication. Anim Behav 60(1): communication. In: Cocroft RB, Gogala M, Hill
1–11 PSM, Wessel A (eds) Studying vibrational communi-
Keighley MV, Langmore NE, Zdenek CN, Heinsohn R cation. Springer, Berlin, pp 125–145
(2017) Geographic variation in the vocalizations of McKelvey EGZ, Gyles JP, Michie K, Barquín
Australian palm cockatoos (Probosciger aterrimus). Pancorbo V, Sober L, Kruszewski LE, Chan A, Fabre
Bioacoustics 26(1):91–108. https://doi.org/10.1080/ CCG (2021) Drosophila females receive male
09524622.2016.1201778 substrate-borne signals through specific leg neurons
Kellogg WN, Kellogg LA (1933) The ape and the child: a during courtship. Curr Biol. https://doi.org/10.1016/j.
comparative study of the environmental influence upon cub.2021.06.002
early behavior. Hafner Publishing, New York and Mitani JC, Marler P (1989) A phonological analysis of
London male gibbon singing behavior. Behaviour 109(1/2):
Kenneally C (2007) The first word: the search for the 20–45
origins of language. Viking, New York Morris D (1956) The function and cause of courtship
Kiley M (1969) The origin and evolution of some displays ceremonies in L’instinct en le comportement des
in canids, felids and ungulates with particular reference animeaux et des hommes. Fondation Singer-Polignac,
to causation. vol 1. Vocalisations, vol 2 Tail and ear Colloque International Sur L’instinct, Paris, pp
movements. D. Phil University of Sussex 261–266
Kiley M (1972) The vocalisations of ungulates canids and Morris D (1957) Typical intensity and its relationship to
felids with particular reference to their origin, causa- the problem of ritualization. Behaviour 11:12
tion and function. Zeit fur Tierpsychol 31:71–222 Morton ES (1977) On the occurrence and significance of
Kiley-Worthington M (2017) The mental homologies of motivational-structural rules in some bird and mammal
mammals. Towards an understanding of another sounds. Am Nat 111:855–869
mammals world view. Animals 7(12):87. Morton ES (1982) Grading, discreteness, redundancy and
Kondo W, Watanabe S (2009) Contact calls: information motivational-structural rules. In: Kroodsma DE, Miller
and social function. Jpn Psychol Res 51(3):197–208. EH (eds) Evolution and ecology of acoustical communi-
https://doi.org/10.1111/j.1468-5884.2009.00399.x cation in birds. Academic Press, New York, pp 183–212
Kroodsma DE (ed) (1982) Acoustic communication in Nagel T (1974) What is it like to be a bat? Philos Rev
birds. Production, perception and design features of 83(4):435–450
sounds, vol 1. Academic Press, New York Narins PM (1990) Seismic communication in anuran
Kroodsma DE, Miller EH (eds) (1982) Acoustic commu- amphibians. Bioscience 40:268–274
nication in birds. Song learning and its consequence, Narins PM, Lewis ER (1984) The vertebrate ear as an
vol 2. Academic Press, New York exquisite seismic sensor. J Acoust Soc Am 76:1384–
Lakes-Harlan R, Strauss J (2014) Functional morphology 1387
and evolutionary diversity of vibration receptors in Narins PM, Losin N, O’Connell-Rodwell CE (2009) Seis-
insects. In: Cocroft RB, Gogala M, Hill PSM, Wessel mic and vibrational signals in animals. In: Squire LR
A (eds) Studying vibrational communication. Springer, (ed) Encyclopedia of neuroscience. Elsevier,
Berlin, pp 277–302 Amsterdam, pp 555–559
Lyn H (2012) Apes and the evolution of language: taking Nehaniv C, Dautenhahn K (2002) Imitation in animals and
stock of 40 years of research. In: Vonk J, Todd K (eds) artifacts. MIT Press, Cambridge, MA
Shackelford Oxford handbook of comparative evolu- Noad MJ, Cato DH, Bryden MM, Jenner MN, Jenner KCS
tionary psychology, Chapter 19. Oxford University (2000) Cultural revolution in whale songs. Nature 408:
Press, Oxford. https://doi.org/10.1093/oxfordhb/ 537. https://doi.org/10.1038/35046199
9780199738182.013.0019 O’Connell-Rodwell CE, Arnason BT, Hart LA (1997)
Manabe K, Sadr EL, Dooling RJ (1998) Control of vocal Seismic transmission of elephant vocalizations and
intensity in budgerigars (Melopsittacus undulatus): movement. J Acoust Soc Am 102:3124
Differential reinforcement of vocal intensity and the O’Connell-Rodwell CE, Arnason BT, Hart LA (2000)
Lombard effect. J Acoust Soc Am 103:1190. https:// Seismic properties of Asian elephant (Elephas
doi.org/10.1121/1.421227 maximus) vocalizations and locomotion. J Acoust Soc
Marler P (1961) Logical analysis of animal communica- Am 108:3066–3072
tion. J Theor Biol 1:295–317. https://doi.org/10.1016/ O’Connell-Rodwell C, Guan X, Puria S (2019) Vibra-
0022-5193(61)90032-7 tional communication in elephants: a case for bone
Marler P, Doupe AJ (2000) Singing in the brain. Proc Natl conduction. In: Hill PSM, Lakes-Harlan R,
Acad Sci 97(7):2965–2967. https://doi.org/10.1073/ Mazzoni V, Narins PM, Virant-Doberlet M, Wessel
pnas.97.7.2965 A (eds) Biotremology: studying vibrational behavior.
Marler P, Tamura M (1964) Culturally transmitted patterns Springer Nature, Cham, Switzerland, pp 259–276
of vocal behavior in sparrow. Science 146:1483–1486 O’Farrell MJ, Corben C, Gannon WL (2000) Geographic
Mazzoni V, Eriksson A, Anfora G, Lucchi A, Virant- variation in the echolocation call of the hoary bat
Doberlet M (2014) Active space and the role of (Lasiurus cinereus). Acta Chiropterol 2(2):185–196
416 R. Dunlop et al.
Oliveira RF, McGregor PK, Latruffe C (1998) Know thine Savage-Rumbaugh S, Rumbaugh D, Fields W (2009)
enemy: fighting fish gather information from observing Empirical kanzi: The ape language controversy
conspecific interactions. Proc R Soc B 265(1401): revisited. Skeptic 15(1):25–33
1045–1049 Schaller GB (1964) The year of the gorilla. University of
Ota N, Soma M (2022) Vibrational signals in multimodal Chicago Press, Chicago, p 304
courtship displays of birds. In: Hill PSM, Mazzoni V, Scheiber IBR, Weiß BM, Kingma SA, Komdeur J (2017)
Stritih-Peljhan N, Virant-Doberlet M, Wessel A (eds) The importance of the altricial – precocial spectrum for
Biotremology: physiology, ecology and evolution. social complexity in mammals and birds – a review.
Springer Nature, Cham Front Zool 14:3. https://doi.org/10.1186/s12983-016-
Peake TM, McGregor PK (2004) Information and aggres- 0185-6
sion in fishes. Anim Learn Behav 32(1):114–121 Schusterman RJ, Kastak D (1998) Functional equivalence
Podos J, Warren PS (2007) The evolution of geographic in a California Sea lion: relevance to animal social and
variation in birdsong. In: Advances in the study of communicative interactions. Anim Behav 55(5):
behavior. Elsevier, New York 1087–1095. https://doi.org/10.1006/anbe.1997.0654
Potash LM (1972) Noise induced changes in calls of the Sebeok T (1977) How animals communicate. Indiana
Japanese quail. Psychonomic Sci 26(5):252–254 University Press, Bloomington
Randall JA (1984) Territorial defense and advertisement Skinner BF (1957) Century psychology series. In: Verbal
by footdrumming in bannertail kangaroo rats behavior. Appleton-Century-Crofts, New York.
(Dipodomys spectabilis) at high and low population https://doi.org/10.1037/11256-000
densities. Behav Ecol Sociobiol 16:11–20. https://doi. Slater PJB (1981) Cultural evolution in chaffinch song:
org/10.1007/BF00293099 process inferred from micro and macro geographical.
Randall JA, Lewis ER (1997) Seismic communication Biol J Linnaean Soc 42:135–147
between the burrows of kangaroo rats, Dipodomys Slater PJB (1986) The cultural transmission of bird song.
spectabilis. J Comp Physiol 181(5):525–531. https:// Trends Ecol Evol 1(4):94–97
doi.org/10.1007/s003590050136 Slater PJB (1989) Bird song learning: causes and
Ratnam R, Feng A (1998) Detection of auditory signals by consequences. Ethol Ecol Evol 1(1):19–46. https://
frog inferior collicular neurons in the presence of spa- doi.org/10.1080/08927014.1989.9525529
tially separated noise. J Neurophysiol 80:2848–2859. Slobodchikoff CN, Perla BS, Verdolin JL (2009)
https://doi.org/10.1152/jn.1998.80.6.2848 Prairie dogs: communication and community in an
Rendell L, Whitehead H (2001) Culture in whales and animal society. Harvard University Press,
dolphins. Behav Brain Sci 24(2):309–382. https://doi. Cambridge, MA
org/10.1017/S0140525X0100396X Smith WJ (1969) Messages of vertebrate communication.
Reznikova Z (2019) Evolutionary and behavioural aspects Science 165(3889):145–150
of altruism in animal communities: is there room for Strusaker TT (1966) Auditory communication among ver-
intelligence? In: Evolution: cosmic, biological, and vet moneys (Cercopithecus aethiops). In: Alternment
social. Almanac, Dublin SA (ed) Social communication among primates.
Riede T, Fitch WT (1999) Vocal tract length and acoustics Chicago University Press, Chicago, pp 281–384
of vocalization in the domestic dog, Canis familiaris. J Stumpner A, von Helversen D (2001) Evolution and func-
Exp Biol 202:2859–2867 tion of auditory systems in insects. Naturwis-
Riondato I, Gamba M, Tan CL, Niu K, Narins PM, senschaften 88:159–170
Yang Y, Giacoma C (2021) Allometric escape and Šturm R, Rexhepi B, López Díez JJ, Blejec A, Polajnar J,
acoustic signal features facilitate high-frequency com- Sueur J, Virant-Doberlet M (2021) Vibroscape – an
munication in an endemic Chinese primate. J Comp overlooked world of vibrational communication.
Physiol 207:327–336 iScience, revision in review
Roberts L, Howard DR (2022) Substrate-borne vibrational Sullivan J, Demboski JR, Bell KC, Hird S, Sarver B,
noise in the Anthropocene: From land to sea. In: Hill Reid N, Good JM (2014) Divergence with gene flow
PSM, Mazzoni V, Stritih-Peljhan N, Virant-Doberlet within the recent chipmunk radiation (Tamias). Hered-
M, Wessel A (eds) Biotremology: physiology, ecology ity 11:185–194. https://doi.org/10.1038/hdy.2014.27
and evolution. Springer Nature, Cham Sutton D, Nadler C (1974) Systematic revision of three
Sales G, Pye D (1974) Ultrasonic communication in Townsend Chipmunks (Eutamias townsendii). South-
animals. Chapman & Hall, London west Nat 19(2):199–211. https://doi.org/10.2307/
Salinas-Melgoza A, Wright TF (2012) Evidence for vocal 3670280
learning and limited dispersal as dual mechanisms for Terrace HS (1985) Animal cognition: thinking without
dialect maintenance in a parrot. PLoS One. https://doi. language. Philos Trans R Soc Lond B 308(1135):
org/10.1371/journal.pone.0048667 113–128. https://doi.org/10.1098/rstb.1985.0014
Sanvito S, Galimberti F, Miller EH (2007) Vocal signaling Vannoni E, McElligott AG (2008) Low frequency groans
of male southern elephant seals is honest but imprecise. indicate larger and more dominant fallow deer (Dama
Anim Behav 73:287–299 dama) males. PLoS One 3:e3113
11 Vibrational and Acoustic Communication in Animals 417
Virant-Doberlet M, Čokl A (2004) Vibrational communi- RR, Popper AN (eds) Hearing and sound communica-
cation in insects. Neotrop Entomol 33:121–134 tion in amphibians, vol 28. Springer, New York, pp
Virant-Doberlet M, Mazzoni V, de Groot M, Polajnar J, 44–86
Lucchi A, Symondson WOC, Čokl A (2014) Vibra- Wessel A, Mühlethaler R, Hartung V, Kuštor V, Gogala M
tional communication networks: eavesdropping and (2014) The tymbal: evolution of a complex vibration-
biotic noise. In: Cocroft R, Gogala M, Hill PSM, producing organ in the Tymbalia (Hemiptera excl.
Wessel A (eds) Studying vibrational communication. Sternorrhyncha). In: Cocroft R, Gogala M, Hill PSM,
Springer, Berlin, pp 93–123 Wessel A (eds) Studying vibrational communication.
Volodin IA, Matrosova VA, Frey R et al (2018) Altai pika Springer, Berlin, pp 395–444
(Ochotona alpina) alarm calls: individual acoustic var- Wiley RH (1983) The evolution of communication: Infor-
iation and the phenomenon of call-synchronous ear mation and manipulation. In: Halliday TR, PJB S (eds)
folding behavior. Sci Nat 105:40. https://doi.org/10. Animal behavior, volume 2, communication, vol 225.
1007/s00114-018-1567-8 W. H. Freeman, New York, pp 156–189
Walker SF (1998) Animal communication. In: Mey JL Wilson-Henjum GE, Job JR, McKenna MF et al (2019)
(ed) Concise encyclopedia of pragmatics. Elsevier, Alarm call modification by prairie dogs in the presence
Amsterdam, pp 26–35 of juveniles. J Ethol 37:167–174. https://doi.org/10.
Wells KD, Schwartz JJ (2006) The behavioral ecology of 1007/s10164-018-0582-8
anuran communication. In: Narins PM, Feng AS, Fay
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Echolocation in Bats, Odontocetes,
Birds, and Insectivores 12
Signe M. M. Brinkløv, Lasse Jakobsen, and Lee A. Miller
In this chapter, we review basic concepts about source levels are referenced to a distance of 1 m
echolocation, the variety of animals known to in front of the animal. Source levels of bats are
echolocate, the main types of echolocation signals variable, but generally higher in aerial-feeding bats
they use, and how they produce and receive those that fly and search for prey in the open sky (typi-
signals. The topic of perception by echolocating cally 100–130 dB re 20 μPa at 0.1 m). Bats that fly
animals is beyond the scope of this chapter. and forage in vegetation use lower-amplitude
signals. Among these, the so-called “whispering
bats” (e.g., slit-faced bats (Nycteridae), false vam-
12.2 Characteristics of Echolocation pire bats (Megadermatidae), and many New
Signals World leaf-nosed bats (Phyllostomidae)), emit
echolocation sounds at about 65–70 dB re
Echolocating animals use two broad classes of 20 μPa at 0.1 m (Jakobsen et al. 2013a). The
sounds. Toothed whales, rousette bats, and birds source level of a dolphin’s echolocation signal is
generate broadband clicks produced at varying several orders of magnitude greater than that of a
rates. The vast majority of bats, however, use bat’s signal, primarily owing to the different
tonal echolocation signals, characterized by lon- properties of the two media (see next section)
ger duration and either a constant frequency or, (Madsen and Surlykke 2014). Echolocation clicks
more commonly, frequency modulation (FM; i.e., of bottlenose dolphins (Tursiops truncatus) can
sweeping across several frequencies over time). reach source levels of 225 dB re 1 μPa at 1 m
With the exception of certain bat species, peak-to-peak (Au 1993, p. 78). Source levels of
echolocating animals time their outgoing pulses oilbirds (Steatornis caripensis) are around 100 dB
so the echo from a previous pulse does not over- re 20 μPa root-mean-square (rms) at 1 m (Brinkløv
lap with the next outgoing signal, especially dur- et al. 2017), corresponding to roughly 120 dB re
ing general orientation and searching for prey. 20 μPa at 0.1 m, which is comparable to estimates
This separation ensures that the strong outgoing from many bat species. Little has been
signal does not mask the fainter returning echoes documented about the source levels of swiftlets,
from the previous signal (Jen and Suga 1976; tenrecs, and shrews.
Kalko and Schnitzler 1989; Verfuss et al. 2009). Bats and toothed whales both emit the acoustic
Bats and odontocetes both show characteristic signal energy in a focused beam, with specific
changes in echolocation behavior as they vertical and horizontal transmission patterns,
approach objects. Notably, most species in both akin to an “acoustic flashlight” focused on a cer-
groups adjust the sound emission rate to the dis- tain search area. The open mouth of a bat, or the
tance of the target. The click rate increases as they nose in nasal-emitting bats, shapes the transmitted
approach objects and numerous species emit a beam (Hartley and Suthers 1987, 1989), which is
terminal buzz (i.e., a series of pulses or clicks in much broader than that of dolphins (Madsen and
rapid succession) during prey capture (Fig. 12.1). Surlykke 2014). The dolphin’s melon transmits
In bats, these temporal changes are accompanied the outgoing echolocation signals with a slightly
by a change from narrow to wider bandwidths elevated vertical beam above the rostrum
and lower to higher frequencies as they move (Au 1993). There is no information on signal
from an open to a cluttered aerial environment directionality from oilbirds or swiftlets.
or detect an airborne insect prey. Such pro-
nounced, systematic changes have not been
documented in oilbirds or swiftlets. 12.3 Differences in Echolocation
Echolocation signals are often much higher in Signals in Air and Water
amplitude than other sounds produced by animals.
Amplitudes of bat echolocation signals are typi- Only a few of the 71 known species of toothed
cally given at a reference distance of 0.1 m in front whales are proven to use echolocation, but by
of the mouth or nostril. For whales and birds, inference probably all of them do (Culik 2011),
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 421
Fig. 12.1 Echolocation sequence from a harbor porpoise species increase the rate of sound emission as they
(Phocoena phocoena) and a Daubenton’s bat (Myotis approach prey and emit a terminal buzz immediately
daubentonii) as they approach and capture prey. Both before prey capture
as do presumably more than 1000 species of bats. greater than 3.4 mm strongly reflects the
For echolocators, there are three important 100-kHz sound in air, while in water, the sphere
differences between sound in air and sound in must be larger than 15 mm in diameter.
water: (1) density of the medium, (2) reflectivity The absorption coefficient (see Chaps. 5 and 6
of targets, and (3) maneuverability of the target on sound propagation) of the medium is a func-
(Madsen and Surlykke 2014). These differences tion of several factors, but frequency is the most
severely influence the way echolocation has important for echolocators. In seawater, the
evolved in the two media (Au and Simmons absorption coefficient for sound at 100 kHz is
2007). about 0.038 dB/m, while in air at the same fre-
First, water is about 770 times denser than air: quency, it is much larger: 3.3 dB/m. In addition,
1000 and 1.3 kg/m3, respectively, partly sound pressure is lost through geometric spread-
explaining why sound travels about 4.4 times ing in both air and water. For spherical spreading,
faster in water than in air (1520 m/s versus each time the distance is doubled, the sound pres-
344 m/s). For the same frequency of sound, the sure level of the emitted signal is halved (i.e.,
wavelength in water is about 4.4 times longer reduced by 6 dB). Taken together, sound absorp-
than in air. Longer wavelengths limit detection tion and geometric spreading mean that an
to larger targets because reflection depends on the echolocating dolphin can detect an object at
relationship between the wavelength of the much longer distances than can an echolocating
impinging sound and the size of the reflecting bat (Madsen and Surlykke 2014).
object (Urick 1983; also see Chap. 5, section on Investigators often want to get a relative notion
reflection). Sound at a given frequency reflects of the difference in amplitude of bat and dolphin
more effectively from smaller objects in air than echolocation signals. However, such a compari-
in water. For example, the wavelength of a son should be done cautiously because of the
100-kHz signal is 3.4 mm in air, and 15 mm in different physical properties of air and water and
water. Thus, a sphere with a circumference the two different reference pressures. To compare
422 S. M. M. Brinkløv et al.
of predator and prey; a 3-m long dolphin is 6–15 12.4.1 Sound Production and Signal
times larger than its fish prey (20 to 50 cm long) Characteristics
and a 3–8 cm long bat is 5–10 times bigger than
its insect prey. Bats often use their wing and tail With the exception of the tongue-clicking
membranes and even their feet to catch and Rousettus bats (10 species belonging to the
manipulate insects. Toothed whales are stream- pteropodid family), all ~1200 species of
lined with only pectoral and dorsal fins and flukes echolocating bats produce their echolocation
as appendages; they must catch and manipulate signals in the larynx (Suthers and Hector 1988).
prey with their teeth and mouths (Miller 2010). The larynges and associated structures in bats are
Despite very different selective pressures specialized to varying degrees from the basic
placed on bats and toothed whales, most of mammalian pattern, notably the entire structure
which are founded in the density and viscosity ossifies much earlier during development than in
differences between air and water, they operate most mammals, and for many species the vocal
their biosonar in very similar ways. This similar- tract and nasal passages are modified to filter
ity of the biosonar systems of bats and toothed frequencies used for echolocation (Au and
whales (Fig. 12.5a) is a wonderful example of Suthers 2014). Most echolocating bats emit
convergent evolution (Madsen and Surlykke sound through the open mouth, but bats in several
2014; Wilson et al. 2013). families emit sound through the nostrils
(Pedersen 1993). Bats emitting sound through
the mouth generally have plain faces, while the
bats emitting sound through the nose typically
12.4 Echolocation in Bats
have elaborate structures surrounding the nostrils
such as a nose-leaf that aids in sound radiation
Bats are the second-most species-rich order of
(Fig. 12.3).
mammals, currently comprising almost 1400 spe-
The vast majority of echolocating bats are
cies (Burgin et al. 2018) and they play several
insectivorous. Most insectivorous bats hunt flying
trophic roles. Echolocating bats eat a diverse
insects and typically vary the structure of their
range of food including animals (insects,
echolocation calls as they progress from
vertebrates), plant materials (leaves, fruit, nectar,
searching to approaching and capturing prey. Tra-
and pollen), and even blood. The
ditionally, prey capture is divided into three
non-echolocating pteropodid bats all eat mainly
phases (Fig. 12.4): a search, an approach, and a
plant materials. Traditionally, bats were arrayed
terminal phase (Griffin 1958; Griffin et al. 1960).
in two suborders separating them into the
In the search phase, bats emit long-duration,
echolocating Microchiroptera and the
lower-frequency, narrowband signals (search
non-echolocating Megachiroptera, but recent
calls) at a low repetition rate. After an object of
phylogenetic studies do not support this division.
interest is detected, the bats gradually reduce the
Bats are now divided into Yinpterochiroptera and
duration and intensity of the signals; while they
Yangochiroptera (Teeling 2009; Teeling et al.
increase the rate and the bandwidth as they
2005). The non-echolocating pteropodid bats are
approach objects (approach calls). In the terminal
found in the Yinpterochiroptera. This new divi-
phase, immediately before prey capture, the repe-
sion is intriguing because it creates two
tition rates may exceed 150 calls per second (the
alternatives for the evolution of bat echolocation,
terminal buzz). Several reasons underlie these
either as a single event resulting in the loss of
progressive changes in call emission. The search
echolocation by the pteropodids or as two sepa-
calls facilitate a long detection range as lower
rate events. The current consensus favors a single
frequencies are attenuated much less than are
origin of echolocation and subsequent loss in the
higher frequencies (Lawrence and Simmons
pteropodids (Thiagavel et al. 2018; Wang et al.
1982b) and the long duration and narrow
2017).
424 S. M. M. Brinkløv et al.
Fig. 12.3 Variation in bat facial morphology. (a) emitting echolocators while c–f are nose emitters. Note
Nyctalus noctula, (b) Murina cyclotis, (c) Plecotus that c does not have the associated nasal structures com-
auritus, (d) Mimon crenulatum, (e) Rhinolophus rouxii, mon in nose emitters. Photos by S. Brinkløv
(f) Hipposideros lankadiva. Bats a and b are mouth
bandwidth focus the energy of the call in a narrow avoids perceptual errors associated with poten-
range of the sensory system. These calls are, tially assigning echoes to the wrong calls, it also
however, not ideal for accurate localization and means that the distance between the bat and
object classification. Short-duration, broadband, objects of interest limits the call emission rate.
high-frequency calls are much better suited for As the bats approach an object, echoes return with
these tasks (Simmons et al. 1975). The switch progressively shorter delays and the bat can emit
from long-duration, narrowband, low-frequency the calls at a higher rate, up to over 200 calls/
calls in the search phase to short-duration, broad- s during the terminal buzz (Simmons et al. 1979,
band, higher-frequency calls in the approach Fig. 12.4). While this is an impressively high call
phase is a clear indication of object detection rate, the echoes are still received well before the
and it has been used to estimate detection distance next call is emitted. At the short distances
in echolocating bats. However, it is important to between the bat and the prey when the buzz is
note that this is a minimum measure as the bat emitted, the bat could theoretically increase the
may well have detected the object before call rate to 1000 calls/s and still avoid call-echo
adjusting its call parameters (Kalko and ambiguity. Instead, the call rate is limited by the
Schnitzler 1989, 1993). maximum speed of the superfast muscles that
Most echolocating bats, like toothed whales, control each call emission (Elemans et al. 2011).
emit an echolocation call and wait for echoes Concurrent with the increase in call rate, the call
from objects of interest before emitting the next duration decreases as distance to the object
call (Madsen and Surlykke 2014). While this decreases. This is likely to prevent overlap
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 425
Fig. 12.4 Echolocation call sequence emitted by a foraging soprano pipistrelle (Pipistrellus pygmaeus), illustrating the
progressive change in call characteristics and emission rate as the bat searches for, approaches, and captures insect prey
between the emitted call and the returning echo acoustic attenuation is less for lower frequencies.
since the much louder call emission will mask the On the reflection side, small objects return quieter
quieter returning echo if the two overlap (Kalko echoes and will therefore always be detectable at
and Schnitzler 1989, 1993). Hence, echoes from shorter ranges than large objects (Fig. 12.6). The
objects of interest are received in a clearly defined structure and texture of the object also affects the
window between the end of call emission and the level of the returning echo. Hard objects reflect
beginning of the next call. For example, a bat more sound than soft objects and the same is true
emitting calls of 8 ms duration at a call rate of for plane or convex surfaces compared to concave
10 calls/s can resolve echoes from objects surfaces (Urick 1983; also see Chap. 5, section on
between 1.4 and 17 m distance without masking reflection). Additionally, the relationship between
the returning echo during call emission and with- the wavelength of the sound impinging on the
out the risk of call-echo ambiguity (Fig. 12.5). object and the size of the object affects how
While call rate and call duration define an efficient the sound is reflected. If the wavelength
overlap-free window, it is the energy and fre- becomes too long (i.e., the frequency too low)
quency of the emitted call together with the relative to the size of the object, very little
bat’s hearing threshold and the nature of the sound is reflected (Fig. 12.6). This means that
echo-generating object that determine the range prey size imposes a lower frequency limit on bat
of the echolocation system. Echoes have to return echolocation (Houston et al. 2004; Pye 1993).
with enough energy to be detected by the bat. Bats are limited both physically and physio-
Emitting more energy, either by increasing the logically in how high a sound pressure they can
intensity or duration of the call, increases the produce. Supposedly, the main reason why they
detection distance. Emitting lower frequencies emit long-duration calls in the search phase is to
also increases the detection distance because increase the energy of the call. Emitting sound
426 S. M. M. Brinkløv et al.
Fig. 12.5 Schematic illustration of why most received after emission of the next call may create ranging
echolocating bats adjust call duration and call emission ambiguity if assigned to the incorrect call. IPI: inter-pulse
rate relative to target distance. Echoes received during call interval
emission are masked by the louder call and echoes
directionally also increases the source level, that directions (Jakobsen et al. 2013a). The highest
is the sound level measured directly in front of the source levels measured from bats are around
animal. All bats studied to-date emit directional 140 dB re 20 μPa rms at 0.1 m for the greater
echolocation calls. Most bats increase their source bulldog bat (Noctilio leporinus), but most reports
level by 10 dB or more purely by focusing the of open-space aerial hawking bats are around
sound as opposed to radiating sound equally in all 130 dB re 20 μPa rms at 0.1 m (Holderied et al.
2005; Hulgard et al. 2016; Surlykke and Kalko
2008). Combining knowledge of source level,
signal frequency, hearing threshold, and the
echo-generating object, the detection distance is
relatively easy to estimate using a variation of the
sonar equation (Urick 1983) (also see Chap. 6,
section on the sonar equation):
RL ¼ SL 2 PL þ TS
Schnitzler 1989, 1993; Nørum et al. 2012; size of the emitter allows the bats to adjust the
Surlykke and Kalko 2008; Stilz and Schnitzler directionality of the emitted call to suit their envi-
2012). ronment (Kounitsky et al. 2015; Surlykke et al.
The directional echolocation calls of bats 2009b). During the final buzz of prey pursuit, bats
allow an increased detection distance ahead of can broaden their echolocation beam to increase
the bat while reducing the sound levels off to peripheral echo levels and better track the prey
the sides and the back. This reduction in off-axis (Jakobsen et al. 2015; Jakobsen and Surlykke
sound level offers an additional benefit as it 2010; Matsuta et al. 2013; Motoi et al. 2017).
reduces echoes from objects in these directions This is achieved in several species by a sudden
that are likely of little interest to the bats. Echoes drop in call frequency by nearly an octave
from irrelevant objects are known as clutter ech- (as illustrated in Figs. 12.4, 12.7, and 12.8) and
oes and reducing them simplifies the acoustic is often referred to as the buzz II phase.
scene that the bats experience. The obvious dis- The majority of echolocating bats, and the
advantage in emitting directional echolocation focus of our description so far, hunt flying insects
calls is the loss of echoes from relevant off-axis (aerial hawking bats) using relatively short-
objects. The degree to which the benefits out- duration echolocation calls (also known as low
weigh the costs of emitting a very directional duty-cycle calls, with duty cycle being the dura-
echolocation call varies with the environment tion of the call divided by the time period (from
and the behavioral context. The directionality of the start of one call to the start of the next call).
the echolocation call is determined by the emitted There are, however, many species that forage and
frequency and the shape and size of the sound echolocate differently. About 150 species, includ-
emitter. For mouth-emitting bats, this is the shape ing the Old World horseshoe bats and
and size of the open mouth, and for nose-emitting hipposiderid bats (i.e., Pteronotus parnellii and
bats, the shape and size of the nostrils and the closely related species in the family
nose-leaf (Hartley and Suthers 1987, 1989; Mormoopidae from the New World), also feed
Strother and Mogus 1970). Higher frequencies on flying insects. These bats are so-called high
and larger emitters produce higher directionality duty-cycle echolocators and are able to broadcast
(Fig. 12.7). Varying the frequency, shape, and and receive sound at the same time. While low
Fig. 12.8 Echolocation calls emitted by a low duty-cycle bat (Myotis daubentonii) with strongly frequency-modulated
calls (left) and a high duty-cycle bat (Rhinolophus formosae) with mostly constant frequency calls (right)
duty-cycle bats maintain a clear time separation v2 is the speed of the target in m/s (+ indicates
between the emitted call and returning echo, high movement away from the echolocator; would
duty-cycle bats separate call and echo by fre- be movement toward the echolocator), f is the
quency. They all emit much longer duration, emitted frequency in Hz, θ is the angle in degrees
constant-frequency echolocation calls with short between the echolocater and the target, and c is
intervals to navigate and forage (Fig. 12.8, Fenton the speed of sound in the medium (about 344 m/
et al. 2012). When an echo-generating object, s in air and 1500 m/s in water).
such as a moth, moves relative to the bat, the Perception of a Doppler shift by an
echo returns to the bat at a slightly different echolocator is facilitated by emitting long signals
frequency than the emitted call because of the tuned to one frequency (narrowband or constant
Doppler shift. The classical example used to frequency) and by having acute hearing in the
explain the Doppler shift phenomenon is the frequency band of the Doppler-shifted echo. Spe-
moving ambulance. When an ambulance moves cifically, Doppler-shifted echoes are dominated
toward a nearby listener, the siren appears to be by different frequencies than those dominating
higher in frequency than the one heard by some- outgoing pulses (Fenton et al. 2012) and bats
one riding in the ambulance, which does not using this strategy are therefore not sensitive to
change. The effect of Doppler shift is apparent overlap of the two.
when the ambulance passes and moves away Greater horseshoe bats (Rhinolophus
from the listener. Now, the frequency abruptly ferrumequinum) detect the frequency and ampli-
changes from higher to lower in pitch. Doppler tude modulations of the Doppler-shifted echo
shift occurs because the speed of the moving from an insect to within a few Hz of the
ambulance is added to, or subtracted from, the ~82 kHz carrier-frequencies of their echolocation
speed of sound, raising or lowering the perceived calls (Neuweiler 2000). The bats that use
pitch of the siren. The amount of the Doppler shift Doppler-shifted echoes readily detect the wing
is doubled for echolocating animals, as the beats of a fluttering insect and distinguish the
frequencies of both outgoing and returning prey from the background. Flutter-detection is a
signals are shifted. The Doppler shift experienced recurring theme among bats that exploit Doppler
by an echolocating animal may be computed as: shifts (Goldman and Henson 1977; Schnitzler and
Flieger 1983; Lazure and Fenton 2011).
2
Δf ¼ ðv1 þ v2 Þ f cos θ Bats that exploit Doppler-shifted echoes are
c
Doppler-shift compensators (DSC; Hiryu et al.
Here, Δf is the amount of Doppler shift in Hz, 2016) because they continuously adjust the out-
v1 is the speed of the echolocating animal in m/s, going signal to ensure that the Doppler-shifted
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 429
echoes remain at the frequencies to which their while these species often cluster their calls in
acoustic foveae are tuned (Schuller and Pollack groups with increased repetition rates when
1979, Schnitzler 1968; Schnitzler and Flieger faced with increasing acoustic complexity, they
1983; Hiryu et al. 2016). do not emit the terminal buzz characteristic of
There is no current evidence that toothed bats that target flying insect prey (Gonzalez-
whales or other echolocators using broadband Terrazas et al. 2016). In addition, they often rely
clicks are capable of Doppler-shift compensation. on additional sensory input, such as olfactory
However, the small harbor porpoise would be a cues (Gonzalez-Terrazas et al. 2016), or, in the
good species to test for Doppler-shift sensitivity, special case of vampire bats, thermoreception
as they have narrow auditory filters (Popov et al. (Kürten and Schmidt 1982).
2006) and use relatively long clicks (100 μs) and
narrowband echolocation signals centered around
130 kHz. 12.4.2 Hearing Anatomy
High duty-cycle bats, in general, have a highly and Echolocation Abilities
specialized hearing to facilitate this type of echo-
location and they modify their emitted echoloca- The hearing of echolocating bats is based on
tion calls such that the frequency of the returning standard mammalian hearing anatomy, including
echoes always falls within a very narrow fre- recognizable pinnae, tragus, ear canal, tympanic
quency range for which their hearing is optimized membrane, three middle ear bones, and a coiled
(Fig. 12.8 and Sect. 12.4.2) (Schnitzler 1973; cochlea. With few exceptions, they even have the
Schuller 1977). In spite of the large differences same hearing threshold as most other mammals,
between high and low duty-cycle bats, the overall measured at their best frequencies: 0 dB re 20 μPa
call emission pattern when catching flying insects (Fay 1988), Fig. 12.9. There are, however, nota-
is still remarkably similar. High duty-cycle bats ble specializations that relate to echolocation
still emit calls that correspond to the three phases where bats differ from most mammals. It is clear
of search, approach, and buzz when they pursue that most bats have a larger than average pinna
flying insects, including similar call-structure and tragus, but there is considerable variation
changes to those in the low duty-cycle bats: grad- across species in size and shape that likely relates
ual source-level reduction, duration shortening, to the bat’s echolocation signals and foraging
increasing repetition rate (Ratcliffe et al. 2013), ecology (Coles et al. 1989; Obrist et al. 1993)
and broadening of the echolocation beam during (Fig. 12.3). In general, bats that complement
the terminal buzz (Matsuta et al. 2013). their echolocation by passive listening for prey-
Bats that do not forage for flying insects gen- generated sounds have larger pinnae than bats
erally search for more conspicuous food. Many that rely solely on echolocation (Obrist et al.
species hunt non-flying insects in dense vegeta- 1993). The pinna provides substantial direction-
tion, a strategy known as gleaning. Gleaning bats, ality and acoustic gain depending on the relation-
in general, emit very short low-intensity calls that ship between pinna size and sound frequency.
sweep over a broad range of frequencies The pinnae of gleaning bats commonly amplify
(Denzinger and Schnitzler 2013). As noted ear- sound well below the bats’ echolocation
lier, such calls provide excellent localization and frequencies (Coles et al. 1989; Guppy and Coles
classification and the low intensities greatly 1988; Obrist et al. 1993; Schmidt et al. 1983). The
weaken clutter echoes, which is particularly acoustic gain provided by the large pinnae affords
important when flying in dense vegetation. Fruit some bats extremely low hearing thresholds such
and nectar eating can be considered variations on as the impressive 20 dB re 20 μPa hearing
the gleaning strategy, and the echolocation threshold found in the brown long-eared bat
behavior of fruit-eating and nectar-drinking bats (Plecotus auritus) and the Indian false vampire
very closely resembles that of insect-gleaning bat (Megaderma lyra) (Coles et al. 1989; Schmidt
bats (Denzinger and Schnitzler 2013). Notably, et al. 1983). While pinna structure plays a crucial
430 S. M. M. Brinkløv et al.
Fig. 12.9 Audiograms of three echolocating bats and two ferrumequinum, from Long and Schnitzler 1975);
echolocating bird species. A non-echolocating bird is dark blue: oilbird (Steatornis caripensis, from Konishi
shown for comparison. Bat thresholds are based on behav- and Knudsen 1979); red: swiftlet (Aerodramus
ioral experiments, bird thresholds are derived from neuro- spodiopygia, from Coles et al. 1987); yellow: black-
physiological experiments. Green: big brown bat capped chickadee (non-echolocating, from Wong and
(Eptesicus fuscus, from Dalland 1965); light blue: Egyp- Gall 2015). Thresholds are not directly comparable
tian fruit bat (Rousettus aegyptiacus, from Koay et al. between species due to differences in experimental
1998); purple: greater horseshoe bat (Rhinolophus conditions
role in bat echolocation, large external ears have a but a progressive increase in echo strength at the
disadvantage during flight. Large ears create sub- bat by +6 dB per halving of distance. However,
stantial drag, and it is likely that the ears of fast- the bat’s auditory system reduces its sensitivity by
flying bats are shaped as much by the aerodynam- an additional 6 dB per halving of distance,
ics of flight as by echolocation (Gardiner et al. because as the bat vocalizes, the middle ear
2008; Johansson et al. 2016; Vanderelst et al. muscles contract to avoid self-deafening, increas-
2015). ing the bat’s hearing threshold. This time-
As mentioned above, bats decrease their emit- dependent change in hearing threshold
ted intensity progressively as they approach corresponds almost perfectly to the missing
objects. This is primarily believed to function as 6 dB per halving of distance and presumably
gain control for the auditory system, a phenome- provides a constant perceived echo level for the
non also seen in echolocating odontocetes (see bat (Hartley 1992a, b; Henson 1965; Suga and
Sect. 12.5.2). If the bats kept their output level Jen 1975). The gradual relaxation of the middle
constant, the echo level would increase progres- ear muscles progressively decreases the bat’s
sively by many orders of magnitude as the bat hearing threshold back to resting level. It is
approached an object. Considering small insects worth noting that this is under very predictable
as point sources, this increase would be laboratory conditions and that in a real-life field
40 log10(r) or 12 dB per halving of distance r. scenario, the bats encounter much more unpre-
So, the output call level generally decreases by dictable conditions and prey behavior.
6 dB per distance halved (Boonman and Jones Recordings of prey capture in the field reveal
2002; Brinkløv et al. 2013; Hartley 1992a, b; that intensity reduction is much more variable
Lewanzik and Goerlitz 2018). Such a reduction and commonly exceeds 6 dB per halving of dis-
results in a constant intensity at the object/prey, tance (Nørum et al. 2012). This subject is also
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 431
discussed below for harbor porpoises and features of the echo generated by one or more
dolphins. reflections from the objects (Schmidt 1988;
Bat hearing is certainly specialized for echolo- Simmons et al. 1990; Weissenbacher and
cation and for high frequencies (Fig. 12.9). Other Wiegrebe 2003), while the classification of large
small mammals such as mice and rats have a objects such as trees is more complex (Grunwald
similar high-frequency hearing. Bats are, how- et al. 2004). The bat’s resolution of a target
ever, much more sensitive up to their high- depends on both the frequency of the emitted
frequency limit and have very high sensitivity call (higher frequencies reflect more efficiently
over a much wider range of frequencies. Compar- off smaller structures than do lower frequencies
ing echolocating to non-echolocating bats, the (Fig. 12.6 and Urick 1983) and the bat’s ability to
cochlea is significantly larger relative to skull perceive these reflections. Bats are capable of
size, and the basilar membrane, where frequency distinguishing similar-sized objects with very
coding occurs, is longer for echolocating bats minute textural differences. They can clearly dis-
compared to all other mammals (Kössl and tinguish small disks from mealworms when both
Vater 1995). High duty-cycle bats have the lon- are thrown in the air and smooth hanging beads
gest basilar membranes containing an acoustic from textured beads with the same overall echo-
fovea, which is a large region of the membrane strength (Falk et al. 2011; Griffin et al. 1965).
dedicated to a very narrow frequency range. The Our account of bat echolocation only contains
acoustic fovea provides the crucial frequency res- broad strokes. With around 1200 species of
olution and sharp tuning that allows high duty- echolocating bats, the variation in echolocation
cycle bats to separate call and echo by frequency design is vast, and while most follow the outline
instead of time (Bruns and Schmieszek 1980). given here, there are many deviations and many
Bats use the time delay between their outgoing bat species that utilize their echolocation in
call and the returning echo to determine the dis- puzzling ways that are as yet unexplained.
tance to a target. They determine the horizontal
direction to the object by comparing the input on
the two ears. For bats, interaural intensity 12.5 Echolocation in Odontocetes
differences likely provide the main cues (Pollak
1988). The vertical direction is mainly coded by Among cetaceans, only species in the suborder
frequency-dependent reflections from the pinna Odontoceti (toothed whales) are known to
and tragus (Lawrence and Simmons 1982a). echolocate (Au 1993). Bioacoustical research
Bats have excellent spatial resolution and accu- has focused on bottlenose dolphins, belugas,
racy. They consistently aim their echolocation false killer whales, and killer whales (all in the
beam to within less than 5 of their target both families Monodontidae and Delphinidae) as well
horizontally and vertically (Ghose and Moss as porpoises (Phocoenidae), sperm whales
2003; Jakobsen and Surlykke 2010; Masters (Physeteridae), and a few species of beaked
et al. 1985; Surlykke et al. 2009a) and can dis- whales (Ziphiidae).
criminate between two objects in the horizontal Odontocetes use echolocation to orient in the
plane if they are more than 1.5 apart (Simmons aquatic environment, to detect, chase, and capture
et al. 1983) and, in the vertical plane, if they are prey, and to socialize (Thomas et al. 2004;
more than 3 apart (Lawrence and Simmons Thomas and Turl 1990). They have broadband
1982a). hearing and a good ability to discriminate a signal
Aerial hawking bats can easily be tricked into in noise. Their echolocation signals have narrow
catching small pebbles thrown in the air. This is beam patterns that can be modified, as can the
not because bats cannot distinguish pebbles from amplitude and frequency content of outgoing
insects, but likely because most airborne items of clicks.
a given size are edible to bats. Classification of The bottlenose dolphin has been the “labora-
small objects is based on temporal and spectral tory rat” of odontocete biosonar studies. A series
432 S. M. M. Brinkløv et al.
of experiments by US Navy researchers examined brief broadband clicks for echolocation. Fig-
the ability of captive bottlenose dolphins ure 12.10 shows four echolocation clicks from a
(Tursiops truncatus) to detect subtle differences false killer whale (Pseudorca crassidens). Each
in human-made objects for military reconnais- click generally has four to eight cycles and a
sance purposes (Au 1993, 2015; Moore and Pop- duration of 15–70 μs. Peak-to-peak source levels
per 2019). They showed that dolphins wearing can be very high, from 210 to over 225 dB re
eyecups (so they could not see their targets) and 1 μPa at 1 m. High-intensity signals from
using only echolocation could: (1) distinguish dolphins generally are broadband and can contain
objects of the same shape, but of different frequencies beyond 100 kHz. The frequencies of
materials (e.g., cylinders of glass, metal, or dolphin clicks vary almost linearly with the signal
rock), (2) distinguish objects of the same material intensity, such that, as the peak frequency of
but different shapes (e.g., PVC cylinders, plates, echolocation signals increases, the intensity of
squares, and tubes), (3) detect a 3-inch hollow clicks increases (Au and Suthers 2014).
metal sphere at about 115 m distance and a sphere All odontocetes studied thus far produce echo-
of a few millimeters at a distance of about 50 m, location signals using one or two pairs of phonic
(4) feed normally if blind, but if hearing-impaired lips located in the nasal passages. The lips contain
become disoriented, (5) discriminate metal cylin- bursae, which are rod-like fatty structures situated
der targets with different wall-thickness (differ- just below the blowhole (AB, PB in Fig. 12.11b).
ence as little as 0.00 l mm), and (6) control the The phonic lips produce both echolocation clicks
amplitude and frequency of their outgoing pulses, and communication whistles (Cranford et al.
such that in areas of high ambient noise, they 1996).
produced louder and higher-frequency pulses. Amundin (1991) and Huggenberger et al.
(2009) studied click-production in the harbor por-
poise, which can serve as a general example for
12.5.1 Sound Production and Signal odontocetes other than sperm whales. Fig-
Characteristics ure 12.11 shows an overview and details of the
harbor porpoise sound-producing apparatus
Most dolphins emit whistles and burst-pulse (Huggenberger et al. 2009). Air passages are
sounds for intraspecific communication and shown in blue, fat in yellow, bone in white, and
202 ± 5 dB
I
205 ± 5 dB II III
I
Relative Amplitude
1
II
IV
209 ± 4 dB
III
213 ± 3 dB 0
IV 0 50 100 150 200
Frequency (kHz)
0 100 µs
Fig. 12.10 Left: Waveform of false killer whale biosonar Suthers RA. Production of Biosonar Signals: Structure and
signals with increasing averaged peak-to-peak source level Form, pp. 61–105, in Surlykke A, Nachtigall PE, Fay RR,
in dB re 1 μPa (relative amplitudes are drawn). Right: Popper AN (eds) Biosonar. Springer, New York, NY,
Spectra of the corresponding signal type showing increas- USA; https://link.springer.com/chapter/10.1007/978-1-
ing peak-frequency with increasing signal amplitude. 4614-9146-0_3. # Springer Nature, 2014. All rights
Adapted by permission from Springer Nature. Au WWL, reserved
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 433
Fig. 12.11 Schematic sagittal reconstruction of the head cantantis; PE: premaxillary eminence; PN: posterior
of an adult harbor porpoise showing the nasal structures nasofrontal sac; PS: premaxillary sac; PX: pharynx; RO:
and the position of the larynx (LA). (a) Overview. (b) rostrum; sm, sphincter muscle of larynx; TO: tongue; TR:
Detail of boxed area in (a). Blue: air spaces of the upper trachea; TT: connective tissue theca; V: ventral; VE: ver-
respiratory tract; gray: digestive system; light gray: carti- tex of skull; VP: vestibulum of nasal passage; VS: vestib-
lage and bone of the skull; yellow: fat bodies. AB: rostral ular sac; VV: folded ventral wall of vestibular sac.
bursa cantantis; AL: rostral phonic lip; AN: anterior Reprinted with permission from John Wiley and Sons.
nasofrontal sac; AS: angle of nasofrontal sac; BC: brain Huggenberger S, Rauschmann MA, Vogl TJ, Oelschläger
cavity; BH: blowhole; BL: blowhole ligament; BM: blow- HHA. Functional Morphology of the Nasal Complex in
hole ligament septum; C: caudal; CS: caudal sac; DI: the Harbor Porpoise (Phocoena phocoena L.). The
diagonal membrane; DP: low density pathway; IV: infe- Anatomical Record 292:902–920; https://anatomypubs.
rior vestibulum; LA: larynx; MA: mandible; ME: onlinelibrary.wiley.com/doi/full/10.1002/ar.20854.
melon; MT: melon terminus; NA: nasal passage; NP: # John Wiley and Sons, 2009. All rights reserved
nasal plug; NS: nasofrontal septum; PB: caudal bursa
other tissues in red. Air in the bony nares (NA) is Anterior Lip/PL: Posterior Lip) in each naris
pressurized by the nasopharyngeal pouch and the resulting in a click-like vibration in the bursae
sphincter muscle of the larynx (sm), possibly with (Anterior Bursa, AB and Posterior Bursa, PB),
help of the piston-like action of the rostral end of primarily on the right-side. Each click projects
the larynx (LA) and epiglottis (Ridgway and from the bursae through a low-density pathway
Carter 1988). The nasal plug (NP) and the blow- (DP) to the melon (ME) and from there to the
hole ligament septum (BM) control the flow of water. This low-density pathway (DP) is charac-
pressurized air past the phonic lip pair (AL: teristic for the families Phocoenidae (porpoises)
434 S. M. M. Brinkløv et al.
and Cephalorhynchinae (small dolphins). In the sac (VS) is associated with the melon and also
bottlenose dolphin, and most other delphinids, the acts like a shield to preventing sound leakage.
anterior bursa (AB) directly abuts the melon. The New results indicate that the melon of the harbor
small amount of air needed to produce a single porpoise functions as an acoustic waveguide (Wei
click ends up in the vestibular air sac (VS) and et al. 2017, 2018).
eventually is re-cycled to the nasal cavity (NA), The foreheads of beaked whales (Ziphiidae)
rather than exhaled through the blow hole and the two pygmy sperm whales (family
(BH) (Norris et al. 1971; Dormer 1979). This Kogiidae) are quite different. Here, the anterior
process appears to be the same in all odontocetes. bursae lie against a spermaceti organ filled with
Dormer (1979) showed that in three wax esters (Cranford et al. 1996). The spermaceti
delphinids, the right pair of phonic lips produces organ abuts the melon, so an echolocation click
high-frequency clicks, the left pair produces first passes through the spermaceti organ into the
whistles. Whistles, like clicks, are also transmit- melon and out into the sea. Beaked whales have
ted to the melon and into the water but are much an extensive sheet of thick, dense, connective
less directional due to their lower frequencies. tissue rather than air sacs above the spermaceti
There is conflicting evidence for click-production organ and melon (Cranford et al. 2008). Beaked
by the left pair of phonic lips (Madsen et al. 2013; whales dive deep and hunt at depths of more than
Cranford et al. 2011, 2015). Critically designed 1000 m (Johnson et al. 2006). At such extreme
experiments and field recordings are needed to pressures, air sacs would collapse, but the struc-
elucidate the full function of the left pair of pho- tural adaptation of the forehead would still protect
nic lips, particularly in species such as porpoises against acoustic leakage from the melon. Song
that do not whistle. et al. (2015) measured the acoustical properties
In dolphins, porpoises, and river dolphins, the of the melon in pygmy sperm whale (Kogia
melon (ME in Fig. 12.11) and associated tissues breviceps). The density of the melon tissue, and
are the primary structures for transmitting echolo- the velocity and impedance of sound are highest
cation clicks from the phonic lips to the water in the center of the melon. These physical
(Cranford et al. 1996). In the bottlenose dolphin characteristics keep sound from leaking through
melon, fat is not homogeneous; rather it is com- connective and muscular tissue surrounding the
posed of varying amounts of triglycerides and melon. In addition, air sacs above the spermaceti
wax esters that differentially affect the sound organ of Kogia keep sound in the spermaceti
transmission velocity through the melon organ. It is unknown how deep Kogia dives, but
(Au 1993, 2015). The same is true for the harbor the presence of air sacs above the spermaceti
porpoise (Au et al. 2006; Madsen et al. 2010), organ suggests that it does not dive as deeply as
where the melon contains mainly triglycerides, beaked whales. Kogia has extreme right-sided
probably of many different types (chain lengths asymmetry of the skull bones, the function of
and degree of saturation) producing different which remains unclear.
densities (acoustical impedances). The lowest The bioacoustical system of the sperm whale
density is near the low-density pathway (DP in differs from all other odontocetes (Cranford et al.
Fig. 12.11), while the highest density 1996). Sperm whales (Physeter macrocephalus)
approximates that of seawater and occurs in the have only the right pair of phonic lips, which
dorsal part of the melon about four centimeters projects to the tip of the giant rostrum
caudal to the upper lip of the harbor porpoise (Fig. 12.12). Click-production is essentially like
(Kuroda et al. 2015). that of other odontocetes. Air is pressurized in the
The density of muscle and connective tissue right naris (Rn) causing a click from the right pair
above and lateral to the melon (TT in Fig. 12.11) of phonic lips (Mo). A very small amount of
is greater than the density of the melon tissue and sound energy escapes through the distal air sac
keeps sound from leaking out of the melon. In (Di) at click-production (P0 Fig. 12.12b). The
dolphins and the harbor porpoise, a vestibular air major portion of sound energy projects back
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 435
Fig. 12.12 A schematic drawing of a sperm whale head. can modify click generation to produce coda, or weaker
Bl Blow hole; Di Distal air sac; Fr Frontal air sac; Jo Junk communication clicks (the red solid line). This indicates
organ; Ln Left naris; Mo Monkey lips (museau de singe); that the whale can somehow control where the click,
Rn Right naris; So Spermaceti organ. (a) communication generated by the monkey lips (Mo), reflects off the frontal
or coda clicks and (b) echolocation clicks, p1 being the air sac (Fr) thus exiting near the distal air sac (Di).
strongest. According to the bent horn model, the produc- Modified from Caruso et al. (2015). # Caruso et al.
tion of an intense echolocation click (the solid black 2015; https://doi.org/10.1371/journal.pone.0144503.
dashed lines and p1 in b) generates multiple weaker pulses Licensed under CC BY 4.0; https://creativecommons.org/
(p2, p3, p4 in b) owing to reverberation of the initial sound licenses/by/4.0/.
(p1) between Di and Fr (the thin dashed lines). The whale
through the spermaceti organ (So, heavy dashed pulse structure. Cranford et al. (1996) proposed
line), hits the frontal air sac (Fr) and is reflected that the spermaceti organ and the junk are homol-
through the “junk” (Jo, heavy dashed line) into ogous with the posterior and anterior bursae in the
the water as a powerful and broadband click (P1 dolphin, respectively.
in Fig. 12.12b). The sperm whale P1 click is the Although the sound-generating apparatus is
most powerful biological sound known (with basically similar in odontocetes, the outgoing
maximum source levels of 236 dB re 1 μPa rms sound from the melon can differ substantially
at 1 m, Møhl et al. 2003), and is probably used as among species. Initially, the action of the phonic
a long-distance biosonar probe signal (see lips, controlled by pneumatic pressure, influences
Fig. 12.13b). But it has been proposed that these the intensity of the click. Stronger hammer-action
powerful clicks could stun prey. Norris and Møhl of a phonic lip pair means the transmission of
(1983) suggested a “big bang theory” for more intense and higher-frequency clicks
bottlenose dolphins and sperm whales that pro- (Finneran et al. 2014; Fig. 12.10).
duce especially loud, single pulses (or bangs). During orientation, most delphinids produce
These pulses could debilitate prey for easy cap- short, broadband echolocation clicks (Au 1993)
ture, but this has never been proven. In fact, a new often of high intensity. They produce less intense,
study using D-tags on sperm whales recorded no but rapidly repeated clicks, analogous to a bat’s
“big bangs,” but normal odontocete prey capture buzz when approaching objects or prey (see
behavior (Fais et al. 2016). Fig. 12.1). A single click of a wild white-beaked
A fraction of P1 energy reflects from the distal dolphin lasts about 15 μs and has energy from
air sac causing a P2 click to be emitted at a delay about 30 kHz to over 200 kHz (Rasmussen and
consistent with the length of the head (spermaceti Miller 2002). The sperm whale also fits into this
organ). The reverberation continues (P1 to P4 in category (Møhl et al. 2003) with a broadband P1
Figs. 12.12b and 12.13a), resulting in a multi- click (Fig. 12.13b).
436 S. M. M. Brinkløv et al.
p1
0
b p1
a p2
-60
p0
-80
0 5 10 15 20
Frequency (kHz)
10 ms
Fig. 12.13 Multi-pulse structure of a sperm whale click. caused by reverberations in the nose of the whale (see
The P1 click is the most intense and broadest in frequency. also Fig. 12.12). From Møhl et al. (2003). # Acoustical
It is the most powerful biological sound known. The Society of America, 2003. All rights reserved
following clicks of decreasing amplitude (P2–P4) are
At present, it seems that the modulation of Beaked whales regularly use frequency-
clicks in the harbor porpoise occurs in the whale’s modulated up-swept clicks for orientation and
forehead and that the basic echolocation signals when searching for prey. These are relatively
entering the forehead are short-duration, broad- broadband and about 200 μs long (Fig. 12.15).
band clicks. Madsen et al. (2010) used contact Clicks used during prey capture in the buzz are
hydrophones to show that a harbor porpoise click less than 100 μs long, slightly more broadband
recorded near the right (or left) phonic lip pair is than the regular clicks and similar to dolphin
broadband. The same click recorded on the clicks. It is unknown how the upsweep of the
melon, along the midline of the animal near the regular click is generated, but by analogy to the
exit point of the sound, has the typical polycyclic porpoise, the basic signal is likely a broadband
narrowband structure. The narrowband high- click somehow shaped in the forehead of the
frequency click (Fig. 12.14) somehow results whale.
from the melon and associated tissues, but the The directionality of the echolocation sound
details of this mechanism are unknown. beam in odontocetes has been studied for many
Fig. 12.14 (a) Echolocation click from a harbor por- et al. 2019). From Fig. 12.1 in Miller and Wahlberg
poise. (b) Spectrum of a harbor porpoise click. The harbor (2013); # Miller and Wahlberg 2013; https://doi.org/10.
porpoise is one of several smaller toothed whales that use a 3389/fphys.2013.00052. Licenced under CC BY 3.0;
high-frequency narrowband echolocation click (Galatius https://creativecommons.org/licenses/by/3.0/
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 437
Fig. 12.15 Beaked whale click waveform (a), spectro- shows ambient noise). Baumann-Pickering et al. (2010).
gram (b Hann window, 40-point FFT, 98% overlap), and # Acoustical Society of America, 2010. All rights
spectrum (c Hann window, 256-point FFT; dashed line reserved
years (Au 1993, 2015; Au et al. 1985, 1986, Wild Amazon river dolphins (Inia geoffrensis)
1999; Kloepper et al. 2012; Koblitz et al. 2012). also increase the beam width during prey capture
Recent work reveals that odontocetes control the (Ladegaard et al. 2017). Increasing the beam
shape and direction of the beam (Moore et al. width helps the porpoise (or bat) track a moving
2008; Wisniewska et al. 2015). A bottlenose dol- prey at close proximity. Presumably, the muscu-
phin with its head stationary and its mouth on a lature around the melon helps control the beam
biteplate moved its sound beam by 26 to the left width and direction in porpoises and dolphins
and 21 to the right when echolocating a movable (Moore et al. 2008), but this needs verification.
sphere 9 m away (Moore et al. 2008). The direction of the sound beam from the head
Wisniewska et al. (2015) used two-dimensional of a porpoise carcass can be changed by artifi-
hydrophone arrays to verify that harbor porpoises cially inflating the vestibular air sacs (Miller
approaching a target (a dead fish) voluntarily 2010). With no air in the vestibular air sacs, a
change the diameter of their echolocation beam broadband click generated by a small hydrophone
to increase the ensonified area by 100–200%, between the right pair of phonic lips projects left
while reducing the interval between clicks in the of the midline and vice versa with an artificial
buzz phase just before prey capture (Fig. 12.16). click generated between the left phonic lip pair.
These changes are analogous to what a bat will do With air in the vestibular air sacs, the artificial
when capturing an insect (Jakobsen et al. 2015). clicks project out the midline (Fig. 12.17; see also
438 S. M. M. Brinkløv et al.
Fig. 12.16 The harbor porpoise can increase the intervals emitted in the search phase at longer distances
ensonified area by nearly 200% during the buzz phase (ICI in b, red). # Wisniewska et al. 2015; https://
with short inter-click intervals (ICI in b, blue). The large elifesciences.org/articles/05651. Licensed under CC BY
diameter circle (solid in a) illustrates the beam width for 4.0; https://creativecommons.org/licenses/by/4.0/. All
clicks with short intervals. The small diameter circle rights reserved
(dashed in a) shows the beam width of clicks with longer
Fig. 12.17 Short broadband artificial clicks generated generated (clicks generated between the right pair of pho-
between the phonic lips (right lip: solid arrow and curve; nic lips emerge to the left and vice versa). Adapted with
left lip: dashed arrow and curve) of a cadaver harbor permission from Miller LA (2010); Prey Capture by Har-
porpoise. With air in the vestibular air sacs (right image), bor Porpoises (Phocoena phocoena): A Comparison
the clicks emerge at the midline. Without air in the vestib- Between Echolocators in the Field and in Captivity; J
ular air sacs (left image), the clicks emerge on either side Marine Acoust Soc Jpn 37 (3):156–168. # The Marine
of the midline depending on where the artificial click was Acoustics Society of Japan, 2010
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 439
Starkhammar et al. 2011; Cranford et al. 2014). porpoise (Linnenschmidt et al. 2012, 2013) have
Incidentally, the exiting click remained broad- voluntary control over the level of the emitted
band in these experiments indicating that the liv- click and of their auditory sensitivity during echo-
ing melon and associated tissues are necessary for location tasks. The results from the harbor por-
producing a high-frequency, narrowband click poise clearly illustrate active hearing during the
typical for the harbor porpoise (Madsen et al. echolocation of targets: the porpoise maintains a
2010). constant level of auditory perception independent
The primordial odontocete echolocation signal of target distance. If the distance to a target is
was probably a short, broadband click similar to doubled, the level of a click impinging on the
the clicks used by most living dolphins and the target is halved (6 dB). To compensate for
sperm whale (Fig. 12.10, left). In contrast, the La this, the porpoise doubles the level of the outgo-
Plata dolphin (Pontoporia blainvillei), six small ing click (+6 dB), keeping the level of the inci-
dolphins (family Delphinidae), all porpoises dent sound on the target constant and independent
(family Phocoenidae, six species with four of distance (within a certain range). However, the
documented), and the pygmy and dwarf sperm returning echo is halved (6 dB) at double the
whales (family Kogiidae) use narrowband, high- distance. Linnenschmidt et al. (2012) showed that
frequency (NBHF) echolocation clicks (see there is an “automatic gain control” in the audi-
Fig. 12.14). The change from broadband to tory system of the porpoise such that its hearing
NBHF echolocation clicks could reflect predation increases in sensitivity by about +6 dB to com-
pressure by killer whales (and their ancestors), as pensate for the loss in the echo level over double
well as environmental factors (Andersen and the distance. Without compensating for the level
Amundin 1976; Madsen et al. 2005; Morisaka of the outgoing click and the gain control in the
and Connor 2007; Miller and Wahlberg 2013; auditory system, the echo level would drop by 1/4
Galatius et al. 2019). NBHF clicks appear to be (12 dB) per doubling of distance to the target,
generated in the melon and associated tissues making echolocation more difficult for the whale.
(Madsen et al. 2010). It is assumed that all Toothed whales obviously find their prey
odontocetes can control the amplitude of echolo- using echolocation, but how they discriminate
cation clicks, steer the sound beam, and manipu- between prey species is not known and, to our
late its width (Moore et al. 2008; Wisniewska knowledge, has not been studied experimentally.
et al. 2015). These features are of obvious advan- Probably the most spectacular use of echolocation
tage for detecting and tracking prey. There are to find prey is shown by bottlenose dolphins in
rich possibilities in future research of sound pro- the Grand Bahamas. The dolphins often find fish
duction and the use of echolocation by odontocete under the sand using their echolocation and stick
whales. their proboscis down in the sand, sometimes to
the pectoral fins, and come up with a fish in their
mouths (Rossbach and Herzing 1997). What echo
12.5.2 Hearing Anatomy information they use for this unusual behavior is
and Echolocation Abilities unknown. Harbor porpoises can discriminate
between identical spheres of different materials
We refer to Vol. 2 Chap. 9 on aquatic mammals (Wisniewska et al. 2012). Three harbor porpoises
for more detail on hearing anatomy and abilities. were easily able to distinguish between an alumi-
Here, we focus on the hearing abilities of num sphere and spheres of plexiglas, PVC, and
odontocetes as they relate to the tasks of obstacle brass. Two of the three had problems
and prey detection by echolocation. differentiating aluminum from steel spheres. The
Experimental studies show that the bottlenose spectra of these two spheres were very similar, so
dolphin (Li et al. 2011), the false killer whale we assume the harbor porpoises were using spec-
(Nachtigall and Supin 2008), and the harbor tral information to detect the differences among
440 S. M. M. Brinkløv et al.
Fig. 12.18 Underwater audiograms of four (Grampus griseus) auditory evoked response audiogram
odontocetes. Blue: Harbor porpoise behavioral audiogram using a 20-ms sinusoidal amplitude-modulated stimulus
using a 50-ms sound stimulus (Kastelein et al. 2010). (Nachtigall et al. 2005). Yellow: Killer whale average
Orange: White-beaked dolphin auditory evoked response behavioral audiogram of two animals using a 2-s tone
audiogram using a 1-s sinusoidal amplitude-modulated (Szymanski et al. 1999)
stimulus (Nachtigall et al. 2008). Purple: Risso’s dolphin
the spheres. Perhaps they also use spectral infor- Price et al. 2004). Neither seem to use echolocation
mation together with target strength to distinguish to find food, but rather for crude orientation in dark
between different fish species. caves or tunnels where they roost and nest. Argu-
All echolocating toothed whales have a ably, bird echolocation systems are not a highly
U-shaped audiogram (Fig. 12.18) and a broad evolved sensory specialization in the same sense as
range of hearing extending up to 200 kHz. In in bats and odontocetes.
general, the hearing of odontocetes is most sensi- Disregarding nesting habits, oilbirds and
tive at the frequencies used for echolocation. For swiftlets have very different ecologies. Oilbirds
example, the harbor porpoise, a narrow-band are nocturnal fruit-eaters from the tropical part of
high-frequency species, is most sensitive at South America (Chantler et al. 1999). Swiftlets
around 130 kHz, the peak frequency of its narrow occur across the Indo-Pacific and use vision to
band signal. The killer whale uses lower locate insect prey during the day. There are
frequencies in its echolocation signals and its records of swiftlets hunting at dusk, but it is
best hearing is accordingly lower (Fig. 12.18). unclear if they use echolocation during this activ-
ity (Price et al. 2004; Fullard et al. 1993).
Fig. 12.19 Schematic of syrinx anatomy in the oilbird into the two bronchi. Note the lack of intrinsic syringeal
(based on Suthers and Hector 1988, Fig. 12.2) and the muscles (mm. broncholateralis) in the swiftlet. Note also
Australian grey swiftlet (Aerodramus (formerly the asymmetry of the bronchial oilbird syrinx with a more
Collocalia) spodiopygia; based on Suthers and Hector cranial placement of the right semi-syrinx. Adapted by
1982, Fig. 12.2), showing the trachea and its bifurcation S. Brinkløv
Suthers and Hector (1982, 1985) revealed distinct lack intrinsic syringeal muscles (Fig. 12.19) and
differences in the syringeal morphology of instead contract extrinsic tracheolateralis muscles
oilbirds and swiftlets (Fig. 12.19) but proposed to terminate their echolocation clicks (Suthers and
similar sound production mechanisms in both. Hector 1982).
Oilbirds have a bronchial syrinx located caudal Bird biosonar signals are relatively broadband
to the tracheal bifurcation. The two half-syringes and without structured frequency changes over
are placed with bilateral asymmetry in the two time (Pye 1980). In this sense, they resemble the
bronchi (Suthers and Hector 1985). The swiftlet tongue-clicks of rousettes bats more than the
syrinx is tracheobronchial (i.e., located where the signals produced by other echolocators, but with
trachea splits into the two bronchi; Suthers and a narrower frequency range, longer duration, and
Hector 1982). lacking similarly well-defined on- and offsets
Suthers and Hector suggested that biosonar (Fig. 12.20).
signals in both oilbirds and swiftlets are produced In the wild, oilbirds emit click-bursts of two or
as a contraction of the extrinsic sternotrachealis more single clicks in rapid succession
muscles pulls the trachea caudal. This reduces (Fig. 12.20). Their clicks and click intervals are
tension across the syrinx and causes the syringeal stereotyped within such a burst, with click
membranes to fold into the syrinx lumen, where durations of 0.5–1 ms and click intervals of
they induce vibrations of the expiratory airflow. ~2.5 ms. Clicks recorded from oilbirds in the
Contrary to their other vocalizations, oilbirds and wild have the most energy around 10–15 kHz
swiftlets actively terminate their echolocation but extend from 7 to 23 kHz measured at 6 dB
clicks but do so by using different sets of muscles. from the peak frequency (Brinkløv et al. 2017).
In oilbirds, termination is controlled by contrac- The intervals between click-bursts are more vari-
tion of the broncholateralis muscles intrinsic to able, but often around 200 ms (Griffin 1953).
the syrinx (Suthers and Hector 1985). Swiftlets Each click-burst is perceived by human ears as
442 S. M. M. Brinkløv et al.
Fig. 12.20 Waveform and spectrogram displays of bird to its nest in a Sri Lankan railway tunnel. The overall
echolocation click sequences. Top panel: oilbird timescale is 1 s, frequency scale is from 0 to 20 kHz.
(Steatornis caripensis) exiting cave roost, recorded at Spectrogram settings: FFT size 256, Hann window, 98%
Dunstan’s Cave, Asa Wright Nature Centre, Trinidad. overlap. Both recordings are high-pass filtered at 1 kHz
Bottom panel: swiftlet (Aerodramus unicolor) returning (second order Butterworth filter)
one coherent sound (Konishi and Knudsen 1979). could be affected by reverberant confines or the
It is unresolved whether the number of individual stress of handling/being restrained.
clicks in a burst has functional meaning to the Swiftlets emit biosonar signals either as single
oilbird, but recent studies indicate that oilbirds or double clicks (two single clicks in rapid suc-
may add click subunits to a burst as a means to cession, Thomassen et al. 2004; Fig. 12.20). As in
increase overall burst energy and, as a result, the oilbirds, it is unclear if the difference between
echolocation range (Brinkløv et al. 2017). Click- single and double clicks has functional meaning
bursts typically have source levels of around to the swiftlets or is merely an artifact of the
100 dB re 20 μPa rms at 1 m (Brinkløv et al. sound production mechanism (Suthers and Hec-
2017). tor 1982). Of 12 swiftlet species studied, only the
Data from captive oilbirds differ somewhat Atui swiftlet (Aerodramus sawtelli) appears to
from field recordings. Konishi and Knudsen consistently produce single clicks (Fullard et al.
(1979) reported that oilbird signals had most 1993), while the rest emit both single and, more
energy around 2 kHz and described each click often, double-clicks. Each click of a pair is
as a pulse-like sound burst of 20 ms or more. 1–8 ms long, with the second often of higher
Suthers and Hector (1985) described a large sig- amplitude and slightly longer duration (Griffin
nal variation including continuous pulsed signals and Suthers 1970; Suthers and Hector 1982;
of 40–80 ms and shorter single or double pulses. Coles et al. 1987). Clicks within a pair have
This difference between field and captive data intervals of 1–25 ms and click-pairs are emitted
possibly indicates that the sounds of captive at intervals of 50–350 ms. Swiftlet clicks have
birds do not accurately reflect the echolocation most energy below 10 kHz (see spectrogram in
behavior of birds in the wild since vocalization Fig. 12.20).
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 443
Fig. 12.21 Overview of avian and mammalian middle Springer Nature. Manley GA, Peripheral hearing
and inner ear anatomy. Left: Birds have a single middle ear mechanisms in reptiles and birds; https://www.springer.
bone (columella) and a straight cochlea. Right: Mammals com/gp/book/9783642836176. # Springer Nature, 1990.
have three middle ear bones (malleus, incus, and stapes) All rights reserved
and a coiled cochlea. Adapted by permission from
444 S. M. M. Brinkløv et al.
best thresholds at 1–5 kHz (Fig. 12.9 and Coles relative to nights with more ambient light. The
et al. 1987). Hence, both oilbirds and swiftlets higher intensity of click-bursts emitted on darker
appear to have the ‘standard’ bird hearing range, nights resulted both from an increase in the ampli-
with lowest thresholds between 2 and 4 kHz and tude of individual clicks and an increase in the
poor sensitivity above 10 kHz (Dooling 1980). number of individual clicks per click-burst. Sev-
Curiously, it appears that oilbirds in the wild emit eral studies have noted that swiftlets increase
echolocation clicks that are not well-aligned to click repetition rate as they approach obstacles
their best area of hearing. The lack of external ear (Griffin and Suthers 1970; Coles et al. 1987)
structures in oilbirds and swiftlets means that and Atiu swiftlets emit signals at higher repetition
directional cues occur at frequencies predicted rate when they enter than when they emerge from
by head size. their cave roost (Fullard et al. 1993).
With echolocation signals matching their most Nesting in dark places, such as caves, mines,
sensitive area of hearing, oilbirds and swiftlets tunnels, and other places where the lighting is
should detect objects down to at least 17 cm in uncertain, is a common feature of the ecology of
diameter, equal to the wavelength of the signal at oilbirds and echolocating swiftlets. Both start
2 kHz. For Oilbirds, this prediction is supported clicking as they cross a threshold from light to
by obstacle-avoidance experiments, suggesting dark (Fenton 1975; Thomassen 2005; Brinkløv
that they detect discs 20 cm in diameter et al. 2017). Neither have been shown to use
suspended from the ceiling of their cave roost echolocation for foraging, although oilbirds may
(Konishi and Knudsen 1979). However, detection be able to detect some of the larger fruits they eat
thresholds between 0.6 and 2 cm have been found (palm fruits up to 6 cm) by echolocation (Snow
for swiftlets (Griffin and Suthers 1970; Fenton 1961, 1962; Bosque et al. 1995).
1975; Griffin and Thompson 1982; Smyth and
Roberts 1983), indicating that they may somehow
extract echo information from the upper, albeit 12.7 Orientation and Echolocation
weaker, frequency range of their signals. in Insectivores and Rodents
Like bats and odontocetes, oilbirds and
swiftlets detect obstacles in dark spaces using 12.7.1 Echo-Based Orientation
echolocation. Unlike bats and odontocetes, in Insectivores: Tenrecs
echolocating birds, even the nocturnal oilbird, and Shrews
are also vision specialists and presumably do not
forage by echolocation. The importance of vision Tenrecs and shrews are small insectivorous
in oilbirds is reflected in their specialized retinal mammals that forage in dense vegetation or
morphology with multiple layers of under leaf-litter (Fig. 12.22). Tenrecs are largely
photoreceptors (Martin et al. 2004). Initial behav- endemic to Madagascar, but shrews have a wide
ioral experiments revealed that oilbirds flying in distribution across Eurasia and North America.
darkness consistently produced sounds but could Both have tiny eyes and a presumably well-
not avoid obstacles if their ears were blocked. developed olfactory sense and emit a variety of
With the lights on, the birds, in contrast, produced sounds. The use of sounds by shrews and tenrecs,
fewer or no sounds and negotiated obstacles also as they approach and explore unfamiliar objects
with their ears blocked (Griffin 1953). in their surroundings, led to initial suggestions
Biosonar signals of birds are generally stereo- that they may use echolocation. However, few
typed (Thomassen and Povel 2006) and there is studies have successfully tested this hypothesis
no indication that birds have similar adaptive directly. The current consensus is that shrews
control over signal frequency as most and tenrecs may use a simple echo-based orienta-
echolocating bats. However, Brinkløv et al. tion system to obtain rough acoustic input about
(2017) recently found that the intensity of oilbird their surroundings at short range beyond their
echolocation signals increased on darker nights snout and vibrissae. As stated by Siemers et al.
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 445
Fig. 12.22 Photographs (from left) of lowland streaked by Wilfried Berns, 2006, https://en.wikipedia.org/wiki/
tenrec (Hemicentetes semispinosus), lesser hedgehog ten- Lesser_hedgehog_tenrec#/media/File:Kleiner-igeltanrek-
rec (Echinops telfairi), and northern short-tailed shrew a.jpg. Photo of northern short-tailed shrew by Giles
(Blarina brevicauda). Photo of lowland streaked tenrec Gonthier, 2007, https://en.wikipedia.org/wiki/Northern_
by Frank Vassen, 2010, https://commons.wikimedia.org/ short-tailed_shrew#/media/File:Blarina_brevicauda.jpg.
wiki/File:Lowland_Streaked_Tenrec,_Mantadia,_ All photos licensed under CC BY 2.0; https://
Madagascar.jpg#filelinks. Photo of lesser hedgehog tenrec creativecommons.org/licenses/by/2.0/deed.en
(2009): “Except for large and thus strongly series of tongue clicks, each less than 2 ms long
reflecting objects, such as a big stone or tree with most energy between 10 and 16 kHz. The
trunk, shrews probably are not able to disentangle clicks were produced as singles, doubles, or in
echo scenes, but rather derive information on triplets. Streaked tenrecs (Hemicentetes
habitat type from the overall call reverberations. semispinosus) emitted clicks of low intensity;
This might be comparable to human hearing while those of Nesogale dobsoni were audible to
whether one calls into a forest or into a reverber- humans at 7 m.
ant cave.” Gould et al. (1964) found that, contrary to the
Gould et al. (1964) and Gould (1965) provided audible pulses of tenrecs, shrews (Sorex vagrans,
the most direct evidence for echo-based orienta- S. cinereus, S. palustris, and Blarina brevicauda)
tion in several species of shrews and tenrecs. searching for the platform emitted ultrasonic
After unsuccessful attempts to use an obstacle- pulses with most energy between 30 and
avoidance set-up, the animals were instead tested 60 kHz. The pulses were about 5 ms in duration
using a so-called disc-platform apparatus. They with inter-pulse intervals of about 20 ms. Sanchez
were trained to find and jump onto a platform et al. (2019) recorded five Sorex unguiculatus in
suspended at a vertical distance below a disc three different experimental setups, including soft
with an area of partial overlap. The location of and hard barrier obstacles. Under all three
the overlap was varied at random between trials. conditions, the shrews emitted a variety of calls,
Both tenrecs and shrews emitted sounds during including clicks and several tonal pulse types
this task in the dark, but animals with their ears ranging in frequency between 5 and 45 kHz
blocked were less successful in finding and land- with durations of 3–40 ms. While several studies
ing on the platform than control animals. The have shown that shrews and tenrecs do show
control experiments included two tenrecs that context-dependent changes in vocalization rate,
were blindfolded. there is little direct evidence for echolocation by
Gould (1965) recorded the sound pulses emit- these animals (Buchler 1976; Tomasi 1979;
ted by captive tenrecs (Echinops telfairi, Forsman and Malmquist 1988; Siemers et al.
Hemicentetes semispinosus, and Nesogale (for- 2009; Sanchez et al. 2019).
merly Microgale) dobsoni) as they explored the No morphological adaptations for echoloca-
disk-platform apparatus. The tenrecs emitted tion have been found in the auditory systems of
446 S. M. M. Brinkløv et al.
tenrecs or shrews. The limited data on hearing in Supplementing the behavioral part of their
these animals indicate that at least tenrecs hear study, He et al. (2021) also conducted anatomical
well across the frequency range of their tongue- scans to reveal that the stylohyal bone of soft-
clicks. Sales and Pye (1974) reported that the furred tree mice is fused with the tympanic bone,
hearing of streaked tenrecs is most sensitive which is characteristic of echolocating bats.
from 2 to 60 kHz. Drexl et al. (2003) used Lastly, they used genetic analyses to document a
otoacoustic emissions and auditory evoked strong convergence of hearing-related genes with
potentials from the inferior colliculus and the those of other echolocating mammal groups,
auditory cortex to determine that the auditory including the prestin gene associated with echo-
range of lesser hedgehog tenrecs (Echinops location in bats and toothed whales (Liu et al.
telfairi) extends from 5–50 kHz at 40 dB SPL, 2014). All four species of soft-furred tree mice
with a lowest threshold at 16 kHz. Siemers et al. emit similar short (~2 ms) ultrasonic pulses rang-
(2009) report a best hearing range of shrews ing from 65 to 140 kHz (He et al. 2021).
between 2 and 20 kHz.
One important test for echolocation is to blind the Studies on the role of echolocation signals for
echolocator. This was done by Griffin (1958) for intraspecific communication have included
bats and by Norris et al. (1961) for dolphins. observations and recordings, playback
Although such a “blinding test” was not experiments, and combinations of these
performed, a multifaceted study by He et al. approaches. Echolocation signals elicited territo-
(2021) convincingly suggests soft-furred tree rial behavior in foraging spotted bats, served in
mice (Typhlomys) must be added to the list of individual recognition, and assisted in
echolocating animals. Through behavioral maintaining group adhesion among foraging
experiments in total darkness, filmed with an molossids (Fenton 1995). Furthermore, bats use
infrared video camera, they showed that all four buzzes (high pulse repetition rates) not only when
species of soft-furred tree mouse emitted acoustic attacking prey, but also during landing, drinking
pulses at higher rate and grouped pulses more in and by several species in social settings (e.g.,
complex space than open space and during obsta- Schwartz et al. 2007). Many bat species roost in
cle avoidance. Further, three species (T. cinereus, large groups in caves and emerge at dusk as a
T. daloushanensis, and T. nanus) were tested in a group to forage. Several toothed whale species
disk-platform setup similar to that used by Gould forage in large numbers. Echolocation in bats and
et al. (1964) for shrews and tenrecs. The tree mice odontocetes likely plays a role in maintaining
spent increased time emitting higher pulse rates spacing among group members during foraging
on the sector of the disk above the platform before or during large group movements. However, there
dropping down onto the platform. This preference has been little research on whether all or only
was lost when their ears were blocked but specific animals echolocate while foraging as a
regained when the ears were unplugged or fitted group. The benefits of eavesdropping on each
with hollow tubes. The study also used laboratory other’s echolocation signals need to be studied.
house mice (Mus musculus) as a control to dem- Groups of flying bats and swimming toothed
onstrate absence of any location preference or whales surely eavesdrop on each other’s echolo-
sound emission during the disk-platform test. cation signals to gain general information about
Myriad tests and field studies document the func- prey location. The energetic cost of sound pro-
tional use of echolocation by bats and toothed duction for flying bats and for clicking dolphins is
whales, but such studies are not available for negligible (Speakman and Racey 1991; Noren
insectivores and rodents. et al. 2017).
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 447
Evidence suggests that toothed whales use greater than that of silent dolphins indicating
their echolocation clicks as communication that echolocation is not energetically costly
signals. These comprise repeated patterns of (Noren et al. 2017).
rising, falling, or constant click repetition rates Several free-ranging species of dolphins
up to near 1000 clicks/s. Clicks used for commu- (Tursiops truncatus, Stenella attenuata,
nication by dolphins and porpoises have the same S. longirostris, S. frontalis, Orcinus orca, and
spectral properties as those used for echolocation, Cephalorhynchus hectori) use pulse-bursts
but this does not hold true for the coda-clicks of mostly during affiliative and aggressive behavior
sperm whales, as explained below. (Dawson 1991; Herzing 2000; Lammers et al.
In toothed whales, most is known about the 2004). Rasmussen et al. (2016) played back arti-
communication role of echolocation clicks from ficial pulse-burst signals (repeated at 300 clicks/
studies of captive harbor porpoises, captive s for 2 s) to 21 free-ranging white-beaked
bottlenose dolphins, and wild sperm whales. dolphins. Rather than responding with aggressive
Porpoises and dolphins communicate with chang- behavior, the dolphins showed mostly a change in
ing click repetition rates, rather like Morse code, swimming direction and swam around the projec-
without changing the temporal and spectral tion equipment, mirroring the retreat of individual
properties of the clicks (Rasmussen and Miller captive harbor porpoises receiving an ‘aggres-
2002; Clausen et al. 2010). These “pulse-bursts” sive’ pulse-burst. The pulse-bursts, or rasps, of
(or burst-pulse sounds) of high repetition rate Blainville’s beaked whale are only emitted at
clicks with narrow sound beams are especially depths below 200 m and composed of a series
good for close range and directed communication of short, FM clicks similar to its FM echolocation
(Clausen et al. 2010). clicks, except with a lower peak-frequency. The
Figure 12.23 shows click rates used in five communication context is not known (Arranz
behavioral contexts between a mother harbor por- et al. 2011).
poise and her calf. The porpoises used the highest Sperm whales are social and form social units
click rates in aggressive encounters, the lowest in in subtropical and tropical waters worldwide. Up
grooming and echelon swimming (Clausen et al. to 12 females with young of both sexes gather in
2010). The mother may be aggressive toward her long-term stable social units. Sperm whales in all
calf and toward males. Aggressive signals were ocean basins communicate using rhythmic
usually higher in intensity and repetition rates and “coda” clicks (see Fig. 12.12), which are a unique
always resulted in the other animal moving away specialization among toothed whales (Watkins
from the emitter. Both mother and calf emitted and Schevill 1977) and may even signify individ-
approach signals, but only the calf emitted contact ual identity. The composition of codas can have
signals and only the mother emitted grooming many repetitive patterns, such as one click + a
signals. Wild harbor porpoises also use rapid group of three clicks: 1 + 3, or 2 + 1 + 1 + 1,
click rates for communication (Sørensen et al. 1 + 1 + 3, etc. The coda patterns are not stereo-
2018). typed; click intervals within a coda can vary and
Bottlenose dolphins use both echolocation seem to contain information for the receiver. One
clicks and whistles as communication signals. stable social unit of five adult females, a juvenile
Blomkvist and Amundin (2004) studied two cap- male, and a calf in the waters off Dominica used
tive female bottlenose dolphins that used high- 15 different codas. All individuals in the unit used
frequency, high repetition rate pulse-bursts dur- several codas and one individual used 11 of the
ing aggressive behavior. The pulse-bursts lasted 15 codas (Antunes et al. 2011). A recent study
up to 900 ms with click repetition rates from (Oliveira et al. 2016) confirmed and extended
100 to 940 clicks/s. Like the echolocation clicks those of Antunes et al. (2011). Using digital data
used for orientation and foraging, the pulses were acquisition tags (D-tags) attached to five individ-
between 60 and 150 kHz. The metabolic rate of ual sperm whales near the Azores, Oliveira et al.
dolphins producing clicks was only slightly (2016) strongly indicated that codas from these
448 S. M. M. Brinkløv et al.
Fig. 12.23 Use of echolocation click rates by harbor Beedholm K, Dereuiter S, Madsen PT, Click communica-
porpoise as communication signals. Five different acoustic tion in harbor porpoises (Phocoena phocoena). Bioacous-
behaviors with seven events in each are shown. Note the tics 20:1–28; https://www.tandfonline.com/doi/abs/10.
very rapid increase in click repetition rate up to 1000 1080/09524622.2011.9753630. # Taylor & Francis,
clicks/s during aggressive encounters. Reprinted with per- 2011. All rights reserved
mission from Taylor & Francis. Clausen KT, Wahlberg M,
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 449
echolocators. Bats produce their echolocation the Dark. While now more than 60 years old, the
sounds in the larynx. Dolphins emit echolocation original observations and insights detailed by
sounds through the melon within their forehead Griffin (1958) are still very much to the point
and from here into the water. They have phonic and relevant today. The Springer Handbook of
lips in their nasal passage to produce their echo- Auditory Research volumes Hearing by Bats,
location clicks and communication whistles. Bat Bioacoustics, Hearing by Whales and
A primary advantage of echolocation is Dolphins, and Biosonar are also highly
allowing animals to operate and orient in recommended as they hold much more detail
situations where light is uncertain, unpredictable, than the present description. Finally, Thomas,
or plain absent. But as with other sensory Moss, and Vater edited a book on Echolocation
capacities, echolocation often does not stand in Bats and Dolphins in 2002.
alone. The cross-modal sensory interactions
between echolocation and sensory abilities such Acknowledgments We dedicate this chapter to
as touch, olfaction, and vision, is an area awaiting Dr. Annemarie Surlykke, who made substantial
contributions to the field of bioacoustics in insects and in
further exploration.
echolocating bats. She was one of the first women
Information leakage is a primary disadvantage scientists to concentrate her research in the area of bio-
of echolocation. The signals used in echolocation acoustics, which requires a multi-disciplinary understand-
are audible to many other animals, such as com- ing of biology, acoustics, physics, animal behavior, and
electrical engineering.
peting conspecifics, predators, and prey. The evo-
We appreciate the careful reviews of sections 5 and 8 by
lutionary arms race between echolocating bats Mats Amundin, Senior Advisor Kolmårdens Djurpark and
and some insect prey is a classic example of Guest Prof. Linkoping University, Sweden; Professor
predator–prey co-evolution. Signals used in echo- Peter T. Madsen, Department of Bioscience, Aarhus Uni-
versity, Denmark; and Associate Professor Magnus
location also can function in communication, as
Wahlberg, Institute of Biology, University of Southern
shown in echolocating bats and toothed whales. Denmark, Odense, Denmark. We acknowledge and appre-
Both bats and odontocetes are affected by ciate the initial outline of this chapter by now deceased
anthropogenic activities, as exemplified by the Jeanette Thomas.
high mortality experienced by some bat species
from wind turbines and incidents of drowning, for
example, in porpoises accidentally entangled in References
stationary gillnets. Anthropogenic sound sources
like road or shipping noise may interfere with Amundin M (1991) Sound production in odontocetes with
efficient foraging in bats and toothed whales and emphasis on the harbour porpoise Phocoena
seismic explosions used for offshore oil explora- phocoena. Stockholm University, Stockholm
Andersen SH, Amundin M (1976) Possible predator-
tion can affect the behavior of toothed whales and related adaption of sound production and hearing in
other marine mammals. Echolocating birds are the harbour porpoise (Phoconea phocoena). Aquat
also affected by humans, for example, from Mamm 4(2):56–57
poaching or nest collecting and habitat- Antunes R, Schulz T, Gero S, Whitehead H, Gordon J,
Rendell L (2011) Individual distinctive acoustic
destructive mining activity. Gaining an increased features in sperm whale codas. Anim Behav 81(4):
understanding of echolocation behavior in these 723–730
animals could have important implications for Arranz P, Aguilar de Soto N, Madsen PT, Brito A, Bordes
such issues and for wildlife management in F, Johnson MP (2011) Following a foraging fish-
finder: diel habitat use of Blainville’s beaked whales
general. revealed by echolocation. PLoS One 6(12). https://doi.
org/10.1371/journal.pone.0028353
Au WWL (1993) The sonar of dolphins. Springer,
12.10 Additional Resources New York
Au WWL (2015) History of dolphin biosonar research.
Acoust Tod 11(4):10–17
For a more in-depth view of bat echolocation, we Au WWL, Simmons JA (2007) Echolocation in dolphins
strongly recommend Griffin’s book Listening in and bats. Phys Tod 2007:40–45
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 451
Au WWL, Suthers RA (2014) Production of biosonar Handbook of the birds of the world, barn owls to
signals: structure and form. In: Surlykke A, Nachtigall hummingbirds, vol 5. Lynx, Barceloa, pp 388–457
PE, Fay RR, Popper AN (eds) Biosonar. Springer, Clausen KT, Wahlberg M, Beedholm K, Dereuiter S,
New York, pp 61–105. https://doi.org/10.1007/978-1- Madsen PT (2010) Click communication in harbour
4614-9146-0_3 porpoises (Phocoena phocoena). Bioacoustics 20:1–
Au WWL, Charder DA, Penner RH, Scronce BL (1985) 28
Demonstration of adaptation in beluga whale echolo- Coles RB, Konishi M, Pettigrew JD (1987) Hearing and
cation signals. J Acoust Soc Am 77:726–730 echolocation in the Australian Grey swiftlet,
Au WWL, Moore PWB, Pawloski D (1986) Echolocation Collocalia spodiopygia. J Exp Biol 129:365–371
transmitting beam of the Atlantic bottlenose dolphin. J Coles RB, Guppy A, Anderson ME, Schlegel P (1989)
Acoust Soc Am 80:688–691 Frequency sensitivity and directional hearing in the
Au WWL, Kastelein RA, Rippe T, Schooneman NM gleaning bat, Plecotus auritus (Linnaeus 1758). J
(1999) Transmission beam pattern and echolocation Comp Physiol A 165:269–280
signals of a harbor porpoise (Phocoena phocoena). J Cranford TW, Amundin M, Norris KS (1996) Functional
Acoust Soc Am 106:3699–3705 morphology and homology in the Odontocete nasal
Au WWL, Kastelein RA, Benoit-Bird KJ, Cranford TW, complex: implications for sound generation. J Morphol
McKenna MF (2006) Acoustic radiation from the head 228:223–285
of echolocating harbor porpoises (Phocoena Cranford TW, McKenna MF, Soldevilla MS, Wiggins
phocoena). J Exp Biol 209:2726–2733 SM, Goldbogen JA, Shadwick RE, Krysl P, Leger
Baumann-Pickering S, Wiggins SM, Roth EH, Roch MA, JA, Hildebrand JA (2008) Anatomic geometry of
Schnitzler HU, Hildebrand JA (2010) Echolocation sound transmission and reception in Cuvier’s beaked
signals of a beaked whale at Palmyra atoll. J Acoust whale (Ziphius cavirostris). Anat Rec 291:353–378
Soc Am 127(6):3790–3799. https://doi.org/10.1121/1. Cranford TW, Elsberry WR, Van Bonn WG, Jeffress JA,
3409478 Chaplin MS, Blackwood DJ, Carder DA,
Blomkvist C, Amundin M (2004) High-frequency burst- Kamolnick T, Todd MA, Ridgway SH (2011) Obser-
pulse sounds in agonistic/aggressive interactions in vation and analysis of sonar signal generation in the
bottlenose dolphins, Tursiops truncatus. In: Thomas bottlenose dolphin (Tursiops truncatus): evidence for
JA, Moss CF, Vater M (eds) Echolocation in bats and two sonar sources. J Exp Mar Biol Ecol 407(1):81–96
dolphins. University of Chicago Press, Chicago, pp Cranford TW, Trijoulet V, Smith CR, Krysl P (2014) Vali-
425–431 dation of a vibroacoustic finite element model using
Boonman AM, Jones G (2002) Intensity control during bottlenose dolphin simulations: the dolphin biosonar
target approach in echolocating bats; stereotypical beam is focused in stages. Bioacoustics 23(2):161–194
sensori-motor behaviour in Daubenton’s bats, Myotis Cranford TW, Amundin M, Krysl P (2015) Sound produc-
daubentonii. J Exp Biol 205:2865–2874 tion and sound reception in Delphinoids. In: Johnson
Bosque C, Ramirez R, Rodriguez D (1995) The diet of the CM, Herzing DL (eds) Dolphin communication and
oilbird in Venezuela. Ornitol Neotrop 6:67–80 cognition. Past, present, and future. MIT Press, Boston,
Brinkløv S, Fenton MB, Ratcliffe JM (2013) Echolocation MA, pp 19–48
in oilbirds and swiftlets. Front Physiol 4:123 Culik BM (2011) Odontocetes - the toothed whales, CMS
Brinkløv S, Elemans CPH, Ratcliffe JM (2017) Oilbirds Technical Series No. 24, vol 24. United Nations Envi-
produce echolocation signals beyond their best hearing ronmental Program, Bonn, Germany
range and adjust signal design to natural light Dalland JI (1965) Hearing sensitivity in bats. Science 150:
conditions. R Soc Open Sci 4(5):17025 1185–1186
Bruns V, Schmieszek E (1980) Cochlear innervation in the Dawson SM (1991) Clicks and communication: the
greater horseshoe bat - demonstration of an acoustic behavioural and social contexts of Hector’s dolphin
fovea. Hear Res 3(1):27–43. https://doi.org/10.1016/ vocalizations. Ethology 88:265–276
0378-5955(80)90006-4 Denzinger A, Schnitzler HU (2013) Bat guilds, a concept
Buchler ER (1976) The use of echolocation by the wan- to classify the highly diverse foraging and echolocation
dering shrew (Sorex vagrans). Anim Behav 24:858– behaviors of microchiropteran bats. Front Physiol
873 4. https://doi.org/10.3389/fphys.2013.00164
Burgin CJ, Colella JP, Kahn PL, Upham NS (2018) How Dooling RJ (1980) Behavior and psychophysics of hearing
many species of mammals are there? J Mammal 99(1): in birds. In: Popper AN, Fay RR (eds) Comparative
1–14. https://doi.org/10.1093/jmammal/gyx147 studies of hearing in vertebrates. Springer, New York,
Caruso F, Sciacca V, Bellia G, De Domenico E, Larosa G, pp 261–288
Papale E et al (2015) Size distribution of sperm whales Dormer KJ (1979) Mechanism of sound production and air
acoustically identified during long term deep-sea mon- recycling in delphinids: cineradiographic evidence. J
itoring in the Ionian Sea. PLoS One 10(12):e0144503. Acoust Soc Am 65(1):229–239
https://doi.org/10.1371/journal.pone.0144503 Drexl M, Faulstich MH, Von Stebut B, Radtke-Schuller S,
Chantler P, Wells DR, Schuchmann KL (1999) Family Kössl M (2003) Distortion product otoacoustic
Apodidae (swifts). In: Hoyo D, Elliott S (eds) emissions and auditory evoked potentials in the
452 S. M. M. Brinkløv et al.
hedgehog tenrec, Echinops telfairi. J Assoc Res Gould E (1965) Evidence for echolocation in the
Otolaryngol 4:555–564 Tenrecidae of Madagascar. Proc Am Philos Soc 109:
Elemans CPH, Mead AF, Jakobsen L, Ratcliffe JM (2011) 352–360
Superfast muscles set maximum call rate in Gould E, Negus NC, Novick A (1964) Evidence for echo-
echolocating bats. Science 333:1885–1888 location in shrews. J Exp Zool 156:19–37
Fais M, Johnson M, Wilson M, Aguilar Soto N, Madsen Griffin DR (1944) Echolocation by blind men, bats and
PT (2016) Sperm whale predator-prey interactions radar. Science 100:589–590
involve chasing and buzzing, but no acoustic stunning. Griffin DR (1953) Acoustic orientation in the oilbird,
Sci Rep 6:28562:1–13. https://doi.org/10.1038/ Steatornis. Proc Natl Acad Sci USA 39:884–893
srep28562 Griffin DR (1958) Listening in the dark, 2nd edn. Cornell
Falk B, Williams T, Aytekin M, Moss CF (2011) Adaptive University, New York
behavior for texture discrimination by the free-flying Griffin DR, Suthers RA (1970) Sensitivity of echolocation
big brown bat, Eptesicus fuscus. J Comp Physiol A in cave swiftlets. Biol Bull 139:365–371
197(5):491–503 Griffin DR, Thompson T (1982) Echolocation by cave
Fay RR (1988) Hearing in vertebrates: a psychophysics swiftlets. Behav Ecol Sociobiol 10:119123
databook. Hill-Fay Associates, Winnetka, IL Griffin DR, Webster FA, Michael CR (1960) The echolo-
Fenton MB (1975) Acuity of echolocation in Collocalia cation of flying insects by bats. Anim Behav 8:141–
hirundinacea (Aves: Apodidae), with comments on the 154
distributions of echolocating swiftlets and molossid Griffin DR, Friend JH, Webster FA (1965) Target discrim-
bats. Biotropica 7:1–7 ination by the echolocation of bats. J Exp Zool 158:
Fenton MB (1995) Natural history and biosonar 155–168
signals. In: Popper AN, Fay RR (eds) Hearing by Grunwald JE, Schornich S, Wiegrebe L (2004) Classifica-
bats, Springer handbook of auditory research, vol tion of natural textures in echolocation. Proc Natl Acad
5. Springer, New York, pp 37–86 Sci USA 101(15):5670–5674. https://doi.org/10.1073/
Fenton MB, Faure PA, Ratcliffe JM (2012) Evolution of pnas.0308029101
high duty cycle echolocation in bats. J Exp Biol Guppy A, Coles RB (1988) Acoustical and neural aspects
215(17):2935–2944 of hearing in the Australian gleaning bats,
Finneran JJ, Branstetter BK, Houser DS, Moore PW, Macroderma gigas and Nyctophilus gouldi. J Comp
Mulsow J, Martin C, Perisho S (2014) High-resolution Physiol A 162(5):653–668. https://doi.org/10.1007/
measurement of a bottlenose dolphin’s (Tursiops Bf01342641
truncatus) biosonar transmission beam pattern in the Hartley DJ (1992a) Stabilization of perceived echo
horizontal plane. J Acoust Soc Am 136(4):2025–2038 amplitudes in echolocating bats. I. Echo detection and
Forsman KA, Malmquist MG (1988) Evidence for echolo- automatic gain control in the big brown bat, Eptesicus
cation in the common shrew, Sorex araneus. J Zool fuscus, and the fishing bat, Noctilio leporinus. J Acoust
Soc Lond 216:655–662 Soc Am 91:1120–1132
Fullard JH, Barclay RMR, Thomas DW (1993) Echoloca- Hartley DJ (1992b) Stabilization of perceived echo
tion in free-flying Atiu swiftlets (Aerodramus sawtelli). amplitudes in echolocating bats. II. The acoustic
Biotropica 25:334–339 behavior of the big brown bat, Eptesicus fuscus,
Galatius A, Olsen MT, Steeman ME, Racicot RA, when tracking moving prey. J Acoust Soc Am 91:
Bradshaw CD, Kyhn L, Miller LA (2019) Raising 1133–1149
your voice: evolution of narrow band high frequency Hartley DJ, Suthers RA (1987) The sound emission pat-
signals in odontocetes. Biol J Linn Soc 126:213–224. tern and the acoustical role of the noseleaf in the
https://doi.org/10.1093/biolinnean/bly194 echolocating bat, Carollia perspicillata. J Acoust Soc
Gardiner JD, Dimitriadis G, Sellers WI, Codd JR (2008) Am 82:1892–1900
The aerodynamics of big ears in the brown long-eared Hartley DJ, Suthers RA (1989) The sound emission pat-
bat Plecotus auritus. Acta Chiropterol 10(2):313–321. tern of the echolocating bat, Eptesicus fuscus. J Acoust
https://doi.org/10.3161/150811008x414881 Soc Am 85:1348–1351
Ghose K, Moss CF (2003) The sonar beam pattern of a He K, Liu Q, Xu D-M, Qi F-Y, Bai J, He S-W, Chen P,
flying bat as it tracks tethered insects. J Acoust Soc Am Zhou X, Cai W-Z, Chen Z-L, Jiang X-L, Shi P (2021)
114(2):1120–1131 Echolocation in soft-furred tree mice. Science 372:1–
Goldman LJ, Henson OW (1977) Prey recognition and 10. https://doi.org/10.1126/science.aay1513
selection by the constant frequency bat, Pteronotus p. Henson OW Jr (1965) The activity and function of the
parnellii. Behav Ecol Sociobiol 2:411–419 middle-ear muscles in echo-locating bats. J Physiol
Gonzalez-Terrazas TP, Martel C, Milet-Pinheiro P, 180(4):871–887. https://doi.org/10.1113/jphysiol.
Ayasse M, Kalko EKV, Tschapka M (2016) Finding 1965.sp007737
flowers in the dark: nectar-feeding bats integrate olfac- Herzing DL (2000) Acoustics and social behavior of wild
tion and echolocation while foraging for nectar. R Soc dolphins: implications for a sound society. In: Au
Open Sci 3(8). https://doi.org/10.1098/rsos.160199 WWL, Popper AN, Fay RR (eds) Hearing by whales
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 453
and dolphins, Hearing by whales and dolphins, vol 12. Kalko EKV, Schnitzler H-U (1993) Plasticity in echoloca-
Springer, New York, pp 225–272 tion signals of European pipistrelle bats in search
Hiryu S, Mora EC, Riquimaroux H (2016) Behavioral and flight: implications for habitat use and prey detection.
physiological bases for doppler shift compensation by Behav Ecol Sociobiol 33:415–428
echolocating bats. In: Fenton M, Grinnell A, Popper A, Kastelein RA, Hoek L, de Jong CAF, Wensveen PJ (2010)
Fay R (eds) Bat Bioacoustics, Springer handbook of The effect of signal duration on the underwater detec-
auditory research, vol 54. Springer, New York, NY. tion thresholds of a harbor porpoise (Phocoena
https://doi.org/10.1007/978-1-4939-3527-7_9 phocoena) for single frequency-modulated tonal signals
Holderied MW, Korine C, Fenton MB, Parsons S, between 0.25 and 160 kHz. J Acoust Soc Am 128(5):
Robson S, Jones G (2005) Echolocation call intensity 3211–3222. https://doi.org/10.1121/1.3493435_
in the aerial hawking bat Eptesicus bottae (Vesperti- Kloepper LN, Nachtigall PE, Donahue MJ, Breese M
lionidae) studied using stereo videogrammetry. J Exp (2012) Active echolocation beam focusing in the
Biol 208:1321–1327 false killer whale, Pseudorca crassidens. J Exp Biol
Houston RD, Boonman AM, Jones G (2004) Do echolo- 215:1306–1312. https://doi.org/10.1242/jeb.066605
cation signal parameters restrict bats’ choice Koay G, Heffner RS, Heffner HE (1998) Hearing in a
of prey? In: Thomas JA, Moss CF, Vater M (eds) Megachiropteran fruit bat (Rousettus aegyptiacus). J
Echolocation in bats and dolphins. Chicago University Comp Psychol 112(4):371–382
Press, Chicago, pp 339–345 Koblitz JC, Wahlberg M, Stilz P, Madsen PT,
Huggenberger S, Rauschmann MA, Vogl TJ, Oelschläger Beedholm K, Schnitzler HU (2012) Asymmetry and
HHA (2009) Functional morphology of the nasal com- dynamics of a narrow sonar beam in an echolocating
plex in the harbor porpoise (Phocoena phocoena L.). harbor porpoise. J Acoust Soc Am 131(3):2315–2324.
Anat Rec 292:902–920 https://doi.org/10.1121/1.3683254
Hulgard K, Moss CF, Jakobsen L, Surlykke A (2016) Big Konishi M, Knudsen EI (1979) The oilbird: hearing and
brown bats (Eptesicus fuscus) emit intense search calls echolocation. Science 204:425–427
and fly in stereotyped flight paths as they forage in the Kössl M, Vater M (1995) Cochlear structure and function
wild. J Exp Biol 219(3):334–340. https://doi.org/10. in bats. In: Popper AN, Fay RR (eds) Hearing by bats,
1242/jeb.128983 vol 5. Springer, New York, pp 191–234
Iwaniuk AN, Clayton DH, Wylie DR (2006) Echoloca- Kounitsky P, Rydell J, Amichai E, Boonman A, Eitan O,
tion, vocal learning, auditory localization and the rela- Weiss AJ, Yovel Y (2015) Bats adjust their mouth
tive size of the avian auditory midbrain nucleus (MLd). gape to zoom their biosonar field of view. Proc Natl
Behav Brain Res 167:307–317 Acad Sci USA 112(21):6724–6729. https://doi.org/10.
Jakobsen L, Surlykke A (2010) Vespertilionid bats control 1073/pnas.1422843112
the width of their biosonar sound beam dynamically Kubke MF, Massoglia DP, Carr CE (2004) Bigger brains
during prey pursuit. PNAS 107(31):13930–13935 or bigger nuclei? Regulating the size of auditory
Jakobsen L, Brinklov S, Surlykke A (2013a) Intensity and structures in birds. Brain Behav Evol 63:169–180
directionality of bat echolocation signals. Front Physiol Kuroda M, Sasaki M, Yamada K, Miki N, Matsuishi T
4:89. https://doi.org/10.3389/fphys.2013.00089 (2015) Tissue physical property of the harbor porpoise
Jakobsen L, Ratcliffe JM, Surlykke A (2013b) Convergent Phocoena phocoena for investigation of the sound
acoustic field of view in echolocating bats. Nature emission process. J Acoust Soc Am 138(3):1451–1456
493(7430):93–96. https://doi.org/10.1038/ Kürten L, Schmidt U (1982) Thermoperception in the
nature11664 common vampire bat (Desmodus rotundus). J Comp
Jakobsen L, Olsen MN, Surlykke A (2015) Dynamics of Physiol 146:223–228. https://doi.org/10.1007/
the echolocation beam during prey pursuit in aerial BF00610241
hawking bats. Proc Natl Acad Sci USA 112(26): Ladegaard M, Jensen FH, Beedholm K, da Silva VMF,
8118–8123. https://doi.org/10.1073/pnas.1419943112 Madsen PT (2017) Amazon river dolphins (Inia
Jen PHS, Suga N (1976) Coordinated activities of middle- geoffrensis) modify biosonar output level and directiv-
ear and laryngeal muscles in echolocating bats. Science ity during prey interception in the wild. J Exp Biol 220:
191:950–952 2654–2665. https://doi.org/10.1242/jeb.159913
Johansson LC, Hakansson J, Jakobsen L, Hedenstrom A Lammers MO, Au WWL, Aubauer R, Nachtigall PE
(2016) Ear-body lift and a novel thrust generating (2004) A comparative analysis of the pulsed emissions
mechanism revealed by the complex wake of brown of free-ranging Hawaiian spinner dolphins (Stenella
long-eared bats (Plecotus auritus). Sci Rep 6:24886. longirostris). In: Thomas JA, Moss CF, Vater M
https://doi.org/10.1038/srep24886 (eds) Echolocation in bats and dolphins. University of
Johnson M, Madsen PT, Zimmer WMX, Aguilar de Chicago Press, Chicago, pp 414–419
Soto N, Tyack PL (2006) Foraging Blainville’s beaked Lawrence BD, Simmons JA (1982a) Echolocation in bats:
whales (Mesoplodon densirostris) produce distinct the external ear and perception of the vertical position
click types matched to different phases of echoloca- of targets. Science 218:481–483
tion. J Exp Biol 209:5038–5050 Lawrence BD, Simmons JA (1982b) Measurements of
Kalko EKV, Schnitzler H-U (1989) The echolocation and atmospheric attenuation at ultrasonic frequencies and
hunting behavior of Daubenton’s bat, Myotis the significance for echolocation by bats. J Acoust Soc
daubentoni. Behav Ecol Sociobiol 24:225–238 Am 71(3):585–590. https://doi.org/10.1121/1.387529
454 S. M. M. Brinkløv et al.
Lazure L, Fenton MB (2011) High duty-cycle echoloca- Masters WM, Moffat AJM, Simmons JA (1985) Sonar
tion and prey detection by bats. J Exp Biol 214:1131– tracking of horizontally moving targets by the big
1137 brown bat Eptesicus fuscus. Science 228:1331–1333
Lewanzik D, Goerlitz HR (2018) Continued source level Matsuta N, Hiryu S, Fujioka E, Yamada Y,
reduction during attack in the low-amplitude bat Riquimaroux H, Watanabe Y (2013) Adaptive beam-
Barbastella barbastellus prevents moth evasive flight. width control of echolocation sounds by CF-FM bats,
Funct Ecol 32(5):1251–1261. https://doi.org/10.1111/ Rhinolophus ferrumequinum nippon, during prey-
1365-2435.13073 capture flight. J Exp Biol 216(Pt 7):1210–1218.
Li S, Nachtigall PE, Breese M (2011) Dolphin hearing https://doi.org/10.1242/jeb.081398
during echolocation: evoked potential responses in an Miller LA (1983) IV.3 How insects detect and avoid
Atlantic bottlenose dolphin (Tursiops truncatus). J Exp bats. In: Huber F, Markl H (eds) Neuroethology and
Biol 214:2027–2035. https://doi.org/10.1242/jeb. behavioral physiology. Springer, Berlin, pp 251–266
053397 Miller LA (2010) Prey capture by harbor porpoises
Linnenschmidt M, Beedholm K, Wahlberg M, Kristensen (Phocoena phocoena): a comparison between
JH, Nachtigall PE (2012) Keeping returns optimal: echolocators in the field and in captivity. J Mar Acoust
gain control exerted through sensitivity adjustments Soc Jpn 37(3):156–168
in the harbour porpoise auditory system. Proc Royal Miller LA, Surlykke A (2001) How some insects detect
Soc B 279:2237–2465 and avoid being eaten by bats: the tactics and counter
Linnenschmidt M, Teilmann J, Akamatsu T, Dietz R, tactics of prey and predator. Bioscience 51:570–581
Miller LA (2013) Biosonar, dive and foraging activity Miller LA, Wahlberg M (2013) Echolocation by the harbour
of satellite tracked harbour porpoises (Phocoena porpoise: life in coastal waters. Front Integr Physiol
phocoena). Mar Mamm Sci 29(2):E77–E97 4(52):1–6. https://doi.org/10.3389/fphys.2013.00052
Liu Z, Qi F-Y, Zhou X, Ren H-Q, Shi P (2014) Parallel Møhl B, Wahlberg M, Madsen PT, Heerfordt A, Lund A
sites implicate functional convergence of the hearing (2003) The monopulsed nature of sperm whale clicks. J
gene Prestin among echolocating mammals. Mol Biol Acoust Soc Am 114(2):1143–1154
Evol 31(9):2415–2424. https://doi.org/10.1093/ Moore P, Popper AN (2019) Heptuna’s contributions to
molbev/msu194 biosonar. Acoust Tod 15(1):44–52
Long GR, Schnitzler HU (1975) Behavioral audiograms Moore PWB, Dankiewicz LA, Houser DS (2008)
for the bat, Rhinolophus ferrumequinum. J Comp Beamwidth control and angular target detection in an
Physiol A 100:211–220 echolocating bottlenose dolphin (Tursiops truncatus). J
Madsen PT, Surlykke A (2014) Echolocation in air and Acoust Soc Am 124:3324–3332. https://doi.org/10.
water. In: Surlykke A, Nachtigall PE, Fay RR, Popper 1121/1.2980453
AN (eds) Biosonar, Springer handbook of auditory Morisaka T, Connor RC (2007) Predation by killer whales
research, vol 51. Springer, New York, pp 257–304. (Orcinus orca) and the evolution of whistle loss and
https://doi.org/10.1007/978-1-4614-9146-0 narrow-band high frequency clicks in odontocetes.
Madsen PT, Carder DA, Beedholm K, Ridgway SH (2005) Evol Biol 20. https://doi.org/10.1111/j.1420-9101.
Porpoise clicks from a sperm whale nose – convergent 2007.01336.x
evolution of 130 kHz pulses in toothed whale sonars? Motoi K, Sumiya M, Fujioka E, Hiryu S (2017) Three-
Bioacoustics 15:195–206 dimensional sonar beam-width expansion by Japanese
Madsen PT, Wilson M, Johnson M, Hanlon RT, house bats (Pipistrellus abramus) during natural forag-
Bocconcelli A, Aguilar de Soto N, Tyack PL (2007) ing. J Acoust Soc Am 141(5):EL439. https://doi.org/
Clicking for calamari: toothed whales can echolocate 10.1121/1.4981934
squid Loligo pealeii. Aquat Biol 1:141–150 Nachtigall PE, Supin AY (2008) A false killer whale
Madsen PT, Wisniewska D, Beedholm K (2010) Single adjusts its hearing when it echolocates. J Exp Biol
source sound production and dynamic beam formation 211:1714–1718
in echolocating harbour porpoises (Phocoena Nachtigall PE, Yuen MML, Mooney TA, Taylor KA
phocoena). J Exp Biol 213:3105–3110. https://doi. (2005) Hearing measurements from a stranded infant
org/10.1242/jeb.044420 Risso’s dolphin, Grampus griseus. J Exp Biol 208:
Madsen PT, Lammers MO, Wisniewska D, Beedholm K 4181–4188
(2013) Nasal sound production in echolocating Nachtigall PE, Mooney TA, Taylor KA, Miller LA,
delphinids (Tursiops truncatus and Pseudorca Rasmussen MH, Akamatsu T, Teilmann J,
crassidens) is dynamic, but unilateral: clicking on the Linnenschmidt M, Vikingsson GA (2008) Shipboard
right side and whistling on the left side. J Exp Biol measurements of the hearing of the white-beaked dol-
216(21):4091–4102. https://doi.org/10.1242/jeb.091306 phin, Lagenorhynchus albirostris. J Exp Biol 211:
Manley GA (1990) Peripheral hearing mechanisms in 642–647
reptiles and birds. Springer, Berlin Neuweiler G (2000) The biology of bats (trans: Covey E).
Mann DA, Lu Z, Popper AN (1997) Ultrasound detection Oxford University Press, Oxford
by a teleost fish. Nature 389:341–341 Noren DP, Holt MM, Dunkin RC, Williams TM (2017)
Martin GR, Rojas LM, Ramírez Y, McNeil R (2004) The Echolocation is cheap for some mammals: dolphins
eyes of oilbirds (Steatornis caripensis): pushing the conserve oxygen while producing high-intensity
limits of sensitivity. Naturwissenschaften 91:26–29 clicks. J Exp Mar Biol Ecol 495:103–109
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 455
Norris KS, Møhl B (1983) Can odontocetes debilitate prey NATO ASI Series, vol 156. Plenum Press, New York,
with sound? Am Nat 122(1):85–104 pp 53–60
Norris KS, Prescott JH, Asa-Dorian PV, Perkins P (1961) Rossbach KA, Herzing DL (1997) Underwater
An experimental demonstration of echo-location observations of benthic-feeding bottlenose dolphins
behavior in the porpoise, Tursiops truncatus (Tursiops truncatus) near Grand Bahama Island,
(Montagu). Biol Bull 120:163–176 Bahamas. Mar Mamm Sci 13(3):498–504
Norris KS, Dormer KJ, Pegg J, Liese GJ (1971) The Rydell J, Miller LA, Jensen ME (1999) Echolocation
mechanism of sound production and air recycling in constraints of Daubenton’s bat foraging over water.
porpoises: a preliminary report. In: Paper presented at Funct Ecol 13:247–255
the Proceedings of the eighth conference on the Sales GD, Pye JD (1974) Ultrasonic communication by
biological sonar of diving mammals. Menlo Park, animals. Chapman & Hall, London
California Sanchez L, Ohdachi SD, Kawahara A, Echenique-Diaz
Nørum U, Brinkløv S, Surlykke A (2012) New model for LM, Maruyama S, Kawata M (2019) Acoustic
gain control of signal intensity to object distance in emissions of Sorex unguiculatus (Mammalia:
echolocating bats. J Exp Biol 215(17):3045–3054 Soricidae): assessing the echo-based orientation
Novick A (1959) Acoustic orientation in the cave swiftlet. hypothesis. Ecol Evol 9(5):2629–2639. https://doi.
Biol Bull 117:497–503 org/10.1002/ece3.4930
Obrist MK, Fenton MB, Eger JL, Schlegel PA (1993) Schmidt S (1988) Evidence for a spectral basis of texture
What ears do for bats: a comparative study of pinna perception in bat sonar. Nature 331:617–619
sound pressure transformation in Chiroptera. J Exp Schmidt S, Türke B, Vogler B (1983) Behavioural audio-
Biol 180:119–152 gram from the bat Megaderma lyra (Geoffroy, 1810;
Oliveira C, Wahlberg M, Silva MA, Johnson M, Microchiroptera). Myotis 21–22:62–66
Antunes R, Wisniewska D, Fais A, Madsen PT Schnitzler HU (1968) Die Ultraschallortungslaute der
(2016) Sperm whale codas may encode individuality Hufeisen-Fledermäuse (Chiroptera-Rhinolophidae) in
as well as clan identity. J Acoust Soc Am 139(5): verschiedenen Orientierungssituationen. Z Vgl Physiol
2860–2869 57:376–408
Pedersen SC (1993) Cephalometric correlates of echolo- Schnitzler HU (1973) Control of Doppler-shift compensa-
cation in the chiroptera. J Morphol 218(1):85–98. tion in greater horseshoe bat, Rhinolophus
https://doi.org/10.1002/jmor.1052180107 ferrumequinum. J Comp Physiol 82(1):79–92. https://
Pollak GD (1988) Time is traded for intensity in the bats doi.org/10.1007/Bf00714171
auditory-system. Hear Res 36(2–3):107–124. https:// Schnitzler HU, Flieger E (1983) Detection of oscillating
doi.org/10.1016/0378-5955(88)90054-8 target movements by echolocation in the greater horse-
Popov AV, Supin AY, Wang D, Wang K (2006) Noncon- shoe bat. J Comp Physiol A 153:385–391
stant quality of auditory filters in the porpoises, Schuller G (1977) Echo delay and overlap with emitted
Phocoena phocoena and Neophocaena phocaenoides orientation sounds and Doppler-shift compensation in
(Cetacea, Phocoenidae). J Acoust Soc Am 119(5): bat, Rhinolophus ferrumequinum. J Comp Physiol
3173–3180 114(1):103–114
Price JJ, Johnson KP, Clayton DH (2004) The evolution of Schuller G, Pollack GD (1979) Disproprotionate fre-
echolocation in swiftlets. J Avian Biol 35:135–143 quency representation in the inferior colliculus of
Pye JD (1980) Echolocation signals and echoes in air. In: horseshoe bats: evidence for an “acoustic fovea”. J
Busnel R-G, Fish JF (eds) Animal sonar systems. Ple- Comp Physiol 132:47–54
num Press, New York, pp 309–353 Schwartz C, Tressler J, Keller H, Vanzant M, Ezell S,
Pye JD (1993) Is fidelity futile? The ‘true’ signal is illu- Smotherman M (2007) The tiny difference between
sory, especially with ultrasound. Bioacoustics 4(4): foraging and communication buzzes uttered by the
271–286 Mexican free-tailed bat, Tadarida brasiliensis. J
Rasmussen MH, Miller LA (2002) Whistles and clicks Comp Physiol A 193:853–863
from white-beaked dolphins, Lagenorhynchus Siemers BM, Schauermann G, Turi H, Von Merten S
albirostris, recorded in Faxafloi Bay, Iceland. Aquat (2009) Why do shrews Twitter? Communication or
Mamm 28:78–89 simple echo-based orientation. Biol Lett 5:593–596
Rasmussen MH, Atem ACG, Miller LA (2016) Behavioral Simmons JA, Howell DJ, Suga N (1975) Information
responses by Icelandic white-beaked dolphins content of bat sonar echoes. Am Sci 63:204–215
(Lagenorhynchus albirostris) to playback sounds. Simmons JA, Fenton MB, O’Farrell MJ (1979) Echoloca-
Aquat Mamm 42(3):317–329. https://doi.org/10.1578/ tion and pursuit of prey by bats. Science 203:16–21
AM.42.3.2016.317 Simmons JA, Kick SA, Lawrence BD, Hale C, Bard C,
Ratcliffe JM, Elemans CPH, Jakobsen L, Surlykke A Escudié B (1983) Acuity of horizontal angle discrimi-
(2013) How the bat got its buzz. Biol Lett 9(2). nation by the echolocating bat, Eptesicus fuscus. J
https://doi.org/10.1098/rsbl.2012.1031 Comp Physiol A 153:321–330
Ridgway SH, Carter DA (1988) Nasal pressure and sound Simmons JA, Moss CF, Ferragamo M (1990) Conver-
production in an echolocating white whale, gence of temporal and spectral information into acous-
Delphinaperus leucas. In: Nachtigall PE, Moore tic images of complex sonar targets perceived by the
PWB (eds) Animal sonar: processes and performance,
456 S. M. M. Brinkløv et al.
echolocating bat, Eptesicus fuscus. J Comp Physiol A orca) hearing: auditory brainstem response and behav-
166:449–470 ioral audiograms. J Acoust Soc Am 106:1134–1141
Smyth DM, Roberts JR (1983) The sensitivity of echolo- Teeling E (2009) A molecular and morphological perspec-
cation by the Grey Swiftlet Aerodramus spodiopygius. tive on the evolution of echolocation in bats. J Vertebr
Ibis 125:339–345 Paleontol 29:190a–190a
Snow DW (1961) The natural history of the oilbird, Teeling EC, Springer MS, Madsen O, Bates P, O’Brien SJ,
Steatornis caripensis, in Trinidad, W. I. Part 1. General Murphy WJ (2005) A molecular phylogeny for bats
behavior and breeding habits. Zoologica 46:27–48 illuminates biogeography and the fossil record. Science
Snow DW (1962) The natural history of the oilbird in 307:580–584
Trinidad. W. I. Part II. Population breeding ecology, Thiagavel J, Cechetto C, Santana SE, Jakobsen L, Warrant
and food. Zoologica 27:199–221 EJ, Ratcliffe JM (2018) Auditory opportunity and
Song Z, Xu X, Dong J, Xing L, Zhang M, Liu X, Zhang Y, visual constraint enabled the evolution of echolocation
Li S, Berggren P (2015) Acoustic property reconstruc- in bats. Nat Commun 9(1):98. https://doi.org/10.1038/
tion of a pygmy sperm whale (Kogia breviceps) fore- s41467-017-02532-x
head based on computed tomography imaging. J Thomas JA, Turl CW (1990) Echolocation characteristics
Acoust Soc Am 138(5):3129–3137 and range detection threshold of a false killer whale
Sørensen PM, Wisniewska DM, Jensen FH, Johnson M, (Pseudorca crassidens). In: Thomas JA, Kastelein RA
Teilmann J, Madsen PT (2018) Click communication (eds) Sensory abilities of cetaceans. Plenum Press,
in wild harbour porpoises (Phocoena phocoena). Sci New York, pp 321–334
Rep 8:9702. https://doi.org/10.1038/s41598-018- Thomas JA, Moss CF, Vater M (2004) Echolocation in bats
28022-8 and dolphins. University of Chicago Press, Chicago
Speakman JR, Racey PA (1991) No cost of echolocation Thomassen HA (2005) Swift as sound – design and evo-
for bats in flight. Nature 350:421–423 lution of the echolocation system in Swiftlets
Starkhammar J, Moore PW, Talmadge L, Houser DS (Apodidae: Collocaliini). Leiden University, Leiden
(2011) Frequency-dependent variation in the Thomassen HA, Povel GDE (2006) Comparative and
2-dimensional beam pattern of an echolocating dol- phylogenetic analysis of the echo clicks and social
phin. Biol Lett 7:836–839 vocalisations of swifts and swiftlets (Aves: Apodidae).
Stilz WP, Schnitzler HU (2012) Estimation of the acoustic Biol J Linn Soc 88:631–643
range of bat echolocation for extended targets. J Thomassen HA, Djasim UM, Povel GDE (2004) Echo
Acoust Soc Am 132:1765–1775. https://doi.org/10. click design in swiftlets: single as well as double clicks.
1121/1.4733537 Ibis 146:173–174
Strother GK, Mogus M (1970) Acoustical beam patterns Tomasi TE (1979) Echolocation by the short-tailed shrew,
for bats: some theoretical considerations. J Acoust Soc Blarina brevicauda. J Mammal 60:751–759
Am 48(6):1430–1432 Tomassen HA, Gea S, Maas S, Dirckx JJJ, Decraemer WF,
Suga N, Jen PH-S (1975) Peripheral control of acoustic Povel GDE (2007) Do Swiftlets have an ear for echo-
signals in the auditory system of echolocating bats. J location? The functional morphology of Swiftlets’
Exp Biol 62:277–311 middle ears. Hear Res 225:25–37. https://doi.org/10.
Surlykke A, Kalko EKV (2008) Echolocating bats cry out 1016/j.heares.2006.11.013
loud to detect their prey. PLoS One 3(4):e2036 Urick RJ (1983) Principles of underwater sound, 3rd edn.
Surlykke A, Pedersen SB, Jakobsen L (2009a) McGraw-Hill, New York
Echolocating bats emit a highly directional sonar Vanderelst D, Peremans H, Razak NA, Verstraelen E,
sound beam in the field. Proc R Soc B 276:853–860 Dimitriadis G (2015) The aerodynamic cost of head
Surlykke A, Ghose K, Moss CF (2009b) Acoustic scan- morphology in bats: maybe not as bad as it seems.
ning of natural scenes by echolocation in the big brown PLoS One 10(5):e0118545. https://doi.org/10.1371/
bat, Eptesicus fuscus. JEB 212:1011–1020 journal.pone.0126061
Suthers RA, Hector DH (1982) Mechanism for the produc- Verfuss UK, Miller LA, Pilz PKD, Schnitzler HU (2009)
tion of echolocating clicks by the Grey Swiftlet, Echolocation by two foraging harbour porpoises
Collocalia spodiopygia. J Comp Physiol A 148:457–470 (Phocoena phocoena). J Exp Biol 212:823–834
Suthers RA, Hector DH (1985) The physiology of vocali- Wang Z, Zhu T, Xue H, Fang N, Zhang J, Zhang L, Pang J,
zation by the echolocating oilbird, Steatornis Teeling EC, Zhang S (2017) Prenatal development
caripensis. J Comp Physiol A 156:243–266 supports a single origin of laryngeal echolocation in
Suthers RA, Hector DH (1988) Individual variation in bats. Nat Ecol Evol 1:0021. https://doi.org/10.1038/
vocal tract resonance may assist oilbirds in recognizing s41559-016-0021
echoes of their own sonar clicks. In: Nachtigall PE, Watkins WA, Schevill WE (1977) Sperm whale codas. J
Moore PWB (eds) Animal sonar: processes and perfor- Acoust Soc Am 62:1485–1490
mance. Plenum Press, New York, pp 87–91 Wei C, Au WWL, Ketten DR, Song Z, Zhang Y (2017)
Szymanski MD, Bain DE, Kiehi K, Pennington S, Biosonar signal propagation in the harbor porpoise’s
Wong S, Henry KR (1999) Killer whale (Orcinus (Phocoena phocoena) head: the role of various
12 Echolocation in Bats, Odontocetes, Birds, and Insectivores 457
structures in the formation of the vertical beam. J Wilson M, Wahlberg M, Surlykke A, Madsen PT (2013)
Acoust Soc Am 141(6):4179–4187 Ultrasonic predator-prey interaction in water-
Wei C, Au WWL, Ketten DR, Zhang Y (2018) Finite convergent evolution with insects and bats in air?
element simulation of broadband biosonar signal prop- Front Physiol 4(June):1–12
agation in the near- and far-field of an echolocating Wisniewska DM, Johnson M, Beedholm K, Wahlberg M,
Atlantic bottlenose dolphin (Tursiops truncatus). J Madsen PT (2012) Acoustic gaze adjustments during
Acoust Soc Am 143(5):2611–2620. https://doi.org/ active target selection in echolocating porpoises. J Exp
10.1121/1.5034464 Biol 215:4358–4373
Weissenbacher P, Wiegrebe L (2003) Classification of Wisniewska DM, Ratcliffe JM, Beedholm K, Christensen
virtual objects in the echolocating bat, Megaderma CB, Johnson M, Koblitz JC, Wahlberg M, Madsen PT
lyra. Behav Neurosci 117(4):833–839. https://doi.org/ (2015) Range-dependent flexibility in the acoustic field
10.1037/0735-7044.117.4.833 of view of echolocating porpoises (Phocoena
Wilson M, Acolas ML, Bégout ML, Madsen PT, phocoena). eLife. https://doi.org/10.7554/eLife.05651
Wahlberg M (2008) Allis shad (Alosa alosa) exhibit Wong A, Gall MD (2015) Frequency sensitivity in the
an intensity-graded behavioral response when exposed auditory periphery of male and female black-capped
to ultrasound. J Acoust Soc Am 243:1–5 chickadees (Poecile atricapillus). Zoology 118:357–
363
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license
and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain
permission directly from the copyright holder.
The Effects of Noise on Animals
13
Christine Erbe, Micheal L. Dent, William L. Gannon,
Robert D. McCauley, Heinrich Römer, Brandon L. Southall,
Amanda L. Stansbury, Angela S. Stoeger,
and Jeanette A. Thomas
increase their call repetition to be heard above the stress, hearing loss, barotrauma (in aquatic spe-
chorus of their conspecifics (Serrano and Terhune cies), injury, and ultimately death (Kight and
2001). Similarly, king penguins (Aptenodytes Swaddle 2011). In addition to such direct effects
patagonicus; Aubin and Jouventin 1998), zebra of noise, there may be indirect effects (e.g., when
finches (Taeniopygia guttata; Narayan et al. a prey species is impacted, leading to reduced
2007), and big brown bats (Eptesicus fuscus; prey availability). The effects of noise do not
Warnecke et al. 2015) communicate in a cacoph- always have to be negative from the animals’
ony of conspecific calls. Animals have evolved point of view. In some cases, animals actually
sound production and reception capabilities in use anthropogenic sounds to their advantage.
natural biotic and abiotic background noise. For example, the sound of a dumpster lid closing
However, anthropogenic noise is fairly recent on in a campground might indicate a food source to
evolutionary time scales. Researchers have tried some birds and mammals. Underwater sounds
to assess whether existing adaptations are suffi- from ships can increase the settlement, growth
cient for animals to deal with anthropogenic noise. rate, and absolute growth of biofouling organisms
Anthropogenic noise in terrestrial such as bryozoans, oysters, calcareous
environments originates from road traffic, trains, tubeworms, and barnacles (Stanley et al. 2014).
aircraft, industrial sites, energy plants, construc- Sounds from fishing vessels may attract birds,
tion machinery, etc. Anthropogenic noise in seals, and dolphins, which then feed on the bait
aquatic environments originates from recreational or catch (Söffker et al. 2015). This attraction to a
boating, commercial shipping, commercial fish- food source elicited by anthropogenic noise is
ing, offshore hydrocarbon and mineral explora- called the “dinner bell effect.”
tion, hydrocarbon production, mineral mining, In terms of the potential negative effects of
marine construction, offshore renewable energy anthropogenic noise on animals, Fig. 13.1 shows
production, military activities, etc. Such anthro- a generalized view of increasingly severe effects
pogenic sounds, in air or water, have distinct closer to the noise source. Depending on where
“sound signatures,” and their contributions to the noise source and the receiving animals are
the marine and terrestrial soundscapes are located in space, received noise will differ in
discussed in Chap. 7. spectral and temporal characteristics (see
The effects of anthropogenic noise have been Chaps. 5 and 6 on sound propagation in air and
studied extensively in humans (Kryter 1994); water, respectively). While there are widely vary-
however, less is known about how human- ing sound propagation conditions depending on
generated noise affects other animals. Four edited the specific environment in which a sound is
books (Brumm 2013; Popper and Hawkins 2012, produced and received, received levels generally
2016; Slabbekoorn et al. 2018a) and some journal attenuate or decrease as sound propagates from its
special issues (Erbe et al. 2016b, 2019c; Le Prell source. Given that no habitat is acoustically
et al. 2019; Thomsen et al. 2020) compile many homogeneous or isotropic, received levels vary
examples outlining the effects of noise. The with azimuth (direction) and inclination (height or
effects of anthropogenic noise on animals are a depth), leading to different impact ranges in all
growing concern, having resulted in an exponen- directions.
tial increase in the number of research The absolute range and order of noise impact
publications on this topic (Williams et al. 2015). severity can differ based on features of the propa-
What are the effects of anthropogenic noise? gation environment, exposure context, and spe-
They can vary from mere auditory sensation, mild cies involved (Ellison et al. 2012). In general, at
and temporary annoyance, brief behavioral the longest ranges, a noise might barely be audi-
changes, temporary avoidance of an area, and ble to an animal and may be less likely to have
masking to long-term changes in the usage of any negative effect. Audibility of a noise depends
important feeding or breeding areas, prolonged on its amplitude and spectrum, propagation
13 The Effects of Noise on Animals 461
conditions from the source to the receiver, ambi- bands occupied by the signal and enabling paral-
ent noise conditions, and hearing abilities of the lel processing (Moore 2013). The critical ratio is
animal. the most commonly measured parameter related
Stress is a physiological response, which to auditory masking. It is defined as the mean-
might occur at long and short ranges and at low square sound pressure of a narrowband signal
and high noise levels. Stress can be a direct (e.g., a tone) divided by the mean-square sound
response to noise (e.g., if a novel noise is sud- pressure spectral density of the masking noise at a
denly heard) and an indirect response to noise level, where the signal is just detectable (see
(e.g., if masking causes stress). Stress can affect Chap. 10 on audiometry; International Organiza-
numerous life functions (including immune tion for Standardization 2017). There are two
response, reproductive success, predator avoid- categories of masking. Energetic masking occurs
ance, etc.; Tarlow and Blumstein 2007). when the masking sound overlaps with the signal
Acoustic masking might occur over long in both frequency and time, such that the signal is
ranges when a distant noise masks a faint signal. inaudible. Informational masking occurs later in
Masking is the process (and amount) by which the auditory process; the signal is still audible, but
the audibility threshold for a sound is raised by it cannot be disentangled from the masker (Moore
the presence of another sound (i.e., noise; Ameri- 2013).
can National Standards Institute 2013).1 The Somewhat closer to the source, changes in
higher the noise level is, the greater the masking behavior of varying severity might be seen. An
effect. Masking can interfere with signals impor- animal might change its orientation, cease prior
tant to animals, such as their social communica- behavior (e.g., feeding), move away from the
tion calls, mother-offspring recognition sounds, source, or alter its vocal behavior, which may
echolocation signals, environmental sounds, or have implications for social functions.
sounds by predators and prey (Dooling and Animals must be closer to sound sources to
Leek 2018). The animal’s auditory system splits receive sound levels sufficiently high for noise-
incoming sound into a series of overlapping induced hearing loss (NIHL). NIHL results from
bandpass filters, thus optimizing SNR in the overstimulation of the sensory cells in the inner
ear, leading to metabolic exhaustion of the hair
1
ANSI/ASA S1.1 & S3.20 Standard Acoustical & cells, damage to the organ of Corti, and in
Bioacoustical Terminology Database; https:// extreme cases, degeneration of retrograde
asastandards.org/asa-standard-term-database/
462 C. Erbe et al.
Fig. 13.3 Population Consequences of Acoustic Distur- level consequences via a series of stages, connected by
bance (PCAD) model (National Research Council 2005), transfer functions
which links noise exposure from individual to population-
Fig. 13.4 Bird’s-eye sketch of different mitigation temporarily reduce power or shut down if animals are
methods employed in the marine environment to reduce detected within these zones and resume once animals
the risk of noise impacts (Erbe et al. 2018). The offshore, have departed. In addition, modifications might be possi-
noise-producing platform is indicated by the black star. It ble to the source or its operational parameters. Noise
is surrounded by safety zones, which are observed in real reduction gear (e.g., a bubble curtain around pile driving
time. MMO: marine mammal observer, who might be on in shallow water) is indicated by gray dots. MPA: marine
shore, or on the operations platform, or on an additional protected area, which might only be accessible during
vessel. PAM: passive acoustic monitoring using low-risk seasons
hydrophones, possibly as a towed array. Operations
Fig. 13.5 Photograph from Addo Elephant National of Cathy Dreyer, Conservation Manager, Addo Elephant
Park, South Africa, encouraging visitors to switch off National Park)
their car engines to limit noise effects on wildlife (courtesy
13 The Effects of Noise on Animals 465
Overall, the effects of anthropogenic noise are or highways. The densities of white-footed mice
a challenge to researchers, noise producers, and (Peromyscus leucopus) and eastern chipmunks
policy makers. Often, stakeholders have data (Tamias striatus) did not decrease near roads.
from only a few studies on a few species from While both species were significantly less likely
which to develop criteria for noise exposure. This to cross a road than move the same distance away
chapter gives examples of the effects of noise on a from roads, traffic volume (and noise level) had
variety of animal taxa. no effect (McGregor et al. 2008). Wale et al.
(2013b) investigated the physiological responses
of shore crabs (Carcinus maenas) to single and
13.2 Behavioral Options in a Noisy multiple ship-noise playbacks. Crabs consumed
Environment more oxygen, indicative of a higher metabolic
rate and potential stress, when exposed to ship
When exposed to anthropogenic noise, animals noise compared to ambient noise. However,
have choices of responses. Behavioral changes repeated exposures to ship noise showed no
are perhaps the most frequently observed and change. The authors proposed that crabs
reported effects of noise. In many cases, such exhibited the maximum response on the first
changes might be an “affordable” adaptation, for exposure to ship noise, then habituated or became
example when an animal temporarily moves tolerant of the noise.
away from the noise. The response (or lack Even when no behavioral response is detect-
thereof) is likely based on a cost-benefit ratio or able, animals might accept noise exposure at
the cost of change to improve fitness versus the levels that could have long-term hearing impacts,
magnitude of the benefit by changing. Although a especially if there are benefits of sticking around.
variety of behavioral changes in response to noise For example, each winter endangered manatees
have been studied in several species, their (Trichechus manatus) congregate around power
implications for biological fitness are difficult to plants in Florida likely in order to stay in the
determine. warm water effluence produced by the plant. In
the process, they are potentially exposed to high
levels of underwater noise for long periods.
13.2.1 Habituation Seemingly, the benefit of the warm water
outweighs the cost of noise exposure
Animals sometimes habituate to anthropogenic (JA Thomas, pers. obs.). Similarly, seals
noise. Habituation is a form of learning in which depredating at aquaculture sites might accept
an animal reduces or ceases its response to a hearing loss inducing noise levels from acoustic
stimulus after repeated presentations; in other harassment devices or “seal scarers” (Coram et al.
words, the animal learns to stop responding to 2014).
anthropogenic noise when it learns there are no
significant consequences. Habituation can be dif-
ficult to determine in the wild. A lack of observed 13.2.2 Change of Behavior
behavioral response does not necessarily mean
that there was no response or that the animal Temporary behavioral responses have been
habituated; the response might have been too reported for gray whales that took a somewhat
small to be observed, or it was of physiological wider route around the noise from offshore oil
type, or the animal’s hearing sensitivity might drilling platforms, while continuing their normal
have been reduced by prior exposure. round-trip migration from Alaska to Mexico
There are many accounts of animals living (Malme et al. 1984). Such a subtle response likely
without apparent detrimental impacts in areas of won’t have any long-term impact on fitness. Har-
high ambient noise, for example small mammals bour porpoises (Phocoena phocoena), on the
that live and breed along runways, railroad tracks, other hand, have been shown to forage almost
466 C. Erbe et al.
continuously around the clock and hence even 13.2.3 Change of Acoustic Signaling
moderate occurrences of anthropogenic distur-
bance might have significant fitness Vocal behaviors can also change in response to
consequences (Wisniewska et al. 2016). noise. To reduce interference from urban daytime
A permanent displacement from habitat has noise, chaffinches sang earlier in the day and
been suggested in egrets (Ardea alba) and great European robins (Erithacus rubecula) changed
blue herons (Ardea herodias), judged by the vocal activities to nighttime (Bergen and Abs
altered distribution of nests along the Mississippi 1997; Fuller et al. 2007). The cost of this change
River, potentially in response to increased vessel in vocal behavior is unknown. Animals might
traffic, such as tugboats and barges (JA Thomas, also change the characteristics of their sounds to
pers. obs.). A long-term displacement lasting six avoid masking. Changes in vocal effort such as
years occurred in killer whales (Orcinus orca) in increases in amplitude, repetition rate, and dura-
response to acoustic harassment devices installed tion, or frequency shifts are collectively known as
in parts of their habitat. Whales returned when the Lombard effect, which has been demonstrated
devices were removed (Morton and Symonds in several taxa, including frogs (Halfwerk et al.
2002). 2016), birds (Slabbekoorn and Peet 2003), and
Noise affects not only animal movement but cetaceans (Scheifele et al. 2005). The Lombard
also other behaviors. Chaffinches (Fringilla effect has also been observed during odontocete
coelebs) reduced their food pecking during echolocation: A captive beluga whale
increased background noise, which increased (Delphinapterus leucas) increased the amplitude
their vigilance; however, the increased alertness and frequency of its echolocation signal when
and hence reduction in predation risk might have moved from a quiet habitat in San Diego to an
reduced fitness via the reduction in food intake area with high snapping shrimp noise in Hawaii
(Quinn et al. 2006). Similarly, California ground (Au et al. 1985).
squirrels (Otospermophilus beecheyi) showed Some animal taxa might be limited in their
increased vigilance near wind turbines, poten- ability to voluntarily and temporarily change the
tially at the cost of other behaviors (Rabin et al. spectrographic features of their sounds—often
2006). In the marine environment, anthropogenic called behavioral plasticity. Insects, for example,
noise interfered with the predator-prey relation- generate sound by stridulation of body parts, the
ship. Motorboat noise elevated metabolic rate in resonance of which cannot be actively controlled.
prey fish, which then responded less often and Consequently, a Lombard effect failed to be
less rapidly to predation attempts. Predator fish observed in Oecanthus tree crickets (Costello
consumed more than twice as much prey during and Symes 2014); however, grasshoppers
boat noise exposure (Simpson et al. 2016). (Chorthippus biguttulus) from noisy habitats or
Reinforcing an acoustic communication mes- those exposed to noise as nymphs produced
sage with a visual display can enhance communi- higher-frequency sounds with higher duty cycles
cation in a noisy environment. For example, male (i.e., increased sound-to-pause ratio), indicating
foot-flagging frogs (Dendropsophus parviceps) developmental plasticity (Lampe et al. 2012,
live in neotropical areas with fast-flowing 2014).
streams, high levels of rain, and numerous other A cessation of sound emission in the presence
species of calling frogs. Foot-flagging frogs of anthropogenic noise can also occur. Thomas
evolved the visual signal of stretching out one or et al. (2016) studied the effects of construction
two hind legs, vibrating their feet, or stretching noise on yellow-cheeked gibbons (Nomascus
out their toes while calling, assisting with their gabriellae) at Niabi Zoo. Before construction, a
communication (Amézquita and Hödl 2004). bonded pair and their four-year-old offspring
13 The Effects of Noise on Animals 467
were quite soniferous. The pair commonly duet- The consequences of elevated stress levels can
ted in the early morning and displayed behaviors be far-reaching. Tarlow and Blumstein (2007)
typical of a bonded pair. Once construction near reviewed the effects of increased stress in birds
their exhibit commenced, they gradually resulting from human disturbances. The review
vocalized less often, and by the end of the four- documented changes in hormone levels, changes
month construction period, the pair bond had in heart rate, immunosuppression, changes in
dissolved and the young became ill (possibly flight-initiation distance, disturbed breeding suc-
due to decreased quality of care with the loss of cess, altered mate choice, and fluctuating
parent pair bond). For about a year, the pair anatomical asymmetry—all as a result of stress.
remained distant from each other and did not While there have not been many long-term stud-
vocalize. One of the authors (JA Thomas) played ies of noise-induced, chronic stress in animals,
back recordings of the pair’s own duet and those there is plenty of evidence from humans
of wild gibbons. Already during the first play- documenting, for example, hypertension and car-
back, the pair slowly started to vocalize and diovascular disease (Bolm-Audorff et al. 2020;
move to the top of the exhibit where they nor- Hahad et al. 2019; World Health Organization
mally performed their duet. They vocalized in 2011).
response to their own duet as opposed to Noise can further affect other non-acoustic
playbacks of other gibbon duets. The pair sensing and information use (termed cross-
continued duetting for several more years of modal impacts). For example, road noise
observation. impacted the ability of mongoose (Helogale
parvula) to smell predator feces, leaving these
mammals more susceptible to predation and loss
13.3 Physiological Effects of group cohesion (Morris-Drake et al. 2016).
The effects of noise are complex and they differ
In addition to eliciting changes in fine- or gross- by species. The following sections describe
motor behavior and acoustic behavior, sound can observed responses to sound by different taxa.
also cause physiological impacts, like stress,
hearing loss, or injury to tissues and organs. An
animal with impaired hearing might exhibit dif- 13.4 Noise Effects on Marine
ferent responses to sound and different acoustic Invertebrates
behavior, compared to an animal with normal
hearing. Marine invertebrates comprise a great diversity of
A stress response may occur when noise is fauna with a corresponding diversity of sensory
loud, novel, or unexpected (Wale et al. systems and modes of detecting sound or vibra-
2013a, b). Studies often concentrate on the effects tion. Only a few publications exist on the impacts
of noise-induced stress on reproduction. How- of underwater sound on marine invertebrates.
ever, stress also can result in: (1) a reduction or
cessation of normal movement, with a reduced
likelihood of escaping a predator; (2) reduced 13.4.1 Marine Invertebrate Hearing
appetite, feeding, or food acquisition; and
(3) excessive anti-predation behaviors. Attention Invertebrate species exhibit a diversity of sensory
is required to capture prey or avoid detection by a systems for detecting sound and vibration. Many
predator. Many animals use auditory cues to crustaceans and molluscs have acoustic sensory
detect the presence of predators or prey, and any systems that are an analogue to the fish otolith
noise-induced distraction could limit this detec- hearing system as they contain statocysts. These
tion (Siemers and Schaub 2011). Chan et al. are small organs that house a dense mass (i.e., a
(2010) termed this the “distracted prey statolith), which moves in response to sound and
hypothesis”. thus drives sensory hair cells, which create the
468 C. Erbe et al.
et al. 2004). Statocyst hair cell damage was found had been removed from their seafloor habitat
in cephalopods (cuttlefish and squid) subjected to and were suspended in lantern nets in the water
simulated sonar sweeps in a laboratory tank column where they would not have experienced
(André et al. 2011; Solé et al. 2013; Fig. 13.6). substrate-borne and interface (i.e., at the seafloor)
sound and vibration. Also, physiological
measurements and long-term monitoring were
13.4.2.2 Scallops
not conducted. Przeslawski et al. (2018) made
Scallops (Pecten fumatus) exhibited behavioral
observations of wild scallops exposed to seismic
changes as a result of exposure to a 150-in3
airguns and found no discernible impacts, but the
airgun, which continued during the full 120-day
study had insufficient controls and no physiologi-
post-exposure monitoring, suggesting damage to
cal measurements, and longer-term post-exposure
the statocyst organ, which controls balance (Day
sampling was not undertaken.
et al. 2016a, 2017). Physiological measures
changed for the worse and mortality increased
with dose from 1 to 4 passes of the airgun (Day 13.4.2.3 Crustaceans
et al. 2016a, 2017). A different study failed to find Spiny lobsters (Jasus edwardsii) were exposed to
any significant effects of seismic airguns on single passes of a 45 or 150-in3 airgun and moni-
scallops (Parry et al. 2002); however, animals tored for 365 days after exposure (Day et al.
470 C. Erbe et al.
impact range given by McCauley et al. (2017) 100 kHz. Signaling at these frequencies is impor-
was within the repeat range (400–800 m) within tant for mate attraction and localization, rivalry,
which a 3D seismic survey vessel would pass on and spacing of individuals within populations. In
an adjacent seismic line, so that the entire survey addition, many species use their ears to detect and
area could have its plankton field degraded. avoid predators. Some species of flies eavesdrop
Richardson et al. (2017) ran ecological models on calling insects to locate and parasitize them.
to assess the scale of this impact. Assuming an An evolutionary adaptation to ambient noise
area of strong tidal currents and consistent ocean from competing insect choruses is the modifica-
current, a 3-day copepod turnover rate, and a tion of peripheral sensory filters, such as the
three-fold increase in copepod mortality within sharpening of tuning in the cricket (Fig. 13.7).
1.2 km, the copepod plankton field was modeled Such sharp tuning curves reduce the amount of
to recover within three days of completion of a masking noise within the filter (Schmidt et al.
mid-size 3D seismic survey. But, when 2011).
Richardson et al. (2017) reduced the strength of However, the most prevalent form of insect
the currents in the model, the impact persisted for communication involves substrate-borne sound.
three weeks. Many larger zooplankton have a More than 139,000 described taxa are expected
longer than 3-day turnover rate (i.e., weeks to to exclusively use vibrational signaling and an
months) with larval forms having a once or additional 56,000 taxa use a combination of
twice per year recruitment cycle, enhancing vibrational communication and other forms of
impacts above the published model output. mechanical signaling (Cocroft and Rodríguez
Given the central role zooplankton play in the 2005). The sensory organs monitoring substrate-
ocean ecosystem, and given that not all turn borne sound (e.g., the subgenual organs in the
over rapidly, the results of McCauley et al. legs) are tuned to frequencies below 1 kHz and
(2017) are of concern for ocean health. are extremely sensitive.
Anthropogenic noise sources produce signifi- neither modify the fundamental frequency of their
cant amplitudes of air-borne sound at frequencies song nor increase the amplitude of their calls in
from less than 10 Hz to 50 kHz (e.g., traffic on noise (i.e., lack of a Lombard effect), as do some
roads and railways, compressors, wind turbines, species of frogs and birds, to reduce masking by
military activities, and urban environments). At anthropogenic noise.
the same time, airport, road, and railroad traffic For insects using substrate-borne signals,
and construction are significant sources of experimentally induced noise may disrupt mat-
low-frequency, substrate-borne vibrations below ing. Insects either respond less frequently to
1 kHz. Such substrate-borne noise may be created signals of the opposite sex, or they cease signal-
directly by vibrating the substrate (e.g., by driving ing during the initial part of communication
over it) or indirectly via air-borne noise that (Polajnar and Čokl 2008). The fact that noise
induces vibrations in the substrate. The relatively can disrupt substrate-borne communication
low-frequency sound produced by many of these between the sexes may be utilized in pest control
sources suffers less attenuation and can thus in agriculture (Polajnar et al. 2015). For example,
travel farther from the source. Because many substrate-borne noise can mask the mating signals
insects have very sensitive receptors for of species of leafhoppers, which represent a major
substrate-borne sound, with displacement pest in vineyards, resulting in reduced reproduc-
thresholds less than 1 nm, they are likely to detect tive success. A similar approach was successful
anthropogenic sources over long distances. with pine bark beetles, when the substrate-borne
Anthropogenic noise may therefore have a signif- noise spectrally overlapped with beetle signals
icant impact on the ability of insects to communi- (Hofstetter et al. 2014).
cate and listen in both the air-borne and substrate- The failure to adjust the frequency or ampli-
borne channel (reviewed by Morley et al. 2014; tude of mating signals in noise does, however, not
Raboin and Elias 2019). exclude other means of behavioral plasticity. For
example, the responses of male field crickets
(Gryllus bimaculatus) to traffic noise depended
13.5.2 Behavioral Effects on prior experience (Gallego-Abenza et al. 2019).
Recordings of car noise were played back to
Anthropogenic noise may impact insects in vari- males living at different ranges from the road
ous ways. It can mask communication signals, and, therefore, with different prior experience to
increase stress, affect larval development, and road noise. Males farther from the road decreased
ultimately decrease lifespan (reviewed by Raboin their chirp rate more than those nearer by,
and Elias 2019). The most common consequence suggesting that “behavioral plasticity modulated
of noise is masking, when noise overlaps in time by experience may thus allow some insect species
and frequency with a signal. This decreases the to cope with human-induced environmental
signal-to-noise ratio and thus the detection and/or stressors” (Gallego-Abenza et al. 2019).
discrimination of signals. For example, Schmidt Developmental plasticity may also manifest in
et al. (2014) found that anthropogenic noise signal modifications in response to noise. The
resulted in less effective female cricket orienta- courtship signals of grasshoppers are more broad-
tion toward signaling males (phonotaxis: band in frequency than those of crickets. Specifi-
orientated movement in relation to a sound cally, male grasshoppers (Chorthippus
source), which, in crickets, is the usual way to biguttulus) from roadside habitats produced
bring the sexes together. In another cricket spe- higher-frequency signals compared to
cies, males shortened their calls and paused sing- grasshoppers in quieter habitats (Lampe et al.
ing with increasing noise level. However, males 2014). In an experiment that reared half of the
did not adjust the duration of intervals between grasshopper nymphs in a noisy environment and
song elements important for species identification the other half in a quiet environment, adult males
(Orci et al. 2016). Apparently, these insects can from the first group produced signals with higher-
13 The Effects of Noise on Animals 473
frequency components, suggesting that develop- monsters, monitors, and bearded dragons) spe-
mental plasticity allows signal modifications in cies. Soniferous reptiles include some snakes,
noisy habitats. alligators, crocodiles, geckos, and freshwater
and marine turtles (e.g., Young 1997).
Reptiles are surrounded by anthropogenic
13.5.3 Physiological Effects noise from traffic (in water, on land, and in air),
construction, mineral and hydrocarbon explora-
Strong anthropogenic noise can result in hearing tion and production, etc. Because many anthropo-
loss. Auditory receptors in the locust ear showed genic noise sources are low in frequency and thus
a decreased ability to encode sound after noise within the reptilian hearing range, understanding
exposure. The mechanism for such hearing loss the impact of these sources on behavior and phys-
reveals striking parallels with that of the mamma- iology is an important start for reptile
lian auditory system (Warren et al. 2020). A conservation.
series of experiments was conducted to determine Little literature exists on the impacts of anthro-
whether exposure to simulated road traffic noise pogenic noise on reptiles, with sea turtles having
induces increased heart rates, as an indicator of a received recent attention. Simmons and Narins
stress response (Davis et al. 2018). Larvae of the recently reviewed the topic (2018). Currently,
monarch butterfly (Danaus plexippus) exposed little is known about how eggs and juvenile
for 2 h to road traffic noise experienced a signifi- reptiles respond to anthropogenic noise. As a
cant increase in heart rate, indicative of stress. result, this section concentrates on adult sea
Because these larvae do not have ears for turtles as a representative of reptiles.
air-borne sound, the likely sensory pathway Acoustic signals play an important role in tur-
involved vibration receptors. However, exposing tle social behavior and reproduction. Turtles
larvae for longer periods (up to 12 days) to con- make very-low-frequency calls of short duration
tinuous traffic noise did not increase heart rate at by swallowing or by forcibly expelling air from
the end of larval development; so chronic noise their lungs. Galeotti et al. (2005) published a
exposure may result in habituation or desensitiza- summary of sound occurrence, context, and
tion. However, habituation to stress during larval usage in Cryptodira chelonians—a taxon, which
stages may impair reactions to stressors in adult is quite soniferous. In general, turtles call when
insects. mating or seeking a mate, when they are sick or in
While more research is necessary to under- distress, or for other reasons. Male red-footed
stand the sensory strategies for avoiding or com- tortoises (Chelonoidis carbonaria) make a
pensating for anthropogenic noise, there are some clucking sound during mounting, Greek tortoises
cases where insects experience a significant fit- (Testudo graeca) whistle during combat, and
ness advantage. This may happen in a predator- young big-headed turtles (Platysternon
prey or parasitoid-host relationship, when the megacephalum) squeal when disturbed (Galeotti
noise decreases the ability of a parasitoid fly to et al. 2005). Nesting female leatherback sea
localize calls of their host crickets (Lee and turtles (Dermochelys coriacea) make a belching
Mason 2017), or when bats as predators of flying sound (Cook and Forrest 2005; Mrosovsky
insects are less efficient foragers in the presence 1972), and the sounds from leatherback sea turtle
of anthropogenic noise (Siemers and Schaub eggs are believed to help coordinate hatching
2011). (Ferrara et al. 2014).
Reptiles have both aquatic (sea turtles, alligators, Not all reptiles produce sound for communica-
and crocodiles) and terrestrial (geckos, snakes, tion. Most reptiles can detect substrate-borne
iguana, whiptails, geckos, chameleons, gila vibrations (e.g., Barnett et al. 1999; Christensen
474 C. Erbe et al.
et al. 2012). The auditory anatomy of most reptile 13.7 Noise Effects on Amphibians
species includes a tympanic membrane near the
rear of the head, a middle ear with a stapes, and a Frogs rely heavily on acoustic communication for
fluid-filled inner ear housing the lagena and its mating. Noise has been shown to alter both the
sound-sensing cells (Wever 1978). Brittan- production and perception of frog vocalizations.
Powell et al. (2010) indicated that reptile hearing This can have serious implications for reproduc-
is similar in frequency range to hearing in birds tion in these animals. Males that do not call as
and amphibians. The most sensitive lizards have often will not attract females to their locations
similar absolute sensitivities to birds. Ridgway along a pond edge. Females that do not hear the
et al. (1969) used electrophysiological methods advertisement calls from the males will not be
to test hearing abilities of the green sea turtle able to localize or approach them. Further, they
(Chelonia mydas) and found peak sensitivity will not be able to sample multiple males for
between 300 and 400 Hz, with the best hearing selection of the most attractive one. Studies have
range from 60 to 1000 Hz. In general, the best been conducted in both the laboratory and the
frequency range of hearing in chelonids (turtles, field to determine the effects of noise on acoustic
tortoises, and terrapins) is 50–1500 Hz (Popper communication in frogs, for both vocal produc-
et al. 2014). tion and auditory perception.
Sea turtles may be exposed to acute and chronic The amphibian ear consists of a tympanic mem-
noise. The soundscape of the Peconic Bay Estu- brane on the outside through which sound enters
ary, Long Island, NY, USA, a major coastal for- the ear, a middle ear containing a columella,
aging area for juvenile sea turtles, was recorded similar to the mammalian stapes, that provides
during sea turtle season. There was considerable mechanical lever action, and an inner ear in
boating and recreational activity, especially which sound is converted to neural signals
between early July and early September. Samuel (Wever 1985). The inner ear contains two papil-
et al. (2005) suggested that increasing and chronic lae, known as the amphibian papilla, which
exposure to high levels of anthropogenic noise responds to lower frequencies, and the basilar
could affect sea turtle behavior and ecology. papilla, which responds to higher frequencies.
Indeed, loggerhead sea turtles have been shown Audiograms show good sensitivity between
to dive when exposed to seismic airgun noise— 100 Hz and a few kHz (e.g., Megela-Simmons
perhaps as a means of avoidance (DeRuiter and et al. 1985). Some species, however, exhibit sen-
Larbi Doukara 2012). In the terrestrial world, sitivity also to ultrasound (Narins et al. 2014), and
desert tortoises (Gopherus agassizii) exposed to others to infrasound (Lewis and Narins 1985).
simulated jet overflights did not show a startle
response or increased heart rate, but they froze;
and in response to simulated sonic booms, they
13.7.2 Behavioral Responses to Noise
exhibited brief periods of alertness (Bowles et al.
1999).
Some species of frogs, like other animals, are
Unfortunately, there is a complete lack of data
known to avoid roads and highways, possibly to
on masking of biologically important signals in
avoid both traffic mortality and a reduced trans-
sea turtles and other reptiles by anthropogenic
mission of vocal signals (reviewed by
noise (Popper et al. 2014). Similarly, there has
Cunnington and Fahrig 2010). Several studies,
been little research on physiological effects of
however, failed to document behavioral avoid-
noise in reptiles.
ance of noise by frogs or did not find reduced
13 The Effects of Noise on Animals 475
frog abundance near continuous noise sources 2008). Barber et al. (2010) believed that these
such as highways (Herrera-Montes and Aide frogs were unable to adjust the frequency or dura-
2011). tion of their calls to increase signal transmission.
Nonetheless, noise does affect the perception Penna et al. (2005) found a similar decrease in
of acoustic signals by frogs. Bee and Swanson call rate in leptodactylid frogs (Eupsophus
(2007) investigated the potential of noise from calcaratus) exposed to recordings of natural
road traffic to interfere with the perception of noise in the wild.
male gray treefrog (Dryophytes chrysoscelis) An effective way to increase the likelihood
signals by females. Using a phonotaxis assay, that acoustic signals will be received is by
they presented females with a male advertisement increasing the intensity of those signals (Lombard
call at various signal levels (37–85 dB re 20 μPa) effect). Love and Bee (2010) measured the
in three masking conditions: (1) no masking intensities of vocalizations produced in the labo-
noise, (2) a moderately dense breeding chorus, ratory by Cope’s gray treefrog (Dryophytes
and (3) road traffic noise recorded in wetlands chrysoscelis) in the midst of different levels of
near major roads. In both the chorus and traffic background noise, similar to a frog chorus. They
noise maskers, female response latency increased, found no evidence for the existence of the Lom-
orientation behavior toward the signal decreased, bard effect in their frogs. Frogs produced calls at a
and response thresholds increased by about level of 92–93 dB re 20 μPa, regardless of noise
20–25 dB. The authors concluded that realistic level. Similar to findings from other frogs, Cope’s
levels of traffic noise could limit the active space, gray treefrogs increased call duration and
or the maximum transmission distance, of male decreased call rate with increasing noise levels.
treefrog advertisement calls. Another treefrog However, they appeared to be maximizing their
(Dendropsophus ebraccatus) tested in a labora- call amplitudes in every calling situation, which
tory to compare the effects of dominant frequency does not allow them to increase their call
and signal-to-noise ratio on call perception intensities further when needed. On the contrary,
showed a low-frequency call preference in quiet túngara frogs (Engystomops pustulosus) and
conditions (usually correlated with larger, more rhacophorid treefrogs (Kurixalus chaseni) did
attractive males), but no preference at higher increase their call levels in noise (Halfwerk et al.
signal-to-noise ratios (Wollerman and Wiley 2016; Yi and Sheridan 2019).
2002). These results indicate that females listen- Another possible way for a frog to increase
ing to males in a noisy environment will likely communication efficacy would be to increase
make errors in mate choice. the frequencies of their calls to be above the
Sun and Narins (2005) examined the effects of frequency of the masking noise. Parris et al.
fly-by noise from airplanes and played back (2009) found that two species of frogs (southern
low-frequency sound from motorcycles to an brown treefrog, Litoria ewingii, and common
assemblage of frog species in Thailand. Three of eastern froglet, Crinia signifera) called at a higher
the most acoustically active species (Microhyla frequency in traffic noise (e.g., 4.1 Hz/dB for
butleri, Sylvirana nigrovittata, and Kaloula L. ewingii), and suggested this was an adaptation
pulchra) decreased their calling rate and the over- to be heard over the noisy environmental
all intensity of the assemblage calls decreased. conditions. An extreme form of this frequency-
However, calls from another frog (Hylarana increasing behavior has been discovered in
taipehensis) seemed to persist. The authors concave-eared torrent frogs (Odorrana tormota)
suggested that the anthropogenic noise in China (Feng and Narins 2008). These frogs live
suppressed the calling rate of some species, but near extremely loud streams and waterfalls
seemed to stimulate calling behavior in (58–76 dB re 20 μPa, up to 16 kHz), which should
H. taipenhensis. Another study found that the make vocalizations difficult for other frogs to
vocalization rate of European treefrog (Hyla hear, at least at the lowest frequencies. The calls
arborea) decreased in traffic noise (Lengagne from these frogs are quite different from the
476 C. Erbe et al.
Fig. 13.8 Spectrograms, waveforms, and call spectra tormotus). Journal of Comparative Physiology A, 194(2),
from six vocalizations from the O. tormota frog (Feng 159–167; https://link.springer.com/article/10.1007/
and Narins 2008). Reprinted by permission from Springer s00359-007-0267-1. # Springer Nature, 2008. All rights
Nature. A. S. Feng and Narins, P. M. Ultrasonic commu- reserved
nication in concave-eared torrent frogs (Amolops
(Tennessen et al. 2014), although a recent study otoliths of the inner ear, which sends neural
suggests that eggs taken from high traffic noise signals to the brain. The inner ear is sensitive to
conditions yielded frogs that were less affected by particle motion. Fish with swim bladders close to
noise exposure than frogs from eggs taken from or even connected to the ears are also sensitive to
low traffic noise environments, suggesting acoustic pressure. This is because the sound pres-
adaptations are possible (Tennessen et al. 2018). sure excites the gas bladder, which reradiates an
Whether it is from the stress or the masking of the acoustic wave that drives the otolith. Particle
acoustic signals, anthropogenic noise has been motion then creates differential movement
shown to have negative consequences. between the otoliths and the rest of the ear. The
lateral line system involves neuromasts that detect
water flow and acoustic particle motion. Due to
13.8 Noise Effects on Fish variability in otolith anatomy and the absence or
presence and variable connectivity of swim
All fish species studied to date can detect sound. bladders, fish hearing varies greatly with species
Hundreds of species are known to emit sound in terms of sensitivity and bandwidth, with most
with the most prominent display of sound produc- species sensitive to somewhere between 30 and
tion in fishes being their choruses on spawning 1000 Hz, but some species detecting infrasound,
grounds (Slabbekoorn et al. 2010). Adult, juve- and others ultrasound up to 180 kHz (Popper and
nile, and larval-stage fishes actively use environ- Fay 1993, 2011; Tavolga 1976). Hearing in noise
mental sound to orientate and settle (Jeffrey et al. has been studied and parameters such as the criti-
2002; Simpson et al. 2005, 2007). Herring cal ratio (signal-to-noise ratio for sound detection,
(Clupea harengus) have shown avoidance behav- see Chap. 10) have been measured (Fay and Pop-
ior to playbacks of sounds of killer whales, one of per 2012; Tavolga et al. 2012); however, the
their predators (Doksaeter et al. 2009). Underwa- significance of acoustic masking to fish fitness
ter anthropogenic noise can have a variety of and survival remains poorly understood.
effects on fish, ranging from behavioral changes,
masking, stress, and temporary threshold shifts, to
tissue and organ damage, and death in extreme 13.8.2 Behavioral Responses to Noise
cases (Hawkins and Popper 2018; Normandeau
Associates 2012; Popper and Hastings 2009). The schooling behavior of fish has been observed
Mortality can also result from an increased risk to change in response to an approaching airgun
of predation in noisy environments (Simpson with fish swimming faster, deeper in the water
et al. 2016). Despite the growing amount of liter- column, and in tighter schools (Davidsen et al.
ature, our understanding of the cumulative effects 2019; Fewtrell and McCauley 2012; Neo et al.
of multiple exposures and the fitness implications 2015; Pearson et al. 1992). Caged fish had
to wild fish is limited. compacted near the center of the cage floor at
received levels of 145–150 dB re 1 μPa2s and
swimming behavior returned to normal after
13.8.1 Fish Hearing 11–31 min (Fewtrell and McCauley 2012). A
startle response was noted when the airgun was
Fish have two systems detecting sound and vibra- discharged at close range (Pearson et al. 1992),
tion: the inner ear and the lateral line system. The but not when the received level was ramped up by
inner ear of fish resembles an accelerometer. It approaching from a longer range; also, the startle
contains otoliths, which are bones of approxi- response diminished over time (Fewtrell and
mately three times the water density. Water- McCauley 2012). Wild pelagic and mesopelagic
borne acoustic waves therefore result in differen- species dove deeper and their abundance
tial motion between the otoliths and the fish’s increased at long range from the airgun array
body, thus bending hair cells coupled to the (Slotte et al. 2004). There are a few studies
478 C. Erbe et al.
Fig. 13.9 (a) Experimental setup to study fish responses stopped at the 2nd line, restarted at the 3rd line, and
to playbacks of pile driving sound. (b) Echogram of zoo- stopped at the 4th line (modified from Hawkins et al.
plankton dropping in depth below sea surface during play- 2014). # Acoustical Society of America, 2014. All rights
back of pile driving sound (red ellipses). Time is along the reserved
x-axis; playback started at the 1st vertical black line,
documenting a drop in catch rates of pelagic fish 13.8.3 Effects of Noise on the Auditory
after seismic surveying (Engas and Løkkeborg and other Systems
2002; Engås et al. 1996; Slotte et al. 2004),
believed to be due to behavioral responses. After exposure to intense pulsed sound from
Hawkins et al. (2014) played pile driving noise airguns, extensive hearing damage in the form
to wild zooplankton and fish. A loudspeaker was of ablated or missing hair cells was found in
deployed from one boat for sound transmission, pink snapper (Pagrus auratus) (McCauley et al.
while an echosounder and side-scan sonar were 2003a, b). Other studies have found only limited
deployed from a second boat for animal observa- or no hearing damage or threshold shift in various
tion (Fig. 13.9a). Zooplankton dropped in depth species of fish from airgun exposure (Hastings
below the sea surface after playback onset as and Miksis-Olds 2012; Popper et al. 2005; Song
shown by the echogram in Fig. 13.9b. Wild et al. 2008). Apart from the typical differences in
sprat (Sprattus sprattus) and mackerel (Scomber experimental setup, exposure regime, and species
scombrus) exhibited a diversity of responses tested, a factor influencing the degree of noise
including break-up of aggregations and reforming impact might be the direction from which sound
of much denser aggregations in deeper water. The is received (specifically, vertical versus horizontal
sprat is sensitive to sound pressure, however the incidence; McCauley et al. 2003a). Fish ears are
mackerel lacks a swim bladder and is sensitive to not symmetrical and many anthropogenic sound
the particle motion. The occurrence of behavioral sources have a strong vertical directionality under
responses increased with the received level. The water due to their near-surface deployment lead-
50% response thresholds were 163.2 and ing to a dipole sound field.
163.3 dB re 1 μPa pk-pk and 135.0 and 142.0 Halvorsen et al. (2012, Fig. 13.11) looked for
dB re 1 μPa2s (single-strike exposure) for sprat tissue and organ damage in Chinook salmon
and mackerel, respectively (Hawkins et al. 2014; (Oncorhynchus tshawytscha) that were placed
Fig. 13.10). inside a standing-wave test tube (High-Intensity
13 The Effects of Noise on Animals 479
Fig. 13.10 Dose-response curves (solid lines) and 95% pile driving (modified from Hawkins et al. 2014).
confidence intervals (dashed lines) of (a) sprat and # Acoustical Society of America, 2014. All rights
(b) mackerel to peak-to-peak sound pressure levels from reserved
Fig. 13.11 Chinook salmon injuries from noise hemorrhage (Halvorsen et al. 2012). # Halvorsen et al.;
exposure. Mild: (a) eye hemorrhage, (b, c) fin hematoma. https://journals.plos.org/plosone/article?id¼10.1371/jour
Moderate: (d) liver hemorrhage and (e) bruised swim nal.pone.0038968; licensed under CC BY 4.0; https://
bladder. Mortal: (f) intestinal hemorrhage and (g) kidney creativecommons.org/licenses/by/4.0/
480 C. Erbe et al.
Controlled Impedance Fluid-filled wave Tube, some species’ hearing extending into the infra-
HICI-FT) in which pressure and particle motion sonic range (Dooling et al. 2000).
could be controlled. Physical injury commenced
at 211 dB re 1 μPa2s cumulative sound exposure
resembling 1920 strikes of a pile driver at 177 dB 13.9.2 Behavioral Responses to Noise
re 1 μPa2s each.
Yelverton (1975) conducted studies of the Several studies have demonstrated that some
gross effects of sounds generated from underwa- birds are affected by low-frequency (<3 kHz)
ter explosive blasts on fish. He found three impor- anthropogenic noise from roadways and that
tant factors that influenced the degree of damage: long-term exposure can lead to lower species
the size of the fish relative to the wavelength of diversity or lower breeding densities in an area
the sound, the species’ anatomy, and the location (reviewed by Goodwin and Shriver 2011; Reijnen
of the fish in the water column relative to the and Foppen 2006). Urban noise is known to affect
sound source. reproduction and mating behaviors of birds in
several ways. Urban noise can mask acoustic
components of the lekking display by male
greater sage grouse (Centrocercus urophasianus;
13.9 Noise Effects on Birds
Blickley and Patricelli 2012). It also disrupts
female preference for low-frequency songs sung
Birds rely heavily on acoustic communication for
by male canaries (des Aunay et al. 2014) and
life functions such as warning others about
great tits (Halfwerk et al. 2011). Females of
predators, finding and assessing the quality of
these (and other) species prefer males that sing
mates, defending territories, and discerning
lower-frequency songs over those that sing
which youngster to feed (Bradbury and
higher-frequency songs because the
Vehrencamp 2011). When environmental noise
low-frequency songs are sung by males of higher
levels are high, such functions become difficult
quality (e.g., Gil and Gahr 2002). When
or impossible, unless the birds can make tempo-
low-frequency urban noise masks the
rary or permanent adjustments to their signal,
low-frequency components of calls and songs,
posture, or location. There have been several
females either cannot detect or find the males
studies on the effects of noise on survival and
that are singing or cannot discriminate between
communication in birds in the field as well as
the high-quality males singing at low frequencies
the laboratory, and on the ways that birds adjust
and the poorer-quality males singing at higher
their communication signals and/or lifestyles to
frequencies.
adapt to the noisy modern world.
Urban noise also has influences on where birds
choose to live and breed, often resulting in
consequences for choosing less favorable
13.9.1 Bird Hearing habitats. For instance, Eastern bluebirds (Sialia
sialis) living in noisier environments were found
The avian ear has three main parts: an outer, to have reduced reproductive productivity and
middle, and inner ear. The outer ear is typically brood size compared to those living in quieter
hidden by feathers, but consists of a small exter- habitats (Kight et al. 2012). The presence and
nal meatus. A tympanic membrane separates the absence of construction and highways often
outer and middle ear. The middle ear contains the changes the distribution of birds. Foppen and
columella that mechanically transmits sound to Deuzeman (2007) compared the distribution of
the inner ear, and a connected interaural canal to reed warbler (Acrocephalus arundinaceus) pairs
aid in directional hearing. The basilar papilla in in the Netherlands before a highway was built
the inner ear converts sound into neural signals. through a nesting area and after the highway
Most birds hear between 50 Hz and 10 kHz, with was present. When the highway was present
13 The Effects of Noise on Animals 481
there were fewer nesting pairs, meaning that some mating and reproductive success. Nestling
birds were avoiding preferred habitats to avoid white-crowned sparrows (Zonotrichia
traffic noise. The road was temporarily closed and leucophrys) tutored with songs embedded in
the number of nesting pairs increased; however, anthropogenic noise later sung songs at higher
once the road reopened the number of nesting frequencies and with lower vocal performance
pairs again decreased. A more extensive study than those tutored with non-noisy control songs
conducted in the Netherlands found that 26 of (Moseley et al. 2018). As another example, when
43 (60%) woodland bird species showed reduced alarm calls were presented to tree swallow
numbers near roads (Reijnen et al. 1995). Another (Tachycineta bicolor) nestlings, the tree swallows
count of birds near and far from roads showed in quiet environments crouched more often (hid-
that even when habitats were similar to one ing from predators) while the nestlings in noisy
another, but either near to or far from a highway, environments produced longer calls and did not
the number of birds in each area increased with crouch (McIntyre et al. 2014). Nestling tree
increasing distance from the road (Fig. 13.12), swallows living in noisier environments produced
correlating with noise levels (Polak et al. 2013). narrower-bandwidth and higher-frequency calls
That is, both abundance and diversity of birds than those from quieter nests (Leonard and Horn
increased as noise levels decreased. Other studies 2008), although hearing of noise-reared nestlings
have confirmed that birds with higher-frequency does not differ from that of quiet-reared nestlings
calls were less likely to avoid the roadways than (Horn et al. 2020). These studies indicate that
birds with lower-frequency calls (Rheindt 2003), noise could affect how well offspring hear
again pointing to the challenges that many birds predators and how well parents hear begging
have when communicating in low-frequency calls. It also could influence the rate of feeding
urban noise, and highlighting the difficult choice nestlings and could even have long-lasting effects
that birds must face: Do the costs of choosing a on call structure, which could influence breeding
less favorable habitat outweigh the benefits of success of those nestlings as adults. In a labora-
living in quieter environments? The answer to tory study looking at the effects of noise on repro-
this question clearly differs across both individual duction, high levels of environmental noise
birds and species. eroded pair preferences in zebra finches (Swaddle
When birds do choose to nest in noisier and Page 2007). Paired females chose non-partner
environments, there could be consequences for males over their partners when moderate to high
8
60 m 310 m 560 m
Distance from road
482 C. Erbe et al.
levels of white noise were presented in a prefer- discriminate the calls of their parents from calls
ence test. These results have implications for of other adults at a negative signal-to-noise ratio,
noisy environments altering the population’s suggesting that the enhanced detectability of nat-
breeding styles and eventually the evolutionary ural vocal signals found in the laboratory actually
trajectory of the species (Swaddle and Page translates to excellent acuity in the wild (Aubin
2007). and Jouventin 1998).
All of the above-mentioned studies reveal that
songs and calls are more or less discriminable or
13.9.3 Communication Masking
detectable when they are presented within differ-
ent masker types. For instance, great tits have
To know exactly how noise affects acoustic com-
better thresholds for detecting song elements
munication in birds, playback or perceptual
embedded in woodland noise than urban noise
experiments must be conducted to measure audi-
(Fig. 13.13a; Pohl et al. 2009). Interestingly,
tory acuity in a controlled environment.
detection of song elements in the dawn chorus
Experiments would use either pure tones and
was the most difficult condition for the great tits
white noise or more complex and natural signals
compared to the other noise types, suggesting that
that birds use for communication purposes. Con-
birds are not necessarily listening to one another
trolled laboratory studies measuring the ability to
in the mornings while they are singing. Canaries
detect simple pure tones in broadband noise have
trained to identify canary songs embedded in one
been conducted in over a dozen bird species
to four other distractor canary songs found it more
(reviewed by Dooling et al. 2000) using operant
difficult when there were more songs present,
conditioning techniques. These studies have
similar to conditions of the dawn chorus where
shown that as the frequency of the tone increases,
many birds are singing overlapping songs
it must be incrementally louder to hear it in a
(Fig. 13.13b; Appeltants et al. 2005). Another
noisy background. This is not unlike the trend
laboratory study determined birds’ abilities to
seen in other animals, suggesting a preserved
discriminate auditory distance, a task crucially
evolutionary mechanism for hearing in noise.
important for territorial birds. Pohl et al. (2015)
Other laboratory studies measuring the detec-
trained great tits to discriminate between virtual
tion and discrimination of calls and songs embed-
birdsongs at near and far distances, presented in
ded in various types of noise can reveal more
quiet or embedded in a noisy dawn chorus. The
about the exact nature of the active space for the
birds accurately discriminated between distances,
natural acoustic signals used for communication
although this was much harder in noisy than in
by social birds. Psychoacoustic studies often test
quiet conditions. In summary, these experiments
the abilities of birds to detect, discriminate, or
and others demonstrate that hearing in noise is
identify songs or calls that are embedded in a
possible, and that factors such as the spectro-
chorus of other songs or different types of noise
temporal make-up of signals, noise type, and
(e.g., urban or woodland). Operant conditioning
noise level all have an influence on hearing
experiments on zebra finches, European starlings
signals in noise.
(Sturnus vulgaris), canaries (Serinus canaria),
As a whole, results from the laboratory and
great tits (Parus major), and budgerigars all
field experiments suggest that bird communica-
show that birds have excellent acuity for detecting
tion is more successful in quiet, rather than noisy
or discriminating communication signals relative
environments, that the type of noise matters for
to pure tones, possibly due to the ecological rele-
communication, and that if noise is present,
vance of these signals (Appeltants et al. 2005;
adjustments need to be made to the calls or
Dent et al. 2009; Hulse et al. 1997; Lohr et al.
songs of signalers for those signals to be detected,
2003; Narayan et al. 2007; Pohl et al. 2009). In a
discriminated, and localized by the receivers. One
field test of call discrimination, juvenile king
such adjustment that has shown to be effective is
penguins in a noisy colony were able to
changing the position of the signal relative to the
13 The Effects of Noise on Animals 483
a) 40 b) 80
20
60
10
0 50
1 2 3 4
an
e
s
ru
nc
an
rb
ho
le
dl
U
Si
oo
C
n Number of Song Maskers
W
aw
D
Fig. 13.13 (a) Masked thresholds for great tits detecting urban noise than woodland noise. (b) Performance for
a synthetic song element embedded in silence, woodland canaries discriminating song elements embedded in 1–4
noise, urban noise, or dawn chorus noise (adapted from other songs (adapted from Appeltants et al. 2005). As the
Pohl et al. 2009). Performance is best for quiet conditions, number of maskers increases, performance decreases
worst for the chorus conditions. Thresholds are higher for
masker. Dent et al. (1997) found that thresholds one another in noisy environments, changing
for budgerigars detecting a pure tone in white their position or even simply moving their heads
noise were 11 dB lower when the signal and will increase communication efficiency in similar
noise were separated by 90 in space than when ways as humans attempting to speak to one
they were co-located (i.e., spatial release from another in a noisy cocktail party will often move
masking). A follow-up study showed an even their head toward a speaker.
greater advantage when the spatially separated Another adjustment made by many birds is to
signal was zebra finch song and the masker was shift the frequency content of songs to a higher
a zebra finch chorus (Fig. 13.14; Dent et al. 2009). range, as documented for European blackbirds
Thus, when birds are trying to communicate with (Turdus merula; Slabbekoorn and Ripmeester
2008), plumbeous vireos (Vireo plumbeus;
Francis et al. 2011), gray vireos (Vireo vicinior;
Francis et al. 2011), European robins (McMullen
et al. 2014), chaffinches (Verzijden et al. 2010),
black-capped chickadees (Poecile atricapillus;
Proppe et al. 2011), and a number of tropical
birds (de Magalhães Tolentino et al. 2018).
Whether this is a true adaptation attempting to
increase the lowest frequencies of songs above
the highest frequencies of the noise, whether it is
simply easier for the birds to make high
frequencies louder, or whether urban birds live
in denser environments and want to distinguish
Fig. 13.14 Signal-to-noise ratio thresholds for detecting a their songs from those of other birds is still being
zebra finch song are higher (worse) when a chorus masker debated (e.g., Nemeth et al. 2013).
is co-located with the song (black boxes) than when the Pohl et al. (2012) tested the consequences of
song is spatially separated from the masker (green boxes),
in both budgerigars and zebra finches. Adapted from Dent such shifts on perception in the laboratory. These
et al. (2009) authors trained great tits to detect or discriminate
484 C. Erbe et al.
to translate sound from acoustic waves to nerve 13.10.2 Behavioral Responses to Noise
signals in the cochlea and auditory nerve. Though
very effective, the ear can sustain damage and it One of the most frequently studied sources of
degrades with age. Hearing loss results in reduced noise in terrestrial mammal habitats is traffic
auditory acuity and limited information for the noise from cars, trains, or aircraft. The most fre-
mammal to use. Loss can be caused by sudden quently reported response is animal movement
exposure to high-intensity sound (e.g., from an away from the noise source. For example, Sonoran
explosion or gunfire) or by repeated or prolonged pronghorn (Antilocapra americana sonoriensis)
noise exposure (e.g., at industrial workplaces, at increased their use of areas with lower levels of
rock concerts, or from personal media players). noise over areas with higher levels of noise from
While the general structure of the mammalian military aircraft (Landon et al. 2003). In the case of
ear is shared amongst terrestrial mammal species, mountain sheep (Ovis canadensis mexicana), 19%
there is great diversity in the sounds mammals showed disturbance to low-flying aircraft
can perceive, in the sounds they produce, and in (Krausman and Hervert 1983). Prairie dogs
their responses to sound. While human hearing (Cynomys ludovicianus) were exposed to playback
ranges from about 20 Hz to 20 kHz, elephants use of highway noise in an experimental prairie-dog
infrasound (sounds extending below the human town that was previously absent of anthropogenic
hearing range, i.e., below 20 Hz; Herbst et al. noise. The treatment area had fewer prairie dogs
2012; Payne et al. 1986) and bats use ultrasound above ground. Those that were above ground
(sounds extending above the human hearing spent less time foraging and much more time
range, i.e., above 20 kHz, with some species exhibiting vigilant behavior (Shannon et al.
hearing and emitting sound up to 220 kHz; 2014) leading to earlier predator detection and
Fenton et al. 2016). Rodents are known to be earlier flight response (Shannon et al. 2016).
quite diverse, with subterranean species having A major concern regarding these behavioral
excellent low-frequency hearing and terrestrial responses by wildlife to traffic corridors is habitat
rodents having excellent ultrasonic hearing fragmentation together with limited connectivity.
(reviewed by Dent et al. 2018). Mammals can Noisy areas may displace wildlife and form
thus be expected to display a diversity of barriers to migration and dispersal (Barber et al.
responses to noise. 2011; Fig. 13.15). Roads also fragment bat
Fig. 13.15 (a) Photo of the Going-to-the-Sun road in Formichella, C., Crooks, K. R., Theobald, D. M., and
Glacier National Park, USA. (b) 3D plot of 24-h traffic Fristrup, K. M. Anthropogenic noise exposure in protected
noise. (c) 2D plot of 24-h traffic noise (Barber et al. 2011). natural areas: estimating the scale of ecological
Road noise may form a barrier to wildlife migration. consequences. Landscape Ecology, 26(9), 1281; https://
Reprinted by permission from Springer Nature. Barber, link.springer.com/article/10.1007/s10980-011-9646-7.
J. R., Burdett, C. L., Reed, S. E., Warner, K. A., # Springer Nature, 2011. All rights reserved
486 C. Erbe et al.
habitat, although many species cross roadways or effect). Cats increased the amplitude of calls in
fly through underpasses (Kerth and Melber 2009). noise (Nonaka et al. 1997). Common marmosets
Animals may adapt temporal behavioral (Callithrix jacchus) and cotton-top tamarins
patterns around noise exposure. Black-tufted (Saguinus oedipus) increased both amplitude
marmosets (Callithrix penicillata) living in an and duration of calls in noise (Brumm et al.
urban park in Brazil stayed in quieter, central 2004; Roian Egnor and Hauser 2006). Cotton-
(i.e., away from road noise) areas during the top tamarins timed their calls to avoid overlap
day, and only utilized the park edges at night or with periodic noise (Egnor et al. 2007). Horse-
weekends (Duarte et al. 2011). Forest elephants shoe bats (Rhinolophidae) increased echolocation
(Loxodonta cyclotis) became more nocturnal in amplitudes and shifted echolocation frequency in
areas of industrial activity; and while the study noise (Hage et al. 2013).
found no direct link to noise intensity, concern
about natural biorhythms near noisy industrial
sites was raised (Wrege et al. 2010). 13.10.3 Physiological Responses
Noise may affect foraging behavior. Wood- to Noise
land caribou stopped feeding when exposed to
noise from petroleum exploration (Bradshaw Human studies have shown that noise exposure
et al. 1997). Reduced food intake in noise slowed can lead to a variety of health effects ranging from
growth in rats, pigs, and dogs (Alario et al. 1987; a feeling of annoyance to disturbed sleep, emo-
Gue et al. 1987; Otten et al. 2004). Gleaning bats tional stress, decreased job performance, higher
(Myotis myotis) displayed reduced hunting effi- chance of developing cardiovascular disease, and
ciency during road noise playbacks (Schaub et al. decreased learning in schoolchildren (Basner
2008; Siemers and Schaub 2011). Similarly, et al. 2014). We can only begin to understand
Brazilian free-tailed bats (Tadarida brasiliensis) the effects of noise on the health of other mam-
were less active and produced fewer echolocation malian species.
bursts near a noisy gas compression station Studies on elk (Cervus canadensis) and
(Bunkley et al. 2015). Peromyscus mice, on the wolves (Canis lupus) in Yellowstone National
other hand, were more successful collecting pine Park, USA, had elevated levels of glucocorticoid
seeds (a major food source) near noisy enzymes (a blood hormone that indicates stress)
gas-extraction sites because competing, seed- when snowmobiles were allowed in the park.
collecting jays (Aphelocoma californica) aban- After banning snowmobiling, enzyme levels
doned the site (Francis et al. 2012). Additionally, returned to normal, although a direct link to
predators of the mice, like owls, avoided the noise exposure was not made (Creel et al. 2002).
noisier sites, which may result in reduced preda- After ongoing zoo visitor noise, giant pandas
tion of the mice (Mason et al. 2016). Finally, (Ailuropoda melanoleuca) exhibited increased
some animals may associate noise with reinforce- glucocorticoids, negatively impacting reproduc-
ment, such as food sources, and learn to approach tion efforts (Owen et al. 2004). In male rats
sounds. Badgers (Meles meles) quickly learned to exposed to chronic noise, testosterone decreased
approach an acoustic deterrent device baited with (Ruffoli et al. 2006). Pregnant mice exposed to
food (dinner bell effect; Ward et al. 2008). 85–95 dB re 20 μPa alarm bells had pups with
One pathway by which noise disrupts animal lower serum IgG levels, indicating impaired
behavior is by acoustic masking. Piglets use immune responses (Sobrian et al. 1997). Chronic
vocalization bouts to coordinate nursing with noise exposure in rats affected calcium regulation
sows and noise disrupted this communication leading to detrimental changes at cellular level
leading to reduced milk ingestion and increased (Gesi et al. 2002). Desert mule deer (Odocoileus
energetic costs for the piglets attempting to elicit hemionus crooki) and mountain sheep had
milk (Algers and Jensen 1985). Some animals can increased heart rates relative to increased levels
adjust their calls to reduce masking (Lombard of aircraft noise playback. Heart rate returned to
13 The Effects of Noise on Animals 487
normal within 60–180 s and responses decreased 20 days. Loss of both inner and outer hair cells
over time potentially indicating a form of habitu- at the basal end of the organ of Corti and hence
ation (Weisenberger et al. 1996). PTS were produced (Hawkins et al. 1976). The
difference in noise exposure when an individual
transitions from having temporary to permanent
13.10.4 Effects of Noise on the Auditory damage varies by species as well as depending on
System several individual factors such as past sound
exposure, age, genetics, etc. (Hu 2012).
The physiological impact of noise is well Exposure to continuous, high-level (>100 dB
documented in several mammalian species, par- re 20 μPa) sounds has been shown to damage or
ticularly laboratory animals, due to the ability to destroy hair cells in multiple species, such as rats,
systematically expose and test individuals. Sys- rabbits, and guinea pigs (Borg et al. 1995; Chen
tematic research has shown that several sound and Fechter 2003; Hu et al. 2000). Recently,
features (such as sound frequency, duration, exposure to lower-amplitude sounds over long
intensity, amplitude rise time, continuous versus periods of time has also been shown to cause
temporary exposure, etc.) impact how an animal’s permanent damage. Mice exposed to 70 dB re
auditory system is affected by noise exposure. For 20 μPa continuous white noise for 8 h a day
example, chinchillas experienced TTS from over the course of up to 3 months showed
exposure to the sound of a hammer hitting a nail increased hearing thresholds and decreased audi-
repeatedly (Dunn et al. 1991). While some of the tory response amplitudes (Feng et al. 2020).
chinchillas were exposed to repeated hammering Notably, the mice also showed aggravated
(a series of separate sound events), others were age-related hearing loss in relatively young mice
exposed to continuous noise of the same spectrum (mice were 8 weeks old at the start of exposure)
as nail hammering (one single sound event). (Feng et al. 2020).
While all chinchillas showed a decrease in Some animals can mitigate the impact of noise
hearing sensitivity, the chinchillas exposed to on the auditory system using a stapedial reflex to
the repeated hammering had more hearing loss close the auditory meatus. When exposed to a
(Dunn et al. 1991). loud sound, the contraction of the stapedial mus-
NIHL can occur from mechanical damage cle causes a decrease in auditory sensitivity by
and/or from metabolic disruption of acoustic closing the auditory meatus, thus negating some
structures (Hu 2012). Mechanical damage occurs potential damage. This reflex is well documented
during the sound exposure due to excessive in humans and appears to primarily play a role in
movement caused by sound waves. Depending sudden, unexpected sounds with sharp rise times.
on the level of the sound, loud noise can damage The reflex is thought to function similarly in most
structures at the cellular level. Metabolic damage terrestrial mammals, for example in rabbits.
occurs due to a cascade of changes at the cellular Rabbits exposed to sound in normal conditions
level from mechanical damage and can continue had very little threshold shifts, but when their
for weeks after sound exposure. stapedial reflex was inactivated (by blocking the
In TTS, damage may occur to the synapses and nerve) during noise exposure, PTS was observed
stereocilia, while in PTS, damage is more exten- at otherwise not NIHL inducing levels (Borg et al.
sive, including outer hair cell death and fibrocyte 1983). In cats, this reflex functions even under
loss. For example, the audiograms of four species anesthesia (McCue and Guinan 1994). However,
of Old-World monkeys (Macaca nemestrina, damage to the auditory nerve connections
M. mulatta, M. fascicularis, and Papio papio) (synaptopathy) can also damage auditory
were compared before and after exposure to reflexes; for example, in mice, synaptopathy was
octave-band noise (between 0.5 and 8 kHz at directly correlated to the function of the middle
levels of 120 dB re 20 μPa) for 8 h daily for ear muscle reflex (Valero et al. 2018).
488 C. Erbe et al.
Synaptopathy not only occurs from noise expo- et al. (2007) criteria were updated in 2019
sure, but also at old age or from exposure to (Southall et al. 2019b).
ototoxins (Valero et al. 2018).
Fig. 13.16 Auditory weighting functions for marine water, OCW: other carnivores in water, PCA: phocid
mammal functional hearing groups; LF: low-frequency carnivores in air, OCA: other carnivores in air (Southall
cetaceans, HF: high-frequency cetaceans, VHF: very- et al. 2019b)
high-frequency cetaceans, PCW: phocid carnivores in
Fig. 13.17 Relative response differences in various quantified using generalized additive mixed models for
aspects of blue whale behavior between non-feeding, sur- behavioral parameters relevant to each behavioral state
face-feeding, and deep-feeding individuals (adapted from and potential responses in terms of diving, orientation,
Goldbogen et al. 2013). Response magnitude was and displacement
(Fig. 13.17). This finding has been replicated and lions (Zalophus californianus) have included
expanded with individual blue whales, large sample sizes and repeated exposures to
demonstrating the same context-dependency in demonstrate species, age, and experiential
response probability as well as potential depen- differences in response probability to military
dence in response probability based on horizontal sonar signals (Houser et al. 2013a, b).
range from the sound source even for the same Observational methods (visual and acoustic)
received levels (Southall et al. 2019a). have provided complementary data to assess
Some species such as long-finned pilot whales both acute and chronic noise exposure. Passive
appear behaviorally tolerant of noise exposure acoustic monitoring over large areas and time
(e.g., Antunes et al. 2014), whereas beaked periods demonstrated changes in acoustic behav-
whales (Family Ziphiidae) are clearly among the ior and inferred movement of beaked whales in
more sensitive species behaviorally (DeRuiter response to military sonar signals (e.g., McCarthy
et al. 2013; Miller et al. 2015; Stimpert et al. et al. 2011) resulting in dose-response curves
2014; Tyack et al. 2011). The analysis of multi- (Moretti et al. 2014). Similarly, large-scale moni-
variate behavioral data to determine changes in toring linked cetacean distribution and behavior
behavior, including potentially subtle but impor- to seismic surveys (e.g., Pirotta et al. 2014;
tant changes, is statistically challenging, although Thompson et al. 2013), impact pile driving (e.g.,
recent substantial progress in analytical methods Dähne et al. 2013; Thompson et al. 2010;
has been made as well (Harris et al. 2016). Tougaard et al. 2009), and acoustic harassment
Experimental laboratory approaches have the devices (e.g., Johnston 2002).
advantage of greater control and precision on Such observational studies lack experimental
multivariate aspects of exposure and response, control, resolution to the individual level, detail
but lack the contextual reality in which free- on fine-scale responses, and ability to differenti-
ranging animals experience noise. Studies that ate short-term responses to noise from those to
evaluated noise exposure and response probabil- other stimuli, but offer information on broad-
ity in captive harbor porpoises (e.g., Kastelein scale spatio-temporal changes in habitat use and
et al. 2011, 2013) demonstrated a particular sen- behavior. Ideally, experimental approaches
sitivity of this species, which matched field would be combined with broad-scale observa-
observations. Studies with captive bottlenose tional methods to discover potential population-
dolphins (Tursiops truncatus) and California sea level effects (see Southall et al. 2016).
13 The Effects of Noise on Animals 491
Fig. 13.19 Chart of acoustic footprints of North Atlantic noise footprints can easily engulf (i.e., mask) the right
right whales (Eubalaena glacialis; light blue dots) and whale calls. Stellwagen Bank National Marine Sanctuary
ships (larger footprints with red centers) off Cape Cod, outlined in yellow. Figure courtesy of Chris Clark
Massachusetts Bay, USA. The larger and stronger ship
the blood of captive marine mammals (e.g., been secondarily caused or exacerbated by the
Romano et al. 2004). In the wild, stress hormones animals’ behavioral responses to sonar.
in right whales decreased when ambient noise
from shipping was lower (Rolland et al. 2012).
Such measurements of noise-induced stress in
13.12 Summary
marine mammals are comparable to studies with
other vertebrates (Romero and Butler 2007).
This chapter presented examples of the variety of
However, information is lacking on how stress
effects noise can have on animals in terrestrial and
scales with noise exposure and on the long-term
aquatic habitats. Studies on the hearing in noise
health impacts of prolonged stress.
and on behavioral and physiological responses to
Finally, beaked whales that stranded after
noise have concentrated on fish, frogs, birds, ter-
exposure to military sonar exhibited lesions and
restrial mammals, and marine mammals. Clearly,
gas or fat emboli (Fernandez et al. 2005; Jepson
more research is needed for invertebrates,
et al. 2003). While some form of decompression
reptiles, and all groups of freshwater species. In
sickness has been hypothesized, the physiological
addition, more studies on the metabolic costs of
mechanisms for such emboli to occur are poorly
these responses are needed.
understood. These physiological effects may have
13 The Effects of Noise on Animals 493
Animals demonstrate a hierarchy of behavioral Appl Anim Behav Sci 14(1):49–61. https://doi.org/10.
and physiological responses to noise. Behavioral 1016/0168-1591(85)90037-1
American National Standards Institute (2013) Acoustical
reactions to anthropogenic noise include a startle Terminology (ANSI/ASA S1.1-2013). Acoustical
response, change in movement and direction, Society of America, Melville, NY
freezing in place, cessation of vocal behavior, Amézquita A, Hödl W (2004) How, when, and where to
and change in behavioral budgets. Animals can perform visual displays: The case of the Amazonian
frog Hyla parviceps. Herpetologica 60(4):420–429.
also modify their signals to counteract the effects https://doi.org/10.1655/02-51
of noise and improve communication. Such André M, Solé M, Lenoir M, Durfort M, Quero C, Mas A,
modifications include changes in amplitude, dura- Lombarte A, Mvd S, López-Bejar M, Morell M,
tion, and frequency. Some animals also increase Zaugg S, Houégnigan L (2011) Low-frequency sounds
induce acoustic trauma in cephalopods. Front Ecol
the redundancy of their signals by repeating them Environ 9(9):489–493. https://doi.org/10.1890/100124
more often. Physiological reactions to anthropo- Andriguetto-Filho JM, Ostrensky A, Pie MR, Silva UA,
genic noise are indicated by increased cortisol Boeger WA (2005) Evaluating the impact of seismic
levels (indication of stress), temporary or perma- prospecting on artisanal shrimp fisheries. Cont Shelf
Res 25(14):1720–1727. https://doi.org/10.1016/j.csr.
nent hearing loss, and physical damage to tissues 2005.05.003
and organs such as lungs and swim bladders. Antunes R, Kvadsheim PH, Lam FPA, Tyack PL,
The effects of anthropogenic noise on individ- Thomas L, Wensveen PJ, Miller PJO (2014) High
ual animals can escalate to the population level. thresholds for avoidance of sonar by free-ranging
long-finned pilot whales (Globicephala melas). Mar
Ultimately, species-richness and biodiversity Pollut Bull 83(1):165–180. https://doi.org/10.1016/j.
could be affected. However, methods and models marpolbul.2014.03.056
to address these topics are in their infancy. Appeltants D, Gentner TQ, Hulse SH, Balthazart J, Ball
There is the potential to mitigate any negative GF (2005) The effect of auditory distractors on song
discrimination in male canaries (Serinus canaria).
impacts of anthropogenic noise by modifying the Behav Process 69(3):331–341. https://doi.org/10.
noise source characteristics and operation 1016/j.beproc.2005.01.010
schedules, finding alternative means to obtain Arkhipkin AI, Bizikov VA (2000) Role of the statolith in
operational goals of the noise source, and functioning of the acceleration receptor system in
squids and sepioids. J Zool 250(1):31–55
protecting critical habitats. Effective management Au WWL, Floyd RW, Penner RH, Murchison AE (1974)
of habitats should include noise assessment. Fur- Measurement of echolocation signals of the Atlantic
ther research is needed to understand the ecologi- bottlenose dolphin, Tursiops truncatus Montagu, in
cal consequences of chronic noise in terrestrial open waters. J Acoust Soc Am 56(4):1280–1290
Au WWL, Carder DA, Penner RH, Scronce BL (1985)
and aquatic environments. Demonstration of adaptation in beluga whale echolo-
Remote wilderness areas are not immune to cation signals. J Acoust Soc Am 77(2):726–730.
the effects of anthropogenic noise, because sound https://doi.org/10.1121/1.392341
travels very well (with little loss over long ranges) Aubin T, Jouventin P (1998) Cocktail–party effect in king
penguin colonies. Proc R Soc Lond Ser B Biol Sci
in many terrestrial and aquatic habitats. Resource 265(1406):1665
managers should continue to be vigilant in moni- Bain D, Dahlheim M (1994) Effects of masking noise on
toring and mitigating the effects of anthropogenic detection thresholds of killer whales. In: Loughlin T
noise on animals. (ed) Marine mammals and the Exxon Valdez. Aca-
demic Press, San Diego, CA, pp 243–256
Barber JR, Crooks KR, Fristrup KM (2010) The costs of
chronic noise exposure for terrestrial organisms.
Trends Ecol Evol 25(3):180–189. https://doi.org/10.
References 1016/j.tree.2009.08.002
Barber JR, Burdett CL, Reed SE, Warner KA,
Alario P, Gamallo A, Beato MJ, Trancho G (1987) Body Formichella C, Crooks KR, Theobald DM, Fristrup
weight gain, food intake and adrenal development in KM (2011) Anthropogenic noise exposure in protected
chronic noise stressed rats. Physiol Behav 40(1): natural areas: estimating the scale of ecological
29–32. https://doi.org/10.1016/0031-9384(87)90181-8 consequences. Landsc Ecol 26(9):1281. https://doi.
Algers B, Jensen P (1985) Communication during suck- org/10.1007/s10980-011-9646-7
ling in the domestic pig. Effects of continuous noise.
494 C. Erbe et al.
Barnett KE, Cocroft RB, Fleishman LJ (1999) Possible rabbits. Morphological and electrophysiological
communication by substrate vibration in a chameleon. features, exposure parameters and temporal factors,
Copeia 1:225–228. https://doi.org/10.2307/1447408 variability and interactions. Scand Audiol Suppl 40:
Basner M, Babisch W, Davis A, Brink M, Clark C, 1–147
Janssen S, Stansfeld S (2014) Auditory and Bowles AE, Eckert S, Starke L, Berg E, Wolski L (1999)
non-auditory effects of noise on health. Lancet Effects of flight noise from jet aircraft and sonic booms
383(9925):1325–1332. https://doi.org/10.1016/ on hearing, behavior, heart rate and oxygen consump-
S0140-6736(13)61613-X tion of desert tortoises (Gopherus agassizii). Hubbs-
Battershill C, Cappo M, Colquhoun J, Cripps E, Sea World Research Institution, San Diego, CA
Jorgensen D, McCorry D, Stowar M, Venables W Bradbury JW, Vehrencamp SL (2011) Principles of animal
(2008) Coral damage monitoring using Towed Video communication, 2nd edn. Sinauer Associates,
(TVA) and Photo Quadrat Assessments (PQA). Sunderland, MA
Australian Institute of Marine Science, Bradshaw CJA, Boutin S, Hebert DM (1997) Effects of
Townsville, QLD petroleum exploration on Woodland Caribou in North-
Bee MA (2007) Sound source segregation in grey eastern Alberta. J Wildl Manag 61(4):1127–1133.
treefrogs: spatial release from masking by the sound https://doi.org/10.2307/3802110
of a chorus. Anim Behav 74(3):549–558. https://doi. Branstetter BK, Trickey JS, Bakhtiari K, Black A,
org/10.1016/j.anbehav.2006.12.012 Aihara H, Finneran JJ (2013) Auditory masking
Bee MA, Swanson EM (2007) Auditory masking of patterns in bottlenose dolphins (Tursiops truncatus)
anuran advertisement calls by road traffic noise. with natural, anthropogenic, and synthesized noise. J
Anim Behav 74(6):1765–1776. https://doi.org/10. Acoust Soc Am 133(3):1811–1818. https://doi.org/10.
1016/j.anbehav.2007.03.019 1121/1.4789939
Bergen F, Abs M (1997) Etho-ecological study of the Brittan-Powell EF, Christensen-Dalsgaard J, Tang Y,
singing activity of the Blue Tit (Parus caeruleus), Carr C, Dooling RJ (2010) The auditory brainstem
Great Tit (Parus major) and Chaffinch (Fringilla response in two lizard species. J Acoust Soc Am
coelebs). J Ornithol 138(4):451–467. https://doi.org/ 128(2):787–794. https://doi.org/10.1121/1.3458813
10.1007/bf01651380 Brumm H (ed) (2013) Animal communication and
Blackwell SB, Nations CS, TL MD, Thode AM, noise. Animal signals and communication, vol 2.
Mathias D, Kim KH, Green CR, Macrander AM Springer, Berlin. https://doi.org/10.1007/978-3-642-
(2015) Effects of airgun sounds on bowhead whale 41494-7
calling rates: evidence for two behavioral thresholds. Brumm H, Slabbekoorn H (2005) Acoustic communica-
PLoS One 10(6):e0125720. https://doi.org/10.1371/ tion in noise. In: Advances in the study of behavior, vol
journal.pone.0125720 35. Academic Press, New York, pp 151–209. https://
Blickley JL, Patricelli GL (2012) Potential acoustic doi.org/10.1016/S0065-3454(05)35004-2
masking of greater sage-grouse (Centrocercus Brumm H, Todt D (2002) Noise-dependent song ampli-
urophasianus) display components by chronic indus- tude regulation in a territorial songbird. Anim Behav
trial noise. Ornithol Monogr 74:23–35. https://doi.org/ 63(5):891–897. https://doi.org/10.1006/anbe.2001.
10.1525/om.2012.74.1.23 1968
Blickley JL, Word KR, Krakauer AH, Phillips JL, Sells Brumm H, Voss K, Köllmer I, Todt D (2004) Acoustic
SN, Taff CC, Wingfield JC, Patricelli GL (2012) communication in noise: regulation of call
Experimental chronic noise is related to elevated fecal characteristics in a New World monkey. J Exp Biol
corticosteroid metabolites in lekking male greater 207(3):443. https://doi.org/10.1242/jeb.00768
Sage-Grouse (Centrocercus urophasianus). PLoS Brumm H, Schmidt R, Schrader L (2009) Noise-
One 7(11):e50462. https://doi.org/10.1371/journal. dependent vocal plasticity in domestic fowl. Anim
pone.0050462 Behav 78(3):741–746. https://doi.org/10.1016/j.
Bohne T, Grießmann T, Rolfes R (2019) Modeling the anbehav.2009.07.004
noise mitigation of a bubble curtain. J Acoust Soc Am Bunkley JP, McClure CJW, Kleist NJ, Francis CD, Barber
146(4):2212–2223. https://doi.org/10.1121/1.5126698 JR (2015) Anthropogenic noise alters bat activity
Bolm-Audorff U, Hegewald J, Pretzsch A, Freiberg A, levels and echolocation calls. Global Ecol Conserv 3:
Nienhaus A, Seidler A (2020) Occupational noise and 62–71. https://doi.org/10.1016/j.gecco.2014.11.002
hypertension risk: a systematic review and meta- Chan AAY-H, Giraldo-Perez P, Smith S, Blumstein DT
analysis. Int J Environ Res Public Health 17(17): (2010) Anthropogenic noise affects risk assessment
6281. https://doi.org/10.3390/ijerph17176281 and attention: the distracted prey hypothesis. Biol
Borg E, Nilsson R, Engström B (1983) Effect of the Lett 6(4):458–461. https://doi.org/10.1098/rsbl.2009.
acoustic reflex on inner ear damage induced by indus- 1081
trial noise. Acta Otolaryngol 96(5-6):361–369. https:// Chen G-D, Fechter LD (2003) The relationship between
doi.org/10.3109/00016488309132721 noise-induced hearing loss and hair cell loss in rats.
Borg E, Canlon B, Engström B (1995) Noise-induced Hear Res 177(1):81–90. https://doi.org/10.1016/
hearing loss. Literature review and experiments in S0378-5955(02)00802-X
13 The Effects of Noise on Animals 495
Christensen CB, Christensen-Dalsgaard J, Brandt C, Davidsen JG, Dong H, Linné M, Andersson MH, Piper A,
Madsen PT (2012) Hearing with an atympanic ear: Prystay TS, Hvam EB, Thorstad EB, Whoriskey F,
good vibration and poor sound-pressure detection in Cooke SJ, Sjursen AD, Rønning L, Netland TC,
the royal python, Python regius. J Exp Biol 215(2): Hawkins AD (2019) Effects of sound exposure from
331. https://doi.org/10.1242/jeb.062539 a seismic airgun on heart rate, acceleration and depth
Clark CW, Ellison WT, Southall BL, Hatch L, Van Parijs use in free-swimming Atlantic cod and saithe. Conserv
SM, Frankel A, Ponirakis D (2009) Acoustic masking Physiol 7(1). https://doi.org/10.1093/conphys/coz020
in marine ecosystems: intuitions, analysis, and impli- Davis AK, Schroeder H, Yeager I, Pearce J (2018) Effects
cation. Mar Ecol Prog Ser 395:201–222. https://doi. of simulated highway noise on heart rates of larval
org/10.3354/Meps08402 monarch butterflies, Danaus plexippus: implications
Cocroft RB, Rodríguez RL (2005) The behavioral ecology for roadside habitat suitability. Biol Lett 14(5):
of insect vibrational communication. Bioscience 55(4): 20180018. https://doi.org/10.1098/rsbl.2018.0018
323–334. https://doi.org/10.1641/0006-3568(2005) Day RD, McCauley RD, Fitzgibbon QP, Hartmann K,
055[0323:TBEOIV]2.0.CO;2 Semmens JM (2016a) Assessing the impact of marine
Cook SL, Forrest TG (2005) Sounds produced by nesting seismic surveys on southeast Australian scallop and
leatherback sea turtles (Dermochelys coriacea). lobster fisheries. Fisheries Research & Development
Herpetol Rev 36(4):387–390 Corporation
Coram A, Gordon J, Thompson D, Northridge SP (2014) Day RD, McCauley RD, Fitzgibbon QP, Semmens JM
Evaluating and Assessing the relative effectiveness of (2016b) Seismic air gun exposure during early-stage
acoustic deterrent devices and other non-lethal embryonic development does not negatively affect
measures on marine mammals. University of St spiny lobster Jasus edwardsii larvae (Decapoda:
Andrews, Sea Mammal Research Unit, St Andrews, Palinuridae). Sci Rep 6:22723. https://doi.org/10.
Scotland 1038/srep22723
Costa DP, Schwarz L, Robinson P, Schick RS, Morris PA, Day RD, McCauley RD, Fitzgibbon QP, Hartmann K,
Condit R, Crocker DE, Kilpatrick AM (2016) A bioen- Semmens JM (2017) Exposure to seismic air gun
ergetics approach to understanding the population signals causes physiological harm and alters behavior
consequences of disturbance: elephant seals as a in the scallop Pecten fumatus. Proc Natl Acad Sci
model system. In: Popper AN, Hawkins T (eds) The 114(40):E8537. https://doi.org/10.1073/pnas.
effects of noise on aquatic life II. Springer, New York, 1700564114
pp 161–169. https://doi.org/10.1007/978-1-4939- Day RD, McCauley RD, Fitzgibbon QP, Hartmann K,
2981-8_19 Semmens JM (2019) Seismic air guns damage rock
Costello RA, Symes LB (2014) Effects of anthropogenic lobster mechanosensory organs and impair righting
noise on male signalling behaviour and female reflex. Proc Biol Sci 286(1907):20191424. https://doi.
phonotaxis in Oecanthus tree crickets. Anim Behav org/10.1098/rspb.2019.1424
95:15–22. https://doi.org/10.1016/j.anbehav.2014. Day RD, Fitzgibbon QP, McCauley RD, Hartmann K,
05.009 Semmens JM (2020) Lobsters with pre-existing dam-
Courter JR, Perruci RJ, McGinnis KJ, Rainieri JK (2020) age to their mechanosensory statocyst organs do not
Black-capped chickadees (Poecile atricapillus) alter incur further damage from exposure to seismic air gun
alarm call duration and peak frequency in response to signals. Environ Pollut. https://doi.org/10.1016/j.
traffic noise. PLoS One 15(10):e0241035. https://doi. envpol.2020.115478
org/10.1371/journal.pone.0241035 de Magalhães Tolentino VC, Baesse CQ, Melo C (2018)
Creel S, Fox JE, Hardy A, Sands J, Garrott B, Peterson RO Dominant frequency of songs in tropical bird species is
(2002) Snowmobile activity and glucocorticoid stress higher in sites with high noise pollution. Environ Pollut
responses in wolves and elk. Conserv Biol 16(3): 235:983–992. https://doi.org/10.1016/j.envpol.2018.
809–814. https://doi.org/10.1046/j.1523-1739.2002. 01.045
00554.x de Soto NA, Delorme N, Atkins J, Howard S, Williams J,
Cunnington GM, Fahrig L (2010) Plasticity in the Johnson M (2013) Anthropogenic noise causes body
vocalizations of anurans in response to traffic noise. malformations and delays development in marine lar-
Acta Oecol 36(5):463–470. https://doi.org/10.1016/j. vae. Sci Rep 3:2831. https://doi.org/10.1038/
actao.2010.06.002 srep02831
Cynx J, Lewis R, Tavel B, Tse H (1998) Amplitude Dent ML, Larsen ON, Dooling RJ (1997) Free-field bin-
regulation of vocalizations in noise by a songbird, aural unmasking in budgerigars (Melopsittacus
Taeniopygia guttata. Anim Behav 56(1):107–113. undulatus). Behav Neurosci 111(3):590–598. https://
https://doi.org/10.1006/anbe.1998.0746 doi.org/10.1037/0735-7044.111.3.590
Dähne M, Gilles A, Lucke K, Peschko V, Adler S, Dent ML, McClaine EM, Best V, Ozmeral E, Narayan R,
Krügel K, Sundermeyer J, Siebert U (2013) Effects of Gallun FJ, Sen K, Shinn-Cunningham BG (2009) Spa-
pile-driving on harbour porpoises (Phocoena tial unmasking of birdsong in zebra finches
phocoena) at the first offshore wind farm in Germany. (Taeniopygia guttata) and budgerigars (Melopsittacus
Environ Res Lett 8(2):025002. https://doi.org/10.1088/ undulatus). J Comp Psychol 123(4):357–367. https://
1748-9326/8/2/025002 doi.org/10.1037/a0016898
496 C. Erbe et al.
Dent ML, Screven LA, Kobrina A (2018) Hearing in Dunlop RA, Noad MJ, McCauley RD, Kniest E, Slade R,
Rodents. In: Dent ML, Fay RR, Popper AN (eds) Paton D, Cato DH (2017a, 1869) The behavioural
Rodent bioacoustics. Springer, Cham, pp 71–105. response of migrating humpback whales to a full seis-
https://doi.org/10.1007/978-3-319-92495-3_4 mic airgun array. Proc Biol Sci 284. https://doi.org/10.
Department of the Navy (2008) Atlantic fleet active sonar 1098/rspb.2017.1901
training environmental impact statement. Naval Dunlop RA, Noad MJ, McCauley RD, Scott-Hayward L,
Facilities Engineering Command, Atlantic, Kniest E, Slade R, Paton D, Cato DH (2017b) Deter-
Norfolk, VA mining the behavioural dose–response relationship of
Derryberry EP, Phillips JN, Derryberry GE, Blum MJ, marine mammals to air gun noise and source proxim-
Luther D (2020) Singing in a silent spring: Birds ity. J Exp Biol 220(16):2878–2886. https://doi.org/10.
respond to a half-century soundscape reversion during 1242/jeb.160192
the COVID-19 shutdown. Science 370(6516):575. Dunlop RA, Noad MJ, McCauley RD, Kniest E, Slade R,
https://doi.org/10.1126/science.abd5777 Paton D, Cato DH (2018) A behavioural dose-response
DeRuiter SL, Larbi Doukara K (2012) Loggerhead turtles model for migrating humpback whales and seismic air
dive in response to airgun sound exposure. Endanger gun noise. Mar Pollut Bull 133:506–516. https://doi.
Species Res 16(1):55–63. https://doi.org/10.3354/ org/10.1016/j.marpolbul.2018.06.009
esr00396 Dunlop RA, McCauley RD, Noad MJ (2020) Ships and air
DeRuiter SL, Southall BL, Calambokidis J, Zimmer guns reduce social interactions in humpback whales at
WMX, Sadykova D, Falcone EA, Friedlaender AS, greater ranges than other behavioral impacts. Mar
Joseph JE, Moretti D, Schorr GS, Thomas L, Tyack Pollut Bull 154:111072. https://doi.org/10.1016/j.
PL (2013) First direct measurements of behavioural marpolbul.2020.111072
responses by Cuvier’s beaked whales to Dunn DE, Davis RR, Merry CJ, Franks JR (1991) Hearing
mid-frequency active sonar. Biol Lett 9(4). https:// loss in the chinchilla from impact and continuous noise
doi.org/10.1098/rsbl.2013.0223 exposure. J Acoust Soc Am 90(4):1979–1985. https://
des Aunay GH, Slabbekoorn H, Nagle L, Passas F, doi.org/10.1121/1.401677
Nicolas P, Draganoiu TI (2014) Urban noise Egnor SER, Wickelgren JG, Hauser MD (2007) Tracking
undermines female sexual preferences for silence: adjusting vocal production to avoid acoustic
low-frequency songs in domestic canaries. Anim interference. J Comp Physiol A 193(4):477–483.
Behav 87:67–75. https://doi.org/10.1016/j.anbehav. https://doi.org/10.1007/s00359-006-0205-7
2013.10.010 Ellison W, Southall B, Clark C, Frankel A (2012) A new
Doksaeter L, Godo OR, Handegard NO (2009) context-based approach to assess marine mammal
Behavioural responses of herring (Clupea harengus) behavioral responses to anthropogenic sounds.
to 1-2 and 6-7 kHz sonar signals and killer whale Conserv Biol 26(1):21–28. https://doi.org/10.1111/j.
feedings sounds. J Acoust Soc Am 125(1):554–564. 1523-1739.2011.01803.x
https://doi.org/10.1121/1.3021301 Engas A, Løkkeborg S (2002) Effects of seismic shooting
Dooling RJ, Leek MR (2018) Communication masking by and vessel-generated noise, on fish behaviour and
man-made noise. In: Slabbekoorn H, Dooling RJ, Pop- catch rates. Bioacoustics 12(2–3):313–316
per AN, Fay RR (eds) Effects of anthropogenic noise Engås A, Løkkeborg S, Ona E, Soldal AV (1996) Effects
on animals. Springer, New York, pp 23–46. https://doi. of seismic shooting on local abundance and catch rates
org/10.1007/978-1-4939-8574-6_2 of cod (Gadus morhua) and haddock
Dooling RJ, Lohr B, Dent ML (2000) Hearing in birds and (Melanogrammus aeglefinus). Can J Fish Aquat Sci
reptiles. In: Dooling RJ, Fay RR, Popper AN (eds) 53(10):2238–2249. https://doi.org/10.1139/f96-177
Comparative hearing: birds and reptiles. Springer, Erbe C (2000) Detection of whale calls in noise: perfor-
New York, pp 308–359. https://doi.org/10.1007/978- mance comparison between a beluga whale, human
1-4612-1182-2_7 listeners and a neural network. J Acoust Soc Am
Duarte MHL, Vecci MA, Hirsch A, Young RJ (2011) 108(1):297–303. https://doi.org/10.1121/1.429465
Noisy human neighbours affect where urban monkeys Erbe C, Reichmuth C, Cunningham KC, Lucke K,
live. Biol Lett 7(6):840–842. https://doi.org/10.1098/ Dooling RJ (2016a) Communication masking in
rsbl.2011.0529 marine mammals: a review and research strategy. Mar
Dunlop RA, Cato DH, Noad MJ (2014) Evidence of a Pollut Bull 103:15–38. https://doi.org/10.1016/j.
Lombard response in migrating humpback whales marpolbul.2015.12.007
(Megaptera novaeangliae). J Acoust Soc Am 136(1): Erbe C, Sisneros J, Thomsen F, Hawkins A, Popper A
430–437. https://doi.org/10.1121/1.4883598 (2016b) Overview of the fourth international confer-
Dunlop RA, Noad MJ, McCauley RD, Kniest E, Slade R, ence on the effects of noise on aquatic life. Proc Meet
Paton D, Cato DH (2016) Response of humpback Acoust 27(1):010006. https://doi.org/10.1121/2.
whales (Megaptera novaeangliae) to ramp-up of a 0000256
small experimental air gun array. Mar Pollut Bull Erbe C, Dunlop R, Dolman S (2018) Effects of noise on
103(1–2):72–83. https://doi.org/10.1016/j.marpolbul. marine mammals. In: Slabbekoorn H, Dooling RJ,
2015.12.044 Popper AN, Fay RR (eds) Effects of anthropogenic
13 The Effects of Noise on Animals 497
noise on animals. Springer, New York, pp 277–309. Filadelfo R, Mintz J, Michlovich E, Amico AD, Tyack PL,
https://doi.org/10.1007/978-1-4939-8574-6_10 Ketten DR (2009) Correlating military sonar use with
Erbe C, Dähne M, Gordon J, Herata H, Houser DS, beaked whale mass strandings: what do the historical
Koschinski S, Leaper R, McCauley R, Miller B, data show? Aquat Mamm 35:435–444. https://doi.org/
Müller M, Murray A, Oswald JN, Scholik-Schlomer 10.1578/AM.35.4.2009.435
AR, Schuster M, van Opzeeland IC, Janik VM (2019a) Finneran JJ (2015) Noise-induced hearing loss in marine
Managing the effects of noise from ship traffic, seismic mammals: a review of temporary threshold shift stud-
surveying and construction on marine mammals in ies from 1996 to 2015. J Acoust Soc Am 138(3):
Antarctica. Front Mar Sci 6:647. https://doi.org/10. 1702–1726. https://doi.org/10.1121/1.4927418
3389/fmars.2019.00647 Finneran JJ (2016) Auditory weighting functions and
Erbe C, Marley S, Schoeman R, Smith JN, Trigg L, TTS/PTS exposure functions for marine mammals
Embling CB (2019b) The effects of ship noise on exposed to underwater noise. National Marine
marine mammals: a review. Front Mar Sci 6:606. Fisheries Service, Silver Spring, MD
https://doi.org/10.3389/fmars.2019.00606 Fitzgibbon QP, Day RD, McCauley RD, Simon CJ,
Erbe C, Sisneros J, Thomsen F, Lepper P, Hawkins A, Semmens JM (2017) The impact of seismic air gun
Popper A (2019c) Overview of the fifth international exposure on the haemolymph physiology and
conference on the effects of noise on aquatic life. Proc nutritional condition of spiny lobster, Jasus edwardsii.
Meet Acoust 37(1):001001. https://doi.org/10.1121/2. Mar Pollut Bull 125(1):146–156. https://doi.org/10.
0001052 1016/j.marpolbul.2017.08.004
Fay RR, Popper AN (2012) Fish hearing: new perspectives Foppen RPB, Deuzeman S (2007) De Grote karekiet in de
from two ‘senior’ bioacousticians. Brain Behav Evol noordelijke randmeren; een dilemma voor natuuront-
79(4):215–217. https://doi.org/10.1159/000338719 wikkelingsplannen!? De Levende Natuur 108:20–26
Feng AS, Narins PM (2008) Ultrasonic communication in Francis CD, Ortega CP, Cruz A (2009) Noise pollution
concave-eared torrent frogs (Amolops tormotus). J changes avian communities and species interactions.
Comp Physiol A 194(2):159–167. https://doi.org/10. Curr Biol 19(16):1415–1419. https://doi.org/10.1016/
1007/s00359-007-0267-1 j.cub.2009.06.052
Feng S, Yang L, Hui L, Luo Y, Du Z, Xiong W, Liu K, Francis CD, Ortega CP, Cruz A (2011) Different
Jiang X (2020) Long-term exposure to low-intensity behavioural responses to anthropogenic noise by two
environmental noise aggravates age-related hearing closely related passerine birds. Biol Lett 7(6):850–852.
loss via disruption of cochlear ribbon synapses. Am J https://doi.org/10.1098/rsbl.2011.0359
Transl Res 12(7):3674–3687 Francis CD, Kleist NJ, Ortega CP, Cruz A (2012) Noise
Fenton MB, Grinnell AD, Popper AN, Fay RR (eds) pollution alters ecological services: enhanced pollina-
(2016) Bat bioacoustics. Springer handbook of audi- tion and disrupted seed dispersal. Proc R Soc B Biol
tory research, vol 54. Springer, New York Sci 279(1739):2727–2735. https://doi.org/10.1098/
Fernandez A, Edwards JF, Rodriguez F, de los Monteros rspb.2012.0230
AE, Herraez P, Castro P, Jaber JR, Martin V, Arbelo M Fristrup KM, Hatch LT, Clark CW (2003) Variation in
(2005) “Gas and fat embolic syndrome” involving a humpback whale (Megaptera novaeangliae) song
mass stranding of beaked whales (Family Ziphiidae) length in relation to low-frequency sound broadcasts.
exposed to anthropogenic sonar signals. Vet Pathol J Acoust Soc Am 113(6):3411–3424. https://doi.org/
42(4):446–457 10.1121/1.1573637
Ferrara CR, Vogt RC, Harfush MR, Sousa-Lima RS, Fuller RA, Warren PH, Gaston KJ (2007) Daytime noise
Albavera E, Tavera A (2014) First evidence of leather- predicts nocturnal singing in urban robins. Biol Lett
back turtle (Dermochelys coriacea) embryos and 3(4):368. https://doi.org/10.1098/rsbl.2007.0134
hatchlings emitting sounds. Chelonian Conserv Biol Galeotti P, Sacchi R, Fasola M, Ballasina D (2005) Do
13(1):110–114. https://doi.org/10.2744/ccb-1045.1 mounting vocalisations in tortoises have a communi-
Fewtrell JL, McCauley RD (2012) Impact of air gun noise cation function? A comparative analysis. Herpetol J
on the behaviour of marine fish and squid. Mar Pollut 15(2):61–71
Bull 64(5):984–993. https://doi.org/10.1016/j. Gallego-Abenza M, Mathevon N, Wheatcroft D (2019)
marpolbul.2012.02.009 Experience modulates an insect’s response to anthro-
Fields DM, Handegard NO, Dalen J, Eichner C, Malde K, pogenic noise. Behav Ecol 31(1):90–96. https://doi.
Karlsen Ø, Skiftesvik AB, Durif CMF, Browman HI org/10.1093/beheco/arz159
(2019) Airgun blasts used in marine seismic surveys Gesi M, Fornai F, Lenzi P, Ferrucci M, Soldani P,
have limited effects on mortality, and no sublethal Ruffoli R, Paparelli A (2002) Morphological
effects on behaviour or gene expression, in the cope- alterations induced by loud noise in the myocardium:
pod Calanus finmarchicus. ICES J Mar Sci 76(7): the role of benzodiazepine receptors. Microsc Res
2033–2044. https://doi.org/10.1093/icesjms/fsz126 Tech 59(2):136–146. https://doi.org/10.1002/jemt.
10186
498 C. Erbe et al.
Gil D, Gahr M (2002) The honesty of bird song: multiple challenges of analyzing behavioral response
constraints for multiple traits. Trends Ecol Evol 17(3): study data: an overview of the MOCHA (Multi-study
133–141. https://doi.org/10.1016/S0169-5347(02) OCean Acoustics Human Effects Analysis) project. In:
02410-2 Popper AN, Hawkins A (eds) The effects of noise on
Goldbogen JA, Southall BL, DeRuiter SL, aquatic life II. Springer, New York, pp 399–407.
Calambokidis J, Friedlaender AS, Hazen EL, Falcone https://doi.org/10.1007/978-1-4939-2981-8_47
EA, Schorr GS, Douglas A, Moretti DJ, Kyburg C, Hastings MC, Miksis-Olds J (2012) Shipboard assessment
McKenna MF, Tyack PL (2013) Blue whales respond of hearing sensitivity of tropical fishes immediately
to simulated mid-frequency military sonar. Proc R Soc after exposure to seismic air gun emissions at Scott
B 280(1765):20130657. https://doi.org/10.1098/rspb. Reef. In: Popper AN, Hawkins A (eds) The effects of
2013.0657 noise on aquatic life. Springer, New York, pp 239–243.
Goodwin SE, Shriver WG (2011) Effects of traffic noise https://doi.org/10.1007/978-1-4419-7311-5_53
on occupancy patterns of forest birds. Conserv Biol Hatch L, Clark C, Van Parijs S, Frankel A, Ponirakis D
25(2):406–411. https://doi.org/10.1111/j.1523-1739. (2012) Quantifying loss of acoustic communication
2010.01602.x space for right whales in and around a U.S. National
Grafe TU, Preininger D, Sztatecsny M, Kasah R, Dehling Marine Sanctuary. Conserv Biol 26(6):983–994.
JM, Proksch S, Hödl W (2012) Multimodal communi- https://doi.org/10.1111/j.1523-1739.2012.01908.x
cation in a noisy environment: A case study of the Hawkins AD, Popper AN (2018) Effects of man-made
Bornean rock frog Staurois parvus. PLoS One 7(5): sound on fishes. In: Slabbekoorn H, Dooling RJ, Pop-
e37965. https://doi.org/10.1371/journal.pone.0037965 per AN, Fay RR (eds) Effects of anthropogenic noise
Greenfield MD (2016) Evolution of acoustic communica- on animals. Springer, New York, pp 145–177. https://
tion in insects. In: Pollack GS, Mason AC, Popper AN, doi.org/10.1007/978-1-4939-8574-6_6
Fay RR (eds) Insect hearing. Springer, Cham, pp Hawkins JE, Johnsson LG, Stebbins WC, Moody DB,
17–47. https://doi.org/10.1007/978-3-319-28890-1_2 Coombs SL (1976) Hearing loss and cochlear pathol-
Gue M, Fioramonti J, Frexinos J, Alvinerie M, Bueno L ogy in monkeys after noise exposure. Acta Otolaryngol
(1987) Influence of acoustic stress by noise on gastro- 81(3-6):337–343. https://doi.org/10.3109/
intestinal motility in dogs. Dig Dis Sci 32(12): 00016487609119971
1411–1417. https://doi.org/10.1007/BF01296668 Hawkins AD, Roberts L, Cheesman S (2014) Responses
Guerra A, González AF, Rocha F (2004) A review of the of free-living coastal pelagic fish to impulsive sounds.
records of giant squid in the north-eastern Atlantic and J Acoust Soc Am 135(5):3101–3116. https://doi.org/
severe injuries in Architeuthis dux stranded after acous- 10.1121/1.4870697
tic explorations. In: Paper presented at the ICES Helble TA, Guazzo RA, Martin CR, Durbach IN, Alongi
Annual Science Conference, Vigo, Spain GC, Martin SW, Boyle JK, Henderson EE (2020)
Hage SR, Jiang T, Berquist SW, Feng J, Metzner W Lombard effect: Minke whale boing call source levels
(2013) Ambient noise induces independent shifts in vary with natural variations in ocean noise. J Acoust
call frequency and amplitude within the Lombard Soc Am 147(2):698–712. https://doi.org/10.1121/10.
effect in echolocating bats. Proc Natl Acad Sci 0000596
110(10):4063. https://doi.org/10.1073/pnas. Herbst CT, Stoeger AS, Frey R, Lohscheller J, Titze IR,
1211533110 Gumpenberger M, Fitch WT (2012) How low can
Hahad O, Kröller-Schön S, Daiber A, Münzel T (2019) you go? Physical production mechanism of elephant
The cardiovascular effects of noise. Dtsch Arztebl Int infrasonic vocalizations. Science 337(6094):595–599.
116(14):245–250. https://doi.org/10.3238/arztebl. https://doi.org/10.1126/science.1219712
2019.0245 Herrera-Montes MI, Aide TM (2011) Impacts of traffic
Halfwerk W, Bot S, Buikx J, van der Velde M, Komdeur J, noise on anuran and bird communities. Urban Ecosyst
ten Cate C, Slabbekoorn H (2011) Low-frequency 14(3):415–427. https://doi.org/10.1007/s11252-011-
songs lose their potency in noisy urban conditions. 0158-7
Proc Natl Acad Sci 108(35):14549–14554. https:// Heuch PA, Karlsen E (1997) Detection of infrasonic water
doi.org/10.1073/pnas.1109091108 oscillations by copepodids of Lepeophtheirus salmonis
Halfwerk W, Lea AM, Guerra MA, Page RA, Ryan MJ (Copepoda Caligida). J Plankton Res 19(6):735–747.
(2016) Vocal responses to noise reveal the presence of https://doi.org/10.1093/plankt/19.6.735
the Lombard effect in a frog. Behav Ecol 27(2): Heyward A, Colquhoun J, Cripps E, McCorry D,
669–676. https://doi.org/10.1093/beheco/arv204 Stowar M, Radford B, Miller K, Miller I, Battershill
Halvorsen MB, Casper BM, Woodley CM, Thomas J, C (2018) No evidence of damage to the soft tissue or
Carlson TJ, Popper AN (2012) Threshold for onset of skeletal integrity of mesophotic corals exposed to a 3D
injury in Chinook salmon from exposure to impulsive marine seismic survey. Mar Pollut Bull 129(1):8–13.
pile driving sounds. PLoS One 7(6):e38968. https:// https://doi.org/10.1016/j.marpolbul.2018.01.057
doi.org/10.1371/journal.pone.0038968 Hofstetter RW, Dunn DD, McGuire R, Potter KA (2014)
Harris CM, Thomas L, Sadykova D, DeRuiter SL, Tyack Using acoustic technology to reduce bark beetle
PL, Southall BL, Read AJ, Miller PJO (2016) The
13 The Effects of Noise on Animals 499
reproduction. Pest Manag Sci 70(1):24–27. https://doi. response to anthropogenic noise. Ibis 163(1):52–64.
org/10.1002/ps.3656 https://doi.org/10.1111/ibi.12844
Holt MM, Noren DP, Veirs V, Emmons CK, Veirs S Kaifu K, Akamatsu T, Segawa S (2008) Underwater sound
(2009) Speaking up: Killer whales (Orcinus orca) detection by cephalopod statocyst. Fish Sci 74(4):
increase their call amplitude in response to vessel 781–786. https://doi.org/10.1111/j.1444-2906.2008.
noise. J Acoust Soc Am 125(1):El27–El32. https:// 01589.x
doi.org/10.1121/1.3040028 Kastelein RA, Steen N, de Jong C, Wensveen PJ,
Horn AG, Aikens M, Jamieson E, Kingdon K, Leonard Verboom WC (2011) Effect of broadband-noise
ML (2020) Effect of noise on development of call masking on the behavioral response of a harbor por-
discrimination by nestling tree swallows, Tachycineta poise (Phocoena phocoena) to 1-s duration 6-7 kHz
bicolor. Anim Behav 164:143–148. https://doi.org/10. sonar up-sweeps. J Acoust Soc Am 129(4):2307–2315.
1016/j.anbehav.2020.04.008 https://doi.org/10.1121/1.3559679
Houser DS, Martin SW, Finneran JJ (2013a) Behavioural Kastelein R, Gransier R, Hoek L, Olthuis J (2012) Tempo-
responses of California sea lions to mid-frequency rary threshold shifts and recovery in a harbour porpoise
(3250-3450 Hz) sonar signals. Mar Environ Res 92: (Phocoena phocoena) after octave-band noise at
268–278. https://doi.org/10.1016/j.marenvres.2013. 4 kHz. J Acoust Soc Am 132(5):3525–3537. https://
10.007 doi.org/10.1121/1.4757641
Houser DS, Martin SW, Finneran JJ (2013b) Exposure Kastelein RA, Gransier R, van den Hoogen M, Hoek L
amplitude and repetition affect bottlenose dolphin (2013) Brief behavioural response threshold levels of a
behavioural responses to simulated mid-frequency harbour porpoise (Phocoena phocoena) to five heli-
sonar signals. J Exp Mar Biol Ecol 443:123–133. copter dipping sonar signals (1.33 to 1.43 kHz).
https://doi.org/10.1016/j.jembe.2013.02.043 Aquat Mamm 39(2):162–173
Hu B (2012) Noise-induced structural damage to the Kerth G, Melber M (2009) Species-specific barrier effects
cochlea. In: Le Prell CG, Henderson D, Fay RR, Pop- of a motorway on the habitat use of two threatened
per AN (eds) Noise-induced hearing loss: scientific forest-living bat species. Biol Conserv 142(2):
advances. Springer, New York, pp 57–86. https://doi. 270–279. https://doi.org/10.1016/j.biocon.2008.
org/10.1007/978-1-4419-9523-0_5 10.022
Hu BH, Guo W, Wang PY, Henderson D, Jiang SC (2000) Kight CR, Swaddle JP (2011) How and why environmen-
Intense noise-induced apoptosis in hair cells of guinea tal noise impacts animals: an integrative, mechanistic
pig cochleae. Acta Otolaryngol 120(1):19–24. https:// review. Ecol Lett 14(10):1052–1061. https://doi.org/
doi.org/10.1080/000164800750044443 10.1111/j.1461-0248.2011.01664.x
Hulse SH, MacDougall-Shackleton SA, Wisniewski AB Kight CR, Saha MS, Swaddle JP (2012) Anthropogenic
(1997) Auditory scene analysis by songbirds: stream noise is associated with reductions in the productivity
segregation of birdsong by European starlings (Sturnus of breeding Eastern Bluebirds (Sialia sialis). Ecol Appl
vulgaris). J Comp Psychol 111(1):3–13. https://doi. 22(7):1989–1996. https://doi.org/10.1890/12-0133.1
org/10.1037/0735-7036.111.1.3 Kobayasi KI, Okanoya K (2003) Context-dependent song
International Organization for Standardization (2017) amplitude control in Bengalese finches. Neuroreport
Underwater acoustics—terminology (ISO 18405). 14(3):521–524
Switzerland, Geneva Krausman PR, Hervert JJ (1983) Mountain sheep
Jeffrey ML, Brooke MC-E, Douglas HC (2002) Sound responses to aerial surveys. Wildl Soc Bull 11(4):
detection in situ by the larvae of a coral-reef damselfish 372–375
(Pomacentridae). Mar Ecol Prog Ser 232:259–268 Kryter KD (1994) The handbook of hearing and the effects
Jepson PD, Arbelo M, Deaville R, Patterson IAP, Castro P, of noise: physiology, psychology, and public health.
Baker JR, Degollada E, Ross HM, Herráez P, Pocknell Academic Press, New York
AM, Rodríguez F, Howie FE, Espinosa A, Reid RJ, Kujawa SG, Liberman MC (2009) Adding insult to injury:
Jaber JR, Martin V, Cunningham AA, Fernández A cochlear nerve degeneration after “temporary” noise-
(2003) Gas-bubble lesions in stranded cetaceans. induced hearing loss. J Neurosci 29(45):14077–14085.
Nature 425:575. https://doi.org/10.1038/425575a https://doi.org/10.1523/JNEUROSCI.2845-09.2009
Johnston DW (2002) The effect of acoustic harassment Lampe U, Schmoll T, Franzke A, Reinhold K (2012)
devices on harbour porpoises (Phocoena phocoena) in Staying tuned: grasshoppers from noisy roadside
the Bay of Fundy, Canada. Biol Conserv 108(1): habitats produce courtship signals with elevated fre-
113–118. https://doi.org/10.1016/s0006-3207(02) quency components. Funct Ecol 26(6):1348–1354.
00099-x https://doi.org/10.1111/1365-2435.12000
Joy R, Tollit D, Wood J, MacGillivray A, Li Z, Trounce K, Lampe U, Reinhold K, Schmoll T (2014) How
Robinson O (2019) Potential benefits of vessel grasshoppers respond to road noise: developmental
slowdowns on endangered southern resident killer plasticity and population differentiation in acoustic
whales. Front Mar Sci 6(344). https://doi.org/10. signalling. Funct Ecol 28(3):660–668. https://doi.org/
3389/fmars.2019.00344 10.1111/1365-2435.12215
Juárez R, Araya-Ajoy YG, Barrantes G, Sandoval L Landon DM, Krausman PR, Kiana KGK, Harris LK
(2021) House wrens Troglodytes aedon reduce reper- (2003) Pronghorn use of areas with varying sound
toire size and change song element frequencies in pressure levels. Southwest Nat 48(4):725–728
500 C. Erbe et al.
Le Prell CG, Hammill TL, Murphy WJ (2019) Noise- Malme CI, Miles PR, Clark CW, Tyack P, Bird JE (1983)
induced hearing loss: translating risk from animal Investigations of the potential effects of underwater
models to real-world environments. J Acoust Soc Am noise from petroleum industry activities on migrating
146(5):3646–3651. https://doi.org/10.1121/1.5133385 gray whale behavior Bolt Beranek and Newman Inc.
Lee N, Mason AC (2017) How spatial release from for U.S. Minerals Management Service, Anchorage,
masking may fail to function in a highly directional AK, Cambridge, MA
auditory system. eLife 6:e20731. https://doi.org/10. Malme CI, Miles PR, Clark CW, Tyack P, Bird JE (1984)
7554/eLife.20731 Investigations of the potential effects of underwater
Lee-Dadswell GR (2011) Physics of the interaction noise from petroleum industry activities on migrating
between a crab and a seismic test pulse - stage 4: gray whale behavior. Phase II: January 1984 migration.
continued development of mathematical model and (trans: Minerals Management Service USDotI,
testing of model via simulation. Cape Breton Univer- Washington, DC.). Bolt Beranek and Newman Inc.,
sity, Sydney, NS Anchorage, AK
Lengagne T (2008) Traffic noise affects communication Manabe K, Sadr EI, Dooling RJ (1998) Control of vocal
behaviour in a breeding anuran, Hyla arborea. Biol intensity in budgerigars (Melopsittacus undulatus): dif-
Conserv 141(8):2023–2031. https://doi.org/10.1016/j. ferential reinforcement of vocal intensity and the Lom-
biocon.2008.05.017 bard effect. J Acoust Soc Am 103(2):1190–1198.
Leonard ML, Horn AG (2008) Does ambient noise affect https://doi.org/10.1121/1.421227
growth and begging call structure in nestling birds? Mason JT, McClure CJW, Barber JR (2016) Anthropo-
Behav Ecol 19(3):502–507. https://doi.org/10.1093/ genic noise impairs owl hunting behavior. Biol
beheco/arm161 Conserv 199:29–32. https://doi.org/10.1016/j.biocon.
Lewis ER, Narins PM (1985) Do frogs communicate with 2016.04.009
seismic signals? Science 227(4683):187–189. https:// McCarthy E, Moretti D, Thomas L, DiMarzio N,
doi.org/10.1126/science.227.4683.187 Morrissey R, Jarvis S, Ward J, Izzi A, Dilley A
Liberman MC (2016) Noise-induced hearing loss: perma- (2011) Changes in spatial and temporal distribution
nent versus temporary threshold shifts and the effects and vocal behavior of Blainville’s beaked whales
of hair cell versus neuronal degeneration. In: Popper (Mesoplodon densirostris) during multiship exercises
AN, Hawkins A (eds) The effects of noise on aquatic with mid-frequency sonar. Mar Mamm Sci 27(3):
life II. Springer, New York, pp 1–7. https://doi.org/10. E206–E226. https://doi.org/10.1111/j.1748-7692.
1007/978-1-4939-2981-8_1 2010.00457.x
Lohr B, Wright TF, Dooling RJ (2003) Detection and McCauley RD (2014) Maxima seismic survey noise expo-
discrimination of natural calls in masking noise by sure, Scott Reef 2007. Centre for Marine Science &
birds: estimating the active space signal. Anim Behav Technology, Perth, WA
65:763–777 McCauley RD, Fewtrell J, Duncan AJ, Jenner C, Jenner
Love EK, Bee MA (2010) An experimental test of noise- M-N, Penrose JD, Prince RIT, Adhitya A, Murdoch J,
dependent voice amplitude regulation in Cope’s grey McCabe K (2003a) Marine seismic surveys: analysis
treefrog, Hyla chrysoscelis. Anim Behav 80(3): and propagation of air-gun signals; and effects of expo-
509–515. https://doi.org/10.1016/j.anbehav.2010. sure on humpback whales, sea turtles, fishes and
05.031 squid. In: Environmental implications of offshore oil
Lovell JM, Findlay MM, Moate RM, Yan HY (2005) The and gas development in Australia: further research.
hearing abilities of the prawn Palaemon serratus. Australian Petroleum Production and Exploration
Comp Biochem Physiol A Mol Integr Physiol 140(1): Association, Canberra, ACT
89–100. https://doi.org/10.1016/j.cbpb.2004.11.003 McCauley RD, Fewtrell J, Popper AN (2003b) High inten-
Lowry H, Lill A, Wong BBM (2012) How noisy does a sity anthropogenic sound damages fish ears. J Acoust
noisy miner have to be? Amplitude adjustments of Soc Am 113:638–642. https://doi.org/10.1121/1.
alarm calls in an avian urban ‘adapter’. PLoS One 1527962
7(1):e29960. https://doi.org/10.1371/journal.pone. McCauley RD, Day RD, Swadling KM, Fitzgibbon QP,
0029960 Watson RA, Semmens JM (2017) Widely used marine
Lucke K, Siebert U, Lepper PA, Blanchet M-A (2009) seismic survey air gun operations negatively impact
Temporary shift in masked hearing thresholds in a zooplankton. Nat Ecol Evol 1:0195. https://doi.org/
harbor porpoise (Phocoena phocoena) after exposure 10.1038/s41559-017-0195
to seismic airgun stimuli. J Acoust Soc Am 125(6): McCue MP, Guinan JJ (1994) Acoustically responsive
4060–4070. https://doi.org/10.1121/1.3117443 fibers in the vestibular nerve of the cat. J Neurosci
Madsen PT, Wahlberg M, Tougaard J, Lucke K, Tyack P 14(10):6058. https://doi.org/10.1523/JNEUROSCI.
(2006) Wind turbine underwater noise and marine 14-10-06058.1994
mammals: implications of current knowledge and McGregor RL, Bender DJ, Fahrig L (2008) Do small
data needs. Mar Ecol Prog Ser 309:279–295. https:// mammals avoid roads because of the traffic? J Appl
doi.org/10.3354/meps309279 Ecol 45(1):117–123. https://doi.org/10.1111/j.
1365-2664.2007.01403.x
13 The Effects of Noise on Animals 501
McIntyre E, Leonard ML, Horn AG (2014) Ambient Morrissey R (2014) A risk function for behavioral
noise and parental communication of predation risk disruption of Blainville’s beaked whales (Mesoplodon
in tree swallows, Tachycineta bicolor. Anim Behav densirostris) from mid-frequency active sonar. PLoS
87:85–89. https://doi.org/10.1016/j.anbehav.2013. One 9(1):e85064. https://doi.org/10.1371/journal.
10.013 pone.0085064
McMullen H, Schmidt R, Kunc HP (2014) Anthropogenic Morley EL, Jones G, Radford AN (2014) The importance
noise affects vocal interactions. Behav Process 103: of invertebrates when considering the impacts of
125–128. https://doi.org/10.1016/j.beproc.2013. anthropogenic noise. Proc Biol Sci 281(1776):
12.001 20132683. https://doi.org/10.1098/rspb.2013.2683
Megela-Simmons A, Moss CF, Daniel KM (1985) Behav- Morris-Drake A, Kern JM, Radford AN (2016) Cross-
ioral audiograms of the bullfrog (Rana catesbeiana) modal impacts of anthropogenic noise on information
and the green tree frog (Hyla cinerea). J Acoust Soc use. Curr Biol 26(20):R911–R912. https://doi.org/10.
Am 78(4):1236–1244. https://doi.org/10.1121/1. 1016/j.cub.2016.08.064
392892 Morton AB, Symonds HK (2002) Displacement of
Miller PJO, Biassoni N, Samuels A, Tyack PL (2000) Orcinus orca (L.) by high amplitude sound in British
Whale songs lengthen in response to sonar. Nature Columbia, Canada. ICES J Mar Sci 59(1):71–80
405(6789):903. https://doi.org/10.1038/35016148 Moseley DL, Derryberry GE, Phillips JN, Danner JE,
Miller GW, Moulton VD, Davis RA, Holst M, Millman P, Danner RM, Luther DA, Derryberry EP (2018) Acous-
MacGillivray A, Hannay D (2005) Monitoring seismic tic adaptation to city noise through vocal learning by a
effects on marine mammals - Southeastern Beaufort songbird. Proc R Soc B Biol Sci 285(1888):20181356.
Sea, 2001-2002. In: Armsworthy SL, Cranford PJ, https://doi.org/10.1098/rspb.2018.1356
Lee K (eds) Offshore oil and gas environmental effects Mrosovsky N (1972) Spectrographs of the sounds of leath-
monitoring: approaches and technologies. Battelle erback turtles. Herpetologica 28(3):256–258
Press, Columbus, OH, pp 511–542 Narayan R, Best V, Ozmeral E, McClaine E, Dent M,
Miller PJ, Kvadsheim PH, Lam F-PA, Wensveen PJ, Shinn-Cunningham B, Sen K (2007) Cortical interfer-
Antunes R, Alves AC, Visser F, Kleivane L, Tyack ence effects in the cocktail party problem. Nat
PL, Sivle LD (2012) The severity of behavioral Neurosci 10(12):1601–1607. https://doi.org/10.1038/
changes observed during experimental exposures of nn2009
killer (Orcinus orca), long-finned pilot (Globicephala Narins PM, Wilson M, Mann DA (2014) Ultrasound
melas), and sperm (Physeter macrocephalus) whales to detection in fishes and frogs: discovery and
naval sonar. Aquat Mamm 38(4):362–401 mechanisms. In: Köppl C, Manley GA, Popper AN,
Miller PJO, Antunes RN, Wensveen PJ, Samarra FIP, Fay RR (eds) Insights from comparative hearing
Alves AC, Tyack PL, Kvadsheim PH, Kleivane L, research. Springer, New York, pp 133–156. https://
Lam F-PA, Ainslie MA, Thomas L (2014) Dose- doi.org/10.1007/2506_2013_29
response relationships for the onset of avoidance of National Marine Fisheries Service (2018) 2018
sonar by free-ranging killer whales. J Acoust Soc Am Revisions to: technical guidance for assessing the
135(1):975. https://doi.org/10.1121/1.4861346 effects of anthropogenic sound on marine mammal
Miller PJO, Kvadsheim PH, Lam F-PA, Tyack PL, hearing (Version 2.0): underwater thresholds for
Cure C, Deruiter SL, Kleivane L, Sivle LD, Van onset of permanent and temporary threshold shifts. U.-
Ijsselmuide SP, Visser F, Wensveen PJ, Von Benda- S. Department of Commerce, National Oceanic and
Beckmann AM, Martin Lopez LM, Narazaki T, Atmospheric Administration, Silver Spring, MD
Hooker SK (2015) First indications that northern National Research Council (2005) Marine mammal
bottlenose whales are sensitive to behavioural distur- populations and ocean noise: determining when noise
bance from anthropogenic noise. R Soc Open Sci 2: causes biologically significant effects. National
140484. https://doi.org/10.1098/rsos.140484 Academies Press, Washington. https://doi.org/10.
Mooney TA, Hanlon RT, Christensen-Dalsgaard J, 17226/11147
Madsen PT, Ketten DR, Nachtigall PE (2010) Sound Nemeth E, Pieretti N, Zollinger SA, Geberzahn N,
detection by the longfin squid (Loligo pealeii) studied Partecke J, Miranda AC, Brumm H (2013) Bird song
with auditory evoked potentials: sensitivity to and anthropogenic noise: vocal constraints may
low-frequency particle motion and not pressure. J explain why birds sing higher-frequency songs in cit-
Exp Biol 213(21):3748–3759. https://doi.org/10. ies. Proc Biol Sci 280(1754). https://doi.org/10.1098/
1242/jeb.048348 rspb.2012.2798
Mooney TA, Yamato M, Branstetter BK (2012) Hearing in Neo Y, Ufkes E, Kastelein R, Winter H, ten Cate C,
cetaceans: from natural history to experimental biol- Slabbekoorn H (2015) Impulsive sounds change euro-
ogy. Adv Mar Biol 63:197–246 pean seabass swimming patterns: influence of pulse
Moore BCJ (2013) An introduction to the psychology of repetition interval. Mar Pollut Bull 97(1–2):111–117.
hearing. Brill, Leiden, The Netherlands https://doi.org/10.1016/j.marpolbul.2015.06.027
Moretti D, Thomas L, Marques T, Harwood J, Dilley A, New LF, Clark JS, Costa DP, Fleishman E, Hindell MA,
Neales B, Shaffer J, McCarthy E, New L, Jarvis S, Klanjcek T, Lusseau D, Kraus S, McMahon CR,
502 C. Erbe et al.
Robinson PW, Schick RS, Schwarz LK, Simmons SE, Penna M, Pottstock H, Velasquez N (2005) Effect of
Thomas L, Tyack PL, Harwood J (2014) Using short- natural and synthetic noise on evoked vocal responses
term measures of behaviour to estimate long-term fit- in a frog of the temperate austral forest. Anim Behav
ness of southern elephant seals. Mar Ecol Prog Ser 70(3):639–651. https://doi.org/10.1016/j.anbehav.
496:99–108. https://doi.org/10.3354/meps10547 2004.11.022
Nonaka S, Takahashi R, Enomoto K, Katada A, Unno T Pirotta E, Brookes KL, Graham IM, Thompson PM (2014)
(1997) Lombard reflex during PAG-induced vocaliza- Variation in harbour porpoise activity in response to
tion in decerebrate cats. Neurosci Res 29(4):283–289. seismic survey noise. Biol Lett 10(5):20131090.
https://doi.org/10.1016/S0168-0102(97)00097-7 https://doi.org/10.1098/rsbl.2013.1090
Nordt A, Klenke R (2013) Sleepless in town – drivers of Pohl NU, Slabbekoorn H, Klump GM, Langemann U
the temporal shift in dawn song in urban European (2009) Effects of signal features and environmental
blackbirds. PLoS One 8(8):e71476. https://doi.org/10. noise on signal detection in the great tit, Parus major.
1371/journal.pone.0071476 Anim Behav 78(6):1293–1300. https://doi.org/10.
Normandeau Associates (2012) Effects of noise on fish, 1016/j.anbehav.2009.09.005
fisheries, and invertebrates in the U.S. Atlantic and Pohl NU, Leadbeater E, Slabbekoorn H, Klump GM,
Arctic from energy industry sound-generating Langemann U (2012) Great tits in urban noise benefit
activities. U.S. Department of the Interior, Bureau of from high frequencies in song detection and discrimi-
Ocean Energy Management, Washington, DC nation. Anim Behav 83(3):711–721. https://doi.org/10.
Ocean Studies Board (2016) Approaches to understanding 1016/j.anbehav.2011.12.019
the cumulative effects of stressors on marine mammals. Pohl NU, Klump GM, Langemann U (2015) Effects of
The National Academies Press, Washington, signal features and background noise on distance cue
DC. https://doi.org/10.17226/23479 discrimination by a songbird. J Exp Biol 218(7):1006.
Orci KM, Petróczki K, Barta Z (2016) Instantaneous song https://doi.org/10.1242/jeb.113639
modification in response to fluctuating traffic noise in Polajnar J, Čokl A (2008) The effect of vibratory distur-
the tree cricket Oecanthus pellucens. Anim Behav 112: bance on sexual behaviour of the southern green stink
187–194. https://doi.org/10.1016/j.anbehav.2015. bug Nezara viridula (Heteroptera, Pentatomidae). Cent
12.008 Eur J Biol 3(2):189–197. https://doi.org/10.2478/
Otten W, Kanitz E, Puppe B, Tuchscherer M, Brüssow KP, s11535-008-0008-7
Nürnberg G, Stabenow B (2004) Acute and long term Polajnar J, Eriksson A, Lucchi A, Anfora G, Virant-
effects of chronic intermittent noise stress on Doberlet M, Mazzoni V (2015) Manipulating
hypothalamic-pituitary-adrenocortical and sympatho- behaviour with substrate-borne vibrations – potential
adrenomedullary axis in pigs. Anim Sci 78(2): for insect pest control. Pest Manag Sci 71(1):15–23.
271–283. https://doi.org/10.1017/S1357729800054060 https://doi.org/10.1002/ps.3848
Owen MA, Swaisgood RR, Czekala NM, Steinman K, Polak M, Wiącek J, Kucharczyk M, Orzechowski R
Lindburg DG (2004) Monitoring stress in captive (2013) The effect of road traffic on a breeding commu-
giant pandas (Ailuropoda melanoleuca): behavioral nity of woodland birds. Eur J For Res 132(5):931–941.
and hormonal responses to ambient noise. Zoo Biol https://doi.org/10.1007/s10342-013-0732-z
23(2):147–164. https://doi.org/10.1002/zoo.10124 Popper AN, Fay RR (1993) Sound detection and
Parris KM, Velik-Lord M, North JMA (2009) Frogs call at processing by fish: critical review and major research
a higher pitch in traffic noise. Ecol Soc 14(1):25 questions. Brain Behav Evol 41(1):14–25
Parry GD, Heislers S, Werner GF, Asplin MD (2002) Popper AN, Fay RR (2011) Rethinking sound detection by
Assessment of environmental effects of seismic testing fishes. Hear Res 273(1):25–36. https://doi.org/10.
on scallop fisheries in Bass Strait. Marine and Fresh- 1016/j.heares.2009.12.023
water Resources Institute, Queenscliff, VIC Popper AN, Hastings MC (2009) The effects of anthropo-
Payne KB, Langbauer WR, Thomas EM (1986) Infrasonic genic sources of sound on fishes. J Fish Biol 75:455–
calls of the Asian elephant (Elephas maximus). Behav 489
Ecol Sociobiol 18(4):297–301. https://doi.org/10. Popper AN, Hawkins A (eds) (2012) The effects of noise
1007/bf00300007 on aquatic life. advances in experimental medicine and
Payne JF, Andrews CA, Fancey LL, Cook AL, Christian biology 730. Springer, New York
JR (2007) Pilot study on the effects of seismic air gun Popper AN, Hawkins A (eds) (2016) The effects of noise
noise on lobster (Homarus americanus). Canadian on aquatic life II. Advances in experimental medicine
Technical Report of Fisheries and Aquatic and biology 730. Springer, New York
Sciences 2712 Popper AN, Smith ME, Cott PA, Hanna BW,
Pearson WH, Skalski JR, Malme CI (1992) Effects of MacGillivray AO, Austin ME, Mann DA (2005)
Sounds from a Geophysical Survey Device on Behav- Effects of exposure to seismic airgun use on hearing
ior of Captive Rockfish (Sebastes spp.). Can J Fish of three fish species. J Acoust Soc Am 117(6):
Aquat Sci 49(7):1343–1356. https://doi.org/10.1139/ 3958–3971. https://doi.org/10.1121/1.1904386
f92-150 Popper AN, Hawkins AD, Fay RR, Mann DA, Bartol S,
Carlson TJ, Coombs S, Ellison WT, Gentry RL,
13 The Effects of Noise on Animals 503
Halvorsen MB, Løkkeborg S, Rogers PH, Southall BL, Richardson AJ, Matear RJ, Lenton A (2017) Potential
Zeddies DG, Tavolga WN (2014) Sound exposure impacts on zooplankton of seismic surveys. CSIRO,
guidelines. In: ASA S3/SC1.4 TR-2014 sound expo- Canberra, Australia
sure guidelines for fishes and sea turtles: a technical Ridgway SH, Wever EG, McCormick JG, Palin J,
report prepared by ANSI-Accredited Standards Com- Anderson JH (1969) Hearing in the giant sea turtle,
mittee S3/SC1 and registered with ANSI. Springer, Chelonia mydas. Proc Natl Acad Sci USA 64:884–890
New York, pp 33–51. https://doi.org/10.1007/978-3- Roian Egnor SE, Hauser MD (2006) Noise-induced vocal
319-06659-2_7 modulation in cotton-top tamarins (Saguinus oedipus).
Potash LM (1972) Noise-induced changes in calls of the Am J Primatol 68(12):1183–1190. https://doi.org/10.
Japanese quail. Psychon Sci 26(5):252–254. https:// 1002/ajp.20317
doi.org/10.3758/bf03328608 Rolland RM, Parks SE, Hunt KE, Castellote M, Corkeron
Proppe DS, Sturdy CB, St Clair CC (2011) Flexibility in PJ, Nowacek DP, Wasser SK, Kraus SD (2012) Evi-
animal signals facilitates adaptation to rapidly chang- dence that ship noise increases stress in right whales.
ing environments. PLoS One 6(9):e25413. https://doi. Proc R Soc Lond Ser B Biol Sci 279(1737):
org/10.1371/journal.pone.0025413 2363–2368. https://doi.org/10.1098/rspb.2011.2429
Przeslawski R, Huang Z, Anderson J, Carroll AG, Romano TA, Keogh MJ, Kelly C, Feng P, Berk L,
Edmunds M, Hurt L, Williams S (2018) Multiple field- Schlundt CE, Carder DA, Finneran JJ (2004) Anthro-
based methods to assess the potential impacts of seismic pogenic sound and marine mammal health: measures
surveys on scallops. Mar Pollut Bull 129(2):750–761. of the nervous and immune systems before and after
https://doi.org/10.1016/j.marpolbul.2017.10.066 intense sound exposure. Can J Fish Aquat Sci 61(7):
Quinn JL, Whittingham MJ, Butler SJ, Cresswell W 1124–1134. https://doi.org/10.1139/f04-055
(2006) Noise, predation risk compensation and vigi- Romero ML, Butler LK (2007) Endocrinology of stress.
lance in the chaffinch Fringilla coelebs. J Avian Biol Int J Comp Psychol 20(2)
37(6):601–608. https://doi.org/10.1111/j.2006. Ruffoli R, Carpi A, Giambelluca MA, Grasso L, Scavuzzo
0908-8857.03781.x MC, Giannessi FF (2006) Diazepam administration
Rabin LA, Coss RG, Owings DH (2006) The effects of prevents testosterone decrease and lipofuscin accumu-
wind turbines on antipredator behavior in California lation in testis of mouse exposed to chronic noise
ground squirrels (Spermophilus beecheyi). Biol stress. Andrologia 38(5):159–165. https://doi.org/10.
Conserv 131(3):410–420. https://doi.org/10.1016/j. 1111/j.1439-0272.2006.00732.x
biocon.2006.02.016 Ryals BM, Rubel EW (1988) Hair cell regeneration after
Raboin M, Elias DO (2019) Anthropogenic noise and the acoustic trauma in adult Coturnix quail. Science
bioacoustics of terrestrial invertebrates. J Exp Biol 240(4860):1774. https://doi.org/10.1126/science.
222(12):jeb178749. https://doi.org/10.1242/jeb. 3381101
178749 Samuel Y, Morreale SJ, Clark CW, Greene CH, Richmond
Ratnam R, Feng AS (1998) Detection of auditory signals ME (2005) Underwater, low-frequency noise in a
by frog inferior collicular neurons in the presence of coastal sea turtle habitat. J Acoust Soc Am 117(3):
spatially separated noise. J Neurophysiol 80(6):2848 1465–1472. https://doi.org/10.1121/1.1847993
Reichmuth C, Holt MM, Mulsow J, Sills JM, Southall BL Saunders JC, Dooling RJ (2018) Characteristics of tempo-
(2013) Comparative assessment of amphibious hearing rary and permanent threshold shifts in vertebrates. In:
in pinnipeds. J Comp Physiol A 199(6):491–507 Slabbekoorn H, Dooling RJ, Popper AN, Fay RR (eds)
Reijnen R, Foppen R (2006) Impact of road traffic on Effects of anthropogenic noise on animals. Springer,
breeding bird populations. In: Davenport J, Davenport New York, pp 83–107. https://doi.org/10.1007/978-1-
JL (eds) The ecology of transportation: managing 4939-8574-6_4
mobility for the environment. Springer, Netherlands, Schaub A, Ostwald J, Siemers BM (2008) Foraging bats
Dordrecht, pp 255–274. https://doi.org/10.1007/1- avoid noise. J Exp Biol 211(19):3174. https://doi.org/
4020-4504-2_12 10.1242/jeb.022863
Reijnen R, Foppen R, Braak CT, Thissen J (1995) The Scheifele PM, Andrew S, Cooper RA, Darre M, Musiek
effects of car traffic on breeding bird populations in FE, Max L (2005) Indication of a Lombard vocal
woodland. III. Reduction of density in relation to the response in the St. Lawrence River beluga. J Acoust
proximity of main roads. J Appl Ecol 32(1):187–202. Soc Am 117(3):1486–1492. https://doi.org/10.1121/1.
https://doi.org/10.2307/2404428 1835508
Rheindt FE (2003) The impact of roads on birds: does Schmidt AKD, Riede K, Römer H (2011) High back-
song frequency play a role in determining susceptibil- ground noise shapes selective auditory filters in a trop-
ity to noise pollution? J Ornithol 144(3):295–306. ical cricket. J Exp Biol 214(10):1754. https://doi.org/
https://doi.org/10.1007/bf02465629 10.1242/jeb.053819
Richardson WJ, Greene CR, Malme CI, Thomson DH Schmidt R, Morrison A, Kunc HP (2014) Sexy voices – no
(1995) Marine mammals and noise. Academic Press, choices: male song in noise fails to attract females.
San Diego Anim Behav 94:55–59. https://doi.org/10.1016/j.
anbehav.2014.05.018
504 C. Erbe et al.
Serrano A, Terhune JM (2001) Within-call repetition may Slabbekoorn H, Dooling RJ, Popper AN, Fay RR
be an anti-masking strategy in underwater calls of harp (eds) Effects of anthropogenic noise on animals.
seals. Can J Zool 79:1410–1413 Springer, New York, pp 243–276. https://doi.org/10.
Shannon G, Angeloni LM, Wittemyer G, Fristrup KM, 1007/978-1-4939-8574-6_9
Crooks KR (2014) Road traffic noise modifies Slotte A, Hansen K, Dalen J, Ona E (2004) Acoustic
behaviour of a keystone species. Anim Behav 94: mapping of pelagic fish distribution and abundance in
135–141. https://doi.org/10.1016/j.anbehav.2014. relation to a seismic shooting area off the Norwegian
06.004 west coast. Fish Res 67(2):143–150. https://doi.org/10.
Shannon G, Crooks KR, Wittemyer G, Fristrup KM, 1016/j.fishres.2003.09.046
Angeloni LM (2016) Road noise causes earlier preda- Sobrian SK, Vaughn VT, Ashe WK, Markovic B,
tor detection and flight response in a free-ranging Djuric V, Jankovic BD (1997) Gestational exposure
mammal. Behav Ecol 27(5):1370–1375. https://doi. to loud noise alters the development and postnatal
org/10.1093/beheco/arw058 responsiveness of humoral and cellular components
Shen J-X, Feng AS, Xu Z-M, Yu Z-L, Arch VS, Yu X-J, of the immune system in offspring. Environ Res
Narins PM (2008) Ultrasonic frogs show hyperacute 73(1):227–241. https://doi.org/10.1006/enrs.1997.
phonotaxis to female courtship calls. Nature 3734
453(7197):914–916. https://doi.org/10.1038/ Söffker M, Trathan P, Clark J, Collins MA, Belchier M,
nature06719 Scott R (2015) The impact of predation by marine
Siemers BM, Schaub A (2011) Hunting at the highway: mammals on patagonian toothfish longline fisheries.
traffic noise reduces foraging efficiency in acoustic PLoS One 10(3):e0118113. https://doi.org/10.1371/
predators. Proc R Soc B Biol Sci 278(1712): journal.pone.0118113
1646–1652. https://doi.org/10.1098/rspb.2010.2262 Solé M, Lenoir M, Durfort M, López-Bejar M,
Simmons AM, Narins PM (2018) Effects of anthropogenic Lombarte A, André M (2013) Ultrastructural damage
noise on amphibians and reptiles. In: Slabbekoorn H, of Loligo vulgaris and Illex coindetii statocysts after
Dooling RJ, Popper AN, Fay RR (eds) Effects of low frequency sound exposure. PLoS One 8(10):
anthropogenic noise on animals. Springer, New York, e78825. https://doi.org/10.1371/journal.pone.0078825
pp 179–208. https://doi.org/10.1007/978-1-4939- Song J, Mann DA, Cott PA, Hanna BW, Popper AN
8574-6_7 (2008) The inner ears of Northern Canadian freshwater
Simpson SD, Meekan M, Montgomery J, McCauley R, fishes following exposure to seismic air gun sounds. J
Jeffs A (2005) Homeward sound. Science 308:221. Acoust Soc Am 124(2):1360–1366. https://doi.org/10.
https://doi.org/10.1126/science.1107406 1121/1.2946702
Simpson SD, Jeffs A, Montgomery JC, McCauley RD, Southall BL (2018) Noise. In: Würsig B, Thewissen JGM,
Meekan MG (2007) Nocturnal relocation of adult and Kovacs KM (eds) Encyclopedia of marine mammals,
juvenile coral reef fishes in response to reef noise. 3rd edn. Academic Press, New York, pp 637–645.
Coral Reefs 27:97–104. https://doi.org/10.1007/ https://doi.org/10.1016/B978-0-12-804327-1.00183-7
s00338-007-0294-y Southall BL, Bowles AE, Ellison WT, Finneran JJ, Gentry
Simpson SD, Radford AN, Nedelec SL, Ferrari MCO, RL, Greene CRJ, Kastak D, Ketten DR, Miller JH,
Chivers DP, McCormick MI, Meekan MG (2016) Nachtigall PE, Richardson WJ, Thomas JA, Tyack
Anthropogenic noise increases fish mortality by preda- PL (2007) Marine mammal noise exposure criteria:
tion. Nat Commun 7:10544. https://doi.org/10.1038/ Initial scientific recommendations. Aquat Mamm
ncomms10544 33(4):411–521. https://doi.org/10.1080/09524622.
Slabbekoorn H, Peet M (2003) Ecology: birds sing at a 2008.9753846
higher pitch in urban noise. Nature 424(6946): Southall BL, Nowacek DP, Miller PJO, Tyack PL (2016)
267–267. https://doi.org/10.1038/424267a Experimental field studies to measure behavioral
Slabbekoorn H, Ripmeester EAP (2008) Birdsong and responses of cetaceans to sonar. Endanger Species
anthropogenic noise: implications and applications Res 31:293–315. https://doi.org/10.3354/esr00764
for conservation. Mol Ecol 17(1):72–83. https://doi. Southall BL, DeRuiter SL, Friedlaender A, Stimpert AK,
org/10.1111/j.1365-294X.2007.03487.x Goldbogen JA, Hazen E, Casey C, Fregosi S, Cade
Slabbekoorn H, Bouton N, van Opzeeland I, Coers A, ten DE, Allen AN, Harris CM, Schorr G, Moretti D,
Cate C, Popper AN (2010) A noisy spring: the impact Guan S, Calambokidis J (2019a) Behavioral responses
of globally rising underwater sound levels on fish. of individual blue whales (Balaenoptera musculus) to
Trends Ecol Evol 25(7):419–427. https://doi.org/10. mid-frequency military sonar. J Exp Biol 222(5):
1016/j.tree.2010.04.005 jeb190637. https://doi.org/10.1242/jeb.190637
Slabbekoorn H, Dooling RJ, Popper AN, Fay RR (2018a) Southall BL, Finneran JJ, Reichmuth C, Nachtigall PE,
Effects of anthropogenic noise on animals. In: Springer Ketten DR, Bowles AE, Ellison WT, Nowacek DP,
handbook of auditory research, vol 66. Springer, Tyack PL (2019b) Marine mammal noise exposure
New York criteria: updated scientific recommendations for resid-
Slabbekoorn H, McGee J, Walsh EJ (2018b) Effects of ual hearing effects. Aquat Mamm 45(2):125–232.
man-made sound on terrestrial mammals. In: https://doi.org/10.1578/AM.45.2.2019.125
13 The Effects of Noise on Animals 505
Stanley JA, Radford CA, Jeffs AG (2009) Induction of Thompson PM, Lusseau D, Barton T, Simmons D,
settlement in crab megalopae by ambient underwater Rusin J, Bailey H (2010) Assessing the responses of
reef sound. Behav Ecol 21(1):113–120. https://doi.org/ coastal cetaceans to the construction of offshore wind
10.1093/beheco/arp159 turbines. Mar Pollut Bull 60(8):1200–1208. https://doi.
Stanley JA, Wilkens SL, Jeffs AG (2014) Fouling in your org/10.1016/j.marpolbul.2010.03.030
own nest: vessel noise increases biofouling. Biofouling Thompson PM, Brookes KL, Graham IM, Barton TR,
30(7):837–844. https://doi.org/10.1080/08927014. Needham K, Bradbury G, Merchant ND (2013)
2014.938062 Short-term disturbance by a commercial
Stimpert AK, Deruiter SL, Southall BL, Moretti DJ, two-dimensional seismic survey does not lead to
Falcone EA, Goldbogen JA, Friedlaender A, Schorr long-term displacement of harbour porpoises. Proc R
GS, Calambokidis J (2014) Acoustic and foraging Soc Lond Ser B Biol Sci 280(1771). https://doi.org/10.
behavior of a Baird’s beaked whale, Berardius bairdii, 1098/rspb.2013.2001
exposed to simulated sonar. Sci Rep 4:7031. https:// Thomsen F, Erbe C, Hawkins A, Lepper P, Popper AN,
doi.org/10.1038/srep07031 Scholik-Schlomer A, Sisneros J (2020) Introduction to
Strasser EH, Heath JA (2013) Reproductive failure of a the special issue on the effects of sound on aquatic life.
human-tolerant species, the American kestrel, is J Acoust Soc Am 148(2):934–938. https://doi.org/10.
associated with stress and human disturbance. J Appl 1121/10.0001725
Ecol 50(4):912–919. https://doi.org/10.1111/ Todd VLG, Todd IB, Gardiner JC, Morrin ECN,
1365-2664.12103 Macpherson NA, Dimarzio NA, Thomsen F (2015) A
Sun JWC, Narins PM (2005) Anthropogenic sounds dif- review of impacts of marine dredging activities on
ferentially affect amphibian call rate. Biol Conserv marine mammals. ICES (International Council for the
121(3):419–427. https://doi.org/10.1016/j.biocon. Exploration of the Seas). J Mar Sci 77(2):328–340.
2004.05.017 https://doi.org/10.1093/icesjms/fsu187
Swaddle JP, Page LC (2007) High levels of environmental Tougaard J, Carstensen J, Teilmann J (2009) Pile driving
noise erode pair preferences in zebra finches: zone of responsiveness extends beyond 20 km for
implications for noise pollution. Anim Behav 74(3): harbor porpoises (Phocoena phocoena (L.)). J Acoust
363–368. https://doi.org/10.1016/j.anbehav.2007. Soc Am 126(1):11–14. https://doi.org/10.1121/1.
01.004 3132523
Tarlow EM, Blumstein DT (2007) Evaluating methods to Turnbull SD (1994) Changes in masked thresholds of a
quantify anthropogenic stressors on wild animals. Appl harbor seal Phoca vitulina associated with angular
Anim Behav Sci 102(3):429–451. https://doi.org/10. separation of signal and noise sources. Can J Zool 72:
1016/j.applanim.2006.05.040 1863–1866. https://doi.org/10.1139/z94-253
Tavolga WN (ed) (1976) Sound reception in fishes. Tyack PL, Zimmer WMX, Moretti D, Southall BL,
Dowden, Hutchinson and Ross, Stroudsburg, PA Claridge DE, Durban JW, Clark CW, D’Amico A,
Tavolga WN, Popper AN, Fay RR (eds) (2012) Hearing DiMarzio N, Jarvis S, McCarthy E, Morrissey R,
and sound communication in fishes. Springer, Ward J, Boyd IL (2011) Beaked whales respond to
New York simulated and actual navy sonar. PLoS One 6(3):
Tennessen JB, Parks SE, Langkilde T (2014) Traffic noise e17009. https://doi.org/10.1371/journal.pone.0017009
causes physiological stress and impairs breeding Valero MD, Hancock KE, Maison SF, Liberman MC
migration behaviour in frogs. Conserv Physiol 2(1). (2018) Effects of cochlear synaptopathy on middle-
https://doi.org/10.1093/conphys/cou032 ear muscle reflexes in unanesthetized mice. Hear Res
Tennessen JB, Parks SE, Swierk L, Reinert LK, Holden 363:109–118. https://doi.org/10.1016/j.heares.2018.
WM, Rollins-Smith LA, Walsh KA, Langkilde T 03.012
(2018) Frogs adapt to physiologically costly anthropo- Vermeij MJA, Marhaver KL, Huijubers C, Nagelkerken I,
genic noise. Proc R Soc B Biol Sci 285(1891): Simpson SD (2010) Coral larvae move toward reef
20182194. https://doi.org/10.1098/rspb.2018.2194 sounds. PLoS One 5(5):e10660. https://doi.org/10.
Thode AM, Blackwell SB, Conrad AS, Kim KH, 1371/journal.pone.0010660
Marques T, Thomas L, Oedekoven CS, Harris D, Brö- Verzijden MN, Ripmeester EAP, Ohms VR,
ker K (2020) Roaring and repetition: how bowhead Snelderwaard P, Slabbekoorn H (2010) Immediate
whales adjust their call density and source level (Lom- spectral flexibility in singing chiffchaffs during experi-
bard effect) in the presence of natural and seismic mental exposure to highway noise. J Exp Biol 213(15):
airgun survey noise. J Acoust Soc Am 147(3): 2575. https://doi.org/10.1242/jeb.038299
2061–2080. https://doi.org/10.1121/10.0000935 Visser F, Curé C, Kvadsheim PH, Lam F-PA, Tyack PL,
Thomas JA, Friel B, Yegge S (2016) Restoring dueting Miller PJO (2016) Disturbance-specific social
behavior in a mated pair of buffy cheeked gibbons after responses in long-finned pilot whales, Globicephala
exposure to construction noise at a zoo through melas. Sci Rep. https://doi.org/10.1038/srep28641
playbacks of their own sounds. J Acoust Soc Am Wale MA, Simpson SD, Radford AN (2013a) Noise nega-
140(4):3415–3415. https://doi.org/10.1121/1.4970975 tively affects foraging and antipredator behaviour in
506 C. Erbe et al.
shore crabs. Anim Behav 86(1):111–118. https://doi. Williams R, Wright AJ, Ashe E, Blight LK, Bruintjes R,
org/10.1016/j.anbehav.2013.05.001 Canessa R, Clark CW, Cullis-Suzuki S, Dakin DT,
Wale MA, Simpson SD, Radford AN (2013b) Size- Erbe C, Hammond PS, Merchant ND, O’Hara PD,
dependent physiological responses of shore crabs to Purser J, Radford AN, Simpson SD, Thomas L, Wale
single and repeated playback of ship noise. Biol Lett MA (2015) Impacts of anthropogenic noise on
9(2):20121194. https://doi.org/10.1098/rsbl.2012.1194 marine life: publication patterns, new discoveries, and
Ward AI, Pietravalle S, Cowan DP, Delahay RJ (2008) future directions in research and management. Ocean
Deterrent or dinner bell? Alteration of badger activity Coast Manag 115:17–24. https://doi.org/10.1016/j.
and feeding at baited plots using ultrasonic and water ocecoaman.2015.05.021
jet devices. Appl Anim Behav Sci 115(3):221–232. Wisniewska DM, Johnson M, Teilmann J, Rojano-
https://doi.org/10.1016/j.applanim.2008.06.004 Doñate L, Shearer J, Sveegaard S, Miller Lee A,
Warnecke M, Chiu C, Engelberg J, Moss CF (2015) Siebert U, Madsen Peter T (2016) Ultra-high foraging
Active listening in a bat cocktail party: adaptive echo- rates of harbor porpoises make them vulnerable to
location and flight behaviors of big brown bats, anthropogenic disturbance. Curr Biol 26(11):
Eptesicus fuscus, foraging in a cluttered acoustic envi- 1441–1446. https://doi.org/10.1016/j.cub.2016.03.069
ronment. Brain Behav Evol 86(1):6–16. https://doi. Wollerman L, Wiley RH (2002) Background noise from a
org/10.1159/000437346 natural chorus alters female discrimination of male
Warren B, Fenton GE, Klenschi E, Windmill JFC, French calls in a neotropical frog. Anim Behav 63(1):15–22.
AS (2020) Physiological basis of noise-induced hearing https://doi.org/10.1006/anbe.2001.1885
loss in a tympanal ear. J Neurosci 40(15):3130. https:// World Health Organization (2011) Burden of disease from
doi.org/10.1523/JNEUROSCI.2279-19.2019 environmental noise: quantification of healthy life
Weir CR, Dolman SJ (2007) Comparative review of the years lost in Europe. World Health Organization,
regional marine mammal mitigation guidelines Copenhagen
implemented during industrial seismic surveys, and Wrege PH, Rowland ED, Thompson BG, Batruch N
guidance towards a worldwide standard. J Int Wildl (2010) Use of acoustic tools to reveal otherwise cryptic
Law Policy 10:1–27. https://doi.org/10.1080/ responses of forest elephants to oil exploration.
13880290701229838 Conserv Biol 24(6):1578–1585. https://doi.org/10.
Weisenberger ME, Krausman PR, Wallace MC, De Young 1111/j.1523-1739.2010.01559.x
DW, Maughan OE (1996) Effects of simulated jet Yelverton JT, Richmond DR, Hicks W, Saunders K,
aircraft noise on heart rate and behavior of desert Fletcher ER (1975) The relationship between fish size
ungulates. J Wildl Manag 60(1):52–61. https://doi. and their response to underwater blast. Lovelace Foun-
org/10.2307/3802039 dation for Medical Education and Research,
Wensveen PJ, Kvadsheim PH, Lam F-PA, von Benda- Albuquerque, NM
Beckmann AM, Sivle LD, Visser F, Curé C, Tyack Yi YZ, Sheridan JA (2019) Effects of traffic noise on
PL, Miller PJO (2017) Lack of behavioural responses vocalisations of the rhacophorid tree frog Kurixalus
of humpback whales (Megaptera novaeangliae) indi- chaseni (Anura: Rhacophoridae) in Borneo. RAFFLES
cate limited effectiveness of sonar mitigation. J Exp Bull Zool 67:77–82. https://doi.org/10.26107/RBZ-
Biol 220(22):4150–4161. https://doi.org/10.1242/jeb. 2019-0007
161232 Young BA (1997) A review of sound production and
Wever EG (1978) The reptile ear. Princeton University hearing in snakes, with a discussion of intraspecific
Press, Princeton. https://doi.org/10.2307/j.ctvbcd2f0 acoustic communication in snakes. J Pa Acad Sci
Wever EG (1985) The amphibian ear. Princeton Univer- 71(1):39–46
sity Press. https://doi.org/10.2307/j.ctt7zth8g Zhao L, Zhu B, Wang J, Brauth SE, Tang Y, Cui J (2017)
Williams R, Erbe C, Ashe E, Beerman A, Smith J (2014) Sometimes noise is beneficial: stream noise informs
Severity of killer whale behavioural responses to ship vocal communication in the little torrent frog Amolops
noise: a dose-response study. Mar Pollut Bull 79:254– torrentis. J Ethol 35(3):259–267. https://doi.org/10.
260. https://doi.org/10.1016/j.marpolbul.2013.12.004 1007/s10164-017-0515-y
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons
license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder.
Index
A Aircrafts, 225
Abiotic noise, 175, 459 Airgun, 210, 238, 249, 468–470
Absolute abundance, 348 Airplane, 226
Absorption, 160, 161, 187, 194, 195 Akaike’s information criterion (AIC), 346
Accelerometer, 18 Alarm calls, 98, 406
Acoustic adaptation hypothesis, 218, 236 Alder flycatchers, 173
Acoustic alarms, 234 Aliasing, 137
Acoustically hard, 165 Alligator, 222
Acoustically soft, 165 Alpacas, 366
Acoustic camera, 17, 43 Alternative hypothesis, 337
Acoustic communication, 187 Altruism, 406
Acoustic complexity index (ACI), 248 Ambient noise, 171, 187, 357
Acoustic diversity index (ADI), 247 Ambient sound, 228
Acoustic energy, 119 American bullfrog, 362
Acoustic environment, 217 Amplitude, 116
Acoustic evenness index (AEI), 247 Amplitude-modulation, 116
Acoustic habitat generalists, 218, 219 Amplitude sensitivity, 39
Acoustic habitat hypothesis, 218 Analog, 2, 137
Acoustic habitat specialists, 218, 219 Analog-to-digital converter, 3
Acoustic harassment devices, 465 ANCOVA, 341
Acoustic impedance, 195, 199, 422 Anechoic chamber, 360
Acoustic indices, 246 Angle of incidence, 195
Acoustic intensity, 120 Angular frequency, 200
Acoustic masking, 486 Animal choruses, 278
Acoustic mirage, 166, 169 Animal ethics permits, 88
Acoustic niche hypothesis, 218, 219 ANOVA, 341
Acoustic niches, 242 Antagonistic, 463
Acoustic power, 120 Antarctic blue whales, 236, 304
Acoustic scene, 219, 427 Antarctic minke whale, 273
Acoustic startle response, 360 Antennae, 471
Acoustic tag, 26 Anthropogenic noise, 175, 460
Acoustic trauma, 462 Anthropophony, 112, 218
Acoustic wavefront, 193 Anti-aliasing filter, 138
Active sonar, 21 Anti-masking strategies, 172, 187
Active sonar systems, 199 Antinode, 203
Active space, 177, 475, 482, 491 Antiphonal, 392
AD-converters, 43 Ants, 220
Adiabatic mode method, 205 Anuran, 220, 295, 296
Advertisement calls, 363 Appraiser, 393
Affected source level, 126 Approach calls, 423
Affiliative calls, 404 Aquatic biophony, 228
African elephant, 221, 296 Aquatic geophony, 232
Aggressive calls, 404 Aquatic soundscape, 227
Air absorption, 167 Aquatic technophony, 233
Killer whale, 92, 98, 232, 292, 295, 297, 367, 466, 476, MANCOVA, 341
489 MANOVA, 341
King penguins, 173, 460 Marine mammal, 229
Kit foxes, 362 Markov Chain Monte Carlo (MCMC), 345
Koalas, 174 Marmoset monkeys, 175
Marmosets, 222
L Masked threshold, 356
Laboratory mice, 362 Masking, 358, 461, 472, 475, 491
Lamb, 224 Matched filters, 279
Larvae, 468, 470 Mating calls, 363
Larynx, 391 Mauthner cells, 361
Laser accelerometers, 18 Maximum frequency, 115
Laser-doppler vibrometer, 76 Maximum likelihood estimation (MLE), 336
Laser interferometers, 18 Mean-square sound pressure, 117
Laser microphones, 18 Measurement microphones, 18, 44
Law of reflection, 161 Mechanical turbulence, 170
Leafhoppers, 472 Mechanical wave forms, 395
Leaky modes, 205 Mel-frequency cepstrum, 284
Least-squares estimation (LSE), 336 Melon, 434
Leatherback sea turtles, 473 Metadata, 88
Leks, 409 Method of constant stimuli, 368, 381
Leopard frogs, 476 Method of limits, 368, 369, 381
Leopard seal, 231, 302 Mice, 274, 361, 373, 378, 487
Level quantity, 123 Micro-electrical-mechanical system (MEMS), 43
Liberal response bias, 371 Microphone, 116
Limiting ray, 167 Microphone array, 17
Linear regression models, 341 Microphonic potentials, 357
Line sources, 159 Middle ear, 358
Link function, 341 Miners, 484
Little brown bats, 173 MiniDisc, 10
Lizards, 474 Minimum frequency, 115
Lloyd’s mirror effect, 14, 197 Mink, 176
Lobster, 229, 468–470 Missed detection, 280, 370
Local extremum, 115 Mitigation, 463
Local maximum, 115 Mixed layer, 190, 192
Local minimum, 115 Mockingbirds, 106
Locust, 473 Mode shape, 203
Logbook, 93 Modified method of limits, 369
Loggerhead sea turtles, 474 Molecular relaxation, 160, 194
Logit function, 342 Molluscs, 467, 468
Log-link function, 342 Mongolian gerbil, 356, 363, 377
Lombard effect, 172, 174, 401, 466, 472, 475, 484, 486, Mongoose, 173, 467
491 Monophonic, 17
Longitudinal wave, 113 Mono recordings, 8
Longitudinal studies, 331 Moth, 422
Long short-term memory, 293 Motifs, 274
Long-term spectral averages, 243 Mountain sheep, 485, 486
Lossless compression, 139 Multi-collinearity, 345
Loudness, 117 Multilevel modeling, 344
Low duty-cycle bats, 427–428 Multiple regression, 341
Low-pass filter, 131, 160 Multivariate, 332
M-weighting, 130
M Myotis bats, 286, 289
Machine learning, 290, 293 Mysticete, 229, 488
Mackerel, 478
Magnetaphone, 6 N
Magnitude, 116 Narrowband, 131
Malleus, 484 Naval sonar, 186
Manatees, 465 Near-field, 126, 127
Index 513
Seal scarers, 465 Spatial release from masking, 172, 178, 189, 401, 476,
Sea otters, 223 483, 491
Sea turtles, 474 Species evenness, 246
Sea urchins, 229 Species-recognition, 96
Seismic airgun, 234, 302 Species richness, 246
Seismic communication, 396 Spectral density, 133
Seismic survey, 233, 238, 470 Spectral leakage, 140
Self-noise, 40 Spectral probability density (SPD), 244
Shadow zone, 167, 169, 201, 227, 235 Spectrogram, 27, 136, 240, 241
Shallow-water duct, 201 Spectrogram cross-correlation, 278
Shannon entropy, 279 Spectrogram equalization, 278
Shannon-Nyquist sampling theorem, 3 Spectrum, 131, 134, 136
Shear, 113 Specular reflection, 161
Sheep, 224 Speed of sound, 120, 165
Ship noise, 240, 245 Spermaceti organ, 434
Shotgun microphones, 17 Sperm whale, 300, 435, 436
Shrew, 445 Spherical spreading, 159, 193, 195
Shrimp, 228, 229, 239, 470 Spline, 342
SI base units, 112 Spontaneous otoacoustic emissions, 373
Sidelobes, 134, 140 Spotted dolphins, 297
Signal, 112, 393 Sprat, 478
Signal-to-noise ratio, 171, 186, 459 Spreading loss, 187
Significance level, 339 Spring peepers, 236
Sinusoidal, 117 Squid, 468, 469
SI system, 112 Squirrelfish, 376
Smartphone, 32 Squirrels, 238, 271, 466
Smooth functions, 342 Standard atmospheric pressure, 160
Snakes, 220 Stapedial reflex, 487
Snapper, 478 Stapes, 484
Snapping shrimp, 240, 459 Starling, 178, 482
Snell’s law, 166, 196, 201, 227, 235 Start frequency, 115
Snowmobiles, 226 Startle response, 477
Social adaptation, 402 Static information, 403
SOFAR channel, 202, 233, 235 Statistical controls, 323
Sonar equation, 186, 187, 208 Statistical decision theory, 347
Song, 219, 231, 274, 407 Statistical inference, 323
Songbirds, 274, 297 Statistical population, 323
Sonobuoy, 54, 101 Statistical power, 322
Sonogram, 27 Statocyst, 467–470
Sonoran pronghorn, 485 Statolith, 467, 468
Sound, 111, 113 Stereocilia, 487
Sound-attenuating chamber, 360 Stereophonic, 17
Sound detection, 363 Stereo recording, 8
Sound discrimination, 363 Stickleback, 362
Sound exposure, 118 Stonefly, 422
Sound exposure level, 118 Streaked tenrecs, 445
Sound field, 158 Stress, 461, 467, 486, 491
Sound level meter, 44 Structure-borne, 112
Sound localization, 366 Substrate-borne, 112
Sound maps, 170 Substrate-borne vibrations, 75
Sound pressure, 116 Suction-cup electrodes, 377
Sound pressure level, 117, 159, 239 Support vector machines, 284
Sound propagation, 156, 227, 239, 245 Surface duct, 192, 201, 235
Soundscape, 112, 141, 217, 218 Survival models, 343
Sound speed, 192, 199 Swallow, 481
Sound speed profile, 101, 166, 168, 192 Swiftlet, 430, 442
Source level, 126, 159, 186, 187 Syllables, 274
Source-path-receiver model, 153 Symbolic, 411
Southern brown treefrog, 475 Synapses, 487
Southern right whale, 273 Synaptopathy, 487
Sparrows, 96 Synergistic, 463
516 Index