0% found this document useful (0 votes)
11 views9 pages

HRTF Individualization A Survey

Uploaded by

p78113024
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views9 pages

HRTF Individualization A Survey

Uploaded by

p78113024
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/328146217

HRTF Individualization: A Survey

Conference Paper · October 2018


DOI: 10.17743/aesconv.2018.978-1-942220-25-1

CITATIONS READS

32 466

2 authors:

Corentin Guezenoc Renaud Seguier


Soundskrit École Supérieure d'Electricité
7 PUBLICATIONS 54 CITATIONS 142 PUBLICATIONS 1,054 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Corentin Guezenoc on 19 July 2022.

The user has requested enhancement of the downloaded file.


HRTF Individualization: A Survey
Corentin Guezenoc, Renaud Seguier

To cite this version:


Corentin Guezenoc, Renaud Seguier. HRTF Individualization: A Survey. Audio Engineer-
ing Society Convention 145, Audio Engineering Society, Oct 2018, New York, United States.
�10.17743/aesconv.2018.978-1-942220-25-1�. �hal-01890916v2�

HAL Id: hal-01890916


https://hal.archives-ouvertes.fr/hal-01890916v2
Submitted on 12 Mar 2020

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est


archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents
entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non,
lished or not. The documents may come from émanant des établissements d’enseignement et de
teaching and research institutions in France or recherche français ou étrangers, des laboratoires
abroad, or from public or private research centers. publics ou privés.
HRTF Individualization: A Survey
Corentin Guezenoc1, 2, ∗ and Renaud Séguier2, †
1
3D Sound Labs SAS
Rennes, France
2
FAST Research Team
IETR (CNRS UMR 6164)
CentraleSupélec
Rennes, France
The individuality of head-related transfer functions (HRTFs) is a key issue for binaural synthesis.
While, over the years, a lot of work has been accomplished to propose end-user-friendly solutions
to HRTF personalization, it remains a challenge. In this article, we establish a state-of-the-art
of that work. We classify the various proposed methods, review their respective advantages and
disadvantages and, above all, methodically check if and how the perceptual validity of the resulting
HRTFs was assessed.

I. INTRODUCTION II. ACOUSTIC MEASUREMENT

The most obvious approach to HRTF individualiza-


Thanks to only two audio signals perceived at the tion is acoustic measurement: one or several loudspeakers
eardrums, one is able to perceive the spatial character- are positioned at each direction of interest around the
istics of sound sources around him: distance, direction, subject and microphones placed at the entrance of his
spread... Among the auditory cues are the level, time-of- ear canals record the corresponding impulse responses.
arrival and spectrum of the incoming sound. Typically, The measurement is usually performed in an anechoic
this sound/morphology interaction is mathematically de- or semi-anechoic environment (the HRTFs are, by def-
scribed by the Head-Related Transfer Functions (HRTFs) inition, free-field transfer functions). Topics of interest
[1]. These cues are greatly influenced by the interaction include measurement setup, measurement time, subject-
of sound with one’s pinnae, head and torso and thus are movement-related inaccuracies and, of course, perceptual
specific to each individual. performance.
By reproducing these cues, a virtual auditory environ-
ment can be generated using regular headphones: by
convolving a given sound sample with the right pair of A. Measurement setup
HRTFs before presenting it to the listener, the sound sam-
ple is perceived at the desired location. This process is A typical state-of-the-art measurement setup [3–6] fea-
called binaural synthesis. However, most binaural synthe- tures loudspeakers on one or several vertical arcs and
sis engines are currently non-individual, i.e. they use the a turntable on which the subject stands or sits, though
same generic HRTF set for all users, which is known to a variety of measurement setups can be read of in the
cause discrepancies such as weak externalization, wrong literature such as one or several loudspeakers moving
perception of elevation and front-back inversions [2]. This around a still subject [7]. This is the main shortcoming
is due to the fact that there is currently no easy way to of the method: the equipment is expensive and scarcely
provide individual HRTFs for the average customer. transportable (and not at all in the case of anechoic or
semi-anechoic measurements). A more detailed presenta-
Hence, an open key issue for binaural synthesis is: how tion of measurement setups and their respective benefits
to individualize HRTFs for the end-user? Furthermore, and constraints can be found in Rugeles’s PhD Thesis [3,
what is the perceptual performance of such an individ- p. 46-49].
ualized HRTF set? In this article, we go over the dif-
ferent families of approaches that address this problem,
namely acoustic measurement, numerical simulation, in-
B. Measurement time
direct individualization based on morphological data and
indirect individualization based on perceptual feedback.
Furthermore, we systematically examine whether percep- Another major disadvantage of the method is the time
tual studies were conducted and what their results were needed to measure the HRTFs for thousands of directions.
and synthesize this information in Table I. Indeed, between a few minutes and a couple of hours
depending on the method, the subject is supposed to
remain still for that duration, which is uncomfortable
and difficult. The historical approach, which consists in
∗ Electronic address: corentin.guezenoc@centralesupelec.fr measuring the HRIRs one direction at a time, takes up
† Electronic address: renaud.seguier@centralesupelec.fr to 1h45 on a modern setup such as Carpentier et al.’s
2

in 2014 [4]. It is however often sped up, down to 20 tion requirements to a set of consumer-grade smartphone
mn according to Rugeles in 2016 [3], using interleaved pictures [16]. Since the mid-2000s, the major computa-
multiple sweep sines as proposed by Majdak et al. in tion techniques have been the Fast-Multipole-accelerated
2007 [8]. A promising and rather trending approach Boundary Element Method (FM-BEM) [17–19] for har-
is the one proposed by Enzner in 2008 [5]. Based on monic domain and the Finite Difference Time Domain
continuous azimuth-wise rotation and adaptive filtering, (FDTD) [20, 21] for time domain, though other methods
this new paradigm allowed the measurement time to be such as the Finite Element Method (FEM) [22] and the
considerably reduced further: according to his work, it more exotic raytracing [23] and Differential Pressure Syn-
would only take 4 mn with that method to measure a thesis (DPS) [24] have been used since the late 1990s, 2006
whole HRIR set with a spatial resolution comparable to and 2003, respectively. We take a particular interest here
that of Rugeles’s system [3]. into the matters of the accuracy of the 3D geometry used
for simulation, the computing time and the perceptual
relevance of the calculated HRTFs.
C. Directional imprecision due to subject
movement

Measurement time exacerbates another issue: as re-


ported in 2010 by [9] the subject cannot stay completely
still all the way through the measurement session, which A. 3D Geometry Accuracy
is a source of errors about the actual direction of the
measured HRTFs (compared to the desired one). Never-
theless, recent studies [10, 11] from 2010 and 2017 seem A major topic of interest for HRTF calculation is the
to have successfully limited the subject’s movements by accuracy of the 3D geometry passed into simulation.
giving him a visual feedback. Denk et al. [11] reported
their directional error to be imperceptible. However, this Therefore, geometry acquisition is a key issue. On
directional imprecision at measurement might be an issue this, there seems to be a consensus on the fact that
in several currently-used databases. the ear needs more accuracy than the rest of the bust.
Typically, a precise scan of the ear is stitched onto a
rougher scan of the head and/or torso by an operator,
D. Perceptual performance which takes up to dozens of minutes of manual labour.
A wide variety of scanning solutions can be read of in
work on HRTF calculation: MRI, CT scan, structured
In spite of the aforementioned drawbacks of the method,
light and infrared for instance. Scanning of the pinna
for the last 30 years binaural synthesis with individual
have sometimes been performed on a mold. However, the
measured HRTFs has been extensively compared to real
literature would merit more studies that evaluate and
free-field sound sources in terms of localization accuracy.
compare the various scanning methods and their impact
The consensus is that they are overall equivalent [7, 10,
on the resulting HRTFs.
12–14], although a few defects [12] were reported and
attributed either to the biasing presence of dynamic clues In contrast, the matter of geometry re-meshing has
when comparing against real sources or to distortion in been well-studied. Indeed, prior to BEM simulation, the
the measurements. More details can be found in Bahu’s surfacic mesh of the subject must be re-arranged so it is
PhD Thesis [15, p. 27]. regular enough and so the edge lengths are small enough in
regard to the simulation’s wavelength. As computing time
increases considerably with the number of mesh elements,
III. NUMERICAL SIMULATION the re-meshing resolution is a trade-off between numeri-
cal accuracy and computing time. Although the use of
Another approach to obtain an individual HRTF set is the six-to-ten-elements-per-wavelength empirical rule has
to simulate numerically the propagation of acoustic waves been wide-spread, the Acoustics Research Institute has
around the subject. Its main advantages over HRTF mea- recently well contributed to the subject. Indeed, by im-
surement are mobility and user comfort. Indeed, only plementing and studying the effect of various re-meshing
a 3D scan of the listener is needed for individualization methods on the resulting HRTFs objectively and sub-
which makes up for a much less tedious acquisition ses- jectively, they not only determined the optimal uniform
sion than acoustic measurement. Moreover, once the 3D re-meshing resolution in 2015 [25] but also proposed a
geometry is acquired, the simulation procedure is com- progressive re-meshing algorithm that allowed the simula-
pletely repeatable and free of measurement noise, and tion time to be cut down by a factor 10 while maintaining
thus it holds a large potential to understanding the inter- the same HRTF accuracy in 2016 [26]. Similar work has
individual variations in HRTFs. Furthermore, a low-cost been carried out in the case of FDTD simulation through
version can be made available to the end-user by using 2D- studying the impact of the voxelization of a subject’s
to-3D reconstruction techniques, by reducing the acquisi- volumic geometry on the resulting HRTFs [21].
3

B. Computing time a rotation in space of the HRTF set, which translates to


a head tilt, in order to further improve the adaptation’s
Computing time used to be the main drawback of HRTF results. However, neither of these studies included any
calculation: HRTFs could not be computed on the whole perceptual study. In particular, it was impossible to Maki
audible frequency range up until 2007 [20, 22]. However, et al. [30] to do so as the HRTFs they studied were those
it has been reduced to a few hours’ time thanks to the of gerbils.
constant increase in available computing power, to the
democratization of distributed computing on clusters over
the last decade and to the introduction of FM-BEM in B. Selection
2007 [17].
Complementary to adaptation, one can select a HRTF
set from anthropometric measurements in a database
C. Perceptual Performance that contains both kind of data. For instance, using
the CIPIC database [32], Zotkin [33] implemented in
Various objective comparisons with acoustic measure- 2002 a coarse nearest neighbors approach that used only
ments reported computed HRTF sets to be overall similar 7 morphological parameters measured on a picture of
to acoustic measurements [17, 18, 27], although one of the pinna, and showed some improvement in terms of
them [18] reported some alterations of spectral features localization performance compared to no individualization
known to be clues for elevation perception. On a subjec- (average gain of 15% in elevation score). More recently,
tive level, among the studies where individual HRTF sets in 2017, Yao [34] proposed a more exotic method to
were simulated for human subjects on the whole audible select a HRTF set among a database, using a neural
range (i.e. up to at least 16 kHz), two provided percep- network trained to predict a perceptual score (from 1 to
tual evaluations [6, 25]. Mokhtari et al. in 2008 [6] and 5) from anthropometric measurements. However, it is
Ziegelwanger et al. in 2015 [25] performed localization difficult to conclude on the results of their perceptual
tests with measured HRTFs as reference that showed good study in comparison with others, as it only used their
results, however these studies were carried out on very own perceptual score as indicator.
few subjects: 2 and 3 respectively.

C. Regression
IV. INDIRECT INDIVIDUALIZATION BASED
ON ANTHROPOMETRIC DATA Going further, another approach to devising low-cost
HRTF individualization based on morphology is the esti-
Though more convenient than acoustic measurement, mation of a HRTF set from anthropometric measurements
HRTF calculation still requires specialized equipment of the listener. To this end, multiple linear regression
and non-negligible mesh processing and computing time. has been widely used. Among such work, the HRTF sets
Hence, based on the fact that HRTF sets rely heavily have often, since the early 2000s, been compressed using
on morphology, many studies have explored the idea of statistical modeling such as Principal Component Analy-
a low-cost HRTF individualization methodology based sis (PCA) [35, 36] and Independent Component Analysis
on anthropometric measurements. We distinguish three (ICA) [37]. Some, as Bilinski et al. in 2014 [38], have
sub-categories: adaptation, selection and regression. chosen to rather predict a HRTF set by linear combi-
nation of HRTF sets using the coefficients of a model
of anthropometric parameters. Suprisingly, among the
A. Adaptation studies reviewed for this article, only that of Hu et al.[36]
featured a perceptual evaluation and, while the results
One way to do it is to take a non-individual set and to were encouraging, they did not put elevation perception to
adapt it, i.e. to alter it in order to make more suitable the test. Since the late 2000s, nonlinear regression models
for the subject at hand. Based on the idea that the most have been used too that have typically relied on neural
prominent morphological difference between two individ- networks coupled to various data compression techniques
uals is size, Middlebrooks and colleagues [28] proposed in including PCA, [39] High-Order SVD [40] and Isomap [41].
1999 to adapt a generic HRTF set thanks to a frequency However, none of these studies carried out any perceptual
scaling. In 2000 [29], they reported that the scaling fac- evaluation of the estimated HRTF sets.
tor could be estimated from a combination of head and
pinnae measurements through linear regression. In both
cases, perceptual evaluations performed on 9 to 11 sub- V. INDIRECT INDIVIDUALIZATION BASED
jects reported localization performance to be improved ON PERCEPTUAL FEEDBACK
compared to no individualization but to be worse than
with own measured HRTF set. Later on in 2005 and 2008, If methods for indirect individualization based on mor-
other researchers [30, 31] combined frequency scaling with phological data are practical for the end-user and provide
4

individualization, they can be subject to morphological direction dependance was not handled [46], which meant
measurement errors. Indeed, the morphological data ac- the adaptation was rather rough as it is basically an equal-
quisition is done by the user: measurements as well as ization of the whole HRTF set. Second, the listener-driven
pictures can be made wrong. As the subjective perception filter-design had to be done for each direction separately
of spatialization is the ultimate goal, an alternative is to [47] and thus the number of parameters to tune for a
propose a low-cost individualization method that is based whole set was too high to expect a tuning procedure in
on the listener’s feedback. Quite similarly to section IV, a reasonable amount of time. Indeed, Runkle et al. [47]
we distinguish two categories: selection and adaptation. did not present any perceptual evaluation of their solu-
tion while Tan and Gan [46] presented some encouraging
perceptual results but did not evaluate other criteria that
A. Selection the ones used for tuning i.e. front-back reversal and sense
of elevation.
A natural strategy that has been well-explored in the
literature since the late 1990s is to help the listener select
the best non-individual HRTF set among a database 3. Statistical-model-based tuning
[42, 43]. All studies reviewed for this article evaluated the
selected HRTF set perceptually with results indicating Alternatively, a lot of work have proposed to rely on
that the selected set was better than a non-individual one a statistical model, with in mind the goal of reducing
but worse than a subject’s own set. However, it should the number of tuning parameters while still being able to
be noted that Seeber et al. [42] did not put elevation cover most of the database’s HRTF space.
perception to the test in their study. Reported tuning The main statistical modeling method used in the lit-
times ranged from 15 min [42] to more than 35 min [43]. erature is Principal Component Analysis (PCA) for its
Conjointly, in order to improve the relevance and duration ease to interpret as well as for its low implementation
of the tuning procedure, it has been proposed to cluster and computing complexity. Most [48–50], in 2008, 2008
a priori the database based on either objective [44] or and 2015 respectively, proposed a procedure that allowed
perceptual [43] criteria. the tuning of a HRTF in one direction at a time. The
number of parameters were reduced to 3 to 5 principal
components (PC) weights per direction, making it possi-
B. Adaptation
ble for the listener to tune each direction in a reasonable
amount of time. These studies all reported a localization
A non-individual HRTF set, sometimes elected through performance improvement over non-individual HRTFs,
a previous selection procedure, can be adapted based on although the number of subjects was rather small (3 and
perceptual feedback from the listener. We distinguish 4 respectively) for [48] and [49] and elevation perception
three ways to adapt a HRTF set: frequency scaling, filter- was not evaluated in [50]. However, these tuning pro-
design-based tuning and statistical-model-based tuning. cedures had to be performed direction by direction and
thus did not allow to tune a whole HRTF set in a rea-
sonable amount of time (only 9 to 10 directions were
1. Frequency scaling tuned). Hölzl, in his 2014 Master Thesis [51], proposed
a solution to that flaw by applying Spherical Harmonics
As mentioned in IV A, Middlebrooks et al. explored in (SH) to the direction-dependent PC weights. However, no
1999 [28] the idea of adapting a generic HRTF set through subjective evaluation of this method was proposed, and
frequency scaling and reported in its companion study [45] even though the overall problem dimension was reduced
an improvement in localization performance compared to to 5 PC weights x 9 SH coefficients = 45, it is still a
no scaling. In their 2000 study [29], they reported that the high number of parameters to tune. Moreover, the com-
scaling factor could be tuned by the listener trough a 20- bination of spherical harmonics coefficients and principal
min tuning session with similar localization performance component weights are rather counter-intuitive and hard
than previous methods for obtaining the scaling factor to comprehend for the end-user.
(minimization of a spectrum-based metric and anthro- In 2017, Yamamoto and Igarashi [52] proposed a state-
pometric measurements). This tuning method has the of-the-art method that relied on the modeling of HRTF
advantage of offering one single tuning lever for the whole sets thanks to a variational autoencoder neural network.
HRTF set and to bring some perceptual improvement. The tuning procedure consisted in a gradient descent op-
timization of the network’s weights where the cost was
determined at every iteration by the user’s notation of two
2. Filter-design-based tuning HRTF sets presented to him by the algorithm. They con-
ducted a preference test in which the participants graded
Some work [46, 47] proposed in 1998 and 2000, respec- HRTF sets pair by pair in a double-blind manner. The
tively, to rely on the tuning of filters to adapt a given baseline condition was a best fit non-individual HRTF set
HRTF set. We have distingushed two directions. First, elected among the database in a previous preference test
5

procedure. The outcome was a significant improvement linear and nonlinear regression between anthropometric
over an optimal non-individual HRTF set for 18 partici- measurements and HRTF sets. Indeed, among the last
pants out of 20, although the nonstandard nature of the category we found a rare single perceptual study [36] and
perceptual testing methodology makes it hard to compare that one did not try elevation perception. In other words,
those results with other studies’. there is a lack of perceptual results for statistics-based
methods, which may well indicate that the databases
are not large enough: all the studies reviewed here used
VI. DISCUSSION similarly-sized databases of 43 to 50 subjects. Thus, a key
to their improvement may well reside in larger databases.
However, to the best of our knowledge the matter of their
As of today, acoustic measurement remains the refer-
ideal size remains an open one. More generally for the
ence method to acquire individual HRTFs thanks to sig-
anthropometrics-based approach, errors may also come
nificant perceptual assessment against real sound sources
from the fact that the measurement step is handed over to
[10, 12, 13], as summarized in Table I. As such, it has
the end-user and from the unclear relevance of the choice
been used as ground truth by all other families of HRTF
of the anthropometric parameters to predict HRTFs.
individualization methods. Nevertheless, in spite of recent
major advances in terms of acquisition time, it is imprac- Alternatively, researchers have investigated the possibil-
tical for consumer-grade applications because of the cost ity of individualizing a HRTF set based on the listener’s
and difficulty to transport the measurement equipment. subjective feedback. This approach has the double advan-
tage of including the listener and his perceptions in the
On the other hand, in spite of the professional-grade
individualization process while avoiding errors related to
scanning equipment and few processing hours needed ,
data acquisition. Accordingly, the vast majority of such
numerical simulation allows the data acquisition step to
studies provide subjective evaluations (cf Table I). On one
be mobile and more comfortable for the user. Further-
hand, the simple techniques, which include selection and
more, the scanning equipment may be reduced to a simple
adaptation by frequency-scaling, have shown perceptual
smartphone for consumer-grade applications by relying
improvement over no individualization in studies that
on 2D-to-3D reconstruction technologies[16]. In addition,
gathered 7 to 11 listeners [29, 42]. On the other hand, the
simulation is a powerful tool for investigating and un-
more complex methods i.e. the statistical-model-based
derstanding the link between morphology and HRTFs.
ones, have been well used in order to reduce the number
Major technical limitations such as computing time, 3D
of tuning parameters in the most relevant manner. To
geometry acquisition and re-meshing have mostly been
this end, PCA models have been used in majority [48–50].
overcome. However, although objective [17, 18, 27] and
While the models that were used needed to be tuned direc-
subjective [6, 25] evaluations showed rather promising re-
tion by direction and thus the tuning of a whole HRTF set
sults, perceptual studies that compared calculated HRTFs
was impractical, they have shown encouraging results to
with measured ones were surprisingly rare and featured
their localization tests, though some [48, 49] featured only
only 2 to 3 subjects (cf Table I. In addition, some objec-
3 to 4 subjects and the other [50] only included azimuthal
tive observations underlined the possibility of perceptual
directions. As for Yamamoto and Igarashi [52], the re-
defects in the produced HRTFs. Hence, despite a lot of
sult of their 20-listener preference test was altogether
work on HRTF simulation for thirty years, and in partic-
promising, but it would merit a more standard subjec-
ular since the first full-band calculations ten years ago,
tive evaluation to be able to compare it to other studies.
computed HRTFs would merit wider-ranged perceptual
For further advances, statistical-model-based approaches,
studies, both in number of studies and of participants.
as in the case of anhtropometry-based indirect methods,
Possible causes for simulation-related problems include an
may very well benefit from larger databases. Indeed, it
inaccurate geometry acquisition (depending on the scan-
would then be particularly interesting to attempt PCA
ning process) and/or a wrong modeling of the acoustics
modeling of whole HRTF sets and to use its weights as
problem.
tuning parameters. Yamamoto and Igarashi’s [52] method
With in mind the goal of developing solutions that
seems promising as well but would benefit from a more
are more user-friendly, the idea of individualizing HRTFs
conventional perceptual evaluation methodology such as
from simpler morphological data has been widely explored
localization testing.
in the literature. This has the advantage of relying on
little equipment and on an easy data acquisition process,
usually a smartphone and the shooting of one or a few
pictures. However, as reported in Table I, the perceptual VII. CONCLUSION
results are mixed. On one side, the simple methods,
namely selection and adaptation by frequency scaling In this paper we established a state-of-the-art of what
and/or set rotation, have demonstrated some perceptual has been done so far to tackle the problem of HRTF in-
improvement compared to no individualization, thanks dividualization for the end-user. We distinguished four
to studies that featured 6 to 11 participants [29, 34]. On families of methods, namely acoustic measurement, nu-
the other side, we cannot conclude on the quality of the merical simulation, indirect individualization from mor-
HRTFs produced by more complex methods, such as phology and indirect individualization from perceptual
6

Eval. type Baseline Nsubj τperc (%) Results


Acoustic mesurement Localization RS 3-10
[7, 10, 12–14] 63 Good
Preference RS 6

Numerical simulation [6, 25] Localization IAC 3 25 Promising but would merit
more studies & subjects

Indirect individualization from


anthropometric data
Selection, frequency-scaling-based Localization NIAC 6-11 67 Better than non-individual
adaptation [29, 34]
Statistical-model-based regression [36] Localization, NIAC 5 10 Poor: few studies and no
no elevation elevation testing

Indirect individualization from


perceptual feedback
Selection, frequency-scaling-based Localization NIAC 7-11
adaptation [29, 42, 43] 100 Better than non-individual
Preference NIAC 45

Filter-design-based adaptation, Localization IAC, NIAC 3-6 Promising but would merit
statistical-model-based adaptation 80 more standard studies &
[48–50, 52] Preference BFAC 20 more subjects

TABLE I: Overview of perceptual evaluations for the major HRTF individualization approaches.
The columns describe the following features, from left to right: type of evaluation (Eval. type), condition(s) used as ground
truth (Baseline), number of participants (Nsubj ), proportion of studies that carried out a perceptual evaluation (τperc ) and
results of the perceptual studies.
Acronyms RS, IAC, NIAC and BFAC stand respectively for Real sound Sources, stimuli binauralized using Individual Acoustic
HRTFs, stimuli binauralized using Non-Individual Acoustic HRTFs and stimuli binauralized using a Best Fit non-individual
Acoustic HRTF set elected among the database in a previous preference test procedure.

feedback. We summarized their specific advantages and cant perceptual results are rather scarce, though not for
disadvantages and took stock of the current advances all approaches (cf Table I), which tends to indicate that
while identifying some leads for improvement. In partic- a lot of work remains to be done to reach an efficient
ular, we took a special interest into the existence and end-user-friendly solution to HRTF individualization.
outcome of related perceptual studies. Overall, signifi-

[1] H. Møller, Applied Acoustics 36, 171 (1992), URL Display (Paris, France, 2008).
http://www.sciencedirect.com/science/article/ [7] E. H. A. Langendijk and A. W. Bronkhorst, JASA 107,
pii/0003682X9290046U. 528 (1999), URL http://asa.scitation.org/doi/abs/
[2] E. M. Wenzel, M. Arruda, D. J. Kistler, and F. L. Wight- 10.1121/1.428321.
man, JASA 94, 111 (1993), URL http://asa.scitation. [8] P. Majdak, P. Balazs, and B. Laback, JAES 55, 623
org/doi/10.1121/1.407089. (2007).
[3] F. Rugeles Ospina, PhD Thesis, Universite Pierre et [9] T. Hirahara, H. Sagara, I. Toshima, and M. Otani,
Marie Curie / Orange Labs (2016), URL https://hal. Acoustical Science and Technology 31, 165 (2010),
archives-ouvertes.fr/tel-01537182. URL http://joi.jlc.jst.go.jp/JST.JSTAGE/ast/31.
[4] T. Carpentier, H. Bahu, M. Noisternig, and O. Warusfel, 165?from=CrossRef.
in 7th Forum Acusticum (EAA) (2014), URL https:// [10] P. Majdak, M. J. Goupell, and B. Laback, Attention, Per-
hal.archives-ouvertes.fr/hal-01247583/. ception, & Psychophysics 72, 454 (2010), URL https://
[5] G. Enzner, in IEEE International Conference on Acous- link.springer.com/article/10.3758/APP.72.2.454.
tics, Speech and Signal Processing (ICASSP) (2008), pp. [11] F. Denk, J. Heeren, S. D. Ewert, B. Kollmeier, and S. M.
393–396. Ernst, in DAGA (Kiel, 2017).
[6] P. Mokhtari, R. Nishimura, and H. Takemoto, in Pro- [12] F. L. Wightman and D. J. Kistler, JASA 85, 868 (1989).
ceedings of the 14th International Conference on Auditory [13] H. Møller, M. F. Sørensen, C. B. Jensen, and D. Ham-
7

mershøi, JAES 44, 451 (1996), URL http://www.aes. Acoustics, 2001 IEEE Workshop on the (IEEE, 2001), pp.
org/e-lib/browse.cfm?elib=7897. 99–102.
[14] R. L. Martin, K. I. McAnally, and M. A. Senova, JAES [33] D. N. Zotkin, R. Duraiswami, and L. S. Davis (Ky-
49, 14 (2001), URL http://www.aes.org/e-lib/browse. oto, Japan, 2002), URL https://smartech.gatech.edu/
cfm?elib=10204. handle/1853/51348.
[15] H. Bahu, Ph.D. thesis, Universite Pierre et Marie [34] S.-N. Yao, T. Collins, and C. Liang, Archives of Acoustics
Curie / IRCAM (2016), URL http://www.theses.fr/ 42, 365 (2017).
2016PA066452. [35] C. Jin, P. Leong, J. Leung, A. Corderoy, and S. Carlile,
[16] S. Kaneko, T. Suenaga, and S. Sekine, in AES In- in Proceedings of the First IEEE Pacific-Rim Conference
ternational Conference on Audio for Virtual and Aug- on Multimedia (2000), pp. 235–238.
mented Reality (Audio Engineering Society, 2016), URL [36] H. Hu, L. Zhou, J. Zhang, H. Ma, and Z. Wu, in 2006
http://www.aes.org/e-lib/browse.cfm?elib=18509. International Conference on Computational Intelligence
[17] N. A. Gumerov, R. Duraiswami, and D. N. Zotkin, in and Security (2006), vol. 2, pp. 1829–1832, URL http:
IEEE International Conference on Acoustics, Speech and //sci-hub.la/10.1109/ICCIAS.2006.295380.
Signal Processing (ICASSP) (2007), vol. 1, pp. I–165. [37] Q. H. Huang and Q. L. Zhuang, Electronics Letters 45,
[18] W. Kreuzer, P. Majdak, and Z. Chen, The Jour- 1002 (2009).
nal of the Acoustical Society of America 126, [38] P. Bilinski, J. Ahrens, M. R. Thomas, I. J. Tashev, and
1280 (2009), URL http://www.ncbi.nlm.nih.gov/pmc/ J. C. Platt, in IEEE International Conference on Acous-
articles/PMC3061451/. tics, Speech and Signal Processing (ICASSP) (2014), pp.
[19] S. Ghorbal, T. Auclair, C. Soladié, and R. Séguier, in Pro- 4468–4472.
ceedings of the 20th International Conference on Digital [39] H. Hu, L. Zhou, H. Ma, and Z. Wu, Applied Acous-
Audio Effects (DAFx-17) (Edinburgh, 2017). tics 69, 163 (2008), URL http://linkinghub.elsevier.
[20] P. Mokhtari, H. Takemoto, R. Nishimura, and H. Kato, com/retrieve/pii/S0003682X07000965.
in Audio Engineering Society Convention 123 (Audio [40] L. Li and Q. Huang, in IEEE International Conference
Engineering Society, 2007), URL http://www.aes.org/ on Acoustics, Speech and Signal Processing (ICASSP)
e-lib/online/browse.cfm?elib=14298. (2013), pp. 3707–3710, URL http://ieeexplore.ieee.
[21] S. Prepelită, M. Geronazzo, F. Avanzini, and L. Savioja, org/abstract/document/6638350/.
The Journal of the Acoustical Society of America [41] F. Grijalva, L. Martini, S. Goldenstein, and D. Florencio,
139, 2489 (2016), URL http://asa.scitation.org/ in IEEE International Conference on Acoustics, Speech
doi/full/10.1121/1.4947546. and Signal Processing (ICASSP) (2014), pp. 4473–4477.
[22] T. Huttunen, E. T. Seppälä, O. Kirkeby, A. Kärkkäinen, [42] B. U. Seeber and H. Fastl, in International Conference
and L. Kärkkäinen, J. Comp. Acous. 15, 429 on Auditory Display (ICAD) (Boston, MA, USA, 2003).
(2007), URL http://www.worldscientific.com/doi/ [43] B. F. Katz and G. Parseihian, The Journal of the Acous-
abs/10.1142/S0218396X07003469. tical Society of America 131, EL99 (2012), URL http:
[23] N. Röber, S. Andres, and M. Masuch (2006). //asa.scitation.org/doi/abs/10.1121/1.3672641.
[24] Y. Tao, A. I. Tew, and S. J. Porter, JAES 51, 647 [44] B. Xie, X. Zhong, and N. He, Applied Acoustics 94, 1
(2003), URL http://www.aes.org/e-lib/browse.cfm? (2015).
elib=12212. [45] J. C. Middlebrooks, The Journal of the Acoustical So-
[25] H. Ziegelwanger, P. Majdak, and W. Kreuzer, The Journal ciety of America 106, 1493 (1999), URL http://asa.
of the Acoustical Society of America 138, 208 (2015), URL scitation.org/doi/abs/10.1121/1.427147.
http://asa.scitation.org/doi/10.1121/1.4922518. [46] C.-J. Tan and W.-S. Gan, Electronics letters 34, 2387
[26] H. Ziegelwanger, W. Kreuzer, and P. Maj- (1998), URL http://ieeexplore.ieee.org/abstract/
dak, Applied Acoustics 114, 99 (2016), URL document/744001/.
http://www.sciencedirect.com/science/article/ [47] P. Runkle, A. Yendiki, and G. H. Wakefield, in Interna-
pii/S0003682X1630192X. tional Conference on Auditory Display (ICAD) (Georgia
[27] H. Ziegelwanger, A. Reichinger, and P. Majdak, in Inter- Institute of Technology, 2000), URL https://smartech.
national Congress on Acoustics (ICA) (Acoustical Society gatech.edu/handle/1853/50665.
of America, 2013), vol. 19. [48] K. H. Shin and Y. Park, IEICE Transactions on Funda-
[28] J. C. Middlebrooks, The Journal of the Acoustical So- mentals of Electronics, Communications and Computer
ciety of America 106, 1480 (1999), URL http://asa. Sciences 91, 345 (2008).
scitation.org/doi/abs/10.1121/1.427176. [49] S. Hwang, Y. Park, and Y.-s. Park, Acta Acustica
[29] J. C. Middlebrooks, E. A. Macpherson, and Z. A. Onsan, united with Acustica 94, 965 (2008), URL http://
The Journal of the Acoustical Society of America 108, openurl.ingenta.com/content/xref?genre=article&
3088 (2000). issn=1610-1928&volume=94&issue=6&spage=965.
[30] K. Maki and S. Furukawa, The Journal of the Acoustical [50] K. J. Fink and L. Ray, Applied Acoustics 87,
Society of America 118, 2392 (2005). 162 (2015), URL http://linkinghub.elsevier.com/
[31] P. Guillon, R. Nicol, and L. Simon, in Audio Engi- retrieve/pii/S0003682X14001753.
neering Society Convention 125 (Audio Engineering So- [51] J. Hölzl, Master Thesis, Graz University of Technology
ciety, 2008), URL http://www.aes.org/e-lib/browse. (2014).
cfm?elib=14761. [52] K. Yamamoto and T. Igarashi, ACM Transactions
[32] V. R. Algazi, R. O. Duda, D. M. Thompson, and C. Aven- on Graphics 36, 1 (2017), URL http://dl.acm.org/
dano, in Applications of Signal Processing to Audio and citation.cfm?doid=3130800.3130838.

View publication stats

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy