Abstract
The paper describes a research about the possibility of integrating different types of word and semantic features for automatically identifying themes of real-life telephone conversations in a customer care service (CCS). Features are all the words of the application vocabulary, the probabilities obtained with latent Dirichlet allocation (LDA) of selected discriminative words and semantic features obtained with a limited human supervision of words and patterns expressing entities and relations of the application ontology. A deep neural network (DNN) is proposed for integrating these features. Experimental results on manual and automatic conversation transcriptions are presented showing the effective contribution of the integration. The results show how to automatically select a large subset of the test corpus with high precision and recall, making it possible to automatically obtain theme mention proportions in different time periods.
Mohamed Bouallegue thanks the ANR agency for funding through the CHIST-ERA ERA-Net JOKER project.
Carole Lailler thanks European Commission for funding through the EUMSSI Project, number 611057, call FP7-ICT-2013-10.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Béchet F, Maza B, Bigouroux N, Bazillon T, El-Bèze M, De Mori R, Arbillot E (2012) Decoda: a call-centre human-human spoken conversation corpus. In: Proceeding of LREC’12
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Carpineto C, De Mori R, Romano G, Bigi B (2001) An information-theoretic approach to automatic query expansion. ACM Trans Inf Syst 19(1):1–27
Chen Y-N, Wang WY, Rudnicky AI (2014) Leveraging frame semantics and distributional semantics for unsupervised semantic slot induction in spoken dialogue systems. In: IEEE spoken language technology workshop (SLT 2014), South Lake Tahoe, California and Nevada
Cuayáhuitl H, Dethlefs N, Hastie H, Liu X (2014) Training a statistical surface realiser from automatic slot labelling. In: IEEE spoken language technology workshop (SLT 2014), South Lake Tahoe, California and Nevada
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407
Hazen TJ (2011) MCE training techniques for topic identification of spoken audio documents. IEEE Trans Audio Speech Lang Process 19(8):2451–2460
Linarès G, Nocéra P, Massonie D, Matrouf D (2007) The LIA speech recognition system: from 10xRT to 1xRT. In: Proceedings of the 10th international conference on text, speech and dialogue. Springer, Berlin, pp 302–308
Morchid M, Dufour R, Bousquet P-M, Bouallegue M, Linarès G, De Mori R (2014) Improving dialogue classification using a topic space representation and a gaussian classifier based on the decision rule. In: Proceedings of ICASSP
Morchid M, Bouallegue M, Dufour R, Linarès G, Matrouf D, De Mori R (2014) An i-vector based approach to compact multi-granularity topic spaces representation of textual documents. In: The 2014 conference on empirical methods on natural language processing (EMNLP), SIGDAT
Morchid M, Bouallegue M, Dufour R, Linarès G, Matrouf D, De Mori R (2014) I-vector based representation of highly imperfect automatic transcriptions. In: Conference of the international speech communication association (INTERSPEECH) 2014, ISCA
Morchid M, Dufour R, Bouallegue M, Linarès G, De Mori R (2014) Theme identification in human-human conversations with features from specific speaker type hidden spaces. In: Fifteenth annual conference of the international speech communication association
Sarikaya R, Hinton GE, Deoras A (2014) Application of deep belief networks for natural language understanding. IEEE/ACM Trans Audio Speech Lang Process 22(4):778–784
Tur G, De Mori R (2011) Spoken language understanding: systems for extracting semantic information from speech. Wiley, New York
Tur G, Hakkani-Tür D (2011) Human/human conversation understanding. In: Spoken language understanding: systems for extracting semantic information from speech. Wiley, New York, pp 225–255
Wu MS, Lee HS, Wang HM (2010) Exploiting semantic associative information in topic modeling. In: Proceedings of the IEEE workshop on spoken language technology (SLT 2010), pp 384–388
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Estève, Y. et al. (2015). Integration of Word and Semantic Features for Theme Identification in Telephone Conversations. In: Lee, G., Kim, H., Jeong, M., Kim, JH. (eds) Natural Language Dialog Systems and Intelligent Assistants. Springer, Cham. https://doi.org/10.1007/978-3-319-19291-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-19291-8_21
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19290-1
Online ISBN: 978-3-319-19291-8
eBook Packages: Computer ScienceComputer Science (R0)