Skip to main content

Improved Phrase Translation Modeling Using MAP Adaptation

  • Conference paper
Text, Speech and Dialogue (TSD 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7499))

Included in the following conference series:

  • 1707 Accesses

Abstract

In this paper, we explore several methods of improving the estimation of translation model probabilities for phrase-based statistical machine translation given in-domain data sparsity. We introduce a hierarchical variant of maximum a posteriori (MAP) adaptation for domain adaptation with an arbitrary number of out-of-domain models. We note that domain adaptation can have a smoothing effect, and we explore the interaction between smoothing and the incorporation of out-of-domain data. We find that the relative contributions of smoothing and interpolation depend on the datasets used. For both the IWSLT 2011 and WMT 2011 English-French datasets, the MAP adaptation method we present improves on a baseline system by 1.5+ BLEU points.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Gauvain, J.L., Lee, C.H.: Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Transactions on Speech and Audio Processing 2, 291–298 (1994)

    Article  Google Scholar 

  2. Federico, M.: Bayesian estimation methods for n-gram language model adaptation. In: Proceedings of International Conference on Spoken Language Processing, pp. 240–243 (1996)

    Google Scholar 

  3. Bacchiani, M., Riley, M., Roark, B., Sproat, R.: Map adaptation of stochastic grammars. Computer Speech & Language 20, 41–68 (2006)

    Article  Google Scholar 

  4. Shen, W., Delaney, B., Aminzadeh, A.R., Anderson, T., Slyh, R.: The MIT-LL/AFRL IWSLT-2009 System. In: Proc. of the International Workshop on Spoken Language Translation, Tokyo, Japan, pp. 71–78 (2009)

    Google Scholar 

  5. Foster, G., Kuhn, R., Johnson, J.H.: Phrasetable smoothing for statistical machine translation. In: Conference on Empirical Methods in Natural Language Processing, Sydney, Australia (2006)

    Google Scholar 

  6. Foster, G., Goutte, C., Kuhn, R.: Discriminative instance weighting for domain adaptation in statistical machine translation. In: Proceedings of the 2010 EMNLP, Cambridge, MA, pp. 451–459 (2010)

    Google Scholar 

  7. Foster, G., Kuhn, R.: Mixture-model adaptation for SMT. In: ACL Workshop on Statistical Machine Translation, Prague, Czech Republic (2007)

    Google Scholar 

  8. Bisazza, A., Ruiz, N., Federico, M.: Fill-up versus interpolation methods for phrase-based smt adaptation. In: International Workshop on Spoken Language Translation (2011)

    Google Scholar 

  9. Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1995, vol. 1, pp. 181–184 (1995)

    Google Scholar 

  10. Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. Computer Speech and Language 13, 359–393 (1999)

    Article  Google Scholar 

  11. Chen, B., Kuhn, R., Foster, G., Johnson, H.: Unpacking and transforming feature functions: New ways to smooth phrase tables. In: Proceedings of MT Summit XIII, Xiamen, China (2011)

    Google Scholar 

  12. Campbell, W., Campbell, J., Gleason, T., Reynolds, D., Shen, W.: Speaker verification using support vector machines and high-level features. IEEE Transactions on Audio, Speech, and Language Processing 15, 2085–2094 (2007)

    Article  Google Scholar 

  13. Callison-Burch, C., Koehn, P., Monz, C., Zaidan, O.: Findings of the 2011 workshop on statistical machine translation. In: Proceedings of the 6th Workshop on Statistical Machine Translation, Edinburgh, Scotland, pp. 22–64 (2011)

    Google Scholar 

  14. Federico, M., Bentivogli, L., Michael Paul, S.S.:: Overview of the iwslt 2011 evaluation campaign. In: Proceedings of the International Workshop on Spoken Language Translation, San Francisco, CA (2011)

    Google Scholar 

  15. Shen, W., Anderson, T., Slyh, R., Aminzadeh, A.: The MIT-LL/AFRL IWSLT 2010 MT system. In: Proc. of the International Workshop on Spoken Language Translation, Paris, France (2010)

    Google Scholar 

  16. Hsu, B.J., Glass, J.: Iterative language model estimation: Efficient data structure and algorithms. In: Proc. Interspeech (2008)

    Google Scholar 

  17. Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., Mercer, R.L.: The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics 19, 263–311 (1993)

    Google Scholar 

  18. Chen, B., Cattoni, R., Bertoldi, N., Cettolo, M., Federico, M.: The ITC-irst SMT System for IWSLT-2005. In: Proceedings of the IWSLT 2005 (2005)

    Google Scholar 

  19. Och, F.J.: Minimum Error Rate Training in Statistical Machine Translation. In: Proceedings of ACL (2003)

    Google Scholar 

  20. Koehn, P., Hoang, H., Birch, A., Burch, C.C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the ACL, ACL 2007, Stroudsburg, PA, USA, pp. 177–180 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Aminzadeh, A.R., Drexler, J., Anderson, T., Shen, W. (2012). Improved Phrase Translation Modeling Using MAP Adaptation. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2012. Lecture Notes in Computer Science(), vol 7499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32790-2_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32790-2_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32789-6

  • Online ISBN: 978-3-642-32790-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy