The KIT Translation Systems for IWSLT 2012

Mohammed Mediani, Yuqi Zhang, Thanh Le Ha, Jan Niehues, Eunah Cho, Teresa Herrmann, Rainer Kärgel, Alexander Waibel

Research output: Contribution to conferencePaperpeer-review

3 Citations (Scopus)

Abstract

In this paper, we present the KIT systems participating in the English-French TED Translation tasks in the framework of the IWSLT 2012 machine translation evaluation. We also present several additional experiments on the English-German, English-Chinese and English-Arabic translation pairs. Our system is a phrase-based statistical machine translation system, extended with many additional models which were proven to enhance the translation quality. For instance, it uses the part-of-speech (POS)-based reordering, translation and language model adaptation, bilingual language model, word-cluster language model, discriminative word lexica (DWL), and continuous space language model. In addition to this, the system incorporates special steps in the preprocessing and in the post-processing step. In the preprocessing the noisy corpora are filtered by removing the noisy sentence pairs, whereas in the postprocessing the agreement between a noun and its surrounding words in the French translation is corrected based on POS tags with morphological information. Our system deals with speech transcription input by removing case information and punctuation except periods from the text translation model.

Original languageEnglish
Pages38-45
Number of pages8
Publication statusPublished - 2012
Externally publishedYes
Event9th International Workshop on Spoken Language Translation, IWSLT 2012 - Hong Kong, China
Duration: Dec 6 2012Dec 7 2012

Conference

Conference9th International Workshop on Spoken Language Translation, IWSLT 2012
Country/TerritoryChina
CityHong Kong
Period12/6/1212/7/12

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'The KIT Translation Systems for IWSLT 2012'. Together they form a unique fingerprint.

Cite this