Abstract
Machine translation can be carried out via transfer between source and target language deep syntactic structures. In this paper, we examine core parameters of such a system in the context of a statistical approach where the translation model, based on deep syntax, is automatically learned from parsed bilingual corpora. We provide a detailed empirical investigation into the effects of core parameters on translation quality for the German-English translation pair, such as methods of word alignment, limits on the size of transfer rules, transfer decoder beam size, n-best target input representations for generation, as well as deterministic versus non-deterministic generation. Results highlight just how vital employing a suitable method of word alignment is for this approach as well as the significant trade-off between gains in Bleu score and increase in overall translation time that exists when n-best structures are generated.
Link to pdf of paper