We present the main features of the addition of an MRS component to a Norwegian computational LFG grammar.
The Norwegian cooperative (Oslo-Bergen-Trondheim) research project LOGON (Oepen & al. 2004) is centered around the development of a machine translation demonstrator, translating Norwegian tourism-related texts into English. The demonstrator has a classical rule-based backbone interacting with stochastic methods. It translates not only between two languages but also between two theoretical frameworks, since the Norwegian grammar used for analysis is written in LFG on the XLE platform as part of the ParGram project (Butt et al. 2002) while the English grammar used for generation is written in HPSG on the LKB platform (Flickinger 2002). The platforms are integrated as modules in the system.
Translation is based on transfer over representations in Minimal Recursion Semantics (MRS). MRS representations are flat structures consisting of sets of Elementary Predications (EPs) in which relations among EPs are expressed by means of variable-sharing rather than embedding (Copestake et al. 2003). Furthermore, MRS representations allow underspecification of quantifier scope, so that a scopally ambiguous sentence can be assigned one underspecified MRS representation rather than a set of alternative scopally specified representations. Both these features are useful in a computational setting.
The Norwegian LFG-based analysis produces an MRS representation as part of its output. The representation is subjected to a limited amount of post-processing, the output of which is input to the rule-based transfer component. Transfer outputs MRS representations attuned to the English HPSG grammar, which is then used for the final generation of English sentences from the MRS input.
Genabith and Crouch (1996a, 1996b, 1997) show how subsets of LFG f-structures and subsets of other kinds of underspecified semantic representations - Quasi-Logical Forms (QLF) and Underspecified Discourse Representation Structures (UDRS) - can be brought into one-to-one correspondence with each other. Our task within LOGON is similar, but goes further: we need to interrelate specific f-structures and MRS representations which are not only well-formed, but which also satisfy further, mutually independent constraints. In the first place, already the fact that f-structures are syntactic representations and MRSs semantic representations designed to capture translational relations frequently motivates different packagings of information on the two levels. Furthermore, the NorGram f-structures meet the requirements for f-structures developed within the ParGram project, while the NorGram MRS representations are constructed according to the same general principles as the MRS representations of the English target HPSG grammar (ERG). As a result the f-structure and MRS analyses of the same sentence are not always in a simple structural correspondence with each other. Examples are predicative constructions and constructions with quantifiers.
The implementation of the MRS module exploits the projection architecture of LFG by projecting the MRS representation off the f-structure by co-description, and subjecting the resulting structure to a limited amount of post-processing to convert it to the LOGON interface format. We discuss the role of MRS post-processing in the derivation of appropriate scope relations among adverbs and adverbial clauses. This post-processing issue may be taken to exemplify the limitations of pure co-description in the derivation of appropriate semantic representations from non-ad-hoc syntactic representations (c- and f-structures).
The implementation demonstrates the feasibility of deriving structures meeting external specifications from LFG resource grammars.