The Pargram Project: Workshop and Demo

Miriam Butt, Mary Dalrymple, Stefanie Dipper, Helge Dyvik, Ronald M. Kaplan, Tracy Holloway King, Jean-Philippe Marcotte, Hiroshi Masuichi, Tomoko Ohkuma, Christian Rohrer, Victoria Rosen, and Annie Zaenen

Abstract

The Parallel Grammar Project (ParGram) is a long-standing consortium of researchers developing LFG grammars for written input in a number of languages. The grammars are written ''in parallel'' (hence the name of the project), based on shared linguistic assumptions about the nature of the grammars that are produced. The project has several goals: on the theoretical side, to test the universality of LFG theory and to examine and rectify any limitations in coverage of the theory, and on the practical side to produce resources for applications. In developing the grammars, we rely on the XLE platform, a grammar development platform incorporating high-performance algorithms for parsing, generating, and debugging LFG grammars. Word-level analysis is performed through finite-state morphological analyzers, which function as a separate module of the grammar.

The Pargram project began in 1994 as a collaboration between NLTT/Palo Alto Research Center, the University of Stuttgart, and MLTT/Xerox Research Center Europe. Originally, grammars for three languages were developed: an English grammar at PARC, a German grammar at the University of Stuttgart. and a French grammar at XRCE. The original partners contributed to the development of the XLE platform for large grammars and applications, to solidifying the underlying grammatical assumptions and conventions used in writing the grammars, and to the integration of morphological analyzers. After the move of the French grammar to PARC in 2000, several additional partners were added to the project; currently, the project encompasses grammars for six languages. The English and French grammars are being developed at PARC, and the German grammar is being developed at the University of Stuttgart. Additionally, a Norwegian grammar is under development at the University of Bergen, a Japanese grammar at the Corporate Research Center, Fuji Xerox, Japan, and a Hindi/Urdu grammar at UMIST.

Grammars developed by the PARGRAM project have been incorporated into a number of other research projects. Among them are the PARTRANS project, which uses the grammars in translation; the COMET project, which explores statistical disambiguation, using the Wall Street Journal corpus and the English grammar; and the TIGER project, which uses the grammar in semi-automatic creation of a treebank of German newspaper text.

Recommended ParGram Specific References:

Selected References: