Dependency-based Sentence Simplification for Increasing Deep LFG Parsing Coverage

Özlem Çetinoğlu, Sina Zarrieß and Jonas Kuhn

Abstract

Large scale deep grammars can achieve high coverage of corpus data, yet cannot produce full-fledged solutions for each sentence. In this paper, we present a dependency-based sentence simplification approach to obtain full parses of simplified sentences that failed to have a complete analysis in their original form. In order to remove the erroneous parts that cause failure, we delete phrases from failed sentences by utilising their dependency structure, and reprocess the remaining shorter sentences with XLE to get full analyses. We ensure the grammaticality and preserve the core argument structure of simplified sentences by defining the deletion scheme only on a set of modifier phrases. We apply our approach on German data and retrieve full parses of simplified sentences for 52.37 of the failed TIGER sentences. With the combination of original and simplified sentences, the full XLE parses derived from the TIGER Treebank increases from 80.66 to 90.79.

Link to pdf of paper