1. Introduction of the audience

Backgrounds and interests of the participants:

  • dependency parsing, constituent-based parsing
  • mwe recognition to facilitate parsing
  • information extraction
  • acronym acquisition
  • machine translation
  • dictionaries acquisition
  • nlp applications wrt linguistic resources
  • relation extraction from texts
  • language acquisition
2. Individual contributions

Yannick: MWE-aware lexical selection (aka supertagging) for TAG
   drawback: limited MWE support (parses ranking)

 Giuseppe: MWE-aware tools

 * Dependency parsing (transition-based shift-reduce parsing)
   http://desr.sourceforge.net (trained on 28 languages)
   (robust, 100 of sentences per second, no grammar needed, only annotated treebank)
   point: annotation easy compared with deep grammar development
 * Word embedding (done offline) for word labelling (including POS-tagging, MWE recognition, word clustering)
   point: not relying on feature definition and refinement
   quality measurement ? not per se but via tagging/parsing improvement
   avantage: POS can be replaced with clusters for improved parsing (what about the number of classes vs POS-tags ?)

Kayla: acronym acquisition and disambiguation
   how to deal with non-adjacent expansions ?
   growing number of acronyms -> needs for automatic reproducible acquisition methods
   link between acronyms recognition and coreference ?
   formation patterns ? (half of hebrew acronyms use additional letters, not only leading ones) -> language dependent

Proposal for next WG2 meeting:

 * psycholinguistics, semantics, parallel semantic corpora (cf Peter, Norway-Iceland, Frasar.net)
 * summaries to be put on line
 * select common topics/issues:
       e.g. information extraction from parse outputs (cf idiomatic meaning)
       (several people addressing a question)
 * structuration of the meeting prepared in advance