This page contains surveys carried on within PARSEME.

  1. MWE templates (WG1):
    • Objective: define a multilingually applicable classification of MWEs
    • Outcome:
      • A Wiki space with one page per language
      • Each language is documented following a common template, with examples, glosses and translations.
      • 16 languages are covered so far: Croatia, Czech, English, French, German, Greek, Herbrew, Latin, Macedonian, Polish, Romanian, Russian, Serbian, Slovak, Slovene and Spanish.
      • 3 classification axes are defined
        • by syntactic structure
        • by fixedness/flexibility of MWE parts
        • by idiomaticity (lexical, syntactic, semantic, pragmatic and statistical)
      • The access is restricted to registered WG1 members so far.
    • Contact: This email address is being protected from spambots. You need JavaScript enabled to view it. (Germany)
  2. MWE lexicon survey (WG1):
    • Objective: list and document the existing lexical resources of MWEs
    • Outcomes :
      • Google table (one line per resources)
      • The metadata include sizes, licenses, and the fact of taking discontinuous MWEs into account
      • Dozens of languages are covered.
      • LREC 2016 paper describing the survey
    • The survey form is still open for contributions
    • Contact: This email address is being protected from spambots. You need JavaScript enabled to view it. (Norway)
  3. Survey on multilingual MWE resources (WG1):
    • Objective: list and document the existing multlingual machine-readable resources of MWEs
    • Outcomes :
      • Google table (one line per resources)
      • The metadata include languages, sizes and domains
      • Dozens of languages are covered.
      • 67 resources are listed: 24 stem from the Thamus and 13 from INCYTA;13 resources are new.
    • Contact: This email address is being protected from spambots. You need JavaScript enabled to view it. (Norway), Johanna Monti (Italy), This email address is being protected from spambots. You need JavaScript enabled to view it. (Ireland), Lonneke van der Plas (Malta) & Manfred Sailer (Germany)
  4. Survey on annotating MWEs in treebanks (WG4):
    • Objective: document the existing treebanks with MWE annotations; document the annotation techniques; prepare the state of the art for MWE annotation guidelines
    • Outcomes:
      • A Wiki table with 13 treebanks in 20 languages and 11 MWE types.
      • TLT 2015 paper describing the survey
      • LREC 2016 paper paving the way to MWE annotation guidelines
    • Contact: Victoria Rosén (Norway)
  5. Survey on hybrid processing of MWEs (WG3)
    • Objectives: state of the art in processing MWEs in 3 types of NLP applications: discovery, parsing and translation
    • paper submitted to Computational Linguistics (under revision)
    • Contact: This email address is being protected from spambots. You need JavaScript enabled to view it. (Malta)

     

Last modification: 12 April 2017