Event title: PARSEME WG1 hands-on workshop on lexical encoding of MWEs
Location: Ferdinand Hall, Alexandru Ioan Cruza University, Ia»ôi, Romania
Dates: 21-22 September 2015 (co-located with PARSEME's 5th general meeting on 23-24 September)
Hosting Institution: Alexandru Ioan Cuza University, Ia»ôi, Romania
- introduction to the LMF framework and an presentation of DUELME, a Dutch MWE (proto-)lexicon in LMF-format
NEW! Several other photos from the workshop are also available.
|Monday, 21 September||Tuesday, 22 September|
|9:00-10:30||Discussion 1: Debugging of practical problems I: find solutions to simpler problems|
|10:50-12:50||Discussion 2: Debugging of practical problems II: find solutions to more advanced problems|
|13:00-14:00||Introduction by the workshop leader, short presentation of the participants (max. 3 min. each)||12:50-14:00||Lunch|
|14:00-14:45||Lecture 1: Jan Odijk, DUELME||
|Practical session 2: Documentation - production of short videos on how to encode particular MWEs.|
|15:15-16:00||Lecture 2: Jan Odijk, DUELME in LMF|
|Practical sessions 1: Creation of lexicon entries for MWEs in the data set; identification of practical problems||16:20-17:20||Discussion 3: Indentification and documentation of the main challenges from MWE encoding in general and the LMF standard in particular.|
|17:30-18:30||Discussion and preparation for a joint publication|
Rationale: The idea behind the workshop is to work hands-on with the encoding of linguistic (and other) properties of MWEs. Evaluation of frameworks for lexical encoding is a prioritized task in PARSEME, and the main objectives of this workshop are to make recommendations for the development of MWE lexicons and databases and to work towards the development of best practices.
A framework for MWE encoding (i.e., a MWE lexicon/database model) should ideally meet at least the following requirements:
- support rich linguistic descriptions
- support metadata specifications
- be language independent
- be theory neutral
- be NLP compatible
- be reusable and interoperable
The Lexical Markup Framework (LMF) will be used for lexical encoding. LMF is a standardized framework for the development of computational dictionaries and is recommended as a standard by large international language resource infrastructure initiatives such as CLARIN and META-NET. It is based on standard formalisms for data description and modeling, and adheres to the above requirements.
Modalities: A MWE data set will be created in advance, with both straightforward examples and more difficult cases from all languages represented at the workshop. During hands-on sessions, participants will try to create lexical entries in LMF format for the MWEs in the data set. The lexical encoding of the more straightforward cases will be recorded as short "encoding do-it-yourself videos", while general challenges and challenging cases will be discussed in a problem-solving session. Proposed solutions to the more difficult cases will also be recorded and made available as an e-learning resource. Participants will be encouraged to plan and write a publication summing up the experiences from the workshop.
Participants: about 15 experts of various languages (ideally 1 per language); computational linguists, computational lexicographers (PARSEME members have a priority)
1 June 2015: registration deadline 15 June 2015: notification of admission 9 July 2015: notify wokshop organizers about special topic(s) of interest and data sets (if you have your own data and want to use this for encoding) 31 July 2015: feedback from particpants regarding reading materials and data (provide relevant new MWE examples for encoding if necessary) 7 September 30 August 2015: submission of workshop input data (encoding examples, challenges, possible solutions to problems) Registration: if you are interested in attending the workshop, please fill in the registration form. The workshop organizers will select those participants who will be entitled to reimbursement of their travel and stay.
Accommodation and reimbursement: see the webpage of PARSEME's 5th general meeting
A reading list and a folder with the relevant documents was sent out to the workshop participants on July 3rd. If you are attending the workshop but for some reason did not receive this email (e.g. the attachment was too large), please contact the workshop organizers as soon as possible and they will find a different way of providing these materials.