LT4CloseLang: Language Technology for Closely Related Languages and Language Variants!!!Deadline extended to August 2, 2014!!!
Recent initiatives in language technology have led to the development of at least minimal language processing toolkits for all EU-official languages as well as for languages with a large number of speakers worldwide such as Chinese and Arabic. This is a big step towards the automatic processing and/or extraction of information, especially from official documents and newspapers, where the standard, literary language is used.
Apart from those official languages, a large number of dialects or closely-related language variants are in daily use, not only as spoken colloquial languages but also in some written media, e.g., in SMS, chats, and social networks. Building language resources and tools for them from scratch is expensive, but the efforts can often be reduced by making use of pre-existing resources and tools for related, resource-richer languages.
Examples of closely-related language variants include the different variants of Spanish in Latin America, the Arabic dialects in North Africa and the Middle East, German in Germany, Austria and Switzerland, French in France and in Belgium, Dutch in the Netherlands and Flemish in Belgium, etc. Examples of pairs of related languages include Swedish-Norwegian, Bulgarian-Macedonian, Serbian-Bosnian, Spanish-Catalan, Russian-Ukrainian, Irish-Gaelic Scottish, Malay-Indonesian, Turkish–Azerbaijani, Mandarin-Cantonese, Hindi–Urdu, and many other.
The workshop aims to bring together researchers interested in building language technology applications that make use of language closeness to exploit existing resources in a related language or a language variant. A previous version of this workshop, organised at RANLP 2013, attracted a lot of research interest, showing the need for further activities.
Topics of interest include but are not limited to the following:
- Case studies of using language resources and tools for related languages and language variants
- Adaptation of monolingual tools and resources for closely-related languages and language variants
- Evaluation of language resources and tools when applied to closely-related languages and language variants
- Linguistic issues when adapting language resources and tools, e.g., semantic discrepancies, lexical gaps, false friends, etc.
- Machine translation between closely-related languages
- Submission deadline: July 26, 2014, 11:59 p.m. PST ==> Extended to August 2, 2014!
- Acceptance/rejection notification: August 26, 2014
- Camera-ready deadline: September 12, 2014, 11:59 p.m. PST
- Workshop: October 29, 2014
Submission should be done using START:
Papers should be up to 9 pages long
and should follow the formatting instructions for EMNLP'2014.
Papers should be up to 9 pages long and should follow the formatting instructions for EMNLP'2014.