Curation project: Linguistic Annotation of Non-standard Varieties – Guidelines and Best Practices (WG 7)

Project content

The curation project for the evaluation of annotation schemes for non-standard varieties was granted in June 2012.

Current schemata and guidelines for linguistic annotation have been developed predominantly for the description of newspaper language, and automatic annotation tools continue to be evaluated mainly with the help of newspaper language.

This curation project aims at annotating data of different domains of so-called “non-standard varieties”. Such data comprise a variety of linguistic structures and phenomena which are not covered by current guidelines.

In a pilot study, the curation project will evaluate established annotation schemes for three annotation layers (dependency analysis, named entity recognition and coreference) and, where necessary, extend them.

For that reason, a test corpus of non-standard varieties will be compiled and annotated with the goal to produce guidelines and best practices for the annotation of those varieties.

Duration

  • 01.09.2012 – 30.09.2013

Applicants

Responsible Institution

  • Institut für deutsche Sprache und Linguistik, Humboldt-Universität zu Berlin

  • Sprachwissenschaftliches Institut, Ruhr-Universität Bochum

Executive Staff