This paper describes a multi-site project to annotate six sizable bilingual parallel corpora for interlingual content. After presenting the background and objectives of the effort, we describe the data set that is being annotated, the interlingua representation language used, an interface environment that supports the annotation task and the annotation process itself. We will then present a preliminary version of our evaluation methodology and conclude with a summary of the current status of the project along with a number of issues which have arisen.
|Conference||Workshop on Frontiers in Corpus Annotation, NAACL/HLT 2004 |
|Period||2/05/04 → 7/05/04|