Abstract
Arabic language has much more syntactical and morphological information. Diacritics, which are marks placed over and below the letters of Arabic word, play a great role in adding linguistic attributes to Arabic word in part-of-speech tagging system. This paper describes a tagset that were built based on the inflectional morphology system which derived from traditional Arabic grammatical theory. The tagset developed represent an early stage of research related to automatic morphosyntactic annotation in Arabic language. This paper aims to present a general tagset for use in diacritics-based automated tagging system that is underdevelopment by the author.
Original language | English |
---|---|
Pages (from-to) | 2787-2792 |
Number of pages | 6 |
Journal | WSEAS Transactions on Computers |
Volume | 5 |
Issue number | 11 |
Publication status | Published - Nov 2006 |
Keywords
- Arabic language
- Diacritics
- Morphological
- Part-of-speech (POS)
- Syntactical
- Tagset