Abstract
Unreliable data is present in datasets, and
is either ignored, acknowledged ad hoc, or
undetected. This paper discusses data
quality issues with a potential framework
in mind to deal with them. Such a framework
should be applied within data-to-text
systems at the generation of text rather
than being an afterthought. This paper
also shows ways to express uncertainty
through language and World Health Organisation
(WHO) corpus studies, and an
experiment which analyses how subjects
approached summarising data with data
quality issues. This work is still ongoing.
is either ignored, acknowledged ad hoc, or
undetected. This paper discusses data
quality issues with a potential framework
in mind to deal with them. Such a framework
should be applied within data-to-text
systems at the generation of text rather
than being an afterthought. This paper
also shows ways to express uncertainty
through language and World Health Organisation
(WHO) corpus studies, and an
experiment which analyses how subjects
approached summarising data with data
quality issues. This work is still ongoing.
Original language | English |
---|---|
Title of host publication | Proceedings of the European Natural Language Generation 2015 workshop (ENLG 2015) |
Publisher | ACL Anthology |
Pages | 95-99 |
Number of pages | 5 |
ISBN (Print) | 978-1-941643-78-5 |
DOIs | |
Publication status | Published - 2015 |
Event | 15th European Workshop on Natural Language Generation - Brighton, United Kingdom Duration: 10 Sep 2015 → 11 Sep 2015 |
Conference
Conference | 15th European Workshop on Natural Language Generation |
---|---|
Country/Territory | United Kingdom |
City | Brighton |
Period | 10/09/15 → 11/09/15 |