A Gold Standard Methodology for Evaluating Accuracy in Data-To-Text Systems

Craig Alexander Thomson, Ehud Reiter

Research output: Contribution to conferencePaperpeer-review

Abstract

Most Natural Language Generation systems need to produce accurate texts. We propose a methodology for high-quality human evaluation of the accuracy of generated texts, which is intended to serve as a gold-standard for accuracy evaluations of data-to-text systems. We use our methodology to evaluate the accuracy of computer generated basketball summaries. We then show how our gold standard evaluation can be used to validate automated metrics.
Original languageEnglish
Pages158-168
Number of pages11
Publication statusPublished - Dec 2020
EventProceedings of the 13th International Conference on Natural Language Generation - Held online Dublin City University, Dublin, Ireland
Duration: 15 Dec 202018 Dec 2020
Conference number: 13
https://www.inlg2020.org/

Conference

ConferenceProceedings of the 13th International Conference on Natural Language Generation
Abbreviated titleINLG 2020
CountryIreland
CityDublin
Period15/12/2018/12/20
Internet address

Fingerprint Dive into the research topics of 'A Gold Standard Methodology for Evaluating Accuracy in Data-To-Text Systems'. Together they form a unique fingerprint.

Cite this