Using a randomised controlled clinical trial to evaluate an NLG system

Ehud Baruch Reiter, R Robertson, A S Lennox, L Osman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The STOP system, which generates personalised smoking-cessation letters, was evaluated by a randomised controlled clinical trial. We believe this is the largest and perhaps most rigorous task effectiveness evaluation ever performed on an NLG system. The detailed results of the clinical trial have been presented elsewhere, in the medical literature. In this paper we discuss the clinical trial itself: its structure and cost, what we did and did not learn from it (especially considering that the trial showed that STOP was not effective), and how it compares to other NLG evaluation techniques.

Original languageEnglish
Title of host publication39TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE
Place of PublicationSOMERSET
PublisherAssociation for Computational Linguistics
Pages434-441
Number of pages8
ISBN (Print)1-55860-767-6
Publication statusPublished - 2001

Keywords

  • GENERATION

Cite this

Reiter, E. B., Robertson, R., Lennox, A. S., & Osman, L. (2001). Using a randomised controlled clinical trial to evaluate an NLG system. In 39TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE (pp. 434-441). Association for Computational Linguistics.