The 2022 ReproGen Shared Task on Reproducibility of Evaluations in NLG: Overview and Results

Anya Belz, Anastasia Shimorina, Maja Popović, Ehud Reiter

Research output: Chapter in Book/Report/Conference proceedingPublished conference contribution

4 Downloads (Pure)

Abstract

Against a background of growing interest in reproducibility in NLP and ML, and as part of an ongoing research programme designed to develop theory and practice of reproducibility assessment in NLP, we organised the second shared task on reproducibility of evaluations in NLG, ReproGen 2022. This paper describes the shared task, summarises results from the reproduction studies submitted, and provides further comparative analysis of the results. Out of six initial team registrations, we received submissions from five teams. Meta-analysis of the five reproduction studies revealed varying degrees of reproducibility, and allowed further tentative conclusions about what types of eval- uation tend to have better reproducibility.
Original languageEnglish
Title of host publicationProceedings of the 15th International Conference on Natural Language Generation
Subtitle of host publication Generation Challenges
Place of PublicationWaterville, Maine, USA and virtual meeting
PublisherAssociation for Computational Linguistics
Pages43-51
Number of pages9
Publication statusPublished - 1 Jul 2022
Event15th International Natural Language Generation Conference: Generation Challenges - Colby College, Waterville, United States
Duration: 18 Jul 202222 Jul 2022
https://inlgmeeting.github.io/

Conference

Conference15th International Natural Language Generation Conference
Abbreviated titleINLG 2022
Country/TerritoryUnited States
CityWaterville
Period18/07/2222/07/22
Internet address

Bibliographical note

We thank the authors of the five original papers that were up for reproduction in Track A. And of course the authors of the reproduction papers, without whom there would be no shared task. Our work was carried out as part of the ReproHum project on Investigating Reproducibility of Human Evaluations in Natural Language Processing, funded by EPSRC (UK) under grant number EP/V05645X/1. Popovic’s work is directly funded by the ADAPT ´SFI Centre for Digital Media Technology which is funded by Science Foundation Ireland through
the SFI Research Centres Programme and is cofunded under the European Regional Development Fund (ERDF) through Grant 13/RC/2106. Both Popovic and Belz also benefit in other ways from ´being members of ADAPT.

Fingerprint

Dive into the research topics of 'The 2022 ReproGen Shared Task on Reproducibility of Evaluations in NLG: Overview and Results'. Together they form a unique fingerprint.

Cite this