The 2022 ReproGen Shared Task on Reproducibility of Evaluations in NLG: Overview and Results

Anya Belz; Anastasia Shimorina; Maja Popović; Ehud Reiter

The 2022 ReproGen Shared Task on Reproducibility of Evaluations in NLG: Overview and Results

Anya Belz, Anastasia Shimorina, Maja Popović, Ehud Reiter

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

4 Downloads (Pure)

Abstract

Against a background of growing interest in reproducibility in NLP and ML, and as part of an ongoing research programme designed to develop theory and practice of reproducibility assessment in NLP, we organised the second shared task on reproducibility of evaluations in NLG, ReproGen 2022. This paper describes the shared task, summarises results from the reproduction studies submitted, and provides further comparative analysis of the results. Out of six initial team registrations, we received submissions from five teams. Meta-analysis of the five reproduction studies revealed varying degrees of reproducibility, and allowed further tentative conclusions about what types of eval- uation tend to have better reproducibility.

Original language	English
Title of host publication	Proceedings of the 15th International Conference on Natural Language Generation
Subtitle of host publication	Generation Challenges
Place of Publication	Waterville, Maine, USA and virtual meeting
Publisher	Association for Computational Linguistics
Pages	43-51
Number of pages	9
Publication status	Published - 1 Jul 2022
Event	15th International Natural Language Generation Conference: Generation Challenges - Colby College, Waterville, United States Duration: 18 Jul 2022 → 22 Jul 2022 https://inlgmeeting.github.io/

Conference

Conference	15th International Natural Language Generation Conference
Abbreviated title	INLG 2022
Country/Territory	United States
City	Waterville
Period	18/07/22 → 22/07/22
Internet address	https://inlgmeeting.github.io/

Bibliographical note

We thank the authors of the five original papers that were up for reproduction in Track A. And of course the authors of the reproduction papers, without whom there would be no shared task. Our work was carried out as part of the ReproHum project on Investigating Reproducibility of Human Evaluations in Natural Language Processing, funded by EPSRC (UK) under grant number EP/V05645X/1. Popovic’s work is directly funded by the ADAPT ´SFI Centre for Digital Media Technology which is funded by Science Foundation Ireland through
the SFI Research Centres Programme and is cofunded under the European Regional Development Fund (ERDF) through Grant 13/RC/2106. Both Popovic and Belz also benefit in other ways from ´being members of ADAPT.

Access to Document

Belz_etal_ACL_The_2022_ReproGen_VoR
© 1963–2023 ACL; other materials are copyrighted by their respective copyright holders. Materials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License. https://creativecommons.org/licenses/by/4.0/
Final published version, 219 KBLicence: CC BY

https://aclanthology.org/2022.inlg-genchal.8Licence: CC BY

Cite this

The 2022 ReproGen Shared Task on Reproducibility of Evaluations in NLG: Overview and Results. / Belz, Anya; Shimorina, Anastasia; Popović, Maja et al.
Proceedings of the 15th International Conference on Natural Language Generation: Generation Challenges. Waterville, Maine, USA and virtual meeting: Association for Computational Linguistics, 2022. p. 43-51.

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

Belz, A, Shimorina, A, Popović, M & Reiter, E 2022, The 2022 ReproGen Shared Task on Reproducibility of Evaluations in NLG: Overview and Results. in Proceedings of the 15th International Conference on Natural Language Generation: Generation Challenges. Association for Computational Linguistics, Waterville, Maine, USA and virtual meeting, pp. 43-51, 15th International Natural Language Generation Conference, Waterville, Maine, United States, 18/07/22. <https://aclanthology.org/2022.inlg-genchal.8>

@inproceedings{1fad0b88cf2142ba9f4e6c25ccfeba54,

title = "The 2022 ReproGen Shared Task on Reproducibility of Evaluations in NLG: Overview and Results",

abstract = "Against a background of growing interest in reproducibility in NLP and ML, and as part of an ongoing research programme designed to develop theory and practice of reproducibility assessment in NLP, we organised the second shared task on reproducibility of evaluations in NLG, ReproGen 2022. This paper describes the shared task, summarises results from the reproduction studies submitted, and provides further comparative analysis of the results. Out of six initial team registrations, we received submissions from five teams. Meta-analysis of the five reproduction studies revealed varying degrees of reproducibility, and allowed further tentative conclusions about what types of eval- uation tend to have better reproducibility.",

author = "Anya Belz and Anastasia Shimorina and Maja Popovi{\'c} and Ehud Reiter",

note = "We thank the authors of the five original papers that were up for reproduction in Track A. And of course the authors of the reproduction papers, without whom there would be no shared task. Our work was carried out as part of the ReproHum project on Investigating Reproducibility of Human Evaluations in Natural Language Processing, funded by EPSRC (UK) under grant number EP/V05645X/1. Popovic{\textquoteright}s work is directly funded by the ADAPT ´SFI Centre for Digital Media Technology which is funded by Science Foundation Ireland through the SFI Research Centres Programme and is cofunded under the European Regional Development Fund (ERDF) through Grant 13/RC/2106. Both Popovic and Belz also benefit in other ways from ´being members of ADAPT.; 15th International Natural Language Generation Conference : Generation Challenges, INLG 2022 ; Conference date: 18-07-2022 Through 22-07-2022",

year = "2022",

month = jul,

day = "1",

language = "English",

pages = "43--51",

booktitle = "Proceedings of the 15th International Conference on Natural Language Generation",

publisher = "Association for Computational Linguistics",

url = "https://inlgmeeting.github.io/",

}

TY - GEN

T1 - The 2022 ReproGen Shared Task on Reproducibility of Evaluations in NLG

T2 - 15th International Natural Language Generation Conference

AU - Belz, Anya

AU - Shimorina, Anastasia

AU - Popović, Maja

AU - Reiter, Ehud

N1 - We thank the authors of the five original papers that were up for reproduction in Track A. And of course the authors of the reproduction papers, without whom there would be no shared task. Our work was carried out as part of the ReproHum project on Investigating Reproducibility of Human Evaluations in Natural Language Processing, funded by EPSRC (UK) under grant number EP/V05645X/1. Popovic’s work is directly funded by the ADAPT ´SFI Centre for Digital Media Technology which is funded by Science Foundation Ireland through the SFI Research Centres Programme and is cofunded under the European Regional Development Fund (ERDF) through Grant 13/RC/2106. Both Popovic and Belz also benefit in other ways from ´being members of ADAPT.

PY - 2022/7/1

Y1 - 2022/7/1

N2 - Against a background of growing interest in reproducibility in NLP and ML, and as part of an ongoing research programme designed to develop theory and practice of reproducibility assessment in NLP, we organised the second shared task on reproducibility of evaluations in NLG, ReproGen 2022. This paper describes the shared task, summarises results from the reproduction studies submitted, and provides further comparative analysis of the results. Out of six initial team registrations, we received submissions from five teams. Meta-analysis of the five reproduction studies revealed varying degrees of reproducibility, and allowed further tentative conclusions about what types of eval- uation tend to have better reproducibility.

AB - Against a background of growing interest in reproducibility in NLP and ML, and as part of an ongoing research programme designed to develop theory and practice of reproducibility assessment in NLP, we organised the second shared task on reproducibility of evaluations in NLG, ReproGen 2022. This paper describes the shared task, summarises results from the reproduction studies submitted, and provides further comparative analysis of the results. Out of six initial team registrations, we received submissions from five teams. Meta-analysis of the five reproduction studies revealed varying degrees of reproducibility, and allowed further tentative conclusions about what types of eval- uation tend to have better reproducibility.

M3 - Published conference contribution

SP - 43

EP - 51

BT - Proceedings of the 15th International Conference on Natural Language Generation

PB - Association for Computational Linguistics

CY - Waterville, Maine, USA and virtual meeting

Y2 - 18 July 2022 through 22 July 2022

ER -

The 2022 ReproGen Shared Task on Reproducibility of Evaluations in NLG: Overview and Results

Abstract

Conference

Bibliographical note

Access to Document

Fingerprint

Cite this