State-of-the-art generalisation research in NLP: a taxonomy and review

Dieuwke Hupkes; Mario Giulianelli; Verna Dankers; Mikel Artetxe; Yanai Elazar; Tiago Pimentel; Christos Christodoulopoulos; Karim Lasri; Naomi Saphra; Arabella Sinclair; Dennis  Ulmer; Florian  Schottmann; Khuyagbaatar  Batsuren; Kaiser  Sun; Koustuv  Sinha; Leila  Khalatbari; Maria  Ryskina; Hong Technology; Ryan  Cotterell; Zhijing  Jin

doi:10.48550/arXiv.2210.03050

State-of-the-art generalisation research in NLP: a taxonomy and review

Dieuwke Hupkes, Mario Giulianelli, Verna Dankers, Mikel Artetxe, Yanai Elazar, Tiago Pimentel, Christos Christodoulopoulos, Karim Lasri, Naomi Saphra, Arabella Sinclair, Dennis Ulmer, Florian Schottmann, Khuyagbaatar Batsuren, Kaiser Sun, Koustuv Sinha, Leila Khalatbari, Maria Ryskina, Hong Technology, Ryan Cotterell, Zhijing Jin

Research output: Working paper › Preprint

Abstract

The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what `good generalisation' entails and how it should be evaluated is not well understood, nor are there any common standards to evaluate it. In this paper, we aim to lay the ground-work to improve both of these issues. We present a taxonomy for characterising and understanding generalisation research in NLP, we use that taxonomy to present a comprehensive map of published generalisation studies, and we make recommendations for which areas might deserve attention in the future. Our taxonomy is based on an extensive literature review of generalisation research, and contains five axes along which studies can differ: their main motivation, the type of generalisation they aim to solve, the type of data shift they consider, the source by which this data shift is obtained, and the locus of the shift within the modelling pipeline. We use our taxonomy to classify over 400 previous papers that test generalisation, for a total of more than 600 individual experiments. Considering the results of this review, we present an in-depth analysis of the current state of generalisation research in NLP, and make recommendations for the future. Along with this paper, we release a webpage where the results of our review can be dynamically explored, and which we intend to up-date as new NLP generalisation studies are published. With this work, we aim to make steps towards making state-of-the-art generalisation testing the new status quo in NLP.

Original language	English
Publisher	ArXiv
Number of pages	86
DOIs	https://doi.org/10.48550/arXiv.2210.03050
Publication status	Published - 9 Jan 2023

Bibliographical note

We thank Adina Williams, Armand Joulin, Elia Bruni, Lucas Weber, Robert Kirk and Sebastian Riedel
for providing us feedback on various stages of this draft, and Gary Marcus for providing detailed feedback on the final draft of this paper. We thank Elte Hupkes for making the app that allows searching
through references, and we thank Daniel Haziza and Ece Takmaz for other contributions to the website

Access to Document

10.48550/arXiv.2210.03050Licence: CC BY

Cite this

Hupkes, D., Giulianelli, M., Dankers, V., Artetxe, M., Elazar, Y., Pimentel, T., Christodoulopoulos, C., Lasri, K., Saphra, N., Sinclair, A., Ulmer, D., Schottmann, F., Batsuren, K., Sun, K., Sinha, K., Khalatbari, L., Ryskina, M., Technology, H., Cotterell, R., & Jin, Z. (2023). State-of-the-art generalisation research in NLP: a taxonomy and review. ArXiv. https://doi.org/10.48550/arXiv.2210.03050

Hupkes, D, Giulianelli, M, Dankers, V, Artetxe, M, Elazar, Y, Pimentel, T, Christodoulopoulos, C, Lasri, K, Saphra, N, Sinclair, A, Ulmer, D, Schottmann, F, Batsuren, K, Sun, K, Sinha, K, Khalatbari, L, Ryskina, M, Technology, H, Cotterell, R & Jin, Z 2023 'State-of-the-art generalisation research in NLP: a taxonomy and review' ArXiv. https://doi.org/10.48550/arXiv.2210.03050

@techreport{40137bf05f5e4fc3b73ddd4eaf3e3cf9,

title = "State-of-the-art generalisation research in NLP: a taxonomy and review",

abstract = "The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what `good generalisation' entails and how it should be evaluated is not well understood, nor are there any common standards to evaluate it. In this paper, we aim to lay the ground-work to improve both of these issues. We present a taxonomy for characterising and understanding generalisation research in NLP, we use that taxonomy to present a comprehensive map of published generalisation studies, and we make recommendations for which areas might deserve attention in the future. Our taxonomy is based on an extensive literature review of generalisation research, and contains five axes along which studies can differ: their main motivation, the type of generalisation they aim to solve, the type of data shift they consider, the source by which this data shift is obtained, and the locus of the shift within the modelling pipeline. We use our taxonomy to classify over 400 previous papers that test generalisation, for a total of more than 600 individual experiments. Considering the results of this review, we present an in-depth analysis of the current state of generalisation research in NLP, and make recommendations for the future. Along with this paper, we release a webpage where the results of our review can be dynamically explored, and which we intend to up-date as new NLP generalisation studies are published. With this work, we aim to make steps towards making state-of-the-art generalisation testing the new status quo in NLP. ",

author = "Dieuwke Hupkes and Mario Giulianelli and Verna Dankers and Mikel Artetxe and Yanai Elazar and Tiago Pimentel and Christos Christodoulopoulos and Karim Lasri and Naomi Saphra and Arabella Sinclair and Dennis Ulmer and Florian Schottmann and Khuyagbaatar Batsuren and Kaiser Sun and Koustuv Sinha and Leila Khalatbari and Maria Ryskina and Hong Technology and Ryan Cotterell and Zhijing Jin",

note = "We thank Adina Williams, Armand Joulin, Elia Bruni, Lucas Weber, Robert Kirk and Sebastian Riedel for providing us feedback on various stages of this draft, and Gary Marcus for providing detailed feedback on the final draft of this paper. We thank Elte Hupkes for making the app that allows searching through references, and we thank Daniel Haziza and Ece Takmaz for other contributions to the website",

year = "2023",

month = jan,

day = "9",

doi = "10.48550/arXiv.2210.03050",

language = "English",

publisher = "ArXiv",

type = "WorkingPaper",

institution = "ArXiv",

}

TY - UNPB

T1 - State-of-the-art generalisation research in NLP

T2 - a taxonomy and review

AU - Hupkes, Dieuwke

AU - Giulianelli, Mario

AU - Dankers, Verna

AU - Artetxe, Mikel

AU - Elazar, Yanai

AU - Pimentel, Tiago

AU - Christodoulopoulos, Christos

AU - Lasri, Karim

AU - Saphra, Naomi

AU - Sinclair, Arabella

AU - Ulmer, Dennis

AU - Schottmann, Florian

AU - Batsuren, Khuyagbaatar

AU - Sun, Kaiser

AU - Sinha, Koustuv

AU - Khalatbari, Leila

AU - Ryskina, Maria

AU - Technology, Hong

AU - Cotterell, Ryan

AU - Jin, Zhijing

N1 - We thank Adina Williams, Armand Joulin, Elia Bruni, Lucas Weber, Robert Kirk and Sebastian Riedel for providing us feedback on various stages of this draft, and Gary Marcus for providing detailed feedback on the final draft of this paper. We thank Elte Hupkes for making the app that allows searching through references, and we thank Daniel Haziza and Ece Takmaz for other contributions to the website

PY - 2023/1/9

Y1 - 2023/1/9

N2 - The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what `good generalisation' entails and how it should be evaluated is not well understood, nor are there any common standards to evaluate it. In this paper, we aim to lay the ground-work to improve both of these issues. We present a taxonomy for characterising and understanding generalisation research in NLP, we use that taxonomy to present a comprehensive map of published generalisation studies, and we make recommendations for which areas might deserve attention in the future. Our taxonomy is based on an extensive literature review of generalisation research, and contains five axes along which studies can differ: their main motivation, the type of generalisation they aim to solve, the type of data shift they consider, the source by which this data shift is obtained, and the locus of the shift within the modelling pipeline. We use our taxonomy to classify over 400 previous papers that test generalisation, for a total of more than 600 individual experiments. Considering the results of this review, we present an in-depth analysis of the current state of generalisation research in NLP, and make recommendations for the future. Along with this paper, we release a webpage where the results of our review can be dynamically explored, and which we intend to up-date as new NLP generalisation studies are published. With this work, we aim to make steps towards making state-of-the-art generalisation testing the new status quo in NLP.

AB - The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what `good generalisation' entails and how it should be evaluated is not well understood, nor are there any common standards to evaluate it. In this paper, we aim to lay the ground-work to improve both of these issues. We present a taxonomy for characterising and understanding generalisation research in NLP, we use that taxonomy to present a comprehensive map of published generalisation studies, and we make recommendations for which areas might deserve attention in the future. Our taxonomy is based on an extensive literature review of generalisation research, and contains five axes along which studies can differ: their main motivation, the type of generalisation they aim to solve, the type of data shift they consider, the source by which this data shift is obtained, and the locus of the shift within the modelling pipeline. We use our taxonomy to classify over 400 previous papers that test generalisation, for a total of more than 600 individual experiments. Considering the results of this review, we present an in-depth analysis of the current state of generalisation research in NLP, and make recommendations for the future. Along with this paper, we release a webpage where the results of our review can be dynamically explored, and which we intend to up-date as new NLP generalisation studies are published. With this work, we aim to make steps towards making state-of-the-art generalisation testing the new status quo in NLP.

U2 - 10.48550/arXiv.2210.03050

DO - 10.48550/arXiv.2210.03050

M3 - Preprint

BT - State-of-the-art generalisation research in NLP

PB - ArXiv

ER -

State-of-the-art generalisation research in NLP: a taxonomy and review

Abstract

Bibliographical note

Access to Document

Fingerprint

Cite this