TY - JOUR
T1 - Land surface model parameter optimisation using in situ flux data
T2 - Comparison of gradient-based versus random search algorithms (a case study using ORCHIDEE v1.9.5.2)
AU - Bastrikov, Vladislav
AU - Macbean, Natasha
AU - Bacour, Cédric
AU - Santaren, Diego
AU - Kuppel, Sylvain
AU - Peylin, Philippe
N1 - This work used eddy covariance data acquired by the FLUXNET community and in particular by the following networks: AmeriFlux (U.S. Department of Energy, Biological and Environmental Research, Terrestrial Carbon Program; DE-FG02-04ER63917 and DE-FG02-04ER63911), AfriFlux, AsiaFlux, CarboAfrica, CarboEuropeIP, CarboItaly, CarboMont, ChinaFlux, Fluxnet-Canada (supported by CFCAS, NSERC, BIOCAP, Environment Canada, and NRCan), GreenGrass, KoFlux, LBA, NECC, OzFlux, TCOS-Siberia and USCCC. We acknowledge the financial support to the eddy covariance data harmonisation provided by CarboEuropeIP, FAO-GTOS-TCO, iLEAPS, Max Planck Institute for Biogeochemistry, National Science Foundation, University of Tuscia, Universiteì Laval, Environment Canada and US Department of Energy and the database development and technical support from Berkeley Water Center, Lawrence Berkeley National Laboratory, Microsoft Research eScience, Oak Ridge National Laboratory, University of California – Berkeley and the University of Virginia.
PY - 2018/11/30
Y1 - 2018/11/30
N2 - Land surface models (LSMs), which form the land component of earth system models, rely on numerous processes for describing carbon, water and energy budgets, often associated with highly uncertain parameters. Data assimilation (DA) is a useful approach for optimising the most critical parameters in order to improve model accuracy and refine future climate predictions. In this study, we compare two different DA methods for optimising the parameters of seven plant functional types (PFTs) of the ORCHIDEE LSM using daily averaged eddy-covariance observations of net ecosystem exchange and latent heat flux at 78 sites across the globe. We perform a technical investigation of two classes of minimisation methods-local gradient-based (the L-BFGS-B algorithm, limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm with bound constraints) and global random search (the genetic algorithm)-by evaluating their relative performance in terms of the model-data fit and the difference in retrieved parameter values. We examine the performance of each method for two cases: when optimising parameters at each site independently ("single-site" approach) and when simultaneously optimising the model at all sites for a given PFT using a common set of parameters ("multi-site" approach). We find that for the single site case the random search algorithm results in lower values of the cost function (i.e. lower model-data root mean square differences) than the gradient-based method; the difference between the two methods is smaller for the multi-site optimisation due to a smoothing of the cost function shape with a greater number of observations. The spread of the cost function, when performing the same tests with 16 random first-guess parameters, is much larger with the gradient-based method, due to the higher likelihood of being trapped in local minima. When using pseudo-observation tests, the genetic algorithm results in a closer approximation of the true posterior parameter value in the L-BFGS-B algorithm. We demonstrate the advantages and challenges of different DA techniques and provide some advice on using it for the LSM parameter optimisation.
AB - Land surface models (LSMs), which form the land component of earth system models, rely on numerous processes for describing carbon, water and energy budgets, often associated with highly uncertain parameters. Data assimilation (DA) is a useful approach for optimising the most critical parameters in order to improve model accuracy and refine future climate predictions. In this study, we compare two different DA methods for optimising the parameters of seven plant functional types (PFTs) of the ORCHIDEE LSM using daily averaged eddy-covariance observations of net ecosystem exchange and latent heat flux at 78 sites across the globe. We perform a technical investigation of two classes of minimisation methods-local gradient-based (the L-BFGS-B algorithm, limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm with bound constraints) and global random search (the genetic algorithm)-by evaluating their relative performance in terms of the model-data fit and the difference in retrieved parameter values. We examine the performance of each method for two cases: when optimising parameters at each site independently ("single-site" approach) and when simultaneously optimising the model at all sites for a given PFT using a common set of parameters ("multi-site" approach). We find that for the single site case the random search algorithm results in lower values of the cost function (i.e. lower model-data root mean square differences) than the gradient-based method; the difference between the two methods is smaller for the multi-site optimisation due to a smoothing of the cost function shape with a greater number of observations. The spread of the cost function, when performing the same tests with 16 random first-guess parameters, is much larger with the gradient-based method, due to the higher likelihood of being trapped in local minima. When using pseudo-observation tests, the genetic algorithm results in a closer approximation of the true posterior parameter value in the L-BFGS-B algorithm. We demonstrate the advantages and challenges of different DA techniques and provide some advice on using it for the LSM parameter optimisation.
UR - http://www.scopus.com/inward/record.url?scp=85039871600&partnerID=8YFLogxK
U2 - 10.5194/gmd-11-4739-2018
DO - 10.5194/gmd-11-4739-2018
M3 - Article
AN - SCOPUS:85039871600
VL - 11
SP - 4739
EP - 4754
JO - Geoscientific Model Development
JF - Geoscientific Model Development
SN - 1991-959X
IS - 12
ER -