Early Prediction of Gestational Diabetes Mellitus in the Chinese Population via Advanced Machine Learning

Yan-Ting Wu; Chen-Jie Zhang; Ben Mol; Andrew Kawai; Cheng Li; Lei Chen; Yu Wang; Jian-Zhong Sheng; Jian-Xia Fan; Yi Shi; He-Feng Huang

doi:10.1210/clinem/dgaa899

Early Prediction of Gestational Diabetes Mellitus in the Chinese Population via Advanced Machine Learning

Yan-Ting Wu, Chen-Jie Zhang, Ben Mol, Andrew Kawai, Cheng Li, Lei Chen, Yu Wang, Jian-Zhong Sheng, Jian-Xia Fan, Yi Shi^* (Corresponding Author), He-Feng Huang^* (Corresponding Author)

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

82 Citations (Scopus)

7 Downloads (Pure)

Abstract

CONTEXT: Accurate methods for early gestational diabetes mellitus (GDM) (during the first trimester of pregnancy) prediction in Chinese and other populations are lacking.

OBJECTIVES: This work aimed to establish effective models to predict early GDM.

METHODS: Pregnancy data for 73 variables during the first trimester were extracted from the electronic medical record system. Based on a machine learning (ML)-driven feature selection method, 17 variables were selected for early GDM prediction. To facilitate clinical application, 7 variables were selected from the 17-variable panel. Advanced ML approaches were then employed using the 7-variable data set and the 73-variable data set to build models predicting early GDM for different situations, respectively.

RESULTS: A total of 16 819 and 14 992 cases were included in the training and testing sets, respectively. Using 73 variables, the deep neural network model achieved high discriminative power, with area under the curve (AUC) values of 0.80. The 7-variable logistic regression (LR) model also achieved effective discriminate power (AUC = 0.77). Low body mass index (BMI) (≤ 17) was related to an increased risk of GDM, compared to a BMI in the range of 17 to 18 (minimum risk interval) (11.8% vs 8.7%, P = .09). Total 3,3,5'-triiodothyronine (T3) and total thyroxin (T4) were superior to free T3 and free T4 in predicting GDM. Lipoprotein(a) was demonstrated a promising predictive value (AUC = 0.66).

CONCLUSIONS: We employed ML models that achieved high accuracy in predicting GDM in early pregnancy. A clinically cost-effective 7-variable LR model was simultaneously developed. The relationship of GDM with thyroxine and BMI was investigated in the Chinese population.

Original language	English
Pages (from-to)	e1191-e1205
Number of pages	15
Journal	Journal of Clinical Endocrinology and Metabolism
Volume	106
Issue number	3
Early online date	22 Dec 2020
DOIs	https://doi.org/10.1210/clinem/dgaa899
Publication status	Published - 31 Mar 2021

Bibliographical note

Acknowledgments
We thank all those who helped to collect the data and the graduate students who took part in the statistical analysis.
Financial Support: This work was supported by the National Key Research and Development Program of China (grant Nos.2018YFC1002804 and 2016YFC1000203), the National Natural Science Foundation of China (grant Nos. 81671412 and 81661128010), Program of Shanghai Academic Research Leader
(grant No. 20XD1424100), the Outstanding Youth Medical Talents of Shanghai Rising Stars of Medical Talent Youth Development Program, Chinese Academy of Medical Sciences (CAMS) Innovation Fund for Medical Sciences (grant No. 2019-12M-5-064), the Foundation of Shanghai Municipal Commission of Health and Family Planning (grant No. 20144Y0110), the Natural Science Foundation of Shanghai (grant Nos. 20511101900 and 20ZR1427200), the Shanghai Shenkang Hospital Development Center, the Clinical Technology Innovation Project (grant Nos. SHDC12019107), and the Clinical Skills Improvement Foundation of Shanghai Jiaotong University School of Medicine (grant No. JQ201717).

Keywords

GDM
early prediction
machine learning models
early pregnancy
BMI
thyroxine
HEMOGLOBIN
CLASSIFICATION
RISK
PREGNANCY
1ST
DISCRIMINATION
GLUCOSE
INSULIN-RESISTANCE
INTRAUTERINE EXPOSURE
ASSOCIATION

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1210/clinem/dgaa899Licence: CC BY-NC-ND

Wu_etal_TJCEM_Early_Prediction_Of_VoR
© The Author(s) 2020. Published by Oxford University Press on behalf of the Endocrine Society. This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
Final published version, 14.9 MBLicence: CC BY-NC-ND

Cite this

@article{b9f8da79545a4170bc296dd45d93aa5f,

title = "Early Prediction of Gestational Diabetes Mellitus in the Chinese Population via Advanced Machine Learning",

abstract = "CONTEXT: Accurate methods for early gestational diabetes mellitus (GDM) (during the first trimester of pregnancy) prediction in Chinese and other populations are lacking.OBJECTIVES: This work aimed to establish effective models to predict early GDM.METHODS: Pregnancy data for 73 variables during the first trimester were extracted from the electronic medical record system. Based on a machine learning (ML)-driven feature selection method, 17 variables were selected for early GDM prediction. To facilitate clinical application, 7 variables were selected from the 17-variable panel. Advanced ML approaches were then employed using the 7-variable data set and the 73-variable data set to build models predicting early GDM for different situations, respectively.RESULTS: A total of 16 819 and 14 992 cases were included in the training and testing sets, respectively. Using 73 variables, the deep neural network model achieved high discriminative power, with area under the curve (AUC) values of 0.80. The 7-variable logistic regression (LR) model also achieved effective discriminate power (AUC = 0.77). Low body mass index (BMI) (≤ 17) was related to an increased risk of GDM, compared to a BMI in the range of 17 to 18 (minimum risk interval) (11.8% vs 8.7%, P = .09). Total 3,3,5'-triiodothyronine (T3) and total thyroxin (T4) were superior to free T3 and free T4 in predicting GDM. Lipoprotein(a) was demonstrated a promising predictive value (AUC = 0.66).CONCLUSIONS: We employed ML models that achieved high accuracy in predicting GDM in early pregnancy. A clinically cost-effective 7-variable LR model was simultaneously developed. The relationship of GDM with thyroxine and BMI was investigated in the Chinese population.",

keywords = "GDM, early prediction, machine learning models, early pregnancy, BMI, thyroxine, HEMOGLOBIN, CLASSIFICATION, RISK, PREGNANCY, 1ST, DISCRIMINATION, GLUCOSE, INSULIN-RESISTANCE, INTRAUTERINE EXPOSURE, ASSOCIATION",

author = "Yan-Ting Wu and Chen-Jie Zhang and Ben Mol and Andrew Kawai and Cheng Li and Lei Chen and Yu Wang and Jian-Zhong Sheng and Jian-Xia Fan and Yi Shi and He-Feng Huang",

note = "Acknowledgments We thank all those who helped to collect the data and the graduate students who took part in the statistical analysis. Financial Support: This work was supported by the National Key Research and Development Program of China (grant Nos.2018YFC1002804 and 2016YFC1000203), the National Natural Science Foundation of China (grant Nos. 81671412 and 81661128010), Program of Shanghai Academic Research Leader (grant No. 20XD1424100), the Outstanding Youth Medical Talents of Shanghai Rising Stars of Medical Talent Youth Development Program, Chinese Academy of Medical Sciences (CAMS) Innovation Fund for Medical Sciences (grant No. 2019-12M-5-064), the Foundation of Shanghai Municipal Commission of Health and Family Planning (grant No. 20144Y0110), the Natural Science Foundation of Shanghai (grant Nos. 20511101900 and 20ZR1427200), the Shanghai Shenkang Hospital Development Center, the Clinical Technology Innovation Project (grant Nos. SHDC12019107), and the Clinical Skills Improvement Foundation of Shanghai Jiaotong University School of Medicine (grant No. JQ201717).",

year = "2021",

month = mar,

day = "31",

doi = "10.1210/clinem/dgaa899",

language = "English",

volume = "106",

pages = "e1191--e1205",

journal = "Journal of Clinical Endocrinology and Metabolism",

issn = "0021-972X",

publisher = "The Endocrine Society",

number = "3",

}

TY - JOUR

T1 - Early Prediction of Gestational Diabetes Mellitus in the Chinese Population via Advanced Machine Learning

AU - Wu, Yan-Ting

AU - Zhang, Chen-Jie

AU - Mol, Ben

AU - Kawai, Andrew

AU - Li, Cheng

AU - Chen, Lei

AU - Wang, Yu

AU - Sheng, Jian-Zhong

AU - Fan, Jian-Xia

AU - Shi, Yi

AU - Huang, He-Feng

N1 - Acknowledgments We thank all those who helped to collect the data and the graduate students who took part in the statistical analysis. Financial Support: This work was supported by the National Key Research and Development Program of China (grant Nos.2018YFC1002804 and 2016YFC1000203), the National Natural Science Foundation of China (grant Nos. 81671412 and 81661128010), Program of Shanghai Academic Research Leader (grant No. 20XD1424100), the Outstanding Youth Medical Talents of Shanghai Rising Stars of Medical Talent Youth Development Program, Chinese Academy of Medical Sciences (CAMS) Innovation Fund for Medical Sciences (grant No. 2019-12M-5-064), the Foundation of Shanghai Municipal Commission of Health and Family Planning (grant No. 20144Y0110), the Natural Science Foundation of Shanghai (grant Nos. 20511101900 and 20ZR1427200), the Shanghai Shenkang Hospital Development Center, the Clinical Technology Innovation Project (grant Nos. SHDC12019107), and the Clinical Skills Improvement Foundation of Shanghai Jiaotong University School of Medicine (grant No. JQ201717).

PY - 2021/3/31

Y1 - 2021/3/31

N2 - CONTEXT: Accurate methods for early gestational diabetes mellitus (GDM) (during the first trimester of pregnancy) prediction in Chinese and other populations are lacking.OBJECTIVES: This work aimed to establish effective models to predict early GDM.METHODS: Pregnancy data for 73 variables during the first trimester were extracted from the electronic medical record system. Based on a machine learning (ML)-driven feature selection method, 17 variables were selected for early GDM prediction. To facilitate clinical application, 7 variables were selected from the 17-variable panel. Advanced ML approaches were then employed using the 7-variable data set and the 73-variable data set to build models predicting early GDM for different situations, respectively.RESULTS: A total of 16 819 and 14 992 cases were included in the training and testing sets, respectively. Using 73 variables, the deep neural network model achieved high discriminative power, with area under the curve (AUC) values of 0.80. The 7-variable logistic regression (LR) model also achieved effective discriminate power (AUC = 0.77). Low body mass index (BMI) (≤ 17) was related to an increased risk of GDM, compared to a BMI in the range of 17 to 18 (minimum risk interval) (11.8% vs 8.7%, P = .09). Total 3,3,5'-triiodothyronine (T3) and total thyroxin (T4) were superior to free T3 and free T4 in predicting GDM. Lipoprotein(a) was demonstrated a promising predictive value (AUC = 0.66).CONCLUSIONS: We employed ML models that achieved high accuracy in predicting GDM in early pregnancy. A clinically cost-effective 7-variable LR model was simultaneously developed. The relationship of GDM with thyroxine and BMI was investigated in the Chinese population.

AB - CONTEXT: Accurate methods for early gestational diabetes mellitus (GDM) (during the first trimester of pregnancy) prediction in Chinese and other populations are lacking.OBJECTIVES: This work aimed to establish effective models to predict early GDM.METHODS: Pregnancy data for 73 variables during the first trimester were extracted from the electronic medical record system. Based on a machine learning (ML)-driven feature selection method, 17 variables were selected for early GDM prediction. To facilitate clinical application, 7 variables were selected from the 17-variable panel. Advanced ML approaches were then employed using the 7-variable data set and the 73-variable data set to build models predicting early GDM for different situations, respectively.RESULTS: A total of 16 819 and 14 992 cases were included in the training and testing sets, respectively. Using 73 variables, the deep neural network model achieved high discriminative power, with area under the curve (AUC) values of 0.80. The 7-variable logistic regression (LR) model also achieved effective discriminate power (AUC = 0.77). Low body mass index (BMI) (≤ 17) was related to an increased risk of GDM, compared to a BMI in the range of 17 to 18 (minimum risk interval) (11.8% vs 8.7%, P = .09). Total 3,3,5'-triiodothyronine (T3) and total thyroxin (T4) were superior to free T3 and free T4 in predicting GDM. Lipoprotein(a) was demonstrated a promising predictive value (AUC = 0.66).CONCLUSIONS: We employed ML models that achieved high accuracy in predicting GDM in early pregnancy. A clinically cost-effective 7-variable LR model was simultaneously developed. The relationship of GDM with thyroxine and BMI was investigated in the Chinese population.

KW - GDM

KW - early prediction

KW - machine learning models

KW - early pregnancy

KW - BMI

KW - thyroxine

KW - HEMOGLOBIN

KW - CLASSIFICATION

KW - RISK

KW - PREGNANCY

KW - 1ST

KW - DISCRIMINATION

KW - GLUCOSE

KW - INSULIN-RESISTANCE

KW - INTRAUTERINE EXPOSURE

KW - ASSOCIATION

UR - http://www.scopus.com/inward/record.url?scp=85102909533&partnerID=8YFLogxK

U2 - 10.1210/clinem/dgaa899

DO - 10.1210/clinem/dgaa899

M3 - Article

C2 - 33351102

SN - 0021-972X

VL - 106

SP - e1191-e1205

JO - Journal of Clinical Endocrinology and Metabolism

JF - Journal of Clinical Endocrinology and Metabolism

IS - 3

ER -

Early Prediction of Gestational Diabetes Mellitus in the Chinese Population via Advanced Machine Learning

Abstract

Bibliographical note

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this