TY - JOUR
T1 - Intelligent classification of coal structure using multinomial logistic regression, random forest and fully connected neural network with multisource geophysical logging data
AU - Wang, Zihao
AU - Cai, Yidong
AU - Liu, Dameng
AU - Qiu, Feng
AU - Sun, Fengrui
AU - Zhou, Yingfang
N1 - Acknowledgments
This research was funded by the National Natural Science Foundation of China (grant nos. 42130806, 41922016 and 41830427).
PY - 2023/2/15
Y1 - 2023/2/15
N2 - The structure of coal indicates the degree of its fragmentation after tectonic movement, which affects the exploration and development of coalbed methane (CBM). Although coal core observations are the most convenient and intuitive way of identifying the coal structure, they are not applicable for use in unexplored coal seams without CBM wells, and they are also very time-consuming. In comparison, geophysical-logging interpretation of the coal structure is more efficient and economical. However, although qualitative methods, such as principal component analysis (PCA), can be used to identify the coal structure with geophysical logging, the interpretation is limited by the calculation ability, and improvements are required based on the structure of an empirical model. Multinomial logistic regression (MLR), random forest (RF), and deep fully connected neural network (DNN) are effective machine learning methods more accurate than the traditional method that with model-aided identification. In this respect, the MLR method is a classical method based on mathematical linear regression, and it has a low construction cost; RF is an ensemble learning algorithm based on a decision tree and use of a bagging algorithm; and DNN is a deep learning model based on self-built feature engineering that has high classification accuracy under a large amount of data training and provides obvious advantages in visual coal classification problems. In this work, the three machine learning methods, MLR, RF, and DNN, were used to identify the coal structure. Two sets of logging data comprising different quantities from the Anze Block of the southern Qinshui Basin, North China, were selected to quantitatively compare the accuracy of coal structure identification with partial coal core observation. The results showed that for 210 and 840 samples, respectively, the accuracy was 76% and 77% for MLR, 83% and 86% for RF, and 82% and 86% for DNN. These results show that the MLR and DNN methods are superior for use with minimal and maximum amounts of data, respectively, and the RF method provides overall accuracy. Furthermore, an algorithmic classification of the coal structure was established, and the geological factors controlling the predicted structure, such as geostress, coal seam thickness, and burial depth, were distinguished.
AB - The structure of coal indicates the degree of its fragmentation after tectonic movement, which affects the exploration and development of coalbed methane (CBM). Although coal core observations are the most convenient and intuitive way of identifying the coal structure, they are not applicable for use in unexplored coal seams without CBM wells, and they are also very time-consuming. In comparison, geophysical-logging interpretation of the coal structure is more efficient and economical. However, although qualitative methods, such as principal component analysis (PCA), can be used to identify the coal structure with geophysical logging, the interpretation is limited by the calculation ability, and improvements are required based on the structure of an empirical model. Multinomial logistic regression (MLR), random forest (RF), and deep fully connected neural network (DNN) are effective machine learning methods more accurate than the traditional method that with model-aided identification. In this respect, the MLR method is a classical method based on mathematical linear regression, and it has a low construction cost; RF is an ensemble learning algorithm based on a decision tree and use of a bagging algorithm; and DNN is a deep learning model based on self-built feature engineering that has high classification accuracy under a large amount of data training and provides obvious advantages in visual coal classification problems. In this work, the three machine learning methods, MLR, RF, and DNN, were used to identify the coal structure. Two sets of logging data comprising different quantities from the Anze Block of the southern Qinshui Basin, North China, were selected to quantitatively compare the accuracy of coal structure identification with partial coal core observation. The results showed that for 210 and 840 samples, respectively, the accuracy was 76% and 77% for MLR, 83% and 86% for RF, and 82% and 86% for DNN. These results show that the MLR and DNN methods are superior for use with minimal and maximum amounts of data, respectively, and the RF method provides overall accuracy. Furthermore, an algorithmic classification of the coal structure was established, and the geological factors controlling the predicted structure, such as geostress, coal seam thickness, and burial depth, were distinguished.
KW - Coal structure identification
KW - Logging data
KW - Machine learning
KW - Random forest
KW - Neural network
KW - Regression
U2 - 10.1016/j.coal.2023.104208
DO - 10.1016/j.coal.2023.104208
M3 - Article
JO - International Journal Of Coal Geology
JF - International Journal Of Coal Geology
SN - 0166-5162
M1 - 104208
ER -