Automatic organofacies identification by means of Machine Learning on Raman spectra

Natalia A. Vergara Sassarini*, Andrea Schito, Marta Gasparrini, Pauline Michel, Sveva Corrado

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


In this study we compare and evaluate different unsupervised clustering algorithms for organofacies discrimination in low maturity dispersed organic matter based on Raman spectroscopic analyses. A total of 1363 Raman spectra were collected from a set of 27 organic-rich samples from the Lower Toarcian shale interval of the Paris Basin sub-surface. Rock-Eval pyrolysis data indicate a type II to type III kerogen with a vitrinite reflectance (Ro%) between 0.45% and 0.65%, and Tmax between 415 °C and 438 °C. Organic petrographic observations under transmitted light reveal the presence of organofacies composed by amorphous organic matter, opaque, and translucent phytoclasts. An optical classification of organic particles was performed on about 40–60 fragments per sample and used as the ground truth. Raman spectra were obtained for all the classified fragments and principal component analysis was performed to underline the variability among spectra. Unsupervised clustering was then applied on Raman spectra principal components. Three clustering methods were applied to evaluate their effectiveness in predicting number, shape and density of clusters and a contingency matrix was used to quantify their ability to predict different organofacies. Gaussian Mixture Model (GMM) was found to be the best algorithm for organofacies identification showing an accuracy mostly between 80% and 90%. This work outlines how unsupervised clustering of Raman spectra of dispersed organic matter can reduce the uncertainty in thermal maturity assessment and help the classification of highly heterogeneous organofacies when using large datasets for Earth and planetary sciences studies.

Original languageEnglish
Article number104237
JournalInternational Journal Of Coal Geology
Early online date8 Apr 2023
Publication statusPublished - 15 Apr 2023


  • Cluster analysis
  • Dispersed organic matter
  • Machine learning
  • Principal component analysis
  • Raman spectroscopy
  • Thermal maturity


Dive into the research topics of 'Automatic organofacies identification by means of Machine Learning on Raman spectra'. Together they form a unique fingerprint.

Cite this