Chemical shift prediction in 13C NMR spectroscopy using ensembles of message passing neural networks (MPNNs)

Williamson, David, Ponte, Santiago, Iglesias, Isaac, Cobas, Carlos, Tonge, Nicola M. and Kemsley, E. Kate ORCID: https://orcid.org/0000-0003-0669-3883 (2024) Chemical shift prediction in 13C NMR spectroscopy using ensembles of message passing neural networks (MPNNs).

Full text not available from this repository. (Request a copy)

Abstract

This study reports a deep learning approach utilising graph convolutional neural networks with four message-passing layers for predicting chemical shifts in 13C NMR spectra of small molecules. The networks were trained on two distinct datasets: one with approximately 4,000 labelled structures and another with over 40,000. To mitigate stochastic variation, an ensemble framework was implemented, which is simple to deploy on multiple nodes of a high-performance computing facility. The results emphasise the critical role of training set size and diversity. While prediction performance was comparable on test partitions from within each dataset, the larger dataset maintained its accuracy when challenged with crossover holdout sets, unlike the smaller dataset, which showed a notable decline. This difference is attributed to the greater diversity of atomic environments in the larger dataset. The larger dataset also enabled more robust modelling of various error properties, providing a quantitative foundation for spectral assignment and verification in two ways. First, a clear relationship was identified between prediction errors and the frequency of different node feature vectors in the training data, from which an estimated error can be associated with any node given node type. Such estimates may be used as weights in a modified cityblock distance metric during the assignment of observed to predicted shifts. Second, the mean absolute prediction error calculated at the structure level is well-fitted by a Gaussian kernel cumulative distribution. This leads to a probabilistic assessment of whether the predicted shifts and assigned observations are consistent with originating from the same molecular structure.

Item Type: Article
Faculty \ School: Faculty of Science > School of Chemistry, Pharmacy and Pharmacology
Related URLs:
Depositing User: LivePure Connector
Date Deposited: 12 Sep 2024 13:30
Last Modified: 12 Sep 2024 13:30
URI: https://ueaeprints.uea.ac.uk/id/eprint/96740
DOI:

Actions (login required)

View Item View Item