Automating the Annotation of Data through Machine Learning and Semantic Technologies

Pigott-Dix, Lorcán Anthony Karel (2023) Automating the Annotation of Data through Machine Learning and Semantic Technologies. Doctoral thesis, University of East Anglia.

[thumbnail of 2024Pigott-DixLPhD.pdf]
Download (6MB) | Preview


The ever-increasing scale and complexity of scientific research is surpassing our means to assimilate newly produced knowledge. Computer tools are necessary for the organisation, retrieval, and interpretation of new scientific knowledge and data. The efficacy of such tools requires that research outputs are described by rich machine-readable metadata. Ontologies provide the framework to unambiguously describe the meaning of knowledge and data, so that it may be re-used or combined to synthesise new knowledge. However, manually annotating research with ontology terms, a process called semantic annotation, is also infeasible due to the aforementioned scale.

This thesis describes research to develop deep learning-based tools for semantic annotation. The approaches described explore different methods for exploiting the domain knowledge encoded into ontologies to avoid the need to manually curate training corpora. They also take advantage of the inherent integrative capabilities of ontologies, to leverage combinations of heterogeneous knowledge to improve annotation performance and model interpretability. Several models exceeded previous benchmarks for semantic annotation in the bio-medical domain. This thesis concludes with a discussion of the strengths and limitations of the methods, and the implications for multi-domain ontology semantic annotation and for explainable artificial intelligence.

Item Type: Thesis (Doctoral)
Faculty \ School: Faculty of Science > School of Biological Sciences
Depositing User: Chris White
Date Deposited: 07 Mar 2024 13:57
Last Modified: 07 Mar 2024 13:57


Downloads per month over past year

Actions (login required)

View Item View Item