Bridging Artificial Intelligence and Plant Biology: Innovations in Phenotyping and Transcriptome Analysis

Colmer, Joshua (2023) Bridging Artificial Intelligence and Plant Biology: Innovations in Phenotyping and Transcriptome Analysis. Doctoral thesis, University of East Anglia.

[thumbnail of 2024ColmerJPhD.pdf] PDF
Restricted to Repository staff only until 30 June 2025.

Request a copy


Machine learning has the potential to revolutionise plant biology by offering unprecedented opportunities for predicting, measuring, and understanding complex biological processes. This thesis presents three projects that utilise machine learning techniques to tackle diverse challenges, ranging from seed germination detection to circadian time prediction and the identification of diagnostic biomarkers.

In the first project, SeedGerm, I developed a novel approach to automatically detect seed germination, a critical physiological process that determines the success of plant establishment. My pipeline utilises computer vision techniques for image-based seed segmentation and machine learning for predicting seed germination, enabling automated seed phenotyping for plant breeders and researchers.

For the second project, Trans-Learn, I identified diagnostic biomarkers for plant viruses through novel applications of image analysis and machine learning techniques. Plant viruses pose a major threat to global crop production, and biomarkers can facilitate the development of disease-resistant crop varieties. My supervised machine learning approach concentrates on transforming tabular datasets into tensors, intelligently arranging features, and interpreting a trained vision transformer to successfully isolate transcriptomic biomarkers in Arabidopsis helleri for turnip mosaic virus infection.

The third project, ChronoGauge, explored using transcriptomic biomarkers and multi-output regression models to predict the circadian clock, a fundamental biological mechanism that regulates the timing of processes in plants. Utilising circular regression techniques, statistical methods to quantify gene expression rhythmicity, and wrapper feature selection methods, I was able to predict the internal circadian time using transcriptomic data in Arabidopsis thaliana and wheat, surpassing the current state-of-the-art method in accuracy.

These projects collectively highlight machine learning’s potential in addressing key challenges in plant biology. The open-source methods developed through these projects have applications to accelerate breeding practices and enable researchers to advance our understanding of plant biology. Overall, this thesis provides valuable insights into bridging the gap between computational techniques and biological research.

Item Type: Thesis (Doctoral)
Faculty \ School: Faculty of Science > School of Biological Sciences
Depositing User: Chris White
Date Deposited: 21 May 2024 07:54
Last Modified: 21 May 2024 07:54

Actions (login required)

View Item View Item