Driving through stop signs: Predicting stop codon reassignment improves functional annotation of bacteriophages

Cook, Ryan, Telatin, Andrea, Bouras, George, Camargo, Antonio Pedro, Larralde, Martin, Edwards, Robert A. and Adriaenssens, Evelien M. (2024) Driving through stop signs: Predicting stop codon reassignment improves functional annotation of bacteriophages. ISME Communications, 4 (1). ISSN 2730-6151

[thumbnail of Cook_etal_2024_ISMEComms]
Preview
PDF (Cook_etal_2024_ISMEComms) - Published Version
Available under License Creative Commons Attribution.

Download (7MB) | Preview

Abstract

The majority of bacteriophage diversity remains uncharacterized, and new intriguing mechanisms of their biology are being continually described. Members of some phage lineages, such as the Crassvirales, repurpose stop codons to encode an amino acid by using alternate genetic codes. Here, we investigated the prevalence of stop codon reassignment in phage genomes and its subsequent impacts on functional annotation. We predicted 76 genomes within INPHARED and 712 vOTUs from the Unified Human Gut Virome Catalogue (UHGV) that repurpose a stop codon to encode an amino acid. We re-annotated these sequences with modified versions of Pharokka and Prokka, called Pharokka-gv and Prokka-gv, to automatically predict stop codon reassignment prior to annotation. Both tools significantly improved the quality of annotations, with Pharokka-gv performing best. For sequences predicted to repurpose TAG to glutamine (translation table 15), Pharokka-gv increased the median gene length (median of per genome median) from 287 to 481 bp for UHGV sequences (67.8% increase) and from 318 to 550 bp for INPHARED sequences (72.9% increase). The re-annotation increased median coding capacity from 66.8% to 90.0% and from 69.0% to 89.8% for UHGV and INPHARED sequences predicted to use translation table 15. Furthermore, the proportion of genes that could be assigned functional annotation increased, including an increase in the number of major capsid proteins that could be identified. We propose that automatic prediction of stop codon reassignment before annotation is beneficial to downstream viral genomic and metagenomic analyses.

Item Type: Article
Additional Information: Data availability statement: The genomes used in this analysis are from two publicly available datasets; INPHARED (https://github.com/RyanCook94/inphared) and the Unified Human Gut Virome (UHGV; https://github.com/snayfach/UHGV). The details of included sequences are shown in Supplementary Table S1. The code for Prokka-gv is available on GitHub (https://github.com/telatin/metaprokka). The code for Pharokka is available on GitHub (https://github.com/gbouras13/pharokka). The code for Prodigal-gv is available on GitHub (https://github.com/apcamargo/prodigal-gv). The code for Pyrodigal-gv is available on GitHub (https://github.com/althonos/pyrodigal-gv). Funding information: This research was supported by the BBSRC Institute Strategic Programme Food Microbiome and Health BB/X011054/1 and its constituent projects BBS/E/F/000PR13631 and BBS/E/F/000PR13633; and by the BBSRC Institute Strategic Programme Microbes and Food Safety BB/X011011/1 and its constituent projects BBS/E/F/000PR13634, BBS/E/F/000PR13635, and BBS/E/F/000PR13636. R.C. and E.M.A. were supported by the BBSRC grant Bacteriophages in Gut Health BB/W015706/1. This research was supported in part by the NBI Research Computing through the High-Performance Computing cluster. We gratefully acknowledge CLIMB-BIG-DATA infrastructure (MR/T030062/1) support for the provision of cloud resources. R.A.E. was supported by an award from the NIH NIDDK RC2DK116713 and an award from the Australian Research Council DP220102915. The work conducted by the US Department of Energy Joint Genome Institute (https://ror.org/04xm1d337) and the National Energy Research Scientific Computing Center (https://ror.org/05v3mvq14) is supported by the US Department of Energy Office of Science user facilities, operated under contract no. DE-AC02-05CH11231.
Faculty \ School: Faculty of Science > School of Biological Sciences
Related URLs:
Depositing User: LivePure Connector
Date Deposited: 16 Jan 2025 01:10
Last Modified: 16 Jan 2025 01:10
URI: https://ueaeprints.uea.ac.uk/id/eprint/98222
DOI: 10.1093/ismeco/ycae079

Downloads

Downloads per month over past year

Actions (login required)

View Item View Item