CodAn: predictive models for precise identification of coding regions in eukaryotic transcripts

Nachtigall, Pedro Gabriel; Kashiwabara, Andre Y; Durham, Alan M

CodAn: predictive models for precise identification of coding regions in eukaryotic transcripts

Publication type

Article

Language

English

Access rights

Open access

Terms of use

CC BY

Appears in Collections:

Artigos

Metrics

Readers/Tweeter

Altmetrics citation

Abstract

Motivation Characterization of the coding sequences (CDSs) is an essential step in transcriptome annotation. Incorrect identification of CDSs can lead to the prediction of non-existent proteins that can eventually compromise knowledge if databases are populated with similar incorrect predictions made in different genomes. Also, the correct identification of CDSs is important for the characterization of the untranslated regions (UTRs), which are known to be important regulators of the mRNA translation process. Considering this, we present CodAn (Coding sequence Annotator), a new approach to predict confident CDS and UTR regions in full or partial transcriptome sequences in eukaryote species. Results Our analysis revealed that CodAn performs confident predictions on full-length and partial transcripts with the strand sense of the CDS known or unknown. The comparative analysis showed that CodAn presents better overall performance than other approaches, mainly when considering the correct identification of the full CDS (i.e. correct identification of the start and stop codons). In this sense, CodAn is the best tool to be used in projects involving transcriptomic data. Availability CodAn is freely available at https://github.com/pedronachtigall/CodAn.

Reference

Nachtigall PG, Kashiwabara AY, Durham AM. CodAn: predictive models for precise identification of coding regions in eukaryotic transcripts. Brief. Bioinform. 2021 May;22(3):1–11. doi:10.1093/bib/bbaa045.

Link to cite this reference

https://repositorio.butantan.gov.br/handle/butantan/4057

URL

https://doi.org/10.1093/bib/bbaa045

Journal title

Briefings in Bioinformatics

Keywords

mRNA; CDS characterization; UTR characterization; annotation

Funding agency

(FAPESP) Fundação de Amparo à Pesquisa do Estado de São Paulo ; (CAPES) Coordenação de Aperfeiçoamento de Pessoal de Nível Superior ; (CNPq) Conselho Nacional de Desenvolvimento Científico e Tecnológico

Issue Date

2021

Files in This Item:

bbaa045.pdf
Description:
Size: 850.39 kB
Format: Adobe PDF
View/Open

Show full item record

This item is licensed under a Creative Commons License

CodAn: predictive models for precise identification of coding regions in eukaryotic transcripts

Author

Butantan affiliation

External affiliation

Publication type

Language

Access rights

Terms of use

Appears in Collections:

Metrics

Readers/Tweeter

Abstract

Reference

Link to cite this reference

URL

Journal title

Keywords

Funding agency

Issue Date

Files in This Item: