GeneMark-EP+ is a semi-supervised eukaryotic gene prediction tool which integrates information produced by proteins spliced aligned to genomic regions into model training and gene prediction steps.

The protein hints are generated by ProtHint, a fast protein mapping pipeline which predicts and scores introns, start and stop codons in the genome of interest from proteins of any evolutionary distance. As a source of proteins, we recommend to use OrthoDB v10 .

Due to its semi-supervised nature and ability to incorporate proteins of any evolutionary distance, GeneMark-EP+ is an optimal tool to predict genes in a novel genome without the need for a curated training set or a set of closely related proteins.

Due to limits on the computational resources we are not able to provide access to web based execution of GeneMark-EP+. To download code for local installation please follow this link: download GeneMark-ES/ET/EP

Tomáš Brůna, Alexandre Lomsadze, Mark Borodovsky
GeneMark-EP+: eukaryotic gene prediction with self-training in the space of genes and proteins
NAR Genomics and Bioinformatics, Volume 2, Issue 2, 2020

