Welcome to the Vertebrate Genomes Project (VGP), which aims to generate near error-free reference genome assemblies of ~70,000 extant vertebrate species.


The Vertebrate Genomes Project (VGP), a project of the G10K Consortium, aims to generate near error-free reference genome assemblies of ~70,000 extant vertebrate species.

Credit: Digital Creative Associates

These genomes will be used to address fundamental questions in biology and disease, to identify species most genetically at risk for extinction, and to preserve genetic information of life. The motivation for the VGP is based in part on the G10K mission to generate genomes of 10,000 or more vertebrate species and on lessons learned from The Avian Phylogenomics Project.

The VGP will be completed based on taxonomic hierarchy, which is a relative ranking of a group of organisms beginning with the largest classification, domain, to the smallest classification, species: Orders (Phase 1), Families (Phase 2), and Genera (Phase 3) to eventually all species (Phase 4). This strategy will allow us to gain scientific insight at each phase, to continue to integrate emerging technologies, and to complete genomic analyses at increasing levels of phylogenetic scale. Additionally, we expect our approaches and questions at each phase to lead to the development of new algorithms, including algorithms for genome assemblies, alignments, annotations, comparative genomics, etc., which would then be applied to the next phase. This approach will also help us secure the needed funds in stages through grants and other fundraising efforts.

At each phase, species selection is based on a combination of criteria, including those with existing draft genomes in need of improvement, those with specialized traits that inform us about human biology, those in immediate danger of becoming extinct, and those with prominent use in biomedical research. Endangered species are a high priority and critical because our planet is experiencing its 6th mass extinction event, the worst since the die-off of dinosaurs 66 million years ago. This priority is in part due to human influence on pollution, habitat destruction, and climate change. Our planet is at risk of losing 1 in 8 vertebrate species (~8000 in total) to extinction. The VGP aims to map the DNA of these at-risk species to not only preserve blueprints of their genomes but also to help identify genetic variants that might protect these species from total extinction. This knowledge can guide conservation strategies to prevent or at least minimize loss of species resulting from humankind’s impact on the environment.

The VGP intends to use the genomic data that it produces for multiple studies. A list of studies planned for the Phase 1 ordinal VGP include:

1. Genome-scale family tree of vertebrates.
2. Comparative genomics of specialized traits in each vertebrate lineage.
3. Comparative genomics of convergent traits (e.g. vocal learning, flight, loss of limbs, and aquatic / terrestrial adaptations).
4. Developing universal vertebrate gene orthology and nomenclature.
5. Deciphering vertebrate chromosomal genome evolution.
6. Reconstruction of the common ancestor genomes of all vertebrates and of key vertebrate clades (e.g. mammals, birds, reptiles, amphibians, teleost, bony vertebrates, jawed vertebrates, and tetrapods).
7. Evolution of nucleotides to chromosomes of the human genome.
8. Genetics of why some lineages are more disease resistant than others.
9. Conservation genomics of endangered species sequenced.
10. The genomes of all remaining Kakapo parrots on the planet.
11. Genetic signatures of domestication across vertebrates.
12. Genetics of sex determination and sex chromosome evolution among vertebrates.
13. Brain cell type evolution and homologies using genomics and transcriptomics.
14. 3-Dimensional genome structure across vertebrates.
15. Consequences of the evolutionary battle between transposons and host factors.
16. New algorithms for near complete genome assemblies.
17. New algorithms for reference free multi-way genome alignments.

Learn more about our data use policies here.


The VGP genomes will be publicly available on the Genome Ark. The Genome Ark is a database that is hosted by the Amazon cloud and integrated with existing international public genome browsing and analyses databases (NCBIENSEMBL, and UCSC), with our informatics and data management platform hosted by DNAnexus.




A master teller of fish stories

May 9, 2018

Vertebrate Genomes Project Plans to Combine Technologies for 'Near Gapless' Assemblies

Jan 17, 2018

Every One Of These Endearing Parrots Will Have Its Genome Sequenced

September 7, 2017