The Vertebrate Genomes Project (VGP), a project of the G10K Consortium, aims to generate near error-free reference genome assemblies of all 66,000 extant vertebrate species.

These genomes will be used to address fundamental questions in biology and disease, to identify species most genetically at risk for extinction, and to preserve genetic information of life. The motivation for the VGP is based in part on the G10K mission to generate genomes of 10,000 or more vertebrate species and on lessons learned from The Avian Phylogenomics Project.

The VGP will be completed based on taxonomic hierarchy, which is a relative ranking of a group of organisms beginning with the largest classification, domain, to the smallest classification, species: Orders (Phase 1), Families (Phase 2), and Genera (Phase 3) to eventually all species (Phase 4). This strategy will allow us to gain scientific insight at each phase, to continue to integrate emerging technologies, and to complete genomic analyses at increasing levels of phylogenetic scale. Additionally, we expect our approaches and questions at each phase to lead to the development of new algorithms, including algorithms for genome assemblies, alignments, annotations, comparative genomics, etc., which would then be applied to the next phase. This approach will also help us secure the needed funds in stages through grants and other fundraising efforts.

At each phase, species selection is based on a combination of criteria, including those with existing draft genomes in need of improvement, those with specialized traits that inform us about human biology, those in immediate danger of becoming extinct, and those with prominent use in biomedical research. Endangered species are a high priority and critical because our planet is experiencing its 6th mass extinction event, the worst since the die-off of dinosaurs 66 million years ago. This priority is in part due to human influence on pollution, habitat destruction, and climate change. Our planet is at risk of losing 1 in 8 vertebrate species (~8000 in total) to extinction. The VGP aims to map the DNA of these at-risk species to not only preserve blueprints of their genomes but also to help identify genetic variants that might protect these species from total extinction. This knowledge can guide conservation strategies to prevent or at least minimize loss of species resulting from humankind’s impact on the environment.


The VGP genomes will be publicly available on the Genome Ark. The Genome Ark is a database that is hosted by the Amazon cloud and integrated with existing international public genome browsing and analyses databases (NCBIENSEMBL, and UCSC), with our informatics and data management platform hosted by DNAnexus.




