PHASE 1 OF THE VGP WILL GENERATE NEAR ERROR-FREE REFERENCE GENOMES OF 260 SPECIES REPRESENTING ALL VERTEBRATE ORDERS WITH A DIVERGENCE TIME OF ~50 MILLION YEARS AGO (MYA) OR GREATER FROM THEIR MOST RECENT COMMON ORDINAL ANCESTOR, INCLUDING HUMAN AND SOME SPECIES ON THE BRINK OF EXTINCTION. WE WILL SEQUENCE THE HETEROGAMETIC SEX (WHEN IT EXISTS) SO THAT BOTH SEX CHROMOSOMES CAN BE ASSEMBLED FOR EACH SPECIES.
Once funding is secured for all 260 species in Phase 1, we will be able to generate ~12 genomes per week. It will take ~6-8 months to sequence and assemble Phase 1 genomes. Sample collection will occur in parallel, adding another ~4-6 months. Alignments and annotations will also occur simultaneously and at 10 genomes per week will add another ~6 months.
We expect to complete all 260 species plus the 4 invertebrate outgroups within 1.5 years from the start of a major source of funding. Biological analyses for publications will occur simultaneously, although some analyses can only occur after annotation and alignment of the 260 species, which will add another 12 months before submitting papers for publication.
Phase I Pipeline
For Phase 1, we are combining long-read technology, such as Pacbio, and improved algorithms to generate an initial contig assembly, followed by long-range scaffolding approaches, to put the contigs into chromosomes, including 10X Genomics linked-reads, HiC linked-reads, and Bionano optical maps. With further development of our algorithms, we believe that we will be able to achieve near error-free genome assemblies with no additional sequencing.
Current Project Status
AS OF JULY 2018
AFTER ~ 2.5 YEARS FROM INITIAL FUNDING, WE EXPECT THE FOLLOWING OUTCOMES: