We were involved in the sequencing and assembly of the first D. mojavensis genome (from Catalina Island) (D12GC 2007). Subsequently we have set out to sequence, assemble and annotate the genomes of the other three host populations (Baja California, Mojave Desert and mainland Sonora Desert) (Allan and Matzkin 2019). We generated and sequenced both short and long insert (180 bp and 3 kb, respectively) Illumina libraries. Using the previously assembled Catalina Island genome we have were able to construct the great majority of the genome scaffold (93-97%, see table). From our template-based assemblies we have been able to obtain the sequences of 14,238 genes from each of the three host populations, which is the majority (~98%) of the genes annotated in the published genome (14,581). We performed a series of analyses on the pattern of molecular evolution across the four host population’s genomes (Allan and Matzkin, 2019). PAML analysis identified 244 loci (after FDR correction, there were 912 loci prior to correction) showing evidence of positive selection. Further analysis of these genes have illustrated the changes associated with local ecological adaptation across the host populations. For example, the network cluster below illustrates the connectivity of biological functions among the top 10% fastest evolving genes (908) in the D. mojavensis genomes.
Most recently, using both short read (Illumina) and long read (PacBio) sequencing, we have developed a pipeline to generate chromosome-level de novo genome assemblies (Jaworski et al. 2019). Currently, we are using our developed assembly pipeline to generate de novo assemblies of several populations and species of cactophilic Drosophila (Jaworski et al. in prep).