What is a genome?

An organism’s genome is the total set of DNA it inherits from its parents. The genome includes genes and non-coding DNA sequence as well as epigenetic modifications, such as methylation. The regulated expression of these genetic elements provides the building blocks and instructions through which an organism develops, grows, and interacts with its environment.

Why sequence the Joshua Tree genome?

The DNA sequence of a genome sequence provides context. By sequencing the genome of Joshua Tree, we will have a global view of the genes and regulatory elements that code for the chemical building blocks of a Joshua tree and control how it grows and responds to its environment. Knowing the genome sequence, we can begin to conduct experiments and analyses that can identify specific regions of that sequence that are important for Joshua tree’s interactions with pollinating moths and other members of the Mojave desert biological community, and its adaptation to desert climates.

How are we going to do it?

Joshua tree’s genome is approximately three billion DNA bases in length — that’s as many characters as there are in more than 2,500 copies of Moby Dick. Current technology doesn’t allow us to simply “read” such a long DNA sequence from one end to the other; instead, DNA sequencing methods collect many smaller snippets of DNA sequence, which we can then assemble into a whole-genome sequence. We will use a “hybrid assembly” approach for the Joshua tree genome by combining the power of two DNA sequencing technologies. Illumina sequencing can collect large quantities of DNA sequence data, but in small snippets of just a couple hundred DNA bases. The PacBio method reads long continuous stretches of DNA sequence, though it can’t collect as much total data as Illumina. Through our collaborators, we have access to PacBio sequencing capacity, and we’re crowd-funding the collection of Illumina data to complete the assembly. We will incorporate a new “optical mapping” method from BioNano to help assemble this sequence data into the full genome sequence.

When we have completed genome assembly from Illumina and PacBio sequencing, we will annotate the genome to identify genes and other functional elements. Ultimately we will build a “transcriptions atlas” of which genome regions play roles in development of different parts of a Joshua tree, and which control the tree’s growth and responses to its environment. This will provide a foundation to explore form and function within Joshua tree, from questions of how the genome functions as a whole to identifying genes that shape the interaction between Joshua tree and its pollinators.