DArTreseq: Finding the Needle in the Genetic Haystack

Imagine a detective being confronted with literally billions of pieces of evidence, and having to find the one little fact that will make the rest of it all make sense and crack the case – the proverbial needle in a haystack.

That’s the task geneticists face when looking for clues in a string of DNA – and why DNA sequencing can be a complex, time-consuming and expensive process.

Thanks to newly developed technology called DArTreseq, this task has now been significantly simplified.

The genome of bread wheat, for example, may contain 16 billion DNA basepairs (‘bp’ – the DNA building blocks). The task of generating and analysing that much data, to fully take the genetic fingerprint of that wheat, is enormous, despite recent progress in sequencing technologies.

The complexity of this is heightened by the fact that a large proportion of these clues are red herrings – basepairs that contain no useful information in the process of finding the “DNA markers” which describe the genome of the sample being tested. Just like the detective, our geneticist needs to be able to recognise this “junk DNA”.

But here’s where the geneticist has an advantage – because the junk DNA in plants is characterised by a high level of chemical modification called methylation. It is in fact the “junk”, heavily methylated, DNA that is making many plant genomes, like wheat, very “obese”!

In the late 1990s, founder and director of DArT, Andrzej Kilian, invented a method of DNA analysis that could reduce the time – and therefore cost – of characterising the genome of a sample – significantly reducing the size of the DNA haystack.

This method, called DArT, was originally performed on the microarray platform, that is on glass slides each containing many thousands of micrometer-sized DNA spots.

About a decade ago, this process was transferred onto more modern DNA sequencing instruments, and trademarked as DArTseq.

Both DArT and DArTseq methods work by introducing special enzymes into the analysis process, to identify only the useful segments of the genome – those with low levels of methylation. These segments are then assembled into ‘libraries’ that can be analysed and used to build the overall genetic image of the sample.

This process, however, selects only a subset of the useful, unmethylated fragments of the genome, which places a limit on the amount of genomic information DArTseq can provide about the sample. Usually, the 100,000 to 300,000 DNA fragments sequenced by DArTseq can generate at least enough DNA markers for most applications, but there are cases when even this “high density” genetic profile may be insufficient.

To eliminate this potential constraint, DArT has now taken the technology one step further, with DArTreseq, in which the introduced enzymes biochemically remove the methylated sequences from the libraries prior to sequencing.

At the same time, these enzymes enrich the remaining, useful segments of the genome – the ones containing genes.

This reduces the amount of sequencing required to achieve an almost complete DNA profile of a sample, providing reliable results in a fraction of the time and at a fraction of the cost. It is now so much easier to find that needle in a greatly reduced haystack.

A patent for this sequencing complexity reduction technology is now pending, so that DArT can guarantee that it will be broadly available to breeders, scientists and ecologists through its technology delivery model – DArT Network.

This is all a part of DArT’s commitment to seeing its technology used as widely as possible, to help improve the performance of crops, monitor and assist biodiversity, and improve the use of natural resources.

Further information

More detailed information on our specific methodology can be provided as a part of a service report. Please also visit our main page on DArTreseq at https://www.diversityarrays.com/technology-and-resources/dartreseq/.

Please contact us if you are interested in discussing how DArTreseq® can be applied to your work!


DArTreseq precisely targets genes and avoids sequencing repetitive (“junk”) sequences. On the left is an overview ‘map’ of chromosome 1 of sorghum – giving an idea of the huge amount of information that needs to be analysed when sequencing the DNA of the plant. The map on the right zooms in on just a small section of this chromosome. Here, you can see the way the DArTreseq technology enriches and sequences the valuable, single-copy regions (blue), while eliminating the high-copy “junk” (light grey).