Single-Read Sequencing vs. Paired-End Sequencing
Single-read sequencing involves sequencing DNA from only one end, and is the simplest way to utilize Illumina sequencing. Paired-end sequencing allows users to sequence both ends of a fragment and generate high-quality, alignable sequence data. Paired-end sequencing facilitates detection of genomic rearrangements such as gene insertions, deletions, or inversions and repetitive sequence elements, as well as gene fusions and novel transcripts. Since paired-end reads are more likely to align to a reference, the quality of the entire data set improves.
Sequencing Depth (Coverage)
The depth of coverage is a measure of the number of reads that a specific genomic site is sequenced. Most users determine the necessary NGS coverage level based on the application, as well as on other factors such as size of reference genome, gene expression level, published literature, and best practices defined by the scientific community.
- For detecting human genome mutations, SNPs, and rearrangements, publications often recommend from 10× to 30× depth of coverage, depending on the application and statistical model.
- For RNA sequencing, researchers usually think in terms of numbers of millions of reads to be sampled. Detecting rarely expressed genes often requires an increase in the depth of coverage.
- For ChIP-Seq (chromatin immunoprecipitation sequencing), publications often recommend coverage of around 100x.
When to Sequence More?
In Illumina sequencing experiments, it is very easy to increase the coverage or sequence depth, if you later decide you need more data. You can just sequence more, and combine the sequencing output from different flow cells. There are a number of reasons to sequence more than the originally estimated coverage, these include:
- The effects you see are not statistically significant. Sequencing more reads will generally increase the power of your assay.
- You are investigating events that are very rare. For example, you may want to look at transcripts that are expressed at a very low level in RNA Sequencing, or look at very low binding activities in ChIP Sequencing.
- Certain journals or fields may require a higher level of coverage for your particular application.
- Certain genomes may need more sequencing. For example, certain regions may be hard to sequence requiring more coverage, or the genome may be polyploid.