Learn

Enhanced Microbial Exploration with Short- and Long-Read Sequencing

Written by Psomagen | Nov 29, 2023 8:28:06 PM

Metagenomics has revolutionized our understanding of microbial communities, allowing us to delve into the network of microorganisms that shape our environment. Shotgun metagenomics, a widely used approach, enables us to reconstruct complex microbial communities at a high level of detail.

However, the choice between short-read and long-read sequencing is a topic of debate among researchers. In recent studies, some organizations have employed a hybrid approach. This uses both short- and long-read sequencing methods to meet the needs of their research question.

Short-read sequencing is a cost-effective method with rapid turnaround time. Although short reads have a high depth of coverage, this method struggles to resolve complex genomic regions. On the other hand, HiFi sequencing offers longer reads that can span repetitive regions for more accurate assembly. However, it comes at a higher cost per base pair and requires specialized library preparation techniques.

Some studies have turned to hybrid approaches for their research. By combining both short-read and long-read data sets, researchers can benefit from the strengths of each approach. This allows for long assemblies and high mapping rates of bacterial genomes.

Researchers from the University of Copenhagen conducted a shotgun metagenomics study to recover MAG catalogs from the fecal microbial communities of laboratory mice. The project compared the effectiveness of short-read only, long-read only, and hybrid sequencing strategies.

The study examined 22 fecal DNA extracts collected weekly for 11 weeks from two lab mice. Comparison of sequencing methods considered seven key performance metrics across four combinations of depth and technology:

1. 20 Gbp of Illumina short-read data,

2. 40 Gbp of short-read data,

3. 20 Gbp of PacBio HiFi long-read data, and

4. 40 Gbp of hybrid (20 Gbp of short-read + 20 Gbp of 20 long-read) data

The four assemblies were built from a total of 78.6 Gbp of short-read and 39.6 Gbp of long-read sequencing data. This was generated on the NovaSeq 6000 and PacBio Sequel IIe HiFi instrument, respectively. Both sequencing strategies yielded DNA reads of comparable quality (>40 Phred score). 

The hybrid assembly approach was the most effective strategy in capturing a significant number of sequences through read-mapping (96 and 97% of reads). It demonstrates the highest assembly coverage among all the strategies.

All approaches captured over 93% of metagenomic reads, showcasing their ability to handle the complexity inherent in this field. With the hybrid approach, researchers successfully obtained insights into plasmids, viruses, and other mobile genetic elements (MGEs). Short- and long-read approaches performed equally well in recovering circular viral genomes.

These microbes are of biological interest in public health and microbial adaptation research. Plasmids facilitate a significant proportion of antibiotic resistance acquisition in pathogenic bacteria. Viruses, while often overlooked, play a crucial role in shaping the composition of the gut microbiome.

Similar hybrid approaches have been successful in bacterial disease research. In 2021, a team of researchers used long-read sequencing on 20 bacterial isolates for common diseases. Short read sequencing is an effective tool for sequencing many bacterial genomes. However, some bacterial species (like those in the Enterobacteriaceae family) have repetitive structures that are difficult to resolve with short reads.

This project compared the use of different long-read sequencing technologies in hybrid assembly of bacterial genomes. Using PacBio or Oxford Nanopore long-read technologies paired with Illumina short reads completely resolved most genomes. This was more effective than long-read sequencing followed by short-read polishing.

Research focusing on Mycobacterium tuberculosis (MTB) has reached similar conclusions. MTB is known for GC-rich genome regions. In this study, Italian researchers interested in antibiotic resistance tested methods of genome resolution on 13 MTB isolates. Genome resolution is an important step in combatting drug resistance.

In this study, short read, long read, and hybrid sequencing were tested for variant calling, cluster analysis, drug resistance detection, and de novo assembly. The hybrid approach had better genome coverage and identified more SNPs. The research team indicated that hybrid sequencing is a preferred approach in a project that requires this many forms of analysis.

The choice of the optimal sequencing strategy depends on the specific objectives and priorities of a given study. Researchers must weigh the balance between the quantity and quality of recovered genomes and extrachromosomal elements. Current research indicates that there is no one-size-fits-all solution, and emphasizes the need for a tailored choice of sequencing strategies.