What Is NGS (Next-Generation Sequencing)

Provides a quick introduction of NGS (Next-Generation Sequencing), which randomly breaks patient's sample into millions of DNA fragments, reads fragments as nucleotide strings, then digitally align them to a reference genome sequence to construct patient's genome sequence.

What Is NGS (Next-Generation Sequencing)? - NGS (Next-Generation Sequencing) is a genomic testing technology that can be used to determine the order of nucleotides in entire genomes or targeted regions of DNA or RNA.

Here is the main steps of NGS process:

1. Library Preparation — A DNA library is prepared from a patient's sample cells, which are randomly broken into a large amount (in millions) of DNA fragments. Amplification, purification, and other treatments are performed increases the efficiency of the library preparation process.

2. Sequencing — DNA fragments from the library are loaded onto a flow cell and placed on the sequencer. Then the SBS (Sequencing By Synthesis) process is performed to read the nucleotide string of each DNA fragment.

During the SBS process, chemically modified nucleotides bind to the DNA template strand through natural complementarity. Each nucleotide contains a fluorescent tag and a reversible terminator that blocks incorporation of the next base. The fluorescent signal indicates which nucleotide has been added, and the terminator is cleaved so the next base can bind.

The SBS process can be viewed as a DNA fragment reader. It reads nucleotides from a fragment and records their nucleotide letters sequentially. Each recorded nucleotide letter string is called a "read".

Sometimes, the same DNA fragment is read twice, forward and backward, recording 2 reads, a forward read and backward read. This should improve the overall quality of the NGS process.

The diagram (source: illumina.com) below shows the sequencing step of NGS:

NGS (Next-Generation Sequencing) - Reading
NGS (Next-Generation Sequencing) - Reading

2. Data Analysis — Reads (nucleotide letter strings) generated from the previous step are then aligned to a reference genome using a computer algorithm. When a read is aligned to a section of the reference genome, each nucleotide letter in the read is recored as a "hit" of the letter to the aligned position of the reference genome.

After all reads (in millions) are aligned, recorded hits on all positions in the reference genome form a hit distribution, which is then used to construct the genome sequence of the patient.

The diagram (source: illumina.com) below highlights a hit distribution of a given position: C, C, T, C, C, C, C, with C at 6/7, T at 1/7. So the constructed genome sequence should have C at this position. T with 1/7 can be discarded as process error. The result shows a variant (mutation) of the highlighted position from T to C comparing to the reference sequence.

NGS (Next-Generation Sequencing) - Alignment
NGS (Next-Generation Sequencing) - Alignment

Table of Contents

 About This Book

 Introduction of Molecules

 Molecule Names and Identifications

 Molecule Mass and Weight

 Protein and Amino Acid

 Nucleobase, Nucleoside, Nucleotide, DNA and RNA

 Gene and Chromosome

 Protein Kinase (PK)

DNA Sequencing

 What Is DNA Sequencing

 What Is PCR (Polymerase Chain Reaction)

 What Is Sanger Sequencing Method

What Is NGS (Next-Generation Sequencing)

 Gene Mutation

 SDF (Structure Data File)

 PyMol Installation

 PyMol GUI and CLI

 PyMol Selections

 PyMol Editing Functions

 PyMol Measurement Functions

 PyMol Movie Functions

 PyMol Python Integration

 PyMol Object Functions

 ChEMBL Database - European Molecular Biology Laboratory

 PubChem Database - National Library of Medicine

 PDB (Protein Data Bank)

 INSDC (International Nucleotide Sequence Database Collaboration)

 HGNC (HUGO Gene Nomenclature Committee)

 Relocated Tutorials

 Resources and Tools

 Molecule Related Terminologies

 References

 Full Version in PDF/EPUB