Molecule This book is a collection of notes and tutorial examples written by the author while he was learning molecules and related tools. Topics include understanding atoms, bonds and molecules; introduction of atomic isotopes and elements; introduction of proteins and amino acids; introduction of protein kinases; molecule SDF (Structure Data File) format; generating PNG pictures from molecule SDF files; installing RDkit as molecule tool; visualizing molecule structure in 3-D with PyMol; generating molecule movie with PyMol. Updated in 2023 (Version v1.26) with minor updates.

Table of Contents

About This Book

Introduction of Molecules

What Is Molecule

What Is Molecular Formula

What Is Atomic Bond

What Is Molecule Structure

Skeletal Formula Notations

Molecule Names and Identifications

Molecule Common Names

InChI (International Chemical Identifier)

InChIKey - InChI Hash String

IUPAC Nomenclature

Canonical SMILES

Molecule Mass and Weight

Protein and Amino Acid

What Is Amino Acid

The 20 Common Amino Acids

Peptide, Peptide Bond, Amino Acid Residues

What Is Protein

Protein Structure Levels

Alpha Helix and Beta Sheet

Protein Visualization - Ribbon Diagram

Composed Proteins or Protein Complexes - Worldwide PDB (Protein Data Bank)


Nucleobase, Nucleoside, Nucleotide, DNA and RNA

What Is Nucleobase

What Is Nucleoside

What Is Nucleotide

What Is Nucleic Acid

What Is RNA (Ribonucleic Acid)

What Is DNA (Deoxyribonucleic Acid)

RNA Primary Structure - Helix

DNA Primary Structure - Double Helix

What Is DNA/RNA Base and Sequence Pair

What Is Chromosome

Gene and Chromosome

What Is Gene

What Is Human Genome

Gene Address on Chromosome

DNA Coding and Codons

Gene Expression - ­Building Proteins

Genetic Transcription - Creating mRNA

Genetic Translation - Creating Protein

DNA Gene Sequence - Exons and Introns

Chromosome Replication (or DNA Replication)

Protein Kinase (PK)

DNA Sequencing

What Is DNA Sequencing

What Is PCR (Polymerase Chain Reaction)

What Is Sanger Sequencing Method

What Is NGS (Next-Generation Sequencing)

Gene Mutation

What Is Gene Mutation

What Is Point Mutation

Base-Pair Insertion and Deletion

Gene Mutation Inheritance Likelihood

Types of Genetic Testing

Mutation Detection with NGS

What Is Allele Frequency

What Is VCF (Variant Calling Format)

"vcftools" - VCF Utility Command

What Is VAF (Variant Allele Frequency)

Gene Mutation Naming Convention

Gene Mutation Test Report

What Is ctDNA Testing

Sanger Sequencing Test Report

SDF (Structure Data File)

What Is SDF (Structure Data File)

SDF Format Specification

What Are CTfile and CTAB

Convert SDF to SVG using Open Babel

"sdf2svg" - PHP Script to Convert SDF to SVG

PyMol Installation

What Is PyMol

Install PyMol Incentive Edition on macOS

Install PyMol Open Source Edition

Compile PyMol Source Code

Install Open Source PyMol with Homedrew

Install Open Source PyMol with Fink

PyMol GUI and CLI

PyMol Screen Layout

Load Molecule from File into PyMol

Virtual Trackball Rotation on PyMol

Zoom In and Out on PyMol

PyMol Command Line Interface

"load" and "delete" Commands on PyMol

"log_open" and "log_close" Commands on PyMol

Model Space and Camera Space on PyMol

"get_view" and "set_view" on PyMol

View Parameters Auto Adjusted on PyMol

Zoom In/Out by Moving Camera

Rotation with Transformation Matrix

Rotation with "turn" Command

Difference of "turn" and "rotate" Commands

Difference of "move" and "translate" Commands

"center", "zoom" and "reset" Commands

Model-to-Camera Space Coordinates Mapping

Camera-to-Model Space Coordinates Mapping

Turn Structure around Camera

"show lines" Presentation Command

"show sticks" Presentation Command

"show spheres" Presentation Command

"show surface" Presentation Command

"show mesh" Presentation Command

PyMol Selections

Create Selection with Mouse in PyMol

"select" Command in PyMol

Substructure Selection Visualization in PyMol

Modify Molecule Structure in PyMol

Export Molecule Substructure in PyMol

Create Methane Molecule in PyMol

PyMol Editing Functions

"pk1", "pk2", "pk3" and "pk4" Selections

"edit id n1, id n2, id n3, id n4" Commands

"remove pk*" and "remove_picked" Commands

"unbond pk1, pk2" and "bond pk1, pk2" Commands

"replace new_atom, ..." Replace pk1 with New Atom

"attach new_atom, ..." Attach to pk1 with New Atom

Build Alcohol Molecule with PyMol

PyMol Measurement Functions

"get_extent" - Picked Atom Location

"label" - Generate Labels on Atoms

Distance between Atoms in PyMol

Angle Formed by 3 Atoms in PyMol

Dihedral Angle Formed by 3 Atoms in PyMol

Use Selection Expressions in PyMol

"get_area" - Surface Areas of Atoms

Surface Area of Bonded Atoms

Surface Area of Entire Molecule

"get_position" - Viewing Center

PyMol Movie Functions

PyMol Python Integration

Run Python Statements from PyMol

PyMol Python API

Launch PyMol from Python Interpreter

PyMol Python API Only Functions

PyMol Object Functions

What Is Object in PyMol

Visualize Objects Independently

Edit Objects Independently

ChEMBL Database - European Molecular Biology Laboratory

What Is ChEMBL

ChEMBL Speical Web Portals

Download ChEMBL Database

ChEMBL FTP Repository

ChEMBL Web Services API

Call ChEMBL Data Web Service Directly

ChEMBL Data Resource - molecule

ChEMBL Data Resource - activity

ChEMBL Data Resource - assay

ChEMBL Data Resource - document

ChEMBL Data Resource - target

ChEMBL Data Resource - chembl_id_lookup

ChEMBL Related Tools

chembl_webresource_client - Python Client

chembl_webresource_client - Usage Examples

chembl_webresource_client - RetryError Exception

ChEMBL Terminologies

PubChem Database - National Library of Medicine

What Is PubChem

Download PubChem Database

PubChem Data Sources

PubChem Widgets for Web Pages

PDB (Protein Data Bank)

What Is PDB (Protein Data Bank)

What Is CRSB (Research Collaboratory for Structural Bioinformatics)

What Is PDBe (Protein Data Bank in Europe)

What Is PDBj (Protein Data Bank Japan)

VR molecular viewer by PDBj

INSDC (International Nucleotide Sequence Database Collaboration)


Reference Genome Sequence Data File

RefSeq Proteins of Human Genome

HGNC (HUGO Gene Nomenclature Committee)

What Is HGNC

Human Gene Symbol Report by HGNC

REST Web Service at HGNC

Synchronization with HGNC Database

Relocated Tutorials

Resources and Tools

Molecule Related Terminologies


Keywords: Molecule, DNA, Gene, Protein, BioTech