InChIKey - InChI Hash String
This section provides a quick introduction of InChI Key, which is generated from hash values of the InChI string.
InChIKey is a condensed version of the InChI string of a chemical substance, developed under
the auspices of IUPAC (International Union of Pure and Applied Chemistry).
Technically, an InChIKey is 27-character long.
It consists of three parts separated by hyphens, of 14, 10
and one character(s), in the format of XXXXXXXXXXXXXX-YYYYYYYYFV-P, where:
- "XXXXXXXXXXXXXX" - Generated from a SHA-256 hash of the connectivity information
(the main layer and /q sublayer of the charge layer) of the InChI string.
- "YYYYYYYY" - Generated from a hash of the remaining layers of the
- "F" - Used to indicate the kind of InChIKey (S for standard and N
- "V" - Used to indicate the version of InChI used:
"A" for version 1.
- "P" - Used to indicate the protonation of the core parent structure,
corresponding to the /p sublayer of the charge layer:
N for no protonation, O, P, ... if protons should be added
and M, L, ... if they should be removed.
Theoretically, InChIKey is not unique for each chemical substance.
But the likelihood of duplicates is very very small.
For a given molecule, the InChI key can be generated through a set of rules,
which are not so easy to follow.
So you should use some software tools to help you,
like the free "InChI Software" provided at
Open Babel can also be used to generate the InChI string for any given
Here is a list of InChI Keys of some molecules.
You can use
to find the InChI Key of a given molecule.
Table of Contents
About This Book
Introduction of Molecules
►Molecule Names and Identifications
Molecule Common Names
InChI (International Chemical Identifier)
►InChIKey - InChI Hash String
Molecule Mass and Weight
Protein and Amino Acid
Nucleobase, Nucleoside, Nucleotide, DNA and RNA
Gene and Chromosome
Protein Kinase (PK)
SDF (Structure Data File)
RDKit: Open-Source Cheminformatics Software
PyMol GUI and CLI
PyMol Editing Functions
PyMol Measurement Functions
PyMol Movie Functions
PyMol Python Integration
PyMol Object Functions
ChEMBL Database - European Molecular Biology Laboratory
PubChem Database - National Library of Medicine
PDB (Protein Data Bank)
INSDC (International Nucleotide Sequence Database Collaboration)
HGNC (HUGO Gene Nomenclature Committee)
Resources and Tools
Molecule Related Terminologies
Full Version in PDF/EPUB