Try Python API with RDKit Native Code

This section provides a tutorial example on how to connect the Python API to RDKit native code. Unfortunately, it is not working because of the missing boost_python library.

Now I am ready to try the RDKit Python API with the Python 2 engine. It should work with the build I did with "-DRDK_BUILD_PYTHON_WRAPPERS=OFF".

1. Import the "rdkit" package into Python 2. I see an "ImportError: No module named rdBase" error. I have no idea where "rdBase" module is located.

herong$ export PYTHONPATH=/home/herong/rdkit

herong$ python2

Python 2.7.16 (default, Nov 17 2019, 00:07:27)
[GCC 8.3.1 20190507 (Red Hat 8.3.1-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import rdkit import Chem
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/herong/rdkit/rdkit/__init__.py", line 2, in <module>
    from .rdBase import rdkitVersion as __version__
ImportError: No module named rdBase

2. Read RDKit Python documentation, I found this note: "Beginning with the 2019.03 release, the RDKit is no longer supporting Python 2. If you need to continue using Python 2, please stick with a release from the 2018.09 release cycle." So I have to rebuild RDKit with "-DRDK_BUILD_PYTHON_WRAPPERS=ON" to work with Python 3.

3. Unzip rdkit-master.zip into ~/rdkit and again build it again with no option, which takes the default setting of "-DRDK_BUILD_PYTHON_WRAPPERS=ON". I see errors on

herong$ unzip rdkit-master.zip

herong$ mv rdkit-master rdkit

herong$ cd rdkit

herong$ mkdir build

herong$ cd build

herong$ cmake ..
CMake Error: The following variables are used in this project, but
they are set to NOTFOUND. Please set them or make sure they are set
and tested correctly in the CMake files:
PYTHON_LIBRARY (ADVANCED)
linked by target "RDBoost" in directory /home/herong/rdkit/Code/RDBoost
linked by target "rdBase" in directory /home/herong/rdkit/Code/RDBoost/Wrap
...

4. Install "platform-python-devel" and run "cmake" again. I see the "No Boost libraries were found" error.

herong$ sudo dnf install platform-python-devel

...
Installed:
  platform-python-devel-3.6.8-15.1.el8.x86_64
  python-rpm-macros-3-37.el8.noarch
  python3-rpm-generators-5-4.el8.noarch

herong$ cmake ..
CMake Error at /usr/share/cmake/Modules/FindBoost.cmake:2044 (message):
  Unable to find the requested Boost libraries.
  Boost version: 1.66.0
  Boost include path: /usr/include
  Could not find the following Boost libraries:
          boost_python
  No Boost libraries were found. You may need to set BOOST_LIBRARYDIR to
  the directory containing Boost libraries or BOOST_ROOT to the location
  of Boost.
...

5. Search for boost_python library file. I see no boost_python library.

herong$ ls -l /usr/lib64/libboost_p*
     35 May 13  2019 /usr/lib64/libboost_prg_exec_monitor.so
                       -> libboost_prg_exec_monitor.so.1.66.0
  89688 May 13  2019 /usr/lib64/libboost_prg_exec_monitor.so.1.66.0
     34 May 13  2019 /usr/lib64/libboost_program_options.so
                       -> libboost_program_options.so.1.66.0
 701288 May 13  2019 /usr/lib64/libboost_program_options.so.1.66.0
...

herong$ dnf info boost
Installed Packages
Name         : boost
Version      : 1.66.0
Release      : 6.el8
Architecture : x86_64
Size         : 1.3 k
Source       : boost-1.66.0-6.el8.src.rpm
Repository   : @System
From repo    : AppStream
Summary      : The free peer-reviewed portable C++ source libraries
URL          : http://www.boost.org

Too bad. the "boost 1.66" package I installed does not have the boost_python library. Not sure if I have to install it manually.

Table of Contents

 About This Book

 SMILES (Simplified Molecular-Input Line-Entry System)

 Open Babel: The Open Source Chemistry Toolbox

 Using Open Babel Command: "obabel"

 Generating SVG Pictures with Open Babel

 Substructure Search with Open Babel

 Similarity Search with Open Babel

 Fingerprint Index for Fastsearch with Open Babel

 Stereochemistry with Open Babel

 Command Line Tools Provided by Open Babel

RDKit: Open-Source Cheminformatics Software

 What Is RDKit

 RDKit Installation Options

 Install RDKit in an Anaconda Environment

 Install RDKit Binary Package for CentOS

 Build RDKit from Source Code on CentOS System

 Compile, Link and Run RDKit C++ API Examples

Try Python API with RDKit Native Code

 rdkit.Chem.rdchem - The Core Module

 rdkit.Chem.rdmolfiles - Molecular File Module

 rdkit.Chem.rdDepictor - Compute 2D Coordinates

 rdkit.Chem.Draw - Handle Molecule Images

 Molecule Substructure Search with RDKit

 rdkit.Chem.rdmolops - Molecule Operations

 Daylight Fingerprint Generator in RDKit

 Morgan Fingerprint Generator in RDKit

 RDKit Performance on Substructure Search

 Introduction to Molecular Fingerprints

 OCSR (Optical Chemical Structure Recognition)

 AlphaFold - Protein Structure Prediction

 Resources and Tools

 Cheminformatics Related Terminologies

 References

 Full Version in PDF/EPUB