"obabel -i ..." - Input Data Format and Source

This section provides tutorial examples on how to specify input data format and source for the Open Babel 'obabel' command.

Open Babel "obabel" command arguments are organized into 3 sections as shown in the following syntax to convert chemical data from input to output with specified options.

obabel input_section output_section option_section

You use input_section to specify Input Data Format and Source. The input_section must be the first section of the "obabel" command line.

The input_section contains to two optional parts to specify input data format and input data source as shown in the following syntax:

obabel [-i input_format] [input_names] output_section option_section

1. Input Data Format - There are 2 ways to specify input data format:

1.1. Explicit Format Name - A pre-defined format name is specified after the "-i" option flag with or without a space " ".

Here are some examples of specifying input data formats explicitly:

-i sdf
-isdf
-i smiles
-ismiles
-i smi
-ismi
...

1.2. Implicit Format Name - No "-i" option flag is specified. In this case, the input_name part is required. Open Babel will determine the input data format implicitly from file name extension of input_name.

For example, the following "babel" command uses "molecule.sdf" as the input_name to determine the input data format as "sdf". This command reads data from molecule.sdf in "sdf" format.

herong$ obabel molecule.sdf -O molecule.svg

1 molecule converted

2. Input Data Source - There are 3 ways to specify input data source:

2.1. File name of a single input file - In this case, the input data comes from the specified file.

For example, the following "obabel" command uses "molecule.sdf" as the input_section, which specifies a single file named "molecule.sdf". This command converts the molecule data from molecule.sdf to molecule.svg.

herong$ obabel molecule.sdf -O molecule.svg

1 molecule converted

2.2. Multiple file names or a file pattern to match a group of input files - In this case, the input data comes from all specified files concatenated sequentially.

For example, the following "obabel" command uses "mol-20001.sdf mol-20002.sdf mol-20003.sdf" as the input_section, which specifies 3 input files. This command converts and merges molecules from 3 input files and generates a SVG file molecule.svg.

herong$ obabel mol-20001.sdf mol-20002.sdf mol-20003.sdf -O output.svg

3 molecules converted

Here is another way to specified multiple input files a file name pattern:

herong$ obabel mol-*.sdf -O output.svg

3 molecules converted

Note that when a file pattern is specified, the operating system will convert it into a list of file names. So the following two commands are identical from "obabel" command point of view.

herong$ obabel mol-*.sdf -O output.svg

herong$ obabel mol-20001.sdf mol-20002.sdf mol-20003.sdf -O output.svg

2.3. "stdin" stream - No input file is specified. In this case, the input data comes from the "stdin" stream, which could be keys typed in from the keyboard, or data redirected from a command pipe. With no input file name, "obabel" requires the "-i ..." option in the input_section to specify the input data type.

For example, the following "babel" command uses "-i smiles" as the input_section, which tells Open Babel to read "stdin" as SMILES data. This command reads chemical data from "stdin" in SMILES format and converts to benzene.svg.

herong$ obabel -i smiles -O benzene.svg
c1ccccc1
<Ctrl-D>

1 molecule converted

Note the above command expects you to enter SMILES string from the keyboard. You need to press "<Ctrl-D>" to end the input.

3. Input SMILES in command line - Open Babel also allows you to specify SMILES strings directly in the command line as input data using the "-:..." option as shown below:

herong$ obabel -:c1ccccc1 -o sdf --gen2D

 OpenBabel09172116032D

  6  6  0  0  0  0  0  0  0  0999 V2000
   -0.8660   -0.5000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.7321   -0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.7321    1.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.8660    1.5000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.0000    1.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.0000    0.0000    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  6  2  0  0  0  0
  1  2  1  0  0  0  0
  2  3  2  0  0  0  0
  3  4  1  0  0  0  0
  4  5  2  0  0  0  0
  5  6  1  0  0  0  0
M  END
$$$$
1 molecule converted

Table of Contents

 About This Book

 SMILES (Simplified Molecular-Input Line-Entry System)

 Open Babel: The Open Source Chemistry Toolbox

Using Open Babel Command: "obabel"

 What Is "obabel" Command

"obabel -i ..." - Input Data Format and Source

 "obabel -o ... -O" - Output Data Format and Destination

 "obabel -... --..." - Generic Conversion Options

 "obabel" Command Option Argument Syntax

 "obabel ... --gen2D" - Calculated 2D Coordinates

 "obabel ... -f # -l #" - Split Large Molecule File

 "obabel -h/-d" - Add/Remove Hydrogens in Molecule Data

 "obabel --append ..." - Calculate Molecule Properties

 "obabel -L formats" - List of File Formats Supported

 "obabel -a..." - Extra Options for Input Reading

 "obabel -x..." - Extra Options for Output Writing

 "obabel" vs. "babel" Open Babel Commands

 Generating SVG Pictures with Open Babel

 Substructure Search with Open Babel

 Similarity Search with Open Babel

 Fingerprint Index for Fastsearch with Open Babel

 Stereochemistry with Open Babel

 Command Line Tools Provided by Open Babel

 RDKit: Open-Source Cheminformatics Software

 rdkit.Chem.rdchem - The Core Module

 rdkit.Chem.rdmolfiles - Molecular File Module

 rdkit.Chem.rdDepictor - Compute 2D Coordinates

 rdkit.Chem.Draw - Handle Molecule Images

 Molecule Substructure Search with RDKit

 rdkit.Chem.rdmolops - Molecule Operations

 Daylight Fingerprint Generator in RDKit

 Morgan Fingerprint Generator in RDKit

 RDKit Performance on Substructure Search

 Introduction to Molecular Fingerprints

 OCSR (Optical Chemical Structure Recognition)

 AlphaFold - Protein Structure Prediction

 Resources and Tools

 Cheminformatics Related Terminologies

 References

 Full Version in PDF/EPUB