Cheminformatics Tutorials - Herong's Tutorial Examples - v2.01, by Herong Yang
"obabel -i ..." - Input Data Format and Source
This section provides tutorial examples on how to specify input data format and source for the Open Babel 'obabel' command.
Open Babel "obabel" command arguments are organized into 3 sections as shown in the following syntax to convert chemical data from input to output with specified options.
obabel input_section output_section option_section
You use input_section to specify Input Data Format and Source. The input_section must be the first section of the "obabel" command line.
The input_section contains to two optional parts to specify input data format and input data source as shown in the following syntax:
obabel [-i input_format] [input_names] output_section option_section
1. Input Data Format - There are 2 ways to specify input data format:
1.1. Explicit Format Name - A pre-defined format name is specified after the "-i" option flag with or without a space " ".
Here are some examples of specifying input data formats explicitly:
-i sdf -isdf -i smiles -ismiles -i smi -ismi ...
1.2. Implicit Format Name - No "-i" option flag is specified. In this case, the input_name part is required. Open Babel will determine the input data format implicitly from file name extension of input_name.
For example, the following "babel" command uses "molecule.sdf" as the input_name to determine the input data format as "sdf". This command reads data from molecule.sdf in "sdf" format.
herong$ obabel molecule.sdf -O molecule.svg 1 molecule converted
2. Input Data Source - There are 3 ways to specify input data source:
2.1. File name of a single input file - In this case, the input data comes from the specified file.
For example, the following "obabel" command uses "molecule.sdf" as the input_section, which specifies a single file named "molecule.sdf". This command converts the molecule data from molecule.sdf to molecule.svg.
herong$ obabel molecule.sdf -O molecule.svg 1 molecule converted
2.2. Multiple file names or a file pattern to match a group of input files - In this case, the input data comes from all specified files concatenated sequentially.
For example, the following "obabel" command uses "mol-20001.sdf mol-20002.sdf mol-20003.sdf" as the input_section, which specifies 3 input files. This command converts and merges molecules from 3 input files and generates a SVG file molecule.svg.
herong$ obabel mol-20001.sdf mol-20002.sdf mol-20003.sdf -O output.svg 3 molecules converted
Here is another way to specified multiple input files a file name pattern:
herong$ obabel mol-*.sdf -O output.svg 3 molecules converted
Note that when a file pattern is specified, the operating system will convert it into a list of file names. So the following two commands are identical from "obabel" command point of view.
herong$ obabel mol-*.sdf -O output.svg herong$ obabel mol-20001.sdf mol-20002.sdf mol-20003.sdf -O output.svg
2.3. "stdin" stream - No input file is specified. In this case, the input data comes from the "stdin" stream, which could be keys typed in from the keyboard, or data redirected from a command pipe. With no input file name, "obabel" requires the "-i ..." option in the input_section to specify the input data type.
For example, the following "babel" command uses "-i smiles" as the input_section, which tells Open Babel to read "stdin" as SMILES data. This command reads chemical data from "stdin" in SMILES format and converts to benzene.svg.
herong$ obabel -i smiles -O benzene.svg c1ccccc1 <Ctrl-D> 1 molecule converted
Note the above command expects you to enter SMILES string from the keyboard. You need to press "<Ctrl-D>" to end the input.
3. Input SMILES in command line - Open Babel also allows you to specify SMILES strings directly in the command line as input data using the "-:..." option as shown below:
herong$ obabel -:c1ccccc1 -o sdf --gen2D OpenBabel09172116032D 6 6 0 0 0 0 0 0 0 0999 V2000 -0.8660 -0.5000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -1.7321 -0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -1.7321 1.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.8660 1.5000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.0000 1.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1 6 2 0 0 0 0 1 2 1 0 0 0 0 2 3 2 0 0 0 0 3 4 1 0 0 0 0 4 5 2 0 0 0 0 5 6 1 0 0 0 0 M END $$$$ 1 molecule converted
Table of Contents
SMILES (Simplified Molecular-Input Line-Entry System)
Open Babel: The Open Source Chemistry Toolbox
►Using Open Babel Command: "obabel"
►"obabel -i ..." - Input Data Format and Source
"obabel -o ... -O" - Output Data Format and Destination
"obabel -... --..." - Generic Conversion Options
"obabel" Command Option Argument Syntax
"obabel ... --gen2D" - Calculated 2D Coordinates
"obabel ... -f # -l #" - Split Large Molecule File
"obabel -h/-d" - Add/Remove Hydrogens in Molecule Data
"obabel --append ..." - Calculate Molecule Properties
"obabel -L formats" - List of File Formats Supported
"obabel -a..." - Extra Options for Input Reading
"obabel -x..." - Extra Options for Output Writing
"obabel" vs. "babel" Open Babel Commands
Generating SVG Pictures with Open Babel
Substructure Search with Open Babel
Similarity Search with Open Babel
Fingerprint Index for Fastsearch with Open Babel
Stereochemistry with Open Babel
Command Line Tools Provided by Open Babel
RDKit: Open-Source Cheminformatics Software
rdkit.Chem.rdchem - The Core Module
rdkit.Chem.rdmolfiles - Molecular File Module
rdkit.Chem.rdDepictor - Compute 2D Coordinates
rdkit.Chem.Draw - Handle Molecule Images
Molecule Substructure Search with RDKit
rdkit.Chem.rdmolops - Molecule Operations
Daylight Fingerprint Generator in RDKit
Morgan Fingerprint Generator in RDKit
RDKit Performance on Substructure Search
Introduction to Molecular Fingerprints
OCSR (Optical Chemical Structure Recognition)
AlphaFold - Protein Structure Prediction