CDL Chemical Algebra
As noted previously in the first example, operator + is a binary operator that accepts Left-Hand Side (LHS) and RHS operands. In the example, RHS is the fragment to be added. Hence, the LHS operand must contain information on where to add the desired fragment. For this reason, the second element of the algebra is an indexed molecule. The indexed molecule inherits from the plain CDL molecule (in programming terms, an indexed molecule IS A plain molecule) and has one own member variable which provides the mapping to a query molecule. The mapping is represented by the graph's vertex indexes given by the SMARTS algorithm. This molecule cannot be instantiated directly, since it's the result type of the application of a SMARTS algorithm to a plain molecule.
The third element of the algebra is the CDL's SMART class,
which is constructed out of a SMARTS character string.
The SMARTS character set is a super-group of the smiles character set. Any valid smiles is a valid SMARTS, but not the other way around.
The substructure search operator is a binary operator that applies the
SMARTS algorithm of the RHS operand, to the molecule represented by the
LHS operand. For the time being, this operator is
represented by the symbol ^
The return type of the operator is an indexed_molecule.
operator ^ (plain molecule, smarts class) --> returns an indexed_molecule
One can convert directly a smiles string to a plain molecule, and a smarts string to a smarts class. So, this is also valid:
operator ^ (SMILES string, SMARTS string) --> returns an indexed_molecule
The Addition operator inserts a fragment represented by the RHS operand, to the molecule represented by the LHS operand in the first position of the mapping to the query molecule. If there is no such mapping, then the fragment is not added. The Addition operator creates a bond between the LHS and the RHS, hence:
Num. bonds resulting molecule = Num. bonds LHS + Num bonds. RHS + 1
This operator is represented by the symbol +
operator + (indexed_molecule, plain molecule) --> returns an indexed_molecule
this is also valid:
operator + (indexed_molecule, SMILES string) --> returns an indexed_molecule
The returned molecule preserves the mapping contained in the LHS operand.
There are some occasions where the user wants to connect a fragment to more than one position on the target molecule. For example, ring addition.
In the case of the addition operator, the position where the addition
is to be performed is indicated by the first character of the
In the case of the fusion operator, we need two positions. To be able to do this, the SMARTS algorithm was extended to directly express positions in molecules. A position in a CDL SMARTS is given by the symbol <. The first position is always the first character of the SMARTS, the second position is defaulted or explicitly given by the < symbol. For example, a benzene ring, where the meta position is the second position of the map, will be expressed this way: c1cc<ccc1.
The operator works on two indexed molecules :
operator & (indexed molecule, indexed molecule) --> returns an indexed molecule
The subtraction operator removes aliphatic (non cyclic) fragments from a molecule.
The subtraction operator search for the RHS operand in the LHS operand. If only one match is found, the fragment is removed. If more than one match is found, then:
The operator is defined as :
operator - (indexed molecule, molecule) --> returns an indexed molecule
operator - (indexed molecule, indexed molecule) --> returns an indexed molecule
operator - (molecule, molecule) --> returns an indexed molecule
The resulting indexed molecule has in the first position the vertex index where the removed fragment was attached to.