Chemical Descriptors Library: Substructure Search
Client Algorithms

Substructure Search

The search for particular fragments or backbones within virtual molecules is one of the most common tasks in computational chemistry. Obviously, the speed of the search becomes a bottleneck as the size of the database to search increases in size. Virtual libraries with millions of compounds are not unusual in modern drug design. This fact highlights the importance of flexible and fast substructure search algorithms.
CDL implements Ullmann's algorithm for exact subgraph isomorphism matching, which is fast and flexible. However, there are other techniques for fast pruning of chemical structures, being the fingerprints the most frequently used.
CDL's substructure search is tackled in two steps:

  1. Generation of fingerprint. Pruning of those structures that can't ever match the fragment (the fingerprint of the fragment is not contained within the structure.) Because different fragments can generate the same fingerprint, a full substructure search is required when the fingerprints are compatible :
  2. Full subgraph isomorphism is performed with Ullmann's algorithm. This step gives the final answer if the structure matches or not.


Copyright (c) Vladimir Josef Sykora & Morphochem AG 2003
SourceForge.net Logo