Chemical Descriptors Library: Ullmann Algorithm for subgraph isomorphism
Ullmann Algorithm

As a policy in CDL, all algorithms that do not involve a direct knowledge of atom, bond or molecule properties are written accepting a graph object rather than a molecule. In this way, the algorithms can be used for BGL in general, not only for CDL. Even if the knowledge is needed, then the access to the properties is parametrized somehow to hide CDL's implementation.
Ullmann's algorithm for subgraph isomorphism is an example of this case.
Subgraph isomorphism testing is a commom task in computational chemistry. The basic task in this field is the substructure search.
The algorithm prunes the search using labels for vertices and edges. For chemical graph this labels come from the properties of the molecule. Then, the labels are user defined predicates that accepts vertices/edges of two graphs: the subgraph and the graph that is being tested for isomorphism. As simple as that. If you want to search substructures taking chirality into account, then you provide a predicate that accepts a vertex descriptor from the graph, and one from the subgraph, and returns true if the atoms are the same element, and have the same chirality. Same semantics for the edge labeling.

Prototype

Ullmann's algorithm has two versions:

  template <  class Graph
            , class VertexLabeling
            , class EdgeLabeling
            , class BackInsertionSequence  
  >
  bool
  ullmann(const Graph& g1, const Graph& g2,
  VertexLabeling& vertex_labeling, EdgeLabeling& edge_labeling, BackInsertionSequence& F);


  template <  class Graph
            , class VertexLabeling
            , class EdgeLabeling
            , class DoubleBackInsertionSequence
  >
  bool
  ullmann_all(const Graph& g1, const Graph& g2,
  VertexLabeling& vertex_labeling, EdgeLabeling& edge_labeling, DoubleBackInsertionSequence& F);


Arguments

For the first version :
ParameterDescriptionModels
g1
The subgraph to search for BGL adjacency_list<>
g2
The graph to search in BGL adjacency_list<>
vertex_labeling
The binary predicate that compares the labels of the vertices Binary predicate accepting vertices descriptors of g1 and g2 in that order.
edge_labeling
The binary predicate that compares the labels of the edges Binary predicate accepting edge descriptors of g1 and g2 in that order.
F
The mapped vertices for both structures Sequence of std::pair<vertex_descriptor,vertex_descriptor> that provides push_back() method. The mapped vertices are first: g1 and second: g2

For the second:
ParameterDescriptionModels
g1
The subgraph to search for BGL adjacency_list<>
g2
The graph to search in BGL adjacency_list<>
vertex_labeling
The binary predicate that compares the labels of the vertices Binary predicate accepting vertices descriptors of g1 and g2 in that order.
edge_labeling
The binary predicate that compares the labels of the edges Binary predicate accepting edge descriptors of g1 and g2 in that order.
F
The mapped vertices for both structures Sequence of sequences of std::pair<vertex_descriptor,vertex_descriptor> that provides push_back() method. Each element of the outer container contains each possible mapping. The mapped vertices are first: g1 and second: g2

Description

Both versions searches the isomorphism of subgraph g1 into g2. The first version searches returns when the first mapped is found, the reason for which parameter F is a sequence of vertex descriptor pairs. The second version finds all the possible mappings of g1 into g2. Parameter F contains all possible mappings. Each element in the outer container is a possible mapping.

Definition

morpho/cdl/bgl_expansions/algorithms/ullmann.hpp

Complexity

Worst-case scenario : O(n!n^3) [Kukluk, 2004]. Where n are the number of nodes.
In chemistry, this complexity is largely pruned.

Example

Lest consider we want to search for a substructure considering the bond types:
#include <morpho/cdl/molecule/molecule.hpp>
#include <morpho/cdl/bgl_expansions/algorithms/ullmann.hpp>
#include <vector>
#include <algorithm>
#include <iterator>

// define the atom labeling :

  template <class Molecule>
  struct atom_labeling {
    atom_labeling(const Molecule& subgraph, const Molecule& graph) :
      m_subgraph_mol(subgraph), m_graph_mol(graph) {}

    bool operator() (typename Molecule::vertex_descriptor v1,
    typename Molecule::vertex_descriptor v2) {
      const bool rvo(AtomProperty<atomic_numberS,typename Molecule::atom_type>::get(m_subgraph_mol.get_atom(v1))
      == AtomProperty<atomic_numberS,typename Molecule::atom_type>::get(m_graph_mol.get_atom(v2)));
      return rvo;
    }
   private :
    const Molecule& m_subgraph_mol;
    const Molecule& m_graph_mol;
  };

// define the edge labeling

  template <class Molecule>
  struct bond_labeling {

    bond_labeling(const Molecule& subgraph, const Molecule& graph) :
      m_subgraph_mol(subgraph), m_graph_mol(graph) {}

    bool inline operator() (typename Molecule::edge_descriptor e1,
    typename Molecule::edge_descriptor e2) {
      const bool rvo(BondProperty<type_of_bondS,typename Molecule::bond_type>::get(m_subgraph_mol.get_bond(e1))
          == BondProperty<type_of_bondS,typename Molecule::bond_type>::get(m_graph_mol.get_bond(e2)));
      return rvo;
    }
   private :
    const Molecule& m_subgraph_mol;
    const Molecule& m_graph_mol;
  };

int main() {
  typedef molecule  M;
  nail_juice<M>     j1, j2;
  get_juice_from_stream(std::cin,j1,0,smile_formatT());
  get_juice_from_stream(std::cin,j2,1,smile_formatT());
  M m1(j1), m2(j2);
  // the labels:
  atom_labeling<M>      alabel(m1.get_graph(), m2.get_graph());
  bond_labeling<M>      blabel(m1.get_graph(), m2.get_graph());
  // get the first mapping :
  std::vector<std::pair<size_t,size_t> >     mapping;
  ullmann(m1.get_graph(), m2.get_graph(), alabel, blabel, mapping);
  std::cout << "The mapped vertices : \n";
  std::copy(mapping.begin(),mapping.end(), std::ostream_iterator<size_t>(std::cout, " "));
  return 0;
}




Copyright © Vladimir Josef Sykora & Morphochem AG 2003
Copyright © Vladimir Josef Sykora 2003-2006