Chemical Descriptors Library: Atom Types Fingerprints
|
Client Algorithms
|
|
Atom Types Fingerprints
The atom types fingerprints method is similar to the topological pharmacophores
method. The differences are the atom types considered for each method, and that
in the Atom Types Fingerprints you need an atom where to base your fingerprint.
This method counts the number of considered atomic types that surrounds an atom
of interest, up to a maximum distance.
The method is also similar to an atomic RDF function, with the difference that this
method counts atom types, and insert the sum of the counts in a pre-defined position
of the fingerprint. The positions of the fingerprints reflects the distance of the
atom type count over the shortest path to the considered atom of interest.
The method is described in the Xing and Glen 2002 paper.
The considered Atom Types are:
Atom Type |
sp3 Carbon |
sp2 Carbon |
sp Carbon |
aromatic Carbon |
Carbon cation |
sp3 Nitrogen |
sp2 Nitrogen |
sp Nitrogen |
aromatic Nitrogen |
amide Nitrogen |
sp3 Nitrogen positively charged |
sp3 Oxygen |
sp2 Oxygen |
Oxygen in carboxylic and phosphoric acid |
sp3 Sulfur |
sp2 Sulfur |
Sulfoxide Sulfur |
Sulfone Sulfur |
Hydrogen |
Fluorine |
Chlorine |
Bromine |
Iodine |
Prototype
The method is implemented as a functor, accepting the molecule to its constructor,
and the vertex index number to consider to operator().
The fingerprint is stores as a std::vector<size_t>. You get it with the function
get_fingerprint().
template <class Molecule>
struct pka_fingerprint {
pka_fingerprint(const molecule_t& m, size_t max_length = 5);
bool operator()(size_t v);
const std::vector<size_t>&
get_fingerprint();
};
Definition
// for the atom types :
#include <morpho/cdl/fingerprints/sybyl_typing.hpp>
// for the pka_fingerprint functor :
#include <morpho/cdl/fingerprints/pka_fp.hpp>
Preconditions
The distance matrix is already assigned and can be accessed through the
method : MolProperty<distance_matrixS,Molecule>::get();
Complexity
Assuming the distance matrix is already assigned (as done by the default
cosntructor of the molecule), the complexity is linear applications
of the atom type predicate.
Example
To determine the atom type fingerprint of the nitrogen atoms of each molecule
given to standard input. The molecules are in sdf format :
int main() {
typedef molecule<> M;
typedef M::atom_type atype;
typedef AtomProperty<atomic_numberS,atype>::result_type ANum;
ANum N(7);
std::basic_istream<char, std::char_traits<char> >::pos_type pos;
while(cin.good()) {
pos=std::cin.tellg();
if ( !morpho::look_ahead(std::cin,4)) break;
std::cin.seekg(pos);
nail_juice<> j;
std::map<std::string, std::vector<std::string> > n_props;
get_juice_from_stream(std::cin, j, 0, sdf_formatT(),&n_props);
M m(j,true);
M::vertex_iterator vi,vi_end;
tie(vi,vi_end) = m.vertices();
while(vi!=vi_end) {
if( AtomProperty<atomic_numberS,atype>::get(m.get_atom(*vi))==N ) {
pka_fingerprint<M> FP(m);
FP(*vi);
std::copy(FP.get_fingerprint().begin(),FP.get_fingerprint().end(),std::ostream_iterator<size_t>(cout," "));
std::cout << std::endl;
break;
}
++vi;
}
}
return 0;
}
References
Xing, Glen. "Novel Methods for the Prediction of LopP, pKa,
and LogD". J. Chem. Inf. Comput. Sci. 2002. Vol 42. pp. 796-805.
Copyright © Vladimir Sykora & Cyprotex Ltd 2006