Chemical Descriptors Library: Molecular Parsers
Client Algorithms

Molecular Parsers

CDL provides functionality to read a variety of molecular formats, such as MDL's mol and Daylight's smiles.
The function that reads the stream is specialized for each of the format that we want to use. The same holds for the function that writes the molecule.

The properties that the formats contain are handled through the use of a pointer to a std::map<std::string, std::vector<std::string> > that maps the name of the property to a vector of strings that contains the values of each property.


To read streams :

  template <class Stream, class Molecule>
  bool get_juice_from_stream(Stream& in, nail_juice<Molecule>& juice, size_t count, FORMAT_TAG,
  std::map<std::string, std::vector<std::string> >* props = NULL)

To write molecules :

  template <class CT, class AP, class BP, class MP, class AT, class BT, class MT
  , class Format>
  print_nail<molecule<CT,AP,BP,MP,AT,BT,MT>, Format>
  write_nail(const molecule<CT,AP,BP,MP,AT,BT,MT>& nail, Format,
  const std::map<std::string, std::vector<std::string> >* props = NULL);


sdf_formatT MDL's MOL format
smiles_formatT Daylight's smile format


For get_juice_from_stream() :

in The stream that points to the molecular format std::basic_istream
juice The juice where the information of the format is going to be stored nail_juice
count Unsigned iteger to keep track of which molecule is being processed std::size_t
FORMAT_TAG One of the constructed molecular format tags sdf_formatT() or smiles_formatT()
props Where the external properties are going to be stored non const pointer to std::map<std::string, std::vector<std::string> >

For write_nail() :

nail The molecule that you are printing molecule<>
Format The format in which you are printing sdf_formatT() or smiles_formatT()
props The external properties of the molecule const pointer to std::map<std::string, std::vector<std::string> >


get_juice_from_stream() gets the values of a molecular format and stores it in a juice. The function returns a boolean indicating if the parsing was successful.

write_nail() returns a functor that provides operator() with argument the reference to a stream. You sholdn't be concerned about this. Just concerne that the stream to which you are streaming out provides operator <<.
If a non NULL pointer is given, then the format is streamed out as Daylight's TDT (Thor Data Tree) : smiles with properties.

Where defined

#include <morpho/cdl/parsers/parsers.cpp>


// Get molecules as sdf format (with properties) and
// write them as Daylight's tdt format (smiles with
// properties)

// --- cdl

#include <morpho/cdl/molecule/molecule.hpp>
#include <morpho/cdl/parsers/parsers.cpp>

int main() {
  using namespace morpho::cdl;
  while( !std::cin.eof() ) {
    // first check that the stream has a valid sdf :
    std::basic_istream<char, std::char_traits<char> >::pos_type  pos;
    if ( !morpho::look_ahead(std::cin,4)) break;
    // declare the juice
    nail_juice<>  j_from_sdf;
    // declare the object to contain the properties
    std::map<std::string, std::vector<std::string> >    mol_sdf_properties;
    // now get the juice and the external properties :
    get_juice_from_stream(std::cin, j_from_sdf, 0, sdf_formatT(), &mol_sdf_properties);
    // construct the molecule :
    molecule<>    mol(j_from_sdf);
    // Now write the molecule : 
    // If we are streaming out the molecule as a smile, and we
    // provide a pointer to a property which is not NULL, then
    // the tdt format will be used :
    std::cout << write_nail(mol,sdf_formatT(),&mol_sdf_properties) << std::endl;
    // the stream should have now the smiles and properties in the tdt format
  return 0;

