Question

ChemmineR package - fpSim and cmp.similarity - Tanimoto kernels

0

Entering edit mode

ረ • 0

@-10147

Last seen 5.8 years ago

I have a text file containing 100 drug compounds (rows) and 100 protein kinase targets (columns). Values reflect how tightly a compound binds to a target. I also have the drug SMILES of the drugs.

I converted SMILES to SDF. Then I used "fingerprintOB" function to generate fingerprints from that SDFset using OpenBabel. I generated 3 different fingerprints, i.e. FP2, FP3, FP4

Now using those fingerprints, I want to compute fingerprint-based Tanimoto kernels.

Which function should I use? fpSim? Or cmp.similarity?

cmp.similarity does not work.. It asks for apset data, but I have no idea how to convert FPsets (i.e. FP2, FP3, FP4) to apset data..

ChemmineR • 2.0k views

ADD COMMENT • link updated 8.0 years ago by Thomas Girke ★ 1.7k • written 8.0 years ago by ረ • 0

score 0 · Answer 1 · 2016-04-23

For fingerprints you use fpSim. The cmp.similarity function has been designed for atom pairs which are integer vectors of variable length while fingerprints are binary representations of fixed length. A special case would be atom pair fingerprints which are fingerprints generated from atom pairs. Here, you generate from an SDFset first atom pairs using sdf2ap, and then the corresponding fingerprints with desc2fp (see vignette for an example). Since fingerprints contain much less information it is not possible to convert them back to atom pairs.

Additional, similarity methods of interest could be Maximum Common Substructures (MCS) provided by the affiliated fmcsR package or the Rchemcpp package from Michael Mahr and Guenter Klambauer.

Thomas