3D Decomposition Kernel
Introduction
In order to perform learning and classification experiments data has to be converted from the MOL2 format to an internal format by the program parseShapes.pl.
The learning/classification procedures can now use the shapes_plugin.so plugin with the svmDlight programs: to explicitly instantiate the kernel matrix use svmDmatrix; for the learning/classification procedure use svmDlearn/svmDclassify;
Download
You can download the source code for 3DDK [here].
Sample usage and output
 conversion from MOL2 to rdk format:
bin/parseShapes.pl data/dat.test data/list dat.3ddk list 4
Quick reference for parameters:
 file mol2
 ID list [only specified molecules will be converted]
 output file
 distance from central atom
 add a target to data: each data item is serialized as a one line string, the target is a single {1,1} positioned as the first element of the line
Just for testing purposes you can simply add a random column of targets: cat dat.3ddk  awk '{if (rand()>.5) {print 1,$0} else {print 1,$0}}' > datAndTarget.3ddk
in real experimental situations you have a file with targets that you can merge with the data file using the paste command;
 make a kernel matrix using svmDlight:
svm_Dmatrix t 0 u 2.5 D lib/shapes_plugin.so R datAndTarget.3ddk testm
Quick reference for parameters:
 t 0 type of kernel transformation (0 means no transformation)
 u gamma value for the edge gaussian kernel
 perform learning:
svm_Dlearn D lib/shapes_plugin.so datAndTarget.3ddk mod
 perform classification:
svm_Dclassify D lib/shapes_plugin.so datAndTarget.3ddk mod out
2D Weighted Decomposition Kernel
Introduction
In order to perform learning and classification experiments data has to be converted from the MOL2 format to the RecursiveDecompositionKernel format (rdk). This is accomplished by the program MolecularConverter (Note: the boost c++ library is needed for compilation).
The learning/classification procedures can now use the svmDlight_rdk.so plugin with the svmDlight programs: to explicitly instantiate the kernel matrix use svmDmatrix; for the learning/classification procedure use svmDlearn/svmDclassify;
Download
You can download the source code for 2D WDK [here].
Sample usage and output
 conversion from MOL2 to rdk format:
./MolecularConverter a AtomLabelMap.InDat b BondLabelMap.InDat f dat.test cat 3 cac 3 cbt 3 cebt 3 > dat.rdk
Quick reference for parameters:
 a [atom conversion dictionary from symbols to integers]
 b [bond type conversion dictionary from symbols to integers]
 cat [context radius for atom type attributes]
 cac [context radius for atom charge attributes]
 cbt [context radius for bond type attributes]
 add a target to data: each data item is serialized as a one line string, the target is a single {1,1} positioned as the first element of the line
Just for testing purposes you can simply add a random column of targets: cat dat.rdk  awk '{if (rand()>.5) {print 1,$0} else {print 1,$0}}' > datAndTarget.rdk
in real experimental situations you have a file with targets that you can merge with the data file using the paste command;
 make a kernel matrix using svmDlight:
svm_Dmatrix D Plugin_svmDlight/svmDlight_rdk.so R DataTest/datAndTarget.rdk mtx
 perform learning:
svm_Dlearn D Plugin_svmDlight/svmDlight_rdk.so DataTest/datAndTarget.rdk mod
 perform classification:
svm_Dclassify D Plugin_svmDlight/svmDlight_rdk.so DataTest/datAndTarget.rdk mod out
Reference
For reference on svmDlight see http://www.dsi.unifi.it/neural/src/svmDlight/
For reference on the 3D and 2D WDK kernel see:
Demon Kernel
Introduction
The Demon kernel plugin can be used in order to perform learning and classification directly employing kernel matrices.
Availability
The DemonKernel plugin can be downloaded from http://www.dsi.unifi.it/neural/src/DemonKernel/DemonKernel.tgz
Usage
 make a kernel matrix using svmDlight:
svm_Dmatrix D Plugin_svmDlight/svmDlight_rdk.so DataTest/datAndTarget.rdk mtx
 upload the matrix in the shared memory giving it a unique name:
loadKernelMatrix mtx exp1.mtx
 make a special target file that contains the target value and the cardinal value of the element as it is used in the computation of the kernel matrix (ex. first element has value 0, the second one 1, and so on until the nth element with value n1)
Just for testing purposes you can simply add a random column of targets: cat dat.rdk  awk '{if (rand()>.5) {print 1,NR1} else {print 1,NR1}}' > datAndTargetMatrix.rdk
 perform training using as parameter u the name of the uploaded kernel matrix:
svm_Dlearn D svmDlight_demon.so u exp1.mtx datAndTargetMatrix.rdk mod
 perform classification:
svm_Dclassify D svmDlight_demon.so datAndTargetMatrix.rdk mod out
 remove the matrix after when not needed any more:
unloadKernelMatrix exp1.mtx

7th Sept 2007. Machine Learning and Neural Networks Group.
For questions and comments: .
