3D Decomposition Kernel
Introduction
In order to perform learning and classification experiments data has to be converted from the MOL2 format to an internal format by the program parseShapes.pl.
The learning/classification procedures can now use the shapes_plugin.so plugin with the svm-Dlight programs: to explicitly instantiate the kernel matrix use svm-Dmatrix; for the learning/classification procedure use svm-Dlearn/svm-Dclassify;
Download
You can download the source code for 3DDK [here].
Sample usage and output
- conversion from MOL2 to rdk format:
bin/parseShapes.pl data/dat.test data/list dat.3ddk list 4
Quick reference for parameters:
- file mol2
- ID list [only specified molecules will be converted]
- output file
- distance from central atom
- add a target to data: each data item is serialized as a one line string, the target is a single {-1,1} positioned as the first element of the line
Just for testing purposes you can simply add a random column of targets: cat dat.3ddk | awk '{if (rand()>.5) {print 1,$0} else {print -1,$0}}' > datAndTarget.3ddk
in real experimental situations you have a file with targets that you can merge with the data file using the paste command;
- make a kernel matrix using svm-Dlight:
svm_Dmatrix -t 0 -u 2.5 -D lib/shapes_plugin.so -R datAndTarget.3ddk testm
Quick reference for parameters:
- -t 0 type of kernel transformation (0 means no transformation)
- -u gamma value for the edge gaussian kernel
- perform learning:
svm_Dlearn -D lib/shapes_plugin.so datAndTarget.3ddk mod
- perform classification:
svm_Dclassify -D lib/shapes_plugin.so datAndTarget.3ddk mod out
2D Weighted Decomposition Kernel
Introduction
In order to perform learning and classification experiments data has to be converted from the MOL2 format to the RecursiveDecompositionKernel format (rdk). This is accomplished by the program MolecularConverter (Note: the boost c++ library is needed for compilation).
The learning/classification procedures can now use the svmDlight_rdk.so plugin with the svm-Dlight programs: to explicitly instantiate the kernel matrix use svm-Dmatrix; for the learning/classification procedure use svm-Dlearn/svm-Dclassify;
Download
You can download the source code for 2D WDK [here].
Sample usage and output
- conversion from MOL2 to rdk format:
./MolecularConverter -a AtomLabelMap.InDat -b BondLabelMap.InDat -f dat.test -cat 3 -cac 3 -cbt 3 -cebt 3 > dat.rdk
Quick reference for parameters:
- -a [atom conversion dictionary from symbols to integers]
- -b [bond type conversion dictionary from symbols to integers]
- -cat [context radius for atom type attributes]
- -cac [context radius for atom charge attributes]
- -cbt [context radius for bond type attributes]
- add a target to data: each data item is serialized as a one line string, the target is a single {-1,1} positioned as the first element of the line
Just for testing purposes you can simply add a random column of targets: cat dat.rdk | awk '{if (rand()>.5) {print 1,$0} else {print -1,$0}}' > datAndTarget.rdk
in real experimental situations you have a file with targets that you can merge with the data file using the paste command;
- make a kernel matrix using svm-Dlight:
svm_Dmatrix -D Plugin_svm-Dlight/svmDlight_rdk.so -R DataTest/datAndTarget.rdk mtx
- perform learning:
svm_Dlearn -D Plugin_svm-Dlight/svmDlight_rdk.so DataTest/datAndTarget.rdk mod
- perform classification:
svm_Dclassify -D Plugin_svm-Dlight/svmDlight_rdk.so DataTest/datAndTarget.rdk mod out
Reference
For reference on svm-Dlight see http://www.dsi.unifi.it/neural/src/svm-Dlight/
For reference on the 3D and 2D WDK kernel see:
Demon Kernel
Introduction
The Demon kernel plugin can be used in order to perform learning and classification directly employing kernel matrices.
Availability
The DemonKernel plugin can be downloaded from http://www.dsi.unifi.it/neural/src/DemonKernel/DemonKernel.tgz
Usage
- make a kernel matrix using svm-Dlight:
svm_Dmatrix -D Plugin_svm-Dlight/svmDlight_rdk.so DataTest/datAndTarget.rdk mtx
- upload the matrix in the shared memory giving it a unique name:
loadKernelMatrix mtx exp1.mtx
- make a special target file that contains the target value and the cardinal value of the element as it is used in the computation of the kernel matrix (ex. first element has value 0, the second one 1, and so on until the nth element with value n-1)
Just for testing purposes you can simply add a random column of targets: cat dat.rdk | awk '{if (rand()>.5) {print 1,NR-1} else {print -1,NR-1}}' > datAndTargetMatrix.rdk
- perform training using as parameter -u the name of the uploaded kernel matrix:
svm_Dlearn -D svmDlight_demon.so -u exp1.mtx datAndTargetMatrix.rdk mod
- perform classification:
svm_Dclassify -D svmDlight_demon.so datAndTargetMatrix.rdk mod out
- remove the matrix after when not needed any more:
unloadKernelMatrix exp1.mtx
|
7th Sept 2007. Machine Learning and Neural Networks Group.
For questions and comments: .
|