Machine Learning and Neural Networks group

Department of Systems and Computer Science
University of Florence
Via Santa Marta 3
50139 Firenze - Italy
Tel:+39 055 4796361
Fax:+39 055 4796363


3D Decomposition Kernel

Introduction

In order to perform learning and classification experiments data has to be converted from the MOL2 format to an internal format by the program parseShapes.pl. The learning/classification procedures can now use the shapes_plugin.so plugin with the svm-Dlight programs: to explicitly instantiate the kernel matrix use svm-Dmatrix; for the learning/classification procedure use svm-Dlearn/svm-Dclassify;

Download

You can download the source code for 3DDK [here].

Sample usage and output

  1. conversion from MOL2 to rdk format:
    bin/parseShapes.pl data/dat.test data/list dat.3ddk list 4
    Quick reference for parameters:
    • file mol2
    • ID list [only specified molecules will be converted]
    • output file
    • distance from central atom
  2. add a target to data: each data item is serialized as a one line string, the target is a single {-1,1} positioned as the first element of the line
    Just for testing purposes you can simply add a random column of targets: cat dat.3ddk | awk '{if (rand()>.5) {print 1,$0} else {print -1,$0}}' > datAndTarget.3ddk
    in real experimental situations you have a file with targets that you can merge with the data file using the paste command;
  3. make a kernel matrix using svm-Dlight:
    svm_Dmatrix -t 0 -u 2.5 -D lib/shapes_plugin.so -R datAndTarget.3ddk testm
    Quick reference for parameters:
    • -t 0 type of kernel transformation (0 means no transformation)
    • -u gamma value for the edge gaussian kernel
  4. perform learning:
    svm_Dlearn -D lib/shapes_plugin.so datAndTarget.3ddk mod
  5. perform classification:
    svm_Dclassify -D lib/shapes_plugin.so datAndTarget.3ddk mod out

2D Weighted Decomposition Kernel

Introduction

In order to perform learning and classification experiments data has to be converted from the MOL2 format to the RecursiveDecompositionKernel format (rdk). This is accomplished by the program MolecularConverter (Note: the boost c++ library is needed for compilation).
The learning/classification procedures can now use the svmDlight_rdk.so plugin with the svm-Dlight programs: to explicitly instantiate the kernel matrix use svm-Dmatrix; for the learning/classification procedure use svm-Dlearn/svm-Dclassify;

Download

You can download the source code for 2D WDK [here].

Sample usage and output

  1. conversion from MOL2 to rdk format:
    ./MolecularConverter -a AtomLabelMap.InDat -b BondLabelMap.InDat -f dat.test -cat 3 -cac 3 -cbt 3 -cebt 3 > dat.rdk
    Quick reference for parameters:
    • -a [atom conversion dictionary from symbols to integers]
    • -b [bond type conversion dictionary from symbols to integers]
    • -cat [context radius for atom type attributes]
    • -cac [context radius for atom charge attributes]
    • -cbt [context radius for bond type attributes]
  2. add a target to data: each data item is serialized as a one line string, the target is a single {-1,1} positioned as the first element of the line
    Just for testing purposes you can simply add a random column of targets: cat dat.rdk | awk '{if (rand()>.5) {print 1,$0} else {print -1,$0}}' > datAndTarget.rdk
    in real experimental situations you have a file with targets that you can merge with the data file using the paste command;
  3. make a kernel matrix using svm-Dlight:
    svm_Dmatrix -D Plugin_svm-Dlight/svmDlight_rdk.so -R DataTest/datAndTarget.rdk mtx
  4. perform learning:
    svm_Dlearn -D Plugin_svm-Dlight/svmDlight_rdk.so DataTest/datAndTarget.rdk mod
  5. perform classification:
    svm_Dclassify -D Plugin_svm-Dlight/svmDlight_rdk.so DataTest/datAndTarget.rdk mod out

Reference

For reference on svm-Dlight see http://www.dsi.unifi.it/neural/src/svm-Dlight/
For reference on the 3D and 2D WDK kernel see:

Demon Kernel

Introduction

The Demon kernel plugin can be used in order to perform learning and classification directly employing kernel matrices.

Availability

The DemonKernel plugin can be downloaded from http://www.dsi.unifi.it/neural/src/DemonKernel/DemonKernel.tgz

Usage

  • make a kernel matrix using svm-Dlight:
    svm_Dmatrix -D Plugin_svm-Dlight/svmDlight_rdk.so DataTest/datAndTarget.rdk mtx
  • upload the matrix in the shared memory giving it a unique name:
    loadKernelMatrix mtx exp1.mtx
  • make a special target file that contains the target value and the cardinal value of the element as it is used in the computation of the kernel matrix (ex. first element has value 0, the second one 1, and so on until the nth element with value n-1)
    Just for testing purposes you can simply add a random column of targets: cat dat.rdk | awk '{if (rand()>.5) {print 1,NR-1} else {print -1,NR-1}}' > datAndTargetMatrix.rdk
  • perform training using as parameter -u the name of the uploaded kernel matrix:
    svm_Dlearn -D svmDlight_demon.so -u exp1.mtx datAndTargetMatrix.rdk mod
  • perform classification:
    svm_Dclassify -D svmDlight_demon.so datAndTargetMatrix.rdk mod out
  • remove the matrix after when not needed any more: unloadKernelMatrix exp1.mtx

7th Sept 2007. Machine Learning and Neural Networks Group. For questions and comments: .