Adaptive Processing of Data Structures


People


Paolo FrasconiDSI, University of Florence, Italy
Christoph GollerUniversity of München, Germany
Marco GoriDII, University of Siena, Italy
Andreas KüchlerUniversity of Ulm, Germany
Alessandro SperdutiDI, University of Pisa, Italy
Ah Chung TsoiUniversity of Wollongong, Australia
Michelangelo DiligentiDI, University of Siena, Italy

Table of contents

Learning Data Structures

Related Papers


Learning Data Structures


Connectionist and probabilistic models for machine learning are often limited to processing relatively poor data types. For example, feedforward neural networks and adaptive mixture models assume that instances are represented by fixed-size (static) tuples of attributes (when using neural networks, a tuple is typically encoded by a real valued array). While some domains naturally lead to static data representations, other domains are better described by variable length representations. In speech recognition, for example, utterances are normally represented as a discrete time sequences of acoustic features (e.g., 12 kepstral parameters are computed every 10 milliseconds). Similarly, in molecular biology, proteins are described by variable length strings of symbols (each symbol is one of 20 amino acids). There are two important novelties when using sequences instead of static patterns. First, the number of atomic entities in a sequence is not fixed a priori, i.e. a sequence is a dynamic entity. Second, atoms in a sequence are serially ordered and the serial order relation provide us with some information which is not contained into the variables themselves. Models for sequence processing, such as recurrent neural networks (RNNs) and hidden Markov models (HMMs) are a generalization of feedforward neural networks and probabilistic mixture models that can exploit such additional information.

However, data in the real world often come with a much richer structure and serial order can be generalized to more complex relations. These relations can be naturally represented by graphs. A data structure is therefore a dynamic collection of related variables and can be conveniently represented as a graph whose nodes are labeled by variables. A sequence is just a very special case of data structure: the relation among variables is a (quite simple) graph whose topology is restricted to the class of linear chains:



Clearly, it is not difficult to imagine a generalization from serial order to partial orders (directed acyclic graphs):



Learning domains yielding structured instances are not as unusual as one might think. The reason why structured representations are not very popular in connectionism is perhaps the lack of adaptive models that can effectively deal with graphical inputs. For example, complex chemical compounds can be naturally represented using graphs whose nodes are labeled by atoms or simple molecules.



Another interesting example is syntactic pattern recognition. In syntactic pattern recognition, objects are described as labeled graphs, where labels can be chosen from a finite alphabet of symbols. In the simplest case, recognition is a merely symbolic procedure based on a formal generative definition of languages. A set of production rules (defining a formal grammar) are applied to ascertain whether the object under examination belongs to a given language and the yes/no answer is used to predict the object class. While the most widely known theory on formal language is restricted to strings (for example, the theory commonly used for programming languages), researchers in the syntactic pattern recognition community more often use languages defined on graphical domains, i.e. labeled trees or labeled webs are used instead of strings. This is because graphs have a much richer description power. This is an example of tree representation of a logo image:


In this case the image was recursively decomposed into a tree whose nodes correspond to sub-objects and edges express the contained-in relation. For each sub-object, some local features are measured. In this example, shape, perimeter and area of each sub-object are local features attached to nodes. While neural network and probabilistic architectures have been extensively studied for both static and sequential data types, architectures for data structures have received relatively little attention until recently.

top

References

  1. P. Frasconi, M. Gori and A. Sperduti.  A general framework for adaptive processing of data structures  Tech. Report 15/97, Dipartimento di Sistemi e Informatica, Universita' di Firenze, 1997.
  2. A. Sperduti and A. Starita.  Supervised Neural Networks for the Classification of Structures.  IEEE Transactions on Neural Networks, 8(3),1997.
  3. P. Frasconi, M. Gori and A. Sperduti.  On the Efficient Classification of Data Structures by Neural Networks.  Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-97).
  4. A. Sperduti, A. Starita, and C. Goller.  Learning Distributed Representations for the Classification of Terms.  Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95).
  5. A. Sperduti.  Stability Properties of Labeling Recursive Auto-Associative Memory.  IEEE Transactions on Neural Networks 6:1452--1460.
  6. A. Sperduti.  Labeling RAAM.  Connection Science, 6(4):429--459.
  7. C. Goller.  A Connectionist Approach for Learning Search-Control Heuristics for Automated Deduction Systems.  Ph.D. Thesis, Technische Universität München (Germany), 1997.
  8. T. Schmitt, C. Goller.  Relating Chemical Structure to Activity: An Application of the Neural Folding Architecture.  Submitted, 1997.
  9. A.D. Blair.  Scaling Up RAAMs".  Brandeis University Computer Science Technical Report CS-97-192, 1997.
  10. C. Goller, A. Küchler.  Learning Task-Dependent Distributed Representations by Backpropagation Through Structure.  International Conference on Neural Networks (ICNN-96).
  11. J. B. Pollack.  Recursive Distributed Representations.  Artificial Intelligence 46(1):77--105.

Copyright notice

The documents listed in this site are provided as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

top

Machine Learning and Neural Network Group, Firenze University, Italy    April 2003.