Adaptive Processing of Data Structures
|Paolo Frasconi||DSI, University of Florence, Italy|
|Christoph Goller||University of München, Germany|
|Marco Gori||DII, University of Siena, Italy|
|Andreas Küchler||University of Ulm, Germany|
|Alessandro Sperduti||DI, University of Pisa, Italy|
|Ah Chung Tsoi||University of Wollongong, Australia|
|Michelangelo Diligenti||DI, University of Siena, Italy|
Learning Data Structures
Connectionist and probabilistic models for machine learning are often limited
to processing relatively poor data types. For example, feedforward neural networks
and adaptive mixture models assume that instances are represented by fixed-size
(static) tuples of attributes (when using neural networks, a tuple
is typically encoded by a real valued array). While some domains naturally
lead to static data representations, other domains are better described
by variable length representations. In speech recognition, for example,
utterances are normally represented as a discrete time sequences of acoustic
features (e.g., 12 kepstral parameters are computed every 10 milliseconds).
Similarly, in molecular biology, proteins are described by variable length
strings of symbols (each symbol is one of 20 amino acids). There are two
important novelties when using sequences instead of static patterns. First,
the number of atomic entities in a sequence is not fixed a priori, i.e.
a sequence is a dynamic entity. Second, atoms in a sequence are serially
ordered and the serial order relation provide us with some information
which is not contained into the variables themselves. Models for sequence
processing, such as recurrent neural networks (RNNs) and hidden Markov
models (HMMs) are a generalization of feedforward neural networks and probabilistic
mixture models that can exploit such additional information.
However, data in the real world often come with a much richer structure and serial
order can be generalized to more complex relations. These relations can be naturally
represented by graphs. A data structure is therefore a dynamic collection of
related variables and can be conveniently represented as a graph whose
nodes are labeled by variables. A sequence is just a very special case
of data structure: the relation among variables is a (quite simple) graph
whose topology is restricted to the class of linear chains:
Clearly, it is not difficult to imagine a generalization from serial order to partial
orders (directed acyclic graphs):
Learning domains yielding structured instances are not as unusual as one might think.
The reason why structured representations are not very popular in connectionism is
perhaps the lack of adaptive models that can effectively deal with graphical
inputs. For example, complex chemical compounds can be naturally represented using
graphs whose nodes are labeled by atoms or simple molecules.
Another interesting example is syntactic pattern recognition. In syntactic pattern recognition, objects
are described as labeled graphs, where labels can be chosen from a finite
alphabet of symbols. In the simplest case, recognition is a merely symbolic
procedure based on a formal generative definition of languages. A set of
production rules (defining a formal grammar) are applied to ascertain whether
the object under examination belongs to a given language and the yes/no
answer is used to predict the object class. While the most widely known
theory on formal language is restricted to strings (for example, the theory
commonly used for programming languages), researchers in the syntactic
pattern recognition community more often use languages defined on graphical
domains, i.e. labeled trees or labeled webs are used instead of strings.
This is because graphs have a much richer description power.
This is an example of tree representation of a logo image:
In this case the image was recursively decomposed into a tree whose nodes correspond
to sub-objects and edges express the contained-in relation. For each sub-object, some
local features are measured. In this example, shape, perimeter and area
of each sub-object are local features attached to nodes.
While neural network and probabilistic architectures have been extensively studied for
both static and sequential data types, architectures for data structures have received
relatively little attention until recently.
- P. Frasconi, M. Gori and A. Sperduti.
A general framework for adaptive processing of data structures
Tech. Report 15/97, Dipartimento di Sistemi e Informatica, Universita' di Firenze, 1997.
A. Sperduti and A. Starita.
Supervised Neural Networks for the Classification of Structures.
IEEE Transactions on Neural Networks, 8(3),1997.
P. Frasconi, M. Gori and A. Sperduti.
On the Efficient Classification of Data Structures by Neural Networks.
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-97).
A. Sperduti, A. Starita, and C. Goller.
Learning Distributed Representations for the Classification of Terms.
Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95).
Stability Properties of Labeling Recursive Auto-Associative Memory.
IEEE Transactions on Neural Networks 6:1452--1460.
Connection Science, 6(4):429--459.
A Connectionist Approach for Learning Search-Control Heuristics for Automated Deduction Systems.
Ph.D. Thesis, Technische Universität München (Germany), 1997.
T. Schmitt, C. Goller.
Relating Chemical Structure to Activity: An Application of the Neural Folding Architecture.
Scaling Up RAAMs".
Brandeis University Computer Science Technical Report CS-97-192, 1997.
C. Goller, A. Küchler.
Learning Task-Dependent Distributed Representations by Backpropagation Through Structure.
International Conference on Neural Networks (ICNN-96).
J. B. Pollack.
Recursive Distributed Representations.
Artificial Intelligence 46(1):77--105.
The documents listed in this site are provided as a means to ensure timely dissemination
of scholarly and technical work on a noncommercial basis. Copyright and all rights therein
are maintained by the authors or by other copyright holders, notwithstanding that they have
offered their works here electronically. It is understood that all persons copying this
information will adhere to the terms and constraints invoked by each author's copyright.
These works may not be reposted without the explicit permission of the copyright holder.
Machine Learning and Neural Network Group,
Firenze University, Italy April 2003.