Artificial Neural Networks for

Document Analysis and Recognition

Marco Gori  -  Simone Marinai  -   Giovanni Soda

This page contains material related to the tutorial: "Artificial Neural Networks for Document Analysis and Recognition" held on:


Tutorial Program



The research in the area document processing and neural networks have experienced a renewal of interest in the last fifteen years in which some hundreds of papers on document processing applications using connectionist models have appeared in either scientific journals or conference proceedings. Most of them report different neural network models for the recognition of isolated characters (either printed or handwritten), but there are many other significant approaches to the solution of tasks like character segmentation, line removal, region and document classification, signature verification which are not very well-known. In this tutorial, we presents a survey of most significant tasks of document processing where connectionist-based approaches seem to be adequate. Feature-based representations of either graphic items or whole documents are presented, which are subsequently processed using classic connectionist models like multilayer perceptrons, radial basis functions, and learning vector quantization. The major drawbacks of these massively used models are pointed out with special emphasis on their learning from tabula rasa approach and on the rough static data representation they assume. The role of the prior knowledge in the conception of either appropriate architectures or learning algorithms is discussed with specific reference to important learning tasks in the field of document processing. It is also shown that special structured representations, where data are properly modeled by graphs, can be learned from examples by using connectionist models. They can be successfully used for recognition of graphic items, but also for higher level tasks like document classification and retrieval.


The tutorial is addressed to researchers and graduate students in the field of pattern recognition,  and in particular to those working in tha area of document image analysis and recognition.
A general background in pattern recognition and document processing is required, whereas most basic concepts of artificial neural networks will be given in the first part of the tutorial.


Duration: 3 hours

 Artificial Neural Networks: background

 Applications to Document Analysis and Recognition
     Pre processing
     Layout Analysis
     Character Segmentation
     Optical Character Recognition
     Word Recognition
     Signature Verification

Artificial Neural Networks: background

Applications to Document Analysis and Recognition


Layout Analysis

Character segmentation

Optical Character Recognition (OCR)

Word recognition Signature verification


 Marco Gori

Marco Gori received the Laurea in electronic engineering from UniversitÓ di  Firenze, Italy, in 1984, and the Ph.D. degree in 1990 from UniversitÓ di Bologna, Italy. During the graduate studies, he also a visiting student at the School of Computer Science (McGill University, Montreal), where he was involved in problems of automatic speech recognition using artificial neural networks. In 1992, he became an associate professor of computer science at UniversitÓ di Firenze and, in November 1995, he joint the UniversitÓ di Siena, where he is currently professor of computer science. His main research interests are currently in the areas of neural networks and pattern recognition, with special emphasis in document analysis and recognition. He has organized many scientific events in his area of expertise, like the international summer school on ``Adapting Processing of Sequences'' held in Salerno on September 1997 and the International Joint Conference on Neural Networks (July 24-28, 2000), where he acted as a program chair.
Dr. Gori serves as an Associate Editor of a number of technical journals, including Pattern Recognition, the IEEE Trans. on Neural Networks, Pattern Analysis and Application, and the International Journal on Pattern Recognition and Artificial Intelligence. He is a fellow of the IEEE and is also the Italian chairman of the IEEE Neural Network Council.

 Simone Marinai

Simone Marinai received the Laurea in Electronic Engineering in 1992, from the UniversitÓ di Firenze, Italy. He obtained the PhD degree in computer science in 1996 defending a thesis on the extraction of information from structured documents.
Currently he is Assistant Professor at UniversitÓ di Firenze, where he teaches some parts of the Artificial Intelligence course for topics related to Artificial Neural Networks and Document Processing applications.
His main research interests are in  pattern recognition, neural networks, and document processing applications.
He was the chairman of the workshop ``Document Analysis and Understanding for Document Databases'' (DAUDD) held in Firenze in 1999. He is a member of IAPR.

go to DANTE home page

Simone Marinai  -- Mar 7 2004
service provided by