Research and Publications
Introduction
The activity of Dante research group began in 1993 with first appications
of unconstrained handwritten character recognition by means of artificial neural networks.
In the following years we
addressed other application domains including form processing, layout analysis, digital
libraries, and more recently document image retrieval.
One of the main peculiarities of our research is the use of Artificial Neural Networks in
several application domains.
The interaction between Artificial Neural Networks and Document Image Analyis has been
the subject of one tutorial Artificial Neural Networks for Document Analysis and Recognition
held at ICDAR 2001 and ICPR 2002 and has been described in a survey paper [PAMI05].
top
|
Document Image Retrieval
The traditional approach in Document Image Analysis aims at performing a complete and accurate extraction
of the informative content in document images.
This strategy is appropriate only for small size collections and when data have a
significant commercial value.
This is not the case of Digital Library where different strategies could be considered.
One strategy is to adopt document image retrieval based on layout similarity.
In this approach the user identifies one page in the database and most similar
pages are afterwards identified by the systems and shon to the user.
In [DAS02] a system for layout-based Document Image
Retrieval where pages are represented by means of MXY trees described with a suitable
representation.
Recent investgations have considered the use of tree transformation rules in order
to improve the document retrieval [ICDAR05] [avivdlib05].
A DIAR system cannot avoid to take into account the textual page contents. This
point of view is the subject of a paper [PAMI06]
where we described a system for the document retrieval on the basis of keywords
needed by the user.
One salient feature of the proposed approach is the independence, during the
indexing with respect to the specifig language and font of the stored documents.
The methods for layout-based and textual-based document retrieval have been
integrated into a single system that allows users to retrieve relevant
documents combining the two basic approaches in several ways [DIAL04].
top
|
Layout analysis and page classification
The segmentation of document images is aimed at identifying regions having
a homogeneous content that can be subsequently processed with appropriate
techniques.
Methods based on MXY tres are well known and are based aon a recursive
segmentation of the page along white spaces that span the whole image.
With the aim of processing pages containing horizontal and vertical ruling lines
we proposed in [ICDAR99] the MXY tree segmentation
algorithm.
The MXY trees can be used both for page segmentation and for document page
classification.
In the STRETCH (STorage and RETrieval by content of imaged documents) European
Project we developed a system for the storage and retrieval of commercial
invoices that is based on the extraction of a suitable symbolic description of
the invoice structure by means of MXY trees [IJDAR02][\ref{}].
We recently extended our research towards the information extraction from documents
(books and journals) belonging to Digital Libraries.
The DSI participation to the METAe (the Metadata Engine) project was mainly
devoted to the classification of book and journal pages by means of artificial
neural networks dealing with the MXY tree page representation [DEXA01].
MXY trees have been also used for tbale location in technical papers [ICPR02].
Other approaches are related to the use of tree grammars for training set expansion
for improving page classification performance [ICDAR03b].
top
|
Logo Recognition
The logo recognition is an usefull tool for document identification (for
instance for the classification of commercial invoices).
Differently from character recognition in logo recognition the number of
classes is not fixed and can change dinamically.
It is therefore appropriate to build modular classifiers like the
one proposed in [GREC97] that is based on autoassociators.
The autoassociators have been used also for the recognition of rotated and
noisy logo [PR03].
In this case we modified the training algorithm of autoassociators so as to take
into account the information corresponding to the logo contour and reduce the
effect of blobs of noise (e.g. black or white stripes).
top
|
Form and invoice reading systems
The main difficulties in form processing is the document registration and the identification
of the information fields that are not placed in fixed positions in the page.
A new model for processing variable layout forms has been proposed in [ICDAR95]
and [DEXA95] where we discussed also suitable algorithms for
the document registration and information field location.
The proposed approach has been demonstrated into a running system for the description and
analysis of the layout of structured documents [PAMI98].
Similar techniques have been applied also to the reading of commercial invoices [DEXA97]
and for the semi-automatic labeling of columns in the invoices [ICDAR97b].
top
|
Printed and handwritten character recognition
Key components of most document processing systems are the modules devoted
to the interpretation of handwritten and printed text.
In this context we proposed a noise model [GREC97]
that can be applied to grey level images in order to generate sintetic
patterns to be used for classifier training.
Modular classifiers are frequently used in order to improve the performance
of individual classifiers. In this context, we developed techniques for
a serial combination of neural classifiers in the context of an OCR system
[IJDAR01].
This approach is based on a preliminary classification based on an MLP
(MultiLayer Perceptron) followed by a refinement made with autoassociator-based
classifiers that are identified considering the confidence of the MLP.
top
|
References
- [ICIAP09]
- S. Marinai, E. Marino, G. Soda
Nonlinear Embedded Map Projection for Dimensionality Reduction, Proc. ICIAP 09, Springer Verlag, 2009.
- [ICDAR09a]
- S. Marinai,
Metadata Extraction from PDF Papers for Digital Library Ingest, Proc. ICDAR 2009, IEEE, pp. 251-255 2009.
- [ICDAR09b]
- S. Marinai, B. Miotti, G. Soda,
Mathematical Symbol Indexing Using Topologically Ordered Clusters of Shape Contexts, Proc. ICDAR 2009, IEEE, pp. 1041-1045, 2009.
- [AND09]
- S. Marinai,
Text retrieval from early printed books, Proc. AND Workshop 2009, ACM, pp. 33-34, 2009.
- [SPR08]
- S. Marinai, E. Marino, G. Soda
Embedded map projection for dimensionality reduction based similarity search, Proc. S+SSPR 2008, Springer Verlag, 2008.
- [DAS08]
- S. Marinai, E. Marino, G. Soda
A comparison of clustering methods for word image indexing, Proc. DAS 2008, Springer Verlag, 2008.
- [MLDAR08a]
- S. Marinai,
Introduction to Document Analysis and Recognition,
In Machine Learning in Document Analysis and Recognition, Studies in Computational Intelligence 90, Ed. Simone Marinai,
Hiromichi Fujisawa, Springer Verlag, 2008.
- [MLDAR08b]
- S. Marinai, E. Marino, G. Soda,
Self-Organizing Maps for Clustering in Document Image Analysis,
In Machine Learning in Document Analysis and Recognition, Studies in Computational Intelligence 90, Ed. Simone Marinai,
Hiromichi Fujisawa, Springer Verlag, 2008.
- [ICIAP07]
- S. Marinai, E. Marino, G. Soda.
Transformation invariant SOM clustering in Document Image Analysis.
14th International Conference on Image Analysis and Processing, Modena (Italy), 2007, IEEE Press:pp. 185-190, 2007
- [ECDL07]
- S. Marinai, E. Marino, G. Soda.
Exploring Digital Libraries with Document Image Retrieval.
11th European Conference on Research and Advanced Technology for
Digital Libraries, Budapest (Hungary), 2007, Springer Verlag, pp. 368-379.
- [MYS07]
- S. Marinai.
SOM clustering for text retrieval and classification with
examples on Indian scripts.
Proc. of Brainstorming Workshop on OCR for Indian Languages
16-17 March, 2007, Mysore (India).
Invited talk
- [PAMI06]
-
S. Marinai, M.Gori, G.Soda,
Font Adaptive Word Indexing of Modern Printed Documents,
IEEE Transaction PAMI, vol 28, N. 8, August 2006, pp. 1187-1199, IEEE Press, Los Alamitos (CA).
- [CIFED06]
- S. Marinai.
A survey of document image retrieval in digital libraries.
9th Colloque International Francophone sur l'Ecrit et le Document (CIFED 2006), pag. 193-198.
Invited talk
- [DAS06]
- S. Marinai, S. Faini, E. Marino, G. Soda.
Efficient word retrieval by means of SOM clustering and PCA.
7th International Workshop on Document Analysis Systems}, Nelson (New Zealand), 2006, LNCS: pp.
- [DIAL06]
- S. Marinai, E. Marino, G. Soda,
Tree clustering for layout-based document image retrieval,
Proceedings of the Second Int'l Workshop on Document Image Analysis for Libraries, pp. 243-251, Lyon (France), 2006, IEEE Press, Los Alamitos (CA).
- [DAS06]
- S. Marinai, S. Faini, E. Marino, G. Soda.
Efficient word retrieval by means of SOM clustering and PCA.
7th International Workshop on Document Analysis Systems}, Nelson (New Zealand), 2006, LNCS: pp.
- [PAMI 05]
-
S. Marinai, M.Gori, G.Soda,
Artificial Neural Networks for Document Analysis and Recognition,
IEEE Transaction PAMI, vol 27, N. 1, January 2005, pp. 23-35, IEEE Press, Los Alamitos (CA).
- [ICDAR05]
- S. Marinai, E. Marino, G. Soda.
Layout based document image retrieval by means of XY tree reduction.
9th International Conference on Document Analysis and Recognition}, Seoul (Korea), 2005, IEEE Press:pp. 432-436, 2005
- [NNLDAR05]
- S. Faini, S. Marinai, E. Marino, G. Soda,
SOM-based Document Image Retrieval,
Proceeding of the 1st International IAPR Workshop on Neural Networks and Learning in Document Analysis
and Recognition, pp. 33 -- 40, Seoul (Korea), 2005.
- [AvivDlib05]
- S. Marinai, E. Marino, G. Soda,
Layout based document image retrieval in Digital Libraries,
Proceeding of the 7th Int. Workshop Audio-Visual Content and Information Visualization in Digital
Libraries (AVIVDiLib '05), Cortona (Italy), 2005 pp.67-76.
- [DIAL04]
- S. Marinai, E. Marino, F. Cesarini, G. Soda,
A general system for the retrieval of document images from digital libraries,
Proceedings of the First Int'l Workshop on Document Image Analysis for Libraries, pp. 150-173, Palo Alto (CA), 2004, IEEE Press, Los Alamitos (CA).
- [PR03]
- M. Gori, M. Maggini, S. Marinai, J. Q. Sheng, G. Soda,
Edge-Backpropagation for Noisy Logo Recognition,
Pattern Recognition, vol 36, N.1, 2003, pp. 103-110, Elsevier, Amsterdam (NL).
- [ICDAR03a]
- S. Marinai, E. Marino, G. Soda,
Indexing and Retrieval of Words in Old Documents,
Proceedings of ICDAR 2003, pp. 223-227, 2003, IEEE Press, Los Alamitos (CA).
This paper won the Best Paper Award at ICDAR 2003.
- [ICDAR03b]
- S. Baldi, S. Marinai, G. Soda,
Using tree grammars for training set expansion in page classification,
Proceedings of ICDAR 2003, pp. 829-833, 2003, IEEE Press, Los Alamitos (CA).
- [IJDAR02]
- E. Appiani, F. Cesarini, A.M. Colla, M. Diligenti, M.Gori, S.Marinai, G.Soda,
Automatic document classification and indexing in high-volume applications,
IJDAR, vol 4, N. 2 2001, pp. 69-83, Springer-Verlag, Berlin (D).
- [ICPR02]
- F. Cesarini, S. Marinai, L. Sarti, G. Soda,
Trainable table location in document images,
Proceedings of the 16th ICPR, pp. 236-240,
Queb�c City (Canada), August 2002, IEEE Press, Los Alamitos (CA).
- [DAS02]
- F. Cesarini, S. Marinai, G. Soda,
Retrieval by layout similarity of documents represented by MXY trees,
Proceedings of the 5th IAPR International Workshop on Document Analysis Systems (DAS), pp. 353-364
Princeton (NJ, USA), August 2002, LNCS 2423, Springer-Verlag, Berlino (D).
- [IJDAR01]
- E. Francesconi, M.Gori, S.Marinai, G.Soda,
A serial combination of connectionist-based classifiers for OCR,
IJDAR, vol 3, N. 3 2001, pp. 160-168, Springer-Verlag, Berlin (D).
- [ICDAR01]
- F. Cesarini, M. Lastri, S. Marinai, G. Soda,
Encoding of modified X-Y tress for document classification,
Proceedings of ICDAR 2001, pp. 1131-1136, Seattle (USA), 2001, IEEE Press, Los Alamitos (CA).
- [DEXA01]
- F. Cesarini, M. Lastri, S. Marinai, G. Soda,
Page classification for meta-data extraction from digital collections,
Proceedings of DEXA 2001, Munich (D), 2001, pp. 82-91, LNCS 2113, Springer-Verlag, Berlin (D).
- [ICDAR99a]
- F.Cesarini, M. Gori, S. Marinai, G. Soda,
Structured Document Segmentation and Representation by the Modified X-Y Tree,
Proceedings of ICDAR 1999, pp. 563-566, Bangalore (India), 1999, IEEE Press, Los Alamitos (CA).
- [ICDAR99b]
- S. Marinai, P. Nesi,
Projection based Segmentation of Musical Sheets,
Proceedings of ICDAR 1999, pp. 563-566, Bangalore (India), 1999, IEEE Press, Los Alamitos (CA).
- [PAMI98]
- F.Cesarini, M.Gori, S.Marinai, G.Soda,
INFORMys: a flexible INvoice-like FORM reader system,
IEEE Transaction PAMI, vol 20, N. 7 July 1998, pp. 730-745, IEEE Press, Los Alamitos (CA).
- [GREC97]
- E.Francesconi, P.Frasconi, M. Gori, S. Marinai, J.Q. Sheng, G. Soda, A. Sperduti,
Logo Recognition by Recursive Neural Networks,
in Graphics Recognition, Algorithms and Systems, LNCS (1389) pp. 104 - 117, 1998, Springer Verlag, Berlino (D).
- [DEXA97]
- F.Cesarini, E.Francesconi, M.Gori, S.Marinai, J.Q.Sheng, G.Soda,
Conceptual Modelling for Invoice Document Processing,
Proceedings of the Conference DEXA '97 Workshop on
Query Processing in Multimedia Information System,
Toulose, September 1997, pp. 596-603, IEEE Press, Los Alamitos (CA).
- [ICDAR97a]
- F.Cesarini, E.Francesconi, M. Gori, S. Marinai, J.Q. Sheng, G. Soda,
A Neural-based architecture for spot-noisy logo recognition,
Proceedings of ICDAR 1997, pp. 175-179, Ulm (Germany), 1997, IEEE Press, Los Alamitos (CA).
- [ICDAR97b]
- F.Cesarini, E.Francesconi, M. Gori, S. Marinai, J.Q. Sheng, G. Soda,
Rectangle labelling for an Invoice Understanding System,
Proceedings of ICDAR 1997, pp. 324-330, Ulm (Germany), 1997, IEEE Press, Los Alamitos (CA).
- [GREC97]
- F.Cesarini, M.Gori, S.Marinai, G.Soda,
A Hybrid System for Locating Low Level Graphic Items,
in Graphics Recognition, Methods and Applications, LNCS (1072) pp. 135 - 147, 1996, Springer Verlag, Berlino (D).
- [DEXA95]
- F. Cesarini, M. Gori, S. Marinai, G. Soda,
Data Extraction from Form Images,
Proceedings of DEXA 1995, London (UK), 1995, pp. 438-448, LNCS 978, Springer-Verlag, Berlin (D).
- [ICDAR95]
- F.Cesarini, M. Gori, S. Marinai, G. Soda,
A System for Data Extraction from Forms of Known Class,
Proceedings of ICDAR 1995, pp. 1136-11409, Montreal, 1995, IEEE Press, Los Alamitos (CA).
Copyright notice
The documents listed in this site are provided as a means to ensure timely dissemination
of scholarly and technical work on a noncommercial basis. Copyright and all rights therein
are maintained by the authors or by other copyright holders, notwithstanding that they have
offered their works here electronically. It is understood that all persons copying this
information will adhere to the terms and constraints invoked by each author's copyright.
These works may not be reposted without the explicit permission of the copyright holder.
top
|
1st November 2007. Dante Group.
|