next up previous
Next: Model definition Up: Introduction Previous: The INFORMys Modeler

INFORMys Reader

The form reader (FR) is used during the recognition step, in order to process forms of a specified class. User interaction with the system is limited to provide to FR the class of forms to be processed, feeding the scanner with forms and, if requested, correct the output of the character recognition engine.

After the operator enters the form class, the FR starts to process the incoming forms with the following sequence of operations.

  1. Scan the form and convert it to an electronic image. This is quite an ordinary operation which was also massively carried out for creating a document data base.
  2. Register the incoming form. A critical point during the acquisition by the scanner is the presence of skew, that is a rotation of the incoming page with respect to the scanning lines. This may be due to frictions of the scanner loading mechanism. Skews, as well as page translations, may significantly increase the difficulty of locating the fields.
  3. Locate information fields into the incoming form. After having registered the incoming form, the system must determine the location of the information fields. Portions of the image that are expected to contain the information fields are isolated for the next recognition step.
  4. Extract the information. Once the system has discovered an area where an information field is supposed to range, the recognition phase takes place. We assume that the information is simply represented by printed words, that are recognized as sequences of simple characters. We adopt an optical character recognition system that has been fully developed by our research group by using connectionist models.


Tue Oct 7 10:26:36 MET 1997