- Date:
- 2013-01-30
- Main contributors:
- Rahnemoonfar, Maryam
- Summary:
- A particular challenge in the text recognition of historical document images is the considerable amount of "image noise" that can arise during the whole life cycle of a document from printing and storage to the usage and scanning of the document. Historical documents suffer from several different kinds of noise such as geometric distortions, bleed-through, textured papers, stamp, stain, and so forth. Noise will affect and complicate the different stages of document image analysis including enhancement, segmentation, layout analysis and recognition. This talk will cover the description of different stages of document image analysis and challenges and opportunities in image processing and analysis of historical documents. I will particularly explain about the software that I developed in the IMPACT project for correction of arbitrary geometric artefacts in historical documents. Such distortions appear as arbitrary warping, fold, and page curl and have detrimental effects on OCR and print-on-demand quality.