David C. Smith
Optical character recognition, image enhancement, neighbor embedding
Optical character recognition (OCR) is the electronic translation of scanned images of text documents. Frequently text documents are scanned at low resolution (LR) to conserve memory, and OCR engines have difficulty translating such LR images. In this article we apply the neighbor embedding (NE) single-image super-resolution (SISR) technique to LR images of text documents and obtain high resolution (HR) versions, which we subsequently process with OCR. We repeat this experimental procedure using bicubic interpolation (BI) in the preprocessing step. We report our experimental findings comparing the character error rates (CER) of OCR translations before and after NE and BI preprocessing. Our experiments with Latin fonts in the 6pt-10pt range show that at 3x (LR scanning at 100 dpi) and 4x (LR scanning at 75dpi) magnification, CER after NE preprocessing was nearly an order of magnitude lower than CER after BI pre- processing. We also observed that in this point range, OCR applied to LR images scanned at 75 dpi completely failed, and CER was at least 94% for OCR applied to LR images scanned at 100dpi. By contrast, at 3x and 4x magnification, CER after NE preprocessing was under 10% at 6pt and under 3% at 8pt and 10pt.
Important Links:
Go Back