Open Access Open Access  Restricted Access Subscription Access

A Complete Workflow of Multilingual Indian Regional Languages

L. Suriya Kala, R. Thangaraj

Abstract


The system MLOCR is an effective way to convert the document images into editable text; this process is very tedious and time consuming. To make it easier, the MLOCR comes with a solution. It provides the one click solution to all the problems. Here the basic scanned image is taken and preprocessed to be converted by the OCR engine and the image is then recognized by the engine to be converted in a particular language, finally at the press of a button the image is converted into a text which can be manipulated in every way. The whole process is done in GUI-based environment. To those who prefer windows operating system, it is somewhat similar to all other applications in the Windows environment.


Keywords


MLOCR, preprocessing, tesseract

Full Text:

PDF

References


Raushan Kumar Singh, Pandey Akhilesh. A Classification of Handwritten Multilingual Documents. International Journal of Science and Research (IJSR) 2014; 3(7). 2. Aparna K. G., Ramakrishnan A. G. "A complete Tamil optical character recognition system." Document Analysis Systems V. Berlin Heidelberg: Springer; 2002, 53–57p 3. Dongre Vikas J, Mankar Vijay H. "A review of research on Devnagari character recognition." arXiv preprint arXiv:1101.2491. 2011..

Kumar Dilip, Sachan Abhishek, Kumar Malay. Implementation of Speech Recognition in Web Application for Sub Continental Language.

Bhatia Neetu. Optical Character Recognition Techniques: A Review. International Journal of Advanced Research in Computer Science and Software Engineering 2014; 4(5).


Refbacks

  • There are currently no refbacks.