[Project-ideas] Improving IR methods for OCR of Indic Scripts

ABHISHEK GUPTA abhi.bansal21 at gmail.com
Tue Apr 23 12:34:31 PDT 2013


Hi,

I am a 3rd tear student at Dhirubhai Ambani Institute of Information &
Communication Technology. I am interested in doing some work with
Ankur-India on the topic "Improving information retrieval methods for OCR
data sets consisting of Indic scripts". I want to know more about the
project. What is the project's current state. What corpora, tools,
algorithms and approaches are you using. As project is aiming at
improvement of the method, what are the current results?

I have knowledge about programming in Python, C++, C & Java. I have done a
course on Information Retrieval this year under Prof. Prasenjit Majumder.
The main focus of the course is on IR based methods for Indian languages. I
have done a project on Authorship Attribution for Gujarati text using POS
Tags. My paper about the same is
here<https://docs.google.com/file/d/0B9rBcMmY_5QUTkRxNUNkRXdQbFE/edit?usp=sharing>.
I have used Lemur/Indri and CMU SLM Toolkit for some other project tasks. I
have also done a course on Fuzzy Neural Network. One can use
Fuzzy approaches for image to text conversion for better results mainly in
error prone areas.

*Abhishek Gupta*
*9624799165*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ankur.org.in/pipermail/project-ideas-ankur.org.in/attachments/20130424/eb24541a/attachment-0002.htm>


More information about the Project-ideas mailing list