<div dir="ltr">Hi,<div><br></div><div>I am a 3rd tear student at Dhirubhai Ambani Institute of Information & Communication Technology. I am interested in doing some work with Ankur-India on the topic "Improving information retrieval methods for OCR data sets consisting of Indic scripts". I want to know more about the project. What is the project's current state. What corpora, tools, algorithms and approaches are you using. As project is aiming at improvement of the method, what are the current results?</div>


<div><br></div><div>I have knowledge about programming in Python, C++, C & Java. I have done a course on Information Retrieval this year under Prof. Prasenjit Majumder. The main focus of the course is on IR based methods for Indian languages. I have done a project on Authorship Attribution for Gujarati text using POS Tags. My paper about the same is <a href="https://docs.google.com/file/d/0B9rBcMmY_5QUTkRxNUNkRXdQbFE/edit?usp=sharing" target="_blank">here</a>. I have used Lemur/Indri and CMU SLM Toolkit for some other project tasks. I have also done a course on Fuzzy Neural Network. One can use Fuzzy approaches for image to text conversion for better results mainly in error prone areas. </div>


<div><div><br></div><div dir="ltr"><font color="#a64d79" face="trebuchet ms, sans-serif"><b>Abhishek Gupta</b></font></div>

<div dir="ltr"><font color="#a64d79" face="trebuchet ms, sans-serif"><b>9624799165</b></font></div>

</div></div>