[Project-ideas] Improving information retrieval methods for OCR data sets consisting of Indic scripts

Rabindra Rakshit rovir2r at gmail.com
Mon Feb 3 00:42:54 PST 2014


I (Rabindra Rakshit), am interested in applying for GSOC 2014, and would
like to know if Ankur India is applying as a mentoring organization this
year also.

I am currently pursuing my B.tech in Computer Science(CSE) from College of
Engineering and Management, Kolaghat, and being born a Bengali, would love
to see my language flourish in the open source community.

I am particularly interested in the project about Improving information
retrieval methods for OCR data sets consisting of Indic scripts(Info
Rescue). I had a look on the work plan of Abhishek Gupta, the final voting
system in a general(abstract) manner is yet to be implemented.

I don't have any exact experience about OCR, but I do have experience of
working with Information Retrieval Systems, in fact, right now I am working
on Consensus Sequence Segmentation, an Unsupervised Text Segmentation
algorithm that relies entirely on statistical relationships among alphabets
in the input sequence to detect location of word boundaries.

Link: http://arxiv.org/abs/1308.3839
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ankur.org.in/pipermail/project-ideas-ankur.org.in/attachments/20140203/7b889b28/attachment-0003.htm>


More information about the Project-ideas mailing list