[Anubad] Introduction - Aniket Handa

Sankarshan Mukhopadhyay sankarshan.mukhopadhyay at gmail.com
Sat Mar 17 09:54:37 PDT 2012


Hi Aniket,

Thank you for the interest in our project.

On Sat, Mar 17, 2012 at 2:15 PM, Aniket Handa <atneik at gmail.com> wrote:

>   - New Keyboard specific to input for mobile devices.
>   - Improve the accuracy of OCR tools for bengali language to 98%
>   - Improving information retrieval methods for OCR data sets consisting
>   of Indic scripts.
>
> My question of concern is: Will *not* knowing the bengali language hinder
> in my project endeavor?
> What I suppose is some knowledge is necessary in order to present a good
> keyboard layout in case of first project Idea. And in case of OCR it
> greatly depends upon the technique being used ( As one might think of
> improving the OCR specific to the language, or upon some global or local
> matching).

I wrote up a bit about the idea around keyboard layout in an earlier
email. I'll focus a bit on the other two ideas which you find
interesting. The basic premise of the Information Retrieval and, the
OCR ideas is that while they do mention 'Bengali' as a language, the
framework and the implementation would probably encompass a generic
nature rather than tie down to a specific language. That said, I'd be
more interested in how you see the project shaping up. The ideas
listed provide guidance about the expected outcome, the actual end
result is required to be conceived and given shape by the
candidate/developer.

For what it is worth, not everyone who is a contributor to language
technology frameworks is a polyglot :) I hope that would assuage your
concerns.

The form on the GSoC 2012 application system will ask you a few
questions in terms of your familiarity with the project ideas, the
amount of time you will be able to commit and so forth. Our IRC
channel is #ankur.org.in on irc.freenode.net

/sankarshan

-- 
sankarshan mukhopadhyay
<http://sankarshan.randomink.org/blog/>



More information about the Anubad mailing list