[Project-ideas] [Anubad] Project to Develop a system with multi-lingual capabilities in order to receive answer to user specific queries

Sankarshan Mukhopadhyay sankarshan.mukhopadhyay at gmail.com
Tue Mar 20 20:07:18 PDT 2012


On Mon, Mar 19, 2012 at 8:11 PM, Abhishek Gupta
<abhishekgupta.iitd at gmail.com> wrote:

> 1 - Start working on English because of resource rich nature of the same.
> And one we achieve a fairly good accuracy we can move on to Bengali.
>
> 2 - We start from Bengali itself from the beginning.
>
> 3 - Go for a language independent implementation which should be
> considerably harder problem and might lead to lower accuracies as we won't
> be able to take advantage of domain specific knowledge of the language.
>
> Which approach would Ankur recommend?

We should probably look at #2 as a variant of #1 above where the query
language and corpus are in the same locale. And, #3, although ideal,
would probably make it beyond available timelines (I'd admit that
having a system that resolves ambiguity across languages in a
query-response system would be a fascinating one to have).

> I looked at the GSOC documentation and believe that it would be based on
> the discretion of the mentor of the project and Ankur.

I'll think over this for a while.


-- 
sankarshan mukhopadhyay
<http://sankarshan.randomink.org/blog/>



More information about the Project-ideas mailing list