[Project-ideas] Project Proposal ideas

Sayamindu Dasgupta sayamindu at gmail.com
Sat Mar 24 17:29:16 PDT 2012


On Thu, Mar 22, 2012 at 7:53 AM, Sankarshan Mukhopadhyay
<sankarshan.mukhopadhyay at gmail.com> wrote:
> On Thu, Mar 22, 2012 at 3:00 PM, nitesh surtani
> <nitesh.surtani0606 at gmail.com> wrote:
>
>> 1) An application UI testing framework for validating translation
>> completeness and quality
>>
>> Mentor: Runa Bhattacharjee
>>
>> Though I am not very familiar will l10n, but I am very keen to explore this
>> project. I have looked into the localization of a couple of software in
>> Hindi (I wasn’t able to understand the UI in Bengali J). I have gone through
>> the translations for Pidgin for Hindi
>> (http://developer.pidgin.im/ticket/11411) and have understood few issues. I
>> have a doubt though: Since I am not a Bengali speaker, will it affect my
>> understanding and working on this project.
>
> Have a look at the mail at
> <http://lists.ankur.org.in/pipermail/project-ideas-ankur.org.in/2012-March/000012.html>
>
>> 2) Add a language model for speech recognition software for Bengali language
>>
>> Mentor: Sayamindu Dasgupta
>>
>> I actually wanted some more insight regarding this project. Since the corpus
>> is available, some HMM modeI (like HTK toolkit, usually used for speech
>> recognition) can be used to implement this language model. I have used SRILM
>> toolkit once for MT task as part of the course project to develop a
>> domain-specific MT system for tourism domain.
>
> I'll nudge up Sayamindu for a response. However, when you say, "corpus
> is available" what specific corpus do you refer to ?


Are these corpora re-useable for building Open Source software ?
(One has to be careful about licensing)

-Sayamindu


>
>
> --
> sankarshan mukhopadhyay
> <http://sankarshan.randomink.org/blog/>



-- 
Sayamindu Dasgupta
[http://sayamindu.randomink.org/ramblings]



More information about the Project-ideas mailing list