[Project-ideas] GSOC 2013 Introduction

Sun Apr 14 23:26:50 PDT 2013

On Fri, Apr 12, 2013 at 8:20 AM, Sankarshan Mukhopadhyay
<sankarshan.mukhopadhyay at gmail.com> wrote:
> Hi Atanu,
>
> On Thu, Apr 11, 2013 at 6:04 PM, Atanu Ghosh <atanu1991 at gmail.com> wrote:
>
>> I have done a preliminary survey on the topic and have come up with a few
>> points.
>
> Neat. Thank you.
>
>> As per the description of the project idea "Develop a language model for
>> speech processing by extending a freely available corpus" I have come up
>> with:
>>
>> We can go with CMUSphinx to build the language model for Bengali.This can be
>> done as shown in Reference [1].
>> Now one point is that CMUSphinx has laready been tried.To do something new
>> we can use Julius as I dont think it has been tried with Bengali.It will be
>> definitely something new.
>>
>> Next the problem is gathering data to train our system.I have found out to 2
>> approaches to get data.One is to use the data available on the shruthi
>> Bangla ASR site [2] or we can use the algorithm in this paper [3] to
>> generate phonemes consonants etc.
>>
>> Third the actual STT can be done as mentioned in Reference [1] with the
>> guidance of paper in Reference [4].Methods to reduce the noise and hence
>> improve accuracy can be thought of (I havent research on it still).
>>
>> Also I was curious whether we can make a TTS system.I was looking up at
>> Dhvani [5].They say the Bengali module needs a lot improvements [6].Using
>> the large data we have if we train Dhvani to improve and recognize digits
>> even a good TTS system can be obtained.
>>
>> Finally, a very complete and concise documentation with all source code,
>> method of implementation can be released for STT and TTS or both, which can
>> be used by others to develop a language model for any Indic script.The
>> proof=of-concept as said, will be done in Bengali and demonstrated.
>>
>> Thank you for your patience to go through this rather long mail.Please
>> suggest any new ideas/concepts wherein I can improve upon what I wrote in
>> this mail and come with a basic draft of the final objective.
>
> You make valid points.
>
> With regards to the creation of the training corpus, I am not sure
> about the license of the dataset for Shruti - is that free/libre?
>
> Do you feel that you are at a stage where you can begin to take a stab
> at creating a very first iteration of a proposal? If yes, please do
> so. I would prefer that you share the link to the Google Doc with me
> (off list) so that I can share it with the other mentors for them to
> provide inputs.
>
> I am not familiar with the modifications required in Dhvani to make it
> work as a TTS, I am aware that Dhvani may be a good choice. Perhaps
> Bhavani could provide an opinion.
>

Sorry for the delay in response. I'll have a look and reply tonight.

Regards,

-- 
Bhavani Shankar
Ubuntu Developer       |  www.ubuntu.com
https://launchpad.net/~bhavi