<div dir="ltr"><div>Q. extending the dictionary for complex phrases/sentences ?<br></div><div><br></div><div style="text-align:justify">A. I have planned to use g2p tools like<font color="#111111" face="Verdana, Tahoma, Arial, sans-serif"> Phonetisaurus and </font><span style="background-color:rgb(252,252,252);color:rgb(17,17,17);font-family:Verdana,Tahoma,Arial,sans-serif;font-size:13px">sequitur-g2p and right now I am analyzing which one i should use.</span></div>
<div style="text-align:justify"><span style="background-color:rgb(252,252,252);color:rgb(17,17,17);font-family:Verdana,Tahoma,Arial,sans-serif;font-size:13px"><br></span></div><div style="text-align:justify"><span style="background-color:rgb(252,252,252);color:rgb(17,17,17);font-family:Verdana,Tahoma,Arial,sans-serif;font-size:13px">Q. MITLM ?</span></div>
<div style="text-align:justify"><span style="background-color:rgb(252,252,252);color:rgb(17,17,17);font-family:Verdana,Tahoma,Arial,sans-serif;font-size:13px"><br></span></div><div style="text-align:justify"><span style="background-color:rgb(252,252,252);color:rgb(17,17,17);font-family:Verdana,Tahoma,Arial,sans-serif;font-size:13px">A. Thanks for this suggestion. Yes there are many other tools like MITLM and CMUCLTK </span></div>
<div style="text-align:justify"><span style="background-color:rgb(252,252,252);color:rgb(17,17,17);font-family:Verdana,Tahoma,Arial,sans-serif;font-size:13px">for example SRILM, NGramLibrary, ya but SRILM cannot be used due to license issues.</span></div>
<div style="text-align:justify"><font color="#111111" face="Verdana, Tahoma, Arial, sans-serif">and yes mitlm is slightly better than cmucltk so okk i will use mitlm .</font></div><div style="text-align:justify">
<span style="background-color:rgb(252,252,252);color:rgb(17,17,17);font-family:Verdana,Tahoma,Arial,sans-serif;font-size:13px"><br></span></div><div style="text-align:justify"><span style="background-color:rgb(252,252,252);color:rgb(17,17,17);font-family:Verdana,Tahoma,Arial,sans-serif;font-size:13px">Q. Implement acoustic model.</span></div>
<div style="text-align:justify"><span style="background-color:rgb(252,252,252);color:rgb(17,17,17);font-family:Verdana,Tahoma,Arial,sans-serif;font-size:13px"><br></span></div><div style="text-align:justify"><span style="background-color:rgb(252,252,252);color:rgb(17,17,17);font-family:Verdana,Tahoma,Arial,sans-serif;font-size:13px">A. I have decided to train the acoustic model by first collecting plenty of data to train the model , I know that it will consume our time.</span></div>
<div><span style="background-color:rgb(252,252,252);color:rgb(17,17,17);font-family:Verdana,Tahoma,Arial,sans-serif;font-size:13px"><br></span></div><div class="gmail_extra"><br clear="all"><div>Manish Sharma<br>B.Tech,CSE ,IV year, Indian Institute of Technology Roorkee.<br>
<a href="tel:%2B91-7579048744" value="+917579048744" target="_blank">+91-7579048744</a> <br></div>
<br><br><div class="gmail_quote">On Sun, Apr 14, 2013 at 11:25 AM, Bhavani Shankar R <span dir="ltr"><<a href="mailto:bhavi@ubuntu.com" target="_blank">bhavi@ubuntu.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div>On Sat, Apr 13, 2013 at 3:22 PM, manish sharma<br>
<<a href="mailto:manish09.iitroorkee@gmail.com" target="_blank">manish09.iitroorkee@gmail.com</a>> wrote:<br>
> Hi !!<br>
><br>
> A speech recognition engine has 3 components :<br>
><br>
> 1) Language model.<br>
> 2) Acoustic model<br>
> 3) Decoder.<br>
><br>
> As Each language language has distinct number of sounds and type of sound.<br>
> so we need to develop both an acoustic model and a language model.<br>
><br>
> My plan how to develop a acoustic model and language model is shared with<br>
> this doc.<br>
><br>
> <a href="https://docs.google.com/document/d/18gk39nrmSl6mOAYZ_zelVMnSPyS2-HY0CnLi03vxc44/edit?usp=sharing" target="_blank">https://docs.google.com/document/d/18gk39nrmSl6mOAYZ_zelVMnSPyS2-HY0CnLi03vxc44/edit?usp=sharing</a><br>
><br>
> I have already shared it with you with a message<br>
><br>
> GSOC 2013- Project : "Add a language model for speech recognition for<br>
> bengali language."<br>
><br>
> Report is a little lengthy :).<br>
><br>
> Looking forward for an early response.<br>
><br>
<br>
</div>Hi Manish,<br>
<br>
CMU sphinx looks fine for me. Just a couple of quick basic questions here:<br>
<br>
a) How do you think you can extend the dictionary for complex<br>
phrases/sentences (as a general view) and How do you think you can<br>
implement an acoustic model without much distortion, ensuring clarity<br>
for indic languages? (Since you have mentioned creation of acoustic<br>
model)<br>
b) What do you think about using a language model like MITLM along with sphinx?<br>
<br>
Regards,<br>
<div><div><br>
<br>
<br>
--<br>
Bhavani Shankar<br>
Ubuntu Developer | <a href="http://www.ubuntu.com" target="_blank">www.ubuntu.com</a><br>
<a href="https://launchpad.net/~bhavi" target="_blank">https://launchpad.net/~bhavi</a><br>
</div></div></blockquote></div><br></div></div>