[Project-ideas] GSoC idea discussion

Ishan Jain ishanjain1991 at gmail.com
Tue Apr 23 12:04:27 PDT 2013


Hello all,
I am a 3rd year B.Tech. student from Dhirubhai Ambani Institute of
Information And Communication Technology(DA-IICT), Gandhinagar. I am
interested in topic "Speech based query and result retrieval for Indian
languages". I have prior experience in dealing with Indian languages as I
am currently developing 'Named Entity Tagger' for Gujarati using
Conditional Random Fields. The paper for the same that I wrote can be found
here<https://docs.google.com/file/d/0B5xypg0s9MhZdGN2SHlacV9zRWc/edit?usp=sharing>.
The system is still in its primary stage and there is a lot of scope for
improvement. I am  currently in the process of adding POS tags as features.
I prefer to code in python but I also know and have coded in C, C++, Java.
I have knowledge about POS tags and basic understandings of word sense
disambiguation.
I don't have much knowledge about speech-to-text and text-to-speech aspects
but I am sure that I will be able to understand them without any
difficulty.
Since I have taken Information Retrieval course this semester, I have
fairly good knowledge and understanding about Information Retrieval
systems. I have worked with Terrier and Lemur/Ind.
It would be a great help if you could guide me further like what are the
next steps that I need to take regarding this project and from where can I
get more information abou this project.
It would be convenient for me if further details that are essential could
be provided like corpus availability and description, current work, and
other crucial details.
Thanking You,
Ishan Raj Jain
7878779803
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ankur.org.in/pipermail/project-ideas-ankur.org.in/attachments/20130424/aa51fba1/attachment-0002.htm>


More information about the Project-ideas mailing list