[Project-ideas] FEW QUESTIONS ABOUT GSOC2012 PROJECT PROPOSAL

Abhishek Gupta abhishekgupta.iitd at gmail.com
Sun Apr 1 17:40:28 PDT 2012


Hi Gourab,

As I am also interested in the project "Develop a system with multi-lingual
capabilities in order to receive answer to user specific queries", I have
had some discussions on this with Sankarshan which you can browse through
the archives. Also, you can find my proposal at -
http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/abhi7/20019
which
might help you in giving a broad idea of the problem and what I think can
be a good solution.

I welcome your comments on the list as you won't be able to comment there
and sincerely hope that my proposal would help you as well in framing yours
:)

Also, regarding your the approach that you have suggested I have a doubt.
The approach suggested on the lines of an information retrieval may not
work so well (or may be required) for the problem. As a very specific
domain of FAQ is targeted, we can probably narrow down our steps to well
defined approaches. *For example, we may not require the mentioned dataset
as we might work in a way that no conversion from English - Bangla or vice
versa is required.*

Regards
Abhishek

On 28 March 2012 21:24, Gourab Saha <gourab.isikolkata at gmail.com> wrote:

>
>
> I am Gourab here.As I am on the process to draft my formal project
> proposal, I have few questions regarding some issues.
> As previously said I am seriously interested to work on a project on the
> field of  "information retrieval" as a part of GSOC2012.
> and even continue to work on that field to for my M.Tech thesis.
>
> During the previous week I did a good study on the following project ideas
> you have floated on this area.
>
>
> 1. Improving models for Cross Language Text Re-use
> 2. Develop a system with multi-lingual capabilities in order to receive
> answer to user specific queries
> 3. Improving information retrieval methods for OCR data sets consisting of
> indic scripts
>
>  I have talked with my professors  having similar field of research
> interest.As per their valuable suggestions on the above mentioned project
> ideas I am on my way to draft a
>  formal proposal.
>
> As you have raised the concern over the license issue over the
> dataset/tools available, I have clarified from my
> professor, RISOT data set(RISOT)(On which I am planing to work) is freely
> available and not constrained by any license.
> The lemur toolkit is complete open source framework for IR software
> development. and Trec_eval, a standard tool for
> performance evolution is also completely open source.
>
> I am writing my proposals in a brief here . Kindly give suggestions how it
> can be further improved.
>
> The key idea is to propose and implement a method to improve the
> cross-language information retrieval with a
> pair of languages(Bengali/English).We have RISOT data set containing
> article from ABP bengali news paper corpus
> from 2004-2006 as well as The Telegraph english newspaper corpus. It will
> take the query in english and retrieve the
> results from the bengali corpus . The above mentioned will go through a
> process of implementation translation,transliteration,
> blind relevance feedback,query expansion and finally the information
> retrieval.I am aiming for a well-accepted  accuracy
> measure.
>
> I have few questions other than technical issues of the project.
>
> 1. Apart  from  mentors from the organization(http://ankur.org.in/) can I
> have a mentor from my institution/foreign university? However They will not
> be
>     anyway related to GSOC2012.
>
> 2. If my proposal is accepted and my research in this summer lead to paper
> publication is there any type of constraints/to-dos from
>     GSOC or from your organization for publishing a paper?As far I
> understood Google doesn't have any problem as long as I release my
>     code under open source license.
>
>  Kindly let me know any other issues regarding the proposal(Details will
> be included in the final proposal) or any other impediments over
> any other related issues . Kindly give your valuable feedback,as I am on
> my way to draft my formal proposal.
> I am sincerely hoping to work with you in this summer.
>
> If you have any questions or concerns, please feel free to send me an
> email gourab.isikolkata at gmail .com or call me at 9051110501.
>
> Thanks
>
> Gourab Saha
> M.Tech(Computer Science)
> Indian Statistical Institute,Kolkata
> gourab.isikolkata at gmail.com
> (+91)9051110501
>
>
>
> _______________________________________________
> Project-ideas mailing list
> Project-ideas at lists.ankur.org.in
> http://lists.ankur.org.in/listinfo.cgi/project-ideas-ankur.org.in
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ankur.org.in/pipermail/project-ideas-ankur.org.in/attachments/20120402/361238d6/attachment-0003.htm>


More information about the Project-ideas mailing list