[Anubad] Project to Develop a system with multi-lingual capabilities in order to receive answer to user specific queries

Abhishek Gupta abhishekgupta.iitd at gmail.com
Sun Mar 18 23:17:53 PDT 2012


Hi Sankarshan,

Thanks for the reply. In my view, possible input sets are as follows :

*1 - Model set of questions and answers.*
In this case, if a user asks a question we can find the co-relation of the
question with any of the existing solution(s). On the basis of the
questions shortlisted, now we have some idea what answers to use for the
same. Next step, will be to answer the asked question based on answers of
the model questions shortlisted.

The advantage of the approach is that most of the people already have a
model set of questions and answers with them, because of which the approach
won't need much input from the administrators.

*2 - Ontology database*
The other more obvious, but less convenient option is to construct an
ontology or a semantic network of the information. Next we try to shallow
parse the users' question, followed by semantic analysis (like mapping to a
prolog query on our ontology database, finding the answer and then using
the same to reply the user in a more 'humane' way. This is quite similar to
the model that SIRI is used.

*3 - Unstructured data like wiki page*
This approach even way more convenient to the user, might not generate
really good results. Over here we give unstructured data like say a wiki
page, the system proposed should search the page to find the answer, and
accordingly modify the same so that it looks socialable and relevant to the
question asked.

Note:
We can integrate the solution with the existing bot code-base like *alice*,
so as to take advantage of the extensive knowledge base created.

Like in the bot https://www1.paypal-virtualchat.com/, we can try to orient
the discussion by asking questions from the users with *constrained inputs
like extracting a boolean value* from the answer to see if we are trying to
answer indeed helps us.

The project can be done in two *phases*. In first phase we can focus on
more *objective *questions, while on other phase *subjective *questions can
be stressed upon.

Please add if I am missing something or wrong at some place. There are some
really good publications with respect to the same specially in context of
SIRI. Finally, I wanted to ask about the possibility of a research *
publication* if we are able to achieve some noteworthy results, which I am
quite confident of achieving.

Thanks and Regards
Abhishek Gupta
3rd Year, Computer Science, IIT Delhi
abhishek.cc

On 19 March 2012 10:53, Sankarshan Mukhopadhyay <
sankarshan.mukhopadhyay at gmail.com> wrote:

> Hi Abhishek,
>
> I have copied the mailing list in the reply and, have responded to the
> relevant bits of your email.
>
> On Sun, Mar 18, 2012 at 4:25 AM, Abhishek Gupta
> <abhishekgupta.iitd at gmail.com> wrote:
>
> > I am a student of 3rd year pursuing a B.Tech degree in Computer Science
> at
> > IIT Delhi (Indian Institute of Technology, New Delhi, India). I read
> about
> > the project ideas for Ankur and would be really glad if I can work on the
> > project "Develop a system with multi-lingual capabilities in order to
> > receive answer to user specific queries" under your guidance to
> complement
> > my present knowledge.
>
> It is good to see someone have an interest in this specific project.
> To answer the question you had asked in a separate email, I'd like the
> proposal from you to consider the training data for this system along
> with the rationale on why you selected that as a training data. The
> way we would like to undertake the projects is provide a general sense
> of direction and thereafter request the interested candidates to nail
> down the objectives and the route as well as the metrics which would
> help us gauge the success/failure of the projects with various
> milestones.
>
> /sankarshan
>
> --
> sankarshan mukhopadhyay
> <http://sankarshan.randomink.org/blog/>
>



More information about the Anubad mailing list