[Project-ideas] Query about a project idea

Sampoorna Biswas sampoorna10074 at iiitd.ac.in
Fri Apr 6 03:41:45 PDT 2012


Hi

I would like to develop a system where the query is not mapped to a
pre-existing query, but where the processing is done upon the query itself
in order to produce a suitable match from the data set that we are querying
upon. It would be essentially like a multi-lingual search engine.

I have previously worked on a query-response system where weighted
keyphrase matching was being used to retrieve the closest match from an
existing data set. But that was all in English. The challenge obviously
lies in building an English-Bengali system.
What I can think of is: If we have a data set of both English and Bengali,
first step would be to determine whether the query is in Bengali or
English. If it is in Bengali, no translation should be required to search
in the Bengali data set. But for the English part of it, first we can
translate the query to English (with a high amount of accuracy) and then
search. Then the results from both languages can be combined and presented
to the user. If it is in English, a similar approach can be followed.

However, existing machine translation systems aren't very accurate, and it
is in fact one of the other projects in the ideas page. Should it be
sufficient to develop such a system where the translation bit can be
plugged in from the other project?
Also, I'll be very grateful for any kind of feedback on the approach that I
suggested. I will be writing the formal application soon.

Regards
Sampoorna Biswas


On Fri, Apr 6, 2012 at 8:10 AM, Sankarshan Mukhopadhyay <
sankarshan.mukhopadhyay at gmail.com> wrote:

> On Thu, Apr 5, 2012 at 5:46 PM, Sampoorna Biswas
> <sampoorna10074 at iiitd.ac.in> wrote:
>
> > In the project "Develop a system with multi-lingual capabilities in
> order to
> > receive answer to user specific queries.", how exactly does one
> 'determine
> > the question'? I mean, what is the input here? Will a user query be
> > classified into a query that already exists and the closest match be
> > delivered?
>
> The archives will have at least one email explaining some parts of the
> questions you have raised. I hope you have had the chance to look
> through them. I'd say that the system does not specifically
> "determine" the question. The user can use the input area to provide a
> string to query upon, it may be the form of a question or, it may be a
> fragmented word. Whether your proposal would convert the user input
> into a query to map to a pre-existing query is something you should
> look at. Earlier in the conversation with others on this project, I
> had mentioned that the constraints of a FAQ like system will
> potentially devalue this implementation.
>
>
> --
> sankarshan mukhopadhyay
> <http://sankarshan.randomink.org/blog/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ankur.org.in/pipermail/project-ideas-ankur.org.in/attachments/20120406/af6a852d/attachment-0003.htm>


More information about the Project-ideas mailing list