[Project-ideas] gsoc 2012 project idea---A Glossary Tool to index all available localizable strings for Bengali across various FOSS projects and, other available corpus

Runa Bhattacharjee runabh at gmail.com
Wed Mar 21 11:11:22 PDT 2012


Thank you Subhobrata and Arani for your interest in this project. I am replying
to both your queries inline.

On বুধবার 21 মার্চ 2012 01:53 অপরাহ্ণ, Subhobrata Dey wrote:

>
> my questions are: 1>What exactly is expected out of this project?A glossary
> is ok,but will it also include a translator from an international standard
> language like English to Bengali too?Since this will act as a reference tool
> for translation,so a glossary of words alone won't do.This is because as far
> as i know,a string from english is always not exactly word-by-word translated
> to bengali.Java & Python i18n libraries have methods to do this
> translation.So,for high quality translation this issue needs to be taken care
> of.

I am unsure about what you mean by 'International Standard Language', but am
assuming that you mean either a person or a tool who/that can translate from
English to Bengali. Preferably the latter. Anyways, there are essentially two
things in this - first is the corpus of terms and mappings that build up the
glossary and second, the methods that can make sense of that and put it to good
use. The first can be produced/procured with least efforts. The second part is
where the flowery brain power comes in. This would require extensive studying of
existing glossary utility tools or systems that use these tools and how a
suitable system can be implemented for the language (in this case the Bengali)
translators, that will allow them to ensure that they have easy access to the
contents of the glossary and also use it to maintain consistency in their
applications.

>
> 2>What is meant by self-learning glossary?Is it like parsing the localizable
>  string & learning the meaning of words from it & using them to construct
> valid strings in future?Then i think this will involve nlp. It will be great
> if these doubts of mine are clarified.

A self-learning glossary would primarily be a system (but not limited to) where
new inclusions are parsed adequately to ensure that existing content is not
borked, content querying is not affected, mappings can be adequately
implemented. The rest I'd leave for you to build upon. :)
>
> by the way,creating a desktop application will be included in my project too
> if time permits.

That would certainly be a big help. :)

On Mon, Mar 19, 2012 at 10:46 PM, Arani Bhattacharya <arani89 at gmail.com> wrote:

> 1) As pointed out, one of the major initial tasks is to study any existing
> glossary, and analyze its working, advantages and disadvantages. Could
> someone point me to some existing tool?

As mentioned earlier, I'd suggest that you check existing glossary utility tools
or systems that use these tools. Especially translation systems that have
already been mentioned by Sankarshan.

>
> 2) Which programming language should preferably be used to implement the
> tool?

Depends on what you'd prefer as the best tool to implement your idea :)

Hope this answers your initial queries.

cheers
Runa




-- 
blog: http://arrbee.wordpress.com
irc: arrbee or runa_b on Freenode
http://fedoraproject.org/wiki/User:Runab



More information about the Project-ideas mailing list