[Project-ideas] follow up discussions - improve accuracy of bengali OCR

Sankarshan Mukhopadhyay sankarshan.mukhopadhyay at gmail.com
Mon Apr 22 21:07:52 PDT 2013


On Tue, Apr 23, 2013 at 7:48 AM, Debajyoti Nag <dave0908 at gmail.com> wrote:

> After further thoughts, I believe that its best to rely on Tesseract's
> pre-processor for now.
> Noise due to those factors are not language specific, and hence,the minor
> noise-corrections could be included only as an optional objective of the
> project, which is to be pursued if time permits.
> Things could always be improved.

> Tesseract 3.02 has good support for some connected scripts, but Bengali is
> not among them, however, the methodology should be useful. It's mentioned in
> more detail in my proposal
>
> Please find attached the first draft of my proposal for the project. I tried
> to build it based on the points mentioned on the Project Ideas page.

Since the proposal window has opened up, I think it would prudent to
use the Melange system to submit your write up. For the benefit of
others, the email with the attachment was stuck in moderation queue
and I think it is best that we have proposal discussions on the
Melange system.


--
sankarshan mukhopadhyay
<https://twitter.com/#!/sankarshan>



More information about the Project-ideas mailing list