[Project-ideas] follow up discussions - improve accuracy of bengali OCR

Mon Apr 15 15:59:50 PDT 2013

Hi,

>> Also, M.A.Hasnat, the developer of BanglaOCR pointed to me that the
accuracy
>> may not be same for all domains, eg., newspaper, book, typewriting docs,
>>etc, so, domain adaptability should be considered.

>have you wondered why it would be so ?

I have not given much thought to it, but depending on the initial
pre-processing, those factors (like quality of page, print, scan etc) could
affect the actual input being supplied to the OCR.  Assuming the input data
is uniform in aspects of quality (resolution should also effect, but it
should not be a difficult task to alter resolution of digital data), the
OCR should have the same performance.

But to begin with, I would like to focus more on the post-processing, and
selectively on some of the pre-processing steps, but not as a whole. I
shall describe my plan in more detail in my proposal. (Still working on it,
taking longer than I expected)

Maybe this particular problem is more relevant to the other OCR project. (
!? )

p.s. - apologies for starting a new thread, I only get daily digest mails,
and could not figure out how to reply to the same thread.

-- 
-Regards,
Debajyoti Nag
http://twitter.com/aramis7d
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ankur.org.in/pipermail/project-ideas-ankur.org.in/attachments/20130416/9cc361c4/attachment-0003.htm>