[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: gocr (GPL'ed Optical Character Recognition)
>For those of you with low-volume or occasional OCR requirements, this is
>the best effort I've seen so far. There are others on the link page from
>this site, some of which I've tried and found very much wanting.
>Some impressions (based on a few trials):
>1. Takes longer to compile, per program, than almost anything else I've
> ever seen. This is an indicator (not a guarantee) of a high degree of
> complexity in a fairly small amount of code.
You noticed this too? I sat there, staring at the gnome-terminal window waiting
for the thing to abend. I've never seen a file take that long to compile.
>2. The packaging stinks. This is actually good, because this guy is paying
> lots of attention to the core content, not all the froo-furrah. Later he
> can get fancy.
Not sure what you mean by this? It's a single binary, what packaging would he
need.
>5. OCR'ing a page of typed copy took about 2 minutes on a 233 MHz Pentium;
> (2:30 for a virtually flawless page of Courier). Performance ain't great.
> Who cares? For small volumes, it's livable, for large volumes and production
> apps, there's commercial software, which costs big bux... but then
> there's budget for that on large projects. This levels the playing
> field a bit.
I'd also note that it is not for those with low-memory situations. This thing
eats up memory like the Tasmanian Devil at a buffet.
>Whether you spend nothing or thousands for OCR, you still need to
>proofread, although there's generally fewer mistakes at the very pricey end of
>things. So far, this does a lot of the easy stuff as well as some really
>expensive packages, and it's not even close to release 1.0
>gocr looks good enough, as is, for block-level prototyping. For an early
>release, this is great stuff!
I snatched an except out of a manual last night, about two paragraphs of
Garamond type, trimmed it down to just the text with xv, ocr'd, and had two
minor errors. Which I thought was very good. On a whole page, it didn't do as
well. But if this only gets a little better it will be as good as the
commercial Win32 OCR software I've used.
Systems and Network Administrator
Morrison Industries
1825 Monroe Ave NW.
Grand Rapids, MI. 49505