>>>Someone sent us a PDF that was from a scanned image. The quality is poor at best. It is mostly text and that is what I care about. Any way to improve the quality to make it more readable? Using just Adobe Reader 8.1 to view the PDF.
>>
>>Check PDFToText free utility.
>>
>>
Re: Extract text from PDF Thread #
1217313 Message #
1217317>
>Hmm, I suspect the PDF file does not contain any text per se that could be extracted via a utility like PDFToText; I'm guessing it contains only the original scanned image. It might be possible to:
>
>- use one of the utilities in XPDF to extract the image from the PDF:
http://en.wikipedia.org/wiki/Xpdf>- then, use Optical Character Recognition (OCR) to extract the text from the image
>- once the text has been extracted it can be printed/viewed "perfectly" via NotePad, Word etc.
>
>There are some free OCR packages such as SimpleOCR (
http://www.simpleocr.com/ ) that could help here. Some non-free programs such as the full version of Adobe Acrobat have workflow features that help automate this kind of thing if you do it a lot.
Yes, I guess you're right. I forgot we're talking about scanned image.
If it's not broken, fix it until it is.
My Blog