I'm not sure if this belongs in Software for Windows or Peripherals. I recently bought a Canon 4400F scanner mainly for the purpose of scanning paper documents into PDF files and using OCR to make these documents searchable. I am a complete scanner n00b, and there are 2 documents in particular that are giving me problems.
One is a bill from my gas company. The front side of the bill is black text/logos on white, and it scans and OCRs without any issue. The back side of the bill is light gray text on a white background. It's difficult to read, but the scanner won't even pick it up. I first tried scanning it using Adobe Acrobat. When I set the document type to "black and white" or "grayscale", Acrobat complains and says it did not detect any text on the page. If I set the document type to "color", then the text is scanned, but the OCR fails to identify any of it. I then tried scanning the document using OmniPage Professional 16 and it also claimed that it did not detect any text on the page. Are there any settings I can adjust that will cause this text to be recognized and OCR'd correctly?
The second document is a full-size, 8.5x11" yellow receipt. There is text and color on both sides of the page. It scans fine, but because the paper is thin and semi-transparent, part of the backside of the page "bleeds through" when I scan the front side, and the resulting image looks like a superposition of both sides of the paper. Is there any way to make the scanner ignore the backside of the paper, and prevent this "bleed through" effect?
One is a bill from my gas company. The front side of the bill is black text/logos on white, and it scans and OCRs without any issue. The back side of the bill is light gray text on a white background. It's difficult to read, but the scanner won't even pick it up. I first tried scanning it using Adobe Acrobat. When I set the document type to "black and white" or "grayscale", Acrobat complains and says it did not detect any text on the page. If I set the document type to "color", then the text is scanned, but the OCR fails to identify any of it. I then tried scanning the document using OmniPage Professional 16 and it also claimed that it did not detect any text on the page. Are there any settings I can adjust that will cause this text to be recognized and OCR'd correctly?
The second document is a full-size, 8.5x11" yellow receipt. There is text and color on both sides of the page. It scans fine, but because the paper is thin and semi-transparent, part of the backside of the page "bleeds through" when I scan the front side, and the resulting image looks like a superposition of both sides of the paper. Is there any way to make the scanner ignore the backside of the paper, and prevent this "bleed through" effect?
