• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

need some help scanning documents

Special K

Diamond Member
I'm not sure if this belongs in Software for Windows or Peripherals. I recently bought a Canon 4400F scanner mainly for the purpose of scanning paper documents into PDF files and using OCR to make these documents searchable. I am a complete scanner n00b, and there are 2 documents in particular that are giving me problems.

One is a bill from my gas company. The front side of the bill is black text/logos on white, and it scans and OCRs without any issue. The back side of the bill is light gray text on a white background. It's difficult to read, but the scanner won't even pick it up. I first tried scanning it using Adobe Acrobat. When I set the document type to "black and white" or "grayscale", Acrobat complains and says it did not detect any text on the page. If I set the document type to "color", then the text is scanned, but the OCR fails to identify any of it. I then tried scanning the document using OmniPage Professional 16 and it also claimed that it did not detect any text on the page. Are there any settings I can adjust that will cause this text to be recognized and OCR'd correctly?

The second document is a full-size, 8.5x11" yellow receipt. There is text and color on both sides of the page. It scans fine, but because the paper is thin and semi-transparent, part of the backside of the page "bleeds through" when I scan the front side, and the resulting image looks like a superposition of both sides of the paper. Is there any way to make the scanner ignore the backside of the paper, and prevent this "bleed through" effect?
 
For the first one you should be able to scan the text, change the gray to black, then OCR it. If the scanning software doesn't permit that type of color change you may have to bring it into Photoshop or similar type of program first to change the gray to black, then OCR it.

For the second on try placing a piece of cardboard on top of it so the scanner light doesn't bleed through.
 
Originally posted by: TastesLikeChicken
For the first one you should be able to scan the text, change the gray to black, then OCR it. If the scanning software doesn't permit that type of color change you may have to bring it into Photoshop or similar type of program first to change the gray to black, then OCR it.

For the second on try placing a piece of cardboard on top of it so the scanner light doesn't bleed through.

For the first document, how do you change the color of just the text if the program cannot even recognize it as text to begin with?
 
Originally posted by: Special K
Originally posted by: TastesLikeChicken
For the first one you should be able to scan the text, change the gray to black, then OCR it. If the scanning software doesn't permit that type of color change you may have to bring it into Photoshop or similar type of program first to change the gray to black, then OCR it.

For the second on try placing a piece of cardboard on top of it so the scanner light doesn't bleed through.

For the first document, how do you change the color of just the text if the program cannot even recognize it as text to begin with?
Bring it into Photoshop, or just about any other image editing program, and use the magic wand tool (or its equivalent in whatever program you use) to select the gray color, then fill the selection with black.
 
Back
Top