need some help scanning documents

Special K

Diamond Member
Jun 18, 2000
7,098
0
76
I'm not sure if this belongs in Software for Windows or Peripherals. I recently bought a Canon 4400F scanner mainly for the purpose of scanning paper documents into PDF files and using OCR to make these documents searchable. I am a complete scanner n00b, and there are 2 documents in particular that are giving me problems.

One is a bill from my gas company. The front side of the bill is black text/logos on white, and it scans and OCRs without any issue. The back side of the bill is light gray text on a white background. It's difficult to read, but the scanner won't even pick it up. I first tried scanning it using Adobe Acrobat. When I set the document type to "black and white" or "grayscale", Acrobat complains and says it did not detect any text on the page. If I set the document type to "color", then the text is scanned, but the OCR fails to identify any of it. I then tried scanning the document using OmniPage Professional 16 and it also claimed that it did not detect any text on the page. Are there any settings I can adjust that will cause this text to be recognized and OCR'd correctly?

The second document is a full-size, 8.5x11" yellow receipt. There is text and color on both sides of the page. It scans fine, but because the paper is thin and semi-transparent, part of the backside of the page "bleeds through" when I scan the front side, and the resulting image looks like a superposition of both sides of the paper. Is there any way to make the scanner ignore the backside of the paper, and prevent this "bleed through" effect?
 
Sep 12, 2004
16,852
59
86
For the first one you should be able to scan the text, change the gray to black, then OCR it. If the scanning software doesn't permit that type of color change you may have to bring it into Photoshop or similar type of program first to change the gray to black, then OCR it.

For the second on try placing a piece of cardboard on top of it so the scanner light doesn't bleed through.
 

Special K

Diamond Member
Jun 18, 2000
7,098
0
76
Originally posted by: TastesLikeChicken
For the first one you should be able to scan the text, change the gray to black, then OCR it. If the scanning software doesn't permit that type of color change you may have to bring it into Photoshop or similar type of program first to change the gray to black, then OCR it.

For the second on try placing a piece of cardboard on top of it so the scanner light doesn't bleed through.

For the first document, how do you change the color of just the text if the program cannot even recognize it as text to begin with?
 
Sep 12, 2004
16,852
59
86
Originally posted by: Special K
Originally posted by: TastesLikeChicken
For the first one you should be able to scan the text, change the gray to black, then OCR it. If the scanning software doesn't permit that type of color change you may have to bring it into Photoshop or similar type of program first to change the gray to black, then OCR it.

For the second on try placing a piece of cardboard on top of it so the scanner light doesn't bleed through.

For the first document, how do you change the color of just the text if the program cannot even recognize it as text to begin with?
Bring it into Photoshop, or just about any other image editing program, and use the magic wand tool (or its equivalent in whatever program you use) to select the gray color, then fill the selection with black.