I did OCR conversion to about 1300 PDF's for a customer and now I have to convert them to speech for CD. Sounds pretty simple... but now the hard part: each text file is filled with a header and footer that is unnecessary. Now I need to remove them. I'd seriously consider doing it by hand but I didn't get THAT much money for the job. It's already turning out to be more time put in than I thought.
Each unneccessary line is almost exactly alike so it shouldn't be hard to match them.
Format is kind of like this but it may vary file to file a tiny bit:
Each unneccessary line is almost exactly alike so it shouldn't be hard to match them.
Format is kind of like this but it may vary file to file a tiny bit: