lusolth
New Member
I have an OCR system that is fed files by a VB Script. The OCR program is locking up when it gets a bad PDF. I have tried to open the PDF, and sure enough get an error message that it is a bad file. Adobe Reader says "File is damaged and could not be repaired." I know how many pages are supposed to in each file, and I wrote a quick script to count pages, and unfortunately it returned the expected page count. So I suspect there some something like a missing EOF flag or ????? I don't need to fix the files, just identify them so I can put them out of the queue to the OCR machine. Any method that will identify a bad file would be greatly appreciated. VB Script is the only language I can deal with, but my son could handle a test written in PHP, Java, Java Script, or Python.