parkerjfil
03-05-2010, 07:03 AM
I am going to digitize my book collection and have concerns questions about the process: clip the binding, scan the pages, use OCR to convert raster to text.
1. is it really is necessary to destroy the book in order to get a scan that the OCR can convert reliably. I realize that depends on the binding, scanner, print size/font etc...But generally speaking are OCRs good readers. As much as I appreciate the transportation paradox, I would rather like to give the physical book away after digitizing it...
2. Many of my book are rich in graphic content. any suggestions on how to minimize file size while maintaining good resolution of graphic elements? I guess i am wondering if OCRs handle text, images, and white space correctly and if they are all created equal.
3. OK, lastly, many of my books are loaded with annotation, doodles, notes in the margin, etc. I would seriously pull a face like this::w00t: if this content can be parsed.
Thank you very much for your help,
1. is it really is necessary to destroy the book in order to get a scan that the OCR can convert reliably. I realize that depends on the binding, scanner, print size/font etc...But generally speaking are OCRs good readers. As much as I appreciate the transportation paradox, I would rather like to give the physical book away after digitizing it...
2. Many of my book are rich in graphic content. any suggestions on how to minimize file size while maintaining good resolution of graphic elements? I guess i am wondering if OCRs handle text, images, and white space correctly and if they are all created equal.
3. OK, lastly, many of my books are loaded with annotation, doodles, notes in the margin, etc. I would seriously pull a face like this::w00t: if this content can be parsed.
Thank you very much for your help,