The simplest of which would be to turn the images into a multi page raster PDF, using freely licensed linux based command line tools for PDF generation. Which will of course result in a rather large file size vs doing OCR, but might be the best preservation method for books with illustrations, unusual fonts, catalogs, mixed text and photos, etc.
I am not clear on to what extent the existing workflow does a de-skew of the camera images to deal with page curvature towards the spine.
I think I recall the Internet Archive having an open source design for something similar to this? And other projects which accomplish generally the same idea.
I am not clear on to what extent the existing workflow does a de-skew of the camera images to deal with page curvature towards the spine.
I think I recall the Internet Archive having an open source design for something similar to this? And other projects which accomplish generally the same idea.