Why not simply upload the pdf version of the scanned book or document? Extracting the text out of a scanned document via GCP Document AI API sounds like unnecessary use of resources
I was running into context window issues doing this. I could have gone in and split up the scanned book into chapters or something to get around this, and did that for a couple of subjects. But it wasn't too much work (and literally cost me pennies, like six of them) to get the pure text extract, and it's pretty easy to work with now. (Besides, which random dev doesn't love a little side challenge to explore new APIs at home every now and then? ;) )