I'm curious, if trained with mostly English text in the images rather than Japanese, if it would produce mostly recognizable roman characters (but not necessarily words) or whether instead the characters would be largely unrecognizable.
More reference images with what looks convincingly (to a nonspeaker's eyes) like text: