It’s anecdotal but Google probably has a ton of speech to text data from YouTube captions and such. Their hangouts captioning was phenomenal. I have full hearing but I still switch it on. It’s really good.
Don't forget that their CAPTCHAs for the visually impaired use YouTube audio, so they're building a very accurate, human-trained closed-captioning library.