It'd be interesting to hear this done with other (midi) voices: drums, violins, whistles, birds, dolphins, a full orchestra. You should be able to take an arbitrary sound-track, and recreate it by mixing a bunch of samples of ... just about anything.
Perhaps it's possible, but it would be harder to chain an actual sound with a part spectral print of another sound. I think here they are cropping out the part of the vocal recording that falls outside of the range played by a piano, and then pressing the appropriate piano keys to emulate that recording. A piano key when pressed doesn't even really create a simple tone, but a sound wave with its own structure that sounds like an exact note, even though it contains a combination of other frequencies that mature and decay with their own pattern over time. My guess is, in order to determine which keys to press and when, they assumed the keys were exact notes. It would be impossible to do this with anything producing true white noise (many percussions), and hard to do so with most natural sounds.