There's nothing wrong with starting the paths from the eye; the laws of physics are completely time-symmetric, so you can get equivalent results either way.
It's been a long time since I studied this, and I've forgotten most of it, but I think the key problem is that you don't know in advance what the ray density should be at the eye, so when you start the rays at the eye, you typically emit a fairly even density of rays and see what they do in the scene, but this won't give you correct results for some scenes, where paths emitted from light sources should end up more concentrated in some parts of the eye/image than in others.
There are other hacks too - the typical start-at-the-eye technique doesn't bounce the paths around the scene until they hit a light, it just does a couple of bounces, and at each surface it takes an estimate of how lit it is by looking at the lights that are visible from the intersection. I guess you could simulate the rays bouncing until they hit a light source, but at that point, why not just do the simulation from the light source like in the physical world.
What you really want to do at a surface intersection when you're going from the eye is an integration over all the possible paths light could take to hit that point and reflect along the path you took from the eye. For specular highlights and perfect reflections, this is reasonable, but in other cases there are an infeasibly large number of possible incident paths that could come out along the path you took.
There's even been research into switching a lightbulb + camera setup for a scanning light source + a photodiode. http://graphics.stanford.edu/papers/dual_photography/ (Note there's a video at the bottom of the page that walks through the various techniques.)
The hack is not inverse ray casting. Thing become hacky when the bouncing ray's bounce angle/amplitude etc. Determined by a mapping instead of real geometry/black body etc.