I think the problem here is that you assume the LLM has to operate isolated from the world, i.e. without interaction. If you put a human scientist in isolation, then you cannot have high expectations either.
I assume not that LLM would be isolated, I assume that LLM would be incapable of interacting in any meaningful way on its own (i.e. not triggered by direct input from a programmer).