I've seen shaders cause problems due to branching logic- I'd try disabling them or making them perform a no-op.
I've focused on 2D with Godot, but often with a very high volume of nodes at once.
Which led me to change many things in the implementation and I also ended up patching the Engine multiple times.
One example is this (which I have patched on my Godot version): https://github.com/godotengine/godot-proposals/issues/4050
I profile my game all the time, but sometimes it's not clear if there is a bottleneck or if there are just too many things at the same time.
Also the bottleneck can change in different parts of the game. Sometimes I have removed a bottleneck on the CPU just to find another in the GPU.
I've seen shaders cause problems due to branching logic- I'd try disabling them or making them perform a no-op.
I've focused on 2D with Godot, but often with a very high volume of nodes at once.