Typically the application server is stateless and any persistent state is kept in a database, so you can just spawn another instance on another machine.
Could you give a concrete example from your experience? I ask because in my experience, services have had a relatively small (say less than a few hundred GB) fixed working memory usage, and the rest scales with utilisation meaning it would help to spawn additional processes.
In other words, it sounds like you're speaking of a case where all services together consume terabytes of memory irrespective of utilisation, but if you can split it up into multiple heterogeneous services each will only use at most hundreds of GB. Is that correct, and if so, what sort of system would that be? I have trouble visualising it.
Let's imagine Facebook, we can partition the monolith by user, but you would need the entire code base (50+ million lines?) running in each process just in case a user wants to access that functionality. I'm not saying one can't build a billion dollar business using a monolith, but at some point the limit of what a single process can host might become a problem.
Things like Facebook and Google are at a level of scale where they need to do things entirely differently form everyone else though. e.g. for most companies, you'll get better database performance with monotonic keys so that work is usually hitting the same pages. Once you reach a certain size (which very few companies/products do), you want the opposite so that none of your nodes get too hot/become a bottleneck. Unless you're at one of the largest companies, many of the things they do are the opposite of what you should be doing.
I agree that most will be fine with a monolith, I never said anything to the contrary. But let's not pretend that the limits don't exist and don't matter. They matter to my company (and we're not Facebook or Google, but we're far older and probably have more code).
The biggest challenge with monoliths is the limits of a single process and machine.