That is exactly why Apple's on-device strategy is the only economically viable one. If every Siri request cost $0.01 for cloud inference, Apple would go bankrupt in a month. But if inference happens on the Neural Engine on the user's phone, the cost to Apple is zero (well, aside from R&D). This solves the problem of unmonetizable requests like "set a timer," which killed Alexa's economics
The greed to lock customers in early on for cheap or free, in hopes to force them on a subscription, absolutely ruined the previous era os assistants. It could have been great with offline inference and foster competition. Instead we got mediocre assistants, thst got worse each year.