We have similar in the UK, especially thanks to Covid restrictions.

What I meant though, was that the individual farmers/bakers/etc are protected from spikes & low periods in demand, by the supermarket doing bulk orders with each supplying service, and caching (effectively) the products. Simultaneously, the customers can buy the products they need near instantly, no need to wait for a cow to grow, in order to get milk, etc.

Switching to microservice terminology for a moment, rate limiting at the gateway allows you to decide if each microservice should live in an EC2, or a Lambda (etc), because of the load profile. There's no point paying for an EC2 if you have large periods of inactivity, and equally, if your load is always constant, Lambda's probably won't save you much money (in fact, more often than not, it'd be more expensive).

Rate limits can also be used to help with scalability - eg, if there's too many customers, the supermarket can open more checkouts to serve more customers in parallel.

