I remember a call with a customer from my first few months on the Azure Functions team. The customer’s function triggered from a queue message and...
For further actions, you may consider blocking this person and/or reporting abuse
Hi Jeff,
This sounds great for many of our common problems, however I can see this almost works for our largest problem (but not quite). Think of the SQL Database in your example above and imagine the database is an Azure SQL Database with a set amount of DTUs assigned. You then have a queue as in your example. We would want to somehow throttle the message queue to consume as many DTUs as possible without just causing a flood of throttling errors (Http Status 429s).
Can you see a nice way to use this pattern to throttle rather than circuit break?
It really feels like there's an elegant way to use this to achieve this.
This is effectively the situation where the circuit closes again and potentially all the newly queued work while the circuit was open then overwhelms the database immediately.
Bryden
Thanks Bryden - I think what you’ve described above I spot more into rate limiting and throttling than circuit break. A feature we’ve wanted to implement on a more granular level - hopefully in upcoming months will at least have number of instance throttling in all plans (now just a premium plan feature)
Jeff,
While that will be useful, instance throttling is very much a blunt force instrument. We are currently looking at a resource based consumption throttler. So monitoring the use of each resource that gets used and has some sort of rate limit associated. Then we are building something that will handle that accordingly. I'll sit down and thrash out in detail whether we can get something working that would handle this nicely.
In particular the reason we are looking at this is that we are effectively partitioning our data across multiple rate limited pieces of storage, so in theory we'd like to avoid the situation where we are limiting based on our lowest throughput piece of storage.
Rather we'd like to circuit break those calls quickly and return a 429 or similar to the caller and continue to consume all of our available throughput on everything else.
At it's worst, the durable entities sound like a potentially better storage for this than the Redis cache we are using now.
I guess what I'm saying is that I suspect for most consumers, instance level throttling potentially won't cut it (also because a single instance is quite capable of completely overwhelming a downstream resource all on its own). So investigating a more granular level of throttling would be well worthwhile. For now, we are quite happy to continue investigating, but if you guys had some clever thoughts that might improve our direction that would be great.
Yes makes sense. Would be interested to learn more what you are thinking. Above the "blunt force" instance limiting we have been evaluating execution limiting, but what you are describe sounds even more granular than that. Almost something like "I have 400 locks for SQL, 2000 locks for Azure Storage -- hey functions, do your thing, but before you can run this line of code you need to make sure you have a lock first." Is that accurate?
How would you handle rate/throttling limits from a downstream api inside your azure function? You can try to retry but what if the retrying takes longer than the azure function default timeout. Some downstream api provide a retry-after time what if it exceeds the 5minute default timeout?
Hi Bryden
"We would want to somehow throttle the message queue to consume as many DTUs as possible"
Isn't this just limiting the number of items in the batch read from the queue based on load test from a perf environment? You could even make the number configurable and set it externally from scaling logic on your Azure SQL db
Great article Jeff.
To break the circuit rather than stopping the entire function app, could also just disable the specific function in the Function App with the queue binding or update an ENV variable on the app service which the code uses a feature flag/toggle as it's cheap to read the current stats of the circuit open/closed.
Great point
Hi Jeff,
How can I use Cosmos or Redis as my storage in Durable Functions?
Thanks!
Djalma Freitas
Hi Jeff!
Thanks for great post! Maybe a stupid question from me. As I can see circuit breaker stop the function, but how I can start function again, automatically.