Discussion on: How to Schedule Any Task with AWS Lambda

View post

Replies for: Hi Renato, thanks for your interesting article. What if my task needs to be executed at (almost) exactly the time specified? Do you have any advice...

Hi Alessandro, glad you liked the post. Thanks for the comment, that is a very interesting question.

You'll need to implement custom code. I can think of a few ideas, but need more investigation to come up with a proper architecture. Is it possible to detail a little more your use case?

One naive idea would be:

Set up a Lambda to run every minute, triggered by a CloudWatch Rule.
Store tasks in a DynamoDB table, indicating the precise time to execute.
Lambda will query this table and get all tasks scheduled for start=current_timestamp + 30 seconds & end=current_timestamp + 90 seconds (the 30 sec start is an offset to account for Lambda startup time - this needs to be adjusted according to a number of factors).
Implement one or more additional Lambdas to process each type of task.
The first Lambda will invoke these executor Lambdas passing the task.
Each Executor Lambda code could implement a "while" loop to check whether current_timestamp == task_execution_timestamp. When evaluates to true, it executes the task.

I said it's a naive idea because it ignores some important things:

What does "exactly the time specified" mean to you?

Is it enough to run the task on a given second? Or do you need time resolution down to the millisecond, maybe microsecond?

That will have an impact over the implementation. Some programming languages will resolve time down to milliseconds, only.

How much deviation can you accept to meet the "almost" requirement?

If you're using AWS Lambda, beware that you can't control which machine is running your code. Could be multiple machines throughout a given period of time. It's actually most likely to be a different machine for every cold start.

This has important implications since there are issues with syncing clocks on distributed systems.

Depending on how much deviation you can accept in the "almost the exact time", this can be a problem.

Scalability

How many tasks do you expect to schedule and how are they distributed over time?

Is it possible that you'll have 50,000 tasks to run on a given millisecond? If yes, the challenge will be setting an infra that can scale to that level of concurrent requests.

Reliability (in general, not only infra-wise)

What happens if the triggering process of a task fails, or if the task executor fails entirely and a block of tasks is not executed at all.

Do you need a system in place to check for that and retry the task or can you afford having some tasks being lost?

Will it be too late if a few seconds have passed before retrying?

Is it a problem if, occasionally, the same task gets executed twice? If yes, a proper locking mechanism needs to be in place to ensure each task is processed once and only once.