DEV Community

Marwa Talaat for AWS Community Builders

Posted on • Updated on

Why do we love AWS Step Functions vs AWS Lambda in AI/ML?

Interested to know if you need to use AWS step functions in your application? This is a complete guide on what are AWS Step functions, how AWS Step Functions works, their benefits, and their strengths over AWS Lambda and weaknesses in different situations.

What are AWS Step Functions?

AWS Step Functions is a low-code visual workflow application that allows you to create, deploy, and automate business processes as well as data and machine learning pipelines using AWS services. Workflows handle exceptions, retries, parallelization, service integration, and monitoring, allowing developers to concentrate on higher-value business logic.
In other words, step functions are orchestrators to help to design and implement complex workflows such as batch processing. Step Functions coordinates between multiple tasks that need orchestration, making it simple to build multi-step systems.

You could develop interactive and complicated systems that use all the features mentioned in addition to full orchestration and ease of transparency with AWS Step Functions to manage and shape these interactions. Let's discuss it right away.

AWS Step Function Benefits

  1. Build and deploy rapidly:
    Workflow Studio offers a straightforward drag-and-drop user interface that makes getting started quickly. Step Functions allow you to quickly connect services, systems, or people by using low-code, event-driven workflows to describe complex business logic.

  2. Write less integration code:
    Build robust business workflows, data pipelines, or apps using AWS resources from more than 200 services, such as Lambda, ECS, Fargate, Batch, DynamoDB, SNS, SQS, SageMaker, EventBridge, or EMR.

  3. Build fault-tolerant and stateful workflows:
    Step Functions keeps track of managing state, checkpoints, and restarts so that your workflows proceed as planned. Based on your predefined business logic, automatic error and exception handling are provided through built-in try/catch, retry, and rollback capabilities.

  4. Designed for reliability and scale:
    Depending on your particular use case, you can choose between the Standard or Express workflow types that Step Functions offers. Long-running workloads are managed using standard workflows. Workloads for high-volume event processing are supported by Express Workflows.

  5. Parallelism:
    Declarative parallelism is possible for the work. A state of a step machine may invoke different states simultaneously. The workflow will proceed more quickly as a result.

  6. High Execution Time:
    If some of the tasks in the workflow require a lot of time (exceeding 10 minutes), they can be executed on ECS, EC2, or as an Activity hosted outside of AWS because Step Functions have a maximum execution time of one year.

How did AWS Step Functions build?

AWS Step Functions consist of the following main components:

  • State Machine

The term "state machine" is used to describe an application workflow in AWS Step Functions, which is based on this very idea. The Amazon States Language (ASL ) allows programmers to create state machines in Step Functions using JSON files.

Processes that take a while to complete or require human involvement can be defined as a regular workflow. while Express workflows are ideal for quick, high-volume procedures that complete in under five minutes.

State machine takes data in 3 main forms; input in the initial state, data which passed between states, and output in the final state.

For more information, see State Machine Data

  • State

Individual states could act, make decisions based on their input, and send output to other states. In your state machine, states are elements. A state is identified by its name, which can be any string but needs to be unique across the entire state machine.

A state represents a step in your workflow. States can perform a variety of functions:

• Do some work in your state machine (a Task state)
• Choose between branches of execution (a Choice state)
• Stop execution with a failure or success (a Fail or Succeed state)
• Simply pass its input to its output or inject some fixed data (a Pass state)
• Provide a delay for a certain amount of time or until a specified time/date (a Wait state)
• Begin parallel branches of execution (a Parallel state)
• Dynamically iterate steps (a Map state)

Example of state definition of task type using ASL

"States": {
"FirstState": {
"Type": "Task",
"Resource": "$",
"Next": "My Next state"
}

Enter fullscreen mode Exit fullscreen mode

The states that you decide to include in your state machine and the relationships between your states form the core of your Step Functions workflow.

For more information, see Concepts States

  • Tasks state and Activities

Tasks: A Task also referred to as a task state is a single unit of work used in the state machine. A task can be used in invoking the Lambda function or calling the API of other services.

Activities: Activity is used to perform a task. It lets you connect your step function with a batch of code that is running elsewhere which is known as an activity worker.

You can see and verify your state machine as a set of steps through your AWS console. Step Functions record the execution time, input, output, number of retries, and errors for each step as it is carried out. Engineering teams may quickly identify which step or steps may have caused a workflow to fail and which steps may have caused that failure with the use of this information.

Drawbacks of Step functions

  • Vendor lock: since it's used on AWS. if you decide to migrate your application to a different cloud vendor., you will need to remodel your application or replace it with an alternative service from a new vendor.
  • Complex syntax: Your application code may become more difficult to comprehend for other members of your team who might need to edit or upgrade it because of separating business logic from workflow logic. The Amazon Statements language used to configure step functions is very complex. The syntax of this language is based on JSON. In other words, the language is ideal for machine readability, not for humans. This language can be difficult to learn and is only available for AWS Step Functions as it is an AWS proprietary language.
  • The maximum limit for keeping execution history logs is 90 days.
  • The missing trigger for some events like in DynamoDB and Kinesis.
  • Each Execution name for a state machine must be unique and not used in the last 90 days.

Step Functions limitation

  • Maximum item execution is 25,000 per workflow; so, you will need to divide workflow into multiple workflows to not exceed the limit.
  • Request made to Step functions size should not exceed 1MB.
  • Maximum 50 tags per each resource in step functions

AWS Step Functions vs AWS Lambda: what are the differences?

Overall comparison

AWS Step Functions AWS Lambda
  • Using visual workflow, it builds distributed applications.
  • Simplify coordinating components of distributed applications and microservices through visual workflows.
  • You can expand and update apps efficiently by creating them from separate components that each perform a distinct function
  • Runs code automatically to respond to object modifications.
  • Create your back-end services that use AWS scalability, performance, and security while extending existing AWS services with custom logic.
Cloud Task Management category Serverless category
Integration with other services No infrastructure
State machine, tasks execution unit Lambda function execution unit
Supported Runtimes: Java,.NET, Ruby, PHP, Python (Boto 3),JavaScript, Go, C++ Supported Runtimes: Nodejs 12, 14, 16, Python 3.6, 3.7, 3.8, Java 8, 11, .NET 5,6, Core 3.1, GO 1.x, Ruby 2.7
  • AWS free tier offers 4,000 Step Functions state transitions per month
  • Beyond the free tier the price is depending on region, but the price is calculated per 1,000 state transitions
  • free tier offers 400,000 GB-seconds of compute time per month.
  • Beyond free tier—$0.00001667 per every GB-second

Check AWS Price Calculator to calculate your AWS services and architecture cost estimation.

Cost factors comparison

Prices are based on US East Ohio Region

Lambda Step Functions-Standard workflows
Invocation $0.20 per 1M requests NA
Consuming GB per seconds $0.0000166667 for every GB-second NA
State transitioning NA $0.025 per 1K

You may observe that standard workflows have different pricing. Their charge is primarily based on the number of state changes.

Flexibility Comparison

Lambda is not well known for its flexibility and its ability to perform complex and long-running operations. For instance, Lambda is limited to up to 5 minutes, however, you can extend it to 15 minutes. This may not be practical for large complex scripts.

For example, if you build a simple app using Lambda that you expect some common steps that people expect like retrying the connection until service will be available to move to the next step or run operations in parallel. Additionally, users are unable to run failed code again. Unfortunately, Lambda does not come with these functionalities by default.

This should not discourage you from using Lambda at all, as AWS Step Functions can win in this situation.

You can develop interactive and complicated systems that make use of all the elements we just mentioned and more with complete orchestration and ease of transparency by using AWS Step Functions to manage and shape these interactions.

Development & Integrations Comparison

To perform these longer activities, it is possible to add containers using a solution like Amazon Elastic Container Service (EC2), but what happens when the containers are also working on other tasks? You need to include the state.

AWS Lambda's inability to handle state effectively is the issue. State management requires developers to add code into their systems, which complicates management and extends processing times. As a result, users are forced to decide between using resource-intensive apps and ensuring there is enough state for peak usage.

Fortunately, these obstacles are easily overcome thanks to AWS Step Functions. Because less code is needed to do tasks, AWS Step Functions are genuinely unique. Engineers can build standardized procedures for dealing with errors, retries, and parallelization that free up developer resources for other higher-value tasks. By employing multi-service management, they may get rid of complicated state handling and streamline laborious application development procedures.

For more information on both services integration; check Using AWS Lambda with other services, and Using AWS Step Functions with other services

Benchmarking Comparison

If you are interested in benchmarking AWS Lambda and AWS Step functions, you could check the following blog posts:

Conclusion

To sum up, developers can quickly add, move, switch, and reorganize Lambda functions using AWS Step Functions' visual workflow interface without having to alter the business logic that is already in place. This abstraction makes it relatively simple to increase application performance without writing additional code.

Additionally, AWS Step Functions easily integrates with other AWS services. The task can be configured to execute concurrently with other activities, wait for external processes, or wait for the completion of other internal tasks by developers. With AWS SageMaker, AWS Batch, serverless ETL with AWS Glue, and many more tools, users may integrate machine learning algorithms into their applications.

All these benefits make AWS Step Functions an appealing solution for AI/ML apps that are using the AWS cloud platform.

References:

Top comments (0)