DEV Community

Sc0ra
Sc0ra

Posted on • Originally published at blog.theodo.com

Prevent AWS from Reading Your Step Functions Data

AWS Step Functions is the perfect tool to handle long-running, complex or business-critical workflows such as payment or subscription flows. However, a naive implementation could put sensitive data at risk.

What is AWS Step Functions?

AWS Step Functions is an Amazon cloud service designed to create state machines to orchestrate multiple Lambda functions, branching logic and external services in one place. It presents most of the advantages of the serverless services: the scaling is immediate and the pricing is based only on the computing time and the number of step transitions.

The state machines can be defined directly in the AWS Console with a JSON configuration and you directly have a graph representation of your workflow.

AWS Step Functions state diagram

AWS Step Functions state diagram

At Theodo, we use Serverless as our framework with the Step Functions plugin to write our AWS Lambda functions, define our resources (DynamoDB and Queues) and setup AWS Step Functions and the state machines directly in our codebase.

Example: an identity verification workflow with Step Functions

The previous picture shows an identification verification workflow. It is a piece of a bigger validation flow for banking applications.

What it does is request an identification URL to a third-party system, send this link by text message to the applicant and wait for them to complete the identity verification.
When it is done, the third-party system calls a webhook, which restarts the state machine and either triggers a retry, sends a rejection email, or continues the global validation process.

Graphical representation of the identification workflow

The identification flow

To perform all of those actions, sensitive information will flow through the inputs and outputs of the state machine steps:

  • 👤 Personal information from the applicant which are sent to the third-party service to check his/her identity
  • 📱 The phone number of the applicant to send the SMS
  • ✉️ The email address of the applicant to send the mail
  • ✔️ The result of the identity verification

Problem: we are exposing sensitive data

With this implementation, we see that sensitive data flow through the inputs and outputs of your state machine directly.

state-machine.yaml
----------

SendIdentificationRequest:
  Type: Task
  Resource: arn:aws:states:::lambda:invoke
  Parameter:
    FunctionName:
      Fn::ImportValue: ${self:provider.stage}-sendSMS-Function-arn
    Payload:
      creditApplicationId.$: $.creditApplicationId
      creditApplicant.$: $.creditApplicant
      phoneNumber.$: $.phoneNumber
      message.$: $.smsBody
  ResultPath: $.identificationSMS
  Next: WaitIdentification
Enter fullscreen mode Exit fullscreen mode

An exemple of sensitive data flowing through our inputs and outputs

The problem is: nowadays, most DevOps engineer, developers and even some product managers with access to AWS and can see that information. Not very GDPR friendly, right? Furthermore, AWS Lambda dumps are stored in an S3 bucket which is not encrypted with the default configuration. Any malicious access to your AWS stack could leak sensitive information far more easily than by corrupting your database for example. The more you spread sensitive information between multiple services, the more you put yourself at risk. Finally, as it is stated in AWS documentation:

Any data that you enter into Step Functions or other services might get picked up for inclusion in diagnostic logs.

With that in mind, you may completely dump AWS Step Functions, or you may try to find a solution because the tool is still awesome and perfectly suits your needs!

Our solution

To prevent leaking sensitive information, we need to clean every step function input and output. Here are two ways of doing it:

  • Keep the data in the input and output but encrypt and decrypt it between each step
  • Never transmit sensitive data between steps and retrieve it directly during their execution

Both solutions are viable. However, we prefered the second one because we wanted to keep our sensitive data in a single safe spot, where we put a lot of attention to the security.

We were already using AWS DynamoDB for the waiting tokens of our lambda functions, so we decided to create a new table to store sensitive data between steps.

We configured it like below, using an external KMS key.

resources.yaml
----------

StepSentiveInputTable:
  Type: AWS::DynamoDB::Table
  DeletetionPolicy: Retain
  Properties:
    TableName: step-sensitive-input-table
    AttributeDefinitions:
      - AttributeName: sensitiveInputId
        AttributeType: S
    KeySchema:
      - AttributeName: sensitiveInputId
        KeyType: HASH
    BillingMode: PAY_PER_REQUEST
    PointInTimeRecoverySpecification:
      PointINTimeRecoveryEnabled: true
    SSESpecification:
      SSEEnabled: true
      SSEType: KMS
      KeyId: hsm-key-id
Enter fullscreen mode Exit fullscreen mode

DynamoDB table definition in Serverless

We were already using DynamoDB from our lambdas to save our waiting task tokens and we chose to add another table to save sensitive data between steps. Furthermore, on AWS, you can choose to save the encryption keys of your DynamoDB tables outside of AWS, which was a big plus for us.

To avoid repeating ourselves, we coded a wrapper that we use on all our lambdas. It does the following things:

After the return of the lambda, the wrapper:

  • Looks for sensitive data keys in the output, remove them and build an object with that information
  • Save the sensitive data objects in a dynamo entry
  • Add the id of the new dynamo entry under a “sensitiveData” key in the output

Before the execution of the main lambda function, the wrapper:

  • Looks for “sensitiveData” keys in the input
  • Retrieve the content at the given IDs
  • Replaces the "sensitiveData" keys with the resulting object in the input
// encryption-wrapper.ts

export default (lambdaHandler: LambdaHandler): LambdaHandler => {
  const decryptHandler = async (event: any) => {
    const decryptedInformation = await getSensitiveData(event.sensitiveInputId);
    return lambdaHandler({ ...event, ...decryptedInformation });
  };

  return async (event: any) => {
    const clearOutput = await decryptHandler(event);
    const sensitiveInputId = await storeSensitiveData(clearOutput);

    for (const sensitiveKey of sensitiveInputKeys) {
      delete clearOutput[sensitiveKey];
    }

    return { sensitiveInputId, ...clearOutput };
  };
};
Enter fullscreen mode Exit fullscreen mode

Our custom encryption wrapper

Last of all, we needed to encrypt the first input at the start of our state machine from the API to finally secure our users information all the way!

Takeways

  • Take care about what information you have in your lambdas and state machine events on AWS
  • Always think twice about the data you put in your logs
  • Spread your secrets and security implementation in as few services as possible
  • Use a wrapper if you want to repeat behavior in multiple lambda functions.

Thank you very much for reading my article. Feel free to comment, ask for implementation details, and feel free to share the problems that you encountered with AWS Step Functions.

Top comments (2)

Collapse
 
spifd profile image
Fabrice Delhoste • Edited

Not yet played with AWS Step Functions and quite interesting, thanks!

It is a piece of a bigger validation flow for banking applications.

Does it mean a bigger workflow has been implemented using AWS SF or was it a proof of concept? Would you definitely recommend it for advanced realtime enterprise workflows? Costs?

Collapse
 
sc0ra profile image
Sc0ra

Hello Fabrice, sorry for the late answer !

We implemented the whole credit application validation workflow with AWS Step Functions, not just a proof of concept.

However, we encountered some problems with the lack of experience and maturity of the technology:

  • Upgrading to a breaking version of your workflow can really hard to manage when you have multiple workflow working together (the version management of the Lambda Functions is hard to grasp)
  • DX is problematic, as you have to navigate between a lot of files and the smart navigation isn't completely ready (there is some plugins in WIP tho)

Concerning costs, I'm not an expert but I would advise you to check this article with a cost calculator for your business requirements: medium.com/serverless-transformati...