DEV Community

Cover image for Extra! Extra! Amazon AppFlow is Released!
Raphael Bottino
Raphael Bottino

Posted on

Extra! Extra! Amazon AppFlow is Released!

Have you heard about Amazon AppFlow? It’s a brand new service from AWS that allows you to easily integrate SaaS applications such as SalesForce and Marketo to AWS services, such as S3 or Snowflake.

Look: Amazon AppFlow logo!

Yet another day, yet another new AWS release. Even on challenging times with the current Covid-19 pandemic still slowing the global economy down, AWS shows that it is on full-throttle mode and released a new service last week called Amazon AppFlow.

What is it?

Quick summary on how Amazon AppFlow works.

Pretty much paraphrasing the announcement, Amazon AppFlow is a fully managed integration service that enables you to securely transfer data between Software-as-a-Service (SaaS) applications and AWS services, in just a few clicks. As pretty much everything else on AWS, with AppFlow, you can run data flows at nearly any scale at the frequency you choose, paying just for flow run and data processed, with no upfront charges. For those that are security-aware (if you aren't, you should!), AppFlow automatically encrypts data in motion.

…and what does it mean?

It means 0-time invested to learn both the source's and destination's API. With a few clicks you can, for instance, backup all customer support cases from SalesForce to S3 on a weekly basis or daily push a list of new Leads from Marketo to AWS SnowFlake, allowing your team to quickly understand your leads behavior, all with no coding required.

But since I said that you should be security-aware, you might be thinking… "Can I leverage that for security purposes?" Yes! You can leverage this non-security related service to help you with your security.

Using it for Security

Trend Micro is the only Security vendor to be an AppFlow launch partner, which allows AWS and Trend Micro Cloud One customers to create AppFlow flows using Workload Security data as input, easily moving data from this security service to different destinations.

OK. AppFlow looks cool. The SalesForce and Marketo examples look cool. Having a security vendor like Trend Micro being a launch partner also looks cool. But how to use it?

Let's get our hands dirty

Of course, being the technical-curious person that I am, reading the release notes and examples are definitely not enough. I need to get my hands dirty. So, feel free to follow me on this journey.

Creating our First Flow

First, of course, let's hit the AppFlow dashboard.

Amazon AppFlow dashboard

If we click the bright orange "Create flow" button, we will be taken to the first step on creating our first flow. For this flow, I decided the name would be "CloudOneWorkloadSecurity-Computers" and I moved to the next step, without setting any of the optional settings.

Step 1. Really easy so far.

On Step 2 we can see exactly where AppFlow shines. I picked Trend Micro as Source and all it requires to be able to fetch data from Cloud One is an API secret. Again, no coding required.

Did you expect to see my API secret here?

As soon as I add my API secret, AppFlow presented me with the different object options that it can retrieve from Cloud One. For launch, only "Computers" and "Policies" are available, as you can see below, but we should expect to see more options later down the road.

Object options

Then I picked "Amazon S3" as my destination, deciding on my bucket and a prefix to the objects.

Step 2. Still easy!

Now we move to Step 3. Clicking on the drop-down "Choose source fields", we can decide on which fields we care about for this flow. I clicked first on "Map all fields directly", but because Cloud One is so thorough, I quickly realized it had way more information than I needed for this use case. So I selected only the 9 fields that I care about.

1, 2, 3… 9 fields!

On the following step, I could pick to run the flow on demand or to set a schedule for it. I decided, for this example, to run it daily.

Step 4. I can't believe it is that easy.

And that's it. The flow is ready to be used. And so I did.

Done!

In a little bit over 10 seconds, AppFlow fetched my Computers info from Cloud One Workload Security and dumped to a S3 bucket.

Details on the flow execution.

Clicking the "View data" link, it takes me straight to the bucket, where I can see the lonely file there. Downloading it shows me exactly what I expected, information taken straight from Cloud One.

Data straight from Cloud One

Houston, we have a problem…

There is a problem with that, though… There isn't a ton of value on this flow on itself, plus, my hands didn't get that dirty. If you just wanted to know what AppFlow is and how to use it, the article ends here for you. Thanks for stopping by! If you, like myself, like to get your hands dirty, let's move to the next stage.

Working with the data

After the daily run of this flow, I want to work with the generated data — automatically, as soon as it hits the S3 bucket. The idea is to go trough the generated data, process it and write to another bucket. For this example, I decided to daily generate a JSON compatible array of computers that the current state is different from "active", which means they probably have some kind of connectivity issues with the Cloud One manager. The final result is something similar to the diagram below:

The diagram below.

Before we go any further, it's important to note that the project — which has its code available on my GitHub — has its infrastructure built using AWS CDK (Typescript), while the Lambda code was built using JavaScript. If you are not familiar with CDK, I highly recommend the CDK Workshop documentation.

CDK stack code.

The code above describes the project infrastructure, generating a CloudFormation stack with a destination S3 bucket, a Lambda function and the proper permissions. Since I wanted to trigger this Lambda as soon as the source bucket received the data, I tried for a while to add this trigger to the code with no success; until I remembered, of course, that I wouldn't be able to do it — CloudFormation doesn't support adding event triggers to existing buckets.

After creating the infrastructure, I went ahead and coded the last missing piece: the Lambda itself.

The Lambda function code

The code is pretty straight forward. First, it downloads the newly added data to the Lambda execution environment. Then, it works the data. Since the original file has a JSON-described computer per line instead of an array of objects, I trimmed the file (to remove any white spaces from the end of it) and split it into an array of strings. Since each string represents an object, I mapped the array to return the objects that each string represents and, then, filtered out all objects where the state is active, since they are not relevant for us. Finally, all the non-active computers were written to the destination bucket.

After deploying the above stack, the last step is to manually connect the source bucket to it. Go to the bucket properties, click on Events and create a "All object create events" notification to it. Make sure to select the newly created Lambda to receive the notification. Now, for every AppFlow run, this lambda will also be triggered.

Bucket Events.

If you run the flow manually again to test the environment, you should see a new file on your new bucket, listing only the Cloud One computers that currently aren't on "active" state.
Resources:

[1] https://aws.amazon.com/new/

[2] https://aws.amazon.com/blogs/aws/new-announcing-amazon-appflow/

[3] https://docs.aws.amazon.com/appflow/latest/userguide/what-is-appflow.html

[4] https://blog.trendmicro.com/trend-micro-integrates-with-amazon-appflow/

[5] https://github.com/raphabot/AppFlowWorkloadSecurityDemo

[6] https://cdkworkshop.com

Originally posted at Medium

Top comments (0)