<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mark Nunnikhoven</title>
    <description>The latest articles on DEV Community by Mark Nunnikhoven (@marknca).</description>
    <link>https://dev.to/marknca</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F242940%2Ff4136409-8b14-4e8e-8219-ca76e138f436.jpg</url>
      <title>DEV Community: Mark Nunnikhoven</title>
      <link>https://dev.to/marknca</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/marknca"/>
    <language>en</language>
    <item>
      <title>This One Mistake Will Stop a DevSecOps Shift Left Strategy Dead in Its Tracks</title>
      <dc:creator>Mark Nunnikhoven</dc:creator>
      <pubDate>Wed, 10 Nov 2021 18:30:00 +0000</pubDate>
      <link>https://dev.to/lacework/this-one-mistake-will-stop-a-devsecops-shift-left-strategy-dead-in-its-tracks-3hka</link>
      <guid>https://dev.to/lacework/this-one-mistake-will-stop-a-devsecops-shift-left-strategy-dead-in-its-tracks-3hka</guid>
      <description>&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/3L0g2LfCPOQ"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;DevSecOps is the latest in a long line of buzzwords. The core makes sense: work on security earlier. But why isn’t this everywhere? Here’s the biggest mistakes teams are making trying to “do” DevSecOps.&lt;/p&gt;

&lt;p&gt;Learn more in the video 👆 or read through the transcript 👇.&lt;/p&gt;

&lt;h2&gt;
  
  
  Transcript
&lt;/h2&gt;

&lt;p&gt;I see security teams making the same mistake over and over again when it comes to “shifting left.” It’s frustrating from afar and infuriating when you have to deal with it day-to-day.&lt;/p&gt;

&lt;p&gt;Let’s dig in to the disaster that is DevSecOps…&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;[00:15]&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Imagine for a minute, you’re in your kitchen preparing dinner. You’re a reasonably good home cook. More often than not, what you put on the table is enjoyed by those you’re sharing with it.&lt;/p&gt;

&lt;p&gt;Sure, every once and a while you miss. But that’s the rare case, so when it does happen everyone smiles, you laugh, and then place an order for take out. Mistakes happen.&lt;/p&gt;

&lt;p&gt;Not too bad, right?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;[00:29]&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now, let’s say while you’re getting ready to sit down for a wonderful home cooked meal, you neighbour invites themselves in. They immediate start hammering you with questions like, “How sharp is that knife?”, “Do you know who grew that broccoli?”, “Are there too many ovens in this neighbourhood?”&lt;/p&gt;

&lt;p&gt;Taken aback, you politely ask, “Um, are you a professional chef? Do you have a lot of experience cooking?”&lt;/p&gt;

&lt;p&gt;They reply, “Oh no, I don’t even have a kitchen in my place. I just order food every once and a while.”&lt;/p&gt;

&lt;p&gt;That’s basically the scenario I see play out in organizations around the world.&lt;/p&gt;

&lt;p&gt;The development teams and builders are working to solve business problems and address customer needs.&lt;/p&gt;

&lt;p&gt;Then the security team shows up out of no where and starts asking seemingly irrelevant questions and demanding that priorities change in the name “reducing risk” and “improving the overall security posture” without understanding what you’re working on or how you work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;[01:37]&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is why even the name DevSecOps frustrates me to no end. The DevOps philosophy already assumes that you want to build a resilient, reliable system. There’s no need to jam another acronym in there.&lt;/p&gt;

&lt;p&gt;Teams know that security is important, they just need the information and support to make smart decisions at the right time.&lt;/p&gt;

&lt;p&gt;So is this whole “shift left” thing doomed?&lt;/p&gt;

&lt;p&gt;No.&lt;/p&gt;

&lt;p&gt;Not if you do it well.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;[02:06]&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you’re on the security team, the first thing you need to understand is that you probably don’t understand how the builders are working.&lt;/p&gt;

&lt;p&gt;You can fix that.&lt;/p&gt;

&lt;p&gt;Spend some time with them. Ask lots of questions to better understand their workflow and concerns.&lt;/p&gt;

&lt;p&gt;Most important of all, make sure that the information from security tools that shift left provide information with the proper context and enough data for teams to make an informed decision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;[02:34]&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Just because it’s a security priority, doesn’t mean it’s a business priority.&lt;/p&gt;

&lt;p&gt;For developers and builders, understand that security controls can provide real value to you. The whole goal of these controls is to make sure things work as intended.&lt;/p&gt;

&lt;p&gt;Network security tools look for malicious activity and malformed traffic. You don’t want that anywhere near your app.&lt;/p&gt;

&lt;p&gt;Threat detection on your servers and containers is looking for errant processes and other indicators of compromise. This makes sure that your resources are only working for you instead of doing things like mining cryptocurrency for cybercriminals.&lt;/p&gt;

&lt;p&gt;Posture management—ugh, horrible name—looks at the cloud services you’re using to make sure that you have configured them in a way that matches your risk appetite.&lt;/p&gt;

&lt;p&gt;Vulnerability scanners look at your tech stack trying to find known issue before so they don’t bite you in the you-know-what.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;[03:26]&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Everything on this list and most of the other security controls out there can dramatic HELP you meet your goals.&lt;/p&gt;

&lt;p&gt;With that understanding, you need to make sure that you have access to the outputs of these tools. You need to know that they are in place and doing their job, so that you can focus on other parts of yours.&lt;/p&gt;

&lt;p&gt;By now, you’ve figured out that the number one mistake I see security teams making when they “shift left” is IGNORING the developers and builders.&lt;/p&gt;

&lt;p&gt;For some reason, security teams assume that to “shift left” means doing their isolated security work earlier in the development process. That’s an archaic way of thinking.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;[04:05]&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To truly shift left, you need to leverage the capability of security tools and processes to help developers and builders identify risks with their systems earlier in THEIR processes.&lt;/p&gt;

&lt;p&gt;This data will help the teams make informed decisions about what actions should be taken to meet the business goals.&lt;/p&gt;

&lt;p&gt;Shifting security left can help reduce the risks to the business while improving the quality of the systems your build.&lt;/p&gt;

&lt;p&gt;Who wouldn’t want that?&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Stop Your Password From Opening The Door To Hackers</title>
      <dc:creator>Mark Nunnikhoven</dc:creator>
      <pubDate>Fri, 22 Oct 2021 16:52:53 +0000</pubDate>
      <link>https://dev.to/lacework/stop-your-password-from-opening-the-door-to-hackers-4d3d</link>
      <guid>https://dev.to/lacework/stop-your-password-from-opening-the-door-to-hackers-4d3d</guid>
      <description>&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/iKSuO2hn5oo"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;It's cybersecurity awareness month and we all should be doing out part to &lt;a href="https://twitter.com/hashtag/BeCyberSmart"&gt;#BeCyberSmart&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;The one thing I see people struggling with the most is using passwords and I get it.&lt;/p&gt;

&lt;p&gt;A lot of what we've been subjected too about passwords is &lt;strong&gt;wrong&lt;/strong&gt; and actually makes things less secure. Making matters worse, security folks—myself included!—aren't known as being the most communicative.&lt;/p&gt;

&lt;p&gt;So, I set out to demystify passwords. In the video above 👆, I walk through how passwords are attacked, the UX around them, what makes a truly strong password, and finally I lay out a practical path for dealing with the mishmash of systems out there.&lt;/p&gt;

&lt;p&gt;Here in this post, I'll give you the highlights...&lt;/p&gt;

&lt;h2&gt;
  
  
  Strength
&lt;/h2&gt;

&lt;p&gt;A strong password is a &lt;strong&gt;long&lt;/strong&gt; password...or more probably, a passphrase. &lt;/p&gt;

&lt;p&gt;Length is the single most important factor in determining the strength of a password.&lt;/p&gt;

&lt;p&gt;The second most important factor is the variety of characters you pick from (so, not just a-z). That's the reason for those crazy password rules we're all so familiar with.&lt;/p&gt;

&lt;p&gt;Start thinking pass*&lt;em&gt;phrase&lt;/em&gt;*, not password.&lt;/p&gt;

&lt;h2&gt;
  
  
  Old Rules
&lt;/h2&gt;

&lt;p&gt;Those old rules I mentioned 👆? The whole "at least one capital letter, a number, a symbol, and be at least 8 characters long" thing?&lt;/p&gt;

&lt;p&gt;Those rules actually lead to weaker passwords. &lt;/p&gt;

&lt;p&gt;Thankfully the &lt;a href="https://pages.nist.gov/800-63-3/"&gt;most commonly used guidelines&lt;/a&gt; were updated in 2017 but a lot of systems are still behind the times. That means we still have to deal with them. 😔&lt;/p&gt;

&lt;h2&gt;
  
  
  Password Manager
&lt;/h2&gt;

&lt;p&gt;In addition to dealing with those older systems and rules, we also need different passwords for every site and app we use.&lt;/p&gt;

&lt;p&gt;Why? Because it reduces &lt;strong&gt;your&lt;/strong&gt; risk if one of those sites is hacked or has a breach. &lt;/p&gt;

&lt;p&gt;One of the first things cybercriminals do when they get new credential sets is test them against popular sites.&lt;/p&gt;

&lt;p&gt;But keeping track of all of those passwords is a pain. The solution is to use a password manager. &lt;/p&gt;

&lt;p&gt;Which one doesn't matter much. Just make sure it runs on all of your preferred devices and has a nice user experience.&lt;/p&gt;

&lt;p&gt;That's going to keep your passwords safe and sound...and generate long, gibberish passwords for any new logins.&lt;/p&gt;

&lt;p&gt;Taking things a step further, the manager will actually log you in to those sites and apps when needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  One Password To Rule Them All
&lt;/h2&gt;

&lt;p&gt;To keep all of those passwords in the manager safe and secure, you'll need a password (couldn't avoid them completely 🤣). &lt;/p&gt;

&lt;p&gt;Thankfully, almost all password managers are up to date on the rules and we can use a passphrase here.&lt;/p&gt;

&lt;p&gt;This passphrase is only going to be used with your manager and you should only change it when you think someone might have figured it out or about every year or so.&lt;/p&gt;

&lt;p&gt;Remember, this is the only password you're going to be typing in yourself. Make it a good one!&lt;/p&gt;

&lt;p&gt;Here are some simple guidelines to follow to create a really strong and easy to remember passphrase:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;use a random word generator to select at least 4 (more if you can) truly random words&lt;/li&gt;
&lt;li&gt;throw in a symbol or number (or both) just because&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Boom. Easy to remember, super strong password.&lt;/p&gt;

&lt;p&gt;Something like: &lt;strong&gt;polite2vacuumcensusmonkey!narrowfrozen&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;polite 2 vacuum census monkey ! narrow frozen&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Not only is that a fun passphrase (which I swear was randomly generated) but it's easy to remember and crazy strong.&lt;/p&gt;

&lt;p&gt;Stay safe out there and &lt;a href="https://twitter.com/hashtag/BeCyberSmart"&gt;#BeCyberSmart&lt;/a&gt;!&lt;/p&gt;

</description>
      <category>security</category>
      <category>privacy</category>
      <category>hacktoberfest</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Transferring Files in Amazon S3</title>
      <dc:creator>Mark Nunnikhoven</dc:creator>
      <pubDate>Tue, 11 Aug 2020 18:10:31 +0000</pubDate>
      <link>https://dev.to/marknca/transferring-files-in-amazon-s3-2db2</link>
      <guid>https://dev.to/marknca/transferring-files-in-amazon-s3-2db2</guid>
      <description>&lt;p&gt;Fellow &lt;a href="https://aws.amazon.com/developer/community/heroes/"&gt;AWS Hero&lt;/a&gt;, &lt;a href="https://twitter.com/mattbonig/"&gt;Matt Bonig&lt;/a&gt;, recently asked a very interesting question on Twitter as a poll;&lt;/p&gt;


&lt;blockquote class="ltag__twitter-tweet"&gt;

  &lt;div class="ltag__twitter-tweet__main"&gt;
    &lt;div class="ltag__twitter-tweet__header"&gt;
      &lt;img class="ltag__twitter-tweet__profile-image" src="https://res.cloudinary.com/practicaldev/image/fetch/s--u3p7vKtX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://pbs.twimg.com/profile_images/1260277338488467456/dpaUtowG_normal.jpg" alt="Matthew Bonig profile image"&gt;
      &lt;div class="ltag__twitter-tweet__full-name"&gt;
        Matthew Bonig
      &lt;/div&gt;
      &lt;div class="ltag__twitter-tweet__username"&gt;
        @mattbonig
      &lt;/div&gt;
      &lt;div class="ltag__twitter-tweet__twitter-logo"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ir1kO05j--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/twitter-f95605061196010f91e64806688390eb1a4dbc9e913682e043eb8b1e06ca484f.svg" alt="twitter logo"&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag__twitter-tweet__body"&gt;
      Happy Monday. AWS pop quiz time:&lt;br&gt;&lt;br&gt;When using &lt;br&gt;`aws s3 cp s3://somebucket/somefile s3://otherbucket/somefile`&lt;br&gt;&lt;br&gt;'somefile' is transferred through the host machine
    &lt;/div&gt;
    &lt;div class="ltag__twitter-tweet__date"&gt;
      14:27 PM - 10 Aug 2020
    &lt;/div&gt;


    &lt;div class="ltag__twitter-tweet__actions"&gt;
      &lt;a href="https://twitter.com/intent/tweet?in_reply_to=1292829980624330754" class="ltag__twitter-tweet__actions__button"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--fFnoeFxk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/twitter-reply-action-238fe0a37991706a6880ed13941c3efd6b371e4aefe288fe8e0db85250708bc4.svg" alt="Twitter reply action"&gt;
      &lt;/a&gt;
      &lt;a href="https://twitter.com/intent/retweet?tweet_id=1292829980624330754" class="ltag__twitter-tweet__actions__button"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--k6dcrOn8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/twitter-retweet-action-632c83532a4e7de573c5c08dbb090ee18b348b13e2793175fea914827bc42046.svg" alt="Twitter retweet action"&gt;
      &lt;/a&gt;
      &lt;a href="https://twitter.com/intent/like?tweet_id=1292829980624330754" class="ltag__twitter-tweet__actions__button"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--SRQc9lOp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/twitter-like-action-1ea89f4b87c7d37465b0eb78d51fcb7fe6c03a089805d7ea014ba71365be5171.svg" alt="Twitter like action"&gt;
      &lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  Three Possible Answers
&lt;/h2&gt;

&lt;p&gt;It's an interesting question because depending on your perspective and experience, each of the three possible poll answers make sense.&lt;/p&gt;

&lt;p&gt;If you don't know about the S3 bucket to bucket copy feature (which, while &lt;a href="https://aws.amazon.com/blogs/aws/amazon-s3-copy/"&gt;introduced in 2008&lt;/a&gt; isn't crystal clear in the docs), passing data through the system that called the command makes sense. That's how most file transfers work.&lt;/p&gt;

&lt;p&gt;If you've been working with &lt;a href="https://aws.amazon.com/s3/"&gt;Amazon S3&lt;/a&gt; regularly, you've no doubt seen the ultrafast transfer speeds even over bad connections. The only way that makes sense is if the data is only moving within the S3 infrastructure.&lt;/p&gt;

&lt;p&gt;There are &lt;strong&gt;always&lt;/strong&gt; exceptions to everything which is why, "It depends" holds up as a valid answer. Anyone that's been working with tech for any length of time intuitively understands this...after many, many frustrating stories &amp;amp; experiences.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Actually Happens
&lt;/h2&gt;

&lt;p&gt;Let's breakdown this command and figure out what's going on.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws s3 cp s3://SOURCE_BUCKET/KEY s3://DESTINATION_BUCKET/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;aws&lt;/code&gt; calls the &lt;a href="https://aws.amazon.com/blogs/developer/aws-cli-v2-is-now-generally-available/"&gt;AWS CLI program&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;s3&lt;/code&gt; filters the commands to the S3 service. a/k/a, "We're using S3!"&lt;/p&gt;

&lt;p&gt;&lt;code&gt;cp&lt;/code&gt; is &lt;a href="https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3/cp.html"&gt;the action&lt;/a&gt; to call within the specified AWS service.&lt;/p&gt;

&lt;p&gt;Now, &lt;code&gt;cp&lt;/code&gt; is a little different than most of the AWS CLI commands. Most commands are directly parallels to the AWS API for the service in question.&lt;/p&gt;

&lt;p&gt;In this case, there's a lot of &lt;a href="https://en.wikipedia.org/wiki/Syntactic_sugar"&gt;syntactic sugar&lt;/a&gt; applied here with the goal of making &lt;code&gt;aws s3 cp&lt;/code&gt; work as similar as possible to Linux's &lt;code&gt;cp&lt;/code&gt; &lt;a href="https://man7.org/linux/man-pages/man1/cp.1.html"&gt;command&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The rest of the command provides a mix of options to indicate the source and destination of the file copy. For our example;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;s3://SOURCE_BUCKET/KEY&lt;/code&gt; is the source file. A &lt;strong&gt;key&lt;/strong&gt; is the S3 term for what we commonly think of as the directory structure + filename. In this case, the file is in an existing S3 bucket.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;s3://DESTINATION_BUCKET/&lt;/code&gt; is the destination. Here, we've indicated another S3 bucket and because the path ends in &lt;code&gt;/&lt;/code&gt;, we are telling &lt;code&gt;cp&lt;/code&gt; that we want the same filename (or key) in the destination bucket.&lt;/p&gt;

&lt;h2&gt;
  
  
  API Calls
&lt;/h2&gt;

&lt;p&gt;Behinds the scenes, the S3 API command the &lt;code&gt;cp&lt;/code&gt; calls &lt;strong&gt;depends&lt;/strong&gt; on what you've asked it to do.&lt;/p&gt;

&lt;p&gt;Here are the most common possibilities;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;If you're copying a file from the local system into S3, it calls the &lt;code&gt;PutObject&lt;/code&gt; &lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html"&gt;action&lt;/a&gt; or—if it's a really large file—the &lt;code&gt;CreateMultipartUpload&lt;/code&gt; &lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html"&gt;action&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you're copying a file from a bucket to another bucket, it calls the &lt;code&gt;CopyObject&lt;/code&gt; &lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html"&gt;action&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you're copying a file from a bucket to the local system, it calls the &lt;code&gt;GetObject&lt;/code&gt; &lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html"&gt;action&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Our Transfer
&lt;/h2&gt;

&lt;p&gt;In the case of our command;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws s3 cp s3://SOURCE_BUCKET/KEY s3://DESTINATION_BUCKET/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CLI translates that to the &lt;code&gt;CopyObject&lt;/code&gt; action which means the &lt;strong&gt;data never leaves AWS&lt;/strong&gt;. The contents of our file (or key) are copying via the S3 backend from the source bucket to the destination bucket.&lt;/p&gt;

&lt;p&gt;We can verify that by looking at the outbound traffic on your local system. Here's a screenshot of the original file upload.&lt;/p&gt;

&lt;p&gt;I've run the command;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws s3 cp LOCAL_FILENAME s3://BUCKETA/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This results in an outbound transfer running at ~17 MB/s to AWS. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Uo-SKcVy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/90ztegjjeijgxlzqh6j5.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Uo-SKcVy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/90ztegjjeijgxlzqh6j5.jpg" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see that not only as reported by the AWS CLI but also by my outbound firewall. That outbound firewall is reporting ~14 MB/s, but the difference is just a matter how each tool updates. &lt;/p&gt;

&lt;p&gt;The reported speed also lines up with the amount of time the command was running based on the file size (just over a minute for a 1.1 GB file).&lt;/p&gt;

&lt;p&gt;The key point here is that they are reporting very similar numbers.&lt;/p&gt;

&lt;p&gt;After that upload completes, to copy the file one bucket to another, I run the command;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws s3 cp s3://BUCKETA/KEY s3://BUCKETB/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here are the transfer results from my local system for this command; &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Jxr4DV-k--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/pwjt1jfin1ias7b0pkj6.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Jxr4DV-k--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/pwjt1jfin1ias7b0pkj6.jpg" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The AWS CLI is reporting ~180 MB/s while my local network traffic is only 30 &lt;strong&gt;KB&lt;/strong&gt;/s to AWS. The file transfer takes about 10 seconds to complete.&lt;/p&gt;

&lt;p&gt;This proves that the &lt;code&gt;CopyObject&lt;/code&gt; API is being used to copy the file through the S3 backend.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;The bucket-to-bucket copy feature is a massive time and bandwidth saver when you're working with files in AWS. This works not only between buckets in the same account and region but using different accounts and regions (when the appropriate permissions are in place).&lt;/p&gt;

&lt;p&gt;The results of this little experiment also highlight a key rule of working with data in Amazon S3;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Keep data inside of Amazon S3 for as long as possible&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You don't want to have to wait on data to be downloaded outside of AWS, nor do you want to have &lt;a href="https://www.lastweekinaws.com/blog/understanding-data-transfer-in-aws/"&gt;to pay for it&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The AWS CLI itself is a really interesting open source project that has a lot of very cool code behind the scenes to make it work. You can &lt;a href="https://github.com/aws/aws-cli"&gt;check that out on GitHub&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>aws</category>
    </item>
    <item>
      <title>Applying the Well-Architected Framework, Small Edition</title>
      <dc:creator>Mark Nunnikhoven</dc:creator>
      <pubDate>Sat, 27 Jun 2020 15:08:01 +0000</pubDate>
      <link>https://dev.to/aws-heroes/applying-the-well-architected-framework-small-edition-18fj</link>
      <guid>https://dev.to/aws-heroes/applying-the-well-architected-framework-small-edition-18fj</guid>
      <description>&lt;p&gt;Do you ever tackle a problem and know that you’ve just spent way too much time on it? But you also know that it was worth it? This post (which is also long!) sums up my recent experience with exactly that type of problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  tl:dr
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;AWS Lambda has a storage limit for &lt;code&gt;/tmp&lt;/code&gt; of 512MB&lt;/li&gt;
&lt;li&gt;AWS Lambda functions needs to be in a VPC to connect to an Amazon EFS filesystem&lt;/li&gt;
&lt;li&gt;AWS Lambda functions within in a VPC &lt;strong&gt;require&lt;/strong&gt; a NAT gateway to access the internet&lt;/li&gt;
&lt;li&gt;Amazon EC2 instances can use cloud-init to run a custom script on boot&lt;/li&gt;
&lt;li&gt;A solution needs to address all five pillars of the AWS Well-Architected Framework in order to be the “best”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Read on to find out the whole story…&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;I love learning and want to keep tabs on several key areas. Ironically, tabs themselves &lt;a href="https://twitter.com/marknca/status/1017237825199247361"&gt;are massive issue for me&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Every morning, I get a couple of tailored emails from a service called &lt;a href="https://mailbrew.com"&gt;Mailbrew&lt;/a&gt;. Each of these emails contains the latest results from Twitter queries, subreddits, and key websites (via RSS).&lt;/p&gt;

&lt;p&gt;The problem is that I want to track a lot of websites and Mailbrew only supports adding sites one-by-one. &lt;/p&gt;

&lt;p&gt;This leads to a problem statement of…&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Combine N feeds into one super feed  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Constraints
&lt;/h2&gt;

&lt;p&gt;Ideally these &lt;em&gt;super feeds&lt;/em&gt; would be published on my website. That site is build in &lt;a href="https://gohugo.io"&gt;Hugo&lt;/a&gt; and deployed to &lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/dev/WebsiteHosting.html"&gt;Amazon S3&lt;/a&gt; with &lt;a href="https://www.cloudflare.com"&gt;CloudFlare&lt;/a&gt; in front. This setup is ideal for me. &lt;/p&gt;

&lt;p&gt;Following the &lt;a href="https://aws.amazon.com/architecture/well-architected/"&gt;AWS Well-Architected Framework&lt;/a&gt;; it’s highly performant, low cost, has minimal operational burden, a strong security posture, and is very reliable. It’s a win across all five pillars. &lt;/p&gt;

&lt;p&gt;Adding the feeds to this setup shouldn’t compromise any of these attributes.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I think it’s important to point out that there a quite a few services out there that combine feeds for your.  &lt;a href="https://rssunify.com/"&gt;RSSUnify&lt;/a&gt; and &lt;a href="http://www.rssmix.com/"&gt;RSSMix&lt;/a&gt; come to mind, but there are many, many others…  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The nice thing about Hugo is that it uses text files to build out your site, including &lt;a href="https://benjamincongdon.me/blog/2020/01/14/Tips-for-Customizing-Hugo-RSS-Feeds/"&gt;custom RSS feeds&lt;/a&gt;.  The solution should write these feed items as unique posts (a/k/a text files) in my Hugo content directory structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  🐍 Python To The Rescue
&lt;/h2&gt;

&lt;p&gt;A quick little python script (&lt;a href="https://gist.github.com/marknca/c863d166cf91d710c247f6af563ca73b"&gt;available here&lt;/a&gt;) and I’ve got a tool that takes a list of feeds and writes them into unique Hugo posts.&lt;/p&gt;

&lt;p&gt;Problem? &lt;strong&gt;Solved.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Hmmm…I forgot these feeds need to be kept up to date and my current build pipeline (a &lt;a href="https://github.com/features/actions"&gt;GitHub Action&lt;/a&gt;) doesn’t support running on a schedule. &lt;/p&gt;

&lt;p&gt;Besides, trying to run that code in the action is going to require another event to hook into or it’ll get stuck in an update loop as the new feed items are committed to the repo.&lt;/p&gt;

&lt;p&gt;New problem statement…&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Run a python script on-demand and a set schedule  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This feels like a good problem for &lt;a href="https://markn.ca/2019/road-to-reinvent-what-is-serverless/"&gt;a serverless solution&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS Lambda
&lt;/h2&gt;

&lt;p&gt;I immediately thought of &lt;a href="https://aws.amazon.com/lambda/"&gt;AWS Lambda&lt;/a&gt;. I run a crazy amount of little operational tasks just like this using AWS Lambda functions triggered by a  scheduled &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/WhatIsCloudWatchEvents.html"&gt;Amazon CloudWatch Event&lt;/a&gt;. It’s a strong, simple pattern.&lt;/p&gt;

&lt;p&gt;It turns out that getting a Lambda function to access a git repo isn’t very straight forward. I’ll save you the sob story but here’s how I got it running using a python 3.8 runtime;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;add the git binaries as a Lambda Layer, I used &lt;a href="https://github.com/lambci/git-lambda-layer"&gt;git-lambda-layer&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;use the &lt;a href="https://github.com/gitpython-developers/GitPython"&gt;GitPython&lt;/a&gt; module&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That allows a simple code setup like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;repo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;git&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Repo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clone_from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;REPO_URL_WITH_AUTH&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="bp"&gt;...&lt;/span&gt;
&lt;span class="c1"&gt;# do some other work, like collect RSS feeds
&lt;/span&gt;&lt;span class="bp"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;fn&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;untracked_file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;File {} hasn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t been added to the repo yet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# You can also use git commands directly-ish
&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;git&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;git&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Updated from python&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This makes it easy enough to work with a repo. With a little bit of hand wavy magic, I wired a Lambda function up to a scheduled CloudWatch Event and I was done.&lt;/p&gt;

&lt;p&gt;…until I remembered—and by, “remembered”, I mean the function threw an exception—about the Lambda &lt;code&gt;/tmp&lt;/code&gt; storage &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html"&gt;limit of 512MB&lt;/a&gt;. The repo for my website is around 800MB and growing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Amazon EFS
&lt;/h2&gt;

&lt;p&gt;Thankfully, AWS just released a new integration between &lt;a href="https://aws.amazon.com/blogs/compute/using-amazon-efs-for-aws-lambda-in-your-serverless-applications/"&gt;Amazon EFS and AWS Lambda&lt;/a&gt;. I followed along the relatively simple process to get this set up.&lt;/p&gt;

&lt;p&gt;I hit two big hiccups.&lt;/p&gt;

&lt;p&gt;The first, for a Lambda function to connect to an EFS file system, both need to be “in” the same VPC. This is easy enough to do if you have a VPC setup and &lt;a href="https://docs.aws.amazon.com/vpc/latest/userguide/vpc-getting-started.html"&gt;even if you don’t&lt;/a&gt;.  We’re going to come back to this one in a second.&lt;/p&gt;

&lt;p&gt;The second issue was that I initially set the path for the EFS access point to &lt;code&gt;/&lt;/code&gt;. There wasn’t a warning (that I saw) in the official documents but an off-handed remark in &lt;a href="https://read.acloud.guru/how-i-used-lambda-and-efs-for-massively-parallel-compute-96575bc85157"&gt;a fantastic post by Peter Sbarski&lt;/a&gt; highlighted this problem.&lt;/p&gt;

&lt;p&gt;That was a simple fix (I went with &lt;code&gt;/data&lt;/code&gt;) but the VPC issue brought up a bigger challenge.&lt;/p&gt;

&lt;p&gt;The simplest of VPCs that will solve this problem is one or two subnets with an internet gateway configured. This structure is free and only incurs &lt;a href="https://www.duckbillgroup.com/blog/understanding-data-transfer-in-aws/"&gt;charges for inbound/outbound data transfer&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Except&lt;/strong&gt; that my Lambda function needs internet access and that requires one more piece.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/"&gt;That piece is a NAT gateway&lt;/a&gt;. No big deal, it’s &lt;a href="https://aws.amazon.com/premiumsupport/knowledge-center/nat-gateway-vpc-private-subnet/"&gt;a one click deploy&lt;/a&gt; and a same routing change. The new problem is cost.&lt;/p&gt;

&lt;p&gt;The need for a NAT gateway makes completely sense. Lambda runs adjacent to your network structure. Routing those functions into your VPC requires an explicit structure. From a security perspective, we don’t want an implicit connection between our private network (VPC) and other random bits of AWS.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fz5l0421astnspeouzzyl.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fz5l0421astnspeouzzyl.jpg" alt="Alt Text" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Well-Architected Principles
&lt;/h2&gt;

&lt;p&gt;This is where things really start to go off the of the rails. As mentioned above, the Well-Architected Framework is built on five pillars;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://d1.awsstatic.com/whitepapers/architecture/AWS-Operational-Excellence-Pillar.pdf"&gt;Operational Excellence&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://d1.awsstatic.com/whitepapers/architecture/AWS-Reliability-Pillar.pdf"&gt;Reliability&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://d1.awsstatic.com/whitepapers/architecture/AWS-Security-Pillar.pdf"&gt;Security&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://d1.awsstatic.com/whitepapers/architecture/AWS-Performance-Efficiency-Pillar.pdf"&gt;Performance Efficiency&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://d1.awsstatic.com/whitepapers/architecture/AWS-Cost-Optimization-Pillar.pdf"&gt;Cost Optimization&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AWS Lambda + Amazon EFS route continues to perform well in all of the pillars except for one; cost optimization.&lt;/p&gt;

&lt;p&gt;Why? Well I use accounts and VPCs as &lt;a href="https://d0.awsstatic.com/aws-answers/AWS_Multi_Account_Security_Strategy.pdf"&gt;strong security barriers&lt;/a&gt;. So the VPC that this Lambda function and EFS filesystem are running in is only for this solution. The NAT gateway would only be used during the build of my site.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://aws.amazon.com/vpc/pricing/"&gt;cost of a NAT gateway&lt;/a&gt; per month? &lt;strong&gt;$32.94&lt;/strong&gt; + bandwidth consumed.&lt;/p&gt;

&lt;p&gt;That’s not an unreasonable amount of money until you put that in the proper context of the project. The site costs less than $0.10 per month to host. If we add in the AWS Lambda function + EFS filesystem, that skyrockets to &lt;strong&gt;$0.50 per month&lt;/strong&gt; 😉.&lt;/p&gt;

&lt;p&gt;That NAT gateway is looking very unreasonable now&lt;/p&gt;

&lt;h2&gt;
  
  
  Alternatives
&lt;/h2&gt;

&lt;p&gt;Easy alternatives to AWS Lambda for computation are &lt;a href="https://aws.amazon.com/fargate/"&gt;AWS Fargate&lt;/a&gt; and good old &lt;a href="https://aws.amazon.com/ec2/"&gt;Amazon EC2&lt;/a&gt;. As much as everyone would probably lean towards containers and I’ve heard people say it’s the next logical choice…&lt;/p&gt;

&lt;p&gt;&lt;iframe class="tweet-embed" id="tweet-1273575669390327809-115" src="https://platform.twitter.com/embed/Tweet.html?id=1273575669390327809"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-1273575669390327809-115');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1273575669390327809&amp;amp;theme=dark"
  }



&lt;/p&gt;

&lt;p&gt;…I went old school and started to explore what a solution in EC2 would look like.&lt;/p&gt;

&lt;p&gt;For an Amazon EC2 instance to access the internet, it only needs to be in a public subnet of a VPC with an internet gateway. No NAT gateway needed. This removes the $32.94 each month but does put us into a more expensive compute range.&lt;/p&gt;

&lt;p&gt;But can we automate this easily? Is this going to be a reliable solution? What about the security aspects?&lt;/p&gt;

&lt;h2&gt;
  
  
  Amazon EC2 Solution
&lt;/h2&gt;

&lt;p&gt;The 🔑 key? Remembering the &lt;em&gt;&lt;a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html"&gt;user-data&lt;/a&gt;&lt;/em&gt; feature of EC2 and that all AWS-managed AMIs support &lt;a href="https://cloudinit.readthedocs.io/en/latest/"&gt;cloud-init&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This provides us with 16KB of space to configure an instance on the fly to accomplish out task. That should be plenty...if you haven't figured it out from my profile pic, I'm approaching the greybeard powers phase of my career 🧙‍♂️.&lt;/p&gt;

&lt;p&gt;A quick bit of troubleshooting (lots of instances firing up and down), and I ended up with this &lt;a href="https://www.gnu.org/software/bash/manual/html_node/index.html#SEC_Contents"&gt;bash&lt;/a&gt; script (yup, bash) to solve the problem;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#! /bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;sleep &lt;/span&gt;15
&lt;span class="nb"&gt;sudo &lt;/span&gt;yum &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nb"&gt;install &lt;/span&gt;git
&lt;span class="nb"&gt;sudo &lt;/span&gt;yum &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nb"&gt;install &lt;/span&gt;python3
&lt;span class="nb"&gt;sudo &lt;/span&gt;pip3 &lt;span class="nb"&gt;install &lt;/span&gt;boto3
&lt;span class="nb"&gt;sudo &lt;/span&gt;pip3 &lt;span class="nb"&gt;install &lt;/span&gt;dateparser
&lt;span class="nb"&gt;sudo &lt;/span&gt;pip3 &lt;span class="nb"&gt;install &lt;/span&gt;feedparser

&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /home/ec2-user/get_secret.py &lt;span class="o"&gt;&amp;lt;&amp;lt;-&lt;/span&gt; &lt;span class="no"&gt;PY_FILE&lt;/span&gt;&lt;span class="sh"&gt;
# Standard libraries
import base64
import json
import os
import sys

# 3rd party libraries
import boto3
from botocore.exceptions import ClientError

def get_secret(secret_name, region_name):
    secret = None
    session = boto3.session.Session()
    client = session.client(service_name='secretsmanager', region_name=region_name)
    try:
        get_secret_value_response = client.get_secret_value(SecretId=secret_name)
    except ClientError as e:
        print(e)
    else:
        if 'SecretString' in get_secret_value_response:
            secret = get_secret_value_response['SecretString']
        else:
            decoded_binary_secret = base64.b64decode(get_secret_value_response['SecretBinary'])

    return json.loads(secret)

if __name__ == '__main__':
    github_token = get_secret(secret_name="GITHUB_TOKEN", region_name="us-east-1")['GITHUB_TOKEN']
    print(github_token)
&lt;/span&gt;&lt;span class="no"&gt;PY_FILE

&lt;/span&gt;git clone https://&lt;span class="se"&gt;\`&lt;/span&gt;python3 get_secret.py&lt;span class="se"&gt;\`&lt;/span&gt;:x-oauth-basic@github.com/USERNAME/REPO /home/ec2-user/website

python3 RUN_MY_FEED_UPDATE_SCRIPT

&lt;span class="nb"&gt;cd&lt;/span&gt; /home/ec2-user/website

&lt;span class="c"&gt;# Build my website&lt;/span&gt;
./home/ec2-user/website/bin/hugo &lt;span class="nt"&gt;-b&lt;/span&gt; https://markn.ca

&lt;span class="c"&gt;# Update the repo&lt;/span&gt;
git add &lt;span class="nb"&gt;.&lt;/span&gt;
git config &lt;span class="nt"&gt;--system&lt;/span&gt; user.email MY_EMAIL
git config &lt;span class="nt"&gt;--system&lt;/span&gt; user.name &lt;span class="s2"&gt;"MY_NAME"&lt;/span&gt;
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Updated website via AWS"&lt;/span&gt;
git push

&lt;span class="c"&gt;# Sync to S3&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;sync&lt;/span&gt; /home/ec2-user/website/public s3://markn.ca &lt;span class="nt"&gt;--acl&lt;/span&gt; public-read

&lt;span class="c"&gt;# Handy URL to clear the cache&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; GET &lt;span class="s2"&gt;"https://CACHE_PURGING_URL"&lt;/span&gt;

&lt;span class="c"&gt;# Clean up by terminating the EC2 instance this is running on&lt;/span&gt;
aws ec2 terminate-instances &lt;span class="nt"&gt;--instance-ids&lt;/span&gt; &lt;span class="sb"&gt;`&lt;/span&gt;wget &lt;span class="nt"&gt;-q&lt;/span&gt; &lt;span class="nt"&gt;-O&lt;/span&gt; - http://169.254.169.254/latest/meta-data/instance-id&lt;span class="sb"&gt;`&lt;/span&gt; &lt;span class="nt"&gt;--region&lt;/span&gt; us-east-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This entire run takes on average 4 minutes. Even at the higher on-demand cost (vs. &lt;a href="https://aws.amazon.com/ec2/spot/pricing/"&gt;spot&lt;/a&gt;), each run costs $0.000346667 on a t3.nano in us-east-1.&lt;/p&gt;

&lt;p&gt;For the month, that’s $0.25376244 (for 732 scheduled runs).&lt;/p&gt;

&lt;p&gt;We’re well 😉 over the AWS Lambda compute price ($0.03/mth) but still below the Lambda + EFS cost ($0.43/mth) and certainly well below the total cost including a NAT gateway. It’s weird, but this is how cloud works.&lt;/p&gt;

&lt;p&gt;Each of these runs is triggered by a CloudWatch Event that calls an AWS Lambda function to create the EC2 instance. That’s one extra step compared to the pure Lambda solution but it’s still reasonable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reliability
&lt;/h2&gt;

&lt;p&gt;In practice, this solution has been working well. After 200+ runs, I have experienced zero failures. That’s a solid start. Additionally, the cost of failure is low. If this process fails to run, the site isn’t updated.&lt;/p&gt;

&lt;p&gt;Looking at the overall blast radius, there are really only two issues that need to be considered;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;If the sync to S3 fails and leaves the site in an incomplete state&lt;/li&gt;
&lt;li&gt;If the instance fails to terminate&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The likelihood of a sync failure is very low but if it does fail the damage would only be to one asset on the site. The AWS CLI command copies files over one-by-one if they are newer. If one fails, the command stops. This means that only one asset (page, image, etc.) would be in a damaged state.&lt;/p&gt;

&lt;p&gt;As long as that’s not the main .css file for the site, we should be ok. Even then, clean HTML markup leaves the site still readable.&lt;/p&gt;

&lt;p&gt;The second issue could have more of an impact. &lt;/p&gt;

&lt;p&gt;The hourly cost of the t3.nano instance is $0.0052/hr. Every time this function runs, another instance is created. This means I could have a few dozen of these instances running in a failure event…running up a bill that would quickly top $100/month if left unchecked.&lt;/p&gt;

&lt;p&gt;In order to mitigate this risk, I added another command to the bash script; &lt;code&gt;shutdown&lt;/code&gt;.  Also ensuring that the API parameter of &lt;code&gt;—instance-initiated-shutdown-behavior&lt;/code&gt; set to &lt;code&gt;terminate&lt;/code&gt; is set on instance creation. This means the instance calls the EC2 API to terminate itself and shuts itself down to terminate…just in case.&lt;/p&gt;

&lt;p&gt;Adding &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/monitor_estimated_charges_with_cloudwatch.html"&gt;a billing alert&lt;/a&gt; rounds out the mitigations to significantly reduce the risk.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security
&lt;/h2&gt;

&lt;p&gt;The security aspects of this solution concerned me. AWS Lambda presents a much smaller set of responsibilities for the user. In fact, an instance is the most responsibility taken on by the builder within &lt;a href="https://markn.ca/2019/road-to-reinvent-the-shared-responsibility-model/"&gt;the Shared Responsibility Model&lt;/a&gt;. That’s the opposite way we want to be moving.&lt;/p&gt;

&lt;p&gt;Given that this instance isn’t handling inbound requests, the security group is completely locked down. It only allows outbound traffic, nothing inbound.&lt;/p&gt;

&lt;p&gt;Additionally, using an IAM Role, I’ve only provided the necessary permissions to accomplish the task at hand. This is called the principle of least privilege. It can be a pain to setup sometimes but does wonders to reduce the risk of any solution.&lt;/p&gt;

&lt;p&gt;You may have noticed in the &lt;em&gt;user-data&lt;/em&gt; script above that we’re actually writing a python script to the instances on boot. This script allows the instance to access &lt;a href="https://aws.amazon.com/secrets-manager/"&gt;AWS Secrets Manager&lt;/a&gt; to get a secret and print its value to stdout.&lt;/p&gt;

&lt;p&gt;I’m using that to store the &lt;a href="https://help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token"&gt;GitHub Personal Access Token&lt;/a&gt; required to clone and update the private repo that holds my website. This reduces the risk to my GitHub credentials which are the most important piece of data in this entire solution.&lt;/p&gt;

&lt;p&gt;This means that the instance needs to the following permissions;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;secretsmanager:GetSecretValue
secretsmanager:DescribeSecret
s3:ListBucket
s3:*Object
ec2:TerminateInstances
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The permissions for &lt;code&gt;secretsmanager&lt;/code&gt; are locked to the specific ARN of the secret for the GitHub token. The &lt;code&gt;s3&lt;/code&gt; permissions are restricted to &lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_examples_s3_rw-bucket.html"&gt;read/write&lt;/a&gt; my website bucket.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ec2:TerminateInstances&lt;/code&gt;  was a bit trickier as we don’t know the instance ID ahead of time. You could dynamically assign the permission but that adds needless complexity. Instead, this is a great use case for resource tags as a condition for the permission. If the instance isn’t tagged properly (in this case I use a “Task” key with a value set to a random, constant value), this role can’t terminate it.&lt;/p&gt;

&lt;p&gt;Similarly, the AWS Lambda function has standard execution rights and;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;iam:PassRole
ec2:CreateTags
ec2:RunInstances
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cybersecurity is simply making sure that what you build does what you intend…and only what is intended.&lt;/p&gt;

&lt;p&gt;If we run through the possibilities for this solution, there isn’t anything that an attacker could do without already having other rights and access within our account.&lt;/p&gt;

&lt;p&gt;We’ve minimized the risk to a more than acceptable level, even though we’re using EC2. It appears that this solution can only do what is intended.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Did I Learn?
&lt;/h2&gt;

&lt;p&gt;The Well-Architected Framework isn’t just useful for big projects or for a point-in-time review. The principles promoted by the framework apply continuously to any project.&lt;/p&gt;

&lt;p&gt;I thought I had a simple, slam dunk solution using a serverless design. In this case, a pricing challenge required me to change the approach. Sometimes it’s a performance, sometimes it’s security, sometimes it’s another aspect.&lt;/p&gt;

&lt;p&gt;Regardless, you need to be evaluating your solution across all five pillars to make sure you’re striking the right balance.&lt;/p&gt;

&lt;p&gt;There’s something about the instance approach that still worries me a bit. I don’t have the same peace of mind that I do when I deploy Lambda but the data is showing this as reliable and meeting all of my needs.&lt;/p&gt;

&lt;p&gt;This setup also leaves room for expansion. Adding addition tasks to the &lt;em&gt;user-data&lt;/em&gt; script is straightforward and would not dramatically shift any of the concerns around the five pillars if done well. The risk here is expanding this into a custom CI/CD pipeline which is something to avoid. &lt;/p&gt;

&lt;p&gt;“I built my own…”, generally means you’ve taken a wrong turn somewhere along the way. Be concerned when you find yourself echoing those sentiments.&lt;/p&gt;

&lt;p&gt;This is also a reminder that there are a ton of features and functionality within the big three (&lt;a href="https://aws.amazon.com"&gt;AWS&lt;/a&gt;, &lt;a href="https://azure.microsoft.com/en-ca/"&gt;Azure&lt;/a&gt;, &lt;a href="https://cloud.google.com"&gt;Google Cloud&lt;/a&gt;) cloud platforms and that can be a challenge to stay on top of.&lt;/p&gt;

&lt;p&gt;If I didn’t remember the cloud-init/user-data combo, I’m not sure I would’ve evaluated EC2 as a possible solution.&lt;/p&gt;

&lt;p&gt;One more reason to keep on learning and sharing!&lt;/p&gt;

&lt;p&gt;Btw, checkout the results of this work at;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://markn.ca/security-super-feed/"&gt;Security Super Feed&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://markn.ca/devops-super-feed/"&gt;DevOps Super Feed&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://markn.ca/cloud-super-feed/"&gt;Cloud Super Feed&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And if I'm missing a link with a feed that you think should be there, &lt;a href="https://docs.google.com/forms/d/e/1FAIpQLSeASl4P9NEgdtVQeLrJ8sOm0x-x_1SWAsIvTGDYcfNpOW1vTA/viewform"&gt;please let me know&lt;/a&gt;!&lt;/p&gt;

&lt;h2&gt;
  
  
  Total Cost
&lt;/h2&gt;

&lt;p&gt;If you’re interested in the total cost breakdown for the solution at the end of all this, here is it;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Per month
===========
24 * 30.5 = 732 scheduled runs
+ N manual runs
===========
750 runs per month

EC2 instance, t3.nano at $0.0052/hour
+ running for 4m on average
===========
(0.0052/60) * 4 = $0.000346667/run

AWS Lambda function, 128MB at $0.0000002083/100ms
+ running for 3500ms on average
============
$0.00000729/run

Per run cost
============
$0.000346667/run EC2
$0.00000729/run Lambda
============
$0.000353957/run

Monthly cost
=============
$0.26546775/mth to run the build 750 times
$0.00 for VPC with 2 public subnets and Internet Gateway
$0.00 for 2x IAM roles and 2x policies
$0.40 for 1x secret in AWS Secrets Manager
$0.00073 for 732 CloudWatch Events (billed eventually)
$0.00 for 750 GB inbound transfer to EC2 (from GitHub)
$0.09 for 2 GB outbound trasnfer (to GitHub)
=============
$0.75620775/mth

This means it'll take 3.5 years before we've spent the same as one month of NAT Gateway support.

* Everything is listed in us-east-1 pricing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>aws</category>
      <category>serverless</category>
      <category>security</category>
    </item>
  </channel>
</rss>
