<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Andrii Liashenko</title>
    <description>The latest articles on DEV Community by Andrii Liashenko (@liashenko).</description>
    <link>https://dev.to/liashenko</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F280895%2Fdef53efa-1119-4c6a-b5f8-549593577c1b.jpg</url>
      <title>DEV Community: Andrii Liashenko</title>
      <link>https://dev.to/liashenko</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/liashenko"/>
    <language>en</language>
    <item>
      <title>Fault-tolerance in distributed systems</title>
      <dc:creator>Andrii Liashenko</dc:creator>
      <pubDate>Thu, 30 Jan 2020 15:57:31 +0000</pubDate>
      <link>https://dev.to/liashenko/fault-tolerance-in-distributed-systems-3hdd</link>
      <guid>https://dev.to/liashenko/fault-tolerance-in-distributed-systems-3hdd</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F1zy6ang7kyzhu0agyzug.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F1zy6ang7kyzhu0agyzug.jpg" alt="distributed system" width="800" height="587"&gt;&lt;/a&gt; &lt;br&gt;
&lt;em&gt;A distributed system is a network of computers, which are communicating with each other by passing messages, but acting as a single computer to the end-user.&lt;/em&gt;  &lt;/p&gt;

&lt;p&gt;With distributed power comes big challenges, and one of them is inevitable failures caused by distributed nature.&lt;br&gt;
Network connections fail or degrade, servers crash or respond enormously slow, software has bugs, etc.&lt;br&gt;&lt;br&gt;
How to make your system stable and tolerant to the failures?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Make your components redundant&lt;/strong&gt;. Avoid single point of failure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handle your Interaction Points&lt;/strong&gt; (calls to remote services).&lt;/li&gt;
&lt;li&gt;When it's possible, &lt;strong&gt;respond to requests when faillures happen&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test your system&lt;/strong&gt; to discover its behavior under pressure.&lt;/li&gt;
&lt;li&gt;Embrace the chaos to bring order in your system facilitating &lt;strong&gt;Chaos Engineering&lt;/strong&gt; experiments.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Handle your Interaction Points
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Integration points are the number-one killer of systems.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Every remote call is a risk to your system health and a single failing call can take the whole system down if not handled properly.&lt;br&gt;&lt;br&gt;
Let's review some common patterns to handle remote calls.&lt;br&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F9pafqexuz1e3f5f9rp0o.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F9pafqexuz1e3f5f9rp0o.jpg" alt="Something failed" width="800" height="587"&gt;&lt;/a&gt;  &lt;/p&gt;
&lt;h4&gt;
  
  
  Retries
&lt;/h4&gt;

&lt;p&gt;Often trying the same request again causes the request to succeed. It happens because of partial or transient failures.&lt;br&gt;
A partial failure is when a part of requests succeed. &lt;br&gt;
A transient failure is when a request fails for a short period of time.&lt;br&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fj1gzir3fpe3uw5jlgby2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fj1gzir3fpe3uw5jlgby2.jpg" alt="Retry" width="800" height="387"&gt;&lt;/a&gt;&lt;br&gt;
But it's not always safe to retry. A retry can increase the load on the system being called. Instead of retrying immediately, you can use &lt;strong&gt;exponential backoff&lt;/strong&gt;, where the wait time is increased exponentially after every attempt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;waitTime = min(maxWait, baseInterval * exponentialFactor ** attempt)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When failures are caused by overload, backing off doesn't help. If all the failed calls back off at the same time, they increase overload even more.&lt;br&gt;
The solution is &lt;strong&gt;jitter&lt;/strong&gt;. Jitter adds randomness to the backoff to spread the retries in time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;waitTime = rand(0, min(maxWait, baseInterval * exponentialFactor ** attempt))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Timeouts
&lt;/h4&gt;

&lt;p&gt;When a request is taking longer than usual, it might increase latency in your system (and fail eventually).&lt;br&gt;&lt;br&gt;
Also, the call holds on to the resources it is using for that request and during high load the server can quickly run out of the resources (memory, threads, connections, etc.).&lt;br&gt;&lt;br&gt;
To avoid this situation set &lt;strong&gt;connection and request timeouts&lt;/strong&gt;.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fp5yvopziklyap5m5lbd6.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fp5yvopziklyap5m5lbd6.jpg" alt="Timeout" width="800" height="370"&gt;&lt;/a&gt;  &lt;/p&gt;

&lt;h4&gt;
  
  
  Circuit breakers
&lt;/h4&gt;

&lt;p&gt;When there’s an issue with a dependency, stop calling it!&lt;br&gt;&lt;br&gt;
In the normal “closed” state, the circuit breaker executes requests as usual.&lt;br&gt;
Once the number of failures for the frequency of failures exceeds a threshold, the circuit breaker “opens” the circuit for some time.&lt;br&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fwxb4hfvy3g684ois4wdi.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fwxb4hfvy3g684ois4wdi.jpg" alt="Circuit Breaker" width="800" height="320"&gt;&lt;/a&gt;  &lt;/p&gt;

&lt;h4&gt;
  
  
  Bulkhead
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;In a ship, a bulkhead is a dividing wall or barrier between other compartments.&lt;br&gt;&lt;br&gt;
If the hull of a ship is compromised, only the damaged section fills with water, which prevents the ship from sinking.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Isolate the failure.&lt;br&gt;
Separate thread pools dedicated to different functions (e.g. separate thread pools for each remote service), so that if one fails, the others will continue to function.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fp8qhnq821gloiz5g7ssj.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fp8qhnq821gloiz5g7ssj.jpg" alt="Bulkheads" width="800" height="563"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Respond when failure happens
&lt;/h3&gt;

&lt;p&gt;“Fail fast” is generally a good idea:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;no increased latency&lt;/li&gt;
&lt;li&gt;no risk for the whole system to halt&lt;/li&gt;
&lt;li&gt;no invalid system behaviour&lt;/li&gt;
&lt;li&gt;releasing the pressure on underlying systems (i.e. shed load) when they are having issues &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, there are scenarios where your service can provide responses in a “fallback mode” to reduce the impact of failure on users.  &lt;/p&gt;

&lt;p&gt;Some fallback approaches:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cache&lt;/strong&gt;
Save the data that comes from remote services to a local or remote cache and reuse the cached data as a response during one of the service failure.
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Frynjjwts3tidybmiccf3.jpg" alt="Cache" width="800" height="696"&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queue&lt;/strong&gt;
Setup a queue for the requests to a remote service to be persisted until the dependency is available.
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F1ngl4rssawmmnjue8ejk.jpg" alt="Queue" width="800" height="584"&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stubbed (default) values&lt;/strong&gt;
Return default values when personalized options can’t be retrieved.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fail silently&lt;/strong&gt;.
Return empty or null response that can be handled by the caller (e.g. UI).
If possible, disable the functionality that is failing.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Hystrix
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/Netflix/Hystrix/" rel="noopener noreferrer"&gt;Hystrix&lt;/a&gt; is a Netflix open-source library that helps you handle Integration Points using the techniques described before: Timeout, Circuit Breaker, Bulkhead, and without effort allows you to provide fallback options.&lt;br&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F8wyz522eeyca0vhrdusi.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F8wyz522eeyca0vhrdusi.jpg" alt="Hystrix flow" width="800" height="388"&gt;&lt;/a&gt;  &lt;/p&gt;

&lt;p&gt;Embed the fault tolerance and latency tolerance in your system wrapping the calls to external services into HystrixCommands:&lt;br&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fruvibukekkzoenloz2fu.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fruvibukekkzoenloz2fu.jpg" alt="Hystrix" width="800" height="461"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Testing
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Load testing and stress testing
&lt;/h4&gt;

&lt;p&gt;Perform load and stress testing to discover how your system behaves under the load. It might uncover unexpected issues and failures in your system.&lt;br&gt;&lt;br&gt;
Perform the testing for the long period of time to discover how your system behaves under continuous stress.&lt;br&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F2qn9waf3plt07misajoz.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F2qn9waf3plt07misajoz.jpg" alt="Load testing" width="800" height="486"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Test for remote services failures
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;no response&lt;/li&gt;
&lt;li&gt;failed response&lt;/li&gt;
&lt;li&gt;slow response&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Chaos engineering (resilience testing)
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;Chaos Engineering is the discipline of experimenting on a system in order to build confidence in the system’s capability to withstand turbulent conditions in production.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Facilitate Chaos Engineering experiments to understand the system robustness and discover the system weaknesses.  &lt;/p&gt;

&lt;p&gt;&lt;a href="http://principlesofchaos.org/" rel="noopener noreferrer"&gt;Chaos Engineering experiments&lt;/a&gt; follow four steps:&lt;/p&gt;

&lt;blockquote&gt;
&lt;ol&gt;
&lt;li&gt;Start by defining ‘steady state’ as some measurable output of a system that indicates normal behavior.&lt;/li&gt;
&lt;li&gt;Hypothesize that this steady state will continue in both the control group and the experimental group.&lt;/li&gt;
&lt;li&gt;Introduce variables that reflect real world events like servers that crash, hard drives that malfunction, network connections that are severed, etc.&lt;/li&gt;
&lt;li&gt;Try to disprove the hypothesis by looking for a difference in steady state between the control group and the experimental group.&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  Gremlin
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://www.gremlin.com" rel="noopener noreferrer"&gt;Gremlin&lt;/a&gt; is a chaos engineering platform. Gremlin provides the framework to safely and simply simulate real outages.&lt;br&gt;
Be prepared - Gremlins come:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Resource gremlins. Throttle CPU, Memory, I/O, and Disk&lt;/li&gt;
&lt;li&gt;State gremlins. Reboot hosts, kill processes, travel in time&lt;/li&gt;
&lt;li&gt;Network gremlins. Introduce latency, blackhole traffic, lose packets, fail DNS&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;In distributed systems failures are unavoidable by nature. Keep that in mind during architecture, implementation and testing of your system.&lt;br&gt;
Handle the Integration points with Retries, Timeouts, Bulkheads and Circuit breakers. &lt;br&gt;
Minimize failures impact on users by responding when failures happen. Leverage Caching, Queues, return default values or disable the failing functionality. &lt;br&gt;
Test your system vigorously. Test for remote services failures.&lt;br&gt;
Break your system to make it unbreakable facilitating chaos engineering experiments.  &lt;/p&gt;

&lt;h3&gt;
  
  
  References
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Michael T. Nygard. Release It!: Design and Deploy Production-Ready Software 2nd Edition (2018)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://netflixtechblog.com/fault-tolerance-in-a-high-volume-distributed-system-91ab4faae74a" rel="noopener noreferrer"&gt;Article "Fault-tolerance in a high volume distributed system" by Netflix&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://netflixtechblog.com/lessons-netflix-learned-from-the-aws-outage-deefe5fd0c04" rel="noopener noreferrer"&gt;Article "Lessons learned from the AWS outage" by Netflix&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/" rel="noopener noreferrer"&gt;Article "Exponential backoff and jitter" by AWS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Netflix/Hystrix/wiki" rel="noopener noreferrer"&gt;Hystrix&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://principlesofchaos.org/" rel="noopener noreferrer"&gt;Principles of chaos&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.gremlin.com/" rel="noopener noreferrer"&gt;Gremlin&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>architecture</category>
      <category>distributedsystems</category>
      <category>faulttolerance</category>
    </item>
    <item>
      <title>AWS S3 + Athena real-time business analytics</title>
      <dc:creator>Andrii Liashenko</dc:creator>
      <pubDate>Sat, 30 Nov 2019 17:27:18 +0000</pubDate>
      <link>https://dev.to/liashenko/aws-s3-athena-real-time-business-analytics-333b</link>
      <guid>https://dev.to/liashenko/aws-s3-athena-real-time-business-analytics-333b</guid>
      <description>&lt;p&gt;&lt;strong&gt;Overview&lt;/strong&gt;&lt;br&gt;
Business analytics is crucial:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It provides business people a view on the current status of your software.&lt;/li&gt;
&lt;li&gt;It is the key for data-driven door oh, what a pun
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To make business analytics possible we need data, data that represents the right business metrics for the software (KPIs).&lt;br&gt;
Business metrics can be stored in a database, logs, files or a dedicated warehouse.&lt;/p&gt;

&lt;p&gt;In this article I’d like to show you &lt;strong&gt;real-time business analytics in AWS S3 using AWS Athena&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AWS S3&lt;/strong&gt; is a simple object storage service. &lt;br&gt;
It is highly available (99.9%) and durable ( 99.999999999%).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AWS Athena&lt;/strong&gt; is a service to query data (basically files with records) in S3 using SQL.&lt;br&gt;
Athena supports querying CSV, JSON, Apache Parquet data formats.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It’s serverless! You don't need to set up or maintain hosts/databases.&lt;/li&gt;
&lt;li&gt;You pay per query! 1TB of scanned data = 5$&lt;/li&gt;
&lt;li&gt;You can use it with different business intelligence or SQL clients.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How Athena works?&lt;/strong&gt;&lt;br&gt;
Athena uses &lt;strong&gt;Presto&lt;/strong&gt; under the hood.&lt;br&gt;
&lt;a href="https://aws.amazon.com/big-data/what-is-presto/" rel="noopener noreferrer"&gt;What is Presto?&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu9osyjh8afoj9n9ho89d.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu9osyjh8afoj9n9ho89d.jpg" alt="bookstore" width="800" height="599"&gt;&lt;/a&gt;&lt;br&gt;
Now, getting back to our topic, let's imagine we have an online bookstore and we need to analyze books purchases that are processed by &lt;em&gt;PurchaseService&lt;/em&gt;. &lt;br&gt;
Let's define our purchase metrics metadata:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;bookId&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ec437d98-455d-4fec-8dbe-2c2630454bdd&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;title&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Orlando&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Virginia Woolf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;genre&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Fiction&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;userId&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;test-user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;userCountry&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;country&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;userAge&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;purchaseTimestamp&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1575121641&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Good enough to make different kind of business analytics.&lt;/p&gt;

&lt;p&gt;Now we can publish our purchase metrics directly to S3, but &lt;strong&gt;AWS Kinesis Firehose&lt;/strong&gt; is better and here is why:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Firehose buffers incoming records and delivers in batches.&lt;/li&gt;
&lt;li&gt;Firehose can convert the records to another data format, for example Apache Parquet (which is much more efficient for Athena querying - 1TB of JSON records reducing down to 130GB, meaning faster and cheaper querying) &lt;/li&gt;
&lt;li&gt;Firehose can compress the data (gzip, snappy, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's create a Firehose stream in AWS Console called &lt;em&gt;books-purchase-stream&lt;/em&gt; that delivers data to S3.&lt;br&gt;
&lt;em&gt;PurchaseService&lt;/em&gt; is a NodeJS AWS Lambda Function and it will publish purchase events (purchase metrics format we defined recently) to &lt;em&gt;books-purchase-stream&lt;/em&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;AWS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-sdk&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;firehose&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;AWS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Firehose&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;firehoseStream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;books-purchase-stream&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="nx"&gt;exports&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;metricsPublisher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;purchaseRecord&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;firehoseRecord&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;DeliveryStreamName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;firehoseStream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;Record&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;Data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;purchaseRecord&lt;/span&gt;
       &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="nx"&gt;firehose&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;putRecord&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;firehoseRecord&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stack&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;done&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Now we have our purchase metrics stored in S3, how do we query it?&lt;/strong&gt;&lt;br&gt;
Athena is integrated with AWS Glue Data Catalog and requires a Glue database and a Glue table for querying. &lt;br&gt;
&lt;strong&gt;AWS Glue Data Catalog&lt;/strong&gt; is a metadata repository, a Glue table is the data model (schema) and a Glue database contains tables.&lt;/p&gt;

&lt;p&gt;To create a Glue database and a table with our purchase metrics metadata we’re gonna use a &lt;strong&gt;Glue Crawler&lt;/strong&gt;.&lt;br&gt;
Point a Glue Crawler to the data in S3 and the crawler will extract the metadata into AWS Glue Data Catalog.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The flow we've created so far:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb93o249ywdl1jvssefsl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb93o249ywdl1jvssefsl.png" alt="AWS S3 + Athena real-time business analytics" width="800" height="320"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We have our purchase metrics in AWS S3 and purchase metrics metadata in AWS Glue Catalog, can we query it now?&lt;/strong&gt;&lt;br&gt;
Yes! Let’s go to Athena and write a simple query:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8j1gln7cw68zf4wu7cw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8j1gln7cw68zf4wu7cw.png" alt="Athena query" width="800" height="317"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;br&gt;
In the end we have a simple yet powerful serverless real-time business analytics infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;References&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://docs.aws.amazon.com/athena/latest/ug/what-is.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/athena/latest/ug/what-is.html&lt;/a&gt;&lt;br&gt;
&lt;a href="https://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html&lt;/a&gt;&lt;br&gt;
&lt;a href="https://aws.amazon.com/premiumsupport/knowledge-center/error-json-athena/" rel="noopener noreferrer"&gt;https://aws.amazon.com/premiumsupport/knowledge-center/error-json-athena/&lt;/a&gt;&lt;br&gt;
&lt;a href="https://docs.aws.amazon.com/glue/latest/dg/populate-data-catalog.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/glue/latest/dg/populate-data-catalog.html&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>tutorial</category>
      <category>serverless</category>
      <category>architecture</category>
    </item>
  </channel>
</rss>
