<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Amit Kayal</title>
    <description>The latest articles on DEV Community by Amit Kayal (@amitkayal).</description>
    <link>https://dev.to/amitkayal</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F500645%2Fe0c703c3-855c-4fbd-a1c0-b546a60c022e.png</url>
      <title>DEV Community: Amit Kayal</title>
      <link>https://dev.to/amitkayal</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/amitkayal"/>
    <language>en</language>
    <item>
      <title>Building a Hybrid AWS Microservices Platform with API Gateway, Lambda, ECS, and Load Balancers</title>
      <dc:creator>Amit Kayal</dc:creator>
      <pubDate>Mon, 20 Apr 2026 18:41:39 +0000</pubDate>
      <link>https://dev.to/amitkayal/building-a-hybrid-aws-microservices-platform-with-api-gateway-lambda-ecs-and-load-balancers-mnn</link>
      <guid>https://dev.to/amitkayal/building-a-hybrid-aws-microservices-platform-with-api-gateway-lambda-ecs-and-load-balancers-mnn</guid>
      <description>&lt;h1&gt;
  
  
  Building a Hybrid AWS Microservices Platform with API Gateway, Lambda, ECS, and Load Balancers
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;When teams start splitting a large backend into smaller services, the first infrastructure question is usually not "How do we build a microservice?" but "How do we expose many different services safely, consistently, and without creating a networking mess?"&lt;/p&gt;

&lt;p&gt;Our architecture provides a practical answer to that problem using a hybrid AWS design:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API Gateway as the front door&lt;/li&gt;
&lt;li&gt;Lambda for lightweight serverless capabilities and supporting workflows&lt;/li&gt;
&lt;li&gt;ECS Fargate for containerized business services&lt;/li&gt;
&lt;li&gt;Internal load balancers for private service routing&lt;/li&gt;
&lt;li&gt;Terraform for repeatable, staged infrastructure delivery&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The important architectural idea is separation of concerns. Public access, authentication, routing, container execution, and service discovery are all handled by different layers. That keeps the platform easier to scale and much easier to evolve as the number of services grows.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Pattern
&lt;/h2&gt;

&lt;p&gt;At a high level, the platform follows this flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A client sends an HTTPS request to API Gateway.&lt;/li&gt;
&lt;li&gt;API Gateway applies request-level controls such as API key enforcement, CORS behavior, and route matching.&lt;/li&gt;
&lt;li&gt;The request is sent either to a Lambda-backed endpoint or to a private containerized service.&lt;/li&gt;
&lt;li&gt;For ECS services, traffic goes through a VPC Link into internal load balancing.&lt;/li&gt;
&lt;li&gt;The load balancer forwards the request to the correct ECS service based on path rules.&lt;/li&gt;
&lt;li&gt;ECS Fargate runs one or more healthy tasks for that service and returns the response.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This gives a single API surface to consumers while allowing the backend implementation to vary by use case.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Combine Lambda and ECS?
&lt;/h2&gt;

&lt;p&gt;A platform like this benefits from using both compute models rather than forcing every workload into one.&lt;/p&gt;

&lt;p&gt;Lambda is a strong fit for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;lightweight request handlers&lt;/li&gt;
&lt;li&gt;event-driven tasks&lt;/li&gt;
&lt;li&gt;simple orchestration&lt;/li&gt;
&lt;li&gt;platform support functions&lt;/li&gt;
&lt;li&gt;endpoints that do not need a full container lifecycle&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ECS Fargate is a better fit for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;long-lived HTTP microservices&lt;/li&gt;
&lt;li&gt;containerized frameworks and dependencies&lt;/li&gt;
&lt;li&gt;services that need more predictable runtime behavior&lt;/li&gt;
&lt;li&gt;APIs that benefit from load balancing, health checks, and horizontal scaling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In our architecture, the design supports both. Some APIs are routed to Lambda-based services, while others are routed to ECS services defined through service configuration. That hybrid model is useful in real organizations because all services do not have the same runtime needs.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Three-Stage Infrastructure Model
&lt;/h2&gt;

&lt;p&gt;One of the strongest ideas in our architecture is the staged Terraform layout. Instead of deploying everything together, the infrastructure is split into three layers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 1: Networking
&lt;/h3&gt;

&lt;p&gt;The first stage establishes the network foundation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;VPC selection or creation&lt;/li&gt;
&lt;li&gt;public and private subnet discovery or provisioning&lt;/li&gt;
&lt;li&gt;internal Network Load Balancer&lt;/li&gt;
&lt;li&gt;internal Application Load Balancer&lt;/li&gt;
&lt;li&gt;VPC Link for API Gateway&lt;/li&gt;
&lt;li&gt;ECS task security group&lt;/li&gt;
&lt;li&gt;ALB log storage and network observability components&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This stage is intentionally infrastructure-only. No application services are deployed here.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 2: Compute
&lt;/h3&gt;

&lt;p&gt;The second stage provisions the actual execution environment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ECS cluster on Fargate&lt;/li&gt;
&lt;li&gt;ECR repositories for service images&lt;/li&gt;
&lt;li&gt;target groups per service&lt;/li&gt;
&lt;li&gt;ALB listener and listener rules&lt;/li&gt;
&lt;li&gt;ECS service definitions&lt;/li&gt;
&lt;li&gt;CloudWatch log groups&lt;/li&gt;
&lt;li&gt;Lambda functions used by the platform&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This stage consumes outputs from the networking stage so the compute layer never hardcodes network assumptions in its own design.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 3: API Gateways
&lt;/h3&gt;

&lt;p&gt;The third stage exposes services through API Gateway:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a public API for internet-facing consumption&lt;/li&gt;
&lt;li&gt;a private API for VPC-only access&lt;/li&gt;
&lt;li&gt;route creation from service metadata&lt;/li&gt;
&lt;li&gt;VPC Link integrations for containerized services&lt;/li&gt;
&lt;li&gt;Lambda proxy integrations for Lambda-backed services&lt;/li&gt;
&lt;li&gt;API keys, usage plans, and stage configuration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This split is operationally important. Teams can change routing without rebuilding networking, and they can add services without redesigning the entire platform.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Request Path for ECS Services
&lt;/h2&gt;

&lt;p&gt;For containerized microservices, the implementation follows a private ingress model.&lt;/p&gt;

&lt;p&gt;The path is:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Client -&amp;gt; API Gateway -&amp;gt; VPC Link -&amp;gt; internal NLB -&amp;gt; internal ALB -&amp;gt; ECS service -&amp;gt; ECS task&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;That may look like one hop too many at first, but each layer has a purpose.&lt;/p&gt;

&lt;h3&gt;
  
  
  API Gateway
&lt;/h3&gt;

&lt;p&gt;API Gateway is the public control plane. It handles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;TLS termination at the edge&lt;/li&gt;
&lt;li&gt;route exposure&lt;/li&gt;
&lt;li&gt;API key enforcement&lt;/li&gt;
&lt;li&gt;request and header mapping&lt;/li&gt;
&lt;li&gt;CORS handling&lt;/li&gt;
&lt;li&gt;stage-based deployment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It gives consumers a stable API contract while keeping the backend private.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why a VPC Link Is Used
&lt;/h3&gt;

&lt;p&gt;ECS services are not exposed directly to the internet. Instead, API Gateway connects privately into the VPC using a VPC Link. That allows the public API layer to reach internal services without making the services themselves public.&lt;/p&gt;

&lt;p&gt;This is a strong security pattern because the application runtime stays inside the VPC, but consumers still get a clean managed API endpoint.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why the Repository Uses Both NLB and ALB
&lt;/h3&gt;

&lt;p&gt;A useful implementation detail in our architecture is that the VPC Link targets an internal Network Load Balancer, and that NLB forwards to an internal Application Load Balancer.&lt;/p&gt;

&lt;p&gt;This arrangement provides two separate benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The NLB is used as the stable target for the API Gateway VPC Link.&lt;/li&gt;
&lt;li&gt;The ALB performs path-based routing to the actual microservices.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The ALB is what makes many ECS services practical behind one internal entry point. Each service gets its own listener rule and target group, so the platform can route based on URL path rather than provisioning a separate load balancer per service.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Load Balancing Works
&lt;/h2&gt;

&lt;p&gt;The load-balancing model is service-oriented.&lt;/p&gt;

&lt;p&gt;Each ECS microservice contributes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a base API path&lt;/li&gt;
&lt;li&gt;an ALB path pattern&lt;/li&gt;
&lt;li&gt;a listener rule priority&lt;/li&gt;
&lt;li&gt;a container port&lt;/li&gt;
&lt;li&gt;a health check definition&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From that metadata, Terraform creates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one target group per service&lt;/li&gt;
&lt;li&gt;one listener rule per service&lt;/li&gt;
&lt;li&gt;one ECS service per service&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This means the routing layer is not manually duplicated for every new microservice. The service declares its path and runtime settings, and the platform generates the infrastructure around it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Target Groups
&lt;/h3&gt;

&lt;p&gt;Each target group points to ECS tasks using IP targets. That is the correct choice for Fargate because tasks run with their own elastic networking interfaces rather than on shared EC2 hosts.&lt;/p&gt;

&lt;p&gt;The target groups in this repository also use application-level health checks. A task is considered healthy only when its service endpoint responds successfully on the configured health path.&lt;/p&gt;

&lt;p&gt;That matters because container startup is not the same as application readiness. A service may be running from ECS's perspective but still not ready to receive traffic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Listener Rules
&lt;/h3&gt;

&lt;p&gt;The ALB listener is configured once, and each service gets a path-based rule. For example, a service under a quoting path can be matched independently from a service under a product-pricing path.&lt;/p&gt;

&lt;p&gt;This keeps the routing layer centralized and avoids deploying a dedicated ALB per service, which would become expensive and operationally noisy as the platform grows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Health Checks and Traffic Protection
&lt;/h3&gt;

&lt;p&gt;The repository uses health checks in multiple places:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API health endpoints at the application level&lt;/li&gt;
&lt;li&gt;ALB target group health checks&lt;/li&gt;
&lt;li&gt;ECS service health grace periods&lt;/li&gt;
&lt;li&gt;container health checks inside the task definition&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That layered approach improves resilience:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;unhealthy tasks are removed from target groups&lt;/li&gt;
&lt;li&gt;ECS replaces failed tasks&lt;/li&gt;
&lt;li&gt;API Gateway continues to route through the same private entry point&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result is a platform that can recover from instance-level failures without changing the public API contract.&lt;/p&gt;

&lt;h2&gt;
  
  
  How ECS Is Structured
&lt;/h2&gt;

&lt;p&gt;The ECS side of the platform is built for repeatability rather than one-off service definitions.&lt;/p&gt;

&lt;h3&gt;
  
  
  ECS Cluster
&lt;/h3&gt;

&lt;p&gt;The platform provisions a shared ECS cluster per environment. That allows multiple microservices to run within the same operational boundary while still being isolated at the task and service level.&lt;/p&gt;

&lt;p&gt;The cluster uses Fargate, which removes the need to manage EC2 worker nodes. This simplifies operations significantly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;no patching of container hosts&lt;/li&gt;
&lt;li&gt;no cluster capacity management at the instance level&lt;/li&gt;
&lt;li&gt;easier scaling by task count&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Reusable ECS Service Module
&lt;/h3&gt;

&lt;p&gt;Instead of defining each ECS service from scratch, the repository uses a reusable Terraform module for service deployment.&lt;/p&gt;

&lt;p&gt;That module is responsible for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;task definition creation&lt;/li&gt;
&lt;li&gt;container logging configuration&lt;/li&gt;
&lt;li&gt;IAM role wiring&lt;/li&gt;
&lt;li&gt;ECS service creation&lt;/li&gt;
&lt;li&gt;target group attachment&lt;/li&gt;
&lt;li&gt;subnet and security group placement&lt;/li&gt;
&lt;li&gt;optional capacity provider strategy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a strong platform choice. It makes service onboarding consistent and reduces drift between services.&lt;/p&gt;

&lt;h3&gt;
  
  
  Task Definitions
&lt;/h3&gt;

&lt;p&gt;Each service runs as a Fargate task with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a named container image from ECR&lt;/li&gt;
&lt;li&gt;CPU and memory settings&lt;/li&gt;
&lt;li&gt;environment variables&lt;/li&gt;
&lt;li&gt;a health check command&lt;/li&gt;
&lt;li&gt;CloudWatch logging&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The repository also includes support for an additional X-Ray sidecar container in the task definition pattern, which is useful for distributed tracing in a microservice environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Network Mode
&lt;/h3&gt;

&lt;p&gt;Tasks run with &lt;code&gt;awsvpc&lt;/code&gt; networking, which gives each task its own network interface and private IP. This is the standard model for ECS on Fargate and is what allows ALB target groups to use IP mode cleanly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Subnet and Security Group Design
&lt;/h2&gt;

&lt;p&gt;This repository supports both existing/default VPC usage and a more segmented custom VPC model.&lt;/p&gt;

&lt;p&gt;That flexibility matters because many teams start in a default-VPC or dev-friendly setup and later move to stricter network isolation for staging and production.&lt;/p&gt;

&lt;h3&gt;
  
  
  Subnet Placement
&lt;/h3&gt;

&lt;p&gt;The network layer discovers public and private subnets where available. In a custom VPC, the design supports proper private subnet deployment. In a simpler default VPC setup, the platform can fall back to available public subnets when private ones are not present.&lt;/p&gt;

&lt;p&gt;This is an important operational nuance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;development environments often optimize for simplicity&lt;/li&gt;
&lt;li&gt;higher environments usually optimize for stricter isolation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The repository is built to handle both.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security Groups
&lt;/h3&gt;

&lt;p&gt;The security model follows least-privilege intent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ECS tasks accept application traffic from the internal load-balancing layer&lt;/li&gt;
&lt;li&gt;services are not directly internet-facing&lt;/li&gt;
&lt;li&gt;API Gateway reaches backend services through private network integration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This keeps the application tier out of direct public exposure while still allowing a public API facade.&lt;/p&gt;

&lt;h2&gt;
  
  
  Config-Driven Service Onboarding
&lt;/h2&gt;

&lt;p&gt;One of the most scalable ideas in our architecture is that services are registered through configuration rather than by handcrafting infrastructure every time.&lt;/p&gt;

&lt;p&gt;There is a master service registry that lists enabled services per environment, and each service provides its own deployment metadata, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;service identity&lt;/li&gt;
&lt;li&gt;container port&lt;/li&gt;
&lt;li&gt;desired task count&lt;/li&gt;
&lt;li&gt;CPU and memory&lt;/li&gt;
&lt;li&gt;API base path&lt;/li&gt;
&lt;li&gt;ALB path pattern&lt;/li&gt;
&lt;li&gt;listener priority&lt;/li&gt;
&lt;li&gt;health check behavior&lt;/li&gt;
&lt;li&gt;logging retention&lt;/li&gt;
&lt;li&gt;autoscaling preferences&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This creates a platform model rather than a collection of unrelated microservices.&lt;/p&gt;

&lt;p&gt;Adding a new service becomes a repeatable process:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create the service.&lt;/li&gt;
&lt;li&gt;Define its configuration.&lt;/li&gt;
&lt;li&gt;Register it in the service catalog.&lt;/li&gt;
&lt;li&gt;Build and publish the image.&lt;/li&gt;
&lt;li&gt;Apply Terraform stages.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That is much easier to maintain than cloning infrastructure blocks over and over.&lt;/p&gt;

&lt;h2&gt;
  
  
  Container Delivery with ECR
&lt;/h2&gt;

&lt;p&gt;For ECS workloads, the container supply chain is straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Build the service image.&lt;/li&gt;
&lt;li&gt;Push it to an ECR repository.&lt;/li&gt;
&lt;li&gt;Reference the tagged image in the ECS task definition.&lt;/li&gt;
&lt;li&gt;Update the ECS service to roll out the new task definition.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Our platform provisions one ECR repository per service, with image scanning enabled. That is a good baseline for a microservices platform because it keeps artifacts separated by service while still following a common naming convention.&lt;/p&gt;

&lt;p&gt;There is also an explicit deployment phase between infrastructure provisioning and API exposure where container images are built and pushed. That is a practical real-world step many diagrams omit, but it is essential because ECS cannot run a service until the image exists in the registry.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Lambda Fits into the Platform
&lt;/h2&gt;

&lt;p&gt;Lambda is used here as a first-class platform option, not as an afterthought.&lt;/p&gt;

&lt;p&gt;There are two useful Lambda patterns in our architecture.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Lambda as an API Backend
&lt;/h3&gt;

&lt;p&gt;Some services can be exposed through API Gateway using Lambda proxy integration. This is ideal for capabilities that are naturally event-driven, lightweight, or operationally simpler as functions than as always-on containers.&lt;/p&gt;

&lt;p&gt;In this model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API Gateway owns the route&lt;/li&gt;
&lt;li&gt;Lambda executes the business logic&lt;/li&gt;
&lt;li&gt;API Gateway returns the Lambda response directly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This avoids unnecessary load-balancer and container overhead for smaller workloads.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Lambda as a Platform Support Function
&lt;/h3&gt;

&lt;p&gt;Our architecture also provisions Lambda functions that support the overall platform, such as authentication-related or onboarding-related workflows.&lt;/p&gt;

&lt;p&gt;This is a smart use of Lambda in a hybrid platform because not every supporting concern needs to run inside ECS.&lt;/p&gt;

&lt;h2&gt;
  
  
  Authentication and API Protection
&lt;/h2&gt;

&lt;p&gt;Our architecture clearly treats API protection as an API Gateway concern.&lt;/p&gt;

&lt;p&gt;The current public API implementation enforces API key usage through API Gateway methods, API keys, and usage plans. The codebase also provisions a supporting API key validation Lambda function and related permissions, which shows the platform is designed to accommodate Lambda-based validation flows where needed.&lt;/p&gt;

&lt;p&gt;From a blog perspective, the important architectural takeaway is this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;keep authentication and traffic governance at the gateway layer&lt;/li&gt;
&lt;li&gt;keep service containers focused on business logic&lt;/li&gt;
&lt;li&gt;keep private workloads private&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That separation keeps the platform easier to secure and easier to reason about.&lt;/p&gt;

&lt;h2&gt;
  
  
  Public and Private API Models
&lt;/h2&gt;

&lt;p&gt;Another strength of our architecture is that it supports both public and private APIs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Public API
&lt;/h3&gt;

&lt;p&gt;The public API is intended for internet-facing access. It handles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;external client access&lt;/li&gt;
&lt;li&gt;API keys and usage plans&lt;/li&gt;
&lt;li&gt;CORS behavior&lt;/li&gt;
&lt;li&gt;Lambda and ECS route exposure&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Private API
&lt;/h3&gt;

&lt;p&gt;The private API is intended for internal or VPC-scoped access. It is useful when services should only be reachable from trusted network boundaries such as internal AWS workloads, integration environments, or enterprise connectivity paths.&lt;/p&gt;

&lt;p&gt;This split is helpful when some capabilities should be public and others should remain internal even though they share the same service platform underneath.&lt;/p&gt;

&lt;h2&gt;
  
  
  Observability and Operations
&lt;/h2&gt;

&lt;p&gt;A microservices platform is only as good as its operational visibility.&lt;/p&gt;

&lt;p&gt;Our architecture includes observability at several levels:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CloudWatch log groups for ECS services&lt;/li&gt;
&lt;li&gt;CloudWatch logs for Lambda functions&lt;/li&gt;
&lt;li&gt;API Gateway stage logging&lt;/li&gt;
&lt;li&gt;ALB logging support&lt;/li&gt;
&lt;li&gt;VPC flow logging&lt;/li&gt;
&lt;li&gt;X-Ray-friendly task patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That combination helps answer the most common production questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Did the request reach the gateway?&lt;/li&gt;
&lt;li&gt;Was it routed to the right backend?&lt;/li&gt;
&lt;li&gt;Was the target healthy?&lt;/li&gt;
&lt;li&gt;Did the service fail or time out?&lt;/li&gt;
&lt;li&gt;Was the problem in networking, routing, or application logic?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without that layered visibility, hybrid platforms become difficult to troubleshoot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scaling Characteristics
&lt;/h2&gt;

&lt;p&gt;This architecture scales well because each layer can evolve somewhat independently.&lt;/p&gt;

&lt;h3&gt;
  
  
  API Layer Scaling
&lt;/h3&gt;

&lt;p&gt;API Gateway absorbs public traffic without requiring the backend to manage edge-facing concerns directly.&lt;/p&gt;

&lt;h3&gt;
  
  
  ECS Scaling
&lt;/h3&gt;

&lt;p&gt;ECS services scale by task count. Each service can define:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;desired count&lt;/li&gt;
&lt;li&gt;minimum and maximum capacity&lt;/li&gt;
&lt;li&gt;CPU and memory sizing&lt;/li&gt;
&lt;li&gt;autoscaling thresholds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That means heavily used services can scale out without affecting lighter services.&lt;/p&gt;

&lt;h3&gt;
  
  
  Platform Growth
&lt;/h3&gt;

&lt;p&gt;As more services are added, the platform does not need a new ingress pattern each time. The same path-based routing model continues to work as long as route definitions and listener priorities stay clean.&lt;/p&gt;

&lt;h2&gt;
  
  
  Alignment with AWS Well-Architected Best Practices
&lt;/h2&gt;

&lt;p&gt;This architecture also aligns well with AWS best-practice design principles, especially the AWS Well-Architected mindset.&lt;/p&gt;

&lt;h3&gt;
  
  
  Operational Excellence
&lt;/h3&gt;

&lt;p&gt;We have structured the platform so that it is operated as a system rather than as a collection of one-off deployments.&lt;/p&gt;

&lt;p&gt;This is reflected in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;staged Terraform deployments for clearer ownership and safer changes&lt;/li&gt;
&lt;li&gt;configuration-driven service onboarding&lt;/li&gt;
&lt;li&gt;consistent ECS service patterns through reusable modules&lt;/li&gt;
&lt;li&gt;standardized logging and deployment workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This reduces manual drift and makes operational changes more repeatable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security
&lt;/h3&gt;

&lt;p&gt;Security is addressed through layered controls rather than a single protection point.&lt;/p&gt;

&lt;p&gt;We have adhered to good AWS security practices by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;placing ECS services behind private networking rather than exposing them directly&lt;/li&gt;
&lt;li&gt;using API Gateway as the controlled ingress layer&lt;/li&gt;
&lt;li&gt;applying API-level protection at the gateway&lt;/li&gt;
&lt;li&gt;using security groups to limit east-west traffic&lt;/li&gt;
&lt;li&gt;supporting encrypted log and storage patterns&lt;/li&gt;
&lt;li&gt;separating public access from internal service routing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This follows the AWS principle of strong boundaries, least privilege, and defense in depth.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reliability
&lt;/h3&gt;

&lt;p&gt;Reliability comes from designing for failure at the service and routing layers.&lt;/p&gt;

&lt;p&gt;We have incorporated that through:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;multi-AZ subnet placement&lt;/li&gt;
&lt;li&gt;load balancer health checks&lt;/li&gt;
&lt;li&gt;ECS task replacement behavior&lt;/li&gt;
&lt;li&gt;target group isolation per service&lt;/li&gt;
&lt;li&gt;decoupled gateway and backend layers&lt;/li&gt;
&lt;li&gt;staged infrastructure dependencies with clear outputs between layers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This means a failing task or unhealthy target does not require the API surface itself to change.&lt;/p&gt;

&lt;h3&gt;
  
  
  Performance Efficiency
&lt;/h3&gt;

&lt;p&gt;The architecture chooses the right compute model for the right workload.&lt;/p&gt;

&lt;p&gt;That is an AWS best practice because it avoids treating all traffic the same.&lt;/p&gt;

&lt;p&gt;Examples include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lambda for lighter, event-oriented, or supporting workflows&lt;/li&gt;
&lt;li&gt;ECS Fargate for containerized services that need steady HTTP handling&lt;/li&gt;
&lt;li&gt;ALB path-based routing for efficient multi-service consolidation&lt;/li&gt;
&lt;li&gt;service-specific CPU, memory, and scaling settings&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This lets us tune services independently instead of overprovisioning everything at the platform level.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost Optimization
&lt;/h3&gt;

&lt;p&gt;Cost optimization is also visible in the design choices.&lt;/p&gt;

&lt;p&gt;We are not multiplying infrastructure unnecessarily. Instead, the architecture encourages shared but controlled platform components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one API layer for many services&lt;/li&gt;
&lt;li&gt;one internal routing layer for many ECS workloads&lt;/li&gt;
&lt;li&gt;shared ECS cluster patterns per environment&lt;/li&gt;
&lt;li&gt;service-level scaling instead of blanket scaling&lt;/li&gt;
&lt;li&gt;support for Fargate and optional capacity-provider strategies where appropriate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is much closer to AWS best practice than provisioning separate ingress and compute stacks for every small service.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sustainability and Maintainability
&lt;/h3&gt;

&lt;p&gt;Even when sustainability is not called out directly, maintainable designs usually consume fewer engineering and infrastructure resources over time.&lt;/p&gt;

&lt;p&gt;The architecture helps here by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;reducing duplicated infrastructure definitions&lt;/li&gt;
&lt;li&gt;making service onboarding metadata-driven&lt;/li&gt;
&lt;li&gt;encouraging reuse of shared platform components&lt;/li&gt;
&lt;li&gt;keeping the public contract stable while backend services evolve&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That leads to lower long-term complexity, which is a practical form of architectural efficiency.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Pattern Works Well
&lt;/h2&gt;

&lt;p&gt;This AWS pattern is effective because it balances standardization with flexibility.&lt;/p&gt;

&lt;p&gt;It standardizes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deployment stages&lt;/li&gt;
&lt;li&gt;ingress architecture&lt;/li&gt;
&lt;li&gt;service registration&lt;/li&gt;
&lt;li&gt;load-balancer behavior&lt;/li&gt;
&lt;li&gt;logging and health checks&lt;/li&gt;
&lt;li&gt;ECS service creation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It stays flexible by allowing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lambda-backed endpoints&lt;/li&gt;
&lt;li&gt;ECS-backed endpoints&lt;/li&gt;
&lt;li&gt;public and private APIs&lt;/li&gt;
&lt;li&gt;different service-level scaling and runtime settings&lt;/li&gt;
&lt;li&gt;multiple environments with different networking strategies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is exactly what a growing microservices platform needs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Implementation Advice
&lt;/h2&gt;

&lt;p&gt;If you want to implement a similar architecture, a good sequence is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Build the networking foundation first.&lt;/li&gt;
&lt;li&gt;Keep all service backends private.&lt;/li&gt;
&lt;li&gt;Put API Gateway in front of everything external.&lt;/li&gt;
&lt;li&gt;Use ECS Fargate for containerized APIs that benefit from long-lived service behavior.&lt;/li&gt;
&lt;li&gt;Use Lambda for support functions and lightweight endpoints.&lt;/li&gt;
&lt;li&gt;Register services through metadata, not repetitive infrastructure definitions.&lt;/li&gt;
&lt;li&gt;Use path-based ALB routing so many services can share one internal ingress layer.&lt;/li&gt;
&lt;li&gt;Add strong health checks and centralized logs before traffic grows.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key is not just choosing AWS services, but assigning each AWS service a clear responsibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Our architecture demonstrates a mature way to implement Lambda and ECS-based microservices through API Gateway without exposing backend services directly.&lt;/p&gt;

&lt;p&gt;The architecture uses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;staged Terraform for separation of concerns&lt;/li&gt;
&lt;li&gt;API Gateway as the public and private API facade&lt;/li&gt;
&lt;li&gt;Lambda where serverless execution makes sense&lt;/li&gt;
&lt;li&gt;ECS Fargate for containerized microservices&lt;/li&gt;
&lt;li&gt;NLB and ALB together for private, path-aware routing&lt;/li&gt;
&lt;li&gt;config-driven onboarding for scale&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For teams building an enterprise microservices platform, this is a strong pattern because it supports security, operational clarity, and service growth without forcing every workload into the same runtime model.&lt;/p&gt;

&lt;p&gt;Most importantly, it turns infrastructure into a reusable platform. Once that platform is in place, adding the next service becomes much easier than adding the first one.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lessons Learned
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Keeping API Gateway as the front door and backend services private makes the architecture easier to secure and easier to evolve.&lt;/li&gt;
&lt;li&gt;Using both Lambda and ECS is more practical than forcing every use case into a single compute model.&lt;/li&gt;
&lt;li&gt;Path-based routing through shared internal load balancing scales better than creating isolated ingress infrastructure for every service.&lt;/li&gt;
&lt;li&gt;Service onboarding becomes significantly easier when routing, health checks, scaling, and runtime settings are driven by configuration.&lt;/li&gt;
&lt;li&gt;Health checks, logging, and observability need to be designed from the beginning; adding them later is much harder in a distributed system.&lt;/li&gt;
&lt;li&gt;A staged infrastructure model reduces operational risk because networking, compute, and API exposure can be changed independently.&lt;/li&gt;
&lt;li&gt;Standardizing platform patterns early saves substantial effort as the number of microservices grows.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>architecture</category>
      <category>aws</category>
      <category>lambda</category>
      <category>apigateway</category>
    </item>
    <item>
      <title>Building a Practical Lambda Capacity Provider Platform: Lessons Learned from Warm Pools, Version Hygiene, and CI/CD Reality</title>
      <dc:creator>Amit Kayal</dc:creator>
      <pubDate>Mon, 20 Apr 2026 18:25:06 +0000</pubDate>
      <link>https://dev.to/amitkayal/building-a-practical-lambda-capacity-provider-platform-lessons-learned-from-warm-pools-version-1l7j</link>
      <guid>https://dev.to/amitkayal/building-a-practical-lambda-capacity-provider-platform-lessons-learned-from-warm-pools-version-1l7j</guid>
      <description>&lt;h1&gt;
  
  
  Building a Practical Lambda Capacity Provider Platform: Lessons Learned from Warm Pools, Version Hygiene, and CI/CD Reality
&lt;/h1&gt;

&lt;p&gt;There is a big difference between a slide-deck architecture and an operating system you can trust on a Monday morning.&lt;/p&gt;

&lt;p&gt;This implementation captures that difference well. On paper, the idea is simple: create a shared AWS Lambda Managed Instances capacity provider, run latency-sensitive workloads on ARM64, keep the pool warm with EventBridge, prune old Lambda versions before they become operational debt, and wrap the whole thing in a GitHub Actions plus CodeBuild delivery model. In practice, each of those choices changes how you think about performance, cost, blast radius, and developer discipline.&lt;/p&gt;

&lt;p&gt;What follows is not a generic cloud post. It is the kind of write-up you produce after actually building and living with the system.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Problem We Were Solving
&lt;/h2&gt;

&lt;p&gt;Traditional Lambda is excellent when you want abstraction and convenience. It becomes less elegant when your workload is sensitive to startup time, carries heavier dependencies, or needs more predictable execution behavior under bursty load.&lt;/p&gt;

&lt;p&gt;That is where a Lambda capacity provider changes the discussion.&lt;/p&gt;

&lt;p&gt;In this implementation, the platform is built around a shared &lt;code&gt;aws_lambda_capacity_provider&lt;/code&gt; that uses ARM64 Graviton instances and auto scaling. The core idea is straightforward: instead of leaving execution placement entirely to the default Lambda fleet, we deliberately provide a managed compute pool that multiple functions can share. That gives us more control over cost-performance characteristics and lets us design around cold-start pain rather than merely complain about it.&lt;/p&gt;

&lt;p&gt;The choice is visible in the Terraform:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The provider runs on &lt;code&gt;arm64&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Allowed instance types are constrained to &lt;code&gt;m6g.large&lt;/code&gt;, &lt;code&gt;m6g.xlarge&lt;/code&gt;, &lt;code&gt;m7g.large&lt;/code&gt;, and &lt;code&gt;m7g.xlarge&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Scaling is set to &lt;code&gt;Auto&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The maximum pool ceiling is set to &lt;code&gt;64&lt;/code&gt; vCPU&lt;/li&gt;
&lt;li&gt;The capacity provider is placed in the default VPC, with unsupported Availability Zones filtered out&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point matters more than it first appears. The code explicitly excludes unsupported AZs such as &lt;code&gt;us-east-1e&lt;/code&gt;, which is a good example of operational maturity: the happy path is not enough when the service itself has placement constraints.&lt;/p&gt;

&lt;h2&gt;
  
  
  How We Actually Created the Capacity Provider
&lt;/h2&gt;

&lt;p&gt;One thing I wanted this platform to avoid was "concept architecture" with no implementation backbone. So the capacity provider here is not described abstractly. It is provisioned directly in Terraform and wired into the Lambda lifecycle in a fairly intentional way.&lt;/p&gt;

&lt;p&gt;The build starts in &lt;code&gt;terraform_file/agent_core_sync_cp.tf&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;First, the capacity provider itself is created with &lt;code&gt;aws_lambda_capacity_provider&lt;/code&gt;. The naming pattern ties it to the service and environment, which is the right instinct for multi-environment operation. The provider is tagged as shared compute for agent workloads, which matters later for discoverability and platform governance.&lt;/p&gt;

&lt;p&gt;Second, the provider is placed inside the default VPC, but not blindly. In &lt;code&gt;terraform_file/data.tf&lt;/code&gt;, the code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;discovers the default VPC&lt;/li&gt;
&lt;li&gt;fetches the default subnets&lt;/li&gt;
&lt;li&gt;inspects subnet Availability Zones one by one&lt;/li&gt;
&lt;li&gt;excludes unsupported zones such as &lt;code&gt;us-east-1e&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;optionally caps how many subnets are used&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a subtle but important design choice. Lambda Managed Instances often create one placement footprint per subnet or AZ. If you do not control subnet spread, you can end up creating more infrastructure surface area than you intended.&lt;/p&gt;

&lt;p&gt;Third, the provider uses a dedicated security group rather than inheriting something vague and accidental. The current implementation keeps outbound traffic fully open and allows inbound HTTPS. That is permissive, but it is at least explicit and repeatable. Early-stage platforms benefit from that kind of clarity.&lt;/p&gt;

&lt;p&gt;Fourth, the capacity provider gets its own operator role through &lt;code&gt;AWSLambdaManagedEC2ResourceOperator&lt;/code&gt;. That is a critical detail. Capacity providers are not just Lambda resources; they need AWS to manage the EC2-backed execution infrastructure on your behalf. If you miss that role, the platform does not really exist no matter how nice your Terraform looks.&lt;/p&gt;

&lt;p&gt;Fifth, the instance requirements are opinionated. The code forces &lt;code&gt;arm64&lt;/code&gt; and narrows the fleet to supported Graviton M-family instance types. That is one of the better engineering decisions in this implementation because it converts an architectural preference into an enforceable runtime rule.&lt;/p&gt;

&lt;p&gt;Finally, the Lambda function is attached to the capacity provider in &lt;code&gt;terraform_file/lambda_clm_router_agent.tf&lt;/code&gt; through &lt;code&gt;capacity_provider_config&lt;/code&gt;. That is where the abstraction becomes real. We are not just provisioning a pool and hoping someone uses it later. We are explicitly binding a published Lambda to that pool and tuning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;memory GiB per vCPU&lt;/li&gt;
&lt;li&gt;max concurrency per execution environment&lt;/li&gt;
&lt;li&gt;ARM64 runtime alignment&lt;/li&gt;
&lt;li&gt;published versioning through Lambda aliases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the full loop: provision shared compute, constrain placement, grant AWS the operator role it needs, attach live functions to the pool, and then manage the resulting version sprawl with automation. That is what makes this feel like a platform artifact rather than a loose Terraform experiment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 1: A Capacity Provider Is Not a Tuning Knob. It Is an Operating Model.
&lt;/h2&gt;

&lt;p&gt;Teams often talk about capacity providers as if they are just a performance optimization. That framing is too shallow.&lt;/p&gt;

&lt;p&gt;The moment you move Lambda onto managed instances, you are no longer only buying faster startup. You are adopting a new operating model with very clear implications:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You now care about instance family compatibility&lt;/li&gt;
&lt;li&gt;You need to think about subnet strategy and AZ support&lt;/li&gt;
&lt;li&gt;You have to reason about pool scaling ceilings, concurrency, and memory per vCPU&lt;/li&gt;
&lt;li&gt;You are effectively blending serverless ergonomics with infrastructure accountability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This implementation shows that transition clearly. The CLM router Lambda is not just declared with a runtime and handler. It is attached to the shared capacity provider and explicitly tuned with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;execution_environment_memory_gib_per_vcpu&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;per_execution_environment_max_concurrency&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;publish = true&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;architectures = ["arm64"]&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the tell. Once we start specifying how execution environments should behave, we are no longer simply "deploying a Lambda." We are shaping compute economics.&lt;/p&gt;

&lt;p&gt;The practical lesson here is simple: if you adopt Lambda Managed Instances, treat it like platform engineering, not like a runtime checkbox.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 2: ARM64 Delivers Real Value, but Only if You Respect Service Constraints
&lt;/h2&gt;

&lt;p&gt;One of the strongest decisions in this implementation is the bias toward Graviton. For Python-heavy agent workloads, ARM64 is usually the right default. The economics are better, and the performance-per-dollar story is often compelling.&lt;/p&gt;

&lt;p&gt;But there is an important nuance that the Terraform comments correctly capture: not every EC2 family you might expect is supported in the way you assume. This implementation explicitly avoids unsupported combinations and narrows the fleet to supported M-family Graviton instances.&lt;/p&gt;

&lt;p&gt;That is a good lesson in cloud architecture generally: cloud products market flexibility, but production systems survive on constraint management.&lt;/p&gt;

&lt;p&gt;The teams that do well with modern AWS services are not the ones that assume every SKU works. They are the ones that encode the service's real boundaries in Terraform so no one has to rediscover them during an incident window.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 3: Warmup Is Not a Hack. It Is a Deliberate Control Loop.
&lt;/h2&gt;

&lt;p&gt;There is a tendency in engineering circles to treat "warming" as a slightly embarrassing workaround. I think that is the wrong mindset.&lt;/p&gt;

&lt;p&gt;This implementation schedules the CLM router Lambda every five minutes through EventBridge. The handler itself is intentionally lightweight and effectively acts as a keep-alive mechanism. That is not laziness. It is an explicit decision to keep the shared pool alive for latency-sensitive traffic.&lt;/p&gt;

&lt;p&gt;More specifically, the warmer exists to reduce the probability that the capacity provider has to spin up fresh managed instance capacity for a new invocation path after a quiet period. That is the practical point of the EventBridge rule in &lt;code&gt;terraform_file/eventbridge_cp_arm.tf&lt;/code&gt;. By invoking the Lambda on a steady &lt;code&gt;rate(5 minutes)&lt;/code&gt; schedule, the platform keeps the execution path warm enough that the shared capacity provider is less likely to fall all the way back to a cold, scale-from-zero posture right before a real request arrives.&lt;/p&gt;

&lt;p&gt;The important insight is this: once you care about cold-start predictability, you need a control loop.&lt;/p&gt;

&lt;p&gt;That control loop can be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Provisioned concurrency&lt;/li&gt;
&lt;li&gt;Scheduled warmers&lt;/li&gt;
&lt;li&gt;Request shaping&lt;/li&gt;
&lt;li&gt;A shared managed instance pool&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this design, the team chose scheduled warm invocation plus a shared capacity provider. That is a sensible middle ground. It is cheaper and simpler than overcommitting always-on infrastructure, while still materially reducing the first-hit penalty.&lt;/p&gt;

&lt;p&gt;In plain English: the EventBridge warmer is being used here so the capacity provider does not need to spin up a brand-new server footprint every time traffic reappears after idle time. For interactive or latency-sensitive agent workloads, that is a very practical optimization.&lt;/p&gt;

&lt;p&gt;The strategic lesson is that warmup should be measured against business latency, not ideological purity. If a five-minute EventBridge schedule protects user experience and keeps cost acceptable, it is doing its job.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 4: Shared Pools Create Efficiency, but They Also Create Coupling
&lt;/h2&gt;

&lt;p&gt;The capacity provider here is intentionally shared across platform agents and automation services. That is the right move early in a platform journey because it improves utilization and prevents every Lambda from inventing its own isolated infrastructure story.&lt;/p&gt;

&lt;p&gt;But shared pools always introduce two forms of coupling:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Technical coupling, because multiple workloads compete for the same execution substrate&lt;/li&gt;
&lt;li&gt;Organizational coupling, because one team's deployment patterns can affect another team's cost and performance envelope&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is why the concurrency controls here matter. The CLM router function uses a per-execution-environment concurrency setting, and the environment-specific &lt;code&gt;.tfvars&lt;/code&gt; files pin that concurrency to &lt;code&gt;4&lt;/code&gt;. That is more than a performance number. It is a fairness policy.&lt;/p&gt;

&lt;p&gt;If I were advising a platform team scaling this pattern, I would say this clearly: shared capacity providers are excellent, but they need quota thinking from day one. Otherwise the first successful workload becomes the first noisy neighbor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 5: If You Publish Versions Aggressively, You Need Lifecycle Hygiene on Day One
&lt;/h2&gt;

&lt;p&gt;This implementation makes another good call: the Lambda functions are published, aliased, and then cleaned up with an automated version pruner.&lt;/p&gt;

&lt;p&gt;That matters because version sprawl is one of those quiet operational problems that teams ignore until it becomes annoying enough to disrupt deployments. Published versions accumulate quickly when CI/CD is active. If you do not manage them, you eventually pay in clutter, confusion, or hard service limits.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;lambda_version_pruner&lt;/code&gt; implementation is stronger than a simplistic cleanup script because it preserves what actually matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It scans all Lambda functions&lt;/li&gt;
&lt;li&gt;It filters only functions associated with the target capacity provider&lt;/li&gt;
&lt;li&gt;It lists all aliases and protects aliased versions&lt;/li&gt;
&lt;li&gt;It keeps the latest N published versions&lt;/li&gt;
&lt;li&gt;It deletes everything older that is neither current nor aliased&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is exactly the kind of automation mature teams invest in. Not glamorous. Very valuable.&lt;/p&gt;

&lt;p&gt;There is also an understated platform principle here: rollback is not just about keeping artifacts. It is about keeping the right artifacts. By preserving aliased versions, the pruner respects deployment intent rather than blindly optimizing for tidiness.&lt;/p&gt;

&lt;p&gt;There is also a more practical capacity-provider reason for doing this, and it deserves to be stated directly.&lt;/p&gt;

&lt;p&gt;When you run a shared Lambda Managed Instances pool, you want the platform to spend its effort on the versions that are actually serving traffic, warming correctly, or remaining available for safe rollback. If old published versions keep accumulating forever, three unhealthy things tend to happen:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;operators lose clarity on which versions are still meaningful&lt;/li&gt;
&lt;li&gt;rollback and alias management become noisier than they should be&lt;/li&gt;
&lt;li&gt;the shared platform carries more deployment residue than useful runtime intent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Strictly speaking, deleting old Lambda versions does not magically increase CPU on the capacity provider. What it does do is improve platform hygiene around the shared pool. It ensures that the versions attached to aliases, warmup patterns, and deployment workflows remain deliberate and limited. In other words, it improves capacity-provider utilization indirectly by reducing version sprawl around the workloads that consume that shared capacity.&lt;/p&gt;

&lt;p&gt;That matters in real operations. The healthier the deployment surface is, the easier it is to reason about what is warming, what is active, what can be rolled back, and what should no longer influence the platform at all.&lt;/p&gt;

&lt;p&gt;So the version pruner is not just a cleanup utility. It is part of making the shared capacity provider operationally efficient. Not by adding raw compute, but by reducing noise, protecting the versions that matter, and keeping the platform focused on live execution paths instead of historical leftovers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 6: GitHub Actions Should Orchestrate. CodeBuild Should Execute.
&lt;/h2&gt;

&lt;p&gt;Architecturally, the CI/CD model here is sensible.&lt;/p&gt;

&lt;p&gt;GitHub Actions is used as the control plane:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;branch-based triggering&lt;/li&gt;
&lt;li&gt;security scanning&lt;/li&gt;
&lt;li&gt;environment selection&lt;/li&gt;
&lt;li&gt;AWS credential injection&lt;/li&gt;
&lt;li&gt;build orchestration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AWS CodeBuild is used as the execution plane:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Terraform install&lt;/li&gt;
&lt;li&gt;&lt;code&gt;terraform init&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;terraform validate&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;terraform plan&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;terraform apply&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I like this split. It keeps GitHub Actions lightweight and makes AWS the place where the actual infrastructure mutation happens. That usually gives better access control, cleaner auditability, and fewer surprises around long-running plan or apply steps.&lt;/p&gt;

&lt;p&gt;The buildspecs pin Terraform &lt;code&gt;1.12.2&lt;/code&gt;, install the CLI explicitly, and then execute plan/apply flows with environment-specific variable files. That is exactly the kind of boring repeatability you want in infrastructure delivery.&lt;/p&gt;

&lt;p&gt;This is one of the most practical lessons from the implementation: do not force GitHub Actions to be your full deployment runtime if AWS-native execution gives you better control.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 7: CI/CD Maturity Is Not About Having a Pipeline. It Is About Where the Gates Actually Are.
&lt;/h2&gt;

&lt;p&gt;The implementation also reveals a harder truth: CI/CD design is won or lost not by YAML volume, but by trigger discipline.&lt;/p&gt;

&lt;p&gt;There are some good instincts here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dev deployment is chained off a successful security workflow&lt;/li&gt;
&lt;li&gt;Security scanning runs on push and PR for &lt;code&gt;dev&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;PR security review is scoped only to actual code and infrastructure changes&lt;/li&gt;
&lt;li&gt;Environment-specific secrets are used for AWS access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That said, the current implementation also shows the kinds of issues every fast-moving team encounters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The dev deploy workflow is triggered by &lt;code&gt;Security Checks (Push)&lt;/code&gt;, not by a broader quality gate such as tests plus security plus static analysis&lt;/li&gt;
&lt;li&gt;The QA workflow is currently triggered on &lt;code&gt;pull_request&lt;/code&gt; to &lt;code&gt;qa&lt;/code&gt;, yet it also includes an apply stage, which is a risky combination&lt;/li&gt;
&lt;li&gt;The sanity workflow references a different CodeBuild project naming pattern, which looks like copy-forward drift from another implementation&lt;/li&gt;
&lt;li&gt;One dev apply step mixes generic and environment-specific secrets in a way that deserves tightening&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a criticism of the team. It is actually the most authentic part of the system.&lt;/p&gt;

&lt;p&gt;Real pipelines evolve through reuse, renaming, urgency, and partial migration. The useful engineering habit is not pretending they are pristine. It is recognizing that pipeline drift is itself a production concern.&lt;/p&gt;

&lt;p&gt;My blunt lesson here is this: CI/CD is software. It needs the same review rigor as application code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 8: Documentation Drift Is a Reliability Signal
&lt;/h2&gt;

&lt;p&gt;The README here is ambitious and useful, but parts of it clearly describe a broader or earlier architecture than the exact files currently present. That mismatch is more important than most teams realize.&lt;/p&gt;

&lt;p&gt;When documentation and implementation diverge, three things happen:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;new engineers learn the wrong system&lt;/li&gt;
&lt;li&gt;reviewers approve changes with outdated mental models&lt;/li&gt;
&lt;li&gt;incidents take longer to resolve because operators trust stale diagrams&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One of the best engineering habits is to treat documentation drift as an operational bug, not as a cosmetic issue.&lt;/p&gt;

&lt;p&gt;This implementation makes that case well. The code is the source of truth. The docs are directionally strong, but some names, workflow descriptions, and file references have clearly moved over time. That is normal. What matters is catching it before the next engineer builds decisions on old assumptions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 9: The Default VPC Is Fine for Speed, but It Should Be a Conscious Temporary Convenience
&lt;/h2&gt;

&lt;p&gt;The Terraform intentionally uses the default VPC and default subnets, then layers in filtering and a custom security group. For early velocity, that is an acceptable choice. It removes friction and makes the first deployment much easier.&lt;/p&gt;

&lt;p&gt;But teams should be honest about the tradeoff.&lt;/p&gt;

&lt;p&gt;Using the default VPC accelerates setup. It does not provide the same clarity, segmentation, or policy hygiene that a dedicated workload VPC eventually should. The inbound HTTPS rule from &lt;code&gt;0.0.0.0/0&lt;/code&gt; is another example of where a practical early-stage decision should later be revisited with a more opinionated security posture.&lt;/p&gt;

&lt;p&gt;My view is simple: default VPC usage is fine when it is a speed decision. It becomes dangerous when it silently hardens into architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 10: Least Privilege Usually Loses the First Battle. Do Not Let It Lose the War.
&lt;/h2&gt;

&lt;p&gt;The Lambda IAM policy for the router function is broad. Very broad.&lt;/p&gt;

&lt;p&gt;That is common when a platform team is trying to unblock integration work quickly across S3, SQS, SNS, DynamoDB, Bedrock, AppSync, logs, X-Ray, and secrets. The version pruner is noticeably tighter, which is encouraging. But the broader pattern remains familiar: the first version of a system usually over-grants.&lt;/p&gt;

&lt;p&gt;The lesson is not "never do that." The lesson is "know when you are doing it, and schedule the hardening work while the platform is still comprehensible."&lt;/p&gt;

&lt;p&gt;Security debt compounds. The longer a wide-open policy survives, the more invisible it becomes.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Repo Gets Right
&lt;/h2&gt;

&lt;p&gt;If I strip away the drift and focus on the platform instincts, this implementation gets a lot right:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It treats capacity provider infrastructure as shared platform capability, not one-off function plumbing&lt;/li&gt;
&lt;li&gt;It optimizes for ARM64 economics instead of defaulting to x86 out of habit&lt;/li&gt;
&lt;li&gt;It acknowledges cold starts as a business problem and addresses them operationally&lt;/li&gt;
&lt;li&gt;It preserves rollback safety with aliases while still pruning version sprawl&lt;/li&gt;
&lt;li&gt;It separates orchestration from execution in CI/CD&lt;/li&gt;
&lt;li&gt;It encodes AWS service constraints in Terraform comments and defaults, which reduces tribal knowledge&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a strong foundation.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Would Improve Next
&lt;/h2&gt;

&lt;p&gt;If I were turning this into the next version of a production-grade internal platform, I would prioritize the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Tighten naming consistency across the implementation.&lt;br&gt;
The capacity provider name appears in slightly different forms across resources. That is how automation misses its target. Shared naming locals should eliminate this class of error.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Make QA and production promotion rules stricter.&lt;br&gt;
A PR-triggered apply path should be removed. Plan on PR, apply on protected branch or approved environment gate is the cleaner model.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Run Terraform from a single explicit working directory.&lt;br&gt;
The current layout places Terraform under &lt;code&gt;terraform_file/&lt;/code&gt;, while some buildspec commands read like root-level execution. That ambiguity should be eliminated.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Move from broad IAM toward intent-based policies.&lt;br&gt;
Especially for the router Lambda, policy scope should narrow as the workload stabilizes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Revisit networking posture.&lt;br&gt;
The default VPC is fine for speed; a dedicated VPC model is better for longevity, auditability, and controlled ingress.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add stronger deployment quality gates.&lt;br&gt;
Security review is useful, but infrastructure promotion should also hang off validation, tests, linting, and explicit approval where appropriate.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add platform observability as code.&lt;br&gt;
CloudWatch alarms, dashboarding, and cost visibility for the capacity provider should be treated as first-class Terraform resources, not follow-up tasks.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Bigger Technical Lesson
&lt;/h2&gt;

&lt;p&gt;The biggest takeaway from this implementation is not about Lambda specifically.&lt;/p&gt;

&lt;p&gt;It is about how modern platform teams should build.&lt;/p&gt;

&lt;p&gt;We should absolutely chase better cost-performance curves. We should use managed primitives aggressively. We should automate the boring work. But we also need the discipline to encode what we learn while the system is still small enough to reason about.&lt;/p&gt;

&lt;p&gt;What makes this useful is that it shows both halves of real engineering:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the architectural intent&lt;/li&gt;
&lt;li&gt;the implementation scars&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That combination is where credible engineering judgment comes from.&lt;/p&gt;

&lt;p&gt;Anyone can present a clean target state. The harder and more useful skill is building systems that survive contact with deployment friction, service constraints, naming drift, and operational reality.&lt;/p&gt;

&lt;p&gt;That is what this implementation is doing. And that is why the lessons here matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing Thought
&lt;/h2&gt;

&lt;p&gt;Capacity providers, warmers, version pruning, and GitHub-driven delivery are not separate topics. They are all answers to the same technical question:&lt;/p&gt;

&lt;p&gt;How do we make cloud systems faster, cheaper, safer, and more repeatable without turning every application team into a specialized infrastructure group?&lt;/p&gt;

&lt;p&gt;In this implementation, the answer was to centralize the hard platform decisions, automate the hygiene, keep the runtime warm where it matters, and stay honest about the places where the system still needs tightening.&lt;/p&gt;

&lt;p&gt;That is not just good infrastructure work.&lt;/p&gt;

&lt;p&gt;That is good engineering practice.&lt;/p&gt;

</description>
      <category>serverless</category>
      <category>lambda</category>
      <category>aws</category>
    </item>
    <item>
      <title>Lessons I learned building a memory-aware agent with Amazon Bedrock AgentCore Runtime</title>
      <dc:creator>Amit Kayal</dc:creator>
      <pubDate>Mon, 20 Apr 2026 18:10:16 +0000</pubDate>
      <link>https://dev.to/amitkayal/lessons-i-learned-building-a-memory-aware-agent-with-amazon-bedrock-agentcore-runtime-4lc9</link>
      <guid>https://dev.to/amitkayal/lessons-i-learned-building-a-memory-aware-agent-with-amazon-bedrock-agentcore-runtime-4lc9</guid>
      <description>&lt;h1&gt;
  
  
  Lessons I learned building a memory-aware agent with Amazon Bedrock AgentCore Runtime
&lt;/h1&gt;

&lt;p&gt;When I started building an agent with Amazon Bedrock AgentCore Runtime, I thought the difficult parts would be model selection, tool wiring, and deployment. Those certainly mattered, but the part that shaped the quality of the agent most was memory.&lt;/p&gt;

&lt;p&gt;The first version of the agent could answer single prompts well enough, but it did not behave like a real multi-turn system. Follow-up questions were brittle. The agent lost short-range intent. Tool usage worked, but only within the narrow boundaries of the current prompt. As soon as the conversation depended on what happened one or two turns earlier, the system started to feel less like an agent and more like a stateless inference endpoint.&lt;/p&gt;

&lt;p&gt;That experience changed how I approached the design. I stopped thinking about memory as a convenience feature and started treating it as part of the runtime architecture itself. This article is a distillation of the most important lessons I learned while building a short-term-memory-aware agent with Amazon Bedrock AgentCore Runtime and Strands.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 1: An agent is not really multi-turn until memory is part of the lifecycle
&lt;/h2&gt;

&lt;p&gt;One of the first things I learned is that conversational continuity does not emerge automatically just because the application calls the same runtime repeatedly.&lt;/p&gt;

&lt;p&gt;Without short-term memory, the agent only sees the current prompt unless the application keeps reconstructing and replaying history manually. That creates several problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;previous instructions are easy to lose,&lt;/li&gt;
&lt;li&gt;tool chains become fragile across turns,&lt;/li&gt;
&lt;li&gt;users have to restate identifiers and intent,&lt;/li&gt;
&lt;li&gt;the system becomes increasingly prompt-shaped rather than interaction-shaped.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What became clear to me is that short-term memory is not about storing everything forever. It is about preserving enough recent state for the current conversation to remain coherent.&lt;/p&gt;

&lt;p&gt;That distinction matters. I was not trying to build a knowledge base or semantic fact store. I was trying to answer a simpler question: how do I help the agent remember what we were just doing?&lt;/p&gt;

&lt;p&gt;Once I framed the problem that way, the architecture became much clearer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 2: The cleanest pattern is explicit memory, not implicit transcript magic
&lt;/h2&gt;

&lt;p&gt;Another lesson I learned quickly is that I did not want memory to be hidden behind vague runtime behavior. I wanted the agent code to make memory use explicit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;where memory comes from,&lt;/li&gt;
&lt;li&gt;when it is read,&lt;/li&gt;
&lt;li&gt;when it is written,&lt;/li&gt;
&lt;li&gt;which user it belongs to,&lt;/li&gt;
&lt;li&gt;which conversation it belongs to.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That led me to a pattern built around &lt;code&gt;MemoryClient&lt;/code&gt; and hooks.&lt;/p&gt;

&lt;p&gt;Instead of treating memory like a passive transcript that somehow appears at the edge of the request, I found it much more reliable to think about it as a lifecycle-managed dependency:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;create a short-term memory resource,&lt;/li&gt;
&lt;li&gt;pass the memory identity into the runtime,&lt;/li&gt;
&lt;li&gt;read recent turns when the agent initializes,&lt;/li&gt;
&lt;li&gt;write new messages as events when the conversation changes.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The important shift for me was this: memory worked best when it was part of the agent object model, not just part of request handling glue code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 3: Hooks are where memory belongs
&lt;/h2&gt;

&lt;p&gt;This was probably the biggest implementation insight.&lt;/p&gt;

&lt;p&gt;Once I had a Strands-based agent running inside AgentCore Runtime, I needed to decide where the memory logic should live. I could have put everything directly into the entrypoint and manually stitched together request parsing, history retrieval, message persistence, and prompt injection. That would have worked, but it would have made the agent lifecycle harder to reason about.&lt;/p&gt;

&lt;p&gt;What worked better was using hooks tied to the agent lifecycle itself:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;AgentInitializedEvent&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;MessageAddedEvent&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That structure gave me a much cleaner mental model.&lt;/p&gt;

&lt;p&gt;On initialization, the agent needs context before it reasons. That is the right moment to retrieve the most recent turns from memory and inject them into prompt context.&lt;/p&gt;

&lt;p&gt;When a new message is added, the conversation state has changed. That is the right moment to persist the latest user or assistant message back into memory.&lt;/p&gt;

&lt;p&gt;The core interaction looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;recent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;memory_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_last_k_turns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;memory_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;memory_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;actor_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;actor_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;memory_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;memory_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;memory_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;actor_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;actor_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What I like about this model is that it is deterministic.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;memory load happens before reasoning,&lt;/li&gt;
&lt;li&gt;memory write happens when conversation state changes,&lt;/li&gt;
&lt;li&gt;both operations use the same identity boundaries,&lt;/li&gt;
&lt;li&gt;the entrypoint stays focused on request extraction rather than conversation orchestration.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That made the system easier to debug, easier to extend, and much easier to explain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 4: Identity is the real memory boundary
&lt;/h2&gt;

&lt;p&gt;Before building this, I thought of memory mostly as a storage problem. In practice, I learned it is just as much an identity problem.&lt;/p&gt;

&lt;p&gt;The two identifiers that mattered most were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;actor_id&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;session_id&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This separation ended up being foundational.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why &lt;code&gt;actor_id&lt;/code&gt; matters
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;actor_id&lt;/code&gt; is the user boundary. If that identifier is unstable, absent, or inconsistent, memory quality degrades immediately.&lt;/p&gt;

&lt;p&gt;What I learned is that a memory system is only as good as the application identity you feed into it. If the same user appears under multiple IDs, the agent cannot retrieve a coherent conversational history. If two users are accidentally mapped to the same identity, memory becomes unsafe.&lt;/p&gt;

&lt;p&gt;So one of my strongest takeaways is that &lt;code&gt;actor_id&lt;/code&gt; should always come from a stable authenticated user identity, not from an incidental client-generated value.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why &lt;code&gt;session_id&lt;/code&gt; matters
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;session_id&lt;/code&gt; turned out to be just as important. A single user does not have just one conversation. They may have multiple active threads:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one troubleshooting flow,&lt;/li&gt;
&lt;li&gt;one transcript analysis request,&lt;/li&gt;
&lt;li&gt;one abandoned conversation from earlier,&lt;/li&gt;
&lt;li&gt;one brand-new task.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without a session boundary, all of that collapses into one memory stream. The agent might technically “remember,” but it remembers too much of the wrong thing.&lt;/p&gt;

&lt;p&gt;That was a key lesson for me: useful memory is not just preserved memory. It is correctly scoped memory.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 5: The agent should be rebuilt per request, but memory should persist across requests
&lt;/h2&gt;

&lt;p&gt;This was an architectural point that became clearer as I implemented the runtime flow.&lt;/p&gt;

&lt;p&gt;The Strands agent instance itself is created per request. That makes sense because each invocation carries request-specific state:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the current user prompt,&lt;/li&gt;
&lt;li&gt;the active user identity,&lt;/li&gt;
&lt;li&gt;the active conversation session,&lt;/li&gt;
&lt;li&gt;the active tool and runtime context.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But memory should not behave like request-local state. Memory has to outlive the agent instance and remain keyed to the same user and conversation across invocations.&lt;/p&gt;

&lt;p&gt;That split was important for me to internalize:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;agent instance lifecycle is short,&lt;/li&gt;
&lt;li&gt;conversation memory lifecycle is longer,&lt;/li&gt;
&lt;li&gt;the link between them is established through state and hooks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once I started thinking in those terms, the design felt much more natural.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 6: Deployment is part of the memory design
&lt;/h2&gt;

&lt;p&gt;I originally thought of deployment as a separate concern from conversational behavior. Building this agent convinced me that the two are tightly connected.&lt;/p&gt;

&lt;p&gt;The runtime needs to know which memory resource it should use, but I did not want that decision hardcoded in application logic. The better pattern was to resolve the correct memory resource during deployment and pass that identity into the runtime as configuration.&lt;/p&gt;

&lt;p&gt;In practice, that meant the runtime received environment-specific values such as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AGENT_NAME=&amp;lt;agent-name&amp;gt;
MEMORY_ID=&amp;lt;memory-id&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That gave me a few benefits immediately:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the same application code could move across environments,&lt;/li&gt;
&lt;li&gt;memory resources stayed aligned with environment boundaries,&lt;/li&gt;
&lt;li&gt;the runtime remained configurable without source changes,&lt;/li&gt;
&lt;li&gt;the control plane remained the primary place where resource binding happened.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One of the clearest lessons here is that memory should be treated like any other environment-bound infrastructure dependency. If it is not part of deployment, it tends to become a hidden assumption.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 7: Short-term memory and long-term memory solve different problems
&lt;/h2&gt;

&lt;p&gt;I found it helpful to stop using the word “memory” as if it meant one thing.&lt;/p&gt;

&lt;p&gt;Short-term memory answered the question:&lt;/p&gt;

&lt;p&gt;"What was happening in this conversation recently?"&lt;/p&gt;

&lt;p&gt;Long-term memory answers a different question:&lt;/p&gt;

&lt;p&gt;"What durable information should the system remember beyond this immediate interaction?"&lt;/p&gt;

&lt;p&gt;For the agent I was building, the short-term problem came first. I needed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;recent-turn continuity,&lt;/li&gt;
&lt;li&gt;bounded replay,&lt;/li&gt;
&lt;li&gt;session-scoped context,&lt;/li&gt;
&lt;li&gt;predictable event retention.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I did not need semantic fact retrieval in the first phase. I did not need vector search for historical knowledge. I needed the agent to remain coherent across adjacent turns.&lt;/p&gt;

&lt;p&gt;That was an important design simplification. It kept the first version of the memory architecture focused on event continuity instead of overextending into knowledge retrieval prematurely.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 8: Recent-turn replay should be bounded
&lt;/h2&gt;

&lt;p&gt;Once I had memory retrieval working, the next question was how much of it to inject back into the agent context.&lt;/p&gt;

&lt;p&gt;My lesson here was simple: more memory is not always better memory.&lt;/p&gt;

&lt;p&gt;If too much prior conversation is replayed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;prompt size grows,&lt;/li&gt;
&lt;li&gt;token cost grows,&lt;/li&gt;
&lt;li&gt;stale context starts competing with the current task,&lt;/li&gt;
&lt;li&gt;reasoning quality can actually decline.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I found the most practical pattern was to retrieve the last few turns and inject them into prompt context in a compact representation. In this design, that replay window was bounded at five turns.&lt;/p&gt;

&lt;p&gt;That gave me a good balance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;enough recent context for continuity,&lt;/li&gt;
&lt;li&gt;small enough context for predictable prompt growth,&lt;/li&gt;
&lt;li&gt;simple enough formatting to inspect and debug.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This also reinforced another lesson: short-term memory should be operationally understandable. I want to know what context the model saw, not just trust that some opaque memory layer handled it correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 9: Memory becomes more valuable when tools are involved
&lt;/h2&gt;

&lt;p&gt;The agent I built was not just a conversational shell. It had tools, including domain-specific behavior such as transcript retrieval and AWS interactions.&lt;/p&gt;

&lt;p&gt;That is where the value of short-term memory became even more obvious.&lt;/p&gt;

&lt;p&gt;In a tool-using workflow, the user often does not repeat the full context every turn. They say things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"use the same meeting"&lt;/li&gt;
&lt;li&gt;"what did the second speaker say?"&lt;/li&gt;
&lt;li&gt;"now summarize that"&lt;/li&gt;
&lt;li&gt;"check the S3 output from before"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without memory, the agent has to reconstruct working state from a single prompt. With memory, the agent has a much better chance of preserving:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the active object under discussion,&lt;/li&gt;
&lt;li&gt;the prior user instruction,&lt;/li&gt;
&lt;li&gt;the last tool result,&lt;/li&gt;
&lt;li&gt;the intended next step.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One of my strongest takeaways is that memory is not just a conversational improvement. It is a workflow improvement. It makes tool orchestration across turns materially more coherent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 10: Failure modes need to be designed, not discovered in production
&lt;/h2&gt;

&lt;p&gt;Building this also made me think much more carefully about degraded behavior.&lt;/p&gt;

&lt;p&gt;If memory resolution fails and the runtime cannot find a memory resource, the agent may still run. That sounds convenient, but it also means the system may silently shift from stateful to stateless behavior.&lt;/p&gt;

&lt;p&gt;That taught me to treat the following as first-class operational conditions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;memory enabled,&lt;/li&gt;
&lt;li&gt;memory disabled,&lt;/li&gt;
&lt;li&gt;memory load succeeded,&lt;/li&gt;
&lt;li&gt;memory write succeeded,&lt;/li&gt;
&lt;li&gt;memory resolution failed,&lt;/li&gt;
&lt;li&gt;identity inputs were missing or malformed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The same thing applies to identity mistakes.&lt;/p&gt;

&lt;p&gt;If &lt;code&gt;actor_id&lt;/code&gt; is unstable, memory becomes fragmented.&lt;/p&gt;

&lt;p&gt;If &lt;code&gt;session_id&lt;/code&gt; is reused incorrectly, unrelated conversations bleed into each other.&lt;/p&gt;

&lt;p&gt;If replay windows grow without discipline, prompt quality degrades.&lt;/p&gt;

&lt;p&gt;These are not edge cases. They are part of the normal operating surface of a memory-aware agent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 11: Retention, privacy, and compliance show up earlier than expected
&lt;/h2&gt;

&lt;p&gt;Short-term memory sounds lightweight, but it is still stored interaction data.&lt;/p&gt;

&lt;p&gt;That means retention policy is not just a platform setting. It is part of the product design. While building this, I became much more aware that memory decisions quickly intersect with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;data handling policy,&lt;/li&gt;
&lt;li&gt;privacy expectations,&lt;/li&gt;
&lt;li&gt;deletion and retention requirements,&lt;/li&gt;
&lt;li&gt;security review,&lt;/li&gt;
&lt;li&gt;production observability.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The technical implementation can be elegant, but if these operational questions are not addressed early, the design will be incomplete.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 12: AgentCore became more useful to me when I treated it as a runtime system, not just a hosting target
&lt;/h2&gt;

&lt;p&gt;This may be the broadest lesson of all.&lt;/p&gt;

&lt;p&gt;At first, I thought of AgentCore Runtime mainly as the place where the agent container would run. But while building with memory, I started appreciating it more as a runtime environment with clear operational boundaries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the runtime executes the agent,&lt;/li&gt;
&lt;li&gt;the framework manages reasoning and tools,&lt;/li&gt;
&lt;li&gt;the memory plane manages event continuity,&lt;/li&gt;
&lt;li&gt;the deployment workflow binds the right resources together.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That view helped me move beyond “deploy a model wrapper in a container” toward “operate an agent system with state, identity, and lifecycle.”&lt;/p&gt;

&lt;p&gt;For me, that was the real shift.&lt;/p&gt;

&lt;h2&gt;
  
  
  The technical pattern I would reuse
&lt;/h2&gt;

&lt;p&gt;If I were building the same class of agent again, I would reuse the same high-level pattern:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a dedicated short-term memory resource.&lt;/li&gt;
&lt;li&gt;Resolve the correct memory resource during deployment.&lt;/li&gt;
&lt;li&gt;Pass memory identity into the runtime explicitly.&lt;/li&gt;
&lt;li&gt;Build the agent per request with user and session state.&lt;/li&gt;
&lt;li&gt;Load recent turns during agent initialization.&lt;/li&gt;
&lt;li&gt;Persist new messages when they are added.&lt;/li&gt;
&lt;li&gt;Keep replay windows bounded.&lt;/li&gt;
&lt;li&gt;Treat &lt;code&gt;actor_id&lt;/code&gt; and &lt;code&gt;session_id&lt;/code&gt; as core correctness boundaries.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I would also keep the same mental model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;short-term memory is for continuity,&lt;/li&gt;
&lt;li&gt;long-term memory is for durable recall,&lt;/li&gt;
&lt;li&gt;hooks are the right place for memory orchestration,&lt;/li&gt;
&lt;li&gt;deployment is part of memory architecture,&lt;/li&gt;
&lt;li&gt;observability should make degraded memory behavior visible.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Closing thought
&lt;/h2&gt;

&lt;p&gt;The biggest lesson I learned while building with Amazon Bedrock AgentCore Runtime is that memory is not something you sprinkle onto an agent once the rest of the system works. Memory changes the shape of the system.&lt;/p&gt;

&lt;p&gt;It affects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;request lifecycle,&lt;/li&gt;
&lt;li&gt;identity boundaries,&lt;/li&gt;
&lt;li&gt;prompt construction,&lt;/li&gt;
&lt;li&gt;deployment,&lt;/li&gt;
&lt;li&gt;observability,&lt;/li&gt;
&lt;li&gt;privacy,&lt;/li&gt;
&lt;li&gt;and tool coherence across turns.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once I accepted that, the architecture became much more disciplined. The agent became easier to reason about, easier to operate, and much more capable in real multi-turn interactions.&lt;/p&gt;

&lt;p&gt;That is the lesson I would carry into any future AgentCore build: if the experience is meant to feel conversational, memory has to be designed as a first-class runtime concern from the beginning.&lt;/p&gt;

</description>
      <category>agentcore</category>
      <category>aws</category>
      <category>serverless</category>
      <category>agentskills</category>
    </item>
    <item>
      <title>API Gateway as Websocket</title>
      <dc:creator>Amit Kayal</dc:creator>
      <pubDate>Tue, 21 Jan 2025 07:49:42 +0000</pubDate>
      <link>https://dev.to/amitkayal/api-gateway-as-websocket-5eee</link>
      <guid>https://dev.to/amitkayal/api-gateway-as-websocket-5eee</guid>
      <description>&lt;h1&gt;
  
  
  API Gateway as websocket
&lt;/h1&gt;

&lt;h2&gt;
  
  
  API Gateway as WS Components
&lt;/h2&gt;

&lt;p&gt;Websocket provides bidirectional session aware communication between caller and receiver and a crucial component for realtime application.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Setup API Gateway for WebSocket&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create a WebSocket API in the Amazon API Gateway console or through IAC.&lt;/li&gt;
&lt;li&gt;Define the WebSocket API route selection expression. Routes here are simply like a bridge to connections e.g., 

&lt;ul&gt;
&lt;li&gt;$request.body.action.&lt;/li&gt;
&lt;li&gt;Define the following WebSocket routes:&lt;/li&gt;
&lt;li&gt;$connect: Triggered when a client establishes a connection.&lt;/li&gt;
&lt;li&gt;$disconnect: Triggered when a client disconnects.&lt;/li&gt;
&lt;li&gt;Custom routes, e.g., sendMessage, to handle specific actions.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Create an Integration with AWS Lambda&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For each route ($connect, $disconnect, custom routes), integrate a Lambda function to handle the respective logic.&lt;/li&gt;
&lt;li&gt;Use the Lambda function's handler to process:

&lt;ul&gt;
&lt;li&gt;$connect: Store the connection in DynamoDB.&lt;/li&gt;
&lt;li&gt;$disconnect: Remove the connection from DynamoDB.&lt;/li&gt;
&lt;li&gt;Custom routes: Process the message and forward it to SQS.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;DynamoDB for Connection Management&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create a DynamoDB table to store:

&lt;ul&gt;
&lt;li&gt;Connection ID (Primary Key).&lt;/li&gt;
&lt;li&gt;Session ID or other metadata for grouping connections.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;This table allows tracking active WebSocket connections for broadcasting messages.&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Configure SQS for Message Queue&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use an SQS FIFO queue for guaranteed order and deduplication.&lt;/li&gt;
&lt;li&gt;Messages processed in Lambda (custom routes) are sent to SQS for downstream services.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;IAM Roles and Permissions&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Assign an IAM role to the API Gateway to invoke the integrated Lambda functions.&lt;/li&gt;
&lt;li&gt;Grant Lambda permissions to read/write from DynamoDB and send messages to SQS.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Client Connection and Messaging&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use WebSocket-compatible libraries (e.g., ws in Node.js or WebSocket API in browsers) to:&lt;/li&gt;
&lt;li&gt;Establish a WebSocket connection to the API Gateway endpoint.&lt;/li&gt;
&lt;li&gt;Send and receive messages using the WebSocket protocol.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Architecture of Websocket mechanism
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;WebSocket Client:

&lt;ul&gt;
&lt;li&gt;Initiates WebSocket connection and communicates via send() and onmessage().&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;API Gateway (WebSocket API):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Manages WebSocket connections and invokes Lambda functions for defined routes.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Route Integration (Lambda Functions):&lt;br&gt;
Every route should have an integration. There are 3 types — Mock, HTTP and Lambda.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$connect: Adds connection metadata to DynamoDB.&lt;/li&gt;
&lt;li&gt;$disconnect: Removes connection metadata from DynamoDB.&lt;/li&gt;
&lt;li&gt;$default route: selected when route cant be evaluated against message&lt;/li&gt;
&lt;li&gt;Custom Routes: Processes messages to invoke integration based on message content and forwards them to SQS.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;DynamoDB:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maintains active connection records, including connectionId and associated metadata.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;SQS FIFO Queue:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Queues messages for downstream processing, ensuring delivery order and deduplication.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Downstream Services:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Processes messages from SQS and performs actions like notifications, data updates, or storage.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Security
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Authentication and Authorization
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Custom Authorizer (Lambda Authorizer)&lt;br&gt;
It can only be used for the $connect route.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create a Lambda Authorizer to validate custom tokens or headers sent during connection attempts.&lt;/li&gt;
&lt;li&gt;Example:

&lt;ul&gt;
&lt;li&gt;Validate a JWT token from an identity provider (e.g., Cognito, Auth0).&lt;/li&gt;
&lt;li&gt;Check the token against allowed users or roles.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Amazon Cognito:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use Amazon Cognito for user authentication.&lt;/li&gt;
&lt;li&gt;Configure API Gateway to use Cognito to validate tokens in connection requests.&lt;/li&gt;
&lt;li&gt;Best suited for applications with user pools.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Secure WebSocket Connections
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Always use the secure WebSocket protocol (wss://). API Gateway enforces HTTPS/TLS, ensuring encrypted communication.&lt;/li&gt;
&lt;li&gt;Associate a custom domain with API Gateway WebSocket endpoint. We should AWS Certificate Manager (ACM) to manage SSL/TLS certificates.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  IP Whitelisting and Blacklisting
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt; IP Whitelisting and Blacklisting: We should Attach AWS WAF to API Gateway and Block/allow requests based on IP addresses or CIDR ranges. we should also use rate limit to protect from DDoS attack
### API Gateway Throttling&lt;/li&gt;
&lt;li&gt;We can Set rate and burst limits on API Gateway routes to limit the number of connections per client.&lt;/li&gt;
&lt;li&gt;We can create API keys and associate them with usage plan and then we Limit the number of allowed requests per API key&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Environment-based Access Control:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;We should always use distinct stages (e.g., dev, prod) and restrict connections to the production API through IP rules.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Tools to test
&lt;/h2&gt;

&lt;p&gt;There are following tools which we can explore to test websocket.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Piesocket&lt;/li&gt;
&lt;li&gt;Postman&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>apigateway</category>
      <category>api</category>
    </item>
    <item>
      <title>S3 table &amp; S3 Metadata table</title>
      <dc:creator>Amit Kayal</dc:creator>
      <pubDate>Mon, 09 Dec 2024 18:26:23 +0000</pubDate>
      <link>https://dev.to/aws-builders/s3-table-s3-metadata-table-91i</link>
      <guid>https://dev.to/aws-builders/s3-table-s3-metadata-table-91i</guid>
      <description>&lt;h2&gt;
  
  
  Open table format and its architecture
&lt;/h2&gt;

&lt;p&gt;OpenTable formats, such as Apache Iceberg, Apache Hudi, and Delta Lake, have gained popularity in the data analytics mainly because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ACID Transactions: OpenTable formats (e.g., Apache Iceberg, Delta Lake) ensure reliable and consistent data updates, even with concurrent access.&lt;/li&gt;
&lt;li&gt;Schema Evolution: They allow seamless updates to schemas without disrupting existing pipelines, simplifying data management. metadata tracks the changes to the dataset. The files held in the Data layer are captured by the metadata files held in the Metadata layer. As the files change, the metadata files attached to them track these changes.&lt;/li&gt;
&lt;li&gt;Optimized Queries: Partitioning and indexing enable faster queries by scanning only relevant data, improving performance and cost-efficiency.&lt;/li&gt;
&lt;li&gt;Time Travel: Users can access historical versions of data for debugging, compliance, or analytics.&lt;/li&gt;
&lt;li&gt;Interoperability: These formats integrate seamlessly with big data tools like Spark, Flink, and Presto, making them versatile and widely adopted.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Open file format
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkl9mm5r6t0aqp4uy7dqa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkl9mm5r6t0aqp4uy7dqa.png" alt="img" width="750" height="588"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  S3 table
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;p&gt;Amazon S3 Table is optimized for analytics workloads. It is designed to continuously enhance query performance and reduce storage costs for tabular data. This solution looks very promising if you are working with LakeHouse architecture. It’s a new type of bucket that organizes tables as sub-resources.&lt;br&gt;
&lt;strong&gt;A new bucket type s3 table has been introduced to support this. As liked any other aws resoyrce, it has ARN, can take resource policy and as an unique feature it has dedicated endpoint.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;S3 Tables are intended explicitly for storing data in a tabular format, such as daily purchase transactions, streaming sensor data, or ad impressions. This data is organized into columns and rows like a database table.&lt;/li&gt;
&lt;li&gt;Table buckets support storing tables in the Apache Iceberg format. You can query these tables using standard SQL in query engines that support Iceberg.&lt;/li&gt;
&lt;li&gt;Read/write allowed on datafiles and metadata files. Delete and update not allowed to save data integrity.&lt;/li&gt;
&lt;li&gt;Compatible query engines include Amazon Athena, Amazon Redshift, and Apache Spark.&lt;/li&gt;
&lt;li&gt;S3 Table automatically performs maintenance tasks like compaction and snapshot management to optimize your tables for querying, including removing unreferenced files.&lt;/li&gt;
&lt;li&gt;S3 Table offers access management for both table and bucket&lt;/li&gt;
&lt;li&gt;Fully managed apache icebarg tables in S3&lt;/li&gt;
&lt;li&gt;It supports automatic compaction of underlying files to improve query performance and tune then further for better latency.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  S3 Table buckets namespace
&lt;/h3&gt;

&lt;p&gt;Namespace logically groups related s3 table together and thus allowing us to have greater control based on namespace of s3 tables. It helps us for following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;logical segmentation of data and multi tenancy

&lt;ul&gt;
&lt;li&gt;supporting of multi tenancy by having separate namespace. Supports compliance with data isolation requirements in regulated industries.&lt;/li&gt;
&lt;li&gt;separate tables based on application, project etc&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;prevent naming conflicts

&lt;ul&gt;
&lt;li&gt;Each namespace acts like a "container," allowing tables with the same name in different namespaces without conflicts.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Better Access Control

&lt;ul&gt;
&lt;li&gt;Policies can grant or restrict access to specific namespaces, ensuring data security and compliance.  It also reduces the risk of unauthorized access to unrelated tables in the same bucket.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Easy data management

&lt;ul&gt;
&lt;li&gt;Makes our life easier to query, update, or delete related tables in bulk.&lt;/li&gt;
&lt;li&gt;Makes easy metadata management for tables grouped under a namespace.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Advanced workflows based on namespace

&lt;ul&gt;
&lt;li&gt;It helps to simplify automation for data pipelines or real-time analytics applications.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  S3 table opertaion &amp;amp; management
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Table Operation&lt;/strong&gt;&lt;br&gt;
They are quite similar to CRUD operation.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;list tables&lt;/li&gt;
&lt;li&gt;create tables&lt;/li&gt;
&lt;li&gt;Get table metadata location&lt;/li&gt;
&lt;li&gt;Update table metadata location&lt;/li&gt;
&lt;li&gt;Delete Table&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Table Management&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Put Table Policy&lt;/li&gt;
&lt;li&gt;Put Table Bucket Policy&lt;/li&gt;
&lt;li&gt;Put Table Maintenance Config&lt;/li&gt;
&lt;li&gt;Put Table Bucket Maintenance Config&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Policies related to S3 table operation
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Allow access to create and use table buckets
&lt;/h3&gt;

&lt;p&gt;Here Action Lists the specific actions the policy allows. &lt;/p&gt;

&lt;p&gt;These actions are S3 Tables-specific: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;s3tables:CreateTableBucket: Grants permission to create a table bucket in S3 Tables. &lt;/li&gt;
&lt;li&gt;s3tables:PutTableBucketPolicy: Allows setting or updating the bucket policy for a table bucket. &lt;/li&gt;
&lt;li&gt;s3tables:GetTableBucketPolicy: Allows retrieving the bucket policy associated with a table bucket. &lt;/li&gt;
&lt;li&gt;s3tables:ListTableBuckets: Allows listing all table buckets within the specified scope. &lt;/li&gt;
&lt;li&gt;&lt;p&gt;s3tables:GetTableBucket: Grants permission to access the metadata of a specific table bucket.&lt;br&gt;
Resource Defines the scope of the resources these actions can apply to. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;"arn:aws:s3tables:region:account_id:bucket/*": Specifies all table buckets in the account (account_id) and region (region). &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The * after bucket/ indicates that permissions apply to all buckets under this account and region.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "Version": "2012-10-17",
    "Statement": [{
        "Sid": "AllowBucketActions for user",
        "Effect": "Allow",
        "Action": [
            "s3tables:CreateTableBucket",
            "s3tables:PutTableBucketPolicy",
            "s3tables:GetTableBucketPolicy",
            "s3tables:ListTableBuckets",
            "s3tables:GetTableBucket"
        ],
        "Resource": "arn:aws:s3tables:region:account_id:bucket/*"
    }]
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Allow access to create and use tables in a table bucket
&lt;/h3&gt;

&lt;p&gt;Here Action Lists the specific actions allowed by the policy, related to S3 Tables. &lt;em&gt;Please note that The first policy focused on creating and managing table buckets and associated metadata, but it did not include granular operations like managing tables within namespaces. The first policy did not include actions such as creating tables, querying data, or updating metadata at the table level. These are the operations where namespaces become relevant.&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;s3tables:CreateTable: Allows creating new tables in the specified table bucket. &lt;/li&gt;
&lt;li&gt;s3tables:PutTableData: Grants permission to write data to tables within the table bucket. &lt;/li&gt;
&lt;li&gt;s3tables:GetTableData: Allows reading data from tables in the bucket.&lt;/li&gt;
&lt;li&gt;s3tables:GetTableMetadataLocation: Allows retrieving metadata location information for a table.&lt;/li&gt;
&lt;li&gt;s3tables:UpdateTableMetadataLocation: Grants permission to update the metadata location of a table. &lt;/li&gt;
&lt;li&gt;s3tables:GetNamespace: Allows retrieving namespace information associated with the table bucket. &lt;/li&gt;
&lt;li&gt;s3tables:CreateNamespace: Grants permission to create namespaces for organizing table data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Resource section specifies&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Grants permissions on the bucket named amzn-s3-demo-table-bucket&lt;/li&gt;
&lt;li&gt;Grants permissions on all tables within the amzn-s3-demo-table-bucket
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
     "Version": "2012-10-17",
     "Statement": [ 
         {
             "Sid": "AllowBucketActions",
             "Effect": "Allow",
             "Action": [
                 "s3tables:CreateTable",
                 "s3tables:PutTableData",
                 "s3tables:GetTableData",
                 "s3tables:GetTableMetadataLocation",
                 "s3tables:UpdateTableMetadataLocation",
                 "s3tables:GetNamespace",
                 "s3tables:CreateNamespace"
             ],

             "Resource": [
               "arn:aws:s3tables:region:account_id:bucket/amzn-s3-demo-table-bucket",
               "arn:aws:s3tables:region:account_id:bucket/amzn-s3-demo-table-bucket/table/*"
            ]
         }
     ]
 }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h4&gt;
  
  
  Table bucket policy to allows read access to the namespace
&lt;/h4&gt;

&lt;p&gt;This policy allows to read s3 tables from a namespace. Here Action Lists the specific actions allowed by the policy, related to S3 Tables. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;s3tables:GetTableData: Allows reading data from tables in the bucket.&lt;/li&gt;
&lt;li&gt;s3tables:GetTableMetadataLocation: Allows retrieving metadata location information for a table.
The resource section allows all s3 tables under bucket amzn-s3-demo-table-bucket1 but then s3tables:namespace restrict to only hr related s3 tables.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
     "Version": "2012-10-17",
     "Statement": [ 
         {
             "Effect": "Allow",
             "Action": [
             "Principal": {
               "AWS": "arn:aws:iam::123456789012:user/Jane"
             },
             "Action": [
                  "s3tables:GetTableData", 
                  "s3tables:GetTableMetadataLocation"
             ],
             "Resource":{ "arn:aws:s3tables:region:account_id:bucket/amzn-s3-demo-table-bucket1/table/*”}
             "Condition": { 
                  "StringLike": { "s3tables:namespace": "hr" } 
             }
     ]
 }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h2&gt;
  
  
  S3 table automatic maintenance
&lt;/h2&gt;

&lt;p&gt;It provides automated maintenance through configurations that help simplify table management, optimize performance, and reduce operational overhead.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Table Lifecycle Management

&lt;ul&gt;
&lt;li&gt;we can add S3 Table configurations that includes lifecycle policies that automatically handle data expiration, transitions, or archival.&lt;/li&gt;
&lt;li&gt;automatic snapshot expiration can be configured easily.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Data Compaction

&lt;ul&gt;
&lt;li&gt;S3 Tables automatically compact small files (often produced by incremental writes) into larger, optimized files. It helps to have faster query and reduce storage cost.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Schema Evolution

&lt;ul&gt;
&lt;li&gt;Automated checks ensure compatibility between new and existing data.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Metadata Optimization

&lt;ul&gt;
&lt;li&gt;Indexing of metadata for faster querying and retrieval of table details.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All these can be policy based configuration.&lt;/p&gt;
&lt;h3&gt;
  
  
  Policy for snapshot management
&lt;/h3&gt;

&lt;p&gt;By configuring the maximumSnapshotAge, we can specify the retention period for table snapshots. The following example ensures S3 Table will automatically retain only the snapshots from the last 30 days&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MinimumSnapshots: Ensures that at least one snapshot is always retained, regardless of age. &lt;/li&gt;
&lt;li&gt;MaximumSnapshotAge: Specifies the maximum age (in hours) for snapshots to be retained.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws s3tables put-table-maintenance-configuration \
    --table-arn arn:aws:s3tables:region:account_id:bucket/bucket_name/table/table_name \
    --maintenance-configuration '{
        "SnapshotManagement": {
            "MinimumSnapshots": 1,
            "MaximumSnapshotAge": 720
        }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h2&gt;
  
  
  S3 Table Integration with AWS Analytics
&lt;/h2&gt;

&lt;p&gt;S3 Tables integrate seamlessly with AWS analytics services to enable querying, processing and insight generation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Athena - Run serverless SQL queries on S3 Tables&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use AWS Glue to create a Data Catalog for S3 Tables.&lt;/li&gt;
&lt;li&gt;Query data directly using SQL in Athena.&lt;/li&gt;
&lt;li&gt;Leverage table formats like Apache Iceberg or Parquet for optimized performance.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;AWS Glue - Automate ETL processes for S3 Tables&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use Glue Crawlers to discover table metadata.&lt;/li&gt;
&lt;li&gt;Create ETL jobs to transform and load data into S3 Tables or other destinations.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  S3 Metadata table
&lt;/h2&gt;

&lt;p&gt;It includes system metadata including object tags and user defined metadata&lt;br&gt;
stored into s3 table&lt;br&gt;
generated in near real time during data creation so that it can be used in mins during query&lt;/p&gt;
&lt;h3&gt;
  
  
  Use case for S3 metadata table
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Real-Time Analytics

&lt;ul&gt;
&lt;li&gt;efficient query execution on metadata to identify relevant data partitions.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Machine Learning Pipelines

&lt;ul&gt;
&lt;li&gt;metadata tables to filter, select, and partition data for model training.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Governance and Compliance

&lt;ul&gt;
&lt;li&gt;Track data retention and enforce lifecycle policies via metadata.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Multi-Tenant Data Applications

&lt;ul&gt;
&lt;li&gt;Use namespaces within metadata tables to logically isolate tenant data.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Data Cataloging and Discovery

&lt;ul&gt;
&lt;li&gt;Use metadata queries to identify datasets matching specific criteria.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is the sample python based function which uses metadata table query from athena.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def query_metadata_table(criteria):

    query = f"""
        SELECT *
        FROM {DATABASE}.{TABLE}
        WHERE {criteria}
    """

    print(f"Running query: {query}")

    # Start Athena query
    response = athena_client.start_query_execution(
        QueryString=query,
        QueryExecutionContext={'Database': DATABASE},
        ResultConfiguration={'OutputLocation': S3_OUTPUT}
    )

    query_execution_id = response['QueryExecutionId']

    # Wait for query completion
    print("Waiting for query to complete...")
    while True:
        status = athena_client.get_query_execution(QueryExecutionId=query_execution_id)
        state = status['QueryExecution']['Status']['State']
        if state in ['SUCCEEDED', 'FAILED', 'CANCELLED']:
            break
        time.sleep(2)

    if state != 'SUCCEEDED':
        raise Exception(f"Query failed with state: {state}")

    # Retrieve results
    results = athena_client.get_query_results(QueryExecutionId=query_execution_id)
    datasets = []
    for row in results['ResultSet']['Rows'][1:]:  # Skip the header row
        datasets.append([col['VarCharValue'] for col in row['Data']])

    print(f"Query returned {len(datasets)} datasets matching the criteria.")
    return datasets
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>aws</category>
      <category>s3</category>
      <category>analytics</category>
      <category>serverless</category>
    </item>
    <item>
      <title>Brief Notes on AWS CodeDeploy</title>
      <dc:creator>Amit Kayal</dc:creator>
      <pubDate>Thu, 21 Mar 2024 19:04:04 +0000</pubDate>
      <link>https://dev.to/aws-builders/brief-notes-on-aws-codedeploy-2731</link>
      <guid>https://dev.to/aws-builders/brief-notes-on-aws-codedeploy-2731</guid>
      <description>&lt;p&gt;Service that automates code deployments to any instance, including Amazon EC2 instances and instances running on-premises.&lt;/p&gt;

&lt;h2&gt;
  
  
  Supported Platforms/Deployment Types:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;EC2/On-Premises: In-Place or Blue/Green Deployments&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Describes instances of physical servers that can be Amazon EC2 cloud instances, on-premises servers, or both. Applications created using the EC2/On-Premises compute platform can be composed of executable files, configuration files, images, and more. o   -   - Deployments that use the EC2/On-Premises compute platform manage the way in which traffic is directed to instances by using an in-place or blue/green deployment type.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;AWS Lambda: Canary, Linear, All-At-Once Deployments&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Applications created using the AWS Lambda compute platform can manage the way in which traffic is directed to the updated Lambda function versions during a deployment by choosing a canary, linear, or all-at-once configuration.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Amazon ECS: Blue/Green Deployment&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Used to deploy an Amazon ECS containerized application as a task set. &lt;/li&gt;
&lt;li&gt;CodeDeploy performs a blue/green deployment by installing an updated version of the containerized application as a new replacement task set. CodeDeploy reroutes production traffic from the original application, or task set, to the replacement task set. The original task set is terminated after a successful deployment.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Deployment approach for EC2
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Deploys a revision to a set of instances.&lt;/li&gt;
&lt;li&gt;Deploys a new revision that consists of an application and AppSpec file. The AppSpec specifies how to deploy the application to the instances in a deployment group.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fezsau98rpjq1qlkvy69j.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fezsau98rpjq1qlkvy69j.jpg" alt="URL" width="635" height="402"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Deployment approach for Lambda
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Deploys a new version of a serverless Lambda function on a high-availability compute infrastructure.&lt;/li&gt;
&lt;li&gt;Shifts production traffic from one version of a Lambda function to a new version of the same function. The AppSpec file specifies which Lambda function version to deploy.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7gabfs5volddde900t0u.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7gabfs5volddde900t0u.jpg" alt="url" width="660" height="290"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Deployment approach for ECS
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Deploys an updated version of an Amazon ECS containerized application as a new, replacement task set. CodeDeploy reroutes production traffic from the task set with the original version to the new replacement task set with the updated version. When the deployment completes, the original task set is terminated.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fobk0qy9jw9jw03ddevli.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fobk0qy9jw9jw03ddevli.jpg" alt="URL" width="660" height="294"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  App Spec File
&lt;/h2&gt;

&lt;p&gt;The application specification file (AppSpec file) is a YAML-formatted or JSON-formatted file used by CodeDeploy to manage a deployment. Note: the name of the AppSpec file for an EC2/On-Premises deployment must be appspec.yml. The name of the AppSpec file for an Amazon ECS or AWS Lambda deployment must be appspec.yml.&lt;/p&gt;

&lt;p&gt;For ECS&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The container and port in replacement task set where your Application Load Balancer or Network Load Balancer reroutes traffic during a deployment. This is specified with the LoadBalancerInfo instruction in the AppSpec file.&lt;/li&gt;
&lt;li&gt;Amazon ECS task definition file. This is specified with its ARN in the TaskDefinition instruction in the AppSpec file.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For Lambda&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lambda function version to deploy.&lt;/li&gt;
&lt;li&gt;Lambda functions to use as validation tests.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For EC2&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which lifecycle event hooks to run in response to deployment lifecycle events.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>cicd</category>
    </item>
    <item>
      <title>Bedrock Agent &amp; Tools - Tracing Best practises</title>
      <dc:creator>Amit Kayal</dc:creator>
      <pubDate>Wed, 20 Mar 2024 17:58:52 +0000</pubDate>
      <link>https://dev.to/aws-builders/bedrock-agent-tools-tracing-best-practises-4217</link>
      <guid>https://dev.to/aws-builders/bedrock-agent-tools-tracing-best-practises-4217</guid>
      <description>&lt;p&gt;I understand most of bedrock agent userss will have a use case where you have implemented multiple Lambda functions with a Bedrock Agent to perform different tasks and are looking for guidance in Debugging the API calls and responses from the Agent and lambda functions.&lt;/p&gt;

&lt;p&gt;Here are some of the approaches that we have been using and found quite effective to track and trace agents and usage of their tools&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Enable Tracing for the Agent: When invoking the agent, set the &lt;code&gt;debug&lt;/code&gt; parameter to &lt;code&gt;true&lt;/code&gt;. This will enable detailed tracing for the agent's execution, including the tools (Lambda functions) invoked and their responses. The trace will be printed to the console or returned as part of the agent's response, depending on how you invoke the agent. [1] Example (Python): &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;python result = agent.run(query, debug=True)&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Log Within Lambda Functions: Within each of your Lambda functions (tools), add logging statements to capture relevant information and events. You can use AWS Lambda's built-in logging capabilities or integrate with a centralized logging service like Amazon CloudWatch Logs. [2] Example (Python): &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;python import logging &lt;br&gt;
logger = logging.getLogger(__name__) &lt;br&gt;
def lambda_handler(event, context): &lt;br&gt;
   http://logger.info (f"Received event: {event}") # Your Lambda function's logic here http://&lt;br&gt;
   logger.info (f"Returning result: {result}") return result&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Correlate Logs Using Request IDs or Tracing IDs: To correlate logs across multiple Lambda functions and the agent, you can use request IDs or tracing IDs. Pass a unique ID as part of the event or context to your Lambda functions and include it in your log statements. This will allow you to trace the flow of events across different components of your system.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;&lt;br&gt;
import logging&lt;br&gt;
   import uuid&lt;br&gt;
   def lambda_handler(event, context):&lt;br&gt;
       request_id = event.get("request_id", str(uuid.uuid4()))&lt;br&gt;
       logger = logging.getLogger(__name__)&lt;br&gt;
       logger = logging.LoggerAdapter(logger, {"request_id": request_id})&lt;br&gt;
       logger.info(f"Received event: {event}")&lt;br&gt;
       logger.info(f"Returning result: {result}")&lt;br&gt;
       return result&lt;br&gt;
&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Use AWS X-Ray for Distributed Tracing: AWS X-Ray is a service that can help you analyze and debug distributed applications, including Lambda functions. By integrating X-Ray with your Bedrock application, you can trace requests as they travel through your Lambda functions and gain insights into their performance and potential issues. [3] - Enable X-Ray tracing for your Lambda functions by adding the necessary configuration. - Instrument your Lambda functions with X-Ray tracing code to capture relevant information and events. - Use the X-Ray console or integrate with other monitoring tools to analyze the traces and identify potential bottlenecks or issues.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Implement Advanced prompts : By using advanced prompts, you can enhance your agent's accuracy through modifying these prompt templates to provide detailed configurations. You can also provide hand-curated examples for few-shot prompting, in which you improve model performance by providing labeled examples for a specific task. [4] By combining the built-in tracing mechanism, custom logging within your Lambda functions, and distributed tracing with AWS X-Ray, you can gain better visibility into the API calls, events, and interactions happening within your Bedrock agent and its associated tools. This can help you debug issues more effectively and trace errors back to their source across multiple Lambda functions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Reference&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/trace-events.html" rel="noopener noreferrer"&gt;Trace events in Amazon Bedrock - Amazon Bedrock&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/lambda/latest/operatorguide/best-practices-debugging.html" rel="noopener noreferrer"&gt;Best practices for your debugging environment - AWS Lambda&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/xray/latest/devguide/aws-xray.html" rel="noopener noreferrer"&gt;What is AWS X-Ray? - AWS X-Ray &lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/advanced-prompts.html" rel="noopener noreferrer"&gt;Advanced prompts in Amazon Bedrock - Amazon Bedrock &lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

</description>
      <category>sagemaker</category>
      <category>aws</category>
      <category>bedrock</category>
    </item>
    <item>
      <title>AWS DEV OPS Professional Exam short notes</title>
      <dc:creator>Amit Kayal</dc:creator>
      <pubDate>Sun, 17 Mar 2024 05:55:58 +0000</pubDate>
      <link>https://dev.to/aws-builders/aws-dev-ops-professional-exam-short-notes-4b47</link>
      <guid>https://dev.to/aws-builders/aws-dev-ops-professional-exam-short-notes-4b47</guid>
      <description>&lt;p&gt;Last few weeks I have been preparing for this exam and have summarized below key notes for further quick reference.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Notes
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;You can use CloudWatch Logs to monitor applications and systems using log data. For example, CloudWatch Logs can track the number of errors that occur in your application logs and send you a notification whenever the rate of errors exceeds a threshold you specify. CloudWatch Logs uses your log data for monitoring; so, no code changes are required. For more information on Cloudwatch logs , please refer to the below link: &lt;a href="http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html&lt;/a&gt; The correct answer is: Install the CloudWatch Logs Agent on your AMI, and configure CloudWatch Logs Agent to stream your logs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can add another layer of protection by enabling MFA Delete on a versioned bucket. Once you do so, you must provide your AWS account’s access keys and a valid code from the account’s MFA device in order to permanently delete an object version or suspend or reactivate versioning on the bucket. For more information on MFA please refer to the below link: &lt;a href="https://aws.amazon.com/blogs/security/securing-access-to-aws-using-mfa-part-3/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/security/securing-access-to-aws-using-mfa-part-3/&lt;/a&gt; IAM roles are designed so that your applications can securely make API requests from your instances, without requiring you to manage the security credentials that the applications use. Instead of creating and distributing your AWS credentials, you can delegate permission to make API requests using IAM roles For more information on Roles for EC2 please refer to the below link: &lt;a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;As your infrastructure grows, common patterns can emerge in which you declare the same components in each of your templates. You can separate out these common components and create dedicated templates for them. That way, you can mix and match different templates but use nested stacks to create a single, unified stack. Nested stacks are stacks that create other stacks. To create nested stacks, use the AWS::CloudFormation::Stackresource in your template to reference other templates. For more information on best practices for Cloudformation please refer to the below link: &lt;a href="http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/best-practices.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/best-practices.html&lt;/a&gt; The correct answer is: Separate the AWS CloudFormation template into a nested structure that has individual templates for the resources that are to be governed by different departments, and use the outputs from the networking and security stacks for the application template that you control.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can use Amazon CloudWatch Logs to monitor, store, and access your log files from Amazon Elastic Compute Cloud (Amazon EC2) instances, AWS CloudTrail, and other sources. You can then retrieve the associated log data from CloudWatch Logs. For more information on Cloudwatch logs please refer to the below link: &lt;a href="http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html&lt;/a&gt; You can the use Kinesis to process those logs For more information on Amazon Kinesis please refer to the below link: &lt;a href="http://docs.aws.amazon.com/streams/latest/dev/introduction.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/streams/latest/dev/introduction.html&lt;/a&gt; The correct answers are: Using AWS CloudFormation, create a CloudWatch Logs LogGroup and send the operating system and application logs of interest using the CloudWatch Logs Agent., Using configuration management, set up remote logging to send events to Amazon Kinesis and insert these into Amazon CloudSearch or Amazon Redshift, depending on available analytic tools.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;IAM roles are designed so that your applications can securely make API requests from your instances, without requiring you to manage the security credentials that the applications use. Instead of creating and distributing your AWS credentials For more information on IAM Roles please refer to the below link: &lt;a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The AWS Security Token Service (STS) is a web service that enables you to request temporary, limited-privilege credentials for AWS Identity and Access Management (IAM) users or for users that you authenticate (federated users). The token can then be used to grant access to the objects in S3. You can then provides access to the objects based on the key values generated via the user id&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;As your infrastructure grows, common patterns can emerge in which you declare the same components in each of your templates. You can separate out these common components and create dedicated templates for them. That way, you can mix and match different templates but use nested stacks to create a single, unified stack. Nested stacks are stacks that create other stacks. To create nested stacks, use the AWS::CloudFormation::Stackresource in your template to reference other templates. For more information on Cloudformation best practises please refer to the below link: &lt;a href="http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/best-practices.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/best-practices.html&lt;/a&gt; The correct answer is: Create separate templates based on functionality, create nested stacks with CloudFormation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The default autosclae termination policy is designed to help ensure that your network architecture spans Availability Zones evenly. When using the default termination policy, Auto Scaling selects an instance to terminate as follows: Auto Scaling determines whether there are instances in multiple Availability Zones. If so, it selects the Availability Zone with the most instances and at least one instance that is not protected from scale in. If there is more than one Availability Zone with this number of instances, Auto Scaling selects the Availability Zone with the instances that use the oldest launch configuration. For more information on Autoscaling instance termination please refer to the below link: &lt;a href="http://docs.aws.amazon.com/autoscaling/latest/userguide/as-instance-termination.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/autoscaling/latest/userguide/as-instance-termination.html&lt;/a&gt; The correct answer is: Auto Scaling will select the AZ with 4 EC2 instances and terminate an instance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Amazon RDS Read Replicas provide enhanced performance and durability for database (DB) instances. This replication feature makes it easy to elastically scale out beyond the capacity constraints of a single DB Instance for read-heavy database workloads. You can create one or more replicas of a given source DB Instance and serve high-volume application read traffic from multiple copies of your data, thereby increasing aggregate read throughput. Sharding is a common concept to split data across multiple tables in a database. Shard your data set among multiple Amazon RDS DB instances.Amazon ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory data store or cache in the cloud. The service improves the performance of web applications by allowing you to retrieve information from fast, managed, in-memory data stores, instead of relying entirely on slower disk-based databases.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Continuous Integration (CI) is a development practice that requires developers to integrate code into a shared repository several times a day. Each check-in is then verified by an automated build, allowing teams to detect problems early.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Elastic Beanstalk simplifies this process by managing the Amazon SQS queue and running a daemon process on each instance that reads from the queue for you. When the daemon pulls an item from the queue, it sends an HTTP POST request locally to &lt;a href="http://localhost/" rel="noopener noreferrer"&gt;http://localhost/&lt;/a&gt; with the contents of the queue message in the body. All that your application needs to do is perform the long-running task in response to the POST. For more information Elastic Beanstalk managing worker environments, please visit the below URL: &lt;a href="http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features-managing-env-tiers.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features-managing-env-tiers.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you suspend AddToLoadBalancer, Auto Scaling launches the instances but does not add them to the load balancer or target group. If you resume the AddToLoadBalancer process, Auto Scaling resumes adding instances to the load balancer or target group when they are launched. However, Auto Scaling does not add the instances that were launched while this process was suspended. You must register those instances manually. For more information on the Suspension and Resumption process, please visit the below URL: &lt;a href="http://docs.aws.amazon.com/autoscaling/latest/userguide/as-suspend-resume-processes.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/autoscaling/latest/userguide/as-suspend-resume-processes.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can use the container_commands key of elastic beanstalk to execute commands that affect your application source code. Container commands run after the application and web server have been set up and the application version archive has been extracted, but before the application version is deployed. Non-container commands and other customization operations are performed prior to the application source code being extracted. You can use leader_only to only run the command on a single instance, or configure a test to only run the command when a test command evaluates to true. Leader-only container commands are only executed during environment creation and deployments, while other commands and server customization operations are performed every time an instance is provisioned or updated. Leader-only container commands are not executed due to launch configuration changes, such as a change in the AMI Id or instance type. For more information on customizing containers, please visit the below URL: &lt;a href="http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html&lt;/a&gt; The correct answer is: Use a “Container command” within an Elastic Beanstalk configuration file to execute the script, ensuring that the “leader only” flag is set to true.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A Dockerrun.aws.json file is an Elastic Beanstalk–specific JSON file that describes how to deploy a set of Docker containers as an Elastic Beanstalk application. You can use aDockerrun.aws.json file for a multicontainer Docker environment. Dockerrun.aws.json describes the containers to deploy to each container instance in the environment as well as the data volumes to create on the host instance for the containers to mount. &lt;a href="http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_docker_v2config.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_docker_v2config.html&lt;/a&gt; &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Elastic Beanstalk supports the deployment of web applications from Docker containers. With Docker containers, you can define your own runtime environment. You can choose your own platform, programming language, and any application dependencies (such as package managers or tools), that aren’t supported by other platforms. Docker containers are self-contained and include all the configuration information and software your web application requires to run.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When you see Amazon Kinesis as an option, this becomes the ideal option to process data in real time. Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. Amazon Kinesis offers key capabilities to cost effectively process streaming data at any scale, along with the flexibility to choose the tools that best suit the requirements of your application. With Amazon Kinesis, you can ingest real-time data such as application logs, website clickstreams, IoT telemetry data, and more into your databases, data lakes and data warehouses, or build your own real-time applications using this data. For more information on Amazon Kinesis, please visit the below URL: &lt;a href="https://aws.amazon.com/kinesis" rel="noopener noreferrer"&gt;https://aws.amazon.com/kinesis&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can use CloudWatch Logs to monitor applications and systems using log data CloudWatch Logs uses your log data for monitoring; so, no code changes are required. For example, you can monitor application logs for specific literal terms (such as “NullReferenceException”) or count the number of occurrences of a literal term at a particular position in log data (such as “404” status codes in an Apache access log). When the term you are searching for is found, CloudWatch Logs reports the data to a CloudWatch metric that you specify. Log data is encrypted while in transit and while it is at rest For more information on Cloudwatch logs please refer to the below link: &lt;a href="http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html&lt;/a&gt; Amazon CloudWatch uses Amazon SNS to send email. First, create and subscribe to an SNS topic. When you create a CloudWatch alarm, you can add this SNS topic to send an email notification when the alarm changes state. For more information on SNS and Cloudwatch logs please refer to the below link:  The correct answers are: Install a CloudWatch Logs Agent on your servers to stream web application logs to CloudWatch., Create a CloudWatch Logs group and define metric filters that capture 500 Internal Server Errors. Set a CloudWatch alarm on that metric., Use Amazon Simple Notification Service to notify an on-call engineer when a CloudWatch alarm is triggered&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When you provision an Amazon EC2 instance in an AWS CloudFormation stack, you might specify additional actions to configure the instance, such as install software packages or bootstrap applications. Normally, CloudFormation proceeds with stack creation after the instance has been successfully created. However, you can use a CreationPolicy so that CloudFormation proceeds with stack creation only after your configuration actions are done. That way you’ll know your applications are ready to go after stack creation succeeds.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Auto Scaling periodically performs health checks on the instances in your Auto Scaling group and identifies any instances that are unhealthy. You can configure Auto Scaling to determine the health status of an instance using Amazon EC2 status checks, Elastic Load Balancing health checks, or custom health checks By default, Auto Scaling health checks use the results of the EC2 status checks to determine the health status of an instance. Auto Scaling marks an instance as unhealthy if its instance fails one or more of the status checks. For more information monitoring in Autoscaling , please visit the below URL: &lt;a href="http://docs.aws.amazon.com/autoscaling/latest/userguide/as-monitoring-features.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/autoscaling/latest/userguide/as-monitoring-features.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You need to have a custom health check which will evaluate the application functionality. Its not enough using the normal health checks. If the application functionality does not work and if you don’t have custom health checks , the instances will still be deemed as healthy. If you have custom health checks, you can send the information from your health checks to Auto Scaling so that Auto Scaling can use this information. For example, if you determine that an instance is not functioning as expected, you can set the health status of the instance to Unhealthy. The next time that Auto Scaling performs a health check on the instance, it will determine that the instance is unhealthy and then launch a replacement instance For more information on Autoscaling health checks , please refer to the below document link: from AWS &lt;a href="http://docs.aws.amazon.com/autoscaling/latest/userguide/healthcheck.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/autoscaling/latest/userguide/healthcheck.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A blue group carries the production load while a green group is staged and deployed with the new code. When it’s time to deploy, you simply attach the green group to the existing load balancer to introduce traffic to the new environment. For HTTP/HTTPS listeners, the load balancer favors the green Auto Scaling group because it uses a least outstanding requests routing algorithm As you scale up the green Auto Scaling group, you can take blue Auto Scaling group instances out of service by either terminating them or putting them in Standby state, For more information on Blue Green Deployments , please refer to the below document link: from AWS &lt;a href="https://d0.awsstatic.com/whitepapers/AWS_Blue_Green_Deployments.pdf" rel="noopener noreferrer"&gt;https://d0.awsstatic.com/whitepapers/AWS_Blue_Green_Deployments.pdf&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ensure first that the cloudformation template is updated with the new instance type. The AWS::AutoScaling::AutoScalingGroup resource supports an UpdatePolicy attribute. This is used to define how an Auto Scaling group resource is updated when an update to the CloudFormation stack occurs. A common approach to updating an Auto Scaling group is to perform a rolling update, which is done by specifying the AutoScalingRollingUpdate policy. This retains the same Auto Scaling group and replaces old instances with new ones, according to the parameters specified&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;With web identity federation, you don’t need to create custom sign-in code or manage your own user identities. Instead, users of your app can sign in using a well-known identity provider (IdP) —such as Login with Amazon, Facebook, Google, or any other OpenID Connect (OIDC)-compatible IdP, receive an authentication token, and then exchange that token for temporary security credentials in AWS that map to an IAM role with permissions to use the resources in your AWS account. Using an IdP helps you keep your AWS account secure, because you don’t have to embed and distribute long-term security credentials with your application.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The optional Conditions section includes statements that define when a resource is created or when a property is defined. For example, you can compare whether a value is equal to another value. Based on the result of that condition, you can conditionally create resources. If you have multiple conditions, separate them with commas. You might use conditions when you want to reuse a template that can create resources in different contexts, such as a test environment versus a production environment. In your template, you can add an EnvironmentType input parameter, which accepts either prod or test as inputs. For the production environment, you might include Amazon EC2 instances with certain capabilities; however, for the test environment, you want to use reduced capabilities to save money. With conditions, you can define which resources are created and how they’re configured for each environment type. For more information on Cloudformation conditions please refer to the below link: &lt;a href="http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/conditions-section-structure.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/conditions-section-structure.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Elastic Beanstalk already has the facility to manage various versions and you don’t need to use S3 separately for this.AWS beanstalk is the perfect solution for developers to maintain application versions. With AWS Elastic Beanstalk, you can quickly deploy and manage applications in the AWS Cloud without worrying about the infrastructure that runs those applications. AWS Elastic Beanstalk reduces management complexity without restricting choice or control. You simply upload your application, and AWS Elastic Beanstalk automatically handles the details of capacity provisioning, load balancing, scaling, and application health monitoring.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The first step in using Elastic Beanstalk is to create an application, which represents your web application in AWS. In Elastic Beanstalk an application serves as a container for the environments that run your web app, and versions of your web app’s source code, saved configurations, logs and other artifacts that you create while using Elastic Beanstalk. For more information on Applications, please refer to the below link: &lt;a href="http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/applications.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/applications.html&lt;/a&gt; Deploying a new version of your application to an environment is typically a fairly quick process. The new source bundle is deployed to an instance and extracted, and then the web container or application server picks up the new version and restarts if necessary. During deployment, your application might still become unavailable to users for a few seconds. You can prevent this by configuring your environment to use rolling deployments to deploy the new version to instances in batches. For more information on deployment, please refer to the below link: &lt;a href="http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.deploy-existing-version.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.deploy-existing-version.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Weighted routing lets you associate multiple resources with a single domain name (example.com) or subdomain name (acme.example.com) and choose how much traffic is routed to each resource. This can be useful for a variety of purposes, including load balancing and testing new versions of software. For more information on the Routing policy please refer to the below link: &lt;a href="http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Amazon Elasticsearch Service makes it easy to deploy, operate, and scale Elasticsearch for log analytics, full text search, application monitoring, and more. Amazon Elasticsearch Service is a fully managed service that delivers Elasticsearch’s easy-to-use APIs and real-time capabilities along with the availability, scalability, and security required by production workloads. The service offers built-in integrations with Kibana, Logstash, and AWS services including Amazon Kinesis Firehose, AWS Lambda, and Amazon CloudWatch so that you can go from raw data to actionable insights quickly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can use CloudWatch Logs to monitor applications and systems using log data. For example, CloudWatch Logs can track the number of errors that occur in your application logs and send you a notification whenever the rate of errors exceeds a threshold you specify. CloudWatch Logs uses your log data for monitoring; so, no code changes are required. For example, you can monitor application logs for specific literal terms (such as “NullReferenceException”) or count the number of occurrences of a literal term at a particular position in log data (such as “404” status codes in an Apache access log). When the term you are searching for is found, CloudWatch Logs reports the data to a CloudWatch metric that you specify. For more information on Cloudwatch Logs please refer to the below link: &lt;a href="http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html&lt;/a&gt; Amazon CloudWatch uses Amazon SNS to send email. First, create and subscribe to an SNS topic. When you create a CloudWatch alarm, you can add this SNS topic to send an email notification when the alarm changes state. For more information on Cloudwatch and SNS please refer to the below link: &lt;a href="http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/US_SetupSNS.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/US_SetupSNS.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AWS OpsWorks is a configuration management service that uses Chef, an automation platform that treats server configurations as code. OpsWorks uses Chef to automate how servers are configured, deployed, and managed across your Amazon Elastic Compute Cloud (Amazon EC2) instances or on-premises compute environments. OpsWorks has two offerings, AWS Opsworks for Chef Automate, and AWS OpsWorks Stacks. For more information on Opswork and SNS please refer to the below link: &lt;a href="https://aws.amazon.com/opsworks/" rel="noopener noreferrer"&gt;https://aws.amazon.com/opsworks/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can use Kinesis Streams for rapid and continuous data intake and aggregation. The type of data used includes IT infrastructure log data, application logs, social media, market data feeds, and web clickstream data. Because the response time for the data intake and processing is in real time, the processing is typically lightweight. The following are typical scenarios for using Kinesis Streams: Accelerated log and data feed intake and processing – You can have producers push data directly into a stream. For example, push system and application logs and they’ll be available for processing in seconds. This prevents the log data from being lost if the front end or application server fails. Kinesis Streams provides accelerated data feed intake because you don’t batch the data on the servers before you submit it for intake. Real-time metrics and reporting – You can use data collected into Kinesis Streams for simple data analysis and reporting in real time. For example, your data-processing application can work on metrics and reporting for system and application logs as the data is streaming in, rather than wait to receive batches of data. For more information on Amazon Kinesis and SNS please refer to the below link: &lt;a href="http://docs.aws.amazon.com/streams/latest/dev/introduction.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/streams/latest/dev/introduction.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;With Elastic Beanstalk, you can quickly deploy and manage applications in the AWS Cloud without worrying about the infrastructure that runs those applications. AWS Elastic Beanstalk reduces management complexity without restricting choice or control. You simply upload your application, and Elastic Beanstalk automatically handles the details of capacity provisioning, load balancing, scaling, and application health monitoring For more information on Elastic beanstalk please refer to the below link: &lt;a href="http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/Welcome.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/Welcome.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can use intrinsic functions, such as Fn::If, Fn::Equals, and Fn::Not, to conditionally create stack resources. These conditions are evaluated based on input parameters that you declare when you create or update a stack. After you define all your conditions, you can associate them with resources or resource properties in the Resources and Outputs sections of a template.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Amazon RDS Multi-AZ deployments provide enhanced availability and durability for Database (DB) Instances, making them a natural fit for production database workloads. When you provision a Multi-AZ DB Instance, Amazon RDS automatically creates a primary DB Instance and synchronously replicates the data to a standby instance in a different Availability Zone (AZ). Each AZ runs on its own physically distinct, independent infrastructure, and is engineered to be highly reliable. In case of an infrastructure failure, Amazon RDS performs an automatic failover to the standby (or to a read replica in the case of Amazon Aurora), so that you can resume database operations as soon&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can use AWS CloudTrail to get a history of AWS API calls and related events for your account. This history includes calls made with the AWS Management Console, AWS Command Line Interface, AWS SDKs, and other AWS services. For more information on Cloudtrail, please visit the below URL: &lt;a href="http://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html&lt;/a&gt; Amazon CloudWatch Events delivers a near real-time stream of system events that describe changes in Amazon Web Services (AWS) resources. Using simple rules that you can quickly set up, you can match events and route them to one or more target functions or streams. CloudWatch Events becomes aware of operational changes as they occur. CloudWatch Events responds to these operational changes and takes corrective action as necessary, by sending messages to respond to the environment, activating functions, making changes, and capturing state information&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;By default, all AWS accounts are limited to 5 Elastic IP addresses per region, because public (IPv4) Internet addresses are a scarce public resource. We strongly encourage you to use an Elastic IP address primarily for the ability to remap the address to another instance in the case of instance failure, and to use DNS hostnames for all other inter-node communication&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can manage Amazon SQS messages with Amazon S3. This is especially useful for storing and consuming messages with a message size of up to 2 GB. To manage Amazon SQS messages with Amazon S3, use the Amazon SQS Extended Client Library for Java. Specifically, you use this library to: Specify whether messages are always stored in Amazon S3 or only when a message’s size exceeds 256 KB. Send a message that references a single message object stored in an Amazon S3 bucket. Get the corresponding message object from an Amazon S3 bucket. Delete the corresponding message object from an Amazon S3 bucket. For more information on processing large messages for SQS, please visit the below URL: &lt;a href="http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-s3-messages.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-s3-messages.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AWS CloudFormation provisions and configures resources by making calls to the AWS services that are described in your template. After all the resources have been created, AWS CloudFormation reports that your stack has been created. You can then start using the resources in your stack. If stack creation fails, AWS CloudFormation rolls back your changes by deleting the resources that it created. The below snapshot from Cloudformation shows what happens when there is an error in the stack creation. For more information on how CloudFormation works , please refer to the below link: &lt;a href="http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-whatis-howdoesitwork.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-whatis-howdoesitwork.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Because Elastic Beanstalk performs an in-place update when you update your application versions, your application may become unavailable to users for a short period of time. It is possible to avoid this downtime by performing a blue/green deployment, where you deploy the new version to a separate environment, and then swap CNAMEs of the two environments to redirect traffic to the new version instantly. Blue/green deployments require that your environment runs independently of your production database, if your application uses one. If your environment has an Amazon RDS DB instance attached to it, the data will not transfer over to your second environment, and will be lost if you terminate the original environment. For more information on Blue Green deployments with Elastic beanstalk , please refer to the below link: &lt;a href="http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.CNAMESwap.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.CNAMESwap.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Amazon RDS Read Replicas provide enhanced performance and durability for database (DB) instances. This replication feature makes it easy to elastically scale out beyond the capacity constraints of a single DB Instance for read-heavy database workloads. You can create one or more replicas of a given source DB Instance and serve high-volume application read traffic from multiple copies of your data, thereby increasing aggregate read throughput. Read replicas can also be promoted when needed to become standalone DB instances.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Amazon Route 53 health checks monitor the health and performance of your web applications, web servers, and other resources.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you use SSL termination, your servers will always get non-secure connections and will never know whether users used a more secure channel or not. If you are using Elastic beanstalk to configure the ELB, you can use the below article to ensure end to end encryption. &lt;a href="http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/configuring-https-endtoend.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/configuring-https-endtoend.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Amazon Kinesis Firehose is the easiest way to load streaming data into AWS. It can capture, transform, and load streaming data into Amazon Kinesis Analytics, Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service, enabling near real-time analytics with existing business intelligence tools and dashboards you’re already using today. It is a fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration. It can also batch, compress, and encrypt the data before loading it, minimizing the amount of storage used at the destination and increasing security. For more information on Kinesis firehose, please visit the below URL: &lt;a href="https://aws.amazon.com/kinesis/firehose/" rel="noopener noreferrer"&gt;https://aws.amazon.com/kinesis/firehose/&lt;/a&gt; &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. This enables you to use your data to acquire new insights for your business and customers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use Cloudfront distribution for distributing the heavy reads for your application. You can create a zone apex record to point to the Cloudfront distribution. You can control how long your objects stay in a CloudFront cache before CloudFront forwards another request to your origin. Reducing the duration allows you to serve dynamic content. Increasing the duration means your users get better performance because your objects are more likely to be served directly from the edge cache. A longer duration also reduces the load on your origin.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Amazon EBS encryption offers you a simple encryption solution for your EBS volumes without the need for you to build, maintain, and secure your own key management infrastructure. When you create an encrypted EBS volume and attach it to a supported instance type, the following types of data are encrypted: Data at rest inside the volume All data moving between the volume and the instance All snapshots created from the volume Snapshots that are taken from encrypted volumes are automatically encrypted. Volumes that are created from encrypted snapshots are also automatically encrypted.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A tag is a label that you or AWS assigns to an AWS resource. Each tag consists of a key and a value. A key can have more than one value. You can use tags to organize your resources, and cost allocation tags to track your AWS costs on a detailed level. After you activate cost allocation tags, AWS uses the cost allocation tags to organize your resource costs on your cost allocation report, to make it easier for you to categorize and track your AWS costs. AWS provides two types of cost allocation tags, an AWS-generated tag and user-defined tags. AWS defines, creates, and applies the AWS-generated tag for you, and you define, create, and apply user-defined tags. You must activate both types of tags separately before they can appear in Cost Explorer or on a cost allocation report.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can monitor the progress of a stack update by viewing the stack’s events. The console’s Events tab displays each major step in the creation and update of the stack sorted by the time of each event with latest events on top. The start of the stack update process is marked with an UPDATE_IN_PROGRESS event for the stack For more information on Monitoring your stack, please visit the below URL: &lt;a href="http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-monitor-stack.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-monitor-stack.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A placement group is a logical grouping of instances within a single Availability Zone. Placement groups are recommended for applications that benefit from low network latency, high network throughput, or both. To provide the lowest latency, and the highest packet-per-second network performance for your placement group, choose an instance type that supports enhanced networking. For more information on Placement Groups, please visit the below URL: &lt;a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AWS CloudTrail is an AWS service that helps you enable governance, compliance, and operational and risk auditing of your AWS account. Actions taken by a user, role, or an AWS service are recorded as events in CloudTrail. Events include actions taken in the AWS Management Console, AWS Command Line Interface, and AWS SDKs and APIs. Visibility into your AWS account activity is a key aspect of security and operational best practices. You can use CloudTrail to view, search, download, archive, analyze, and respond to account activity across your AWS infrastructure. You can identify who or what took which action, what resources were acted upon, when the event occurred, and other details to help you analyze and respond to activity in your AWS account. For more information on Cloudtrail, please visit the below URL: &lt;a href="http://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Custom resources enable you to write custom provisioning logic in templates that AWS CloudFormation runs anytime you create, update (if you changed the custom resource), or delete stacks. For example, you might want to include resources that aren’t available as AWS CloudFormation resource types. You can include those resources by using custom resources. That way you can still manage all your related resources in a single stack. Use the AWS::CloudFormation::CustomResource or Custom::String resource type to define custom resources in your templates. Custom resources require one property: the service token, which specifies where AWS CloudFormation sends requests to, such as an Amazon SNS topic.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Failover routing lets you route traffic to a resource when the resource is healthy or to a different resource when the first resource is unhealthy. The primary and secondary resource record sets can route traffic to anything from an Amazon S3 bucket that is configured as a website to a complex tree of records. For more information on Route53 Failover Routing, please visit the below URL: &lt;a href="http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy.html" rel="noopener noreferrer"&gt;http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Deployment Types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Single Target Deployment - small dev projects, legacy or non-HA infrastructure; outage occurs in case of failure, testing opportunity is limited.&lt;/li&gt;
&lt;li&gt;All-at-Once Deployment - deployment happens on multiple targets, requires Orchestration tools, suitable for non critical apps in 5-10 range.&lt;/li&gt;
&lt;li&gt;Minimum in-service Deployment - keeps min in-service targets and deploy in multiple stages, suitable for large environments, allow automated testing, no downtime&lt;/li&gt;
&lt;li&gt;Rolling Deployments - x targets per stage, happens in multiple stages, after completion of stage 1, next stage begins, orchestration and health check required, can be least efficient if x is smaller, allow automated testing, no downtime if x is not large to impact application, can be paused, allowing multi-version testing.&lt;/li&gt;
&lt;li&gt;Blue Green Deployment - Deploy to seperate Green environment, update the code on Green, extra cost due to duplicate env during deployment, Deployment is rapid, cutover and migration is clean(DNS Change), Rollback easy(DNS regression), can be fully automates using CFN etc. Binary, No Traffic Split, not used to feature test&lt;/li&gt;
&lt;li&gt;A/B Testing - distribution traffic between blue/green, allows gradual performance/stability/health analysis, allows new feature testing, rollback is quick, end goal of A/B testing is not migration, Uses Route 53 for DNS resolution, 2 records one pointing A, other pointing B, weighted/round robin.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Intrinsic &amp;amp; Conditional Functions&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Intrinsic Fn - inbuilt function provided by AWS to help manage, reference, and condtionally act upon resources, situation &amp;amp; inputs to a stack. &lt;/li&gt;
&lt;li&gt;Fn::Base64 - Base64 encoding for User Data&lt;/li&gt;
&lt;li&gt;Fn::FindInMap - Mapping lookup &lt;/li&gt;
&lt;li&gt;Fn::GetAtt - Advanced reference look up &lt;/li&gt;
&lt;li&gt;Fn::GetAZs - retrieve list of AZs in a region &lt;/li&gt;
&lt;li&gt;Fn::Join - construct complex strings; concatenate strings &lt;/li&gt;
&lt;li&gt;Fn::Select - value selection from list (0, 1) &lt;/li&gt;
&lt;li&gt;Ref - default value of resource &lt;/li&gt;
&lt;li&gt;Conditional Functions - Fn::And, Fn::Equals, Fn::If, Fn::Not, Fn::Or&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;CFN Resource Deletion Policies&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A policy/setting which is associated with each resource in a template; A way to control what happens to each resource when a stack is deleted.&lt;/li&gt;
&lt;li&gt;Policy value - Delete (Default), Retain, Snapshot&lt;/li&gt;
&lt;li&gt;Delete - Useful for testing environment, CI/CD/QA workflows, &lt;/li&gt;
&lt;li&gt;Presales, Short Lifecycle/Immutable env.&lt;/li&gt;
&lt;li&gt;Retain - live beyond lifcycle of stack; Windows Server Platform (AD), Servers with state, SQL, Exchange, File Servers, &lt;/li&gt;
&lt;li&gt;Non immutable architectures.&lt;/li&gt;
&lt;li&gt;Snapshot - restricted policy type only available for EBS volumes; takes snapshot before deleting for recovering data.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Immutable Architecture - Replace infra instead of upgrading or repairing faulty components, treat servers as unchangeable objects, don't diagnose and fix, throw away and re-create, Nothing bootstraped except AMI.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;CFN Stack updates&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;stack policy is checked, updates can be prevented; absence of &lt;/li&gt;
&lt;li&gt;stack policy allow all updates; stack policy cannot be deleted once applied. Once stack policy applied ALL objects are protected, Update is denied; to remove default DENY, explicit allow is required; can be applied to a single resource(id)/Wild card/NotResource; Has Principal and Action; Condition element (resource type) can also be used.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Stack updates: 4 Types - Update with No Interrupion, Some Interruption, Replacement, Delete&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>serverless</category>
    </item>
    <item>
      <title>Sagemaker Model deployment and Integration</title>
      <dc:creator>Amit Kayal</dc:creator>
      <pubDate>Wed, 18 Jan 2023 07:29:11 +0000</pubDate>
      <link>https://dev.to/aws-builders/sagemaker-model-deployment-and-integration-2l6c</link>
      <guid>https://dev.to/aws-builders/sagemaker-model-deployment-and-integration-2l6c</guid>
      <description>&lt;h1&gt;
  
  
  Sagemaker Model deployment and Integration
&lt;/h1&gt;

&lt;p&gt;[TOC]&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS Feature store
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/sagemaker/feature-store/" rel="noopener noreferrer"&gt;SageMaker Feature Store&lt;/a&gt; is a purpose-built solution for ML feature management. It helps data science teams reuse ML features across teams and models, serve features for model predictions at scale with low latency, and train and deploy new models more quickly and effectively.&lt;/p&gt;

&lt;p&gt;Refer the notebook &lt;a href="https://github.com/aws-samples/ml-lineage-helper/blob/main/examples/example.ipynb" rel="noopener noreferrer"&gt;https://github.com/aws-samples/ml-lineage-helper/blob/main/examples/example.ipynb&lt;/a&gt; for more details.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3gpthlqmcqtvgoq5vloj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3gpthlqmcqtvgoq5vloj.png" alt="im" width="696" height="375"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Why is feature lineage important?
&lt;/h3&gt;

&lt;p&gt;Imagine trying to manually track all of this for a large team, multiple teams, or even multiple business units. Lineage tracking and querying helps make this more manageable and helps organizations move to ML at scale. The following are four examples of how feature lineage helps scale the ML process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Build confidence for reuse of existing features&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Avoid reinventing features that are based on the same raw data as existing features&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Troubleshoot and audit models and model predictions&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Manage features proactively&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  AWS ML Lens and built-in models
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foe752fq1d6ubqp0vjx03.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foe752fq1d6ubqp0vjx03.PNG" alt="im" width="800" height="397"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F22dz1xo92o6rvnjlo7zl.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F22dz1xo92o6rvnjlo7zl.PNG" alt="im" width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Deployment Options
&lt;/h1&gt;

&lt;p&gt;ML inference can be done in real time on individual records, such as with a REST API endpoint. Inference can also be done in batch mode as a processing job on a large dataset. While both approaches push data through a model, each has its own target goal when running inference at scale.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;* Real Time*&lt;/th&gt;
&lt;th&gt;&lt;em&gt;Micro Batch&lt;/em&gt;&lt;/th&gt;
&lt;th&gt;&lt;em&gt;Batch&lt;/em&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;*&lt;em&gt;Execution Mode *&lt;/em&gt;
&lt;/td&gt;
&lt;td&gt;Synchronous&lt;/td&gt;
&lt;td&gt;Synchronous/Asynchronous&lt;/td&gt;
&lt;td&gt;Asynchronous&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;*&lt;em&gt;Prediction Latency *&lt;/em&gt;
&lt;/td&gt;
&lt;td&gt;Subsecond&lt;/td&gt;
&lt;td&gt;Seconds to minutes&lt;/td&gt;
&lt;td&gt;Indefinite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Bounds&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Unbounded/stream&lt;/td&gt;
&lt;td&gt;Bounded&lt;/td&gt;
&lt;td&gt;Bounded&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;*&lt;em&gt;Execution Frequency *&lt;/em&gt;
&lt;/td&gt;
&lt;td&gt;Variable&lt;/td&gt;
&lt;td&gt;Variable&lt;/td&gt;
&lt;td&gt;Variable/fixed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;*&lt;em&gt;Invocation Mode *&lt;/em&gt;
&lt;/td&gt;
&lt;td&gt;Continuous stream/API calls&lt;/td&gt;
&lt;td&gt;Event-based&lt;/td&gt;
&lt;td&gt;Event-based/scheduled&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Examples&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Real-time REST API endpoint&lt;/td&gt;
&lt;td&gt;Data analyst running a SQL UDF&lt;/td&gt;
&lt;td&gt;Scheduled inference job&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h1&gt;
  
  
  Realtime deployment
&lt;/h1&gt;

&lt;p&gt;Sagemaker real-time deployment has the following approach. Key point here is that we can have our inference pipeline coupled with autoscale. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd1enzoeiv9tfr3tzqxjm.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd1enzoeiv9tfr3tzqxjm.PNG" alt="im" width="800" height="401"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here are different ways, we can deploy real-time endpoint by sagemaker.  You can see here multiple options from own model, own container to prebuilt container.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn7qu02wcm6yyjwwhzwmu.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn7qu02wcm6yyjwwhzwmu.PNG" alt="im" width="800" height="395"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With sagemaker, prebuilt container and its own inference script, we can use this as shared below. &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhvqtgdqh809cnifspdi4.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhvqtgdqh809cnifspdi4.PNG" alt="im" width="800" height="402"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Quite a lot of time, we add our own inference script and this is quite simple as shown below. &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftcbmd1e0636z7r3cs70w.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftcbmd1e0636z7r3cs70w.PNG" alt="im" width="800" height="401"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It is not rare to have our own container and own trained model along with inference script. The architecture does not change for that and we still follow same architecture as shared below. &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqqeo555pymn5de7r0jpv.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqqeo555pymn5de7r0jpv.PNG" alt="im" width="800" height="398"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Autoscale
&lt;/h3&gt;

&lt;p&gt;we can set autoscale policy for sagemaker endpoint to scale up and scale down automatically.  &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe9ppfuwozwag6o0hc2e2.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe9ppfuwozwag6o0hc2e2.PNG" alt="im" width="800" height="403"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We have to set autoscale policy setup for endpoint. You can see here that ServiceNamespace is set to sgaemaker and resourceId is set to Endpoint name.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cito27qme7jkwk097ug.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cito27qme7jkwk097ug.PNG" alt="im" width="800" height="401"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi Modal endpoint
&lt;/h2&gt;

&lt;p&gt;SageMaker multi-model endpoints work with several frameworks, such as TensorFlow, PyTorch, MXNet, and sklearn, and you can &lt;a href="https://docs.aws.amazon.com/sagemaker/latest/dg/build-multi-model-build-container.html" rel="noopener noreferrer"&gt;build your own container with a multi-model server.&lt;/a&gt; Multi-model endpoints are also supported natively in the following popular SageMaker built-in algorithms: &lt;a href="https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html" rel="noopener noreferrer"&gt;XGBoost&lt;/a&gt;, &lt;a href="https://docs.aws.amazon.com/sagemaker/latest/dg/linear-learner.html" rel="noopener noreferrer"&gt;Linear Learner&lt;/a&gt;, &lt;a href="https://docs.aws.amazon.com/sagemaker/latest/dg/randomcutforest.html" rel="noopener noreferrer"&gt;Random Cut Forest&lt;/a&gt; (RCF), and &lt;a href="https://docs.aws.amazon.com/sagemaker/latest/dg/k-nearest-neighbors.html" rel="noopener noreferrer"&gt;K-Nearest Neighbors&lt;/a&gt; (KNN). &lt;/p&gt;

&lt;p&gt;Refer the notebook &lt;a href="https://github.com/aws-samples/sagemaker-multi-model-endpoint-tensorflow-computer-vision/blob/main/multi-model-endpoint-tensorflow-cv.ipynb" rel="noopener noreferrer"&gt;https://github.com/aws-samples/sagemaker-multi-model-endpoint-tensorflow-computer-vision/blob/main/multi-model-endpoint-tensorflow-cv.ipynb&lt;/a&gt; to understand how we can deploy this/. Refer the blog &lt;a href="https://aws.amazon.com/blogs/machine-learning/save-on-inference-costs-by-using-amazon-sagemaker-multi-model-endpoints/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/machine-learning/save-on-inference-costs-by-using-amazon-sagemaker-multi-model-endpoints/&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;All of the models that are hosted on a multi-modal endpoint must share the same serving container image. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Multi-model endpoints are an option that can improve endpoint utilization when your models are of similar size and share the same container image  and have similar invocation latency requirements.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;all the model needs to share same S3 bucket to host their weights&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe0zo39qdu5vgu15dvzi7.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe0zo39qdu5vgu15dvzi7.jpg" alt="im" width="800" height="552"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0sf346dtlj3r28w14wh5.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0sf346dtlj3r28w14wh5.gif" alt="im" width="800" height="257"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost advantages
&lt;/h3&gt;

&lt;p&gt;This diagram demonstrates running 10 models on a multi-model endpoint versus using 10 separate endpoints. This results in savings of $3,000 per month, as shown in the following figure: Multi-model endpoints can easily scale to hundreds or thousands of models. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv0ho24l9vl6vcvkyqv6e.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv0ho24l9vl6vcvkyqv6e.gif" alt="img" width="800" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  How to use?
&lt;/h3&gt;

&lt;p&gt;To create a multi-model endpoint in Amazon SageMaker, choose the multi-model option, provide the inference serving container image path, and provide the &lt;a href="http://aws.amazon.com/s3" rel="noopener noreferrer"&gt;Amazon S3&lt;/a&gt; prefix in which the trained model artifacts are stored. You can organize your models in S3 any way you wish, so long as they all use the same prefix. &lt;/p&gt;

&lt;p&gt;When you invoke the multi-model endpoint, you provide the relative path of a specific model with the new TargetModel parameter of InvokeEndpoint. To add models to the multi-model endpoint, simply store a newly trained model artifact in S3 under the prefix associated with the endpoint. The model will then be immediately available for invocations. &lt;/p&gt;

&lt;p&gt;To update a model already in use, add the model to S3 with a new name and begin invoking the endpoint with the new model name. To stop using a model deployed on a multi-model endpoint, stop invoking the model and delete it from S3.&lt;/p&gt;

&lt;p&gt;Instead of downloading all the models into the container from S3 when the endpoint is created, Amazon SageMaker multi-model endpoints dynamically load models from S3 when invoked. As a result, an initial invocation to a model might see higher inference latency than the subsequent inferences, which are completed with low latency. If the model is already loaded on the container when invoked, then the download step is skipped and the model returns the inferences with low latency. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0sf346dtlj3r28w14wh5.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0sf346dtlj3r28w14wh5.gif" alt="im" width="800" height="257"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Monitoring multi-model endpoints using Amazon CloudWatch metrics
&lt;/h3&gt;

&lt;p&gt;To make price and performance tradeoffs, you will want to test multi-model endpoints with models and representative traffic from your own application. Amazon SageMaker provides additional metrics in CloudWatch for multi-model endpoints so you can determine the endpoint usage and the cache hit rate and optimize your endpoint. The metrics are as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ModelLoadingWaitTime&lt;/strong&gt; – The interval of time that an invocation request waits for the target model to be downloaded or loaded to perform the inference.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ModelUnloadingTime&lt;/strong&gt; – The interval of time that it takes to unload the model through the container’s &lt;code&gt;UnloadModel&lt;/code&gt; API call.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ModelDownloadingTime&lt;/strong&gt; – The interval of time that it takes to download the model from S3.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ModelLoadingTime&lt;/strong&gt; – The interval of time that it takes to load the model through the container’s &lt;code&gt;LoadModel&lt;/code&gt; API call.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ModelCacheHit&lt;/strong&gt; – The number of &lt;code&gt;InvokeEndpoint&lt;/code&gt; requests sent to the endpoint where the model was already loaded. Taking the Average statistic shows the ratio of requests in which the model was already loaded.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LoadedModelCount&lt;/strong&gt; – The number of models loaded in the containers in the endpoint. This metric is emitted per instance. The &lt;code&gt;Average&lt;/code&gt; statistic with a period of 1 minute tells you the average number of models loaded per instance, and the &lt;code&gt;Sum&lt;/code&gt; statistic tells you the total number of models loaded across all instances in the endpoint. The models that this metric tracks are not necessarily unique because you can load a model in multiple containers in the endpoint.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;You can use CloudWatch charts to help make ongoing decisions on the optimal choice of instance type, instance count, and number of models that a given endpoint should hos&lt;/strong&gt;t. &lt;/p&gt;

&lt;h2&gt;
  
  
  Inference Pipeline sagemaker
&lt;/h2&gt;

&lt;p&gt;You can use trained models in an inference pipeline to make real-time predictions directly without performing external preprocessing. When you configure the pipeline, you can choose to use the built-in feature transformers already available in Amazon SageMaker. Or, you can implement your own transformation logic using just a few lines of scikit-learn or Spark code.&lt;/p&gt;

&lt;p&gt;Refer &lt;a href="https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-python-sdk/scikit_learn_inference_pipeline/Inference%20Pipeline%20with%20Scikit-learn%20and%20Linear%20Learner.html" rel="noopener noreferrer"&gt;https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-python-sdk/scikit_learn_inference_pipeline/Inference%20Pipeline%20with%20Scikit-learn%20and%20Linear%20Learner.html&lt;/a&gt; / &lt;a href="https://catalog.us-east-1.prod.workshops.aws/workshops/f238037c-8f0b-446e-9c15-ebcc4908901a/en-US/002-services/003-machine-learning/020-sagemaker" rel="noopener noreferrer"&gt;https://catalog.us-east-1.prod.workshops.aws/workshops/f238037c-8f0b-446e-9c15-ebcc4908901a/en-US/002-services/003-machine-learning/020-sagemaker&lt;/a&gt; for more details.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; Inference pipeline allows you to host multiple models behind a single endpoint. But in this case, the models are sequential chain of models with the steps that are required for inference. This allows you to take your data transformation model, your predictor model, and your post-processing transformer, and host them so they can be sequentially run behind a single endpoint.&lt;/li&gt;
&lt;li&gt; As you can see in this picture, the inference request comes into the endpoint, then the first model is invoked, and that model is your data transformation. The output of that model is then passed to the next step, which is actually your XGBoost model here, or your predictor model. 

&lt;ul&gt;
&lt;li&gt;That output is then passed to the next step, where ultimately in that final step in the pipeline, it provides the final response or  the post-process response to that inference request. &lt;/li&gt;
&lt;li&gt;This allows you to couple your pre and post-processing code behind the same endpoint and helps ensure that your training and your inference code stay synchronized&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F77s20por3defocmh57xy.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F77s20por3defocmh57xy.PNG" alt="im" width="800" height="406"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Sagemaker Production Variant
&lt;/h2&gt;

&lt;p&gt;Amazon SageMaker enables you to test multiple models or model versions behind the same endpoint using production variants. Each production variant identifies a machine learning (ML) model and the resources deployed for hosting the model. By using production variants, you can test ML models that have been trained using different datasets, trained using different algorithms and ML frameworks, or are deployed to different instance type, or any combination of all of these. You can distribute endpoint invocation requests across multiple production variants by providing the traffic distribution for each variant, or you can invoke a specific variant directly for each request. In this topic, we look at both methods for testing ML models.&lt;/p&gt;

&lt;p&gt;Refer the notebook &lt;a href="https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_endpoints/a_b_testing/a_b_testing.html" rel="noopener noreferrer"&gt;https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_endpoints/a_b_testing/a_b_testing.html&lt;/a&gt; for implementation details.&lt;/p&gt;

&lt;h3&gt;
  
  
  Test models by specifying traffic distribution
&lt;/h3&gt;

&lt;p&gt;Specify the percentage of the traffic that gets routed to each model by specifying the weight for each production variant in the endpoint configuration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F370os14y9wo7g1w0nwc3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F370os14y9wo7g1w0nwc3.png" alt="im" width="676" height="475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Test models by invoking specific variants
&lt;/h3&gt;

&lt;p&gt;Specify the specific version of the model you want to invoke by providing a value for the &lt;code&gt;TargetVariant&lt;/code&gt; parameter when you call &lt;a href="https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpoint.html" rel="noopener noreferrer"&gt;InvokeEndpoint&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftg60eazu4x6xn1p5acvv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftg60eazu4x6xn1p5acvv.png" alt="im" width="693" height="482"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Amazon SageMaker Batch Transform: Batch Inference
&lt;/h1&gt;

&lt;p&gt;We’ll use the Sagemaker Batch Transform Jobs and a trained machine learning model. It is assumed that we have already trained the model, pushed the Docker image to ECR, and registered the model in Sagemaker. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;we need the identifier of the Sagemaker model we want to use and the location of the input data&lt;/li&gt;
&lt;li&gt;either use a built-in container for your inference image or you can also bring your own.&lt;/li&gt;
&lt;li&gt;Batch Transform &lt;strong&gt;partitions the Amazon S3 objects in the input by key and maps Amazon S3 objects to instances&lt;/strong&gt;. When you have multiples files, one instance might process input1. csv , and another instance might process the file named input2. csv &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In Batch Transform you provide your inference data as a S3 uri and  SageMaker will care of downloading it, running the prediction and  uploading the results afterwards to S3 again. You can find more  documentation for Batch Transform &lt;a href="https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you trained a model using the Hugging Face Estimator, call the &lt;code&gt;transformer()&lt;/code&gt; method to create a transform job for a model based on the training job (see &lt;a href="https://sagemaker.readthedocs.io/en/stable/overview.html#sagemaker-batch-transform" rel="noopener noreferrer"&gt;here&lt;/a&gt; for more details): Refer &lt;a href="https://huggingface.co/docs/sagemaker/inference" rel="noopener noreferrer"&gt;https://huggingface.co/docs/sagemaker/inference&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;batch job has &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;instance count&lt;/li&gt;
&lt;li&gt;instance type&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;transform job has&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;data location&lt;/li&gt;
&lt;li&gt;content type
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;batch_job = huggingface_estimator.transformer(
    instance_count=1,
    instance_type='ml.p3.2xlarge',
    strategy='SingleRecord')


batch_job.transform(
    data='s3://s3-uri-to-batch-data',
    content_type='application/json',    
    split_type='Line')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to run your batch transform job later or with a model from the 🤗 Hub, create a &lt;code&gt;HuggingFaceModel&lt;/code&gt; instance and then call the &lt;code&gt;transformer()&lt;/code&gt; method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sagemaker.huggingface.model import HuggingFaceModel

# Hub model configuration &amp;lt;https://huggingface.co/models&amp;gt;
hub = {
    'HF_MODEL_ID':'distilbert-base-uncased-finetuned-sst-2-english',
    'HF_TASK':'text-classification'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   env=hub,                                                # configuration for loading model from Hub
   role=role,                                              # IAM role with permissions to create an endpoint
   transformers_version="4.6",                             # Transformers version used
   pytorch_version="1.7",                                  # PyTorch version used
   py_version='py36',                                      # Python version used
)

# create transformer to run a batch job
batch_job = huggingface_model.transformer(
    instance_count=1,
    instance_type='ml.p3.2xlarge',
    output_path=output_s3_path, # we are using the same s3 path to save the output with the input
    strategy='SingleRecord'
)

# starts batch transform job and uses S3 data as input
batch_job.transform(
    data='s3://sagemaker-s3-demo-test/samples/input.jsonl',
    content_type='application/json',    
    split_type='Line'
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;input.jsonl&lt;/code&gt; looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import json
from sagemaker.s3 import S3Downloader
from ast import literal_eval
# creating s3 uri for result file -&amp;gt; input file + .out
output_file = f"{dataset_jsonl_file}.out"
output_path = s3_path_join(output_s3_path,output_file)

# download file
S3Downloader.download(output_path,'.')

batch_transform_result = []
with open(output_file) as f:
    for line in f:
        # converts jsonline array to normal array
        line = "[" + line.replace("[","").replace("]",",") + "]"
        batch_transform_result = literal_eval(line) 

# print results 
print(batch_transform_result[:3])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"inputs":"this movie is terrible"}
{"inputs":"this movie is amazing"}
{"inputs":"SageMaker is pretty cool"}
{"inputs":"SageMaker is pretty cool"}
{"inputs":"this movie is terrible"}
{"inputs":"this movie is amazing"}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;📓 Open the &lt;a href="https://github.com/huggingface/notebooks/blob/main/sagemaker/12_batch_transform_inference/sagemaker-notebook.ipynb" rel="noopener noreferrer"&gt;notebook&lt;/a&gt; for an example of how to run a batch transform job for inference.&lt;/p&gt;

&lt;h2&gt;
  
  
  Speeding up the processing
&lt;/h2&gt;

&lt;p&gt;We have only one instance running, so processing the entire file may take some time. We can increase the number of instances using the &lt;code&gt;instance_count&lt;/code&gt; parameter to speed it up. We can send multiple requests to the Docker container simultaneously, too. The configure concurrent transformations we must use the &lt;code&gt;max_concurrent_transforms&lt;/code&gt; parameter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Processing the output
&lt;/h2&gt;

&lt;p&gt;In the end, we must get access to the output. We’ll find the output files in the location specified in the Transformer constructor. Every line contains the prediction and the input parameters. agemaker-notebook.ipynb) for an example of how to run a batch transform job for inference.&lt;/p&gt;

</description>
      <category>motivation</category>
    </item>
    <item>
      <title>My journey into Sentence Transformer</title>
      <dc:creator>Amit Kayal</dc:creator>
      <pubDate>Sun, 16 Oct 2022 16:49:41 +0000</pubDate>
      <link>https://dev.to/aws-builders/my-journey-into-sentence-transformer-1b7m</link>
      <guid>https://dev.to/aws-builders/my-journey-into-sentence-transformer-1b7m</guid>
      <description>&lt;p&gt;Last few days I have been exploring sentence transformer and this page documents my notes/understanding. This note explains the basic of sentence transformer and deployment the same through sagemaker endopoint. The endpoint is then accessed from lambda. Terraform has been considered here for deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Sentence Embedding?
&lt;/h2&gt;

&lt;p&gt;I came across a nice example posted by Mathias about sentence comparison. Consider the following statements: &lt;em&gt;“Nuclear power is dangerous!”&lt;/em&gt; and &lt;em&gt;“Nuclear power is the future of energy!”&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If we are talking about the topic, then definitely yes: both statements are opinions on nuclear power. So in that sense, they are very similar. &lt;/li&gt;
&lt;li&gt;However, if we are talking about sentiment, then the answer is a  resounding no. They are about as dissimilar in terms of sentiment as we can get.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxo80ns2a1x3u4pknw9ph.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxo80ns2a1x3u4pknw9ph.png" alt="im" width="720" height="545"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Transformer and BERT?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;BART&lt;/strong&gt; is a &lt;a href="https://paperswithcode.com/method/denoising-autoencoder" rel="noopener noreferrer"&gt;denoising autoencoder&lt;/a&gt; for pretraining sequence-to-sequence models. It is trained by (1)  corrupting text with an arbitrary noising function, and (2) learning a  model to reconstruct the original text. It uses a standard &lt;a href="https://paperswithcode.com/method/transformer" rel="noopener noreferrer"&gt;Transformer&lt;/a&gt;-based neural machine translation architecture. It uses a standard seq2seq/NMT architecture with a bidirectional encoder (like &lt;a href="https://paperswithcode.com/method/bert" rel="noopener noreferrer"&gt;BERT&lt;/a&gt;) and a left-to-right decoder (like &lt;a href="https://paperswithcode.com/method/gpt" rel="noopener noreferrer"&gt;GPT&lt;/a&gt;). This means the encoder's attention mask is fully visible, like BERT, and the decoder's attention mask is causal, like &lt;a href="https://paperswithcode.com/method/gpt-2" rel="noopener noreferrer"&gt;GPT2&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpbs.twimg.com%2Fmedia%2FEPWlKGfW4AABHof%3Fformat%3Djpg%26name%3D4096x4096" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpbs.twimg.com%2Fmedia%2FEPWlKGfW4AABHof%3Fformat%3Djpg%26name%3D4096x4096" alt="im" width="3170" height="1660"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;While BERT was  trained by using a simple token masking technique, BART empowers the  BERT encoder by using more challenging kinds of masking mechanisms in  its pre-training&lt;/strong&gt;. Once we get the token and sentence-level  representation of an input text sequence, a decoder needs to interpret  these to map with the output target.&lt;/p&gt;

&lt;h1&gt;
  
  
  Building task specific Transformer based solution
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Green portion is pretrained one&lt;/li&gt;
&lt;li&gt;other portion (purple) is custom head which we train further&lt;/li&gt;
&lt;li&gt;QA head is predicting span of text which contains answer from context/text&lt;/li&gt;
&lt;li&gt;Classification Head predicting binary value.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgxra5h5nzdptdfslhypd.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgxra5h5nzdptdfslhypd.PNG" alt="im" width="800" height="453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  BERT for Sentence Similarity
&lt;/h1&gt;

&lt;p&gt;Transformers work using word or &lt;em&gt;token&lt;/em&gt;-level embeddings, &lt;em&gt;not&lt;/em&gt; sentence-level embeddings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Regular transformers produce sentence embeddings by performing some  pooling operation such as the element-wise arithmetic mean on its  token-level embeddings. A good pooling choice for BERT is CLS pooling.  BERT has a special &lt;code&gt;&amp;lt;CLS&amp;gt;&lt;/code&gt;  token that is supposed to capture all the sequence information. It gets  tuned on next-sentence prediction (NSP) during pre-training&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Before sentence transformers, the approach to calculating &lt;em&gt;accurate&lt;/em&gt; sentence similarity with BERT was to use a cross-encoder structure.  This meant that we would pass two sentences to BERT, add a  classification head to the top of BERT — and use this to output a  similarity score.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zkjiva6pl72zl5iwgwj.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zkjiva6pl72zl5iwgwj.jpg" alt="Cross encoder" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The BERT cross-encoder architecture consists of a BERT model  which consumes sentences A and B. Both are processed in the same  sequence, separated by a &lt;code&gt;[SEP]&lt;/code&gt; token. All of this is followed by a feedforward NN classifier that outputs a similarity score.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The cross-encoder network does produce very accurate similarity scores (better than SBERT), but it’s &lt;em&gt;not scalable&lt;/em&gt;. If we wanted to perform a similarity search through a small 100K  sentence dataset, we would need to complete the cross-encoder inference  computation 100K times.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;To cluster sentences, we would need to  compare all sentences in our 100K dataset, resulting in just under 500M  comparisons — this is simply not realistic.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ideally, we need to pre-compute sentence vectors that can be stored and  then used whenever required. If these vector representations are good,  all we need to do is calculate the cosine similarity between each.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;With the original BERT (and other transformers), we can build a  sentence embedding by averaging the values across all token embeddings  output by BERT (if we input 512 tokens, we output 512 embeddings).  Alternatively, we can use the output of the first &lt;code&gt;[CLS]&lt;/code&gt; token (a BERT-specific token whose output embedding is used in classification tasks).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Using one of these two approaches gives us our sentence embeddings that can  be stored and compared much faster, shifting search times from 65 hours  to around 5 seconds (see below). However, the accuracy is not good, and  is worse than using averaged GloVe embeddings (which were developed in  2014).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fak0cs30lni75n51a4cg0.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fak0cs30lni75n51a4cg0.jpeg" alt="im" width="640" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Sentence Transformer?
&lt;/h1&gt;

&lt;h2&gt;
  
  
  How Does it Work?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.sbert.net/index.html" rel="noopener noreferrer"&gt;SentenceTransformers&lt;/a&gt; is a Python framework for state-of-the-art sentence, text, and image  embeddings. Embeddings can be computed for 100+ languages and they can  be easily used for common tasks like &lt;a href="https://en.wikipedia.org/wiki/Semantic_similarity" rel="noopener noreferrer"&gt;semantic text similarity&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Semantic_search" rel="noopener noreferrer"&gt;semantic search&lt;/a&gt;, and paraphrase mining.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The solution&lt;/strong&gt; of the above lack of an accurate model &lt;em&gt;with&lt;/em&gt; reasonable latency was designed by Nils Reimers and Iryna Gurevych in  2019 with the introduction of sentence-BERT (SBERT) and the &lt;code&gt;sentence-transformers&lt;/code&gt; library.&lt;/p&gt;

&lt;p&gt;SBERT produces sentence embeddings — so we do &lt;em&gt;not&lt;/em&gt; need to perform a whole inference computation for every sentence-pair comparison.&lt;/p&gt;

&lt;p&gt;SBERT is fine-tuned on sentence pairs using a &lt;em&gt;siamese&lt;/em&gt; architecture. We can think of this as having two identical BERTs in parallel that share the exact same network weights.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1juzzjp9oknnsmh31v3o.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1juzzjp9oknnsmh31v3o.jpg" alt="im" width="800" height="462"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In reality, we are using a single BERT model. However, because we process sentence A followed by sentence B as &lt;em&gt;pairs&lt;/em&gt; during training, it is easier to think of this as two models with tied weights.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Use cases
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fev1j67qbig0o6oyvu02j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fev1j67qbig0o6oyvu02j.png" alt="im" width="720" height="373"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Sentence Transformer Architecture Changes
&lt;/h2&gt;

&lt;p&gt;SBERT uses a siamese architecture where it contains 2 BERT architectures that are essentially identical and share the same weights, and SBERT processes 2 sentences as pairs during training.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The training process of sentence transformers is especially designed  with semantic similarity in mind.&lt;/strong&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Cross-encoders
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flu8d1tmpxzlywy93ykyd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flu8d1tmpxzlywy93ykyd.png" alt="im" width="720" height="459"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A cross-encoder is thus trained by sentence-pairs along with a ground-truth label of how semantically similar they are.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfmwf3ym15ossz2qe7md.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfmwf3ym15ossz2qe7md.png" alt="img" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Cross-encoders tend to perform very well on sentence-level tasks, they do suffer from a major drawback: &lt;strong&gt;cross-encoders do not produce sentence embeddings. **In the context of information retrieval, this implies that we **cannot pre-compute document embeddings&lt;/strong&gt; and efficiently compare these to a query embedding. We are also &lt;strong&gt;not able to index document embeddings&lt;/strong&gt; for efficient search.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bi-encoders
&lt;/h3&gt;

&lt;p&gt;In the context of information retrieval, this implies that we &lt;strong&gt;cannot pre-compute document embeddings&lt;/strong&gt; and efficiently compare these to a query embedding. We are also &lt;strong&gt;not able to index document embeddings&lt;/strong&gt; for efficient search.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;we feed sentence A to BERT A and sentence B to BERT B in SBERT. Each BERT  outputs pooled sentence embeddings. While the original research paper  tried several pooling methods, they found mean-pooling was the best  approach. Pooling is a technique for generalizing features in a network, and in this case, mean pooling works by averaging groups of features in the BERT.&lt;/li&gt;
&lt;li&gt;After the pooling is done, we now have 2 embeddings: 1 for sentence A and 1  for sentence B. When the model is training, SBERT concatenates the 2  embeddings &lt;strong&gt;which will then run through a softmax classifier and be  trained using a softmax-loss function&lt;/strong&gt;. &lt;/li&gt;
&lt;li&gt;At inference — or when the model  actually begins predicting — the two embeddings are then compared using a cosine similarity function, which will output a similarity score for  the two sentences. Here is a diagram for SBERT when it is fine-tuned and at inference.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftlnttcj8fc717p7vmksy.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftlnttcj8fc717p7vmksy.jpeg" alt="im" width="562" height="392"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmjayejtll1tljz0ub4qa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmjayejtll1tljz0ub4qa.png" alt="im" width="720" height="496"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  How can you use SBERT for sagemaker endpoint?
&lt;/h3&gt;

&lt;p&gt;BERT has its own Python library. Using it is as simple as using a model from the hugging face transformer library. Here, we have used &lt;strong&gt;multi-qa-MiniLM-L6-cos-v1&lt;/strong&gt; model for sentence similarity.&lt;/p&gt;

&lt;p&gt;Here, I have shown how we can deploy this model as our sagemaker serverless endpoint.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Choose transformer model for embeddings
from transformers import AutoTokenizer, AutoModel
import os
import sagemaker
import time
saved_model_dir = 'transformer'
os.makedirs(saved_model_dir, exist_ok=True)

tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/multi-qa-MiniLM-L6-cos-v1")
model = AutoModel.from_pretrained("sentence-transformers/multi-qa-MiniLM-L6-cos-v1") 

tokenizer.save_pretrained(saved_model_dir)
model.save_pretrained(saved_model_dir)

#Defining default bucket for SageMaker pretrained model hosting
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()

!cd transformer &amp;amp;&amp;amp; tar czvf ../model.tar.gz *

model_data = sagemaker_session.upload_data(path='model.tar.gz', key_prefix='autofaiss-demo/huggingface-models')

from sagemaker.huggingface.model import HuggingFaceModel


# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data=model_data,       # path to your model and script
   entry_point = 'predict.py',
   source_dir = 'source_dir',
   role=role,                    # iam role with permissions to create an Endpoint
   transformers_version="4.12",  # transformers version used
   pytorch_version="1.9",        # pytorch version used
   py_version='py38',            # python version used
)

# deploy the endpoint endpoint
predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.m5.2xlarge"
    )

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, I have created a lambda_handler function where the above endpoint is being called for similarity prediction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import logging
import json
import boto3
import io
import os
import time
import logging
import sagemaker
from sagemaker.deserializers import JSONDeserializer
from sagemaker.serializers import IdentitySerializer

logger = logging.getLogger()
logger.setLevel(logging.INFO)
ENDPOINT_NAME = "huggingface-pytorch-inference-2022-10-14-20-02-16-258"
sagemaker_session = sagemaker.Session()

"""
FunctionName: invoke_endpoint
Input: transcript_item (sentence), label_map
    transcript_item type: string
    label_map type: dict
Output: Question
    type: string
"""
# @tracer.capture_method
def invoke_endpoint(payload, endpoint_name):
    runtime = boto3.client('runtime.sagemaker')
    response = runtime.invoke_endpoint(EndpointName=endpoint_name,
                                      ContentType="application/json",
                                      Body=json.dumps(payload))
    embeddings = json.loads((response["Body"].read()))
    return embeddings

# @tracer.capture_lambda_handler
def lambda_handler(event):
    start = time.time()
    similarity_scores = invoke_endpoint(event, ENDPOINT_NAME)
    end = time.time()
    logger.info(f"Profiling: \n Getting Embeddings: {1000*(end-start)} milliseconds")   
    return similarity_scores


json_event = {  
    "query_from_app" : "How many people live in London?",
    "actual_queries" : ["Around 9 Million people live in London", "London is known for its financial district"]
}

lambda_handler(json_event)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The sample output is as shown below...&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{'Scores': [['Around 9 Million people live in London', 0.9156370759010315],
  ['London is known for its financial district', 0.49475768208503723]]}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Serverless deployment of Sentence Transformer
&lt;/h2&gt;

&lt;p&gt;I have shared below our lambda code and terraform code for deployment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lambda Code
&lt;/h3&gt;

&lt;p&gt;I have used terraform to deploy the lambda function and the endpoint is being defined here as environment variable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from query_request import query
DDB_ENDPOINT = os.environ["ServiceConfiguration__DDB_ENDPOINT"]
REGION = os.environ["ServiceConfiguration__REGION"]

def invoke_endpoint(payload, endpoint_name):
    runtime = boto3.client('runtime.sagemaker')
    response = runtime.invoke_endpoint(EndpointName=endpoint_name,
                                      ContentType="application/json",
                                      Body=json.dumps(payload))
    embeddings = json.loads((response["Body"].read()))
    return embeddings

def lambda_handler(event, context):
    start = time.time()
    similarity_scores = invoke_endpoint(event, ENDPOINT_NAME)
    end = time.time()
    logger.info(f"Similarity scores: {similarity_scores}")
    logger.info(f"Profiling: \n Getting Embeddings: {1000*(end-start)} milliseconds")   
    return similarity_scores

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Terraform
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;module "questn_similarity_classification" {
  source  = "terraform-module/lambda/aws"
  version = "2.12.6"

  function_name = "questn_similarity_classification"
  filename      = data.archive_file.questn_similarity_classification-zip.output_path
  source_code_hash = data.archive_file.questn_similarity_classification-zip.output_base64sha256
  description      = "questn_similarity_classification"
  handler        = "questn_similarity_classification.lambda_handler"
  runtime        = "python3.7"
  memory_size    = "1280"
  concurrency    = "25"
  lambda_timeout = "120"
  log_retention  = "30"
  publish        = true
  role_arn       = aws_iam_role.questn_similarity_classification_role.arn
  tracing_config = { mode = "Active" }
 # layers = [aws_lambda_layer_version.numpy_layer_37.arn, data.aws_lambda_layer_version.ml_faiss_layer_version.arn]

  vpc_config = {
    subnet_ids         = tolist(data.aws_subnet.efs_subnet.*.id)
    security_group_ids = [data.aws_security_group.default_sec_grp.id]
  }
  environment = {
    ServiceConfiguration__ENDPOINT_NAME    = var.ServiceConfiguration__ENDPOINT_NAME
    ServiceConfiguration__REGION = var.ServiceConfiguration__REGION
  }
  file_system_config = {
    # file_system_arn              = data.aws_efs_access_point.knn_efs.arn
    efs_access_point_arn = data.aws_efs_access_point.knn_efs.arn
    local_mount_path     = var.file_system_local_mount_path # Local mount path inside the lambda function. Must start with '/mnt/'.
    # file_system_local_mount_path = var.file_system_local_mount_path # Local mount path inside the lambda function. Must start with '/mnt/'.

  }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>aws</category>
      <category>nlp</category>
      <category>datascience</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>A learning journey with ELK stack</title>
      <dc:creator>Amit Kayal</dc:creator>
      <pubDate>Wed, 01 Dec 2021 17:15:27 +0000</pubDate>
      <link>https://dev.to/aws-builders/a-learning-journey-with-elk-stack-4om4</link>
      <guid>https://dev.to/aws-builders/a-learning-journey-with-elk-stack-4om4</guid>
      <description>&lt;h1&gt;
  
  
  Kibana and Elastic Search
&lt;/h1&gt;

&lt;p&gt;[TOC]&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Kibana?
&lt;/h2&gt;

&lt;p&gt;Kibana is a data visualization and exploration tool used for log and time-series analytics, application monitoring, and operational intelligence use cases. It offers powerful and easy-to-use features such as histograms, line graphs, pie charts, heat maps, and built-in geospatial support. Also, it provides tight integration with &lt;a href="https://aws.amazon.com/opensearch-service/the-elk-stack/what-is-elasticsearch/" rel="noopener noreferrer"&gt;Elasticsearch&lt;/a&gt;, a popular analytics and search engine, which makes Kibana the default choice for visualizing data stored in Elasticsearch.Kibana works in sync with Elasticsearch and Logstash which together forms the so called &lt;strong&gt;ELK&lt;/strong&gt; stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is ELK Stack?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;ELK&lt;/strong&gt; stands for Elasticsearch, Logstash, and Kibana. &lt;strong&gt;ELK&lt;/strong&gt; is one of the popular log management platform used worldwide for log analysis. In the ELK stack, Logstash extracts the logging data or other events from different input sources. It processes the events and later stores them in Elasticsearch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kibana&lt;/strong&gt; is a visualization tool, which accesses the logs from Elasticsearch and is able to display to the user in the form of line graph, bar graph, pie charts etc. The basic flow of ELK Stack is shown in the image here −&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgmnnh4c7fouo3lpq6umf.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgmnnh4c7fouo3lpq6umf.jpg" alt="ELK Stack" width="600" height="154"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Logstash is responsible to collect the data from all the remote sources where the logs are filed and pushes the same to Elasticsearch. Elasticsearch acts as a database where the data is collected and Kibana uses the data from Elasticsearch to represent the data to the user in the form of bargraphs, pie charts, heat maps as shown below −&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F35rwy8hzbq9b75mkzk8d.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F35rwy8hzbq9b75mkzk8d.jpg" alt="Elastic search" width="600" height="336"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For a small-sized development environment, the classic architecture will look as follows:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb8nt60gmusmeydwtadoo.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb8nt60gmusmeydwtadoo.jpg" alt="img" width="727" height="149"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;However, for handling more complex pipelines built for handling large amounts of data in production, additional components are likely to be added into your logging architecture, for resiliency (Kafka, RabbitMQ, Redis) and security (nginx):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu2o1ywvwo11es9nlnzgm.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu2o1ywvwo11es9nlnzgm.jpg" alt="img" width="788" height="350"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is of course a simplified diagram for the sake of illustration&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Elasticsearch
&lt;/h2&gt;

&lt;p&gt;It is distributed document stores which means once the document is stored then it can be retrieved from any node of the cluster.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;default port for elasticsearch is 9201&lt;/li&gt;
&lt;li&gt;default port for kibana is 5600&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It shows the data on real time basis, for example, day-wise or hourly to the user. Kibana UI is user friendly and very easy for a beginner to understand. The following table gives a direct comparison between these terms−&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-gcp.marutitech.com%2Fwp-media%2F2017%2F06%2Fb68fa448-what-is-elasticsearch-used-for.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-gcp.marutitech.com%2Fwp-media%2F2017%2F06%2Fb68fa448-what-is-elasticsearch-used-for.png" alt="im" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjv5ejoh5okeext1pr15u.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjv5ejoh5okeext1pr15u.jpg" alt="im" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Every row in RDBMS has an unique row identifier and similarly we have unique document id in elasticsearch for every document.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Elastichsarch built on top of Lucene. Every shard is simply a Lucene index. Lucene index, if simplified, is the inverted index. Every Elasticsearch index is a bunch of shards or Lucene indices. When you &lt;strong&gt;query&lt;/strong&gt; for a document, Elasticsearch will subquery all shards, merge results and return it to you. When you &lt;strong&gt;index&lt;/strong&gt; document to Elasticsearch, the Elasticsearch will calculate in which shard document should be written using the formula&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;shard = hash(routing) % number_of_primary_shards
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key Concepts
&lt;/h2&gt;

&lt;p&gt;In Elasticsearch terms, Index = Database, Type = Table, Document = Row.&lt;/p&gt;

&lt;p&gt;The key concepts of Elasticsearch are as follow&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Node

&lt;ul&gt;
&lt;li&gt;It refers to a single running instance of Elasticsearch.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Cluster

&lt;ul&gt;
&lt;li&gt;It is a collection of one or more nodes. Cluster provides collective indexing and search capabilities across all the nodes for entire data.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Index

&lt;ul&gt;
&lt;li&gt;It is a collection of different type of documents and their properties. Index also uses the concept of shards to improve the performance&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Document

&lt;ul&gt;
&lt;li&gt;It is a collection of fields in a specific manner defined in JSON format. Every document belongs to a type and resides inside an index. Every document is associated with a unique identifier called the UID.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Shard

&lt;ul&gt;
&lt;li&gt;Indexes are horizontally subdivided into shards. This means each shard contains all the properties of document but contains less number of JSON objects than index. The horizontal separation makes shard an independent node, which can be store in any node.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Replicas

&lt;ul&gt;
&lt;li&gt;Elasticsearch allows a user to create replicas of their indexes and shards.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Checking that Elasticsearch is running
&lt;/h2&gt;

&lt;p&gt;You can test that your Elasticsearch node is running by sending an HTTP request to port &lt;code&gt;9200&lt;/code&gt; on &lt;code&gt;localhost&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -X GET "localhost:9200/?pretty"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;response will be as similar to following.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "name" : "BLR-AA202394",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "pYm2eSPlTh-EUo8hcVce2A",
  "version" : {
    "number" : "7.15.2",
    "build_flavor" : "default",
    "build_type" : "zip",
    "build_hash" : "93d5a7f6192e8a1a12e154a2b81bf6fa7309da0c",
    "build_date" : "2021-11-04T14:04:42.515624022Z",
    "build_snapshot" : false,
    "lucene_version" : "8.9.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Rest API in EL
&lt;/h2&gt;

&lt;p&gt;Every feature of Elasticsearch is exposed as a REST API:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Index API&lt;/strong&gt; – Used to document the Index&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Get API&lt;/strong&gt; – Used to retrieve the document&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search API&lt;/strong&gt; – Used to submit your query and get the result&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Put Mapping API&lt;/strong&gt; – Used to override default choices and define our own mapping&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Elasticsearch has its own Query Domain Specific Language, where you specify the query in JSON format. Other queries can also be nested based on your need. Real projects require search on different fields by applying some conditions, different weights, recent documents, values of some predefined fields, and so on. All such complexity can be expressed through a single query. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-gcp.marutitech.com%2Fwp-media%2F2017%2F06%2Fed18c2ed-indexing-and-searching-in-elasticsearch.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-gcp.marutitech.com%2Fwp-media%2F2017%2F06%2Fed18c2ed-indexing-and-searching-in-elasticsearch.png" alt="im" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Document analysis process by Elasticsearch
&lt;/h2&gt;

&lt;p&gt;When a document request for indexing is received by elasticsearch, which in turn is handled by lucene, it converts the document in a stream of tokens. After tokens are generated, the same gets filtered by the configured filter. This entire process is called the analysis process, and is applied on every document that gets indexed.&lt;/p&gt;

&lt;p&gt;One thing to learn from this is that the key to an efficient storage and retrieval process is the analysis process defined on the index, as per the application needs.&lt;/p&gt;

&lt;p&gt;Sometime we also do&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;stemming and this means we convert the word into root word and store in index (Ex: such words are population, populations)&lt;/li&gt;
&lt;li&gt;For synonym, we store single word rather than storing all the words (declined, reduced)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Remember that this same process of document normalization (during inverted index ) needs to be applied during document search query.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frvsy0n1eqv5653uyvjtq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frvsy0n1eqv5653uyvjtq.png" alt="im" width="800" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The analyzer can be custom or default one and this needs to be specified during Index creation. If the analyzer is not provided then EL uses default standard one.&lt;/p&gt;

&lt;p&gt;Remember that, EL standard analyzer will not do do&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;stemming (Running to Run)- process of converting word to root word&lt;/li&gt;
&lt;li&gt;stop word removal&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Inverted Index
&lt;/h2&gt;

&lt;p&gt;It uses a method called inverted index.  Since we are talking about searching a term in a large collection of documents(aka collection of chapters in this case) we can use Inverted Indexes to solve this issue, and yes almost all books use these Inverted Indexes to make your life easier. Just like many other books "Team of Rivals" has inverted indexes at the end of the book as shown in this image. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;An inverted index consists of a list of all the unique words that appear in any document, and for each word, a list of the documents in which it appears&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fres.cloudinary.com%2Fpracticaldev%2Fimage%2Ffetch%2Fs--REcFcmWx--%2Fc_limit%252Cf_auto%252Cfl_progressive%252Cq_auto%252Cw_880%2Fhttps%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fm40ibvqdoapeo68l80e2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fres.cloudinary.com%2Fpracticaldev%2Fimage%2Ffetch%2Fs--REcFcmWx--%2Fc_limit%252Cf_auto%252Cfl_progressive%252Cq_auto%252Cw_880%2Fhttps%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fm40ibvqdoapeo68l80e2.jpg" alt="im" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So after checking the Inverted indexes at the end of the book we know that "Baltimore" is mentioned on pages 629 and 630. So there are two parts in this &lt;em&gt;searching&lt;/em&gt; for "Baltimore" in the lexicographically ordered Inverted Index list and &lt;em&gt;fetching&lt;/em&gt; the pages based on the value of the index (here 629 and 630). The search time is very less for the term in the inverted index since in computing we actually use dictionaries(hash-based or search trees) to keep track of these terms&lt;/p&gt;

&lt;p&gt;The purpose of an inverted index, is to store text in a structure that allows for very efficient and fast full-text searches. When performing full-text searches, we are actually querying an inverted index and not the JSON documents that we defined when indexing the documents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;An inverted index consists of all of the unique terms that appear in any document covered by the index&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Here is the diagram which shows a very simplified structure of an inverted index.  Here we have separated stop words which are common ones and they also should be excluded from queries.&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fres.cloudinary.com%2Fpracticaldev%2Fimage%2Ffetch%2Fs--vbWYcU0Y--%2Fc_limit%252Cf_auto%252Cfl_progressive%252Cq_auto%252Cw_880%2Fhttps%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F8m6i1jb4oig5ohzdd6mx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fres.cloudinary.com%2Fpracticaldev%2Fimage%2Ffetch%2Fs--vbWYcU0Y--%2Fc_limit%252Cf_auto%252Cfl_progressive%252Cq_auto%252Cw_880%2Fhttps%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F8m6i1jb4oig5ohzdd6mx.png" alt="im" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Post the analysis process, when the data is converted into tokens, these tokens are stored into an internal structure called inverted index. This structure maps each unique term in an index to a document. This data structure allows for faster data search and text analytics. All the attributes like term count, term position and other such attributes are associated with the term. Below is a sample visualization of how an inverted index may look like.&lt;/p&gt;

&lt;p&gt;Post the tokens are mapped, document is stored on the disk. One can choose to store the original input of the document along with the analyzed document. The original input gets stored in a system field names "_source". Once can even choose to not analyze the input, and store the document without any analysis. The structure of the inverted index totally depends upon the analyzer chosen for indexing. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmm516u6nmabgmvgirpzc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmm516u6nmabgmvgirpzc.png" alt="im" width="800" height="465"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Another example:&lt;/p&gt;

&lt;p&gt;Document 1: &lt;code&gt;Elasticsearch is fast&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Document 2: &lt;code&gt;I want to learn Elasticsearch&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Let’s take a peek into the Inverted Index and see the result of the Analysis and Indexing process:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--7UkpILE3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/nnc1hrou6zpa3n03l1ci.PNG" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fres.cloudinary.com%2Fpracticaldev%2Fimage%2Ffetch%2Fs--7UkpILE3--%2Fc_limit%252Cf_auto%252Cfl_progressive%252Cq_auto%252Cw_880%2Fhttps%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fnnc1hrou6zpa3n03l1ci.PNG" alt="Alt Text" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can see, the terms are counted and mapped into document  identifiers and its position in the document. The reason we don’t see  the full document &lt;code&gt;Elasticsearch is fast&lt;/code&gt; or &lt;code&gt;I want to learn Elasticsearch&lt;/code&gt; is because they go through Analysis process, which is our main topic in this article.&lt;/p&gt;

&lt;h2&gt;
  
  
  Operation with Elasticsearch
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Inserting a document in elasticsearch means Indexing a document&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you execute the PUT command for the fresh document, Then the document will be created. Such a case result attribute from the response will incite as “created”. When you execute the PUT command for the &lt;br&gt;
second time for the same document, it will not create the new document again if the document already exists in the cluster. Instead, it will update the document. You can confirm that by the result from the &lt;br&gt;
response. The result will indicate as “updated”. But when you execute the PUT command for the fresh document, Then the document will be created. If it's already available, then the document will be updated.&lt;/p&gt;
&lt;h3&gt;
  
  
  What is indexing
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Index as a verb&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The process of inserting a  document into elasticsearch is called indexing. Indexing a document into  elasticsearch is similar to insert the row in a database table. The  only difference is, if the document already exists in the cluster, the  indexing process will replace the old document.&lt;/p&gt;
&lt;h3&gt;
  
  
  Indexing a document from Kibana
&lt;/h3&gt;

&lt;p&gt;Before indexing a document into kibana, first we need to decide where this document will be stored. In elasticsearch, a document belongs to a type and the type stays within an Index. An elasticearch cluster may contain multiple indexes and V7 allows only one type for per index. One index which belong to one type, may contain multiple documents and each document may have multiple fields.&lt;/p&gt;

&lt;p&gt;Base command for an indexing a document is&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{{host}}/:index/_alias/:alias
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is a sample indexing which will create the index and have document added. I have also added following index pattern into elasticsearch.yml to enable auto indexing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;action.auto_create_index: .monitoring*,.watches,.triggered_watches,.watcher-history*,.ml*,.doc*,.json*,_json*,_doc*,*subscriber*,*product*,*event*

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: Passing _doctype value (here it is json) is deprecated into current V7.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Indexing request&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Here intentionally i have used PUT request as this one requires ID value to be provided. My aim is here is to search by subscriberId and hence I am passing the ID valu&lt;/strong&gt;e.  Otherwise we can use POST command like “ POST /subscriber/json/” which will be auto generate id.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PUT /subscriber/json/5159683
{
    "Addresses": [
        {
            "City": "TX",
            "Country": "USA",
            "LineOne": "Parkway Suite Houston",
            "PostalCode": "77047",
            "State": "TX",
            "DefaultBilling": true,
            "DefaultHome": false,
            "DefaultPostal": true,
            "DefaultService": true,
            "DefaultShipping": false,
            "DefaultWork": false,
            "Id": 1412213,
            "Created": "2021-11-22T16:05:40.943Z",
            "Modified": "2021-11-22T16:05:40.943Z",
            "Name": "Parkway E Suite Houston",
            "ShipToName": "India",
            "Status": 1,
            "StatusName": "Active",
            "TaxPCode": null,
            "Verifiable": true,
            "Verified": true
        }
    ],
    "Subscriber": {
        "Category": 1,
        "CompanyName": "IH-163759714",
        "ConvergentBillerId": "40003771",
        "Created": "2021-11-22T16:05:40.890Z",
        "ExternalReference": "IH-EX163759714",
        "Id": 5159683,
        "Language": "en-GB",
        "Login": "aa@mallinator.com",
        "State": 0,
        "StateChangeDate": "2021-11-22T16:05:40.840Z",
        "Status": 1,
        "SubscriberTypeCode": 10121,
        "AdditionalProperties": [
            {
                "ExternalReference": "Invoice_Ref",
                "Id": 1886,
                "Name": "CustomerInvoiceRef",
                "Values": []
            },
            {
                "ExternalReference": "Bill_From_Address",
                "Id": 1888,
                "Name": "BillFromAddress",
                "Values": [
                    "add-sws"
                ]
            }
        ],
        "ContactPreferences": [
            {
                "ContactEventType": 12,
                "ContactMethod": 40,
                "Id": 2702216,
                "OptIn": true
            }
        ],
        "EffectiveStartDate": "2021-11-22T16:05:40.863Z",
        "HomeCountry": "USA",
        "InvoiceConfiguration": {
            "HideZeroAmount": false,
            "InvoiceDetailLevel": 1
        },
        "StateName": "Prospect",
        "StatusName": "Active",
        "SubscriberCurrency": "USD",
        "SubscriberTypeDetails": {
            "AccountingMethod": 2,
            "BillCycle": "40002163",
            "BillCycleDay": 1,
            "BillCycleName": "BillCycle_1",
            "IsReadOnly": false,
            "PaymentDueDays": 10,
            "PostpaidAccountNumber": "CAQ400037719"
        },
        "TermsAndConditionsAccepted": "2021-11-22T16:05:40.883Z"
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Indexing response&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;here the type has been set with our input value&lt;/li&gt;
&lt;li&gt;index has been set to our input subscriber&lt;/li&gt;
&lt;li&gt;result value has set as created&lt;/li&gt;
&lt;li&gt;sequence no has ben set as 3 and this is being set by elasticsearch
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#! [types removal] Specifying types in document index requests is deprecated, use the typeless endpoints instead (/{index}/_doc/{id}, /{index}/_doc, or /{index}/_create/{id}).
{
  "_index" : "subscriber",
  "_type" : "json",
  "_id" : "5159684",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 3,
  "_primary_term" : 1
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is the request when i &lt;strong&gt;passed on the input for an existing id and result value shows as updated&lt;/strong&gt; which means the indexing has been updated. version has been set to 3 and this means the document has been updated 3 times. Every document in elasticsearch has version. So PUT request for existing document will update the version and for new request it will create&lt;/p&gt;

&lt;p&gt;The _index mentions the index name for this document.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Any attribute starts with _ with document are called document metadata.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#! [types removal] Specifying types in document index requests is deprecated, use the typeless endpoints instead (/{index}/_doc/{id}, /{index}/_doc, or /{index}/_create/{id}).
{
  "_index" : "subscriber",
  "_type" : "json",
  "_id" : "5159683",
  "_version" : 3,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 2,
  "_primary_term" : 1
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Finding if a document available?
&lt;/h3&gt;

&lt;p&gt;The HEAD command is being used to see if the document is available and this will respond us http 200 as shared below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;HEAD /subscriber/_doc/5159685
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;200 - OK
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Retrieve an Index document
&lt;/h3&gt;

&lt;p&gt;Here we are passing the Index name, document type and Id to retrieve the index.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /subscriber/json/5159685
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and response will have the document along with its metadata. The found has value set to true which means the document has been found.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "_index" : "subscriber",
  "_type" : "json",
  "_id" : "5159685",
  "_version" : 1,
  "_seq_no" : 7,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "Addresses" : [
      {
        "City" : "TX",
        "Country" : "USA",
        "LineOne" : "Parkway E Suite Houston",
        "PostalCode" : "77047",
        "State" : "TX",
        "DefaultBilling" : true,
        "DefaultHome" : false,
        "DefaultPostal" : true,
        "DefaultService" : true,
        "DefaultShipping" : false,
        "DefaultWork" : false,
        "Id" : 1412213,
        "Created" : "2021-11-22T16:05:40.943Z",
        "Modified" : "2021-11-22T16:05:40.943Z",
        "Name" : "Parkway E Suite 100 Houston",
        "ShipToName" : "MARLINK GROUP USA - HOUSTON",
        "Status" : 1,
        "StatusName" : "Active",
        "TaxPCode" : null,
        "Verifiable" : true,
        "Verified" : true
      }
    ],
    "Subscriber" : {
      "Category" : 1,
      "CompanyName" : "IH1-163759714",
      "ConvergentBillerId" : "400033771",
      "Created" : "2021-11-22T16:05:40.890Z",
      "ExternalReference" : "IH-EX163759714",
      "Id" : 5159685,
      "Language" : "en-GB",
      "Login" : "IH16375339714@aaar.com",
      "State" : 0,
      "StateChangeDate" : "2021-11-22T16:05:40.840Z",
      "Status" : 1,
      "SubscriberTypeCode" : 10121,
      "AdditionalProperties" : [
        {
          "ExternalReference" : "Invoice_Ref",
          "Id" : 1886,
          "Name" : "CustomerInvoiceRef",
          "Values" : [ ]
        },
        {
          "ExternalReference" : "Bill_From_Address",
          "Id" : 1888,
          "Name" : "BillFromAddress",
          "Values" : [
            "xxxx"
          ]
        }
      ],
      "ContactPreferences" : [
        {
          "ContactEventType" : 12,
          "ContactMethod" : 40,
          "Id" : 2702216,
          "OptIn" : true
        }
      ],
      "EffectiveStartDate" : "2021-11-22T16:05:40.863Z",
      "HomeCountry" : "USA",
      "InvoiceConfiguration" : {
        "HideZeroAmount" : false,
        "InvoiceDetailLevel" : 1
      },
      "StateName" : "Prospect",
      "StatusName" : "Active",
      "SubscriberCurrency" : "USD",
      "SubscriberTypeDetails" : {
        "AccountingMethod" : 2,
        "BillCycle" : "40002163",
        "BillCycleDay" : 1,
        "BillCycleName" : "BillCycle_1",
        "IsReadOnly" : false,
        "PaymentDueDays" : 10,
        "PostpaidAccountNumber" : "CAQ400037719"
      },
      "TermsAndConditionsAccepted" : "2021-11-22T16:05:40.883Z"
    }
  }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Updating an Index document
&lt;/h3&gt;

&lt;p&gt;Having a PUT request for an ID will update if the document is already existing. The response will have result as updated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note that PUT request will update the whole document with new content and that means our new PUT request needs to contain all the elements&lt;/strong&gt;. PUT request update the complete document.&lt;/p&gt;

&lt;p&gt;If we want to update partially and don't want pass on all document elements then we should use POST request. POST request update the relevant element only.&lt;/p&gt;

&lt;p&gt;Key point to note here are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;POST command has _update to let elasticsearch know about update&lt;/li&gt;
&lt;li&gt;doc keyword has to be added as elastic search stores documents within doc&lt;/li&gt;
&lt;li&gt;if we provide any new attribute into post request then it will be added into document.&lt;/li&gt;
&lt;li&gt;Following logic gets triggered by elasticsearch after post request

&lt;ul&gt;
&lt;li&gt;retrieves the existing document&lt;/li&gt;
&lt;li&gt;apply changes requested in post request&lt;/li&gt;
&lt;li&gt;removes the old document&lt;/li&gt;
&lt;li&gt;indexes new document (with update) in the place of old document
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST /subscriber/_update/5159683
{
  "doc":
  {
    "Addresses": [
        {
            "City": "Austin"

        }
    ]
  }

}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Deleting an document
&lt;/h3&gt;

&lt;p&gt;Request is similar to get request&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DELETE /subscriber/json/5159699
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response will have result element populated with deleted value.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "_index" : "subscriber",
  "_type" : "json",
  "_id" : "5159699",
  "_version" : 2,
  "result" : "deleted",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 33,
  "_primary_term" : 2
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk Insert
&lt;/h3&gt;

&lt;p&gt;For bulk request, individual items are separated by newline characters (not commas) and th*&lt;em&gt;ere are no square brackets at the end (ie the payload is a sequence of JSON documents&lt;/em&gt;*. The documents must be on a single line, no newlines are allowed within them. It seems bit unintuitive I know since resources don't have to be on a single line when you create them using PUT or POST for non-bulk operations&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST _bulk
{"index": {"_index": "subscriber", "_id": "5159199"}}
{"Subscriber": {"Category": 1, "CompanyName": "IH-163759714","ConvergentBillerId": "40003771", "Created": "2021-11-22T16:05:40.890Z","ExternalReference": "IH-EX163759714","Id": 5159199,"Language": "en-GB","Login": "1@1.com","State": 0, "StateChangeDate": "2021-10-22T16:05:40.840Z","Status": 1,"SubscriberTypeCode": 10121}}
{"index": {"_index": "subscriber", "_id": "5159099"}}
{"Subscriber": {"Category": 1, "CompanyName": "IH-163759714","ConvergentBillerId": "40003771", "Created": "2021-11-22T16:05:40.890Z","ExternalReference": "IH-EX163759714","Id": 5159099,"Language": "en-GB","Login": "1@1.com","State": 0, "StateChangeDate": "2021-10-22T16:05:40.840Z","Status": 2,"SubscriberTypeCode": 10122}}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Bulk insert response will return status of each document insert as shared below.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;status code will be set to 201
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "took" : 20,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "subscriber",
        "_type" : "_doc",
        "_id" : "5159199",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 34,
        "_primary_term" : 2,
        "status" : 201
      }
    },
    {
      "index" : {
        "_index" : "subscriber",
        "_type" : "_doc",
        "_id" : "5159099",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 35,
        "_primary_term" : 2,
        "status" : 201
      }
    }
  ]
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bulk update
&lt;/h3&gt;

&lt;p&gt;Here i am updating all the attributes of the documents.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST _bulk
{"index": {"_index": "subscriber", "_id": "5159199"}}
{"Subscriber": {"Category": 1, "CompanyName": "IH-163759714","ConvergentBillerId": "40003771", "Created": "2021-11-22T16:05:40.890Z","ExternalReference": "IH-EX163759714","Id": 5159199,"Language": "en-GB","Login": "1@1.com","State": 1, "StateChangeDate": "2021-10-22T16:05:40.840Z","Status": 1,"SubscriberTypeCode": 10121}}
{"index": {"_index": "subscriber", "_id": "5159099"}}
{"Subscriber": {"Category": 1, "CompanyName": "IH-163759714","ConvergentBillerId": "40003771", "Created": "2021-11-22T16:05:40.890Z","ExternalReference": "IH-EX163759714","Id": 5159099,"Language": "en-GB","Login": "1@1.com","State": 1, "StateChangeDate": "2021-10-22T16:05:40.840Z","Status": 2,"SubscriberTypeCode": 10122}}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response has status code set to 200 as shared below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "took" : 9,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "subscriber",
        "_type" : "_doc",
        "_id" : "5159199",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 36,
        "_primary_term" : 2,
        "status" : 200
      }
    },
    {
      "index" : {
        "_index" : "subscriber",
        "_type" : "_doc",
        "_id" : "5159099",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 37,
        "_primary_term" : 2,
        "status" : 200
      }
    }
  ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Full text Search basics of Elasticsearch
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Note: Whatever analyzer was using during Index creation, the same will be used during query operation. Every query term goes through analysis process.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;English analyzer in EL does following&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;does tokenization based on white space&lt;/li&gt;
&lt;li&gt;does stemming&lt;/li&gt;
&lt;li&gt;removes stop words&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Getting a document by Id
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /subscriber/_doc/5159199
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response has found set to true as shared below which indicate that the id has been found. The original document is found under metadata  "_source".&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "_index" : "subscriber",
  "_type" : "_doc",
  "_id" : "5159199",
  "_version" : 2,
  "_seq_no" : 36,
  "_primary_term" : 2,
  "found" : true,
  "_source" : {
    "Subscriber" : {
      "Category" : 1,
      "CompanyName" : "IH-163759714",
      "ConvergentBillerId" : "40003771",
      "Created" : "2021-11-22T16:05:40.890Z",
      "ExternalReference" : "IH-EX163759714",
      "Id" : 5159199,
      "Language" : "en-GB",
      "Login" : "1@1.com",
      "State" : 1,
      "StateChangeDate" : "2021-10-22T16:05:40.840Z",
      "Status" : 1,
      "SubscriberTypeCode" : 10121
    }
  }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Searching an Index
&lt;/h3&gt;

&lt;p&gt;The following one will allow to search for all documents into the index.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /subscriber/_search
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The search result will have following metadata and element hits.value.total will indicate how many documents been returned. Here timed_out set to false and this means request has not been timed out. the default timeout set to is 60 sec and we can specify this value in the request also to override the default value.&lt;/p&gt;

&lt;p&gt;The value max_score  equal to 1.0 means all the search results are relevant to search query.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "took": 11,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 12,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [{
                "_index": "subscriber",
                "_type": "json",
                "_id": "5159698",
                "_score": 1.0,
                "_source": {
                ..........
                }
            }
        ]
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Searching by query string
&lt;/h3&gt;

&lt;p&gt;we can pass on query string as shown below accompanied by q. Here elasticsearch is searching all the documents for the value 10122 irrespective of the field.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /subscriber/_search?q="10122"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The response has relation field value set to eq which means search happened with document string compare&lt;/strong&gt;. &lt;strong&gt;The search result is ordered by _score and 1st one has _score se to 2/302585 as it had maximum time of query string 10122 match found&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "took" : 429,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : 2.302585,
    "hits" : [
      {
        "_index" : "subscriber",
        "_type" : "json",
        "_id" : "49",
        "_score" : 2.302585,
        "_source" : {
          "Subscriber" : {
            "Category" : 1,
            "CompanyName" : "IH-10122",
            "ConvergentBillerId" : "10122",
            "Created" : "2021-11-22T16:05:40.890Z",
            "ExternalReference" : "IH-EX163759714",
            "Id" : 49,
            "Language" : "en-GB",
            "Login" : "1@1.com",
            "State" : 1,
            "StateChangeDate" : "2021-10-22T16:05:40.840Z",
            "Status" : 1,
            "SubscriberTypeCode" : 10121
          }
        }
      },
      {
        "_index" : "subscriber",
        "_type" : "json",
        "_id" : "79",
        "_score" : 1.7917595,
        "_source" : {
          "Subscriber" : {
            "Category" : 1,
            "CompanyName" : "IH-163759714",
            "ConvergentBillerId" : "10122",
            "Created" : "2021-11-22T16:05:40.890Z",
            "ExternalReference" : "IH-EX163759714",
            "Id" : 79,
            "Language" : "en-GB",
            "Login" : "1@1.com",
            "State" : 1,
            "StateChangeDate" : "2021-10-22T16:05:40.840Z",
            "Status" : 2,
            "SubscriberTypeCode" : 10122
          }
        }
      },
      {
        "_index" : "subscriber",
        "_type" : "json",
        "_id" : "5159099",
        "_score" : 1.0,
        "_source" : {
          "Subscriber" : {
            "Category" : 1,
            "CompanyName" : "IH-163759714",
            "ConvergentBillerId" : "40003771",
            "Created" : "2021-11-22T16:05:40.890Z",
            "ExternalReference" : "IH-EX163759714",
            "Id" : 5159099,
            "Language" : "en-GB",
            "Login" : "1@1.com",
            "State" : 1,
            "StateChangeDate" : "2021-10-22T16:05:40.840Z",
            "Status" : 2,
            "SubscriberTypeCode" : 10122
          }
        }
      },
      {
        "_index" : "subscriber",
        "_type" : "json",
        "_id" : "59099",
        "_score" : 1.0,
        "_source" : {
          "Subscriber" : {
            "Category" : 1,
            "CompanyName" : "IH-163759714",
            "ConvergentBillerId" : "40003771",
            "Created" : "2021-11-22T16:05:40.890Z",
            "ExternalReference" : "IH-EX163759714",
            "Id" : 59099,
            "Language" : "en-GB",
            "Login" : "1@1.com",
            "State" : 1,
            "StateChangeDate" : "2021-10-22T16:05:40.840Z",
            "Status" : 2,
            "SubscriberTypeCode" : 10122
          }
        }
      },
      {
        "_index" : "subscriber",
        "_type" : "json",
        "_id" : "599",
        "_score" : 1.0,
        "_source" : {
          "Subscriber" : {
            "Category" : 1,
            "CompanyName" : "IH-163759714",
            "ConvergentBillerId" : "40003771",
            "Created" : "2021-11-22T16:05:40.890Z",
            "ExternalReference" : "IH-EX163759714",
            "Id" : 599,
            "Language" : "en-GB",
            "Login" : "1@1.com",
            "State" : 1,
            "StateChangeDate" : "2021-10-22T16:05:40.840Z",
            "Status" : 2,
            "SubscriberTypeCode" : 10122
          }
        }
      },
      {
        "_index" : "subscriber",
        "_type" : "json",
        "_id" : "19",
        "_score" : 1.0,
        "_source" : {
          "Subscriber" : {
            "Category" : 1,
            "CompanyName" : "IH-163759714",
            "ConvergentBillerId" : "40003771",
            "Created" : "2021-11-22T16:05:40.890Z",
            "ExternalReference" : "IH-EX163759714",
            "Id" : 19,
            "Language" : "en-GB",
            "Login" : "1@1.com",
            "State" : 1,
            "StateChangeDate" : "2021-10-22T16:05:40.840Z",
            "Status" : 2,
            "SubscriberTypeCode" : 10122
          }
        }
      }
    ]
  }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Searching by specific element of document
&lt;/h3&gt;

&lt;p&gt;We can pass document element name into query string and this will return all the documents matching with the value of the specific field.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /subscriber/_search?q=SubscriberTypeCode:"10121"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response will have max_score value of 1 for all the returned elements as this is value compare by element name.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "subscriber",
        "_type" : "json",
        "_id" : "149",
        "_score" : 1.0,
        "_source" : {
          "Category" : 1,
          "CompanyName" : "IH-10122",
          "ConvergentBillerId" : "12",
          "Created" : "2021-11-22T16:05:40.890Z",
          "ExternalReference" : "IH-EX163759714",
          "Id" : 149,
          "Language" : "en-GB",
          "Login" : "1@1.com",
          "State" : 1,
          "StateChangeDate" : "2021-10-22T16:05:40.840Z",
          "Status" : 1,
          "SubscriberTypeCode" : 10121
        }
      },
      {
        "_index" : "subscriber",
        "_type" : "json",
        "_id" : "139",
        "_score" : 1.0,
        "_source" : {
          "Category" : 1,
          "CompanyName" : "IH-10122",
          "ConvergentBillerId" : "12",
          "Created" : "2021-11-22T16:05:40.890Z",
          "ExternalReference" : "IH-EX163759714",
          "Id" : 139,
          "Language" : "en-GB",
          "Login" : "1@1.com",
          "State" : 1,
          "StateChangeDate" : "2021-10-22T16:05:40.840Z",
          "Status" : 1,
          "SubscriberTypeCode" : 10121
        }
      }
    ]
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Query DSL
&lt;/h3&gt;

&lt;p&gt;If we see the above query curl command then it is as below and the curl command has the query in the url which can be exploited by hackers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -XGET "http://localhost:9200/subscriber/_search?q=SubscriberTypeCode:"10121""
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So instead of passing query into url, we will pass the query into the body which is more secured. All query of EL will be under query string and here is one example of match query where value is being compared.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /subscriber/_search
{
  "query":
  {
    "match": {
      "SubscriberTypeCode": "10122"
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  match
&lt;/h4&gt;

&lt;p&gt;Now, lets see the corresponding curl command for this query dsl one and the url does not have any more id or field details. So query is specified in request body instead of url.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -XGET "http://localhost:9200/subscriber/_search" -H 'Content-Type: application/json' -d'
{
  "query":
  {
    "match": {
      "SubscriberTypeCode": "10122"
    }
  }
}'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, lets see the following query and its response below. &lt;strong&gt;The field value is being sent in lower case but still the query returned result. This is because during indexing, EL does the analysis process first which involves converting into lower case, stemming, root word etc.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/subscriber/_search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ExternalReference"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ih-EX163759714"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and the response of the above query&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"took"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timed_out"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"_shards"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"total"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"successful"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"skipped"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"failed"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hits"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"total"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"relation"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eq"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"max_score"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.21072102&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"hits"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_index"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"subscriber"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_type"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"149"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.21072102&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Category"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"CompanyName"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IH-10122"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"ConvergentBillerId"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"12"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Created"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2021-11-22T16:05:40.890Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"ExternalReference"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IH-EX163759714"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Id"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;149&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Language"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en-GB"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Login"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1@1.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"State"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"StateChangeDate"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2021-10-22T16:05:40.840Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Status"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"SubscriberTypeCode"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10121&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_index"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"subscriber"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_type"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"179"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.21072102&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Category"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"CompanyName"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IH-163759714"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"ConvergentBillerId"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"16"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Created"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2021-11-22T16:05:40.890Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"ExternalReference"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IH-EX163759714"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Id"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;179&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Language"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en-GB"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Login"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1@1.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"State"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"StateChangeDate"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2021-10-22T16:05:40.840Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Status"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"SubscriberTypeCode"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10122&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_index"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"subscriber"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_type"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"139"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.21072102&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Category"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"CompanyName"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IH-10122"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"ConvergentBillerId"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"12"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Created"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2021-11-22T16:05:40.890Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"ExternalReference"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IH-EX163759714"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Id"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;139&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Language"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en-GB"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Login"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1@1.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"State"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"StateChangeDate"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2021-10-22T16:05:40.840Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Status"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"SubscriberTypeCode"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10121&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_index"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"subscriber"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_type"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_id"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"169"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_score"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.21072102&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Category"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"CompanyName"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IH-163759714"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"ConvergentBillerId"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"16"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Created"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2021-11-22T16:05:40.890Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"ExternalReference"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IH-EX163759714"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Id"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;169&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Language"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en-GB"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Login"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1@1.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"State"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"StateChangeDate"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2021-10-22T16:05:40.840Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"Status"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"SubscriberTypeCode"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10122&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h5&gt;
  
  
  Limiting search result
&lt;/h5&gt;

&lt;p&gt;&lt;strong&gt;Here, we are limiting result by size parameter.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/blogposts/_search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"draft"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The following query shows how we can implement pagination in search query.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /blogposts/_search
{
  "query": {
    "match": {
      "status": "draft"
    }
  },
  "size": 5,
  "from": 10  
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h5&gt;
  
  
  Limiting Number of elements/attrbutes
&lt;/h5&gt;

&lt;p&gt;Sometime, the input document may have lot of elements and we may not need all the elements in search result. In elastic search, search result stays within source attribute and in following way we can reduce number of elements.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/blogposts/_search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"draft"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"from"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;  
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There can be scenarios, where we want to exclude only one field and in that scenario, we need to add excludes conditions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/blogposts/_search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"draft"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"from"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"excludes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tags"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;  
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  match_all
&lt;/h4&gt;

&lt;p&gt;The following match_all allows elastic search to query all and return all documents from the index. Here within _source, we are specifying what are elements required from documents.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/blogposts/_search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"draft"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"from"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  multi_match
&lt;/h4&gt;

&lt;p&gt;it allows to search a query string for two elements in a document as shared below. It does full text search.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /subscriber/_search
{
  "query": {
    "multi_match": {
      "query": "ih",
      "fields": ["ExternalReference","CompanyName"]
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;curl&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;-XGET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:9200/subscriber/_search"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;-H&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'Content-Type:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;application/json'&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;-d'&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"multi_match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ih"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"fields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"ExternalReference"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"CompanyName"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  post_filter
&lt;/h4&gt;

&lt;p&gt;The &lt;code&gt;post_filter&lt;/code&gt; is applied to the search &lt;code&gt;hits&lt;/code&gt; at the very end of a search request,  after aggregations have already been calculated. Its purpose is best explained by example:&lt;/p&gt;

&lt;p&gt;Imagine that you are selling shirts that have the following properties:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;PUT /shirts
{
    "mappings": {
        "_doc": {
            "properties": {
                "brand": { "type": "keyword"},
                "color": { "type": "keyword"},
                "model": { "type": "keyword"}
            }
        }
    }
}

PUT /shirts/_doc/1?refresh
{
    "brand": "gucci",
    "color": "red",
    "model": "slim"
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Imagine a user has specified two filters:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;color:red&lt;/code&gt; and &lt;code&gt;brand:gucci&lt;/code&gt;.  You only want to show them red shirts made by Gucci in the search results.  Normally you would do this with a &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/6.8/query-dsl-bool-query.html" rel="noopener noreferrer"&gt;&lt;code&gt;bool&lt;/code&gt; query&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;GET /shirts/_search
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "color": "red"   }},
        { "term": { "brand": "gucci" }}
      ]
    }
  }
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;However, you would also like to use &lt;em&gt;faceted navigation&lt;/em&gt; to display a list of other options that the user could click on.  Perhaps you have a &lt;code&gt;model&lt;/code&gt; field that would allow the user to limit their search results to red Gucci &lt;code&gt;t-shirts&lt;/code&gt; or &lt;code&gt;dress-shirts&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This can be done with a &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-aggregations-bucket-terms-aggregation.html" rel="noopener noreferrer"&gt;&lt;code&gt;terms&lt;/code&gt; aggregation&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;GET /shirts/_search
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "color": "red"   }},
        { "term": { "brand": "gucci" }}
      ]
    }
  },
  "aggs": {
    "models": {
      "terms": { "field": "model" } 
    }
  }
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Returns the most popular models of red shirts by Gucci.&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;But perhaps you would also like to tell the user how many Gucci shirts are available in &lt;strong&gt;other colors&lt;/strong&gt;. If you just add a &lt;code&gt;terms&lt;/code&gt; aggregation on the &lt;code&gt;color&lt;/code&gt; field, you will only get back the color &lt;code&gt;red&lt;/code&gt;, because your query returns only red shirts by Gucci.&lt;/p&gt;

&lt;p&gt;Instead, you want to include shirts of all colors during aggregation, then apply the &lt;code&gt;colors&lt;/code&gt; filter only to the search results.  This is the purpose of the &lt;code&gt;post_filter&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;GET /shirts/_search
{
  "query": {
    "bool": {
      "filter": {
        "term": { "brand": "gucci" } 
      }
    }
  },
  "aggs": {
    "colors": {
      "terms": { "field": "color" } 
    },
    "color_red": {
      "filter": {
        "term": { "color": "red" } 
      },
      "aggs": {
        "models": {
          "terms": { "field": "model" } 
        }
      }
    }
  },
  "post_filter": { 
    "term": { "color": "red" }
  }
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;The main query now finds all shirts by Gucci, regardless of color.&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;The &lt;code&gt;colors&lt;/code&gt; agg returns popular colors for shirts by Gucci.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;The &lt;code&gt;color_red&lt;/code&gt; agg limits the &lt;code&gt;models&lt;/code&gt; sub-aggregation to &lt;strong&gt;red&lt;/strong&gt; Gucci shirts.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Finally, the &lt;code&gt;post_filter&lt;/code&gt; removes colors other than red from the search &lt;code&gt;hits&lt;/code&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  must and must_not condition
&lt;/h4&gt;

&lt;p&gt;This condition allows us to match one condition and stops another condition. Ex: Our requirement is to filter all documents where subtypeid is not 10222 but externalreference is having IH then for the 1st clause we need to use must not and 2nd one shd have must.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/subscriber/_search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"bool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"must"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"ExternalReference"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ih"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"must_not"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"SubscriberTypeCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"10121"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  sorting
&lt;/h4&gt;

&lt;p&gt;We need to add sorting on the query result. Here you can see the sort is placed outside of query as the query result is passed to sorting.&lt;/p&gt;

&lt;p&gt;Note: Text fields are analyzed and then tokenized. So it is not possible to sort the text fields.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /blogposts/_search
{
  "query": {
    "match": {
      "status": "draft"
    }
  },
  "size": 10,
  "from": 3,
  "_source": ["title","status"],
  "sort": [
    {
      "published_date": {
        "order": "desc"
      }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we have to sort text field then we need to use keyword fields as they are non-analyzed but that will be slow.&lt;/p&gt;

&lt;h4&gt;
  
  
  Full match condition
&lt;/h4&gt;

&lt;p&gt;Here, we are doing full match which is identified by operator and and this is based on title field.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /blogposts/_search
{
  "query": {
    "match": {
      "title":{
        "query":"Introduction to elasticsearch"
        , "operator": "and"
      }
    }
  }

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Minimum match condition
&lt;/h4&gt;

&lt;p&gt;It is sometime necessary to match defined words at minimum level and we don't want full all words to be matched as per our search text.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/blogposts/_search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Introduction to elasticsearch"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
        &lt;/span&gt;&lt;span class="nl"&gt;"minimum_should_match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;we also can give % wise match rather than specifying perfect no of terms.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/blogposts/_search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Introduction to elasticsearch"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
        &lt;/span&gt;&lt;span class="nl"&gt;"minimum_should_match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"25%"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Term and match query
&lt;/h4&gt;

&lt;p&gt;Term query does not go through analysis process. match query goes through analysis process. Lets understand through following example.&lt;/p&gt;

&lt;p&gt;Lets see the following two example of match query returning us records. This is because the elasticsearch is storing Draft as draft due to lowercase in analysis process. &lt;strong&gt;Hence searching by Draft or draft will return us same set of query results.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/blogposts/_search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"draft"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;


&lt;/span&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/blogposts/_search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Draft"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, lets look at the term query and since elasticsearch has lowercase filter here so all texts are stored in lower case,&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1st query here will not return any data&lt;/li&gt;
&lt;li&gt;2nd query here will return data
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /blogposts/_search
{
  "query": {
    "term": {
      "status": {
        "value": "Draft"
      }
    }
  }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /blogposts/_search
{
  "query": {
    "term": {
      "status": {
        "value": "draft"
      }
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  terms filter
&lt;/h4&gt;

&lt;p&gt;This is used when we want to add multiple term filter together. Here “tags” is the document element name and “elasticsearch” and “fast” are element value to compare. Note: This is or condition and any document having tags value of elasticsearch or fast will be returned in result.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /blogposts/_search
{
  "query": {
    "terms": {
      "tags": [
        "elasticsearch",
        "fast"
      ]
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  prefix query
&lt;/h4&gt;

&lt;p&gt;Remember that, this query does not go through analysis process and hence exact string needs to be passed. It does not calculate score and all score remains as 1.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET  /blogposts/_search
{
  "query": {
    "prefix": {
      "title": {
        "value": "intro"
      }
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  wildcard query
&lt;/h4&gt;

&lt;p&gt;This is same as prefix query.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET  /blogposts/_search
{
  "query": {
    "wildcard": {
      "title": {
        "value": "n*ro"
      }
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Complex Queries
&lt;/h4&gt;

&lt;p&gt;Lets take the following requirement and see how we can write the query.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;blogpost must not be in draft status&lt;/li&gt;
&lt;li&gt;blogpost should have elasticsearch&lt;/li&gt;
&lt;li&gt;blogpost should have like &amp;gt; 10&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Lets break this business problem into multiple phases.&lt;/p&gt;

&lt;h6&gt;
  
  
  Solution
&lt;/h6&gt;

&lt;ol&gt;
&lt;li&gt;First, we try to achieve the condition to remove draft blogposts from query.

&lt;ol&gt;
&lt;li&gt;We need query here for searching&lt;/li&gt;
&lt;li&gt;Since there will be multiple conditions, so we need to use bool type of query&lt;/li&gt;
&lt;li&gt;we need to filter blogposts which are in draft status and this is case insensitive search. So we i have used must_not and then match filter with field name status
&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;/blogposts/_search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"bool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"must_not"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"draft"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Now, we will need to allow all blogpost which are having elasticsearch into tag.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;So we need to use must clause into bool query&lt;/li&gt;
&lt;li&gt;we will then use match clause as we want to have case insensitive search
&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET  /blogposts/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "status": "draft"
          }
        }
      ],
      "must": [
        {
          "match": {
            "tags": "elasticsearch"
          }
        }
      ]
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Now we need to add condition of greater than. 

&lt;ol&gt;
&lt;li&gt;So we need filter condition as it will take search result out to further filter data as per where clause.&lt;/li&gt;
&lt;li&gt;we need to add range condition here as our condition of &amp;gt;10 is based on range&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;So, our *&lt;em&gt;final query will be *&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET  /blogposts/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "status": "draft"
          }
        }
      ],
      "must": [
        {
          "match": {
            "tags": "elasticsearch"
          }
        }
      ],
      "filter": [
        {"range": {
          "no_of_likes": {
            "gte": 10
          }
        }}
      ]
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Query Relevence
&lt;/h3&gt;

&lt;p&gt;We can use the score value in the search result to understand how the document is relevant to original query. Relevancy score is represented by a positive floating number. &lt;/p&gt;

&lt;p&gt;The scoring of a document is determined based on the field matches from the query specified and any additional configurations you apply to the search. &lt;/p&gt;

&lt;p&gt;Refer the article &lt;a href="https://www.compose.com/articles/how-scoring-works-in-elasticsearch/" rel="noopener noreferrer"&gt;How scoring works&lt;/a&gt; for more in-depth analysis on this.&lt;/p&gt;

&lt;p&gt;Document matching happens in&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;binary sense (matching with exact values)&lt;/li&gt;
&lt;li&gt;relevancy sense matching&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Scoring Function
&lt;/h4&gt;

&lt;p&gt;Uses following and this means more time a term appears in a document then the document is more relevant while more time the term appears across other documents then it is less relevant.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Term Frequency (TF)&lt;/li&gt;
&lt;li&gt;Inverse document Frequency (IDF)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fewn59ynfcglezdueyqfx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fewn59ynfcglezdueyqfx.png" alt="im" width="700" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-blbbd.nitrocdn.com%2FHXJwrQBMmMkQJKzvzMNlzWeAxjkYTiQA%2Fassets%2Fstatic%2Foptimized%2Frev-0fdaa1e%2Fwp-content%2Fuploads%2FTF_IDF-final-1024x399.png.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-blbbd.nitrocdn.com%2FHXJwrQBMmMkQJKzvzMNlzWeAxjkYTiQA%2Fassets%2Fstatic%2Foptimized%2Frev-0fdaa1e%2Fwp-content%2Fuploads%2FTF_IDF-final-1024x399.png.webp" alt="im" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When a document matches the query, Lucene calculates the score by  combining the score of each matching term. This scoring calculation is  done by the practical scoring function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;score(q,d)  =  
           queryNorm(q)  
          · coord(q,d)    
          · ∑ (           
                tf(t in d)   
              · idf(t)²      
              · t.getBoost() 
              · norm(t,d)    
            ) (t in q) 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;score(q,d) is the relevance score of document d for query q.&lt;/li&gt;
&lt;li&gt;queryNorm(q) is the query normalization factor.&lt;/li&gt;
&lt;li&gt;coord(q,d) is the coordination factor.&lt;/li&gt;
&lt;li&gt;The sum of the weights for each term t in the query q for document d. 

&lt;ul&gt;
&lt;li&gt;tf(t in d) is the term frequency for term t in document d.&lt;/li&gt;
&lt;li&gt;idf(t) is the inverse document frequency for term t.&lt;/li&gt;
&lt;li&gt;t.getBoost() is the boost that has been applied to the query.&lt;/li&gt;
&lt;li&gt;norm(t,d) is the field-length norm, combined with the index-time field-level boost, if any.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h4&gt;
  
  
  Boosting
&lt;/h4&gt;

&lt;p&gt;The &lt;code&gt;_boost&lt;/code&gt; field (document-level boost) was removed, but &lt;a href="https://www.elastic.co/guide/en/elasticsearch/guide/current/practical-scoring-function.html#index-boost" rel="noopener noreferrer"&gt;field-level boosts&lt;/a&gt;, and &lt;a href="https://www.elastic.co/guide/en/elasticsearch/guide/current/query-time-boosting.html" rel="noopener noreferrer"&gt;query boosts&lt;/a&gt; still work just fine.  If we want to boost matches on &lt;code&gt;field1&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;Boosting can be of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Index time boosting

&lt;ul&gt;
&lt;li&gt;stores at index. boost value cant be changed in future other than re-indexing&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;query time boosting

&lt;ul&gt;
&lt;li&gt;Allows to define at query time and can be changed anytime
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"bool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"should"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"terms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"field1"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"67"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"93"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"73"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"78"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"88"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"77"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"boost"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"terms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"field2"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"68"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"94"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"72"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"76"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"82"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"96"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"70"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"86"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"81"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"92"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"97"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"74"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"91"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"85"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"terms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"cat2"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can add also like this,,,&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A class of boost 2 means it is twice as important compare to others&lt;/li&gt;
&lt;li&gt;Boost figure is relative.&lt;/li&gt;
&lt;li&gt;Tough to finalize correct boost value and this is more an iterative process to find out optimal relevency score
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /blogposts/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "title":{
              "query": "Elasticsearch",
              "boost": 2
            }
          }
        },
        {
          "match": {
            "content":{
              "query": "Elasticsearch"

            }
          }
        }
      ]
    }
  }

}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Filter Search basics of Elasticsearch
&lt;/h2&gt;

&lt;p&gt;Filter condition allows us to add more condition into search criteria. Here we see first condition is must and this means it will retrieve all documents where ExternalReference has ih text. This is being set as must. Next condition is filter based one which will be triggered. here  filter condition will be based on status greater than 1. Here since i need range bound one so have used range filter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Always first filter context gets executed and then query context gets executed. This is to ensure filtered objects are available for document search by query objects. Score does not gets calculated for filter query results by elasticsearch. Score only gets calculated for query context.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/subscriber/_search&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"bool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"must"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"match"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"ExternalReference"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ih"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"filter"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"range"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
              &lt;/span&gt;&lt;span class="nl"&gt;"gte"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There can be filter based on term which is where we compare the value. term filter has been used here to compare the value. in case of date range, we need to use range filter.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /subscriber/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "ExternalReference": "ih"
          }
        }

      ]
      , "filter": [
        {
          "term": {
            "Status": "2"
          } 
        }
      ]
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Cluster Node
&lt;/h2&gt;

&lt;p&gt;Cluster has a master node and any node can become master node.  Every node knows where each document leaves. When an index is sharded, a given document within that index will only be stored within one of the shards.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv5c6josjibsmbqj8siju.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv5c6josjibsmbqj8siju.png" alt="im" width="800" height="462"&gt;&lt;/a&gt; &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjusvbf0hjjlq3z2g206h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjusvbf0hjjlq3z2g206h.png" alt="im" width="800" height="782"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are two main reasons why sharding is important, with the first one being that it &lt;strong&gt;allows you to split and thereby scale volumes of data&lt;/strong&gt;. So if you have growing amounts of data, you will not face a bottleneck  because you can always tweak the number of shards for a particular  index. The other reason why sharding is important, is that &lt;strong&gt;operations can be  distributed across multiple nodes and thereby parallelized&lt;/strong&gt;. This results in increased performance, because multiple machines can potentially  work on the same query. This is completely transparent to you as a user  of Elasticsearch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;So how do you specify the number of shards an index has? You can  optionally specify this at index creation time, but if you don’t, a  default number of 5 will be used. This is sufficient in most cases,  since it allows for a good amount of growth in data before you need to  worry about adding additional shard&lt;/strong&gt;s.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuhcv4vm6bb5z3mqpqtit.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuhcv4vm6bb5z3mqpqtit.png" alt="im" width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Primary and Replica shard
&lt;/h3&gt;

&lt;p&gt;There are two types of shards: primaries and replicas. &lt;strong&gt;Each document in an index belongs to one primary shard&lt;/strong&gt;. ... The number of primary shards in an index is fixed at the time that  an index is created, but the number of replica shards can be changed at  any time, without interrupting indexing or query operations.&lt;/p&gt;

&lt;p&gt;Elasticsearch uses the concept of the shard to subdivide the index  into multiple pieces and allows us to make one or more copies of index  shards called replicas. Please refer to this &lt;a href="https://stackoverflow.com/a/15705989/10348758" rel="noopener noreferrer"&gt;SO answer&lt;/a&gt; to get a detailed understanding of shards and replicas.&lt;/p&gt;

&lt;p&gt;To set the number of shards and replicas as properties of index: Here it means number of primary shard will be 6 and for each primary shard there will be 2 replica shard available.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PUT /indexName

{
  "settings": {
    "index": {
      "number_of_shards": 6,
      "number_of_replicas": 2
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Some important tips for choosing the number of shards and replicas:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The number of shards cannot be changed after an index is created. If you later find it necessary to change the number of shards, then you will have to reindex all the documents again.&lt;/li&gt;
&lt;li&gt;To decide no of shards, you will have to choose a starting point  and then try to find the optimal size through testing with your data and queries.&lt;/li&gt;
&lt;li&gt;Replicas tend to improve search performance (not always). But, it is recommended to have at least 1 replica (so that data is preserved in case of hardware failure)&lt;/li&gt;
&lt;li&gt;Refer this &lt;a href="https://medium.com/@alikzlda/elasticsearch-cluster-sizing-and-performance-tuning-42c7dd54de3c" rel="noopener noreferrer"&gt;medium article&lt;/a&gt;, that states that number of nodes and number of shards (primary shard +  replicas), should be proportional to each other. This is important for  Elasticsearch to ensure proper load balancing.&lt;/li&gt;
&lt;li&gt;As stated in this &lt;a href="https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster" rel="noopener noreferrer"&gt;article&lt;/a&gt; it is recommended to keep the number of shards per node below 20 per GB heap it has configured.&lt;/li&gt;
&lt;li&gt;According to this &lt;a href="https://kb.objectrocket.com/elasticsearch/how-to-specify-the-number-of-shards-and-number-of-replicas-per-shard-in-elasticsearch" rel="noopener noreferrer"&gt;blog&lt;/a&gt; when you’re planning for capacity, try and allocate shards at a rate of 150% to 300% (or about double) the number nodes that you had when  initially configuring your datasets&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Replication serves two purposes, &lt;strong&gt;with the main one being to provide high availability in case nodes or shards fai&lt;/strong&gt;l. For replication to even be  effective if something goes wrong, replica shards are &lt;em&gt;never&lt;/em&gt;  allocated to the same nodes as the primary shards, which you can also  see on the above diagram. This means that even if an entire node fails,  you will have at least one replica of any primary shards on that  particular node. The other purpose of replication — or perhaps a side  benefit — is &lt;strong&gt;increased performance for search queries&lt;/strong&gt;. This is the case  because searches can be executed on all replicas in parallel, meaning  that replicas are actually part of the cluster’s searching capabilities. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8t80a6zgia3ba8xrsc2x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8t80a6zgia3ba8xrsc2x.png" alt="im" width="800" height="652"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Replication between Primary and Replica
&lt;/h3&gt;

&lt;p&gt;So how does Elasticsearch keep everything in sync? Elasticsearch uses a model named &lt;em&gt;primary-backup&lt;/em&gt; for its data replication. &lt;/p&gt;

&lt;p&gt;Adding, updating, or removing documents — are sent to the primary shard. The primary shard is then responsible for validating the operations and ensuring that everything is good. This involves checking if the request is structurally invalid, such as trying to add a number to an object  field or something like that. When the operation has been accepted by  the primary shard, the operation will be performed locally on the  primary shard itself. When the operation completes, the operation will  be forwarded to each of the replica shards in the replica group. If the  shard has multiple replicas, the operation will be performed in parallel on each of the replicas. When the operation has completed successfully  on every replica and responded to the primary shard, the primary shard  will respond to the client that the operation has completed  successfully. This is all illustrated in the below diagram.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl6qp6c9qv0kyt2iigrtk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl6qp6c9qv0kyt2iigrtk.png" alt="im" width="800" height="572"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Search Operation and its combine
&lt;/h3&gt;

&lt;p&gt;There are two phases of search operation in elasticsearch,&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;query all the matching documents from all the sherds&lt;/li&gt;
&lt;li&gt;combine all the search records  in master node from all the sherds for the respective documents&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Distributed searching
&lt;/h4&gt;

&lt;p&gt;A search request can be accepted by any node in the Elasticsearch  cluster. Each of the nodes in the cluster is aware of all the other  nodes. The catch being that, the node receiving the search request, by  default is unaware of where the data that is to be queried resides.  Hence, the node has no choice but to broadcast the request to all the  shards and then gather their responses into a globally sorted result set that it can return to the client.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdlz9d8kfz2ntbe1b8hwr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdlz9d8kfz2ntbe1b8hwr.png" alt="im" width="800" height="438"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The process of determining which shard a particular document resides in is termed as &lt;code&gt;routing&lt;/code&gt;. By default, routing is handled by Elasticsearch. The default routing  scheme hashes the ID of a document and uses it as the partition key to  find a shard. This includes both user-provided IDs and randomly  generated IDs picked by Elasticsearch. Documents are assigned shards by  hashing the document ID, dividing the hash by the number of shards and  taking the remainder.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0h8tx73hk3d2r9l636q4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0h8tx73hk3d2r9l636q4.png" alt="im" width="800" height="300"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Index Components
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Manually Index creation
&lt;/h3&gt;

&lt;p&gt;We can create index manually and mainly we need to provide shard and replica related configuration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PUT /blogposts/
{
  "settings": {
    "number_of_shards": 2
    , "number_of_replicas": 1
  }

}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we can define index element datatype and it is better that we always define explicitly datatype rather than relying on Elasticsearch.&lt;/p&gt;

&lt;p&gt;Key datatype are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;string&lt;/li&gt;
&lt;li&gt;Float&lt;/li&gt;
&lt;li&gt;Double&lt;/li&gt;
&lt;li&gt;date&lt;/li&gt;
&lt;li&gt;boolean&lt;/li&gt;
&lt;li&gt;array (Can only hold one datatype data. It cant hold multiple data type like string and Boolean)&lt;/li&gt;
&lt;li&gt;Keyword
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PUT /blogposts/_mapping
{
  "properties":{
    "title":{"type":"text"},
    "content": {"type": "text"},
    "published_date":{"type":"date"}
  }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Response from Elasticsearch will be acknowledgement one.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "acknowledged" : true
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  How index components are?
&lt;/h3&gt;

&lt;p&gt;We can find out details of  index by GET command with Index name.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /subscriber/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Elasticsearch defines index fields mapping and its data types during 1st document insert by default. So if later new datatype has been sent into any field then Elasticsearch will give error. So it is always good to define the index mapping explictly rather than allowing Elasticsearch to assume it by default.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The search result will have following 3 components&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;aliases&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;settings&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It will have following informations&lt;/li&gt;
&lt;li&gt;number_of_shards (Elasticsearch will set this by default to 1)&lt;/li&gt;
&lt;li&gt;provided_name&lt;/li&gt;
&lt;li&gt;index creation date&lt;/li&gt;
&lt;li&gt;uuid (Unique identifier of the index)&lt;/li&gt;
&lt;li&gt;number_of_replicas (Elasticsearch will set this by default to 1)
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    "settings" : {
        "index" : {
          "routing" : {
            "allocation" : {
              "include" : {
                "_tier_preference" : "data_content"
              }
            }
          },
          "number_of_shards" : "1",
          "provided_name" : "subscriber",
          "creation_date" : "1637599846785",
          "number_of_replicas" : "1",
          "uuid" : "L4jp0hosRq2Pjqk3hHxP5w",
          "version" : {
            "created" : "7150299"
          }
        }
      }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;mappings&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;defines how the index should be distributed. Mapping is either generated automatically or we can define explicitly. &lt;/li&gt;
&lt;li&gt;It will have all element names for documents we have indexed.&lt;/li&gt;
&lt;li&gt;Elasticsearch detects field types from its values.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  "PostalCode" : {
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "subscriber" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "Addresses" : {
          "properties" : {
            "City" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "Country" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "Created" : {
              "type" : "date"
            },
            "DefaultBilling" : {
              "type" : "boolean"
            },
            "DefaultHome" : {
              "type" : "boolean"
            },
            "DefaultPostal" : {
              "type" : "boolean"
            },
            "DefaultService" : {
              "type" : "boolean"
            },
            "DefaultShipping" : {
              "type" : "boolean"
            },
            "DefaultWork" : {
              "type" : "boolean"
            },
            "Id" : {
              "type" : "long"
            },
            "LineOne" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "Modified" : {
              "type" : "date"
            },
            "Name" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "PostalCode" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "ShipToName" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "State" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "Status" : {
              "type" : "long"
            },
            "StatusName" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "Verifiable" : {
              "type" : "boolean"
            },
            "Verified" : {
              "type" : "boolean"
            }
          }
        },
        "Category" : {
          "type" : "long"
        },
        "CompanyName" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "ConvergentBillerId" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "Created" : {
          "type" : "date"
        },
        "ExternalReference" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "Id" : {
          "type" : "long"
        },
        "Language" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "Login" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "State" : {
          "type" : "long"
        },
        "StateChangeDate" : {
          "type" : "date"
        },
        "Status" : {
          "type" : "long"
        },
        "Subscriber" : {
          "properties" : {
            "AdditionalProperties" : {
              "properties" : {
                "ExternalReference" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "Id" : {
                  "type" : "long"
                },
                "Name" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "Values" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "Category" : {
              "type" : "long"
            },
            "CompanyName" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "ContactPreferences" : {
              "properties" : {
                "ContactEventType" : {
                  "type" : "long"
                },
                "ContactMethod" : {
                  "type" : "long"
                },
                "Id" : {
                  "type" : "long"
                },
                "OptIn" : {
                  "type" : "boolean"
                }
              }
            },
            "ConvergentBillerId" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "Created" : {
              "type" : "date"
            },
            "EffectiveStartDate" : {
              "type" : "date"
            },
            "ExternalReference" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "HomeCountry" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "Id" : {
              "type" : "long"
            },
            "InvoiceConfiguration" : {
              "properties" : {
                "HideZeroAmount" : {
                  "type" : "boolean"
                },
                "InvoiceDetailLevel" : {
                  "type" : "long"
                }
              }
            },
            "Language" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "Login" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "State" : {
              "type" : "long"
            },
            "StateChangeDate" : {
              "type" : "date"
            },
            "StateName" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "Status" : {
              "type" : "long"
            },
            "StatusName" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "SubscriberCurrency" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "SubscriberTypeCode" : {
              "type" : "long"
            },
            "SubscriberTypeDetails" : {
              "properties" : {
                "AccountingMethod" : {
                  "type" : "long"
                },
                "BillCycle" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "BillCycleDay" : {
                  "type" : "long"
                },
                "BillCycleName" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "IsReadOnly" : {
                  "type" : "boolean"
                },
                "PaymentDueDays" : {
                  "type" : "long"
                },
                "PostpaidAccountNumber" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            },
            "TermsAndConditionsAccepted" : {
              "type" : "date"
            }
          }
        },
        "SubscriberTypeCode" : {
          "type" : "long"
        },
        "TermsAndConditionsAccepted" : {
          "type" : "date"
        }
      }
    },
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "number_of_shards" : "1",
        "provided_name" : "subscriber",
        "creation_date" : "1637599846785",
        "number_of_replicas" : "1",
        "uuid" : "L4jp0hosRq2Pjqk3hHxP5w",
        "version" : {
          "created" : "7150299"
        }
      }
    }
  }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Analyzer
&lt;/h2&gt;

&lt;p&gt;An &lt;em&gt;analyzer&lt;/em&gt;  — whether built-in or custom — is just a package which contains three lower-level building blocks: &lt;em&gt;character filters&lt;/em&gt;, &lt;em&gt;tokenizers&lt;/em&gt;, and &lt;em&gt;token filters&lt;/em&gt;.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Character filters&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;Character filters are used to preprocess the stream of characters  before it is passed to the tokenizer. An example is HTML stripc  character filter that strips HTML elements from a text and replaces HTML  entities with their decoded value (e.g, replaces &lt;code&gt;&amp;amp;amp;&lt;/code&gt; with &amp;amp;).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ol&gt;
&lt;li&gt;Tokenizers&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;A tokenizer receives a stream of characters, breaks it up into  individual tokens (usually individual words), and outputs a stream of  tokens.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ol&gt;
&lt;li&gt;Token filters&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;Token filters accept a stream of tokens from a tokenizer and can  modify tokens (eg lowercasing), delete tokens (eg remove stopwords) or  add tokens (eg synonyms). An example is a lowercase token filter that  simply changes token text to lowercase. ASCII folding token filter  converts alphabetic, numeric, and symbolic characters that are not in  the Basic Latin Unicode block (first 127 ASCII characters) to their  ASCII equivalent, if one exists. For example, the filter changes à to a.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.elastic.co%2Fguide%2Fen%2Felasticsearch%2Fclient%2Fnet-api%2Fcurrent%2Fanalysis-chain.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.elastic.co%2Fguide%2Fen%2Felasticsearch%2Fclient%2Fnet-api%2Fcurrent%2Fanalysis-chain.png" alt="im" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;There can be multiple token filters as shown and each token filter does different work. There can be different query time and Index time analyzer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Characteristics of an analyzer**&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;An analyzer may have zero or more character filters, which are applied in order.&lt;/li&gt;
&lt;li&gt;An analyzer must have exactly one tokenizer.&lt;/li&gt;
&lt;li&gt;An analyzer may have zero or more token filters, which are applied in order.&lt;/li&gt;
&lt;li&gt;Tokenizer generates tokens, which will be passed on to the token filter and then eventually become terms in the &lt;a href="https://nlp.stanford.edu/IR-book/html/htmledition/a-first-take-at-building-an-inverted-index-1.html" rel="noopener noreferrer"&gt;inverted index&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Certain tokenizers like ngram, edgengram can generate lots of tokens, which can cause higher disk usage.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The analyzer will affect how we search the text, but it won’t affect the content of the text itself. With this  example, if we search for &lt;code&gt;let&lt;/code&gt;, the Elasticsearch will still return the full text &lt;code&gt;Let’s build an autocomplete!&lt;/code&gt; instead of only &lt;code&gt;let&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fres.cloudinary.com%2Fpracticaldev%2Fimage%2Ffetch%2Fs--cGfpBdFn--%2Fc_limit%252Cf_auto%252Cfl_progressive%252Cq_auto%252Cw_880%2Fhttps%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Ftgeb9kpb3s9lgonytewt.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fres.cloudinary.com%2Fpracticaldev%2Fimage%2Ffetch%2Fs--cGfpBdFn--%2Fc_limit%252Cf_auto%252Cfl_progressive%252Cq_auto%252Cw_880%2Fhttps%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Ftgeb9kpb3s9lgonytewt.PNG" alt="im" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Elasticsearch’s Analyzer has three components you can modify depending on your use case:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Character Filters&lt;/li&gt;
&lt;li&gt;Tokenizer&lt;/li&gt;
&lt;li&gt;Token Filter&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Character Filter
&lt;/h4&gt;

&lt;p&gt;The character filter has the ability to perform addition,removal or replacement actions on the input text given to them.&lt;/p&gt;

&lt;p&gt;Here is a sample request&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST _analyze/?pretty
{
  "tokenizer": "standard",
  "char_filter": ["html_strip"],
  "text": "The &amp;lt;b&amp;gt; Auto-generation &amp;lt;/b&amp;gt; is a success"

  }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -XPOST "http://localhost:9200/_analyze/?pretty" -H 'Content-Type: application/json' -d'
{
  "tokenizer": "standard",
  "char_filter": ["html_strip"],
  "text": "The &amp;lt;b&amp;gt; Auto-generation &amp;lt;/b&amp;gt; is a success"

  }'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response shows that, html tag has been removed and the resulting tokens are like below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  "tokens" : [
    {
      "token" : "The",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "&amp;lt;ALPHANUM&amp;gt;",
      "position" : 0
    },
    {
      "token" : "Auto",
      "start_offset" : 8,
      "end_offset" : 12,
      "type" : "&amp;lt;ALPHANUM&amp;gt;",
      "position" : 1
    },
    {
      "token" : "generation",
      "start_offset" : 13,
      "end_offset" : 23,
      "type" : "&amp;lt;ALPHANUM&amp;gt;",
      "position" : 2
    },
    {
      "token" : "is",
      "start_offset" : 29,
      "end_offset" : 31,
      "type" : "&amp;lt;ALPHANUM&amp;gt;",
      "position" : 3
    },
    {
      "token" : "a",
      "start_offset" : 32,
      "end_offset" : 33,
      "type" : "&amp;lt;ALPHANUM&amp;gt;",
      "position" : 4
    },
    {
      "token" : "success",
      "start_offset" : 34,
      "end_offset" : 41,
      "type" : "&amp;lt;ALPHANUM&amp;gt;",
      "position" : 5
    }
  ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Tokenizer
&lt;/h4&gt;

&lt;p&gt;The input text after its transformation from the Character filter is passed to the tokeniser. The tokenizer would split this input text into individual tokens (or terms) at specific characters. The default tokenizer in elasticsearch is the “standard tokeniser”, which uses the grammar based tokenisation technique.&lt;/p&gt;

&lt;p&gt;The above command shows the resulting tokens are like below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;“The”,”Auto”,”generation”,”is”,”a”,”success”
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The words are split whenever there is a white-space and also a hyphen (-).&lt;/p&gt;

&lt;h3&gt;
  
  
  Standard Tokenizer
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;default analyzer of the Elasticsearch&lt;/li&gt;
&lt;li&gt; It uses grammar based Tokenization specified in &lt;a href="https://unicode.org/reports/tr29/" rel="noopener noreferrer"&gt;https://unicode.org/reports/tr29/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Does following

&lt;ul&gt;
&lt;li&gt;Tokenizer&lt;/li&gt;
&lt;li&gt;Standard Tokenizer&lt;/li&gt;
&lt;li&gt;Token Filters&lt;/li&gt;
&lt;li&gt;Lower Case Token Filter&lt;/li&gt;
&lt;li&gt;Stop Token Filter (disabled by default)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Simple Analyzer
&lt;/h3&gt;

&lt;p&gt;The simple analyzer breaks text into tokens at any non-letter character, such as numbers, spaces, hyphens and apostrophes, discards non-letter characters, and changes uppercase to lowercase. It does not remove stop words.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST _analyze
{
  "analyzer": "simple",
  "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;simple&lt;/code&gt; analyzer parses the sentence and produces the following tokens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ the, quick, brown, foxes, jumped, over, the, lazy, dog, s, bone ]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Whitespace analyzer
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;whitespace&lt;/code&gt; analyzer breaks text into terms whenever it encounters a whitespace character. The &lt;code&gt;whitespace&lt;/code&gt; analyzer is not configurable. If you need to customize the &lt;code&gt;whitespace&lt;/code&gt; analyzer then you need to recreate it as a &lt;code&gt;custom&lt;/code&gt; analyzer and modify it, usually by adding token filters. This would recreate the built-in &lt;code&gt;whitespace&lt;/code&gt; analyzer and you can use it as a starting point for further customization:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;POST _analyze
{
  "analyzer": "whitespace",
  "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above sentence would produce the following terms:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ The, 2, QUICK, Brown-Foxes, jumped, over, the, lazy, dog's, bone. ]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Stop analyzer
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;stop&lt;/code&gt; analyzer is the same as the &lt;code&gt;simple&lt;/code&gt; analyzer but adds support for removing stop words. It defaults to using the &lt;code&gt;_english_&lt;/code&gt; stop words.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;POST _analyze
{
  "analyzer": "stop",
  "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above sentence would produce the following terms:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ quick, brown, foxes, jumped, over, lazy, dog, s, bone ]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Keyword analyzer
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;keyword&lt;/code&gt; analyzer is a “noop” analyzer which returns the entire input string as a single token.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;POST _analyze
{
  "analyzer": "keyword",
  "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above sentence would produce the following single term:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ The 2 QUICK Brown-Foxes jumped over the lazy dog's bone. ]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Fingerprint analyzer
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;fingerprint&lt;/code&gt; analyzer implements a &lt;a href="https://github.com/OpenRefine/OpenRefine/wiki/Clustering-In-Depth#fingerprint" rel="noopener noreferrer"&gt;fingerprinting algorithm&lt;/a&gt; which is used by the OpenRefine project to assist in clustering. &lt;/p&gt;

&lt;p&gt;Input text is lowercased, normalized to remove extended characters, sorted, deduplicated and concatenated into a single token. If a stop word list is configured, stop words will also be removed.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;creates fingerprint which can be used for duplicate detection&lt;/li&gt;
&lt;li&gt;input is lowercased, normalized to remove extended character, sorted, de-duplicated and concatenated
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST _analyze
{
  "analyzer": "fingerprint",
  "text": "Yes The customer is not honest. He is not reliable for company."
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "tokens" : [
    {
      "token" : "company customer for he honest is not reliable the yes",
      "start_offset" : 0,
      "end_offset" : 63,
      "type" : "fingerprint",
      "position" : 0
    }
  ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Custom Analyzer
&lt;/h2&gt;

&lt;p&gt;Here for custom analyzer, we have provided following&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;analyzer name which is being set as amit_custom_analyze

&lt;ul&gt;
&lt;li&gt;analyzer type is set to custom&lt;/li&gt;
&lt;li&gt;char_filter is set to html stripping. we can have 0 or more character filter and hence it is array type&lt;/li&gt;
&lt;li&gt;tokenizer filter is set to standard one&lt;/li&gt;
&lt;li&gt;token filter is being set as lowercase to convert into lower case&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;This custom analyzer now can be associated with any of the index element.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PUT  /blogposts
{
  "settings": {
    "number_of_shards": 2,
    "number_of_replicas": 1,
    "analysis": {
      "analyzer": {
        "amit_custom_analyze":{
          "type": "custom",
          "char_filter": ["html_strip"],
          "tokenizer": "standard",
          "token_filters": ["lowercase"]

        }
      }
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "blogposts"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Custom Index with Custom Analyzer
&lt;/h2&gt;

&lt;p&gt;Here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;mapping provides each element types and its allowed data type&lt;/li&gt;
&lt;li&gt;setting provides Index metadata attributes
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PUT  /blogposts
{
   "settings": {
    "number_of_shards": 2,
    "number_of_replicas": 1,
    "analysis": {
      "analyzer": {
        "amit_custom_analyze":{
          "type": "custom",
          "char_filter": ["html_strip"],
          "tokenizer": "standard",
          "token_filters": ["lowercase"]

        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title":{"type": "text", "analyzer": "amit_custom_analyze"},
      "content":{"type": "text", "analyzer": "amit_custom_analyze"},
      "published_date":{"type": "date"},
      "no_of_likes":{"type": "text"}
    }
  }

}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "blogposts"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is one example of using custom analyzer during analysis process:&lt;/p&gt;

&lt;p&gt;Example shows&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;html content removed&lt;/li&gt;
&lt;li&gt;converted to lower case
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST  /blogposts/_analyze
{
  "analyzer": "amit_custom_analyze",
  "text": ["This is &amp;lt;html&amp;gt; kayal exploring elasticsearch"]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "tokens" : [
    {
      "token" : "This",
      "start_offset" : 0,
      "end_offset" : 4,
      "type" : "&amp;lt;ALPHANUM&amp;gt;",
      "position" : 0
    },
    {
      "token" : "is",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "&amp;lt;ALPHANUM&amp;gt;",
      "position" : 1
    },
    {
      "token" : "kayal",
      "start_offset" : 15,
      "end_offset" : 20,
      "type" : "&amp;lt;ALPHANUM&amp;gt;",
      "position" : 2
    },
    {
      "token" : "exploring",
      "start_offset" : 21,
      "end_offset" : 30,
      "type" : "&amp;lt;ALPHANUM&amp;gt;",
      "position" : 3
    },
    {
      "token" : "elasticsearch",
      "start_offset" : 31,
      "end_offset" : 44,
      "type" : "&amp;lt;ALPHANUM&amp;gt;",
      "position" : 4
    }
  ]
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How to create customer tokenizer and filters
&lt;/h2&gt;

&lt;p&gt;Here, I have shown an example of Index creation command where custom tokenizer and filters have been defined.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;punctuation is the name of the tokenizer and similarly symbol is the name of the char_filter&lt;/li&gt;
&lt;li&gt;tokenizer will look for pattern @# and tokenize&lt;/li&gt;
&lt;li&gt;symbol will look for emoji :) and :( to convert them into happy and sad
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PUT  /blogposts
{
   "settings": {
    "number_of_shards": 2,
    "number_of_replicas": 1,
    "analysis": {
      "analyzer": {
        "amit_custom_analyze":{
          "type": "custom",
          "tokenizer": "punctuation",
          "char_filter":"symbol"
        }
      },
      "tokenizer": {
        "punctuation":{
          "type":"pattern",
          "pattern":"@#"
        }
      },
      "char_filter": {
        "symbol":{
          "type":"mapping",
          "mappings": [":) ==&amp;gt; happy",":( ==&amp;gt; sad"]
        }
      }
    }
  },
  "mappings": {
      "properties": {
      "title":{"type": "text", "analyzer": "amit_custom_analyze"},
      "content":{"type": "text", "analyzer": "amit_custom_analyze"},
      "published_date":{"type": "date"},
      "no_of_likes":{"type": "text"}
    }
    }

  }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Aggregation
&lt;/h2&gt;

&lt;p&gt;Every aggregation is combination of one or more buckets and metrices. Aggregation refers to the collection of documents or a set of documents obtained from a particular search query or filter. Aggregation forms the main concept to build the desired visualization in Kibana.&lt;/p&gt;

&lt;p&gt;Aggregation always works on top of query and filter result. This is true unless we use post_filter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Buckets
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A bucket mainly consists of a key and a document. When the aggregation is executed, the documents are placed in the respective bucket.&lt;/li&gt;
&lt;li&gt;Bucket aggregations categorize sets of documents as buckets. The type of bucket aggregation determines whether a given document falls into a bucket or not.&lt;/li&gt;
&lt;li&gt;use bucket aggregations to implement faceted navigation (usually placed as a sidebar on a search result landing page) to help you’re users narrow down the results.&lt;/li&gt;
&lt;li&gt;Buckets are similar to GROUP BY&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note: Here the tag is keyword field and not an analyzed one. If this is analyzed one then we need nested raw one. &lt;/p&gt;

&lt;p&gt;We can have sub-aggregation within buckets also.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Metric aggregation&lt;/strong&gt;s—This aggregation helps in calculating matrices from the fields of aggregated document values.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pipeline aggregations&lt;/strong&gt;—As the name suggests, this aggregation takes input from the output results of other aggregations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Matrix aggregations&lt;/strong&gt;  (still in the development phase)—These aggregations work on more than  one field and provide statistical results based on the documents  utilized by the used fields.&lt;/p&gt;

&lt;p&gt;Steps are&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Document&lt;/strong&gt; ==&amp;gt; &lt;strong&gt;Filter&lt;/strong&gt; ==&amp;gt; &lt;strong&gt;Query&lt;/strong&gt; ==&amp;gt; &lt;strong&gt;Query Result&lt;/strong&gt; ==&amp;gt; &lt;strong&gt;Aggregation&lt;/strong&gt; ==&amp;gt; &lt;strong&gt;Aggregation Result&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /blogposts/_search
{
  "aggs": {
    "tag_wise_stats": {
      "terms": {
        "field": "tags",
        "size": 10
      }
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /shirts/_search
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "color": "red"   }},
        { "term": { "brand": "gucci" }}
      ]
    }
  },
  "aggs": {
    "models": {
      "terms": { "field": "model" } 
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Metrices
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Metric Aggregation mainly refers &lt;strong&gt;to the maths calculation done on the documents present in the bucket&lt;/strong&gt;. For example if you choose a number field the metric calculation you can do on it is COUNT, SUM, MIN, MAX, AVERAGE etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Logstash
&lt;/h2&gt;

&lt;p&gt;Logstash is typically used as the “processing” engine for any log management solution (or systems that &lt;br&gt;
deal with changing data streams). &lt;/p&gt;

&lt;p&gt;These  data can be structured, semi-structured, or unstructured, and can have  many different schemas. To Logstash, all these data are “logs”  containing “events”. Logstash can easily parse and filter out the data from these log events using one or more filtering plugins that come with it. Finally,  it can send the filtered output to one or more destinations. Again,  there are prebuilt output interfaces that make this task simple.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcujcpi4yp0rnbk0xhexw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcujcpi4yp0rnbk0xhexw.png" alt="im" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0vwu34c86prhaux29eif.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0vwu34c86prhaux29eif.png" alt="im" width="800" height="179"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffie4qn4j9v56cvj46k7f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffie4qn4j9v56cvj46k7f.png" alt="im" width="800" height="309"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Logstash itself doesn’t access the source system and collect the data, it uses &lt;em&gt;input plugins&lt;/em&gt; to ingest the data from various sources. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;:  There’s a multitude of input plugins available for Logstash such as  various log files, relational databases, NoSQL databases, Kafka queues,  HTTP endpoints, S3 files, CloudWatch Logs, log4j events or Twitter  feed. &lt;/p&gt;

&lt;p&gt;Once data is ingested, one or more &lt;em&gt;filter plugins&lt;/em&gt; take care of the processing part in the filter stage. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Here is a sample logstash config file which allows to push data into elasticsearch host. This config file which allows logstash to takes data from STDIN and and pushes into blogposts index of elasticsearch running in localhost port 9200&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;input { stdin {codec =&amp;gt; json } }
output {
  elasticsearch { 
  hosts =&amp;gt; ["localhost:9200"]
  index =&amp;gt; "blogposts"
  document_type =&amp;gt; _doc
  }
  stdout { }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Filters are an in-line processing mechanism that provide the  flexibility to slice and dice your data to fit your needs. Let’s take a  look at some filters in action. The following configuration file sets up  the &lt;code&gt;grok&lt;/code&gt; and &lt;code&gt;date&lt;/code&gt; filters.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;stdin&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;grok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s2"&gt;"message"&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"%{COMBINEDAPACHELOG}"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;date&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"timestamp"&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"dd/MMM/yyyy:HH:mm:ss Z"&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;elasticsearch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;hosts&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"localhost:9200"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;stdout&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;codec&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;rubydebug&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Sample Filter Plugin
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fczprsw1hti1eecb9slbq.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fczprsw1hti1eecb9slbq.jpg" alt="im" width="638" height="479"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Reference Architecture
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw8vj11m0erwhj34twwm4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw8vj11m0erwhj34twwm4.png" alt="im" width="800" height="395"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  References
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://coralogix.com/blog/42-elasticsearch-query-examples-hands-on-tutorial/" rel="noopener noreferrer"&gt;42 Elasticsearch Query Examples – Hands-on Tutorial&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>elasticsearch</category>
      <category>aws</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Demand Forecasting with AWS Forecast</title>
      <dc:creator>Amit Kayal</dc:creator>
      <pubDate>Sat, 02 Oct 2021 12:24:08 +0000</pubDate>
      <link>https://dev.to/aws-builders/demand-forecasting-with-aws-forecast-pc8</link>
      <guid>https://dev.to/aws-builders/demand-forecasting-with-aws-forecast-pc8</guid>
      <description>&lt;h2&gt;
  
  
  What is forecasting?
&lt;/h2&gt;

&lt;p&gt;A time series essentially is a series of quantitative values. These values are obtained over time, and often have equal time intervals between them. These intervals can be quite different and may consist of yearly, quarterly, monthly or hourly buckets for instance.&lt;/p&gt;

&lt;p&gt;Time-series methods are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Moving Average&lt;/li&gt;
&lt;li&gt;Autoregression&lt;/li&gt;
&lt;li&gt;Vector Autoregression&lt;/li&gt;
&lt;li&gt;Autoregressive Integrated Moving Average&lt;/li&gt;
&lt;li&gt;Autoregressive Moving Average&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Components of Time Series
&lt;/h2&gt;

&lt;p&gt;Key classifications of the components of the time series are: &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Random or Irregular movements.&lt;/li&gt;
&lt;li&gt;Cyclic Variations.&lt;/li&gt;
&lt;li&gt;Seasonal Variations.&lt;/li&gt;
&lt;li&gt;Trend.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Why Amazon Forecast
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Fully managed service that uses machine learning to deliver highly accurate forecasts. &lt;/li&gt;
&lt;li&gt;Based on the same machine learning forecasting technology used by Amazon.com.&lt;/li&gt;
&lt;li&gt;Automated machine learning

&lt;ul&gt;
&lt;li&gt;Includes AutoML capabilities that take care of the machine learning for you.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Works with any historical time series data to create accurate forecasts

&lt;ul&gt;
&lt;li&gt; in a retail scenario, Amazon Forecast uses machine learning to process your time series data (such as price, promotions, and store traffic) and combines that with associated data (such as product features, floor placement, and store locations) to determine the complex relationships between them. &lt;/li&gt;
&lt;li&gt;So, it can combine combining time series data with additional variables for time series prediction&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  How Amazon Forecast works?
&lt;/h2&gt;

&lt;p&gt;Here is the flow diagram taken from AWS site and key points here to note are&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Historical data can be pushed into S3&lt;/li&gt;
&lt;li&gt;Forecast can be triggered on data arrival in S3&lt;/li&gt;
&lt;li&gt;Output of forecast can be pushed into S3 for further actions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fddpimacthdqh43gkj2l7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fddpimacthdqh43gkj2l7.png" alt="im" width="800" height="318"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Forecasting automation with Amazon Forecast by applying MLOps
&lt;/h2&gt;

&lt;p&gt;The following model architecture taken from AWS site allows us to build, train, and deploy a time-series forecasting model leveraging an MLOps pipeline encompassing Amazon Forecast, &lt;a href="https://aws.amazon.com/lambda/" rel="noopener noreferrer"&gt;AWS Lambda&lt;/a&gt;, and &lt;a href="https://aws.amazon.com/step-functions/" rel="noopener noreferrer"&gt;AWS Step Functions&lt;/a&gt;. To visualize the generated forecast, we will use a combination of AWS serverless analytics services such as &lt;a href="https://aws.amazon.com/athena/" rel="noopener noreferrer"&gt;Amazon Athena&lt;/a&gt; and &lt;a href="https://aws.amazon.com/quicksight/" rel="noopener noreferrer"&gt;Amazon QuickSight&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxr2bsq8oatvy8za5k132.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxr2bsq8oatvy8za5k132.png" alt="im" width="800" height="359"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Key components of this architecture are&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;dataset is uploaded to the Amazon S3 cloud storage under the &lt;code&gt;/**train**&lt;/code&gt; directory (prefix).&lt;/li&gt;
&lt;li&gt;uploaded file triggers Lambda, which initiates the MLOps pipeline built using a Step Functions state machine.&lt;/li&gt;
&lt;li&gt;The state machine stitches together a series of Lambda functions to build, train, and deploy a ML model in Amazon Forecast. &lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/cloudwatch/" rel="noopener noreferrer"&gt;Amazon CloudWatch&lt;/a&gt;, which captures Forecast metrics is being used for log analysis&lt;/li&gt;
&lt;li&gt;SNS being used for Forecasting job status change notification

&lt;ul&gt;
&lt;li&gt; final forecasts become available in the source Amazon S3 bucket in the &lt;code&gt;**/forecast**&lt;/code&gt; directory.&lt;/li&gt;
&lt;li&gt;ML pipeline saves any old forecasts in the &lt;code&gt;**/history**&lt;/code&gt; directory.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Forecast Workflow
&lt;/h2&gt;

&lt;p&gt;The workflow to generating forecasts consists of the following steps.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Creating related datasets and a dataset group&lt;/li&gt;
&lt;li&gt;Retrieving training data&lt;/li&gt;
&lt;li&gt;Training predictors (trained model) using an algorithm or AutoML&lt;/li&gt;
&lt;li&gt;Evaluating the predictor with metrics&lt;/li&gt;
&lt;li&gt;Creating a forecast&lt;/li&gt;
&lt;li&gt;Retrieving forecast for users&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Amazon Forecast supports the following dataset domains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/forecast/latest/dg/retail-domain.html" rel="noopener noreferrer"&gt;RETAIL Domain&lt;/a&gt; – For retail demand forecasting&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/forecast/latest/dg/inv-planning-domain.html" rel="noopener noreferrer"&gt;INVENTORY_PLANNING Domain&lt;/a&gt; – For supply chain and inventory planning&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/forecast/latest/dg/ec2-capacity-domain.html" rel="noopener noreferrer"&gt;EC2 CAPACITY Domain&lt;/a&gt; – For forecasting Amazon Elastic Compute Cloud (Amazon EC2) capacity&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/forecast/latest/dg/workforce-domain.html" rel="noopener noreferrer"&gt;WORK_FORCE Domain&lt;/a&gt; – For work force planning&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/forecast/latest/dg/webtraffic-domain.html" rel="noopener noreferrer"&gt;WEB_TRAFFIC Domain&lt;/a&gt; – For estimating future web traffic&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/forecast/latest/dg/metrics-domain.html" rel="noopener noreferrer"&gt;METRICS Domain&lt;/a&gt; – For forecasting metrics, such as revenue and cash flow&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/forecast/latest/dg/custom-domain.html" rel="noopener noreferrer"&gt;CUSTOM Domain&lt;/a&gt; – For all other types of time-series forecasting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example 1: Dataset Types in the RETAIL Domain&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you are a retailer interested in forecasting demand for items, you might create the following datasets in the RETAIL domain:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Target time series is the required dataset of historical time-series demand (sales) data for each item (each product a retailer sells). In the RETAIL domain, this dataset type requires that the dataset includes the &lt;code&gt;item_id&lt;/code&gt;, &lt;code&gt;timestamp&lt;/code&gt;, and the &lt;code&gt;demand&lt;/code&gt; fields. The &lt;code&gt;demand&lt;/code&gt; field is the forecast target, and is typically the number of items sold by the retailer in a particular week or day.&lt;/li&gt;
&lt;li&gt;Optionally, a dataset of the related time series type. In the RETAIL domain, this type can include optional, but suggested, time-series information such as &lt;code&gt;price&lt;/code&gt;, &lt;code&gt;inventory_onhand&lt;/code&gt;, and &lt;code&gt;webpage_hits&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Optionally, a dataset of the item metadata type. In the RETAIL domain, Amazon Forecast suggests providing metadata information related to the items that you provided in target time series, such as &lt;code&gt;brand&lt;/code&gt;, &lt;code&gt;color&lt;/code&gt;, &lt;code&gt;category&lt;/code&gt;, and &lt;code&gt;genre&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  A case study
&lt;/h2&gt;

&lt;p&gt;I took the dataset from Kaggle &lt;strong&gt;Store Item Demand Forecasting Challenge&lt;/strong&gt; which has given 5 years of store-item sales data, and asked to predict 3 months of sales for 50 different items at 10 different stores.&lt;/p&gt;

&lt;p&gt;Here is the way I have used AWS Forecasting with minimal coding.&lt;/p&gt;

&lt;h3&gt;
  
  
  Import your data
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Dataset Details
&lt;/h4&gt;

&lt;p&gt;Following are the details needs to be provided.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dataset name&lt;/li&gt;
&lt;li&gt;Frequency of your data’&lt;/li&gt;
&lt;li&gt;Data schema

&lt;ul&gt;
&lt;li&gt;I have used here schema builder option which is more visual one. Another option is json schema which allows us to specify AttributeName and AttributeType in the JSON format. &lt;/li&gt;
&lt;li&gt;Forecast data schema has concept of domain to make our dataset creation much easier. I have selected retail domain option here and forecast has guided me to have following attributes.&lt;/li&gt;
&lt;li&gt;item_id (attribute type: string) - &lt;strong&gt;Mandatory by forecast&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;timestamp(attribute type: timestamp and have selected format as yyyy-mm-dd) - &lt;strong&gt;Mandatory by forecast&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;demand(attribute type float) - &lt;strong&gt;Mandatory by forecast&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;store (attribute type: string) - Had to add this as my forecast has to be based on timestamp, store id and item id.&lt;/li&gt;
&lt;li&gt;It is essential that &lt;strong&gt;All attributes displayed must exist in your CSV file and must be ordered in the same order that they appear in your CSV file&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;I have used following python code to process my dataset from kaggle.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;train_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;train.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;train_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;train_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                                     &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;%Y%m%d %H:%M:%S&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;train_df_final&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;train_df_final&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;item_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;demand&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;store&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;train_df_final&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;train_df.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                     &lt;span class="n"&gt;date_format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;%Y-%m-%d&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Upload dataset into AWS S3
&lt;/h4&gt;

&lt;p&gt;Create an AWS S3 bucket, and upload the time-series data into the bucket.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_bucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;bucketName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upload_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Filename&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data/item-demand-time.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;bucketName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Dataset import details
&lt;/h4&gt;

&lt;p&gt;Following are the details needs to be provided for Import task.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; Dataset import name&lt;/li&gt;
&lt;li&gt;Select time zone  Info

&lt;ul&gt;
&lt;li&gt;My dataset does not have TZ as any variable and so I have selected the option of do&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Data location Info

&lt;ul&gt;
&lt;li&gt;This is input file path from my S3 bucket which needs to be provided&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;IAM Role Info

&lt;ul&gt;
&lt;li&gt;Dataset groups require permissions from IAM to read your dataset files in S3. &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Now it will give us option to import and once this is done then we should get &lt;strong&gt;Successfully imported your data.&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Train a predictor
&lt;/h4&gt;

&lt;p&gt;Train a predictor, a custom model with underlying infrastructure that Amazon Forecast trains on your datasets.&lt;br&gt;
Following are the key parameters required here.&lt;/p&gt;
&lt;h5&gt;
  
  
  Additional configurations to be set during this phase include
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;How the training dataset is to be split into training and testing dataset ?&lt;/li&gt;
&lt;li&gt;How the missing data is to be addressed ?&lt;/li&gt;
&lt;li&gt;How the model validation is to be performed (i.e., back test window in the context of time-series analysis)&lt;/li&gt;
&lt;li&gt;How many times the model validation is to be performed during the model training phase (i.e., number of back test windows in the context of time-series analysis)&lt;/li&gt;
&lt;li&gt;What is the forecast horizon ?&lt;/li&gt;
&lt;/ul&gt;
&lt;h5&gt;
  
  
  Predictor settings
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;Forecast name&lt;/li&gt;
&lt;li&gt;Forecast horizon

&lt;ul&gt;
&lt;li&gt;This number tells Amazon Forecast how far into the future to predict your data at the specified forecast frequency.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Forecast frequency

&lt;ul&gt;
&lt;li&gt;My data set has timestamp daily and hence i have set this as 1 day&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h5&gt;
  
  
  Predictor details
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;Algorithm selection

&lt;ul&gt;
&lt;li&gt;Here i have selected the option of &lt;strong&gt;AutoML&lt;/strong&gt; which allows me to let Amazon Forecast choose the right algorithm for dataset.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Optimization metric

&lt;ul&gt;
&lt;li&gt;I have selected default here.&lt;/li&gt;
&lt;li&gt;Amazon Forecast provides Root Mean Square Error (RMSE), Weighted Quantile Loss (wQL), Average Weighted Quantile Loss (Average wQL), Mean Absolute Scaled Error (MASE), Mean Absolute Percentage Error (MAPE), and Weighted Absolute Percentage Error (WAPE) metrics to evaluate your predictors.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Forecast dimensions&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Item id is used in training by default and that has been added as mandatory by Forecast. Additionally I have selected Store because my aim is to have forecast based on store and item id.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Forecast type&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Choose up to 5 quantiles between 0.01 and 0.99 (by increments of 0.01). AWS allows us to have by default 0.1,0.5 and 0.9.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h5&gt;
  
  
  Advanced Configuration
&lt;/h5&gt;

&lt;p&gt;Here this is the default FeaturizationMethod being recommended by Amazon Forecast. Provides information about the method that featurizes (transforms) a dataset field. &lt;/p&gt;

&lt;p&gt;Here this method is only being applied for my element demand which is specified by AttrubuteName.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"AttributeName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"demand"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"FeaturizationPipeline"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="nl"&gt;"FeaturizationMethodName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"filling"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="nl"&gt;"FeaturizationMethodParameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"aggregation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sum"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"frontfill"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"none"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"middlefill"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"zero"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"backfill"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"zero"&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h5&gt;
  
  
  Supplementary features
&lt;/h5&gt;

&lt;p&gt;This is quite crucial information and can impact business problem sometime. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Weather info

&lt;ul&gt;
&lt;li&gt;Amazon Forecast Weather Index combines multiple weather metrics from historical weather events and current forecasts at a given location to increase your demand forecast model accuracy.&lt;/li&gt;
&lt;li&gt;In retail inventory management use cases, day-to-day weather variation impacts foot traffic and product mix. &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Holiday info&lt;/li&gt;

&lt;/ul&gt;

&lt;h4&gt;
  
  
  Create a Forecaster
&lt;/h4&gt;

&lt;p&gt;Once the Predictor is trained, it is to be prepared to provide the forecasting.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;create_forecast_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;forecast&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_forecast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                           &lt;span class="n"&gt;ForecastName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;forecastName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                           &lt;span class="n"&gt;PredictorArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;predictorArn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Following key inputs needs to be provided from console and then we can start the process of forecast.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Forecast name&lt;/li&gt;
&lt;li&gt;Predictor (This is the one was created in earlier step)&lt;/li&gt;
&lt;li&gt;Forecast types 

&lt;ul&gt;
&lt;li&gt;By default, Amazon Forecast will generate forecasts for 0.10, 0.50 and 0.90 quantiles.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h4&gt;
  
  
  Make Forecasts
&lt;/h4&gt;

&lt;p&gt;Now we are ready to make forecasts. In our case, we are going to write the forecasted outputs back in S3 bucket.&lt;/p&gt;

&lt;p&gt;Following inputs needs to be provided.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Start date Info&lt;/li&gt;
&lt;li&gt;End date&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I have given below a snapshot of the forecasts which I got using the Predictor that I trained.&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcv564bkkfw97uzk1csfq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcv564bkkfw97uzk1csfq.png" alt="im" width="800" height="332"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/blogs/machine-learning/amazon-forecast-weather-index-automatically-include-local-weather-to-increase-your-forecasting-model-accuracy/" rel="noopener noreferrer"&gt;Amazon Forecast Weather Index – automatically include local weather to increase your forecasting model accuracy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/blogs/machine-learning/tailor-and-prepare-your-data-for-amazon-forecast/" rel="noopener noreferrer"&gt;Prepare and clean your data for Amazon Forecast&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/forecast/latest/dg/howitworks-domains-ds-types.html" rel="noopener noreferrer"&gt;Predefined Dataset Domains and Dataset Types&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>deeplearning</category>
    </item>
  </channel>
</rss>
