<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Evaldas Buinauskas</title>
    <description>The latest articles on DEV Community by Evaldas Buinauskas (@buinauskas).</description>
    <link>https://dev.to/buinauskas</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F38132%2F185ef9c8-8948-4057-a1d6-23819249a158.jpg</url>
      <title>DEV Community: Evaldas Buinauskas</title>
      <link>https://dev.to/buinauskas</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/buinauskas"/>
    <language>en</language>
    <item>
      <title>Vinted Search Scaling Chapter 6: 4th generation of Elasticsearch metrics</title>
      <dc:creator>Evaldas Buinauskas</dc:creator>
      <pubDate>Wed, 22 Dec 2021 06:39:59 +0000</pubDate>
      <link>https://dev.to/vinted/vinted-search-scaling-chapter-6-4th-generation-of-elasticsearch-metrics-2ga7</link>
      <guid>https://dev.to/vinted/vinted-search-scaling-chapter-6-4th-generation-of-elasticsearch-metrics-2ga7</guid>
      <description>&lt;p&gt;An old fact from 2014 December 12th was significant for Vinted: the company switched from &lt;a href="http://sphinxsearch.com/" rel="noopener noreferrer"&gt;Sphinx&lt;/a&gt; search engine to Elasticsearch 1.4.1. At the time of writing this post, we use Elasticsearch 7.15. Without a doubt, a lot has happened in between. This chapter will focus on Elasticsearch metrics, we will share our accumulated experiences from four generations of collecting metrics.&lt;/p&gt;

&lt;p&gt;Managing search requires both product and engineering efforts – two complementary parts. At Vinted, search engineers work with product, maintain a sound infrastructure, set up a &lt;a href="{https://dev.to/vinted/vinted-search-scaling-chapter-1-indexing-4ii1"&gt;scalable indexing pipeline&lt;/a&gt; and upscale &lt;a href="https://dev.to/vinted/vinted-search-scaling-chapter-5-divide-and-conquer-m7f"&gt;search throughput performance&lt;/a&gt;; this is unattainable without proper metrics.&lt;/p&gt;

&lt;p&gt;The first generation of metrics were collected by parsing &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/master/cat.html" rel="noopener noreferrer"&gt;/_cat API's&lt;/a&gt; using Ruby scripts. Nothing sophisticated, we ran Elasticsearch version 1.x - 2.x and monitoring was done on demand with Elasticsearch plugins, such as ElasticHQ.&lt;/p&gt;

&lt;p&gt;The second generation ran on the &lt;a href="https://sensu.io/" rel="noopener noreferrer"&gt;Sensu&lt;/a&gt; observability pipeline. Graphite was used as a persistence layer for storing collected Elasticsearch metrics, which we used to collect metrics of indices, shards and segments. We ran Elasticsearch version 2.x - 5.x at that time, monitoring segments was important as &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/1.7/index-modules-merge.html" rel="noopener noreferrer"&gt;fine-tuning&lt;/a&gt; of the segmentation policy improved reading performance. The drawback was in the storage layer. Graphite would apply sampling on metrics spanning over a longer period. Sampling was the main pain point in moving to another storage engine.&lt;/p&gt;

&lt;p&gt;The third generation of metrics ran on the &lt;a href="https://prometheus.io/docs/introduction/faq/#why-do-you-pull-rather-than-push" rel="noopener noreferrer"&gt;pull-based&lt;/a&gt; metrics collection system, &lt;a href="https://prometheus.io/" rel="noopener noreferrer"&gt;Prometheus&lt;/a&gt;. Prometheus was not yet established as a de facto monitoring system. At that time, we ran Elasticsearch version 5.x, just as open-source Elasticsearch exporters emerged. Numerous open-source Elasticsearch exporters were tried out. We used the video streaming company’s one. During that time, as our infrastructure grew, the exporter failed to deliver by occasionally running out of memory or timing out on &lt;code&gt;/metrics&lt;/code&gt; endpoint requests. The exporter lacked fine-grained configuration functionality, such as limiting unnecessary metrics and configuring polling time per subsystem. Metrics were static, and the naming of metrics was inconsistent. The authors did not accept code change requests from the OSS community, no active delivery was done, and new metrics from  recent Elasticsearch versions weren’t introduced. The in house fork was branched out from the upstream Elasticsearch exporter repository, it became apparent that the forked exporter was beginning to look like a complete rewrite. The efforts of rewriting were so significant, we decided to write a completely new exporter instead.&lt;/p&gt;

&lt;p&gt;The fourth generation of Elasticsearch exporter solves multiple problems. It can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Work with large clusters (400+ nodes)&lt;/li&gt;
&lt;li&gt;Handle large amounts of metrics without crashing (exporter was able to export 940,272 metrics)&lt;/li&gt;
&lt;li&gt;Inject custom metric labels&lt;/li&gt;
&lt;li&gt;Handle ephemeral states of nodes, indices and shards&lt;/li&gt;
&lt;li&gt;Automatically generate new metrics and remove stale ones based on user-configurable lifetime&lt;/li&gt;
&lt;li&gt;Keep track of cluster state metadata node changes, cluster version and shards relocations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The new Elasticsearch exporter is written in the Rust programming language and is open-sourced on GitHub: &lt;a href="https://github.com/vinted/elasticsearch-exporter-rs" rel="noopener noreferrer"&gt;github.com/vinted/elasticsearch-exporter-rs&lt;/a&gt;. The exporter uses asynchronous &lt;a href="https://tokio.rs" rel="noopener noreferrer"&gt;Tokio runtime&lt;/a&gt;, &lt;a href="https://github.com/tikv/rust-prometheus" rel="noopener noreferrer"&gt;Rust Prometheus instrumentation library&lt;/a&gt; and the official &lt;a href="https://github.com/elastic/elasticsearch-rs" rel="noopener noreferrer"&gt;Elasticsearch client library&lt;/a&gt;. Metrics collection is decoupled from the serving &lt;code&gt;/metrics&lt;/code&gt; endpoint. In addition, Elasticsearch time-based metrics in milliseconds are converted into seconds to comply with &lt;a href="https://prometheus.io/docs/practices/naming/" rel="noopener noreferrer"&gt;Prometheus best practices&lt;/a&gt; (metrics ending in “millis” are replaced by “seconds”, “_bytes” and “_seconds” and postfixes are added where appropriate).&lt;/p&gt;

&lt;p&gt;Namespace of subsystem (&lt;code&gt;/_cat{aliases,shards,nodes,..}&lt;/code&gt;, &lt;code&gt;/_nodes/{stats,info,usage}&lt;/code&gt;), / is preserved to the last leaf e.g.: elasticsearch &lt;code&gt;/_nodes/info jvm mem heap_max&lt;/code&gt;. The JSON tree metric is converted to Prometheus format &lt;code&gt;elasticsearch_nodes_info_jvm_mem_heap_max_in_bytes&lt;/code&gt;, this makes metric names intuitive to use.&lt;/p&gt;

&lt;p&gt;Custom labels are injected into associated subsystems. For example, say the cluster version changes because of injection, one can distinguish metrics by the namespaced label &lt;code&gt;vin_cluster_version&lt;/code&gt;. This allows the comparison of Elasticsearch performance between clusters and nodes during a rolling version upgrade. Cluster metadata is updated every 5 minutes by default, it is configurable by the user. Metadata updates are useful when the shard is relocated to another node or the node IP changes.&lt;/p&gt;

&lt;p&gt;We &lt;a href="https://dev.to/vinted/vinted-search-scaling-chapter-3-elasticsearch-index-management-3jb5"&gt;reindex content completely&lt;/a&gt; and frequently (at most up to 2-3 times per day). Reindexing leaves a trace of state changes that exporters collect. Elasticsearch ephemeral state of nodes added/removed/drained, indices created, removed, and shards relocation, re-sharding, alias changes leave a lot of traces. The exporter handles this by keeping track of custom lifetimes per subsystem. Stale metrics are deleted when the last heartbeat of a metric with a unique label passes a predefined lifetime by the user. Metrics lifetimes help eliminate outdated metrics, save storage engine space and keep dashboards up-to-date.&lt;/p&gt;

&lt;p&gt;Extensive configuration is possible via CLI flags. For instance, one can include or skip labels, skip metrics, define polling intervals between subsystems, enable/disable different subsystems and control metadata refresh intervals.&lt;/p&gt;

&lt;p&gt;The best part of the exporter is that it does not define static metrics. Instead, metrics are dynamic, based on &lt;a href="https://doc.rust-lang.org/std/sync/atomic" rel="noopener noreferrer"&gt;atomic&lt;/a&gt; structures, so subsystem metrics are constructed on every request. The dynamic nature of metrics does not require us to define new metrics when the Elasticsearch version changes or remove deprecated ones, so no metrics maintenance is needed. Adding new metric subsystems is easy, but that rarely happens.&lt;/p&gt;

&lt;p&gt;At the moment, the exporter supports the following subsystems:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=^.^=
/_cat/allocation
/_cat/shards
/_cat/indices
/_cat/segments
/_cat/nodes
/_cat/recovery
/_cat/health
/_cat/pending_tasks
/_cat/aliases
/_cat/thread_pool
/_cat/plugins
/_cat/fielddata
/_cat/nodeattrs
/_cat/repositories
/_cat/templates
/_cat/transforms

/_cluster/health
/_cluster/stats

/_nodes/stats
/_nodes/usage
/_nodes/info

/_stats
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Metrics generation code is a little over 1600 lines of code, split between 36 files.&lt;/p&gt;

&lt;p&gt;You can try running the exporter from a docker container by using the command below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ docker run --network=host -it vinted/elasticsearch_exporter --elasticsearch_url=http://IP:PORT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The exporter also exposes metrics about itself.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-scrape-times.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-scrape-times.png%3Fstyle%3Dcentered" alt="Vinted Elasticsearch exporter scrape times"&gt;&lt;/a&gt;&lt;br&gt;Subsystem scraping times
  &lt;/p&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-rss.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-rss.png%3Fstyle%3Dcentered" alt="Vinted Elasticsearch resident memory"&gt;&lt;/a&gt;&lt;br&gt;Exporters resident memory
  &lt;/p&gt;



&lt;p&gt;This is a sneak peek of Vinted’s Elasticsearch metrics infrastructure topology. The search infrastructure spans 3 data centers. We run one Elasticsearch cluster per data center. Each exporter is triplicated for high availability between data centers and racks inside a data center. Exporters generally serve 1 purpose: either scraping the full subsystem or parts of a single subsystem as in the case of a rich subsystem such as &lt;code&gt;/_nodes/stats&lt;/code&gt;. Exporters run side-by-side to Elasticsearch nodes in docker containers and are configured using &lt;a href="https://www.chef.io" rel="noopener noreferrer"&gt;Chef&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-list.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-list.png%3Fstyle%3Dcentered" alt="Vinted Elasticsearch exporter list of used exports"&gt;&lt;/a&gt;&lt;br&gt;This is a production subsystems list of used exporters
  &lt;/p&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-master-nodes.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-master-nodes.png%3Fstyle%3Dcentered" alt="Vinted Elasticsearch exporter master nodes"&gt;&lt;/a&gt;&lt;br&gt;Exporter scrapes cluster health and aliases in master nodes
  &lt;/p&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-client-nodes.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-client-nodes.png%3Fstyle%3Dcentered" alt="Vinted Elasticsearch exporter client nodes"&gt;&lt;/a&gt;&lt;br&gt;Exporter scrapes thread pools in client nodes
  &lt;/p&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-nodes-stats.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-nodes-stats.png%3Fstyle%3Dcentered" alt="Vinted Elasticsearch exporter for nodes stats"&gt;&lt;/a&gt;&lt;br&gt;Exporter scrapes nodes stats partitioned per path parameters in data nodes
  &lt;/p&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-nodes-usage-and-info.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-nodes-usage-and-info.png%3Fstyle%3Dcentered" alt="Vinted Elasticsearch exporter for nodes info and usage"&gt;&lt;/a&gt;&lt;br&gt;Exporter scrapes nodes usage and info in data nodes
  &lt;/p&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-cat.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-exporter-cat.png%3Fstyle%3Dcentered" alt="Vinted Elasticsearch exporter for /_cat subsystems"&gt;&lt;/a&gt;&lt;br&gt;Exporter scrapes /_cat subsystems in data nodes
  &lt;/p&gt;



&lt;p&gt;We are also &lt;a href="https://github.com/vinted/elasticsearch-exporter-rs" rel="noopener noreferrer"&gt;open-sourcing&lt;/a&gt; 13 dashboards that collectively have about 323 panels exposed by various exporter subsystems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-dashboards.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2Fscaling-elasticsearch%2Felasticsearch-dashboards.png%3Fstyle%3Dcentered" alt="Vinted Elasticsearch resident memory"&gt;&lt;/a&gt;&lt;br&gt;Available exporter dashboards
  &lt;/p&gt;



&lt;p&gt;Enjoy.&lt;/p&gt;




&lt;p&gt;If you liked what you just read, we are hiring Search engineers! Find out more on our &lt;a href="https://vinted.com/jobs" rel="noopener noreferrer"&gt;Jobs page&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>vinted</category>
      <category>elasticsearch</category>
      <category>scaling</category>
      <category>prometheus</category>
    </item>
    <item>
      <title>Vinted Search Scaling Chapter 5: Divide and Conquer</title>
      <dc:creator>Evaldas Buinauskas</dc:creator>
      <pubDate>Wed, 22 Dec 2021 06:34:10 +0000</pubDate>
      <link>https://dev.to/vinted/vinted-search-scaling-chapter-5-divide-and-conquer-m7f</link>
      <guid>https://dev.to/vinted/vinted-search-scaling-chapter-5-divide-and-conquer-m7f</guid>
      <description>&lt;p&gt;Elasticsearch is really fast out of the box, but it doesn’t know your data model or access patterns. It’s no secret that Elasticsearch powers our catalogue, but with the ever-increasing amount of listings and queries, we’ve had a number of problems to solve.&lt;/p&gt;

&lt;p&gt;In the middle of 2020, we noticed an increasing amount of failing search requests during peak hours. Looking at node-exporter metrics in Grafana indicated nothing was really wrong and that the cluster load was fairly balanced, but it clearly wasn’t. Only after logging to the actual servers did we realise that the load distribution was uneven across all the Elasticsearch nodes we ran there.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2F2021%2F12%2F14%2Fserver-load-distribution.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2F2021%2F12%2F14%2Fserver-load-distribution.png%3Fstyle%3Dcentered" alt="Server load distribution"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Following documented guidelines for optimizing cluster and JVM settings, we experimented with shard sizes and the number of replicas while following the &lt;code&gt;number_of_shards * (number_of_replicas + 1) &amp;lt; #data_nodes&lt;/code&gt; recommendation. But this gave us no significant improvements, so we had to approach the issue with a different strategy.&lt;/p&gt;

&lt;p&gt;We’ve decided to apply the ‘divide and conquer’ strategy by creating separate indices for specific types of queries. These could exist in a different cluster, have different numbers of shards or replicas, or different index-time sorting. We started by redirecting all of our &lt;code&gt;/_count&lt;/code&gt; queries to another cluster, which significantly reduced the portion of failing requests. We then realized that we've discovered something important:&lt;/p&gt;

&lt;p&gt;Vinted supports multilingual search and uses an index per language strategy. We also allow users to browse our catalogue by applying filters without requiring any search text. We’ve created a version of the listings index to support such kinds of queries. By doing that, we could put all of the listings into a single index, no longer analyse textual fields and redirect all browsing queries to them.&lt;/p&gt;

&lt;p&gt;Surprisingly, it helped to distribute cluster load more evenly despite the fact that the new indices were in the same cluster. The extra disk space and increased complexity was a worthy tradeoff. We decided to go even further and partition this index by date to see if we can control data growth easier, improve query performance, and reduce refresh times. Splitting the index by date allowed us to leverage the shards skipping technique. For example, say search queries last for 2 weeks of data; Elasticsearch would optimise them by skipping shards, querying only the last month’s index partition and skipping the rest.&lt;/p&gt;

&lt;p&gt;You may already know that we use &lt;a href="https://dev.to/vinted/vinted-search-scaling-chapter-1-indexing-4ii1"&gt;Kafka Connect&lt;/a&gt; for our data ingestion and it comes with Single Message Transforms (SMTs), allowing us to transform messages as they flow through Connect. Two of these SMTs were particularly interesting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.confluent.io/platform/current/connect/transforms/timestamprouter.html" rel="noopener noreferrer"&gt;Timestamp Router&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.confluent.io/platform/current/connect/transforms/regexrouter.html" rel="noopener noreferrer"&gt;Regex Router&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Timestamp Router allows the construction of a new topic field using the record’s topic name and timestamp field with the ability to format it.&lt;/p&gt;

&lt;p&gt;Regex Router allows updating the record’s topic using the configured regular expression and replacement string.&lt;/p&gt;

&lt;p&gt;To use these routers, our records – and tombstone messages – must contain a fixed timestamp. Otherwise, we will never be able to ingest data to Elasticsearch consistently. By default, each new record is produced with the current timestamp, but &lt;a href="https://www.rubydoc.info/gems/ruby-kafka/Kafka/Producer#produce-instance_method" rel="noopener noreferrer"&gt;ruby-kafka allows it to be overwritten&lt;/a&gt; with any value. We believed that overwriting it with the listing’s creation time would work. We also considered trying &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index-lifecycle-management.html" rel="noopener noreferrer"&gt;ILM&lt;/a&gt; and &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/master/data-streams.html" rel="noopener noreferrer"&gt;data streams&lt;/a&gt; but quickly rejected them, as both of these techniques target append-only data.&lt;/p&gt;

&lt;p&gt;We built a small application that mimics our ingestion pipeline, but with a couple of changes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Overrides &lt;code&gt;create_time&lt;/code&gt; with a fixed value for each key&lt;/li&gt;
&lt;li&gt;Uses Timestamp Router to partition records into monthly partitions
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transforms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TimestampRouter"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transforms.TimestampRouter.timestamp.format"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"yyyy.MM"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transforms.TimestampRouter.topic.format"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${topic}-${timestamp}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transforms.TimestampRouter.type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"org.apache.kafka.connect.transforms.TimestampRouter"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The config change is fairly simple but very powerful – it’ll suffix the record’s topic with the year and month using its timestamp.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2F2021%2F12%2F14%2Ftimestamp-router-1.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2F2021%2F12%2F14%2Ftimestamp-router-1.png%3Fstyle%3Dcentered" alt="Timestamp Router results"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is great, but we have lots of listings from the past, and creating a monthly bucket for very few records would be expensive. To combat this, we’ve used the Regex Router to put all the partitions up to 2019 into a single &lt;code&gt;old&lt;/code&gt; one. It wouldn’t touch newer items.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transforms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TimestampRouter,MergeOldListings"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transforms.TimestampRouter.timestamp.format"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"yyyy.MM"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transforms.TimestampRouter.topic.format"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${topic}-${timestamp}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transforms.TimestampRouter.type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"org.apache.kafka.connect.transforms.TimestampRouter"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transforms.MergeOldListings.regex"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"(?-mix:(.+)-((2010|2011|2012|2013|2014|2015|2016|2017|2018).*))"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transforms.MergeOldListings.replacement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"$1-old"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transforms.MergeOldListings.type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"org.apache.kafka.connect.transforms.RegexRouter"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2F2021%2F12%2F14%2Ftimestamp-router-2.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2F2021%2F12%2F14%2Ftimestamp-router-2.png%3Fstyle%3Dcentered" alt="Timestamp Router partitioned results"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2F2021%2F12%2F14%2Ftimestamp-router-3.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2F2021%2F12%2F14%2Ftimestamp-router-3.png%3Fstyle%3Dcentered" alt="Timestamp Router partitioned results"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After confirming this works as expected, we implemented this into our application, which was fairly simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create new topics for our listings (we don’t want to affect the current ingestion pipeline)&lt;/li&gt;
&lt;li&gt;Write to both old and new topics at the same time&lt;/li&gt;
&lt;li&gt;Use the listing to create time when sending records to the new topics&lt;/li&gt;
&lt;li&gt;Test ingestion with the new topics and routing strategy&lt;/li&gt;
&lt;li&gt;Switch to new topics once we confirm that ingestion works as expected&lt;/li&gt;
&lt;/ol&gt;



&lt;p&gt;The initial switch to partitioned indices didn’t work out, so we’ve had to tailor our queries for them, tweak the number of shards and the number of replicas. After plenty of unsuccessful attempts, we finally have a version that works great.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2F2021%2F12%2F14%2Fmetrics-after-implementation.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fvinted.engineering%2Fstatic%2F2021%2F12%2F14%2Fmetrics-after-implementation.png%3Fstyle%3Dcentered" alt="Metrics after implementation"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;We’ve managed to not only shed 20ms off the P99 but have had other great results too:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Query and request cache hit ratio is above 90% due to older indices refreshing less often&lt;/li&gt;
&lt;li&gt;Having load split between multiple indices results in a more even load across the whole cluster, so we no longer have hot nodes&lt;/li&gt;
&lt;li&gt;We’ve hit fewer index shards when querying for the top-k latest items&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In retrospect, this abnormal behavior was most likely caused by &lt;a href="https://github.com/elastic/elasticsearch/pull/70283" rel="noopener noreferrer"&gt;bug&lt;/a&gt; in the Adaptive Replica Selection formula. But back in the day, we mitigated this bug by creatively applying the divide and conquer strategy, which helps us to this day.&lt;/p&gt;

</description>
      <category>vinted</category>
      <category>elasticsearch</category>
      <category>scaling</category>
    </item>
    <item>
      <title>Validating JSON input in Rust web services</title>
      <dc:creator>Evaldas Buinauskas</dc:creator>
      <pubDate>Mon, 15 Feb 2021 06:46:34 +0000</pubDate>
      <link>https://dev.to/vinted/validating-json-input-in-rust-web-services-5gp0</link>
      <guid>https://dev.to/vinted/validating-json-input-in-rust-web-services-5gp0</guid>
      <description>&lt;p&gt;One of the key features of a good web service is user input validation. External input cannot be trusted and the application must prevent malicious data from being processed.&lt;/p&gt;

&lt;p&gt;It's not only a matter of security but as well as for service usability and development experience. When something goes wrong, at the very least service should respond with &lt;code&gt;400 Bad Request&lt;/code&gt;, but good service will also respond with exact details of what went wrong.&lt;/p&gt;

&lt;p&gt;In order not to invent its own error response format, a service should use &lt;a href="https://tools.ietf.org/html/rfc7807"&gt;RFC7807&lt;/a&gt;. RFC 7807 provides a standard format for returning problem details from HTTP APIs.&lt;/p&gt;

&lt;p&gt;In this tutorial, we'll implement a web service in Rust using &lt;a href="https://github.com/seanmonstar/warp"&gt;warp&lt;/a&gt; web framework and add request validation using &lt;a href="https://github.com/Keats/validator"&gt;validator&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Warp is a super-easy, composable, web server framework for warp speeds.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
&lt;p&gt;Validator is a simple validation library for Rust structs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Project creation
&lt;/h2&gt;

&lt;p&gt;For starters, create a new project using &lt;code&gt;cargo&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo new warp-validation &lt;span class="nt"&gt;--bin&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;warp-validation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Dependencies
&lt;/h2&gt;

&lt;p&gt;Then edit &lt;code&gt;Cargo.toml&lt;/code&gt; file and add these dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[dependencies]&lt;/span&gt;
&lt;span class="nn"&gt;tokio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;features&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;["full"]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="py"&gt;warp&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0.3"&lt;/span&gt;
&lt;span class="nn"&gt;serde&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"1.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;features&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;["derive"]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="py"&gt;serde_json&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"1.0"&lt;/span&gt;
&lt;span class="nn"&gt;validator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0.12"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;features&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;["derive"]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nn"&gt;http-api-problem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;  &lt;span class="s"&gt;"0.21.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;features&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"hyper"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"warp"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/tokio-rs/tokio"&gt;tokio&lt;/a&gt; is asynchronous runtime for Rust applications&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/seanmonstar/warp"&gt;warp&lt;/a&gt; is web framework of choice&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/serde-rs/serde"&gt;serde&lt;/a&gt; is serialization and deserialization framework&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/Keats/validator"&gt;validator&lt;/a&gt; is validation library&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/chridou/http-api-problem"&gt;http-api-problem&lt;/a&gt; implements &lt;a href="https://tools.ietf.org/html/rfc7807"&gt;RFC7807 Problem Details for HTTP APIs&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now that when all dependencies are ready, we can start by implementing the most basic service.&lt;/p&gt;

&lt;h1&gt;
  
  
  Implementing web service
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Basic web service
&lt;/h2&gt;

&lt;p&gt;This is the most basic web service you can build in Warp. All it does is responds to &lt;code&gt;GET /hello/:name&lt;/code&gt; requests with &lt;code&gt;Hello, :name&lt;/code&gt;, where &lt;code&gt;:name&lt;/code&gt; is path variable. We'll build on top of that.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Filter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;#[tokio::main]&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;hello&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nd"&gt;path!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hello"&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;.map&lt;/span&gt;&lt;span class="p"&gt;(|&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="nd"&gt;format!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello, {}!"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;serve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hello&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;.run&lt;/span&gt;&lt;span class="p"&gt;(([&lt;/span&gt;&lt;span class="mi"&gt;127&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;3030&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So if we called our endpoint using cURL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:3030/hello/evaldas
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This would be the response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hello, evaldas!.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Accepting &lt;code&gt;POST&lt;/code&gt; requests
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;The fundamental building block of &lt;code&gt;warp&lt;/code&gt; is the &lt;code&gt;Filter&lt;/code&gt;: they can be combined and composed to express rich requirements on requests.&lt;/p&gt;

&lt;p&gt;Thanks to its &lt;code&gt;Filter&lt;/code&gt; system, warp provides these out of the box:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Path routing and parameter extraction&lt;/li&gt;
&lt;li&gt;Header requirements and extraction&lt;/li&gt;
&lt;li&gt;Query string deserialization&lt;/li&gt;
&lt;li&gt;JSON and Form bodies&lt;/li&gt;
&lt;li&gt;Multipart form data&lt;/li&gt;
&lt;li&gt;Static Files and Directories&lt;/li&gt;
&lt;li&gt;Websockets&lt;/li&gt;
&lt;li&gt;Access logging&lt;/li&gt;
&lt;li&gt;Gzip, Deflate, and Brotli compression&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;A prebuilt list of filters can be found &lt;a href="https://docs.rs/warp/0.3.0/warp/filters/index.html"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In order to accept &lt;code&gt;POST&lt;/code&gt; requests with JSON payloads, the following pieces are necessary:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Request&lt;/code&gt; struct needed for strongly typed requests. Notice that it derives the &lt;code&gt;Deserialize&lt;/code&gt; trait, meaning, it can be deserialized from &lt;code&gt;JSON&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;warp::post()&lt;/code&gt; filter to accept only &lt;code&gt;POST&lt;/code&gt; requests&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;warp::body::json()&lt;/code&gt; to read JSON payload&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.and(..)&lt;/code&gt; filter to chain them together and ensure that a successful request must match all of the required filters
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;serde&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Deserialize&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Filter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;#[derive(Deserialize,&lt;/span&gt; &lt;span class="nd"&gt;Debug)]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;#[tokio::main]&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;hello&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nd"&gt;path!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hello"&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;.and&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="nf"&gt;.and&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;body&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="nf"&gt;.map&lt;/span&gt;&lt;span class="p"&gt;(|&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nd"&gt;format!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello, {}! This is request: {:?}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;serve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hello&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.run&lt;/span&gt;&lt;span class="p"&gt;(([&lt;/span&gt;&lt;span class="mi"&gt;127&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;3030&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See what happens when we send a valid request&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:3030/hello/evaldas &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt;
    &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"name":"name","email":"email","age":1}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We'd get this response back:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hello, evaldas! This is request: Request { name: "name", email: "email", age: 1 }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Validating requests
&lt;/h2&gt;

&lt;p&gt;Validating requests using the &lt;code&gt;validator&lt;/code&gt; library is a breeze. All we really need to do is to derive the &lt;code&gt;Validate&lt;/code&gt; trait and add &lt;code&gt;#validate&lt;/code&gt; attributes to our fields.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;#[derive(Deserialize,&lt;/span&gt; &lt;span class="nd"&gt;Debug,&lt;/span&gt; &lt;span class="nd"&gt;Validate)]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;#[validate(length(min&lt;/span&gt; &lt;span class="nd"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="nd"&gt;))]&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nd"&gt;#[validate(email)]&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nd"&gt;#[validate(range(min&lt;/span&gt; &lt;span class="nd"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="nd"&gt;,&lt;/span&gt; &lt;span class="nd"&gt;max&lt;/span&gt; &lt;span class="nd"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="nd"&gt;))]&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then we can call &lt;code&gt;.validate()&lt;/code&gt; method on our request to validate it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;#[tokio::main]&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;hello&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nd"&gt;path!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hello"&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;.and&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="nf"&gt;.and&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;body&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="nf"&gt;.map&lt;/span&gt;&lt;span class="p"&gt;(|&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="nf"&gt;.validate&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

            &lt;span class="nd"&gt;format!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"Hello, {}! This is request: {:?} and its validation result: {:?}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;serve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hello&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.run&lt;/span&gt;&lt;span class="p"&gt;(([&lt;/span&gt;&lt;span class="mi"&gt;127&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;3030&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sending exactly the same request would result in a response with a validation failure.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hello, evaldas! This is request: Request { name: "name", email: "email", age: 1 } and its validation result: Err(ValidationErrors({"email": Field([ValidationError { code: "email", message: None, params: {"value": String("email")} }]), "age": Field([ValidationError { code: "range", message: None, params: {"value": Number(1), "max": Number(100.0), "min": Number(18.0)} }])}))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is already great, but in order to completely implement validation, all we really need to do is to make it a part of the request chain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementing a custom validation filter
&lt;/h2&gt;

&lt;p&gt;To achieve that, we'll start by creating a custom &lt;code&gt;Error&lt;/code&gt; enum type that contains only &lt;code&gt;ValidationErrors&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;#[derive(Debug)]&lt;/span&gt;
&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;Error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;Validation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ValidationErrors&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It will also implement &lt;code&gt;warp::reject::Reject&lt;/code&gt;, this is needed to turn an &lt;code&gt;Error&lt;/code&gt; into a custom warp &lt;code&gt;Rejection&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Reject&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;Error&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then we'll create a custom method that accepts a generic value constrained to implement the &lt;code&gt;Validate&lt;/code&gt; trait. It's going to validate the value and either return the value itself or validation errors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="n"&gt;validate&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt;
    &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Validate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="nf"&gt;.validate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="nf"&gt;.map_err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Validation&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And finally we'll create a custom &lt;code&gt;with_validated_json&lt;/code&gt; filter which does a couple of things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Limits payload sizes to &lt;code&gt;16KiB&lt;/code&gt; using &lt;code&gt;warp::body::content_length_limit&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Uses built-in &lt;code&gt;warp::body::json()&lt;/code&gt; filter to deserialize JSON into a struct&lt;/li&gt;
&lt;li&gt;And then passes deserialized value onto &lt;code&gt;validate&lt;/code&gt; method we just built
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="n"&gt;with_validated_json&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;Filter&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Extract&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="n"&gt;Error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Rejection&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;Clone&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt;
    &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DeserializeOwned&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;Validate&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Send&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;body&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;content_length_limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;.and&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;body&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="nf"&gt;.and_then&lt;/span&gt;&lt;span class="p"&gt;(|&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;move&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.map_err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;custom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's now enough to replace &lt;code&gt;warp::body::json()&lt;/code&gt; filter in our request chain with &lt;code&gt;with_validated_json&lt;/code&gt;. We also can remove validation from handler.&lt;/p&gt;

&lt;p&gt;Sending that exact same request results in a validation failure, but this time it doesn't even reach the handler and returns early.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Unhandled rejection: Validation(ValidationErrors({"email": Field([ValidationError { code: "email", message: None, params: {"value": String("email")} }]), "age": Field([ValidationError { code: "range", message: None, params: {"value": Number(1), "max": Number(100.0), "min": Number(18.0)} }])}))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that it says that rejection is unhandled - this is the case when warp doesn't know what to do with the &lt;code&gt;Rejection&lt;/code&gt; and simply returns it. Luckily for us, there's an easy way to handle them and return appropriate responses. There's a &lt;a href="https://github.com/seanmonstar/warp/blob/master/examples/rejections.rs"&gt;excellent rejection example&lt;/a&gt;. We'll improve it by making sure that rejections return an instance of Problem Details instead of only status code and plain error text.&lt;/p&gt;

&lt;p&gt;We'll use &lt;a href="https://docs.rs/warp/0.3.0/warp/reject/struct.Rejection.html#method.find"&gt;&lt;code&gt;Rejection::find&lt;/code&gt;&lt;/a&gt; method to find whether our custom &lt;code&gt;Error&lt;/code&gt; is the reason why request has failed, convert that into Problem Details, and in every other case return internal server error.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;handle_rejection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Rejection&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;Reply&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Infallible&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nf"&gt;Some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="py"&gt;.find&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;handle_crate_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nn"&gt;HttpApiProblem&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;with_title_and_type_from_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;StatusCode&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;INTERNAL_SERVER_ERROR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="nf"&gt;.to_hyper_response&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;handle_crate_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;HttpApiProblem&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nn"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Validation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;problem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
                &lt;span class="nn"&gt;HttpApiProblem&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;with_title_and_type_from_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;StatusCode&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;BAD_REQUEST&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="nf"&gt;.set_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"One or more validation errors occurred"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="nf"&gt;.set_detail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Please refer to the errors property for additional details"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="mi"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;problem&lt;/span&gt;&lt;span class="nf"&gt;.set_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"errors"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="nf"&gt;.errors&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

            &lt;span class="n"&gt;problem&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We then finally need to use &lt;a href="https://docs.rs/warp/0.3.0/warp/trait.Filter.html#method.recover"&gt;&lt;code&gt;Filter::recover&lt;/code&gt;&lt;/a&gt; to recover from the rejection and pass &lt;code&gt;handle_rejection&lt;/code&gt; as an argument to it.&lt;/p&gt;

&lt;p&gt;If we send exactly the same request again, this time it would return an instance of problem details with an appropriate status code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://httpstatuses.com/400"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"One or more validation errors occurred"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"detail"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Please refer to the errors property for additional details"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"errors"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"age"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"range"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"max"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;100.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"min"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;18.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"email"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The complete application looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;http_api_problem&lt;/span&gt;&lt;span class="p"&gt;::{&lt;/span&gt;&lt;span class="n"&gt;HttpApiProblem&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;StatusCode&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;serde&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;de&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;DeserializeOwned&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;serde&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Deserialize&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;std&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;convert&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Infallible&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;validator&lt;/span&gt;&lt;span class="p"&gt;::{&lt;/span&gt;&lt;span class="n"&gt;Validate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ValidationErrors&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::{&lt;/span&gt;&lt;span class="n"&gt;Filter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Rejection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Reply&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="nd"&gt;#[derive(Deserialize,&lt;/span&gt; &lt;span class="nd"&gt;Debug,&lt;/span&gt; &lt;span class="nd"&gt;Validate)]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;#[validate(length(min&lt;/span&gt; &lt;span class="nd"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="nd"&gt;))]&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nd"&gt;#[validate(email)]&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nd"&gt;#[validate(range(min&lt;/span&gt; &lt;span class="nd"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="nd"&gt;,&lt;/span&gt; &lt;span class="nd"&gt;max&lt;/span&gt; &lt;span class="nd"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="nd"&gt;))]&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;#[tokio::main]&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;hello&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nd"&gt;path!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hello"&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;.and&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="nf"&gt;.and&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;with_validated_json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="nf"&gt;.map&lt;/span&gt;&lt;span class="p"&gt;(|&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nd"&gt;format!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello, {}! This is request: {:?}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="nf"&gt;.recover&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle_rejection&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;serve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hello&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.run&lt;/span&gt;&lt;span class="p"&gt;(([&lt;/span&gt;&lt;span class="mi"&gt;127&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;3030&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="n"&gt;with_validated_json&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;Filter&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Extract&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="n"&gt;Error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Rejection&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;Clone&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt;
    &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DeserializeOwned&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;Validate&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Send&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;body&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;content_length_limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;.and&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;body&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="nf"&gt;.and_then&lt;/span&gt;&lt;span class="p"&gt;(|&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;move&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.map_err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;custom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="n"&gt;validate&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt;
    &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Validate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="nf"&gt;.validate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="nf"&gt;.map_err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Validation&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;#[derive(Debug)]&lt;/span&gt;
&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;Error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;Validation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ValidationErrors&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="nn"&gt;warp&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Reject&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;Error&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;handle_rejection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Rejection&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;Reply&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Infallible&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nf"&gt;Some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="py"&gt;.find&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;handle_crate_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nn"&gt;HttpApiProblem&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;with_title_and_type_from_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;StatusCode&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;INTERNAL_SERVER_ERROR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="nf"&gt;.to_hyper_response&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;handle_crate_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;HttpApiProblem&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nn"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Validation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;problem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
                &lt;span class="nn"&gt;HttpApiProblem&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;with_title_and_type_from_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;StatusCode&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;BAD_REQUEST&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="nf"&gt;.set_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"One or more validation errors occurred"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="nf"&gt;.set_detail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Please refer to the errors property for additional details"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="mi"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;problem&lt;/span&gt;&lt;span class="nf"&gt;.set_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"errors"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="nf"&gt;.errors&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

            &lt;span class="n"&gt;problem&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>rust</category>
      <category>warp</category>
      <category>tokio</category>
    </item>
    <item>
      <title>Vinted Search Scaling Chapter 3: Elasticsearch Index Management</title>
      <dc:creator>Evaldas Buinauskas</dc:creator>
      <pubDate>Mon, 08 Feb 2021 08:31:00 +0000</pubDate>
      <link>https://dev.to/vinted/vinted-search-scaling-chapter-3-elasticsearch-index-management-3jb5</link>
      <guid>https://dev.to/vinted/vinted-search-scaling-chapter-3-elasticsearch-index-management-3jb5</guid>
      <description>&lt;p&gt;Managing Elasticsearch indices at scale isn’t easy.&lt;/p&gt;

&lt;p&gt;Index settings were part of our core application monolith logic. Making the slightest change to them meant that the whole application had to be redeployed, lots of unrelated tests had to run. Also, it was built to be only aware of a single language. We needed to be more flexible and have fine-grained control over our Elasticsearch clusters.&lt;/p&gt;

&lt;p&gt;At the same time, we were planning to break our single Elasticsearch cluster into multiple ones for better load distribution and the ability to upgrade smaller clusters. Managing all of that from a single application was difficult. See how we managed to manage search traffic across multiple Elasticsearch clusters in &lt;a href="https://dev.to/vinted/vinted-search-scaling-chapter-2-routing-elasticsearch-requests-with-srouter-2jgn"&gt;second chapter&lt;/a&gt; of these series.&lt;/p&gt;

&lt;p&gt;We’ve decided to decouple search setup and ingestion logic into a separate application, very obviously called Elasticsearch Index Management, or EIM for short. It would turn out to be a set of CLI utilities, automated jobs, and webhook listeners to automate our routine work. I’ll share how we solved these problems and what we learned.&lt;/p&gt;

&lt;h1&gt;
  
  
  Goals
&lt;/h1&gt;

&lt;p&gt;There were no well-defined and clear goals when we started this project. All we really wanted was to solve our pain points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Manage indices and their changes with ease&lt;/li&gt;
&lt;li&gt;Control the data indexing pipeline&lt;/li&gt;
&lt;li&gt;Detect Elasticsearch version inconsistencies&lt;/li&gt;
&lt;li&gt;Automation of the search infrastructure&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Index management
&lt;/h2&gt;

&lt;p&gt;Vinted operates in 13 countries, we support the search for our feed, promotions, forums, FAQ search, and more. Naturally, we need to support multiple languages.&lt;/p&gt;

&lt;p&gt;Elasticsearch templates are well suited for this use case. They’re composable, have great flexibility, allow dynamic field detection and mapping.&lt;br&gt;
We identified that language-agnostic analyzers, language analyzers, index settings, index mappings, and ingestion pipelines are the key elements to construct a template. Indices we create based on templates are consistent and guaranteed to be as per spec. To keep templates correct and up to date, we test them during each CI build and refresh them on clusters during every release.&lt;br&gt;
The way we construct templates also reflects on our code structure. Analysis, indices, pipelines, and other potential parts are grouped together and later merged into a single template.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;elasticsearch/
├── analysis/
│   ├── common.rb
│   ├── english.rb
│   ├── french.rb
│   ├── lithuanian.rb
├── indices/
│   ├── faq.rb
│   ├── feed.rb
│   ├── forum.rb
│   └── promotions.rb
└── pipelines/
    └── brands.rb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To prevent human errors and catch issues before deployment, we heavily test all of the template pieces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Field index and search time analyzers&lt;/li&gt;
&lt;li&gt;Field settings, such as its &lt;code&gt;type&lt;/code&gt;, &lt;code&gt;norms&lt;/code&gt;, &lt;code&gt;doc_fields&lt;/code&gt; or &lt;code&gt;copy_to&lt;/code&gt; values.&lt;/li&gt;
&lt;li&gt;Index settings, such as number of shards and replicas, refresh intervals&lt;/li&gt;
&lt;li&gt;We test ingestion pipelines by simulating them and comparing before and after payloads&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We harness Elasticsearch &lt;code&gt;/_analyze&lt;/code&gt; and &lt;code&gt;/_ingest/pipeline/_simulate&lt;/code&gt; APIs results by creating custom &lt;a href="https://rspec.info"&gt;RSpec&lt;/a&gt; matchers. They allow us to write our domain-specific tests, which are elegant, short, intention-revealing.&lt;/p&gt;

&lt;p&gt;The following matcher extracts tokens produced by &lt;code&gt;/_analyze&lt;/code&gt; API and allows comparing them to expectation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;RSpec&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Matchers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;define&lt;/span&gt; &lt;span class="ss"&gt;:contain_tokens&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
  &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
    &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="n"&gt;failure_message&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
    &lt;span class="s2"&gt;"expected: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="p"&gt;\&lt;/span&gt;
    &lt;span class="s2"&gt;"     got: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="p"&gt;\&lt;/span&gt;
    &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="p"&gt;\&lt;/span&gt;
    &lt;span class="s1"&gt;'(compared using ==)'&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'tokens'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'token'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And this is an example of how it is being used in specs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="s1"&gt;'with special letters'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;let&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s1"&gt;'šermukšnis'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;is_expected&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;contain_tokens&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'sermuksnis'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Data ingestion
&lt;/h2&gt;

&lt;p&gt;From &lt;a href="https://dev.to/vinted/vinted-search-scaling-chapter-1-indexing-4ii1"&gt;chapter 1&lt;/a&gt;, you already know that we leverage Kafka and Kafka Connect for data ingestion, meaning they both are an integral part of Elasticsearch Index Management. We also make heavy use of &lt;a href="https://docs.confluent.io/kafka-connect-elasticsearch/current/configuration_options.html"&gt;Elasticsearch Sink Connector&lt;/a&gt; and use it to reindex data whenever it is needed.&lt;/p&gt;

&lt;p&gt;We built a set of abstractions that help us to achieve that. To reindex data, we'd begin with creating connectors for a specific type of indices. The following code would generate a collection of connectors with automatically generated names and configurations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;eim connector create &lt;span class="nt"&gt;--indices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Promotion Feed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the connector has caught up with indexing, it can be tuned for idle indexing to reduce indexing pressure on the Elasticsearch cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;eim connector update cluster-core-feed_20201217140000 &lt;span class="nt"&gt;--idle-indexing&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In case of an emergency, a connector can be paused.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;eim connector pause cluster-core-feed_20201217140000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We use &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html"&gt;Elasticsearch aliases API&lt;/a&gt; to atomically switch aliases between indices. After carefully observing metrics and being sure that everything's in place we'd run the following command to assign aliases to new indices and remove them from old ones.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;eim &lt;span class="nb"&gt;alias &lt;/span&gt;assign &lt;span class="nt"&gt;--indices&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Promotion Feed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing
&lt;/h2&gt;

&lt;p&gt;We have first-hand experience that Elasticsearch upgrades are not always successful and rolling back might not be possible at all. One of the key reasons Elasticsearch Index Management exists is the ability to easily test cluster and indices settings against different Elasticsearch versions.&lt;br&gt;
To avoid hassle of setting up multiple Elasticsearch versions locally, testing environments are containerized and consist of shared &lt;code&gt;docker-compose&lt;/code&gt; files. This setup allows us to create environments for development, testing, and experiments with very little effort. For instance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker-compose &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-p&lt;/span&gt; stack-7 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-f&lt;/span&gt; docker/docker-compose.es-7-base.yml &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-f&lt;/span&gt; docker/docker-compose.es-7.yml &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-f&lt;/span&gt; docker/docker-compose.kafka-base.yml &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-f&lt;/span&gt; docker/docker-compose.kafka.yml &lt;span class="se"&gt;\&lt;/span&gt;
    up
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At some point, we had to support Elasticsearch 5.x, 6.x, and 7.x. The following diagram showcases how a single environment looks like. CI build would create an isolated environment for every Elasticsearch version we support and run over 600 tests for each environment in parallel.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--uHWlpdHg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://vinted.engineering/static/2021/02/08/testing-environment.png%3Fstyle%3Dcentered%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--uHWlpdHg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://vinted.engineering/static/2021/02/08/testing-environment.png%3Fstyle%3Dcentered%3Fstyle%3Dcentered" alt="Testing environment"&gt;&lt;/a&gt;&lt;br&gt;Figure 1. Testing environment
  &lt;/p&gt;

&lt;h2&gt;
  
  
  Webhook listeners
&lt;/h2&gt;

&lt;p&gt;Kafka Connect connector tasks occasionally fail for various reasons, such as a network hiccup, out of memory exception. Initially, we would restart them manually, however, that wasn't sustainable and we needed a better solution.&lt;/p&gt;

&lt;p&gt;Vinted uses &lt;a href="https://prometheus.io/docs/alerting/latest/alertmanager/"&gt;Alertmanager&lt;/a&gt; for infrastructure monitoring. Out of the box, it can send alerts to email, Slack, additionally it allows configuring &lt;a href="https://prometheus.io/docs/alerting/latest/configuration/#webhook_config"&gt;generic webhook receivers&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We built such a receiver on top of Elasticsearch Index Management, deployed, and made it a part of Alertmanager configuration. From now on, when a connector task fails, Alertmanager sends a RESTful request to restart it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap up
&lt;/h2&gt;

&lt;p&gt;One year later, Elasticsearch Index Management has become a vital part of our job. Before that, everything from cluster upgrades to a simple change of index configuration was an adventure. Now it's predictable, automated, and boring. During one year of development, we caught most of the edge cases, ironed-out most annoying bugs, and built ourselves a companion that lets us stay productive and look after Elasticsearch estate with ease.&lt;/p&gt;

</description>
      <category>vinted</category>
      <category>elasticsearch</category>
      <category>scaling</category>
    </item>
    <item>
      <title>Vinted Search Scaling Chapter 2: Routing Elasticsearch requests with srouter</title>
      <dc:creator>Evaldas Buinauskas</dc:creator>
      <pubDate>Fri, 29 Jan 2021 09:02:15 +0000</pubDate>
      <link>https://dev.to/vinted/vinted-search-scaling-chapter-2-routing-elasticsearch-requests-with-srouter-2jgn</link>
      <guid>https://dev.to/vinted/vinted-search-scaling-chapter-2-routing-elasticsearch-requests-with-srouter-2jgn</guid>
      <description>&lt;p&gt;It began with Elasticsearch cluster upgrades. Every major Elasticsearch version brings new features, free performance, compatibility issues, and uncertainty.&lt;br&gt;
In addition, we had an itch to dissect requests, conjoin request with a response for analysis, record queries for stress testing, and account for each request per second.&lt;br&gt;
With these requirements in mind, we have built a &lt;a href="https://docs.microsoft.com/en-us/azure/architecture/patterns/sidecar"&gt;sidecar&lt;/a&gt; component next to primary application.&lt;/p&gt;

&lt;p&gt;The leftmost component is the core application that is not aware that search requests are being handled by multiple Elasticsearch clusters.&lt;br&gt;
Next component is &lt;a href="https://docs.microsoft.com/en-us/azure/architecture/patterns/sidecar"&gt;sidecar&lt;/a&gt; service written in &lt;a href="https://www.rust-lang.org/"&gt;Rust&lt;/a&gt; programming language, we call srouter.&lt;/p&gt;




  &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--sLd1O-Pv--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://vinted.engineering/static/scaling-elasticsearch/srouter.png%3Fstyle%3Dcentered" alt="srouter routing scheme"&gt;Figure 1. srouter
  




&lt;p&gt;Srouter sidecar provides us fine-grained control of routing search requests to multiple Elasticsearch clusters which allows us to test and compare different Elasticsearch versions using the real-time production workload without impacting the production.&lt;br&gt;
Service is configured in a simple and readable way, configuration may be granular up to the specific HTTP request path,&lt;br&gt;
or &lt;a href="https://www.regular-expressions.info/repeat.html"&gt;greedy&lt;/a&gt; as required by the use case.&lt;br&gt;
A simplified configuration sample can be viewed below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;--------&lt;/span&gt;
&lt;span class="na"&gt;whitelist_ends_with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/_count&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/_search&lt;/span&gt;
&lt;span class="na"&gt;routes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="s"&gt;main-items/_search&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;http://127.0.0.1:1000&lt;/span&gt;
    &lt;span class="na"&gt;mirror&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;http://127.0.0.1:1010&lt;/span&gt;
    &lt;span class="na"&gt;amplify&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;
    &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;311&lt;/span&gt;

  &lt;span class="s"&gt;lt-main-items/_count&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;http://127.0.0.1:1100&lt;/span&gt;
    &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;
    &lt;span class="na"&gt;sampling&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
    &lt;span class="na"&gt;storage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;http://dc1-host.example.com:8082/topics/storage&lt;/span&gt;

  &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;http://127.0.0.1:2000&lt;/span&gt;
    &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At the time of writing in production, we have defined 10 custom routes, it takes up to 3minutes to change any route in production, most of the time is spent in deployment.&lt;br&gt;
Configuration is kept in a git repository to keep track of changes. In the configuration sample, we have &lt;a href="https://www.regular-expressions.info/repeat.html"&gt;greedy&lt;/a&gt; &lt;code&gt;main-items/_search&lt;/code&gt; route with mirror traffic amplified 4x times.&lt;br&gt;
Second &lt;code&gt;lt-main-items/_count&lt;/code&gt; route has a timeout of 100ms and 10 percents of requests are sampled to &lt;a href="https://docs.confluent.io/platform/current/kafka-rest/index.html"&gt;Kafka&lt;/a&gt;.&lt;br&gt;
Unmatched routes fall-back to &lt;code&gt;default&lt;/code&gt; route. Srouter is picky about the HTTP method used, service allows any path HTTP GET requests, while HTTP POST requests are white-listed, in any other case service returns &lt;a href="https://httpstatuses.com/405"&gt;405: method not allowed&lt;/a&gt; status.&lt;br&gt;
The HTTP method limits protect from accidental index removal, direct indexing and serves as a gateway for engineers to "safely" access Elasticsearch.&lt;br&gt;
Configuration flexibility unlocks imagination to tame search scale by shaping Elasticsearch clusters dedicated to handling specific types of requests.&lt;/p&gt;

&lt;p&gt;Configuration routes are updated seamlessly without service interruption using &lt;a href="https://en.wikipedia.org/wiki/SIGHUP"&gt;SIGHUP&lt;/a&gt; signal.&lt;br&gt;
After receiving update signal, update is accomplished by using &lt;a href="https://doc.rust-lang.org/std/sync/atomic/struct.AtomicPtr.html"&gt;atomic pointer&lt;/a&gt; to update new routes.&lt;br&gt;
An atomic pointer is synchronized by &lt;a href="https://en.cppreference.com/w/cpp/atomic/memory_order#Relaxed_ordering"&gt;relaxed memory ordering&lt;/a&gt; to point to new updated &lt;a href="https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html"&gt;HashMap&lt;/a&gt; of routes.&lt;br&gt;
Srouter is &lt;a href="https://en.wikipedia.org/wiki/OSI_model"&gt;OSI layer 7&lt;/a&gt; router built on top of low level &lt;a href="https://hyper.rs"&gt;hyper&lt;/a&gt; server and &lt;a href="https://github.com/tokio-rs/tokio"&gt;tokio&lt;/a&gt; asynchronous runtime.&lt;br&gt;
As Rust is famous for performance this service is no different, it can provide more than 500k requests per second with 200 route updates every 10ms.&lt;br&gt;
Srouter single instance production resources footprint is 1 tokio core thread per 1 CPU with a total 124 megabytes of virtual memory, a large part of the memory (97%) is used by &lt;a href="https://prometheus.io/"&gt;Prometheus metrics&lt;/a&gt; service exposes.&lt;/p&gt;

&lt;p&gt;Metrics provided by the service are &lt;a href="https://prometheus.io/docs/practices/naming/"&gt;labeled&lt;/a&gt; per request path.&lt;br&gt;
Centralization of labeling in sidecar service makes a primary application simpler.&lt;br&gt;
Labeled metrics provides a base understanding of the frequency and volume of the queries, it helps to detect bad requests before production, measure each request percentile, track an exact number of requests.&lt;br&gt;
In addition to metrics, we are sampling a portion of queries and all unsuccessful responses.&lt;br&gt;
Sampled requests are enriched with requests, response headers, service time it took for srouter to complete the request and Elasticsearch response.&lt;br&gt;
Stored request/response combination allowed us to deprecate Elasticsearch &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-slowlog.html"&gt;slow query logs&lt;/a&gt;, the change reduced I/O activity, population sample of queries is improved.&lt;br&gt;
Population sample of queries is improved by sampling not only "slow" (where slow is defined by internal ES measure which depends on many factors such as search request queuing, indexing load, GC, etc) queries but also fast and even failed queries again in a uniform way.&lt;/p&gt;

&lt;p&gt;Srouter proved to be a critical service to our planned scaling strategy. &lt;cite&gt;Anthony Williams in C++ Concurrency in Action[1]&lt;/cite&gt; defines scalability as "reducing the time it takes to perform an action (reduce time) or increasing the amount of data that can be processed in a given time (increase throughput) as more processors are added."&lt;br&gt;
We have summarized this in two simple &lt;a href="https://en.wikipedia.org/wiki/Scale_cube"&gt;scaling cubes&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Lfq007Yg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://vinted.engineering/static/scaling-elasticsearch/scaling_cubes.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Lfq007Yg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://vinted.engineering/static/scaling-elasticsearch/scaling_cubes.png%3Fstyle%3Dcentered" alt="scaling cubes"&gt;&lt;/a&gt;&lt;br&gt;Figure 2. Scaling cubes
  &lt;/p&gt;



&lt;p&gt;Future chapters will explore how we are using "scaling cubes" to scale Elasticsearch.&lt;br&gt;
Liked what you have read? We’re currently hiring Search Engineers, find out more &lt;a href="http://bit.ly/3mGXnBE"&gt;here&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;Reference:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[1] C++ Concurrency in Action by Anthony Williams, &lt;a href="https://www.manning.com/books/c-plus-plus-concurrency-in-action"&gt;https://www.manning.com/books/c-plus-plus-concurrency-in-action&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Hands-On Concurrency with Rust: Confidently build memory-safe, parallel, and efficient software in Rust by Brian L. Troutwine &lt;a href="https://www.amazon.com/Hands-Concurrency-Rust-Confidently-memory-safe/dp/1788399978"&gt;https://www.amazon.com/Hands-Concurrency-Rust-Confidently-memory-safe/dp/1788399978&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Tokio &lt;a href="https://github.com/tokio-rs/tokio"&gt;https://github.com/tokio-rs/tokio&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Hyper &lt;a href="https://hyper.rs"&gt;https://hyper.rs&lt;/a&gt;, &lt;a href="https://github.com/hyperium/hyper"&gt;https://github.com/hyperium/hyper&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Scaling cubes: &lt;a href="https://en.wikipedia.org/wiki/Scale_cube"&gt;https://en.wikipedia.org/wiki/Scale_cube&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Greedy regular expressions: &lt;a href="https://www.regular-expressions.info/repeat.html"&gt;https://www.regular-expressions.info/repeat.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>vinted</category>
      <category>elasticsearch</category>
      <category>rust</category>
    </item>
    <item>
      <title>Vinted Search Scaling Chapter 1: Indexing</title>
      <dc:creator>Evaldas Buinauskas</dc:creator>
      <pubDate>Mon, 18 Jan 2021 15:19:27 +0000</pubDate>
      <link>https://dev.to/vinted/vinted-search-scaling-chapter-1-indexing-4ii1</link>
      <guid>https://dev.to/vinted/vinted-search-scaling-chapter-1-indexing-4ii1</guid>
      <description>&lt;p&gt;Data ingestion into Elasticsearch at scale is hard. In this post, I’m going to share a success story of leveraging the Kafka Connect Elasticsearch Sink Connector to support the ever-increasing usage of the Vinted platform and to help our search engineers to ship features quickly all while having an operational and reliable system.&lt;/p&gt;

&lt;h1&gt;
  
  
  Prehistory
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;Several times postponed and endless times patched with magic workarounds like future jobs, delta jobs, deferred deduplication, delta indexing, etc. - it is time to rethink this solution.&lt;/p&gt;

&lt;p&gt;--- &lt;cite&gt;Ernestas Poškus, Search SRE Team Lead&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Scope of the Post
&lt;/h1&gt;

&lt;p&gt;The job of the indexing pipeline is to bridge the gap between the primary datastore that is &lt;a href="https://www.mysql.com/"&gt;MySQL&lt;/a&gt; and the search indices backed by &lt;a href="https://www.elastic.co/what-is/elasticsearch"&gt;Elasticsearch&lt;/a&gt;. For the building blocks of the indexing pipeline, we picked &lt;a href="https://kafka.apache.org/"&gt;Apache Kafka&lt;/a&gt; and &lt;a href="https://kafka.apache.org/documentation/#connect"&gt;Kafka Connect&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--wujTFj5f--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://engineering.vinted.com/static/2021/01/12/KC.png%3Fstyle%3Dcentered" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--wujTFj5f--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://engineering.vinted.com/static/2021/01/12/KC.png%3Fstyle%3Dcentered" alt="architecture of the indexing pipeline"&gt;&lt;/a&gt;&lt;br&gt;Figure 1. Architecture of the indexing pipeline
  &lt;/p&gt;



&lt;p&gt;The term &lt;strong&gt;Async Jobs&lt;/strong&gt; in the chart above refers to a collection of Ruby scripts that periodically check the database for updates that happened during the last several minutes. Updates are then sent to the appropriate Kafka topics.&lt;/p&gt;

&lt;p&gt;We have a separate Kafka cluster for handling the search use cases. Kafka serves as a high-performance, durable, and fault-tolerant buffer.&lt;/p&gt;

&lt;p&gt;Kafka Connect is a scalable and reliable tool for streaming data between Apache Kafka and other systems. It allows to quickly define connectors that move data into and out of Kafka. Luckily for us, there is an open-source &lt;a href="https://github.com/confluentinc/kafka-connect-elasticsearch"&gt;connector&lt;/a&gt; that sends data from Kafka topics to Elasticsearch indices.&lt;/p&gt;

&lt;p&gt;From now on, this post will focus on the details of our Kafka Connect deployment: how we use the system, and its properties. Also, we'll share the battle stories and lessons we learned while implementing the solution.&lt;/p&gt;

&lt;h1&gt;
  
  
  Requirements of the Data Indexing Pipeline
&lt;/h1&gt;

&lt;p&gt;We started the project with a list of requirements. The following list captures the most important ones:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Capturing the ordered set of data indexing operations (to act as a source-of-truth);&lt;/li&gt;
&lt;li&gt;Scalability and high-availability;&lt;/li&gt;
&lt;li&gt;Ensuring the delivery and consistency of the data;&lt;/li&gt;
&lt;li&gt;Indexing data to multiple Elasticsearch clusters that are of different versions;&lt;/li&gt;
&lt;li&gt;A programmable interface that allows to control the indexing pipeline;&lt;/li&gt;
&lt;li&gt;Support of manual throttling and pausing of the indexing;&lt;/li&gt;
&lt;li&gt;Monitoring and alerting;&lt;/li&gt;
&lt;li&gt;Handling indexing errors and data inconsistencies;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We researched several available open-source tools for the indexing pipeline. Also, we considered the possibility of implementing such a system ourselves from scratch. After evaluating, Kafka Connect seemed to be the way to go.&lt;/p&gt;

&lt;h2&gt;
  
  
  Capturing the ordered set of updates to the Vinted Catalogue
&lt;/h2&gt;

&lt;p&gt;We expect to store the data indexing requests for as long as needed. To achieve that we set &lt;a href="https://kafka.apache.org/documentation/#topicconfigs_retention.ms"&gt;&lt;code&gt;retention.ms&lt;/code&gt;&lt;/a&gt; to &lt;code&gt;-1&lt;/code&gt; for all the relevant topics, i.e. we instructed Kafka not to delete old records. But what about storage requirements of the Kafka cluster if we don't delete anything? We leverage Kafka's &lt;em&gt;log compaction&lt;/em&gt;&lt;sup id="fnref1"&gt;1&lt;/sup&gt; feature. Log compaction ensures that Kafka will always retain at least the last known value for each message key within the log of data for a single topic partition. Or in other words - Kafka will delete old records with the same key in the background. This strategy allows to "compact" all the updates of a catalog item into one update request.&lt;/p&gt;

&lt;p&gt;Kafka topics contain records where the record key is an ID of the Elasticsearch document, and the record body is the &lt;code&gt;_source&lt;/code&gt; of the Elasticsearch document. The body is stored as a JSON string. Note that the record body is always a &lt;em&gt;full document&lt;/em&gt;, meaning that we don't support partial updates to the Elasticsearch documents. There is a special case when the Kafka record body is &lt;code&gt;null&lt;/code&gt;. Such messages are called &lt;strong&gt;tombstone&lt;/strong&gt; messages. Tombstone messages are used to delete documents from the Elasticsearch indices.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scalability and High-availability
&lt;/h2&gt;

&lt;p&gt;Kafka Connect is scalable. Kafka Connect has a concept of a &lt;code&gt;worker&lt;/code&gt;, which is a JVM process that is connected to the Kafka cluster and a Kafka Connect cluster can have many workers, and each connector is backed-by a Kafka consumer group which can have multiple consumers instances. These consumer instances in Kafka Connect are called &lt;code&gt;tasks&lt;/code&gt;, and tasks are spread across the workers in the Kafka Connect cluster. However, the concurrency is limited by the partition count of source topics: it is not recommended to have more tasks per connector than target topics have partitions.&lt;/p&gt;

&lt;p&gt;On top of that, the Elasticsearch connector supports multiple in-flight requests to Elasticsearch to increase concurrency. The amount of data sent can be configured with the &lt;code&gt;batch.size&lt;/code&gt; parameter. How often the data is sent can be configured with the &lt;code&gt;linger.ms&lt;/code&gt; parameter.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--fJyxN9gx--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://engineering.vinted.com/static/2021/01/12/kc-scalability.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--fJyxN9gx--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://engineering.vinted.com/static/2021/01/12/kc-scalability.png" alt="scalibility of kafka connect"&gt;&lt;/a&gt;&lt;br&gt;Figure 2. Scalability schema of Kafka Connect
  &lt;/p&gt;



&lt;p&gt;It is worth mentioning that one connector can index data to one Elasticsearch cluster. To support multiple Elasticsearch clusters you just need to create multiple connectors.&lt;/p&gt;

&lt;p&gt;High availability is ensured by deploying Kafka Connect workers over multiple servers, so that each task of the connector fails independently. An advantage of this is that failed tasks can be restarted one-by-one.&lt;/p&gt;

&lt;h3&gt;
  
  
  Elasticsearch Client Node Bug
&lt;/h3&gt;

&lt;p&gt;We used Elasticsearch client nodes to perform the load balancing during indexing of the data into Elasticsearch clusters. However, occasionally, we had to kill and restart client nodes because they failed with an error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"root_cause"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"circuit_breaking_exception"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"[parent] Data too large, data for [&amp;lt;http_request&amp;gt;] would be [33132540088/30.8gb], which is larger than the limit of [31653573427/29.4gb], real usage: [33132540088/30.8gb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=739583758/705.3mb, accounting=0/0b]"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"bytes_wanted"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;33132540088&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"bytes_limit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;31653573427&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"durability"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TRANSIENT"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"circuit_breaking_exception"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"[parent] Data too large, data for [&amp;lt;http_request&amp;gt;] would be [33132540088/30.8gb], which is larger than the limit of [31653573427/29.4gb], real usage: [33132540088/30.8gb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=739583758/705.3mb, accounting=0/0b]"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"bytes_wanted"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;33132540088&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"bytes_limit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;31653573427&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"durability"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TRANSIENT"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The error above looks as though we simply sent too much data to Elasticsearch at once. Normally, this can be easily mitigated, for example by reducing the &lt;code&gt;batch.size&lt;/code&gt; parameter in the connector or decreasing indexing concurrency. However, the error still didn't disappear. After some investigation we discovered that the problem was being caused by a &lt;a href="https://github.com/elastic/elasticsearch/pull/55404"&gt;bug in Elasticsearch client nodes&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Our workaround was to send indexing requests directly to the data nodes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling Data Consistency
&lt;/h2&gt;

&lt;p&gt;To ensure that Elasticsearch contains correct data we leveraged the Elasticsearch's &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/optimistic-concurrency-control.html"&gt;optimistic concurrency control machanism&lt;/a&gt; combined with &lt;a href="https://docs.confluent.io/kafka-connect-elasticsearch/current/index.html#features"&gt;at-least-once&lt;/a&gt; delivery guarantees of Kafka Connect and the fact that &lt;a href="https://kafka.apache.org/documentation/#intro_concepts_and_terms"&gt;Kafka ensures message ordering per topic partition&lt;/a&gt; when records have an ID. The trick is to use Kafka record offset as a &lt;code&gt;_version&lt;/code&gt; of the Elasticsearch document and set the &lt;code&gt;version_type&lt;/code&gt; parameter to &lt;code&gt;external&lt;/code&gt; for indexing operations. Handling of the details falls on Kafka Connect and Elasticsearch.&lt;/p&gt;

&lt;h3&gt;
  
  
  Elasticsearch Connector Bug
&lt;/h3&gt;

&lt;p&gt;While testing the Kafka Connect we discovered that it is still possible to end up having inconsistent data in Elasticsearch. The problem was that the connector did not use the record offset as a &lt;code&gt;_version&lt;/code&gt; for delete operations. We registered the bug in the &lt;a href="https://github.com/confluentinc/kafka-connect-elasticsearch/pull/422"&gt;Elasticsearch sink connector repository&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The bug affected setups where Kafka Records have non-null keys, send multiple indexing requests with data from the same partition in parallel, where they represent either index or delete operation. In this case, Elasticsearch can end up having either data that it should not have (e.g. item is sold but is still discoverable), or not having the data that it should have (e.g. item is not in the catalogue when it should be).&lt;/p&gt;

&lt;p&gt;We patched the connector and contributed the fix back to upstream. As of this writing, we are maintaining and deploying the fork of the connector ourselves, while we wait for the Elasticsearch connector developers to accept our pull request and release a new version with the fix.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: the issue seems to be fixed but is not released. However, it was done not by accepting out pull request but using a different library to handle the &lt;a href="https://github.com/confluentinc/kafka-connect-elasticsearch/pull/468"&gt;Elasticsearch indexing&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Corner Case Adventure
&lt;/h3&gt;

&lt;p&gt;On a regular day, the described setup is worry-free from the data consistency point-of-view. But what would happen when an engineer increases the number of partitions of Kafka topics for the sake of better throughput in the Kafka cluster? Data inconsistencies happen. The reason behind it is that the record routing into partition changes and (most likely) records ends up being recorded in a different partition and even more importantly: with smaller offset. From Elasticsearch perspective the new records are "older" then the actually older records, i.e. updates to the index are not applied.&lt;/p&gt;

&lt;p&gt;Unfortunately, the described situation happened in our setup. Fixing the issue required to reingest &lt;strong&gt;all&lt;/strong&gt; the data into the new topics (because current topics are "corrupted") and to reindex the data into Elasticsearch. Instead of going this path, we used this "opportunity" to upgrade the Kafka cluster to the newer version. The benefit of this approach was that the topic names could remain unchanged. Which required no changes in our indexing management code. On the flip side, we had to maintain two Kafka clusters for the transition period.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling of failures
&lt;/h2&gt;

&lt;p&gt;Out of various &lt;a href="https://www.confluent.io/blog/kafka-connect-deep-dive-error-handling-dead-letter-queues/"&gt;options to handle errors with the Kafka Connect&lt;/a&gt;, we opted to continue processing despite the fact that some error occured while manually investigating the errors asynchronously in the background to minimize the impact to the indexing pipeline.&lt;/p&gt;

&lt;p&gt;In case of an error, the idea is to send the faulty Kafka record to a separate &lt;a href="https://en.wikipedia.org/wiki/Dead_letter_queue"&gt;Dead Letter Queue (DLQ) topic&lt;/a&gt; with all the details in the Kafka message record&lt;sup id="fnref2"&gt;2&lt;/sup&gt; header, log the problem, and when more than a couple of errors happens during a couple of minutes, send an error message to the Slack. It is said that &lt;a href="https://en.wikipedia.org/wiki/A_picture_is_worth_a_thousand_words"&gt;a picture is worth a thousand words&lt;/a&gt;, so let's visualize that last longish sentence with a neat diagram:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--BKsRNRXY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://engineering.vinted.com/static/2021/01/12/kc-error-handling.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--BKsRNRXY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://engineering.vinted.com/static/2021/01/12/kc-error-handling.png" alt="error handling with kafka connect"&gt;&lt;/a&gt;&lt;br&gt;Figure 2. Error handling
  &lt;/p&gt;



&lt;p&gt;Which errors are handled by the process above? Two cases: (1) when the data is in the wrong format (e.g. the record body is not a valid JSON) and (2) when &lt;a href="https://www.confluent.io/blog/kafka-connect-single-message-transformation-tutorial-with-examples/"&gt;single message transforms (SMT)&lt;/a&gt; are failing. A notable exception is that Kafka Connect doesn't handle errors that happen when sending requests to the &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html"&gt;Elasticsearch bulk API&lt;/a&gt;, i.e. if Elasticsearch rejects request then errors are not handled as previously described, just logged. Kafka Connect logging is set up to send logs to common logging infrastructure that is described in excellent detail in the post &lt;a href="https://engineering.vinted.com/2020/12/07/log-management-overview/"&gt;"One Year of Log Management at Vinted"&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;A snippet of relevant settings from the connector configuration&lt;sup id="fnref3"&gt;3&lt;/sup&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"errors.tolerance": "all",
"errors.log.enable": "true",
"errors.log.include.messages": "true",
"errors.deadletterqueue.topic.replication.factor": "3",
"errors.deadletterqueue.topic.name": "dlq_kafka_connect",
"errors.deadletterqueue.context.headers.enable": "true",
"behavior.on.malformed.documents": "warn",
"drop.invalid.message": "true",
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Future tasks
&lt;/h1&gt;

&lt;p&gt;A solution to synchronize data between MySQL and Kafka begs to be improved. For example, we could get rid of the Ruby scripts by leveraging the Change Data Capture (CDC). The stream changes directly from MySQL to Kafka using another Kafka Connect connector called &lt;a href="https://debezium.io/"&gt;Debezium&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Running Elasticsearch, Kafka, and Kafka Connect, etc. on the workstation requires a lot of CPU and RAM and the Kafka Connect is the most resource intensive of the bunch. Resource consumption gets problematic when the components are run on a MacBook in Docker containers: the battery drains fast, machine runs hot, fans get loud, etc. To mitigate the situation we will try to run Kafka Connect compiled to the native-image using &lt;a href="https://www.graalvm.org/"&gt;GraalVM&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  Wrapping Up
&lt;/h1&gt;

&lt;p&gt;As of now, the data indexing pipeline is one of the most boring and least problematic parts of the search stack at Vinted. And it should be. The reliability and scalability of the Kafka Connect allows our engineers to be agile and focus on working on Vinted's mission of making second-hand fashion first choice.&lt;/p&gt;

&lt;h1&gt;
  
  
  Footnotes
&lt;/h1&gt;




&lt;ol&gt;

&lt;li id="fn1"&gt;
&lt;p&gt;&lt;a href="https://kafka.apache.org/documentation/#compaction"&gt;Kafka log compaction&lt;/a&gt; ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn2"&gt;
&lt;p&gt;We also use &lt;a href="https://github.com/confluentinc/kafka-rest"&gt;Kafka REST Proxy&lt;/a&gt; and, unfortunately, headers with failure details can't be fetched through it because &lt;a href="https://github.com/confluentinc/kafka-rest/issues/698"&gt;Kafka REST Proxy does not support Record headers&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn3"&gt;
&lt;p&gt;For detailed description of the settings consult the documentation &lt;a href="https://docs.confluent.io/current/installation/configuration/connect/sink-connect-configs.html"&gt;here&lt;/a&gt; and &lt;a href="https://docs.confluent.io/current/connect/kafka-connect-elasticsearch/configuration_options.html"&gt;here&lt;/a&gt; ↩&lt;/p&gt;
&lt;/li&gt;

&lt;/ol&gt;

</description>
      <category>vinted</category>
      <category>elasticsearch</category>
      <category>kafka</category>
      <category>kafkaconnect</category>
    </item>
    <item>
      <title>Unbearably slow API gateway calls between Azure App Services</title>
      <dc:creator>Evaldas Buinauskas</dc:creator>
      <pubDate>Tue, 17 Sep 2019 19:08:36 +0000</pubDate>
      <link>https://dev.to/buinauskas/unbearably-slow-api-gateway-calls-on-azure-app-service-3c2g</link>
      <guid>https://dev.to/buinauskas/unbearably-slow-api-gateway-calls-on-azure-app-service-3c2g</guid>
      <description>&lt;p&gt;Hello everyone,&lt;/p&gt;

&lt;p&gt;I'd like to ask for help debugging ASP.NET Core Web API.&lt;/p&gt;

&lt;p&gt;That's the situation:&lt;/p&gt;

&lt;p&gt;We've got an authorization service that exposes multiple RESTful APIs to check if authenticated user can access certain business objects. In my case the business is object is a concept of market and they are authorized by project. So API call would look as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET https://authorization.service.com/markets?project=123456
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Successful response will return array of markets, unauthorized response will return a 403 status code.&lt;/p&gt;

&lt;p&gt;It all works great when I call it through Swagger, call it from Postman - responses are quick and come back in tens of milliseconds.&lt;/p&gt;

&lt;p&gt;Now my other service is calling this service in authorization filter and based on response either continues or responds with Forbidden.&lt;/p&gt;

&lt;p&gt;I've implemented a reusable client for HTTP calls:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ApiClient&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;IApiClient&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;HttpClient&lt;/span&gt; &lt;span class="n"&gt;_client&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;JsonSerializer&lt;/span&gt; &lt;span class="n"&gt;_jsonSerializer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;ApiClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;HttpClient&lt;/span&gt; &lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;JsonSerializer&lt;/span&gt; &lt;span class="n"&gt;jsonSerializer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_client&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpClient&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ArgumentNullException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;nameof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;_jsonSerializer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jsonSerializer&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ArgumentNullException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;nameof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jsonSerializer&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;GetDataAsync&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;accessToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt;
    &lt;span class="err"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;var&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;CreateRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;accessToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;SendAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HttpCompletionOption&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseHeadersRead&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsSuccessStatusCode&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;responseStream&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ReadAsStreamAsync&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;streamReader&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;StreamReader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;responseStream&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;jsonTextReader&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;JsonTextReader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;streamReader&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_jsonSerializer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Deserialize&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;jsonTextReader&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="n"&gt;HttpRequestMessage&lt;/span&gt; &lt;span class="nf"&gt;CreateRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;accessToken&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;locationUri&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;HttpRequestMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;HttpMethod&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;locationUri&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Authorization"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;$"Bearer &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;accessToken&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;My authorization filter looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AuthorizeProjectFilter&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;IAsyncActionFilter&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;IApiClient&lt;/span&gt; &lt;span class="n"&gt;_apiClient&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;IMarketsContext&lt;/span&gt; &lt;span class="n"&gt;_marketsContext&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;AuthorizeProjectFilter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IApiClient&lt;/span&gt; &lt;span class="n"&gt;apiClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IMarketsContext&lt;/span&gt; &lt;span class="n"&gt;marketsContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;_apiClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;apiClient&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ArgumentNullException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;nameof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;apiClient&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;_marketsContext&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;marketsContext&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ArgumentNullException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;nameof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;marketsContext&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt; &lt;span class="nf"&gt;OnActionExecutionAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ActionExecutingContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ActionExecutionDelegate&lt;/span&gt; &lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HttpContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;TryGetValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;nameof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RequestBase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Project&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;out&lt;/span&gt; &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="k"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
            &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HttpContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;TryGetValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Authorization"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;out&lt;/span&gt; &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;authorizationHeader&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;project&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;accessToken&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;authorizationHeader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToString&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;Split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;" "&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

            &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;markets&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_apiClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetDataAsync&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IEnumerable&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Market&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;gt;(&lt;/span&gt;&lt;span class="s"&gt;$"&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;_authorizationServiceOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Authorization&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/markets?project=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;accessToken&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;authorized&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;markets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;authorized&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;ForbidResult&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;_marketsContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Markets&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;markets&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And this works great locally, requests are quick when I call this service, however when I deploy this to Azure App Service - responses take 20+ seconds, sometimes more than a minute. App plan isn't also the cheapest one and it should be fast.&lt;/p&gt;

&lt;p&gt;That doesn't seem like a cold start issue because I can keep on calling the service - it still remains slow.&lt;/p&gt;

&lt;p&gt;I've run a trace on Azure diagnostics and got the following:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--PJWcZefs--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/5f3xh1ith5nty218am1f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--PJWcZefs--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/5f3xh1ith5nty218am1f.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Seems like most of the time is spent on &lt;code&gt;BLOCKED_TIME&lt;/code&gt; within &lt;code&gt;system.net.http&lt;/code&gt; and I'm literally lost here.&lt;/p&gt;

&lt;p&gt;Tracing has also caught some exceptions with the following message:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;An attempt was made to access a socket in a way forbidden by its access permissions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also my &lt;code&gt;HttpClient&lt;/code&gt; is added using &lt;code&gt;IHttpClientFactory&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;services&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddHttpClient&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;IApiClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ApiClient&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultRequestHeaders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Accept"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"application/json"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ConfigurePrimaryHttpMessageHandler&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;HttpClientHandler&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;UseProxy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I'm literally lost now, I couldn't find anything on Google that would've helped me.&lt;/p&gt;

&lt;p&gt;Please help :)&lt;/p&gt;

</description>
      <category>help</category>
      <category>azure</category>
      <category>dotnet</category>
      <category>csharp</category>
    </item>
    <item>
      <title>What are the most common tools for data pre-calculation and aggregation?</title>
      <dc:creator>Evaldas Buinauskas</dc:creator>
      <pubDate>Sun, 21 Oct 2018 16:41:10 +0000</pubDate>
      <link>https://dev.to/buinauskas/what-are-the-most-common-tools-for-data-pre-calculation-and-aggregation-3dkj</link>
      <guid>https://dev.to/buinauskas/what-are-the-most-common-tools-for-data-pre-calculation-and-aggregation-3dkj</guid>
      <description>&lt;p&gt;Company, I work at, does data research and scraping which is later aggregated and published to our clients. We also try to denormalize data in order to provide faster data lookup in web applications.&lt;/p&gt;

&lt;p&gt;Until now, we used mechanisms within SQL Server to do these aggregations. But recently this has became a bottleneck and processes take too much time to execute and overlap to business hours.&lt;/p&gt;

&lt;p&gt;What are other tools that market uses to perform aggregations and pre-calculation outside of relational database? My discoveries include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Apache Hadoop MapReduce&lt;/li&gt;
&lt;li&gt;Apache Pig&lt;/li&gt;
&lt;li&gt;Apache Spark&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>sql</category>
      <category>discuss</category>
    </item>
    <item>
      <title>Hiring for the first time</title>
      <dc:creator>Evaldas Buinauskas</dc:creator>
      <pubDate>Thu, 06 Sep 2018 18:28:06 +0000</pubDate>
      <link>https://dev.to/buinauskas/hiring-for-the-first-time-19d</link>
      <guid>https://dev.to/buinauskas/hiring-for-the-first-time-19d</guid>
      <description>&lt;p&gt;With recent promotion, one of my tasks will be filling up a team with a couple of junior developers. This is going to be the first time I'll interview someone. Could anyone share posts, books, videos, personal experience, basically anything that I could consume in order to prepare myself for it better and be as much as stress free as possible?&lt;/p&gt;

&lt;p&gt;I also should mention that it's going to be internal recruitment and I'll have to interview people I work with daily.&lt;/p&gt;

&lt;p&gt;Any kind of comments and questions are welcome!&lt;/p&gt;

</description>
      <category>interview</category>
      <category>help</category>
    </item>
    <item>
      <title>Help designing web application UI</title>
      <dc:creator>Evaldas Buinauskas</dc:creator>
      <pubDate>Tue, 07 Aug 2018 18:57:16 +0000</pubDate>
      <link>https://dev.to/buinauskas/help-designing-web-application-ui-1o01</link>
      <guid>https://dev.to/buinauskas/help-designing-web-application-ui-1o01</guid>
      <description>&lt;p&gt;We're a small team of backend developers building our first web application (a really small one) and we don't really have a dedicated UI person to help us out.&lt;/p&gt;

&lt;p&gt;We're using Angular together with Angular Material and try to strictly follow it. Would there be someone who'd be able to chat and give some basic guidance?&lt;/p&gt;

</description>
      <category>help</category>
      <category>material</category>
      <category>angular</category>
    </item>
    <item>
      <title>What kind of peripherals do you use?</title>
      <dc:creator>Evaldas Buinauskas</dc:creator>
      <pubDate>Wed, 04 Apr 2018 15:32:42 +0000</pubDate>
      <link>https://dev.to/buinauskas/what-kind-of-peripherals-do-you-use-3k43</link>
      <guid>https://dev.to/buinauskas/what-kind-of-peripherals-do-you-use-3k43</guid>
      <description>&lt;p&gt;How much attention do you pay for your input devices, normally keyboard and a mouse?&lt;/p&gt;

&lt;p&gt;I always loved mechanical keyboards and that clack sound they make and that keystrokes are now lighter, but I never really cared about mouse, however I noticed that my wrist starts hurting a bit and I started looking into more ergonomic ones.&lt;/p&gt;

&lt;p&gt;Logitech MX Master 2S seemed like a good candidate but there are some complaints about its' build quality.&lt;/p&gt;

&lt;p&gt;Logitech MX ERGO is praised but I'm not really sure about trackball.&lt;/p&gt;

&lt;p&gt;Vertical mouses?&lt;/p&gt;

&lt;p&gt;What do you use in your day-to-day job and if you're an fan of ergonomic mice that are built for productivity, what would you recommend?&lt;/p&gt;

</description>
      <category>discuss</category>
    </item>
    <item>
      <title>Maximum line length in your code</title>
      <dc:creator>Evaldas Buinauskas</dc:creator>
      <pubDate>Thu, 29 Mar 2018 09:02:13 +0000</pubDate>
      <link>https://dev.to/buinauskas/maximum-line-length-in-your-code-2ia4</link>
      <guid>https://dev.to/buinauskas/maximum-line-length-in-your-code-2ia4</guid>
      <description>&lt;p&gt;Until now I've always tried to stick to 80 characters per line and it worked just fine in plain ol' JavaScript, however after adopting TypeScript it feels like it's not enough with all these type declarations, sometimes code becomes harder to read, an example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;setCommunity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ActionCreator&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;SetCommunityAction&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@@community/SET_COMMUNITY&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Whereas if limit would be 100, this becomes more readable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;setCommunity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ActionCreator&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;SetCommunityAction&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@@community/SET_COMMUNITY&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Of course this is a single example that was frustrating me.&lt;/p&gt;

&lt;p&gt;What max line length do you use and what are the reasons behind it?&lt;/p&gt;

</description>
      <category>discuss</category>
    </item>
  </channel>
</rss>
