<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Richard Zhang</title>
    <description>The latest articles on DEV Community by Richard Zhang (@hahabrother).</description>
    <link>https://dev.to/hahabrother</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2359219%2Ff0a02621-8735-431b-816b-c8067c3d645f.png</url>
      <title>DEV Community: Richard Zhang</title>
      <link>https://dev.to/hahabrother</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hahabrother"/>
    <language>en</language>
    <item>
      <title>Elasticsearch Overview（2）- Cluster &amp; Terminology</title>
      <dc:creator>Richard Zhang</dc:creator>
      <pubDate>Fri, 15 Nov 2024 08:10:44 +0000</pubDate>
      <link>https://dev.to/hahabrother/elasticsearch-overview2-cluster-terminology-2ddh</link>
      <guid>https://dev.to/hahabrother/elasticsearch-overview2-cluster-terminology-2ddh</guid>
      <description>&lt;h2&gt;
  
  
  Related
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://163.hashnode.dev/elasticsearch-overview1-benefits-scenarios" rel="noopener noreferrer"&gt;Elasticsearch  Overview（1）-  Benefits &amp;amp; Scenarios&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://medium.com/@z1337202218/elasticsearch-overview-3-shard-indexing-and-replicas-ce296b9d6b43" rel="noopener noreferrer"&gt;Elasticsearch Overview（3）- Shard, Indexing, and Replicas&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://substack.com/home/post/p-151629641" rel="noopener noreferrer"&gt;Elasticsearch Overview（4）- Node &amp;amp; Policy Design&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.reddit.com/r/architecture/comments/1grrre4/elasticsearch_overview5_best_practices_security/" rel="noopener noreferrer"&gt;Elasticsearch Overview（5）- Best Practices (Security)&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Elasticsearch Cluster?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Introduction
&lt;/h3&gt;

&lt;p&gt;We discussed Elasticsearch earlier. What is an Elasticsearch cluster? A cluster is a pool of nodes that provide Elasticsearch functionality. In an Elasticsearch cluster, you will have different nodes, which may be different computers, docker containers, or different physical machines. The nodes may be located in the same or different geographical locations. All these nodes work together to provide you with Elasticsearch functionality.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Cluster models
&lt;/h3&gt;

&lt;p&gt;When it comes to clusters, there are many models. For example, there are popular models such as all-in-one clusters and role-based multi-node clusters. On the left, you can see a multi-node cluster. It has three master nodes and four green data nodes, and then a coordinator node and two gray nodes.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Node roles
&lt;/h3&gt;

&lt;p&gt;Master nodes have a special purpose. They manage the cluster by receiving and sending information about the cluster to stabilize the cluster instance. They do not do any data processing. Data nodes are where logs or data are stored. They are the actual storage nodes. The coordinator node acts as a client. It accepts requests and processes them by getting results from the data nodes. Ingest nodes help get data into the Elasticsearch data nodes. Coordinator and collection annotations are part of Data Annotation 2 and are optional but useful.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Best Practices
&lt;/h3&gt;

&lt;p&gt;A three-node all-in-one cluster means that the three nodes act as master, data nodes, and coordinator at the same time. However, for easy scalability and better Elasticsearch clusters, it is better to use multi-node and role-based clusters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsr4vuwinxrvp4wxm2gfl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsr4vuwinxrvp4wxm2gfl.png" alt="Image description" width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;Multi-Node Cluster&lt;/center&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdt5msyd82roegq9pnmz8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdt5msyd82roegq9pnmz8.png" alt="3 Node Cluster" width="539" height="301"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;3 Node Cluster&lt;/center&gt;

&lt;h2&gt;
  
  
  Terminology &amp;amp; How it works
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Raise the question
&lt;/h3&gt;

&lt;p&gt;We have been discussing the Elasticsearch cluster. Now, let's think about what is inside the cluster. Specifically, how is the data stored and how does the data flow occur in Elasticsearch?&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The question about the internal structure of the cluster
&lt;/h3&gt;

&lt;p&gt;The cluster we have discussed is full of nodes. There are data nodes inside the cluster, where the data is stored. The data exists in the form of indexes, which are logical aliases for data. Indexes can be split into different shards. Shards are where the data is actually stored. Documents are the real entities of data in Elasticsearch.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. The relationship between document storage and indexing
&lt;/h3&gt;

&lt;p&gt;All documents are in JSON format and have key-value pairs. They are stored in the form of shards. There is another technology called segments, where the data is actually stored. A group of segments constitutes a shard. When we combine all the shards together, we get an index. The index is a logical entity that you can search for data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ciyqi6fzolg0xsdr67y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ciyqi6fzolg0xsdr67y.png" width="800" height="451"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Data Inflow
&lt;/h3&gt;

&lt;p&gt;In the picture below, you can see two colors, blue represents how data flows into Elasticsearch. Data sources can be various entities. Elasticsearch supports different data types, including unstructured data. Data sources may be logs, caches, or directly from services or infrastructure, such as Windows, Linux, and service servers such as ENGINETICS and Apache. These data sources send data to Elasticsearch. After the initial evaluation, data processing is done.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Data Characteristics and Processing
&lt;/h3&gt;

&lt;p&gt;So, what does your data look like? What key-value pairs does it have? What is its data type? All this metadata information will be processed and extracted into the Elasticsearch index. The index is the logical entity we search. On the other side, green represents the Elasticsearch searcher. For example, Kibana is a tool provided by the Elastic Stack. You can connect through the API or integrate with enterprise search in a web application. Elasticsearch provides multiple clients for different languages. These clients connect to the Elasticsearch cluster. The cluster connects to the index, and then you get the results you expect. This is how the data flow works.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1yxuqzci95e8y7jmykth.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1yxuqzci95e8y7jmykth.png" alt="Image description" width="543" height="306"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Kafka Overview（1）- Topic</title>
      <dc:creator>Richard Zhang</dc:creator>
      <pubDate>Wed, 06 Nov 2024 03:05:41 +0000</pubDate>
      <link>https://dev.to/hahabrother/kafka-overview-3i60</link>
      <guid>https://dev.to/hahabrother/kafka-overview-3i60</guid>
      <description>&lt;h2&gt;
  
  
  Agenda
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Kafka Architecture

&lt;ul&gt;
&lt;li&gt;Basic concepts(topic/partition/consumer group/commit log/offset)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;b&gt;Delver into the process of sending a message to the broker&lt;/b&gt;&lt;/li&gt;

&lt;li&gt;Technical Highlights&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Architecture - Overview
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxhxpn04g2t4hwukeo15f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxhxpn04g2t4hwukeo15f.png" alt="Architecture - Overview" width="800" height="563"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture - Detail
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvaedjdtihjvray8qs34p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvaedjdtihjvray8qs34p.png" alt="Architecture - Detail" width="800" height="618"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture - Basic Concepts
&lt;/h2&gt;

&lt;h3&gt;Topics&lt;/h3&gt;

&lt;p&gt;A &lt;a href="https://www.instaclustr.com/support/documentation/kafka/using-kafka/topic-management/" rel="noopener noreferrer"&gt;&lt;em&gt;Kafka topic&lt;/em&gt;&lt;/a&gt;  defines a channel through which data is streamed.Producers publish messages to topics,and consumers read messages from the topic they subscribe to.&lt;/p&gt;

&lt;p&gt;Topics organize and structure messages,with particular types of messages published to particular topics.Topics are identified by unique names within a Kafka cluster, and there is no limit on the number of topics that can be created.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F72zg7navifelv9z3eqw0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F72zg7navifelv9z3eqw0.png" alt="vUTeVGo7.png" width="800" height="234"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Related
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://163.hashnode.dev/kafka-overview2-partition" rel="noopener noreferrer"&gt;Kafka Overview（2）- partition&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://medium.com/@z1337202218/kafka-overview-b8b735a8e079" rel="noopener noreferrer"&gt;Kafka Overview(3)-Consumer Group&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://open.substack.com/pub/haharichard/p/kafka-overview?r=4aznkt&amp;amp;utm_campaign=post&amp;amp;utm_medium=web&amp;amp;showWelcomeOnShare=true" rel="noopener noreferrer"&gt;Kafka Overview(4) - Commit Log &amp;amp; offset&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.reddit.com/r/Rlanguage/comments/1gmcdfa/kafka_overview5_process_of_sending/?utm_source=share&amp;amp;utm_medium=web3x&amp;amp;utm_name=web3xcss&amp;amp;utm_term=1&amp;amp;utm_content=share_button" rel="noopener noreferrer"&gt;Kafka Overview(5) - Process of sending&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kafka</category>
    </item>
  </channel>
</rss>
