<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Greptime</title>
    <description>The latest articles on DEV Community by Greptime (@greptime).</description>
    <link>https://dev.to/greptime</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1073351%2Fd7e3ba64-e3a1-4080-893e-3a414a3edb61.jpg</url>
      <title>DEV Community: Greptime</title>
      <link>https://dev.to/greptime</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/greptime"/>
    <language>en</language>
    <item>
      <title>Error Handling for Large Rust Projects - A Deep Dive into GreptimeDB's Practices</title>
      <dc:creator>Greptime</dc:creator>
      <pubDate>Sun, 12 May 2024 08:50:00 +0000</pubDate>
      <link>https://dev.to/greptime/error-handling-for-large-rust-projects-a-deep-dive-into-greptimedbs-practices-l9i</link>
      <guid>https://dev.to/greptime/error-handling-for-large-rust-projects-a-deep-dive-into-greptimedbs-practices-l9i</guid>
      <description>&lt;p&gt;:::tip TL;DR:&lt;br&gt;
In this article, we discuss the practice of Rust error handling topic in &lt;a href="https://github.com/GreptimeTeam/greptimedb"&gt;GreptimeDB&lt;/a&gt; and shares possibly future work in the end.&lt;/p&gt;

&lt;p&gt;Topics including: &lt;br&gt;
(1) How to build a cheaper yet more accurate error stack to replace system backtrace; &lt;br&gt;
(2) How to organize errors in large projects; &lt;br&gt;
(3) How to print errors in different schemes to log and end users. &lt;/p&gt;

&lt;p&gt;An error in GreptimeDB might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0: Foo error, at src/common/catalog/src/error.rs:80:10
1: Bar error, at src/common/function/src/error.rs:90:10
2: Root cause, invalid table name, at src/common/catalog/src/error.rs:100:10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;:::&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduce
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Understanding &lt;code&gt;Error&lt;/code&gt; in Rust
&lt;/h3&gt;

&lt;p&gt;Rust's error handling is centered around the &lt;a href="https://doc.rust-lang.org/std/result/enum.Result.html#variant.Err"&gt;&lt;code&gt;Result&amp;lt;T, E&amp;gt;&lt;/code&gt;&lt;/a&gt; enum, where &lt;code&gt;E&lt;/code&gt; typically (but unnecessarily) extends &lt;a href="https://doc.rust-lang.org/std/error/trait.Error.html"&gt;&lt;code&gt;std::error::Error&lt;/code&gt;&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;E&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cd"&gt;/// Contains the success value&lt;/span&gt;
    &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="cd"&gt;/// Contains the error value&lt;/span&gt;
    &lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;E&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This blog shares our experience organizing variant types of &lt;code&gt;Error&lt;/code&gt; in a complex system like GreptimeDB, from how an error is defined to how to log the error or present it to end-users. Such a system is composed of multiple components with their own &lt;code&gt;Error&lt;/code&gt; definitions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Status Quo of Rust's Error Handling
&lt;/h3&gt;

&lt;p&gt;A few standard libraries in Rust provide &lt;code&gt;Error&lt;/code&gt; structs that implement &lt;code&gt;std::error::Error&lt;/code&gt;, like &lt;code&gt;std::io::Error&lt;/code&gt; or &lt;code&gt;std::fmt::Error&lt;/code&gt;. But developers would usually define custom errors for their project, as either they want to express the application specific error info, or there is a necessity to group multiple errors in an enum.&lt;/p&gt;

&lt;p&gt;Since the &lt;code&gt;std::error::Error&lt;/code&gt; trait is not quite complicated, it's easy to implement manually for one custom error type. However, you usually won't want to do so. Because as error variants grow, it would be very hard to work with the flooding template code.&lt;/p&gt;

&lt;p&gt;Nowadays, there are some widely used utility crates to help work with customized error types. For example, &lt;a href="https://docs.rs/thiserror/latest/thiserror/"&gt;&lt;code&gt;thiserror&lt;/code&gt;&lt;/a&gt; and &lt;a href="https://docs.rs/anyhow/latest/anyhow/"&gt;&lt;code&gt;anyhow&lt;/code&gt;&lt;/a&gt; are developed by the famous Rust wizard &lt;a class="mentioned-user" href="https://dev.to/dtolnay"&gt;@dtolnay&lt;/a&gt;, with the distinction that &lt;code&gt;thiserror&lt;/code&gt; is mainly for libraries and &lt;code&gt;anyhow&lt;/code&gt; is for binaries. This rule suits most of the cases.&lt;/p&gt;

&lt;p&gt;But for projects like GreptimeDB, where we divide the entire workspace into several individual sub-crates, we need to define one error type for each crate while keeping a streamlined combination. Neither &lt;code&gt;thiserror&lt;/code&gt; nor &lt;code&gt;anyhow&lt;/code&gt; can achieve this easily.&lt;/p&gt;

&lt;p&gt;Hence, we chose another crate, &lt;a href="https://docs.rs/snafu/latest/snafu/"&gt;&lt;code&gt;snafu&lt;/code&gt;&lt;/a&gt;, to instrument our error system. It's like a combination of &lt;code&gt;thiserror&lt;/code&gt; and &lt;code&gt;anyhow&lt;/code&gt;. &lt;code&gt;thiserror&lt;/code&gt; provides a convenient macro to define custom error types, with display, source, and some context fields. And &lt;code&gt;anyhow&lt;/code&gt; gives a &lt;code&gt;Context&lt;/code&gt; trait that can easily transform from one underlying error into another with a new context.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;thiserror&lt;/code&gt; mainly implements the &lt;a href="https://doc.rust-lang.org/std/convert/trait.From.html"&gt;&lt;code&gt;std::convert::From&lt;/code&gt;&lt;/a&gt; trait for your error types, so that you can simply use &lt;code&gt;?&lt;/code&gt; to propagate the error you receive. Consequently, this also means you cannot define two error variants from the same source type. Considering you are performing some I/O operations, you won't know whether an error is generated in the write path or the read path. This is also an important reason we don't use &lt;code&gt;thiserror&lt;/code&gt;: the context is blurred in type.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stacking the Error
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Design Goals
&lt;/h3&gt;

&lt;p&gt;In the real world, knowing barely the root cause of error is inadequate. Suppose we are building a protocol component in GreptimeDB. It reads messages from the network, decodes them, performs some operations, and then sends them. We may encounter errors from several aspects:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;ReadSocket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;hyper&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;DecodeMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;serde_json&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;Operation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;GreptimeError&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;EncodeMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;serde_json&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;WriteSocket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;hyper&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One possible error message we can get is: &lt;code&gt;DecodeMessage(serde_json: invalid character at 1)&lt;/code&gt;. However, in a specific code snippet, there can be more than 10 places where decoding the message (and thus throw this error)! How can we figure out in which step we see the invalid character?&lt;/p&gt;

&lt;p&gt;So, despite the error itself telling what has happened, if we want to have a clue on where this error occurs and if we should pay attention to it, we need the error to carry more information. For comparison, here is an example of an error log you might see from GreptimeDB.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;``&lt;code&gt;plain text&lt;br&gt;
Failed to handle protocol&lt;br&gt;
0: Failed to handle incoming content, query: blabla, at src/protocol/handler.rs:89:22&lt;br&gt;
1: Failed to reading next message at queue 5 of 10, at src/protocol/loop.rs:254:14&lt;br&gt;
2: Failed to decode&lt;/code&gt;01010001001010001` to ProtocolHeader, at src/protocol/codec.rs:90:14&lt;br&gt;
3: serde_json(invalid character at position 1)&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;


**A good error report is not only about how it gets constructed, but what is more important, to tell what human can understand from its cause and trace. We call it Stacked Error.** It should be intuitive and you must have seen a similar format elsewhere like backtrace.

From this log, it's easy to know the entire thing with full context, from the user-facing behavior to the root cause. Plus the exact line and column number of where each error is propagated. You will know that this error is *"from the query "blabla", the fifth package's header is corrupted"*. It's likely to be invalid user input and we may not need to handle it from the server side.

This example shows the critical information that an error should contain:

- **The root cause** that tells what is happening.
- **The full context stack** that can be used in debugging or figuring out where the error occurs.
- **What happens from the user's perspective.** Decide whether we need to expose the error to users.

The first root cause is often clear in many cases, like the DecodeMessage example above, as long as the library or function we used implements their error type correctly. But only having the root cause can be not enough.

Here is another [evidence](https://github.com/delta-incubator/delta-kernel-rs/pull/151) from Delta Lake developed by Databricks:

![Databricks's example](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/4vu65v27cmhf6ugt5648.png)

In the following sections, we will focus on the context stack and the way to present errors. And shows the way we implement it. So hopefully you can reproduce the same practices as in GreptimeDB.

### System Backtrace

So, now you have the root cause (`DecodeMessage(serde_json: invalid character at 1)`). But it's not clear at which step this error occurs: when decoding the header, or the body?

A intuitive thought is to capture the backtrace. `.unwrap()` is the first choice, where the backtrace will show up when error occurs (of course this is a bad practice). It will give you a complete call stack along with the line number.

Such a call stack contains the full trace, including lots of unrelated system stacks, runtime stacks and std stacks. If you'd like to find the call in application code, you have to inspect the source code stack by stack, and skip all the unrelated ones.

Nowadays, many libraries also provide the ability to capture backtrace on an `Error` is constructed. Regardless of whether the system backtrace can provide what we truly want, it's very costly on either CPU ([#1261](https://github.com/GreptimeTeam/greptimedb/pull/1261)) and memory ([#1273](https://github.com/GreptimeTeam/greptimedb/pull/1273)).

Capturing a backtrace will significantly slow down your program, as it needs to walk through the call stack and translate the pointer. Then, to be able to translate the stack pointer we will need to include a large `debuginfo` in our binary. In GreptimeDB, this means increasing the binary size by &amp;gt;700MB (4x compared to 170MB without debuginfo). And there will be many noises in the captured system backtrace because the system can't distinguish whether the code comes from the standard library, a third-party async runtime or the application code.

There is another difference between the system backtrace and the proposed Stacked Error. System backtrace tells us how to get to the position where the error occurs and you cannot control it, while the Stacked Error shows how the error is propagated.

Take the following code snippet as an example to examine the difference between system backtrace and virtual stack:



```rust
async fn handle_request(req: Request) -&amp;gt; Result&amp;lt;Output&amp;gt; {
    let msg = decode_msg(&amp;amp;req.msg).context(DecodeMessage)?; // propagate error with new stack and context
    verify_msg(&amp;amp;msg)?; // pass error to the caller directly
    process_msg(msg).await? // pass error to the caller directly
}

async fn decode_msg(msg: &amp;amp;RawMessage) -&amp;gt; Result&amp;lt;Message&amp;gt; {
    serde_json::from_slice(&amp;amp;msg).context(SerdeJson) // propagate error with new stack and context
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;System backtace will print the whole call stack, like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1: &amp;lt;alloc::boxed::Box&amp;lt;F,A&amp;gt; as core::ops::function::Fn&amp;lt;Args&amp;gt;&amp;gt;::call
            at /rustc/3f28fe133475ec5faf3413b556bf3cfb0d51336c/library/alloc/src/boxed.rs:2029:9
    std::panicking::rust_panic_with_hook
            at /rustc/3f28fe133475ec5faf3413b556bf3cfb0d51336c/library/std/src/panicking.rs:783:13
... many lines for std's internal traces

22: tokio::runtime::task::raw::RawTask::poll
            at /home/wayne/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/raw.rs:201:18
... many lines for tokio's internal traces

32: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}}
            at /rustc/3f28fe133475ec5faf3413b556bf3cfb0d51336c/library/std/src/thread/mod.rs:529:17
... many lines for std's internal traces
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, it includes a lot of internal stacks that you are uninterested in.&lt;/p&gt;

&lt;p&gt;For other complex logic like batch processing, where errors may not be propagate immediately but be holded for a while, virtual stack can also help making it easy to understand. System backtrace is captured in place when the leaf error is generated, like in the middle step of a map-reduce logic. But with virtual stack, you can postpone the timing to or after reduce step, where you have more information about the overall task.&lt;/p&gt;

&lt;h3&gt;
  
  
  Virtual User Stack
&lt;/h3&gt;

&lt;p&gt;Now let's introduce the virtual user stack. The word "virtual" means the contrast of the system stack. Means it's defined and constructed fully on user code. Look closer into the previous example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0: Failed to handle incoming content, query: blabla, at src/protocol/handler.rs:89:22
1: Failed to reading next message at queue 5 of 10, at src/protocol/loop.rs:254:14
2: Failed to decode `01010001001010001` to ProtocolHeader, at src/protocol/codec.rs:90:14
3: serde_json(invalid character at position 1)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A stack layer is composed of 3 parts: &lt;code&gt;[STACK_NUM]: [MSG], at [FILE_LOCATION]&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stack num&lt;/strong&gt; is the number of this stack. Smaller number means outer error layer. And starts from 0 of course.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Message&lt;/strong&gt; is the message related to one layer. This is scraped from the &lt;a href="https://doc.rust-lang.org/std/fmt/trait.Display.html"&gt;&lt;code&gt;std::fmt::Display&lt;/code&gt;&lt;/a&gt; implementation of that error. Developers can attach useful context here, like the query string or loop counter.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File location&lt;/strong&gt; is the location where one error is generated (and propagated, for intermediate error layer). Rust provides &lt;a href="https://doc.rust-lang.org/std/macro.file.html"&gt;&lt;code&gt;file!&lt;/code&gt;&lt;/a&gt;, &lt;a href="https://doc.rust-lang.org/std/macro.line.html"&gt;&lt;code&gt;line!&lt;/code&gt;&lt;/a&gt; and &lt;a href="https://doc.rust-lang.org/std/macro.column.html"&gt;&lt;code&gt;column!&lt;/code&gt;&lt;/a&gt; macros to help get that information. And the way we display it is also considered, most editors can jump to that location directly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, we utilize &lt;a href="https://docs.rs/snafu/0.8.2/snafu/struct.Location.html"&gt;&lt;code&gt;snafu::Location&lt;/code&gt;&lt;/a&gt; to gather the code location. So each location points to where the error is constructed. Through this chain we know how this error is generated and propagated to the uppermost.&lt;/p&gt;

&lt;p&gt;Here is what it looks like all together from the code side:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;#[derive(Snafu)]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;Error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;#[snafu(display(&lt;/span&gt;&lt;span class="s"&gt;"General catalog error: "&lt;/span&gt;&lt;span class="nd"&gt;))]&lt;/span&gt; &lt;span class="c1"&gt;// &amp;lt;-- the `Display` impl derive&lt;/span&gt;
    &lt;span class="n"&gt;Catalog&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// &amp;lt;-- the `location`&lt;/span&gt;
        &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nn"&gt;catalog&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;error&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// &amp;lt;-- inner cause&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Besides, we implemented a proc-macro &lt;a href="https://greptimedb.rs/common_macro/attr.stack_trace_debug.html"&gt;&lt;code&gt;stack_trace_debug&lt;/code&gt;&lt;/a&gt; to scrape necessary information from the Error's definition and generate the implementation of the related trait &lt;a href="https://greptimedb.rs/common_error/ext/trait.StackError.html"&gt;&lt;code&gt;StackError&lt;/code&gt;&lt;/a&gt;, which provides useful methods to access and print the error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;trait&lt;/span&gt; &lt;span class="n"&gt;StackError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nn"&gt;std&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;error&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;debug_fmt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;layer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;usize&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="nb"&gt;Vec&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Option&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;dyn&lt;/span&gt; &lt;span class="n"&gt;StackError&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;last&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;dyn&lt;/span&gt; &lt;span class="n"&gt;StackError&lt;/span&gt; &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="k"&gt;Self&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Sized&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This proc-macro mainly does two things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Implement &lt;a href="https://greptimedb.rs/common_error/ext/trait.StackError.html"&gt;&lt;code&gt;StackError&lt;/code&gt;&lt;/a&gt; as the scaffold&lt;/li&gt;
&lt;li&gt;Implement &lt;a href="https://doc.rust-lang.org/std/fmt/trait.Debug.html"&gt;&lt;code&gt;std::fmt::Debug&lt;/code&gt;&lt;/a&gt; based on &lt;code&gt;debug_fmt()&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the way, we have added &lt;code&gt;Location&lt;/code&gt; and &lt;code&gt;display&lt;/code&gt; to all errors in GreptimeDB. This is the hard work behind the methodology.&lt;/p&gt;

&lt;h3&gt;
  
  
  Macro Details
&lt;/h3&gt;

&lt;p&gt;Error is a singly linked list, like an onion from outer to inner. So we can capture an error at the outermost and walk through it.&lt;/p&gt;

&lt;p&gt;One tricky thing we did here is about how to distinguish internal and external errors. Internal errors all implement the same trait &lt;a href="https://greptimedb.rs/common_error/ext/trait.ErrorExt.html"&gt;&lt;code&gt;ErrorExt&lt;/code&gt;&lt;/a&gt; which can be used as a marker. But depending on this requires a &lt;code&gt;downcast&lt;/code&gt; every time. We avoid this extra &lt;code&gt;downcast&lt;/code&gt; call by simply giving a different name to them and detect in our macro.&lt;/p&gt;

&lt;p&gt;As shown below, we name all external errors &lt;code&gt;error&lt;/code&gt; and all internal errors &lt;code&gt;source&lt;/code&gt;. Then return &lt;code&gt;None&lt;/code&gt; on implementing &lt;a href="https://greptimedb.rs/common_error/ext/trait.StackError.html#tymethod.next"&gt;&lt;code&gt;StackError::next&lt;/code&gt;&lt;/a&gt; method if we find an error, or &lt;code&gt;Some(source)&lt;/code&gt; if we read source.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;#[derive(Snafu)]&lt;/span&gt;
&lt;span class="nd"&gt;#[stack_trace_debug]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;Error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;#[snafu(display(&lt;/span&gt;&lt;span class="s"&gt;"Failed to deserialize value"&lt;/span&gt;&lt;span class="nd"&gt;))]&lt;/span&gt;
    &lt;span class="n"&gt;ValueDeserialize&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nd"&gt;#[snafu(source)]&lt;/span&gt;
        &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nn"&gt;serde_json&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;error&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// &amp;lt;-- external source&lt;/span&gt;
        &lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;

    &lt;span class="nd"&gt;#[snafu(display(&lt;/span&gt;&lt;span class="s"&gt;"Table engine not found: {}"&lt;/span&gt;&lt;span class="nd"&gt;,&lt;/span&gt; &lt;span class="nd"&gt;engine_name))]&lt;/span&gt;
    &lt;span class="n"&gt;TableEngineNotFound&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;engine_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nn"&gt;table&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;error&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="c1"&gt;// &amp;lt;-- internal source&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The method &lt;a href="https://greptimedb.rs/common_error/ext/trait.StackError.html#tymethod.debug_fmt"&gt;&lt;code&gt;StackError::debug_fmt&lt;/code&gt;&lt;/a&gt; is used to render the error stack. It would be called recursively in the generated code. Each layer of error will write its own debug message to the mutable &lt;code&gt;buf&lt;/code&gt;. The content will contain error description captured from &lt;code&gt;#[snafu(display)]&lt;/code&gt; attribute, the variant arm type like &lt;code&gt;TableEngineNotFound&lt;/code&gt; and the location from the enumeration.&lt;/p&gt;

&lt;p&gt;Given we already defined our error types in that way, adopting stack error doesn't require too much work, only adding the attribute macro &lt;code&gt;#[stack_trace_debug]&lt;/code&gt; to every error type would be enough.&lt;/p&gt;

&lt;h3&gt;
  
  
  Present Error to End Users
&lt;/h3&gt;

&lt;p&gt;So far, we've covered most aspects. Now, let's delve into the final piece which is how to present errors to your users.&lt;/p&gt;

&lt;p&gt;Unlike system developers, users may not care about the line number and even the stack. What information, then, is truly beneficial to end users?&lt;/p&gt;

&lt;p&gt;This topic is very subjective. Still taking the above error as an example, let's consider which parts would or should users care about:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;``&lt;code&gt;plain text&lt;br&gt;
Failed to handle protocol&lt;br&gt;
0: Failed to handle incoming content, query: blabla, at src/protocol/handler.rs:89:22&lt;br&gt;
1: Failed to reading next message at queue 5 of 10, at src/protocol/loop.rs:254:14&lt;br&gt;
2: Failed to decode&lt;/code&gt;01010001001010001` to ProtocolHeader, at src/protocol/codec.rs:90:14&lt;br&gt;
3: serde_json(invalid character at position 1)&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;


The first line gives a brief description of this error, i.e., what users actually saw from the top layer. We should keep it as well. Line 2 and line 3 are about internal details, which are too verbose to include. Line 4 is the leaf internal error, or the boundary from internal code to external dependency. It might sometimes contain useful information, so we count it in. However, we only include the error description since the stack number and code location are useless to users. Then the last line is external error, which is usually the root cause and we'd also include it.

Let's assemble the pieces we just picked. The final error message presents to users is as follow:



```plain text
Failed to handle protocol - Failed to decode `01010001001010001` to ProtocolHeader (serde_json(invalid character at position 1))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This can be achieved easily with previous &lt;code&gt;StackError::next&lt;/code&gt; and &lt;a href="https://greptimedb.rs/common_error/ext/trait.StackError.html#method.last"&gt;&lt;code&gt;StackError::last&lt;/code&gt;&lt;/a&gt;. Or you can customize the format you want with those methods.&lt;/p&gt;

&lt;p&gt;Our experience is that the leaf (or the innermost) error's message might be useful as it is closer to what really goes wrong. The message can be further divided into two parts: internal and external, where the internal error is those defined in our codebase and the external is from dependencies, like &lt;code&gt;serde_json&lt;/code&gt; from the previous example. The root (or the outermost) error's category is more accurate as it comes from where the error is thrown to the user. &lt;/p&gt;

&lt;p&gt;In short, the error message scheme we proposed is:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;```plain text&lt;br&gt;
KIND - REASON ([EXTERNAL CAUSE])&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;


## Cost?

The virtual stack is sweet so far, and it proves to be both more cost-effective and accurate compared to the system backtrace. So what is the cost?

As for runtime overhead, it only requires some string format for the per-level reason and location.

It's even better in binary size. In GreptimeDB's binary, the debug symbols occupied ~700MB. As a comparison, the `strip`-ed binary size is around 170MB, with `.rodata` section size `016a2225` (~22.6M), and the `.text` section occupies `06ad7511` (~ 106.8M).

Removing all `Location` reduces the `.rodata` size to `0169b225` (still ~22.6M, changes are very small) and the overall binary size to 170MB, while removing all `#[snafu(display)]` reduces the `.rodata` size to `01690225` (~22.5M) and the overall binary size to 170MB.

Hence, the Stacked Error mechanism's overhead to binary size is very low (~100K).

## Conclusion and Future Works

In this post, we present how to implement a proc-macro [`stack_trace_debug`](https://greptimedb.rs/common_macro/attr.stack_trace_debug.html) and use it to assemble a low-overhead but still powerful stacked error message. It also provides a convenient way to walk through the error chain, to help render the error in different schema for different purposes.

This macro is only adopted in GreptimeDB now, we are attempting to make it more generic for different projects. A wide adoption of this pattern can also make it even more powerful by bridging third-party stacks and detailed reasons.

Besides, `std::error::Error` now provides an unstable API [`provide`](https://doc.rust-lang.org/std/error/trait.Error.html#method.provide), that allows getting a field in a struct. We can consider using it in refactoring our stack-trace utils.

----

### About Greptime

We help industries that generate large amounts of time-series data, such as Connected Vehicles (CV), IoT, and Observability, to efficiently uncover the hidden value of data in real-time. 

Visit the [latest version](https://www.greptime.com/resources) from any device to get started and get the most out of your data.

- [GreptimeDB](https://github.com/GreptimeTeam/greptimedb), written in Rust, is a distributed, open-source, time-series database designed for scalability, efficiency, and powerful analytics. 
- [GreptimeCloud](https://www.greptime.com/product/cloud) is a fully-managed cloud database-as-a-service (DBaaS) solution built on GreptimeDB. It efficiently supports applications in fields such as observability, IoT, and finance. The built-in observability solution, [GreptimeAI](https://www.greptime.com/product/ai), helps users comprehensively monitor the cost, performance, traffic, and security of LLM applications.
- **Vehicle-Cloud Integrated TSDB** solution is tailored for business scenarios of automotive enterprises. It addresses the practical business pain points that arise when enterprise vehicle data grows exponentially.

If anything above draws your attention, don't hesitate to star us on [GitHub](https://github.com/GreptimeTeam/greptimedb) or join GreptimeDB Community on [Slack](https://www.greptime.com/slack). Also, you can go to our [contribution page](https://github.com/GreptimeTeam/greptimedb/contribute) to find some interesting issues to start with.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>rust</category>
      <category>database</category>
      <category>programming</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Overcoming Prometheus's Single-Value Data Model Limitations - A New Approach by GreptimeDB</title>
      <dc:creator>Greptime</dc:creator>
      <pubDate>Sun, 12 May 2024 08:46:55 +0000</pubDate>
      <link>https://dev.to/greptime/overcoming-prometheuss-single-value-data-model-limitations-a-new-approach-by-greptimedb-30cg</link>
      <guid>https://dev.to/greptime/overcoming-prometheuss-single-value-data-model-limitations-a-new-approach-by-greptimedb-30cg</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Prometheus has established itself as a cornerstone in the monitoring and alerting ecosystem, favored for its straightforwardness and efficiency in handling real-time metrics. Central to its operation is a data model where each sample comprises a single value and an assortment of labels, a design that, while fostering simplicity and adaptability, also introduces several challenges. These challenges can impact data collection efficiency, analysis depth, and query capabilities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This article explores the limitations inherent in Prometheus's single-value data model and introduces GreptimeDB's innovative solutions that aim to address these issues, illustrated with practical examples.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Challenges of The Single-Value Data Model
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Redundant Label Transmission in Data Collection
&lt;/h3&gt;

&lt;p&gt;Prometheus's data model necessitates the repeated transmission of labels for measurements from the same source, resulting in inefficient data collection and storage. Despite the employment of optimization techniques in Prometheus's storage engine to enhance data storage efficiency, the redundancy of label information still poses a significant overhead.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In a scenario where multiple metrics like CPU usage, memory usage, and disk I/O are collected from a server cluster, each metric carries identical labels such as &lt;code&gt;cluster_name&lt;/code&gt;, &lt;code&gt;region&lt;/code&gt;, &lt;code&gt;instance&lt;/code&gt; and &lt;code&gt;server_type&lt;/code&gt;, leading to unnecessary duplication.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxl7ya2kqt5u8hwrg1sjx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxl7ya2kqt5u8hwrg1sjx.png" alt="Multiple Metrics" width="800" height="1058"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Loss of Measurement Correlation
&lt;/h3&gt;

&lt;p&gt;The separation of related measurements into distinct metrics, without a mechanism for structured grouping or inheritance, leads to a loss of correlation among measurements. This separation makes correlated analysis and queries difficult, limiting insights into metric interactions.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;When monitoring a Redis instance by tracking metrics such as memory usage, command processing rates, and active connections separately, it becomes challenging to analyze how these metrics influence each other. For example, understanding how memory usage affects command processing rates becomes difficult.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Complexity in Querying Composite Monitoring Views
&lt;/h3&gt;

&lt;p&gt;Creating comprehensive monitoring dashboards requires aggregating data from multiple, separate PromQL queries, complicating dashboard construction and increasing the query load.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;To monitor a Kubernetes node effectively, a dashboard needs to aggregate metrics like CPU load, memory consumption, network I/O, and pod counts. However, each metric requires a separate PromQL query, which complicates the dashboard setup and may potentially impact performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  GreptimeDB to the Rescue
&lt;/h2&gt;

&lt;p&gt;GreptimeDB introduces innovative solutions to address the limitations of Prometheus's single-value data model:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Group related metrics and store them together
&lt;/h3&gt;

&lt;p&gt;GreptimeDB has developed a new storage engine for this monitoring scenario, called &lt;a href="https://docs.greptime.com/contributor-guide/datanode/metric-engine"&gt;Metric Engine&lt;/a&gt;. It supports storing multiple measurements together physically, cutting a huge amount of cost and accelerating the query in correlated measurements.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Multi-Value Samples and Diverse Value Types
&lt;/h3&gt;

&lt;p&gt;GreptimeDB allows each sample from a single data source to store multiple values, supporting a variety of value types beyond floats.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Monitoring data for a Redis instance can be stored in one or multiple time-series tables, with labels stored as separate tag columns and grouped measurements as separate field columns. This approach reduces label transmission redundancy, preserves data correlation, and facilitates associated analysis and querying.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foh6ik3agdoy8nu9bzhba.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foh6ik3agdoy8nu9bzhba.png" alt="Example of Monitoring Data for Redis" width="800" height="131"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Extended PromQL for Multiple Field Queries
&lt;/h3&gt;

&lt;p&gt;GreptimeDB enhances PromQL to allow queries to return multiple fields (values). To specify a particular field, an extended &lt;code&gt;__field__&lt;/code&gt; label can be used.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This extended PromQL query &lt;code&gt;memstats{ __field__="used_bytes", __field__="free_bytes"}&lt;/code&gt; fetches two time series in one query and renders them together. This extension simplifies querying for composite monitoring views, reducing the complexity and load of constructing detailed dashboards.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Support for Table Model and SQL for Advanced Association Analysis
&lt;/h3&gt;

&lt;p&gt;One of the most impactful features GreptimeDB offers is its support for a table model and the use of SQL for querying data. This capability significantly surpasses the flexibility of PromQL, especially when it comes to performing association analysis and executing complex queries. By leveraging a relational model, users can perform joins across different datasets, enabling a deeper and more nuanced analysis of the monitored systems.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In a complex monitoring scenario where one needs to correlate server performance metrics with application error logs, GreptimeDB allows for this data to be queried together using SQL. For instance, one could execute a SQL query to join CPU usage metrics with application error logs based on timestamps, providing insights into how spikes in CPU usage may correlate with increased error rates. This level of analysis would be cumbersome, if not impossible, to achieve with PromQL alone.&lt;/p&gt;

&lt;p&gt;P.S. GreptimeDB is actively developing the logs engine as described in the &lt;a href="//2024-02-29-greptimedb-2024-roadmap.md"&gt;Roadmap&lt;/a&gt;. Stay tuned!&lt;/p&gt;

&lt;p&gt;This support for a table model and SQL, not only makes GreptimeDB a powerful tool for users transitioning from traditional SQL-based systems, but also enhances the capability for in-depth analysis without the steep learning curve associated with mastering PromQL. Introducing these features marks a significant step forward in making monitoring data more accessible and actionable for a broader range of analytical tasks, from basic monitoring to complex performance analysis and troubleshooting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;While Prometheus's single-value data model has contributed to its simplicity and widespread adoption, it also poses challenges in terms of data collection efficiency, measurement correlation, and query complexity. GreptimeDB's solutions offer a promising approach to overcoming these limitations, providing more efficient data collection, enhanced correlation analysis, and simplified querying for comprehensive monitoring views.&lt;/p&gt;




&lt;h4&gt;
  
  
  About Greptime
&lt;/h4&gt;

&lt;p&gt;We help industries that generate large amounts of time-series data, such as Connected Vehicles (CV), IoT, and Observability, to efficiently uncover the hidden value of data in real-time. &lt;/p&gt;

&lt;p&gt;Visit the &lt;a href="https://www.greptime.com/resources"&gt;latest version&lt;/a&gt; from any device to get started and get the most out of your data.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/GreptimeTeam/greptimedb"&gt;GreptimeDB&lt;/a&gt;, written in Rust, is a distributed, open-source, time-series database designed for scalability, efficiency, and powerful analytics. &lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.greptime.com/product/cloud"&gt;GreptimeCloud&lt;/a&gt; is a fully-managed cloud database-as-a-service (DBaaS) solution built on GreptimeDB. It efficiently supports applications in fields such as observability, IoT, and finance. The built-in observability solution, &lt;a href="https://www.greptime.com/product/ai"&gt;GreptimeAI&lt;/a&gt;, helps users comprehensively monitor the cost, performance, traffic, and security of LLM applications.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vehicle-Cloud Integrated TSDB&lt;/strong&gt; solution is tailored for business scenarios of automotive enterprises. It addresses the practical business pain points that arise when enterprise vehicle data grows exponentially.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If anything above draws your attention, don't hesitate to star us on &lt;a href="https://github.com/GreptimeTeam/greptimedb"&gt;GitHub&lt;/a&gt; or join GreptimeDB Community on &lt;a href="https://www.greptime.com/slack"&gt;Slack&lt;/a&gt;. Also, you can go to our &lt;a href="https://github.com/GreptimeTeam/greptimedb/contribute"&gt;contribution page&lt;/a&gt; to find some interesting issues to start with.&lt;/p&gt;

</description>
      <category>prometheus</category>
      <category>monitoring</category>
      <category>database</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Introducing GreptimeDB v0.7 — Unlock the Future of Cloud-Native Monitoring</title>
      <dc:creator>Greptime</dc:creator>
      <pubDate>Thu, 07 Mar 2024 07:56:52 +0000</pubDate>
      <link>https://dev.to/greptime/introducing-greptimedb-v07-unlock-the-future-of-cloud-native-monitoring-2623</link>
      <guid>https://dev.to/greptime/introducing-greptimedb-v07-unlock-the-future-of-cloud-native-monitoring-2623</guid>
      <description>&lt;p&gt;Last week, we unveiled the &lt;a href="https://www.greptime.com/blogs/2024-02-29-greptimedb-2024-roadmap"&gt;GreptimeDB roadmap for 2024&lt;/a&gt;, charting out several significant updates slated for this year.&lt;/p&gt;

&lt;p&gt;With the advent of spring in early March, we also welcomed the debut of the first production-grade version of GreptimeDB. v0.7 represents a crucial leap toward achieving production readiness; &lt;strong&gt;it implements production-ready features for cloud-native monitoring scenarios&lt;/strong&gt;. We eagerly invite the entire community to engage with this release and share their invaluable feedback through &lt;a href="https://www.greptime.com/slack"&gt;Slack&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;From v0.6 to v0.7, the Greptime team has made significant strides: A total of 184 commits were merged, 705 files were modified, including 82 feature enhancements, 35 bug fixes, 19 code refactors, and a substantial amount of testing work. &lt;/p&gt;

&lt;p&gt;During this period, a total of 8 individual contributors participated in the code contributions. Special thanks to &lt;a href="https://github.com/etolbakov"&gt;Eugene Tolbakov&lt;/a&gt; for being continuously active in GreptimeDB's development as our first committer!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update Highlights&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Metric Engine: crafted specifically for Observability scenarios. It's adept at managing a vast array of small tables, &lt;strong&gt;making it ideal for cloud-native monitoring&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Region Migration: enhance the user experience, simplifying region migrations with straightforward SQL commands.&lt;/li&gt;
&lt;li&gt;Inverted Index: dramatically improves the efficiency of locating data segments relevant to user queries, significantly reducing the IO operations needed for scanning data files and thus accelerating the query process.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now, let's dive deep into the updates in v0.7.&lt;/p&gt;

&lt;h2&gt;
  
  
  Region Migration
&lt;/h2&gt;

&lt;p&gt;Region Migration provides the capability to migrate regions of a data table between Datanodes. Leveraging this feature, we can easily implement hot data migration and horizontal scaling for load balancing. GreptimeDB mentioned an initial implementation of Region Migration in v0.6. In v0.7, we have further refined the feature and enhanced the user experience. &lt;/p&gt;

&lt;p&gt;Now, we can conveniently execute region migration through SQL commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;migrate_region&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;region_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;from_dn_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;to_dn_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;replay_timeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx01cyrybkzdqv45hkyet.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx01cyrybkzdqv45hkyet.png" alt="Image description" width="800" height="881"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Metric Engine
&lt;/h2&gt;

&lt;p&gt;The Metric Engine is a brand-new engine designed specifically for Observability scenarios. The primary goal of Metric Engine is to handle a large number of small tables, making it particularly suitable for cloud-native monitoring, such as scenarios previously using Prometheus. By leveraging the synthetic wide tables, this new engine offers the capability for metric data storage and metadata reuse, making "tables" more lightweight. It can overcome some of the limitations of the current Mito engine, where tables are too heavy weight.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxr49pjwtkjwi0spv3fms.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxr49pjwtkjwi0spv3fms.png" alt="Image description" width="800" height="756"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Original Metric Data

&lt;ul&gt;
&lt;li&gt;Taking the metrics from the following six node exporters as an example. In the single-value model systems represented by Prometheus, even highly correlated metrics need to be split and stored separately.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7676z82ql2jsh6gufsln.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7676z82ql2jsh6gufsln.png" alt="Image description" width="800" height="510"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Logical Table from User's Perspective

&lt;ul&gt;
&lt;li&gt;The Metric Engine authentically reproduces the structure of Metrics, presenting users with the exact structure of the Metrics as they were written.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fglyp6svyuw3ft7f1pc1r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fglyp6svyuw3ft7f1pc1r.png" alt="Image description" width="800" height="1058"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Physical Table from the Storage Perspective

&lt;ul&gt;
&lt;li&gt;At the storage layer, the Metric Engine performs mapping, using a single physical table to store related data. This approach reduces storage costs and supports the storage of Metrics at a larger scale.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F66nfw3m4arvryhg442q5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F66nfw3m4arvryhg442q5.png" alt="Image description" width="800" height="458"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Upcoming Development Plan: Automatic Field Grouping

&lt;ul&gt;
&lt;li&gt;In real-world scenarios that generate Metrics, the majority of these metrics are interconnected. GreptimeDB will possess the capability to automatically identify related metrics and consolidate them. This approach will not only decrease the number of timelines across various metrics but also enhance the efficiency of handling queries across multiple metrics.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj9spoagreycm23ceaz7o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj9spoagreycm23ceaz7o.png" alt="Image description" width="800" height="131"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lower Storage Cost

&lt;ul&gt;
&lt;li&gt;To conduct cost testing based on the AWS S3 storage backend, data is written for approximately thirty minutes at a total write rate of about 300k rows per second. The number of operations occurring during the test is tallied to estimate the cost based on AWS's pricing. The index function is enabled throughout the testing process. &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Pricing references are taken from the Standard tier at &lt;a href="https://aws.amazon.com/s3/pricing/"&gt;https://aws.amazon.com/s3/pricing/&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8ncohv9k4p2y7kwiwd4k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8ncohv9k4p2y7kwiwd4k.png" alt="Image description" width="800" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From the test data provided, it is evident that the &lt;strong&gt;Metric Engine can significantly reduce storage costs by decreasing the number of physical tables.&lt;/strong&gt; The number of operations at each stage has seen a dramatic reduction, scaling down by an order of magnitude, which in turn has led to an &lt;strong&gt;overall cost reduction exceeding eight times&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inverted Index
&lt;/h2&gt;

&lt;p&gt;Inverted Index, as a newly introduced index module, is designed to pinpoint the data segments pertinent to user queries with high efficiency, significantly reducing the I/O operations required to scan data files, thereby accelerating the query process. &lt;strong&gt;In the context of TSBS testing scenarios, we observed an average performance increase of 50%, with select scenarios experiencing boosts of up to nearly 200%.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The advantages of the Inverted Index include:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ready to use&lt;/strong&gt;: The system can automatically generate appropriate indexes, no need for users to specify manually.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Practical functionality&lt;/strong&gt;: Supports equality, range, and regular expression matches for multiple column values, ensuring rapid data location and filtering in most scenarios.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Flexible adaptation&lt;/strong&gt;:  It automatically fine-tunes internal parameters to strike an optimal balance between the cost of construction and the efficiency of queries, adeptly catering to the diverse indexing requirements of different use cases.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ujm048mrp2xgphk15ux.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ujm048mrp2xgphk15ux.png" alt="Image description" width="800" height="353"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Other Updates
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Database management capabilities significantly enhanced&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We have substantially supplemented the &lt;code&gt;information_schema&lt;/code&gt; tables, adding information such as SCHEMATA and PARTITIONS.&lt;/p&gt;

&lt;p&gt;Besides, we also introduced many new SQL functions to facilitate management operations on GreptimeDB. For example, it is now possible to trigger Region Flush and perform Region migration through SQL, as well as to query the execution status of procedures.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Performance Improvement&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In v0.7, the Memtable was restructured and upgraded, enhancing data scan speed and reducing memory usage. At the same time, we have made numerous improvements and optimizations to the read and write performance of object storage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Upgrade Guide
&lt;/h2&gt;

&lt;p&gt;As we have many significant changes in the new version, the release of v0.7 requires a system downtime upgrade. It is recommended to use the official upgrade tool, with the general upgrade process as follows:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a brand new v0.7 cluster&lt;/li&gt;
&lt;li&gt;Shut down the traffic entry to the old cluster (stop writing)&lt;/li&gt;
&lt;li&gt;Export the table structure and data using the GreptimeDB CLI upgrade tool&lt;/li&gt;
&lt;li&gt;Import the data into the new cluster using the GreptimeDB CLI upgrade tool&lt;/li&gt;
&lt;li&gt;Switch the traffic entry to the new cluster&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Please refer to the detailed upgrade guide &lt;a href="https://docs.greptime.com/user-guide/upgrade"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Future Plan
&lt;/h2&gt;

&lt;p&gt;Our next milestone is scheduled for April, as we anticipate the launch of v0.8. This release will mark the completion of GreptimeFlow, a streamlined stream computing solution adept at conducting continuous aggregation across GreptimeDB data streams. Designed with flexibility in mind, GreptimeFlow can either be integrated directly into the GreptimeDB Frontend or deployed as a standalone service within the GreptimeDB architecture. &lt;/p&gt;

&lt;p&gt;Beyond continual functional upgrades, we are persistently optimizing GreptimeDB's performance. Although v0.7 has seen substantial enhancements in performance compared to its predecessors, there remains a gap in observability scenarios compared to some mainstream solutions. Bridging this performance gap will be our primary focus in the upcoming optimization efforts.&lt;/p&gt;

&lt;p&gt;For a comprehensive view of our planned version updates, we invite you to explore the &lt;a href="https://www.greptime.com/blogs/2024-02-29-greptimedb-2024-roadmap"&gt;GreptimeDB 2024 roadmap&lt;/a&gt;. Stay connected and journey with us as we continue to evolve GreptimeDB.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;About Greptime&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We help industries that generate large amounts of time-series data, such as Connected Vehicles (CV), IoT, and Observability, to efficiently uncover the hidden value of data in real-time. &lt;/p&gt;

&lt;p&gt;Visit the &lt;a href="https://www.greptime.com/resources"&gt;latest v0.7&lt;/a&gt; from any device to get started and get the most out of your data.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/GreptimeTeam/greptimedb"&gt;GreptimeDB&lt;/a&gt;, written in Rust, is a distributed, open-source, time-series database designed for scalability, efficiency, and powerful analytics. &lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.greptime.com/product/cloud"&gt;GreptimeCloud&lt;/a&gt; offers a fully managed DBaaS that integrates well with observability and IoT sectors.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.greptime.com/product/ai"&gt;GreptimeAI&lt;/a&gt; is a tailored observability solution for LLM applications.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If anything above draws your attention, don't hesitate to star us on &lt;a href="https://github.com/GreptimeTeam/greptimedb"&gt;GitHub&lt;/a&gt; or join GreptimeDB Community on &lt;a href="https://www.greptime.com/slack"&gt;Slack&lt;/a&gt;. Also, you can go to our &lt;a href="https://github.com/GreptimeTeam/greptimedb/contribute"&gt;contribution page&lt;/a&gt; to find some interesting issues to start with.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>database</category>
      <category>rust</category>
      <category>programming</category>
    </item>
    <item>
      <title>What to Expect Next? GreptimeDB Roadmap for 2024</title>
      <dc:creator>Greptime</dc:creator>
      <pubDate>Fri, 01 Mar 2024 09:52:34 +0000</pubDate>
      <link>https://dev.to/greptime/what-to-expect-next-greptimedb-roadmap-for-2024-33p7</link>
      <guid>https://dev.to/greptime/what-to-expect-next-greptimedb-roadmap-for-2024-33p7</guid>
      <description>&lt;p&gt;Since GreptimeDB's open-sourcing on November 15th, 2022, we have stepped on the committed journey towards crafting a fast and efficient data infrastructure. This endeavor has been propelled by the collaborative efforts of both our dedicated team and the vibrant community that supports us.&lt;/p&gt;

&lt;p&gt;As we embark on the inaugural season of 2024, a leap year enriched by an extra day in February, this year promises to be thrilling as we anticipate numerous groundbreaking developments. These crucial updates will significantly showcase the maturity of our product within production environments, presenting practical benchmarks for users to compare with leading time-series databases in the industry.&lt;/p&gt;

&lt;p&gt;As we forge ahead with GreptimeDB 2024, it prompts the question, "What's next?" This roadmap outlines the objectives our team is pursuing and the visions we harbor for our collective community. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Providing clarity on what the community can expect from GreptimeDB for the next 10-12 months;&lt;/li&gt;
&lt;li&gt;Offering insights to those wishing to contribute to GreptimeDB on GitHub by highlighting potential starting points and the types of projects we are eager to embark on.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Read the updated roadmap in &lt;a href="https://github.com/GreptimeTeam/greptimedb/issues/3412"&gt;this issue&lt;/a&gt;. &lt;/p&gt;

&lt;h2&gt;
  
  
  Main Feature Updates in 2024
&lt;/h2&gt;

&lt;p&gt;The evolution of GreptimeDB in 2024 is marked by a suite of main feature updates. These enhancements are a testament to our ongoing commitment to excellence, driven by feedback from our community and the latest requirements in real-world scenarios.&lt;/p&gt;

&lt;p&gt;Our roadmap for the year includes significant advancements that promise to elevate the capabilities of GreptimeDB and enrich the user experience. &lt;/p&gt;

&lt;p&gt;Here's a glimpse into what we have in store:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://github.com/GreptimeTeam/greptimedb/blob/main/docs/rfcs/2023-07-10-metric-engine.md"&gt;Metric Engine&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tracking issue: &lt;a href="https://github.com/GreptimeTeam/greptimedb/issues/3187"&gt;https://github.com/GreptimeTeam/greptimedb/issues/3187&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;A new engine designed for observability scenarios. Its primary aim is to handle a large number of small tables, making it particularly suitable using Prometheus metrics. By utilizing synthetic wide tables, this new Engine offers the capability to store metric data and reuse metadata, rendering "tables" atop it more lightweight and overcoming some of the limitations of the existing Mito engine, which is considered too heavy for such tasks.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;

&lt;p&gt;&lt;a href="https://github.com/GreptimeTeam/greptimedb/blob/main/docs/rfcs/2024-01-17-dataflow-framework.md"&gt;GreptimeFlow&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tracking issue: &lt;a href="https://github.com/GreptimeTeam/greptimedb/issues/3187"&gt;https://github.com/GreptimeTeam/greptimedb/issues/3187&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;A lightweight stream computing component capable of performing continuous aggregation on GreptimeDB data streams. It can be embedded into the GreptimeDB Frontend or deployed as a separate service within the GreptimeDB cluster.&lt;/li&gt;
&lt;li&gt;A flow job can be submitted in the form of SQL:
&lt;/li&gt;
&lt;/ul&gt;

&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;TASK&lt;/span&gt; &lt;span class="n"&gt;avg_over_5m&lt;/span&gt; &lt;span class="n"&gt;WINDOW_SIZE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;"5m"&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;avg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="nb"&gt;time&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;
&lt;li&gt;

&lt;p&gt;Index&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/GreptimeTeam/greptimedb/blob/main/docs/rfcs/2023-11-03-inverted-index.md"&gt;Inverted Index&lt;/a&gt;

&lt;ul&gt;
&lt;li&gt;Tracking issue: &lt;a href="https://github.com/GreptimeTeam/greptimedb/issues/2705"&gt;https://github.com/GreptimeTeam/greptimedb/issues/2705&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Smart Index

&lt;ul&gt;
&lt;li&gt;For instance, it automatically monitors workloads and query performance, and when necessary, it autonomously creates relevant indexes and removes unused ones.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Spatial Index

&lt;ul&gt;
&lt;li&gt;Supports storage and retrieval of geographic location information.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;

&lt;p&gt;Cluster Management &amp;amp; Autopilot&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/GreptimeTeam/greptimedb/blob/main/docs/rfcs/2023-11-07-region-migration.md"&gt;Region Migration&lt;/a&gt;

&lt;ul&gt;
&lt;li&gt;Tracking issue: &lt;a href="https://github.com/GreptimeTeam/greptimedb/issues/2700"&gt;https://github.com/GreptimeTeam/greptimedb/issues/2700&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;It offers the capability to migrate Regions between Datanodes, facilitating the relocation of hot data and the horizontal scaling of load balancing.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Auto Rebalance Regions

&lt;ul&gt;
&lt;li&gt;An automated load balancing scheduler built upon Region Migration.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;

&lt;p&gt;Logs Engine&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; A storage engine designed specifically for the characteristics of log data, sharing most of GreptimeDB's architecture and capabilities, such as the SQL query layer, data sharding, distributed routing, and querying indexing, and compression ability. This enables GreptimeDB to become a unified system offering optimized storage and a consistent access experience for both Metrics and Logs data, based on a multi-engine architecture.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  GreptimeDB Version Plan
&lt;/h2&gt;

&lt;p&gt;With all the feature updates listed above, we've made the version iteration plan for GreptimeDB in 2024.&lt;/p&gt;

&lt;p&gt;The image below presents the GreptimeDB 2024 Roadmap, showcasing a structured release schedule and the pivotal feature enhancements planned for deployment throughout the year. Please note that these details are tentative and subject to refinement.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fucycsfy5etvoja5jgovo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fucycsfy5etvoja5jgovo.png" alt="Image description" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Track the progress of GreptimeDB versions &lt;a href="https://github.com/GreptimeTeam/greptimedb/milestones"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;GreptimeDB v1.0 marks a milestone as a production-ready release, boasting advanced features such as Smart Index, setting a new standard for efficiency and performance.&lt;/p&gt;

&lt;p&gt;Here we enthusiastically invite you to mark your calendar and experience the robust capabilities of GreptimeDB v1.0 (scheduled to be released in August) to boost your time-series data management and analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;March: GreptimeDB v0.7&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Region Migration&lt;/li&gt;
&lt;li&gt;Inverted Index&lt;/li&gt;
&lt;li&gt;Metrics Engine&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;April: GreptimeDB v0.8&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GreptimeFlow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;June: GreptimeDB v0.9&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Auto Rebalance Regions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;August: GreptimeDB v1.0&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Smart Index&lt;/li&gt;
&lt;li&gt;Spatial Index&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;December: GreptimeDB v1.1&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Logs Engine: Data ingestion from popular log collectors&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get Involved Now
&lt;/h2&gt;

&lt;p&gt;If anything above draws your attention, don't hesitate to star us on &lt;a href="https://github.com/GreptimeTeam/greptimedb"&gt;GitHub&lt;/a&gt; or GreptimeDB Community on &lt;a href="https://www.greptime.com/slack"&gt;Slack&lt;/a&gt;. Also, you can go to our &lt;a href="https://github.com/GreptimeTeam/greptimedb/contribute"&gt;contribution page&lt;/a&gt; to find some interesting issues to start with.&lt;/p&gt;

&lt;p&gt;Looking beyond the initiatives that are in progress, there's a lot of room for improvement. We also welcome other ideas besides these planned updates. If you might be interested in giving that a try, speak up and chat with the team. We probably will end up being the ones who get you the best.&lt;/p&gt;

&lt;h2&gt;
  
  
  About Us
&lt;/h2&gt;

&lt;p&gt;Greptime helps industries that generate large amounts of time-series data, such as Connected Vehicles (CV), IoT, and Observability, to efficiently uncover the hidden value of data in real time.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GreptimeDB, written in Rust, is a distributed, open-source, time-series database designed for scalability, efficiency, and powerful analytics. &lt;/li&gt;
&lt;li&gt;GreptimeCloud offers a fully managed DBaaS that integrates well with observability and IoT sectors.&lt;/li&gt;
&lt;li&gt;GreptimeAI is a tailored observability solution for LLM applications.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As an open-source initiative, we welcome enthusiasts of relevant technologies to join our community and share their insights. Star us now on GitHub and help us strengthen our community together.&lt;/p&gt;

&lt;p&gt;Twitter: &lt;a href="https://twitter.com/Greptime"&gt;https://twitter.com/Greptime&lt;/a&gt;&lt;br&gt;
LinkedIn: &lt;a href="https://www.linkedin.com/company/gr"&gt;https://www.linkedin.com/company/gr&lt;/a&gt;&lt;br&gt;
Youtube: &lt;a href="https://www.youtube.com/@greptime"&gt;https://www.youtube.com/@greptime&lt;/a&gt;&lt;br&gt;
Slack: &lt;a href="https://www.greptime.com/slack"&gt;https://www.greptime.com/slack&lt;/a&gt;&lt;br&gt;
Contact us: &lt;a href="mailto:info@greptime.com"&gt;info@greptime.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>database</category>
      <category>cloudnative</category>
      <category>rust</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Research Paper Sharing - Exploiting Cloud Object Storage for High-Performance Analytics</title>
      <dc:creator>Greptime</dc:creator>
      <pubDate>Fri, 02 Feb 2024 01:41:56 +0000</pubDate>
      <link>https://dev.to/greptime/research-paper-sharing-exploiting-cloud-object-storage-for-high-performance-analytics-4g52</link>
      <guid>https://dev.to/greptime/research-paper-sharing-exploiting-cloud-object-storage-for-high-performance-analytics-4g52</guid>
      <description>&lt;p&gt;In this sharing, we discuss a paper by Dominik Durner, Viktor Leis, and Thomas Neumann from the Technical University of Munich (TUM), published in July 2023 in &lt;a href="https://dl.acm.org/toc/pvldb/2023/16/11"&gt;PVLDB (Volume 16 No.11)&lt;/a&gt;: &lt;em&gt;Exploiting Cloud Object Storage for High-Performance Analytics&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;DB: &lt;a href="https://umbra-db.com/"&gt;https://umbra-db.com/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Paper link: &lt;a href="https://www.vldb.org/pvldb/vol16/p2769-durner.pdf"&gt;https://www.vldb.org/pvldb/vol16/p2769-durner.pdf&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;...Our experiments demonstrate that even without caching, Umbra with integrated AnyBlob can match the performance of state-of-the-art cloud data warehouses that utilize local SSDs for caching, while also enhancing resource elasticity...&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When developing our open-source cloud-native time-series analytical database &lt;a href="https://github.com/GreptimeTeam/greptimedb"&gt;GreptimeDB&lt;/a&gt;, we found this paper exceptionally beneficial. It primarily focuses on performing high-performance data analytics on object storage, with several conclusions providing clear direction for our engineering practices.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction to AWS S3
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;AWS S3's storage cost is $23 per TB per month, offering 99.999999999% (eleven nines) of availability. It's important to note that the final cost also depends on the number of API calls and cross-region data transfer fees.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The bandwidth for accessing S3 can reach up to 200 Gbps, depending on the instance's bandwidth. While the original text in the Introduce section mentions 100 Gbps, later sections state that on AWS C7 series models, the bandwidth can fully reach 200 Gbps.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The paper identifies the following challenges with AWS S3:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Challenge 1: Underutilization of bandwidth&lt;/li&gt;
&lt;li&gt;Challenge 2: Additional network CPU overhead&lt;/li&gt;
&lt;li&gt;Challenge 3: Lack of multi-cloud support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Based on our experience, the importance of these challenges is in the order of 1 &amp;gt; 2 &amp;gt; 3&lt;/p&gt;

&lt;h2&gt;
  
  
  Characteristics of Cloud Storage (Object Storage)
&lt;/h2&gt;

&lt;p&gt;Cloud storage (object storage) typically offers relatively low latency (ranging from several milliseconds to a few hundred milliseconds depending on the load size) and high throughput (capped by EC2 bandwidth, which can go as high as 200 Gbps on 7th generation EC2 models), making it suitable for large-scale data read and write operations.&lt;/p&gt;

&lt;p&gt;In contrast, Amazon Elastic Block Store (EBS) usually provides lower latency (in the order of single-digit milliseconds). However, its throughput is lower than cloud storage, often by one or two orders of magnitude.&lt;/p&gt;

&lt;h3&gt;
  
  
  Latency
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg39njr1zdc1977xhnsby.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg39njr1zdc1977xhnsby.png" alt="Image description" width="800" height="810"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;For &lt;strong&gt;small requests&lt;/strong&gt;, first byte latency is a decisive factor.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the case of &lt;strong&gt;large requests&lt;/strong&gt;, experiments ranging from 8 MiB to 32 MiB showed that latency increases linearly with file size, ultimately reaching the bandwidth limit of a single request.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Regarding &lt;strong&gt;hot data&lt;/strong&gt;, we use the first and the twentieth requests to represent scenarios of cold and hot data requests, respectively. In hot data request scenarios, latency is typically lower.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In GreptimeDB, the average latency data in the data file reading scenarios are as follows:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;For operations involving reading Manifest Files averaging less than 1 KiB, the expected latency is around 30 ms (p50, Cold) / ~ 60 ms (p99, Cold).&lt;/li&gt;
&lt;li&gt;Reading an 8 MiB Parquet file would take ~ 240 ms (p50, Cold) / ~ 370 ms (p99, Cold).&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Noisy neighbors
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg6c6s85g4z3swpojqchv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg6c6s85g4z3swpojqchv.png" alt="Image description" width="800" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Experimental Method: Single request of 16 MiB&lt;br&gt;
Bandwidth Calculation Method: Total bytes / Duration&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;There is significant variability in object bandwidth, ranging from approximately 25 to 95 MiB/s.&lt;/li&gt;
&lt;li&gt;A considerable number of data points (15%) are at the maximum value (~95 MiB/s).&lt;/li&gt;
&lt;li&gt;The median performance is stable at 55-60 MiB/s.&lt;/li&gt;
&lt;li&gt;Performance tends to be higher on weekends.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Latency across different cloud providers
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4vsaotp8wnqypmpm2n26.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4vsaotp8wnqypmpm2n26.png" alt="Image description" width="800" height="510"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Experimental Method: The test involves individual files of 16 MiB, with each request spaced 12 hours apart to reduce the influence of caching.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;S3 exhibits the highest latency among the tested services.&lt;/li&gt;
&lt;li&gt;S3 has a "minimum latency," meaning all data points exceed this value.&lt;/li&gt;
&lt;li&gt;Compared to AWS, the presence of outliers in the low latency range for other providers suggests they do not conceal the effects of caching.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The above phenomenon might be related to the hardware and implementation of S3. Overall, older hardware or different caching strategies could lead to the observed outcomes in points 2 and 3.&lt;/p&gt;

&lt;h3&gt;
  
  
  Throughput
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjpifjz6xpdjy9xwi0jwv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjpifjz6xpdjy9xwi0jwv.png" alt="Image description" width="800" height="762"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The outcomes from the above figure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A single file of 16 MiB, with 256 parallel requests to achieve maximum throughput (100 Gbps).&lt;/li&gt;
&lt;li&gt;The throughput bandwidth fluctuates with the region.&lt;/li&gt;
&lt;li&gt;The median bandwidth of AWS is 75 Gbps.&lt;/li&gt;
&lt;li&gt;The median bandwidth of Cloud X is 40 Gbps.&lt;/li&gt;
&lt;li&gt;The median bandwidth of Cloud Y is 50 Gbps.&lt;/li&gt;
&lt;li&gt;The difference between cold and hot data is minimal.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Optimal Request Size
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F58pn1ln0tbt1gh7wfiud.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F58pn1ln0tbt1gh7wfiud.png" alt="Image description" width="800" height="349"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;From the above graph, we can see that the optimal request size usually lies between 8-16 MiB. Although the cost for 32 MiB is a bit lower, its download time is double that of 16 MiB under the same bandwidth, making it less cost-efficient.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Encryption
&lt;/h3&gt;

&lt;p&gt;So far, all experiments conducted are based on non-secure HTTP connections. In this section, the authors compare the throughput performance with AES encryption enabled and after switching to HTTPS.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsxno8ly7v6c3b9d3ogto.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsxno8ly7v6c3b9d3ogto.png" alt="Image description" width="526" height="732"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HTTPS requires twice the CPU resources compared to HTTP.&lt;/li&gt;
&lt;li&gt;AES encryption increases CPU resources by only 30%.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In AWS, traffic between all regions, and even within availability zones, is automatically encrypted by the network infrastructure. Within the same location, due to VPC isolation, no other user can intercept the traffic between EC2 instances and the S3 gateway. &lt;strong&gt;Therefore, using HTTPS in this context is redundant.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Slow Requests
&lt;/h3&gt;

&lt;p&gt;In the experiments, the authors observed significant tail latency in some requests, with some even being lost without any notification. To address this, cloud providers recommend a request hedging strategy to re-requesting unresponsive requests.&lt;/p&gt;

&lt;p&gt;The authors have gathered some empirical data on slow requests for 16 MiB files:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;After 600 milliseconds, less than 5% of objects have not been successfully downloaded.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Less than 5% of objects have a first byte latency exceeding 200 milliseconds.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Based on these observations, one can consider re-downloading attempts for requests exceeding a certain latency threshold.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cloud Storage Data Request Model
&lt;/h3&gt;

&lt;p&gt;In their study, the authors observed that the bandwidth of a single request is similar to that when accessing data on an HDD (Hard Disk Drive). To fully utilize network bandwidth, a large number of concurrent requests are necessary. For analytical workloads, requests in the 8-16 MiB range are cost-effective. They devised a model to predict the number of requests needed to achieve a given throughput target.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqsa3xmz0amj1vc3pm1vb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqsa3xmz0amj1vc3pm1vb.png" alt="Image description" width="690" height="748"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The experiment utilized computing instances with a total bandwidth of 100 Gbps. In the graph, "Model (Hot)" represents the 25th percentile (p25) latency observed in previous experiments.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdp5x5pl5kxn8c2gmceca.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdp5x5pl5kxn8c2gmceca.png" alt="Image description" width="800" height="86"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The median base latency is approximately 30 ms, as determined from the 1 KiB trial in Figure 2.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The median data latency is around 20 ms/MiB, with Cloud X and Cloud Y exhibiting lower rates (12–15 ms/MiB), calculated from the 16 MiB median minus the base latency in Figure 2.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;To achieve 100 Gbps on S3, 200-250 concurrent requests are necessary.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;With access latencies in tens of milliseconds and a bandwidth of about 50 MiB/s per object, it suggests that the object storage is likely HDD-based. This implies that reading at ∼80 Gbps from S3 is equivalent to accessing around 100 HDDs simultaneously.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Anyblob
&lt;/h2&gt;

&lt;p&gt;AnyBlob is a universal object storage library created by the authors, designed to support access to object storage services from various cloud providers. &lt;/p&gt;

&lt;p&gt;Compared to existing C++ libraries for S3, AnyBlob utilizes the &lt;code&gt;io_uring&lt;/code&gt; system call and removes the limitation of one-to-one thread mapping. The final results indicate that AnyBlob achieves higher performance with reduced CPU usage. However, it's worth considering that the primary reason for this improvement might be the subpar quality of the existing C++ S3 libraries.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F79soduuk07a89ofshnys.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F79soduuk07a89ofshnys.png" alt="Image description" width="800" height="748"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Domain name resolution strategies&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The AnyBlob does incorporate noteworthy features. The authors noted that resolving domain names for each request introduces significant latency overhead. To address this, they implemented strategies including:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Caching Multiple Endpoint IPs&lt;/strong&gt;: Storing the IP addresses of multiple endpoints in a cache and scheduling requests to these IPs. Replace the endpoints with noticeably deteriorating performance based on statistical information.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Based on MTU (Maximum Transmission Unit)&lt;/strong&gt;: Different S3 endpoints have different MTUs. Some support jumbo frames up to 9001 bytes, which can significantly reduce CPU overhead.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;MTU Discovery Strategy&lt;/strong&gt;: This involves pinging the target endpoint's IP with a payload larger than 1500 bytes and the DNF (Do Not Fragment) flag set to determine if it supports larger MTUs.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Integration with Cloud Storage
&lt;/h2&gt;

&lt;p&gt;In this section, the authors discuss how they integrated cloud storage. Overall, these ideas are converging in practice, and the specific implementation details depend on the engineering practices of different teams.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcs0xsqckr27hc2zppa1x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcs0xsqckr27hc2zppa1x.png" alt="Image description" width="800" height="265"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Adaptive Strategy&lt;/strong&gt;&lt;br&gt;
If the processing speed of requested data is slow, then reduce the number of download threads (and tasks) and increase the number of request threads (and tasks).&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance Evaluation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Data Download Performance
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3cnlehhq4f8cub6e18m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3cnlehhq4f8cub6e18m.png" alt="Image description" width="800" height="170"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Experimental Parameters: TPC-H scale factor 500 ( ~500 GiB of data).&lt;/p&gt;

&lt;p&gt;The authors categorized the queries into two types: retrieval-heavy and computation-heavy.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Retrieval-heavy examples: Queries 1, 6, and 19. These are characterized by a constant multiple difference in performance between In-Memory and Remote storage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Computation-heavy examples: Queries 9 and 18. These are marked by a very small performance difference between In-Memory and Remote storage.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Comparison of Different Storage Types
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmizteb7l8vyx0b3ehpbh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmizteb7l8vyx0b3ehpbh.png" alt="Image description" width="800" height="593"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;EBS (Elastic Block Store) exhibits the poorest performance, likely due to the utilization of lower-tier options like gp2/gp3, which offer around 1 GiB of bandwidth.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Scalability
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzrjbbza6ojeankhzwob4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzrjbbza6ojeankhzwob4.png" alt="Image description" width="800" height="523"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Retrieval-Heavy (Q1): The bottleneck in this type of query lies in the network bandwidth. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Computation-Heavy (Q9): The performance improves with an increase in the number of cores. The throughput of the Remote (the Umbra) is nearly the same as that of the in-memory version.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  End-To-End Study with Compression &amp;amp; AES
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5fdq58cw8f4rxtawt80z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5fdq58cw8f4rxtawt80z.png" alt="Image description" width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Experimental Parameters: Scale Factor (SF) of 100 (~ 100 GiB) and 1,000 (~ 1 TiB of data). &lt;/p&gt;

&lt;p&gt;The Snowflake used in the experiment is a large-size configuration, while Umbra utilized EC2 c5d.18xlarge instances, with caching disabled.&lt;/p&gt;

&lt;p&gt;Overall, this comparison might be insufficiently strict. For example, it lacks detailed information about the Snowflake setup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;For the Large-size Snowflake, there might be issues with overselling and throttling.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Snowflake group may have purchased a standard, lower-tier version, which could also impact the results.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, this also highlights another aspect: the core technique of benchmark marketing might involve some statistical wizardry, like hiding the query that didn't hit the cache behind the p99. In other words, the effort required for benchmarking optimization when running a single query 10 times versus 100 times might not be on the same scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Overall, this article provides substantial data support and insights in several areas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Characteristics of object storage&lt;/li&gt;
&lt;li&gt;Optimal file size for data requests&lt;/li&gt;
&lt;li&gt;The impact of enabling HTTPS&lt;/li&gt;
&lt;li&gt;Cloud storage data request model&lt;/li&gt;
&lt;li&gt;Scheduling queries and download tasks based on statistical information&lt;/li&gt;
&lt;li&gt;Empirical data on handling slow requests&lt;/li&gt;
&lt;li&gt;Utilization of MTU jumbo frames&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the upcoming GreptimeDB 0.7.0 release, we have implemented extensive optimizations in querying, including enhancements for queries on object storage. In some scenarios, query response times are now approaching the levels of local storage. &lt;a href="https://github.com/GreptimeTeam/greptimedb"&gt;Star us on GitHub&lt;/a&gt; and stay tuned with GreptimeDB, we eagerly await your try and welcome any form of feedback and discussion.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>tutorial</category>
      <category>opensource</category>
      <category>database</category>
    </item>
    <item>
      <title>GreptimeAI + Xinference - Efficient Deployment and Monitoring of Your LLM Applications</title>
      <dc:creator>Greptime</dc:creator>
      <pubDate>Wed, 24 Jan 2024 12:18:49 +0000</pubDate>
      <link>https://dev.to/greptime/greptimeai-xinference-efficient-deployment-and-monitoring-of-your-llm-applications-2mop</link>
      <guid>https://dev.to/greptime/greptimeai-xinference-efficient-deployment-and-monitoring-of-your-llm-applications-2mop</guid>
      <description>&lt;p&gt;With the rapid evolution of artificial intelligence technology, OpenAI has established itself as a frontrunner in the field. It demonstrates remarkable proficiency in a range of language processing tasks, including machine translation, text classification, and text generation. Parallel to OpenAI's ascent, many high-quality, open-source large language models such as Llama, ChatGLM, and Qwen have also gained prominence. These exceptional open-source models are invaluable assets for teams aiming to swiftly develop robust Large Language Model (LLM) applications.&lt;/p&gt;

&lt;p&gt;With a myriad of options at hand, the challenge becomes how to uniformly use OpenAI's interface while also reducing development costs. Additionally, efficiently and continuously monitoring the performance of LLM applications is crucial, but how could we avoid increasing the complexity of development? &lt;a href="https://greptime.com/product/ai"&gt;GreptimeAI&lt;/a&gt; and &lt;a href="https://github.com/xorbitsai/inference"&gt;Xinference&lt;/a&gt; offer pragmatic solutions to address these pivotal concerns.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is GreptimeAI?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.greptime.com/product/ai"&gt;GreptimeAI&lt;/a&gt;, built upon the open-source time-series database &lt;a href="https://github.com/GreptimeTeam/greptimedb"&gt;GreptimeDB&lt;/a&gt;, offers an observability solution for Large Language Model (LLM) applications, currently supporting both &lt;a href="https://www.langchain.com/"&gt;LangChain&lt;/a&gt; and &lt;a href="https://openai.com/"&gt;OpenAI's&lt;/a&gt; ecosystem. GreptimeAI enables you to understand cost, performance, traffic and security aspects in real-time, helping teams enhance the reliability of LLM applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Xinference?
&lt;/h2&gt;

&lt;p&gt;Xorbits Inference (&lt;a href="https://github.com/xorbitsai/inference"&gt;Xinference&lt;/a&gt;) is an open-source platform to streamline the operation and integration of a wide array of AI models. With Xinference, you’re empowered to run inference using any open-source LLMs, embedding models, and multimodal models either in the cloud or on your own premises, and create robust AI-driven applications. It provides a RESTful API compatible with OpenAI API, Python SDK, CLI, and WebUI. Furthermore, it integrates third-party developer tools like LangChain, &lt;a href="https://www.llamaindex.ai/"&gt;LlamaIndex&lt;/a&gt;, and &lt;a href="https://dify.ai/"&gt;Dify&lt;/a&gt;, facilitating model integration and development. &lt;/p&gt;

&lt;p&gt;Xinference supports multiple inference engines such as Transformers, vLLM, and GGML and is suitable for various hardware environments. It also supports multiple-nodes deployment, efficiently allocating model inference tasks across multiple devices or machines.&lt;/p&gt;

&lt;h2&gt;
  
  
  Utilize GreptimeAI + Xinference to Deploy and Monitor an LLM App
&lt;/h2&gt;

&lt;p&gt;Next, we will take the Llama 2 model as an example to demonstrate how to install and run the model locally using Xinference. This example will feature the use of an OpenAI-style function call to conduct a weather query. Additionally, we will demonstrate how GreptimeAI can be effectively utilized to monitor the usage and performance of the LLM application.&lt;/p&gt;

&lt;h3&gt;
  
  
  Register and Get GreptimeAI Configuration Info
&lt;/h3&gt;

&lt;p&gt;Visit &lt;a href="https://console.greptime.cloud"&gt;https://console.greptime.cloud&lt;/a&gt; to register and create an AI service, then go to the Dashboard, click on the Setup page to get configuration information for OpenAI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9jbvdpxchkda460hgjq1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9jbvdpxchkda460hgjq1.png" alt="Image description" width="800" height="542"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Start the Xinference Model Service
&lt;/h3&gt;

&lt;p&gt;Initiating the Xinference model service locally is pretty straightforward. Simply enter the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;xinference&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="k"&gt;local&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;H&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By default, Xinference initializes the service on your local machine, typically using port 9997. The process of installing Xinference locally is not covered here and you can refer to &lt;a href="https://inference.readthedocs.io/en/latest/getting_started/installation.html"&gt;this article&lt;/a&gt; for installation instructions.&lt;/p&gt;

&lt;h4&gt;
  
  
  Launch the Model via Web UI
&lt;/h4&gt;

&lt;p&gt;After starting Xinference, you can access its Web UI by entering &lt;a href="http://localhost:9997"&gt;http://localhost:9997&lt;/a&gt; in your browser. This provides a user-friendly interface.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgdbmvbr0zc0oyl0uvnp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgdbmvbr0zc0oyl0uvnp.png" alt="Image description" width="800" height="386"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Launch the Model via Command Line Tool
&lt;/h4&gt;

&lt;p&gt;Alternatively, the model can be launched using Xinference's command-line tool. The default Model UID is set to &lt;code&gt;llama-2-chat&lt;/code&gt;, which will be used subsequently for accessing the model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;xinference&lt;/span&gt; &lt;span class="n"&gt;launch&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="n"&gt;llama&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="n"&gt;pytorch&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Obtain Weather Information through an OpenAI-Styled Interface
&lt;/h3&gt;

&lt;p&gt;Suppose we have the capability to fetch weather information for a specific city using the &lt;code&gt;get_current_weather&lt;/code&gt; function, with parameters &lt;code&gt;location&lt;/code&gt; and &lt;code&gt;format&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Configure and Call the OpenAI Interface
&lt;/h4&gt;

&lt;p&gt;Access the Xinference local service using OpenAI's Python SDK and utilize GreptimeAI for metrics and traces collection. You can create dialogues using the &lt;code&gt;chat.completions&lt;/code&gt; module, and specify the list of functions we've defined using &lt;code&gt;tools&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;greptimeai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;openai_patcher&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://127.0.0.1:9997/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;openai_patcher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a helpful assistant. Do not make assumptions about the values in the function calls.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s the weather in New York now&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;chat_completion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama-2-chat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;func_name: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chat_completion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;func_args: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chat_completion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Details of the &lt;code&gt;tools&lt;/code&gt;
&lt;/h4&gt;

&lt;p&gt;The definition of the function calling tool list is as follows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;get_current_weather&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;obtain current weather&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parameters&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;properties&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;location&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;city, such as New York&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="p"&gt;},&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;format&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enum&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;celsius&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fahrenheit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;the temperature unit used, determined by the specific city&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;required&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;location&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;format&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output is as follows, showing the result generated by the Llama 2 model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;func_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_current_weather&lt;/span&gt;
&lt;span class="n"&gt;func_args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;location&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;New York&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;format&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;celsius&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Retrieve the Function Call Results and Making Subsequent Calls
&lt;/h4&gt;

&lt;p&gt;Let's assume that we have invoked the &lt;code&gt;get_current_weather&lt;/code&gt; function with specified parameters and obtained the results. These results, along with the context, will then be resent to the Llama 2 model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chat_completion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_dump&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_call_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_calls&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_calls&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature_unit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;celsius&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;chat_completion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama-2-chat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chat_completion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Final Result
&lt;/h4&gt;

&lt;p&gt;The Llama 2 model ultimately generates the following response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The current temperature in New York is 10 degrees Celsius.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  GreptimeAI Dashboard
&lt;/h2&gt;

&lt;p&gt;On the GreptimeAI Dashboard, you have the capability to comprehensively monitor LLM application behaviors based on the OpenAI interface in real-time, including key metrics such as token, cost, latency, and trace. &lt;br&gt;
Below is a capture of the Dashboard's overview page.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fszj45pi7lq821wrglkop.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fszj45pi7lq821wrglkop.jpeg" alt="Image description" width="800" height="386"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;If you're developing LLM applications with open-source large language models and aspire to utilize API calls akin to OpenAI's style, then choosing Xinference for managing your inference models, coupled with GreptimeAI for performance monitoring, would be a good choice. Whether it's for complex data analysis or simple routine queries, Xinference offers robust and flexible model management capabilities. Furthermore, GreptimeAI's monitoring features help you understand and optimize your model's performance and resource usage.&lt;/p&gt;

&lt;p&gt;We look forward to seeing what you achieve with these tools and are eager to hear about your insights and experiences using GreptimeAI and Xinference. If you experience any issues or feedback, please don't hesitate to reach out to us at &lt;a href="mailto:info@greptime.com"&gt;info@greptime.com&lt;/a&gt; or via &lt;a href="https://www.greptime.com/slack"&gt;Slack&lt;/a&gt;. Together, let's delve into the vast and exciting realm of artificial intelligence!&lt;/p&gt;

</description>
      <category>monitoring</category>
      <category>openai</category>
      <category>llm</category>
      <category>greptimeai</category>
    </item>
    <item>
      <title>Memory Leak Diagnosing using Flame Graphs</title>
      <dc:creator>Greptime</dc:creator>
      <pubDate>Fri, 19 Jan 2024 08:49:46 +0000</pubDate>
      <link>https://dev.to/greptime/memory-leak-diagnosing-using-flame-graphs-4bdk</link>
      <guid>https://dev.to/greptime/memory-leak-diagnosing-using-flame-graphs-4bdk</guid>
      <description>&lt;p&gt;Starting with &lt;a href="https://github.com/GreptimeTeam/greptimedb/pull/1733"&gt;greptimedb#1733&lt;/a&gt; in last June, GreptimeDB has adopted &lt;a href="https://github.com/jemalloc/jemalloc"&gt;Jemalloc&lt;/a&gt; as its default memory allocator. This change not only boosts performance and reduces memory fragmentation but also offers convenient memory analysis capabilities.&lt;/p&gt;

&lt;p&gt;In our previous article, &lt;a href="https://greptime.com/blogs/2023-06-15-rust-memory-leaks"&gt;Unraveling Rust Memory Leaks: Easy-to-Follow Techniques for Identifying and Solving Memory Issues&lt;/a&gt;, we explored several common methods for analyzing memory leaks in Rust applications. &lt;/p&gt;

&lt;p&gt;Here in this article, I will delve into detailed techniques for troubleshooting based on Jemalloc. If you encounter any unusual memory usage issues while using or developing &lt;a href="https://github.com/GreptimeTeam/greptimedb"&gt;GreptimeDB&lt;/a&gt;, refer to this article for quick diagnostics and identification of potential memory leaks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Preparations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Install tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Install &lt;code&gt;flamegraph.pl&lt;/code&gt; script
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; https://raw.githubusercontent.com/brendangregg/FlameGraph/master/flamegraph.pl &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HOME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;/.local/bin/flamegraph.pl
&lt;span class="nb"&gt;chmod&lt;/span&gt; +x &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HOME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;/.local/bin/flamegraph.pl
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;:&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HOME&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;/.local/bin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;flamegraph.pl&lt;/code&gt; , authored by Brendan Gregg, is a Perl script designed for visualizing hot spots in code call stacks. &lt;a href="https://www.brendangregg.com"&gt;Brendan Gregg&lt;/a&gt; is an expert in system performance optimization. We are grateful to him for developing and open-sourcing numerous tools, including &lt;code&gt;flamegraph.pl&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;Install &lt;code&gt;jeprof&lt;/code&gt; command
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# For Ubuntu&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; libjemalloc-dev

&lt;span class="c"&gt;# For Fedora&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;dnf &lt;span class="nb"&gt;install &lt;/span&gt;jemalloc-devel
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;For other operating systems, you can find the dependency packages for &lt;code&gt;jeprof&lt;/code&gt; through &lt;a href="https://pkgs.org"&gt;pkgs.org&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Enabling Heap Profiling in GreptimeDB
&lt;/h3&gt;

&lt;p&gt;The heap profiling feature in GreptimeDB is turned off by default. You can enable this feature by turning on the &lt;code&gt;mem-prof&lt;/code&gt; feature when compiling GreptimeDB.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo build &lt;span class="nt"&gt;--release&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt; mem-prof
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;The discussion about whether the &lt;code&gt;mem-prof&lt;/code&gt; feature should be enabled by default is ongoing in &lt;a href="https://github.com/GreptimeTeam/greptimedb/issues/3166"&gt;greptimedb#3166&lt;/a&gt;. You are welcome to share your opinion there.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Starting GreptimeDB with &lt;code&gt;mem-prof&lt;/code&gt; Feature
&lt;/h3&gt;

&lt;p&gt;To enable the heap profiling feature, you need to set the &lt;code&gt;MALLOC_CONF&lt;/code&gt; environment variable when starting the GreptimeDB process:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;MALLOC_CONF&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;prof:true &amp;lt;path_to_greptime_binary&amp;gt; standalone start
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can use &lt;code&gt;curl&lt;/code&gt; command to check if heap profiling is enabled.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &amp;lt;greptimedb_ip&amp;gt;:4000/v1/prof/mem
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the heap profiling feature is turned on, executing the &lt;code&gt;curl&lt;/code&gt; command should yield a response similar to the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;heap_v2/524288
  t*: 125: 136218 [0: 0]
  t0: 59: 31005 [0: 0]
...

MAPPED_LIBRARIES:
55aa05c66000-55aa0697a000 r--p 00000000 103:02 40748099                  /home/lei/workspace/greptimedb/target/debug/greptime
55aa0697a000-55aa11e74000 r-xp 00d14000 103:02 40748099                  /home/lei/workspace/greptimedb/target/debug/greptime
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you receive the response &lt;code&gt;{"error":"Memory profiling is not enabled"}&lt;/code&gt;, it indicates that the &lt;code&gt;MALLOC_CONF=prof:true&lt;/code&gt; environment variable has not been set correctly. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;For information on the data format returned by the heap profiling API, refer to the &lt;a href="https://jemalloc.net/jemalloc.3.html#heap_profile_format"&gt;HEAP PROFILE FORMAT - jemalloc.net&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Begin your Memory Exploration Journey
&lt;/h2&gt;

&lt;p&gt;By using the command &lt;code&gt;curl &amp;lt;greptimedb_ip&amp;gt;:4000/v1/prof/mem&lt;/code&gt;, you can quickly obtain details of the memory allocated by GreptimeDB. The tools &lt;code&gt;jeprof&lt;/code&gt; and &lt;code&gt;flamegraph.pl&lt;/code&gt; can be used to visualize memory usage details into a flame graph:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# To get memory allocation details&lt;/span&gt;
curl &amp;lt;greptimedb_ip&amp;gt;:4000/v1/prof/mem &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; mem.hprof

&lt;span class="c"&gt;# To generate a flame graph of memory allocation&lt;/span&gt;
jeprof &amp;lt;path_to_greptime_binary&amp;gt; ./mem.hprof &lt;span class="nt"&gt;--collapse&lt;/span&gt; | flamegraph.pl &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; mem-prof.svg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After executing the above commands, a flame graph named 'mem-prof.svg' will be generated in the working directory.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--qjXTTrUo--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zsq8t9avqq90vuhhybhn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--qjXTTrUo--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zsq8t9avqq90vuhhybhn.png" alt="Image description" width="800" height="1614"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  How to Interpret the Flame Graph
&lt;/h3&gt;

&lt;p&gt;Created by Brendan Gregg, the flame graph is a powerful tool for analyzing CPU overhead and memory allocation details. Its principle of generation is based on recording the function call stack that triggers each memory allocation event during each memory sampling. &lt;/p&gt;

&lt;p&gt;After recording a sufficient number of times, the call stacks of each allocation are merged, thus revealing the memory size allocated by each function call and its child function calls.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The bottom of the flame graph represents the base of the function stack, while the top represents the stack top.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Each cell in the flame graph represents a function call, with the cells below it being the callers of that function, and the cells above being the callees, the functions that it calls.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The width of a cell indicates the total amount of memory allocated by that function and its child functions. Wider cells indicate that those functions are allocating more memory. If some functions allocate a lot of memory but they do not have many child functions (as shown in the diagram, with wider stack tops in the flame graph, known as plateaus), it suggests that these functions themselves might have a substantial number of allocation operations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The color of each cell in the flame graph is a random warm color.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Opening the flame graph's SVG file in a browser allows for interactive clicking into each function for more detailed analysis.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Ricctcxr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ue5ihv53b1xf3f9ib954.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Ricctcxr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ue5ihv53b1xf3f9ib954.png" alt="Image description" width="800" height="636"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Accelerating Flame Graph Generation
&lt;/h3&gt;

&lt;p&gt;The heap memory details returned by Jemalloc include the addresses of each function in the call stack. Generating the flame graph requires translating these addresses into file names and line numbers, which is the most time-consuming step. Typically on Linux systems, this task is accomplished by the &lt;code&gt;addr2line&lt;/code&gt; tool from &lt;a href="https://www.gnu.org/software/binutils/"&gt;GNU Binutils&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;To speed up the generation of the flame graph, we can replace the Binutils &lt;code&gt;addr2line tool&lt;/code&gt; with &lt;a href="https://github.com/gimli-rs/addr2line"&gt;&lt;code&gt;glimi-rs/addr2line&lt;/code&gt;&lt;/a&gt;, thereby achieving at least a &lt;strong&gt;2x increase&lt;/strong&gt; in speed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/gimli-rs/addr2line
&lt;span class="nb"&gt;cd &lt;/span&gt;addr2line
cargo build &lt;span class="nt"&gt;--release&lt;/span&gt;
&lt;span class="nb"&gt;sudo cp&lt;/span&gt; /usr/bin/addr2line /usr/bin/addr2line-bak
&lt;span class="nb"&gt;sudo cp &lt;/span&gt;target/release/examples/addr2line /usr/bin/addr2line 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Catching Memory Leaks through Allocation Differences
&lt;/h3&gt;

&lt;p&gt;In most memory leak cases, the usage of memory tends to increase slowly. Therefore, during the process of memory growth, capturing memory usage at two different time points and analyzing the difference between them often points to potential memory leaks.&lt;/p&gt;

&lt;p&gt;We can collect the memory data at the initial time point to establish a baseline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &amp;lt;greptimedb_ip&amp;gt;:4000/v1/prof/mem &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; base.hprof
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When memory usage increases slowly, which suggests a possible memory leak, we should collect the memory data again:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &amp;lt;greptimedb_ip&amp;gt;:4000/v1/prof/mem &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; leak.hprof
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, using 'base.hprof' as a baseline, analyze the memory usage and generate a flame graph:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;jeprof &amp;lt;path_to_greptime_binary&amp;gt; &lt;span class="nt"&gt;--base&lt;/span&gt; ./base.hprof ./leak.hprof &lt;span class="nt"&gt;--collapse&lt;/span&gt; | flamegraph.pl &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; leak.svg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the flame graph generated with the &lt;code&gt;--base&lt;/code&gt; parameter specifying the baseline, only the memory allocation differences between the current memory collection and the baseline will be included. This allows for a clearer understanding of which function calls are responsible for the increase in memory usage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reference
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://rust-lang.github.io/rfcs/1974-global-allocators.html#jemalloc"&gt;https://rust-lang.github.io/rfcs/1974-global-allocators.html#jemalloc&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.brendangregg.com/FlameGraphs/memoryflamegraphs.html"&gt;https://www.brendangregg.com/FlameGraphs/memoryflamegraphs.html&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/gimli-rs/addr2line"&gt;https://github.com/gimli-rs/addr2line&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/GreptimeTeam/greptimedb/pull/1733"&gt;https://github.com/GreptimeTeam/greptimedb/pull/1733&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/GreptimeTeam/greptimedb/pull/1124"&gt;https://github.com/GreptimeTeam/greptimedb/pull/1124&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>database</category>
      <category>memory</category>
      <category>rust</category>
      <category>greptime</category>
    </item>
    <item>
      <title>GreptimeDB v0.6 Released - Support Migrating Table's Regions between Datanodes</title>
      <dc:creator>Greptime</dc:creator>
      <pubDate>Wed, 17 Jan 2024 08:30:54 +0000</pubDate>
      <link>https://dev.to/greptime/greptimedb-v06-released-support-migrating-tables-regions-between-datanodes-309k</link>
      <guid>https://dev.to/greptime/greptimedb-v06-released-support-migrating-tables-regions-between-datanodes-309k</guid>
      <description>&lt;p&gt;As 2024 dawned, the Greptime team, invigorated by the New Year's fresh momentum, continued their efforts towards innovative version iterations. Just three weeks following our &lt;a href="https://www.greptime.com/blogs/2023-12-29-greptimedbv0.5"&gt;previous update&lt;/a&gt;, we are thrilled to announce a new version to our open-source time-series database: GreptimeDB v0.6.&lt;/p&gt;

&lt;p&gt;This update marks a substantial leap from GreptimeDB v0.5, incorporating several major improvements.&lt;/p&gt;

&lt;h2&gt;
  
  
  GreptimeDB v0.6 Updates
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Region Migration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In version 0.5, we enabled support for Kafka WAL, making it possible to synchronize and migrate Region data across multiple Datanodes. In version 0.6, we initially implemented the Region Migration feature, &lt;strong&gt;providing users with the ability to migrate table Regions between Datanodes while ensuring data integrity. This lays a solid foundation for dynamically adjusting cluster load.&lt;/strong&gt; For example, as query performance requirements increase, users can easily migrate table Regions to Datanodes with lower loads or larger specifications through Region Migration, achieving better query performance. &lt;/p&gt;

&lt;p&gt;In the future, we plan to introduce dynamic Region distribution. This feature is designed to intelligently redistribute data Regions, leveraging real-time monitoring of workload conditions and business requirements while ensuring uninterrupted service. This strategic enhancement aims to  optimize resource utilization. By doing so, it not only promotes more efficient and smarter data management but also ensures robust and adaptive support for the ever-evolving demands of the business environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Additional Updates
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Added a configuration item that allows specifying the default time zone for queries

&lt;ul&gt;
&lt;li&gt;By adding the &lt;code&gt;--store-key-prefix&lt;/code&gt; configuration option, administrators can specify the Key prefix used by metasrv to avoid key name conflicts.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Implemented the &lt;code&gt;OR&lt;/code&gt; logical operator in PromQL

&lt;ul&gt;
&lt;li&gt;Added a special &lt;code&gt;UNION&lt;/code&gt; operator (&lt;code&gt;OR&lt;/code&gt; in PromQL) specifically for certain PromQL query scenarios. This operator takes two input nodes. All columns from the left child node will be output, and columns specified in compare_keys are used to check for duplicates. In case of duplicates, if both are from the right node, only the first row is retained; if from the left node, the row from the right node is discarded. The output includes columns from both left and right nodes, and the order of rows is not fixed.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Future Plans
&lt;/h2&gt;

&lt;p&gt;Our next milestone, v0.7, promises to be even more exciting. &lt;/p&gt;

&lt;p&gt;We plan to introduce a brand new indexing module, with its first implementation being an inverted index. This module aims to significantly boost performance when filtering and querying a small subset of time-series from vast datasets, a key focus for our Metric Engine in observable scenarios. Our team is currently rigorously testing the integration of these features to ensure optimal performance and stability. Stay tuned for the much-anticipated release of GreptimeDB v0.7!&lt;/p&gt;

</description>
      <category>database</category>
      <category>version</category>
      <category>region</category>
    </item>
    <item>
      <title>Streamline your OpenAI Monitoring Experience with GreptimeAI</title>
      <dc:creator>Greptime</dc:creator>
      <pubDate>Fri, 12 Jan 2024 07:12:05 +0000</pubDate>
      <link>https://dev.to/greptime/streamline-your-openai-monitoring-experience-with-greptimeai-137m</link>
      <guid>https://dev.to/greptime/streamline-your-openai-monitoring-experience-with-greptimeai-137m</guid>
      <description>&lt;p&gt;With the rapid advancement of artificial intelligence technology, &lt;a href="https://openai.com/" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt; has emerged as one of the leaders in the field. It excels in various language processing tasks, including machine translation, text classification, and text generation.&lt;/p&gt;

&lt;p&gt;However, the critical role of continuous monitoring of API calls while using OpenAI should not be underestimated. This practice is crucial not only for identifying performance bottlenecks and analyzing usage patterns but also for swiftly detecting and addressing any issues that arise with the API.&lt;/p&gt;

&lt;h2&gt;
  
  
  GreptimeAI
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://greptime.com/product/ai" rel="noopener noreferrer"&gt;GreptimeAI&lt;/a&gt; offers a tailor-made observability solution specifically designed for monitoring and managing large language model (LLM) applications. This solution provides comprehensive insights into the cost, performance, traffic, and security aspects of OpenAI usage. For more details about GreptimeAI, please refer to this &lt;a href="https://greptime.com/blogs/2023-11-09-greptimeai" rel="noopener noreferrer"&gt;article&lt;/a&gt;. Notably, &lt;a href="https://greptime.com/product/ai" rel="noopener noreferrer"&gt;GreptimeAI&lt;/a&gt; is built upon the open-source time-series database, &lt;a href="https://github.com/GrepTimeTeam/greptimedb/" rel="noopener noreferrer"&gt;GreptimeDB&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI Modules being Monitored
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;chat&lt;/li&gt;
&lt;li&gt;completion&lt;/li&gt;
&lt;li&gt;audio&lt;/li&gt;
&lt;li&gt;images&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Scenarios Supported
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;async&lt;/li&gt;
&lt;li&gt;stream&lt;/li&gt;
&lt;li&gt;with_raw_response&lt;/li&gt;
&lt;li&gt;retry&lt;/li&gt;
&lt;li&gt;error&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  User Guide
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Installation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;pip&lt;/span&gt; &lt;span class="n"&gt;install&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;upgrade&lt;/span&gt; &lt;span class="n"&gt;greptimeai&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Registration
&lt;/h3&gt;

&lt;p&gt;To get started, create a service by registering greptimeai, and get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;host&lt;/li&gt;
&lt;li&gt;database&lt;/li&gt;
&lt;li&gt;token&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Setting up
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;GREPTIMEAI_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'xxx'&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;GREPTIMEAI_DATABASE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'xxx'&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;GREPTIMEAI_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'xxx'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;

&lt;p&gt;Here is a simple example to illustrate how to call OpenAI chat completion with GreptimeAI tracking enabled.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;greptimeai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;openai_patcher&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;openai_patcher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;completion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;How do I output all files in a directory using Python?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;user_id_from_your_application&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How does it look like in GreptimeAI
&lt;/h2&gt;

&lt;p&gt;Dashboard overview:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo81h12smhhkr58ee45wn.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo81h12smhhkr58ee45wn.jpeg" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The following graph shows the trace detail with multiple spans.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4mwhucq5t4gjt0tli6p8.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4mwhucq5t4gjt0tli6p8.jpeg" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aiops</category>
      <category>monitoring</category>
      <category>database</category>
    </item>
    <item>
      <title>Practical Tips for Choosing the Right AWS EC2 for your Workload</title>
      <dc:creator>Greptime</dc:creator>
      <pubDate>Thu, 11 Jan 2024 16:18:47 +0000</pubDate>
      <link>https://dev.to/greptime/practical-tips-for-choosing-the-right-aws-ec2-for-your-workload-2dk1</link>
      <guid>https://dev.to/greptime/practical-tips-for-choosing-the-right-aws-ec2-for-your-workload-2dk1</guid>
      <description>&lt;h2&gt;
  
  
  Background
&lt;/h2&gt;

&lt;p&gt;AWS EC2, the Elastic Compute Cloud service from Amazon Web Services, offers developers user-friendly and flexible virtual machines. As another most established service within AWS, alongside S3, EC2 has a rich history dating back to its inception in 2006. Over nearly 17 years, it has continuously evolved, underlining its significance and reliability in the cloud computing space.&lt;/p&gt;

&lt;p&gt;Many people new to AWS EC2 might have similar feelings:&lt;/p&gt;

&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;There are too many types of AWS EC2 (hundreds)! Which one should I choose to meet my business needs without exceeding the budget?&lt;/li&gt;
&lt;li&gt;If the CPU and Memory configurations of EC2 are the same, does it mean their performance differences are also the same?&lt;/li&gt;
&lt;li&gt;What is the most cost-effective EC2 payment mode?&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;Reflecting back on the initial launch of EC2, there were only &lt;a href="https://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud#:~:text=Amazon%20EC2%20was%20developed%20mostly,along%20with%20Willem%20van%20Biljon." rel="noopener noreferrer"&gt;two types of instance&lt;/a&gt; available. Fast forward to today, and the landscape has dramatically expanded to an impressive &lt;a href="https://github.com/aws/aws-sdk-go-v2/blob/main/service/ec2/types/enums.go#L3707-L4487" rel="noopener noreferrer"&gt;781 different types&lt;/a&gt;. This vast selection of EC2 options presents developers with a wide array of choices, potentially leading to a challenging decision-making process.&lt;/p&gt;

&lt;p&gt;This article will briefly introduce some tips for selecting EC2 instances to help readers choose the right EC2 type more smoothly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Classification and Selection
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Meet the EC2 family
&lt;/h3&gt;

&lt;p&gt;Although AWS has hundreds of EC2 types, there are only a few &lt;a href="https://aws.amazon.com/ec2/instance-types/?nc1=h_ls" rel="noopener noreferrer"&gt;major categories&lt;/a&gt; as listed follow:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Favaxrovpkwbdr77lgip6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Favaxrovpkwbdr77lgip6.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;General Purpose, M and T series&lt;/strong&gt;:  provide a balance of CPU, memory, and network resources, sufficient for most scenarios;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Compute Optimized, C series&lt;/strong&gt;: Suitable for compute-intensive services;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Memory Optimized, R and X series&lt;/strong&gt;: Designed to provide high performance for workloads processing large data sets;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Accelerated Computing&lt;/strong&gt;: Accelerate the compute instances and use hardware accelerators or coprocessors to execute functions such as floating-point calculations, graphics processing, or data pattern matching, which are more efficient than software running on CPUs;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Storage Optimized&lt;/strong&gt;: Designed for workloads that require high-speed, continuous read and write access to very large data sets on local storage;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;HPC Optimized&lt;/strong&gt;, HPC series: A new category by AWS mainly suitable for applications that require high-performance processing, such as large complex simulations and deep learning workloads;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Typically, &lt;strong&gt;each specific EC2 type belongs to a Family with a corresponding numerical sequence&lt;/strong&gt;. For example, for the General Purpose type M series:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;M7g / M7i / M7i-flex / M7a&lt;/li&gt;
&lt;li&gt;M6g / M6i / M6in / M6a&lt;/li&gt;
&lt;li&gt;M5 / M5n / M5zn / M5a&lt;/li&gt;
&lt;li&gt;M4&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The numerical sequence reveals that M7 represents the latest generation, whereas M4 is comparatively older. Generally, a higher number indicates a more recent model and CPU type, and often, the pricing is more favorable due to the natural depreciation of hardware.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Parameters
&lt;/h3&gt;

&lt;p&gt;We can extract the following key parameters from the AWS EC2 model introduction.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfxp7d1twn7i7dyaxnqu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfxp7d1twn7i7dyaxnqu.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Specific Model of EC2&lt;/strong&gt;: Generally named as &lt;code&gt;&amp;lt;family&amp;gt;.&amp;lt;size&amp;gt;&lt;/code&gt;, like &lt;code&gt;m7g.large&lt;/code&gt; / &lt;code&gt;m7g.xlarge&lt;/code&gt;. For EC2, a certain model is unique globally;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CPU and Memory Size&lt;/strong&gt;: The number of vCPUs and the size of Memory. Most EC2 models have a 1:4 ratio, i.e., the ratio of the number of vCPUs to Memory. For example, when there is 1 vCPU, Memory is usually 4GiB; when there are 2 vCPUs, Memory is usually 8GiB.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Instance Storage&lt;/strong&gt;: EC2 can generally mount different types of persistent storage disks, mainly:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;EBS&lt;/strong&gt;: Mounted AWS distributed block storage service, which is usually the default choice for most EC2 models. &lt;strong&gt;Some models only have the option to use EBS&lt;/strong&gt;, which is bound to a specific AZ. Although its read/write latency is higher than local SSD, it's acceptable in most scenarios. EBS also has &lt;a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html" rel="noopener noreferrer"&gt;different types&lt;/a&gt; based on parameters like &lt;strong&gt;IOPS and throughput&lt;/strong&gt;, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;gp2/gp3&lt;/strong&gt;: Underlying general-purpose SSD, with gp3 being officially recommended for better cost-performance. Typically, the default setting is 3000 IOPS, but it also offers the flexibility to increase IOPS on demand, without any downtime—though this does come with additional costs;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;io1/io2&lt;/strong&gt;: Stronger performance and higher price, also supporting features like &lt;strong&gt;Multi Attach&lt;/strong&gt; (usually, other types of EBS can only be mounted on one EC2);&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Local Storage&lt;/strong&gt;: Some models support local storage in addition to mounting EBS, but of course, they are more expensive. Generally, these models will have a &lt;code&gt;d&lt;/code&gt; in their model name. For example, &lt;code&gt;m7g.large&lt;/code&gt; is an EBS-Only model, while &lt;code&gt;m7gd.large&lt;/code&gt; has 1 118GiB NVME SSD local storage. Some special models also support larger local HDDs;&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;EBS Bandwidth&lt;/strong&gt;: For some newer and specifically EBS-optimized EC2 models, AWS equips them with dedicated EBS bandwidth. This means that in high data throughput scenarios, EBS-optimized models can always enjoy better throughput without competing for network bandwidth on the local machine;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Network Bandwidth&lt;/strong&gt;: The network bandwidth corresponding to the EC2 model;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CPU Model&lt;/strong&gt;: In most scenarios, we can see CPUs from the following manufacturers:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;AWS's self-developed Graviton processor based on the ARM architecture (currently up to Graviton 3), such as the M7g series;&lt;/li&gt;
&lt;li&gt;Intel x86-64 architecture CPU;&lt;/li&gt;
&lt;li&gt;AMD x86-64 architecture CPU;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Generally, for similar configurations, &lt;strong&gt;the pricing trend is Intel being the most expensive, followed by AMD, and then Graviton, with the performance ranking inversely&lt;/strong&gt;. For general scenarios that are performance-insensitive, users can consider using ARM architecture models, which offer greater cost-effectiveness. &lt;/p&gt;

&lt;p&gt;AWS is one of the earliest cloud vendors to introduce ARM architecture into the server CPU field. After years of R&amp;amp;D, Graviton CPU has made significant progress and has a great competitive advantage in cost-performance. It is expected that more customers will use Graviton CPU models in the future.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Virtualization Technology&lt;/strong&gt;: Various EC2 models employ distinct virtualization technologies, resulting in differences in their technical parameters. For example, for newer EC2 models, &lt;a href="https://aws.amazon.com/ec2/nitro/" rel="noopener noreferrer"&gt;Nitro virtualization technology&lt;/a&gt; is generally applied. Nitro is AWS's latest virtualization technology, offloading many virtualization behaviors to hardware, making the software relatively lighter and virtualization performance stronger. From the user's perspective, identical configurations will yield enhanced performance due to reduced virtualization overhead.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Whether suitable for Machine Learning Scenarios&lt;/strong&gt;: With the development of LLM technology, more and more vendors will choose to train their models in the cloud. If you want to use model training on AWS EC2, &lt;strong&gt;Accelerated Computing&lt;/strong&gt; generally would be your choice, such as:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;P series and G series models&lt;/strong&gt;: They use Nvidia's GPU chips. At the re:Invent 2023 conference, Nvidia and AWS started a deeper &lt;a href="https://nvidianews.nvidia.com/news/aws-nvidia-strategic-collaboration-for-generative-ai" rel="noopener noreferrer"&gt;strategic cooperation&lt;/a&gt;. AWS plans to use Nvidia's latest and most powerful GPUs to create a computing platform specifically for generative AI;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Trn and Inf series&lt;/strong&gt;: In addition to using Nvidia GPUs, AWS also develops its own chips for machine learning, such as the &lt;a href="https://aws.amazon.com/machine-learning/trainium/" rel="noopener noreferrer"&gt;Trainium chip&lt;/a&gt;  for training and the &lt;a href="https://aws.amazon.com/machine-learning/inferentia/" rel="noopener noreferrer"&gt;Inferentia chip&lt;/a&gt; for model inference. Trn series and Inf series EC2 models correspond to these two AWS-developed machine learning chips respectively;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;Building on the overview provided above (and there's much more to explore about EC2), we've compiled a few tips for users to consider when selecting EC2 instances.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Typically, for most EC2 models, a higher sequence number indicates a newer CPU model. This generally means better performance and, interestingly, a more cost-effective pricing structure – essentially, you get more bang for your buck.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Among the general-purpose EC2 models, the T series is relatively cheap and offers a &lt;strong&gt;Burstable CPU&lt;/strong&gt; feature: the instance accumulates CPU credits while operating under baseline performance, and when encountering high load scenarios above baseline performance, it can run beyond baseline performance for a certain time according to CPU credits (without changing the cost). However, this also means the T series won't have very high performance, with generally low bandwidth and no EBS optimization. Therefore, the T series is more suitable for non-performance-verified test environments;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Within the general-purpose series, if you're aiming for cost-efficiency, it's advisable to prioritize AWS ARM architecture models;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AWS's official EC2 Pricing on its website is very difficult to read, it is recommended to use &lt;a href="https://ec2instances.info/" rel="noopener noreferrer"&gt;Vantage&lt;/a&gt; to check price information (it is also an open-source project);&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;For most cloud users, the cost of EC2 is generally their major expense. Here are a few ways to reduce this cost as much as possible:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Fully utilize the elasticity of the cloud&lt;/strong&gt;: &lt;br&gt;
make your architecture as flexible as possible, and use on-demand computing power. You can use AWS's &lt;a href="https://karpenter.sh/" rel="noopener noreferrer"&gt;Karpenter&lt;/a&gt; or Cluster &lt;a href="https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler" rel="noopener noreferrer"&gt;Autoscaler&lt;/a&gt; to make your EC2 flexible and scalable;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use Spot instances&lt;/strong&gt;: &lt;br&gt;
Spot instances can be 30% to 90% cheaper than On-Demand instances, but they are subject to preemption and can't be relied on for long-term stable operation. AWS will notify you 2 minutes before preemption, then proceed with it. Spot instances, if well-managed at the underlying level, are very suitable for elastic computing and interruption-tolerant scenarios. For example, the &lt;a href="https://skypilot.readthedocs.io/en/latest/" rel="noopener noreferrer"&gt;SkyPilot&lt;/a&gt; project uses different cloud Spot instances for machine learning training;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Optimize payment modes&lt;/strong&gt;: &lt;br&gt;
Beyond technical approaches, cost reduction can also be achieved by purchasing Saving Plans. These plans offer lower unit costs compared to On-Demand pricing, though they come with decreased flexibility. This makes them more suited for scenarios with relatively stable business architectures.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Efficient selection and utilization of EC2 should be tailored to the user's unique scenarios, requiring continuous and iterative optimization. In summary, leveraging the cloud's elasticity and understanding the key parameters of various EC2 models is essential for every AWS user.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Reducing S3 API Calls by 98% | Exploring the Secrets of OpenDAL's RangeReader</title>
      <dc:creator>Greptime</dc:creator>
      <pubDate>Thu, 04 Jan 2024 07:51:41 +0000</pubDate>
      <link>https://dev.to/greptime/reducing-s3-api-calls-by-98-exploring-the-secrets-of-opendals-rangereader-2ke7</link>
      <guid>https://dev.to/greptime/reducing-s3-api-calls-by-98-exploring-the-secrets-of-opendals-rangereader-2ke7</guid>
      <description>&lt;h2&gt;
  
  
  Preface
&lt;/h2&gt;

&lt;p&gt;At &lt;a href="https://github.com/greptimeTeam/greptimedb/"&gt;GreptimeDB&lt;/a&gt;, we utilize &lt;a href="https://github.com/apache/incubator-opendal"&gt;OpenDAL&lt;/a&gt; as our unified data &lt;code&gt;access layer&lt;/code&gt;. Recently, a colleague informed me that it took 10 seconds to execute a &lt;code&gt;Copy From&lt;/code&gt; statement to import an 800 KiB Parquet file from S3. After some investigation and reviewing related &lt;code&gt;Reader&lt;/code&gt; of OpenDAL documentation and its implementation (realizing we hadn't RTFSC 🥲), I document and briefly summarize our findings here.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Relevant OpenDAL source code Commit: &lt;a href="https://github.com/apache/incubator-opendal/tree/6980cd15007c9a2ae8422cbc0750c818e178abf2"&gt;6980cd1&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Understanding OpenDAL Source Code
&lt;/h2&gt;

&lt;p&gt;Frankly speaking, it was only recently that I fully grasped the intricacies of the OpenDAL source code and its invocation relationships, after previously having only a partial understanding of it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Starting with the &lt;code&gt;Operator&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;All our IO operations revolve around the &lt;code&gt;Operator&lt;/code&gt;. Let's see how the &lt;code&gt;Operator&lt;/code&gt; is constructed. In &lt;code&gt;main.rs&lt;/code&gt;, we first create a file-system-based &lt;code&gt;Backend Builder&lt;/code&gt;; subsequently build it into an &lt;code&gt;accessor&lt;/code&gt; (implementing the &lt;code&gt;Accessor&lt;/code&gt; trait); and then pass this &lt;code&gt;accessor&lt;/code&gt; into &lt;code&gt;OperatorBuilder::new&lt;/code&gt;, finally calling finish.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;OpenDAL unifies the behavior of different storage backends through the &lt;code&gt;Accessor&lt;/code&gt; trait, exposing a unified IO interface to the upper layer, like &lt;code&gt;create_dir&lt;/code&gt;, &lt;code&gt;read&lt;/code&gt;, &lt;code&gt;write&lt;/code&gt;, etc.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;opendal&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;services&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Fs&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;opendal&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Operator&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;#[tokio::main]&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Create fs backend builder.&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Fs&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;default&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="c1"&gt;// Set the root for fs, all operations will happen under this root.&lt;/span&gt;
    &lt;span class="c1"&gt;//&lt;/span&gt;
    &lt;span class="c1"&gt;// NOTE: the root must be absolute path.&lt;/span&gt;
    &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="nf"&gt;.root&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/tmp"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;accessor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="nf"&gt;.build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Operator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;OperatorBuilder&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;accessor&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="nf"&gt;.finish&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What Happens in &lt;code&gt;OperatorBuilder::new&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;accessor&lt;/code&gt; we pass in is attached with two layers when invoking &lt;code&gt;new&lt;/code&gt;, and an additional internal &lt;code&gt;Layer&lt;/code&gt; is added when invoking finish. With these layers added, when we invoke interfaces exposed by &lt;code&gt;Operator&lt;/code&gt;, the invoking starts from the outermost &lt;code&gt;CompleteLayer&lt;/code&gt; and eventually reaches the innermost &lt;code&gt;FsAccessor&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;FsAccessor&lt;/span&gt;
&lt;span class="n"&gt;ErrorContextLayer&lt;/span&gt;
&lt;span class="n"&gt;CompleteLayer&lt;/span&gt;
&lt;span class="o"&gt;^&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;Invoking&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;`read`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;`reader_with`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;`stat`&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;impl&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Accessor&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;OperatorBuilder&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cd"&gt;/// Create a new operator builder.&lt;/span&gt;
    &lt;span class="nd"&gt;#[allow(clippy::new_ret_no_self)]&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;accessor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;OperatorBuilder&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;Accessor&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Make sure error context layer has been attached.&lt;/span&gt;
        &lt;span class="n"&gt;OperatorBuilder&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;accessor&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="nf"&gt;.layer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ErrorContextLayer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;.layer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CompleteLayer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="o"&gt;...&lt;/span&gt;

    &lt;span class="cd"&gt;/// Finish the building to construct an Operator.&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;finish&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Operator&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;ob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="nf"&gt;.layer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TypeEraseLayer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nn"&gt;Operator&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from_inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;Arc&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ob&lt;/span&gt;&lt;span class="py"&gt;.accessor&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;FusedAccessor&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;TL;DR: I just want to emphasize that we should read the source code of OpenDAL starting from CompleteLayer (an epiphany).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Background Information
&lt;/h2&gt;

&lt;p&gt;Let me provide some necessary context here to understand the following content.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;LruCacheLayer&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Currently, in query scenarios, we add a &lt;code&gt;LruCacheLayer&lt;/code&gt; while building the &lt;code&gt;Operator&lt;/code&gt;, so our &lt;code&gt;Operator&lt;/code&gt; looks like the diagram below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="n"&gt;S3Accessor&lt;/span&gt;                &lt;span class="n"&gt;FsAccessor&lt;/span&gt;
&lt;span class="n"&gt;ErrorContextLayer&lt;/span&gt;         &lt;span class="n"&gt;ErrorContextLayer&lt;/span&gt;
&lt;span class="n"&gt;CompleteLayer&lt;/span&gt;             &lt;span class="n"&gt;CompleteLayer&lt;/span&gt;
    &lt;span class="o"&gt;^&lt;/span&gt;                         &lt;span class="o"&gt;^&lt;/span&gt;  &lt;span class="p"&gt;|&lt;/span&gt;
    &lt;span class="p"&gt;|&lt;/span&gt;                         &lt;span class="p"&gt;|&lt;/span&gt;  &lt;span class="p"&gt;|&lt;/span&gt;
    &lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;inner&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;           &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;  &lt;span class="p"&gt;|&lt;/span&gt;
    &lt;span class="p"&gt;|&lt;/span&gt;                         &lt;span class="p"&gt;|&lt;/span&gt;  &lt;span class="p"&gt;|&lt;/span&gt;
    &lt;span class="p"&gt;|&lt;/span&gt;                         &lt;span class="p"&gt;|&lt;/span&gt;  &lt;span class="p"&gt;|&lt;/span&gt;
    &lt;span class="p"&gt;|&lt;/span&gt;                         &lt;span class="p"&gt;|&lt;/span&gt;  &lt;span class="p"&gt;|&lt;/span&gt;
    &lt;span class="o"&gt;+-----&lt;/span&gt; &lt;span class="n"&gt;LruCacheLayer&lt;/span&gt; &lt;span class="o"&gt;-----+&lt;/span&gt;  &lt;span class="p"&gt;|&lt;/span&gt;
                 &lt;span class="o"&gt;^&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt;
                 &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt;
                 &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt;
                 &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="n"&gt;v&lt;/span&gt;
                 &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="nn"&gt;FileReader&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;oio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;TokioReader&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;File&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                 &lt;span class="p"&gt;|&lt;/span&gt;
                 &lt;span class="nf"&gt;Invoking&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="n"&gt;reader_with&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For example, with the &lt;code&gt;read&lt;/code&gt; interface, &lt;code&gt;LruCacheLayer&lt;/code&gt; caches S3 files in the file system, returning the cached file-system-based &lt;code&gt;Box&amp;lt;dyn oio::Read&amp;gt;&lt;/code&gt;(&lt;code&gt;FileReader::new(oio::TokioReader&amp;lt;tokio::fs::File&amp;gt;)&lt;/code&gt;) to the upper layer; if the file to be read is not in the cache, it's first loaded in full from S3 to the local file system.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;LruCacheLayer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;inner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Operator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// S3Backend&lt;/span&gt;
  &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Operator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// FsBackend&lt;/span&gt;
  &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;CacheIndex&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;LayeredAccessor&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;LruCacheLayer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="o"&gt;...&lt;/span&gt;
  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;OpRead&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RpRead&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;Self&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Reader&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.index&lt;/span&gt;&lt;span class="nf"&gt;.hit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="c1"&gt;// Returns `Box&amp;lt;dyn oio::Read&amp;gt;`&lt;/span&gt;
          &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.cache&lt;/span&gt;&lt;span class="nf"&gt;.read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt; 
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="c1"&gt;// Fetches cache and stores...&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The &lt;code&gt;Copy From&lt;/code&gt; Scenario
&lt;/h3&gt;

&lt;p&gt;In the &lt;code&gt;Copy From&lt;/code&gt; scenario, I didn't add this &lt;code&gt;LruCacheLayer&lt;/code&gt; layer. Thus, our &lt;code&gt;Operator&lt;/code&gt; looks like the diagram below:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;``&lt;code&gt;plain text&lt;br&gt;
S3Accessor&lt;br&gt;
ErrorContextLayer&lt;br&gt;
CompleteLayer&lt;br&gt;
   ▲    │&lt;br&gt;
   │    │&lt;br&gt;
   │    │&lt;br&gt;
   │    ▼&lt;br&gt;
   │    RangeReader::new(IncomingAsyncBody)&lt;br&gt;
   │&lt;br&gt;
   Invoking (&lt;/code&gt;reader&lt;code&gt;,&lt;/code&gt;reader_with`)&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;


## Issues Encountered with OpenDAL RangeReader

### Starting with the Construction of ParquetRecordBatchStream
In `Copy From`, after obtaining the file information(i.e., the file location on the S3), we first invoke `operator.reader` to return a `reader` implementing `AsyncReader + AsyncSeek`, then wrap it with a `BufReader`. Ultimately, this `reader` is passed into `ParquetRecordBatchStreamBuilder`.

&amp;gt; Here, the use of `BufReader` is superfluous because it clears its internal buffer after invoking the `seek` method, negating any potential performance benefits.



```rust
  ...
  let reader = operator
      .reader(path)
      .await
      .context(error::ReadObjectSnafu { path })?;

  let buf_reader = BufReader::new(reader.compat());

  let builder = ParquetRecordBatchStreamBuilder::new(buf_reader)
      .await
      .context(error::ReadParquetSnafu)?;

  let upstream = builder
      .build()
      .context(error::BuildParquetRecordBatchStreamSnafu)?;

  ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Reading Metadata in &lt;code&gt;ParquetRecordBatchStream::new&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The metadata reading logic is as follows: first, invokes &lt;code&gt;seek(SeekFrom::End(-FOOTER_SIZE_I64))&lt;/code&gt;, reads &lt;code&gt;FOOTER_SIZE&lt;/code&gt; bytes and parse &lt;code&gt;metadata_len&lt;/code&gt;; then invokes &lt;code&gt;seek again&lt;/code&gt;, and reads &lt;code&gt;metadata_len&lt;/code&gt; bytes to parse the metadata.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;impl&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AsyncRead&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;AsyncSeek&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Unpin&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Send&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AsyncFileReader&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;get_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;BoxFuture&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nv"&gt;'_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;Arc&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;ParquetMetaData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;FOOTER_SIZE_I64&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;i64&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FOOTER_SIZE&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;i64&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;move&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="nf"&gt;.seek&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SeekFrom&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;End&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;FOOTER_SIZE_I64&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0_u8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;FOOTER_SIZE&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
            &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="nf"&gt;.read_exact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;metadata_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;decode_footer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="nf"&gt;.seek&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;SeekFrom&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;End&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;FOOTER_SIZE_I64&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;metadata_len&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;i64&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Vec&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;with_capacity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata_len&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="nf"&gt;.take&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata_len&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.read_to_end&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;Arc&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;decode_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="nf"&gt;.boxed&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Real Problem
&lt;/h3&gt;

&lt;p&gt;Up to this point, we've discussed some minor issues. The more challenging problem arises here, where the variable &lt;code&gt;stream&lt;/code&gt; is the &lt;code&gt;ParquetRecordBatchStream&lt;/code&gt; we've built above. When we invoke &lt;code&gt;next&lt;/code&gt;, &lt;code&gt;ParquetRecordBatchStream&lt;/code&gt; invokes &lt;code&gt;RangeReader&lt;/code&gt;'s &lt;code&gt;seek&lt;/code&gt; and &lt;code&gt;read&lt;/code&gt; multiple times. However, each call to &lt;code&gt;seek&lt;/code&gt; resets &lt;code&gt;RangeReader&lt;/code&gt;'s internal state (discarding the previous byte stream) and, on the subsequent &lt;code&gt;read&lt;/code&gt; call, initiates a new remote request (in the S3 backend scenario).&lt;/p&gt;

&lt;p&gt;You can see detailed information in &lt;a href="https://github.com/apache/incubator-opendal/issues/3747"&gt;this issue&lt;/a&gt; and &lt;a href="https://github.com/apache/incubator-opendal/pull/3734"&gt;the discussion here&lt;/a&gt;).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When using &lt;code&gt;ParquetRecordBatchStream&lt;/code&gt; to retrieve each column's data, it'll first invoke RangeReader &lt;code&gt;seek&lt;/code&gt;, then &lt;code&gt;read&lt;/code&gt; some bytes. Thus, the total number of remote calls required is the number of &lt;code&gt;RowGroups&lt;/code&gt; multiplied by the number of columns in a &lt;code&gt;RowGroup&lt;/code&gt;. Our 800KiB file contains 50 &lt;code&gt;RowGroups&lt;/code&gt; and 12 columns (per &lt;code&gt;RowGroup&lt;/code&gt;), which results in &lt;strong&gt;600 S3 get requests!&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;        &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;copy_table_from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="o"&gt;...&lt;/span&gt;
            &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nf"&gt;Some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="nf"&gt;.next&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;record_batch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="nf"&gt;.context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;error&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;ReadDfRecordBatchSnafu&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;vectors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
                    &lt;span class="nn"&gt;Helper&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;try_into_vectors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;record_batch&lt;/span&gt;&lt;span class="nf"&gt;.columns&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="nf"&gt;.context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IntoVectorsSnafu&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

                &lt;span class="n"&gt;pending_mem_size&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="nf"&gt;.iter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="nf"&gt;.map&lt;/span&gt;&lt;span class="p"&gt;(|&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="nf"&gt;.memory_size&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="py"&gt;.sum&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;usize&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

                &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;columns_values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fields&lt;/span&gt;
                    &lt;span class="nf"&gt;.iter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="nf"&gt;.cloned&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="nf"&gt;.zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="py"&gt;.collect&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;HashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

                &lt;span class="n"&gt;pending&lt;/span&gt;&lt;span class="nf"&gt;.push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.inserter&lt;/span&gt;&lt;span class="nf"&gt;.handle_table_insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;InsertRequest&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="n"&gt;catalog_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="py"&gt;.catalog_name&lt;/span&gt;&lt;span class="nf"&gt;.to_string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                        &lt;span class="n"&gt;schema_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="py"&gt;.schema_name&lt;/span&gt;&lt;span class="nf"&gt;.to_string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                        &lt;span class="n"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="py"&gt;.table_name&lt;/span&gt;&lt;span class="nf"&gt;.to_string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                        &lt;span class="n"&gt;columns_values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="p"&gt;},&lt;/span&gt;
                    &lt;span class="n"&gt;query_ctx&lt;/span&gt;&lt;span class="nf"&gt;.clone&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="p"&gt;));&lt;/span&gt;

                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;pending_mem_size&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;u64&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;pending_mem_threshold&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="n"&gt;rows_inserted&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nf"&gt;batch_insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;pending_mem_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Explore the &lt;code&gt;RangeReader&lt;/code&gt; Source Code
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Take a look at &lt;code&gt;self.poll_read()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;In &lt;code&gt;RangeReader&lt;/code&gt;, the &lt;code&gt;self.state&lt;/code&gt; initially starts as &lt;code&gt;State::Idle&lt;/code&gt;. Let's assume that &lt;code&gt;self.offset&lt;/code&gt; is &lt;code&gt;Some(0)&lt;/code&gt;, then &lt;br&gt;
 &lt;code&gt;self.state&lt;/code&gt; is set to &lt;code&gt;State::SendRead(BoxFuture&amp;lt;'static, Result&amp;lt;(RpRead, R)&amp;gt;&amp;gt;)&lt;/code&gt; and &lt;code&gt;self.poll_read(cx, buf)&lt;/code&gt; is invoked again.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;impl&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nn"&gt;oio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Read&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;RangeReader&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Accessor&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nn"&gt;oio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;poll_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nv"&gt;'_&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Poll&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;usize&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="o"&gt;...&lt;/span&gt;
        &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.state&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nn"&gt;State&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Idle&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.offset&lt;/span&gt;&lt;span class="nf"&gt;.is_none&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="c1"&gt;// When the offset is none, it means we are performing tailing reading.&lt;/span&gt;
                    &lt;span class="c1"&gt;// We should start by getting the correct offset through a stat operation.&lt;/span&gt;
                    &lt;span class="nn"&gt;State&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;SendStat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="nf"&gt;.stat_future&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="nn"&gt;State&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;SendRead&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="nf"&gt;.read_future&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
                &lt;span class="p"&gt;};&lt;/span&gt;

                &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="nf"&gt;.poll_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="o"&gt;...&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What happens in &lt;code&gt;self.read_future()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Clearly, &lt;code&gt;self.read_future()&lt;/code&gt; returns a &lt;code&gt;BoxedFuture&lt;/code&gt;. Within this &lt;code&gt;BoxedFuture&lt;/code&gt;, the underlying &lt;code&gt;Accessor&lt;/code&gt;'s &lt;code&gt;read&lt;/code&gt; method (&lt;code&gt;acc.read(&amp;amp;path, op).await&lt;/code&gt;) is invoked. The &lt;code&gt;Accessor&lt;/code&gt; can be an implementation for some storage backend. In our context, this &lt;code&gt;Accessor&lt;/code&gt; represents an S3 storage backend. When its &lt;code&gt;read&lt;/code&gt; interface is invoked, it establishes a TCP connection to retrieve the file data and returns a byte stream from S3's response to the upper layer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;impl&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;RangeReader&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Accessor&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nn"&gt;oio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;read_future&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;BoxFuture&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;'static&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RpRead&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;acc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.acc&lt;/span&gt;&lt;span class="nf"&gt;.clone&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.path&lt;/span&gt;&lt;span class="nf"&gt;.clone&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;op&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.op&lt;/span&gt;&lt;span class="nf"&gt;.clone&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.cur&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;op&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="nf"&gt;.into_deterministic&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;op&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="nf"&gt;.with_range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="nf"&gt;.calculate_range&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

        &lt;span class="nn"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;pin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;move&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;acc&lt;/span&gt;&lt;span class="nf"&gt;.read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Continuing from where we left off in &lt;code&gt;self.poll_read()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;At this point, &lt;code&gt;poll_read&lt;/code&gt; has not yet returned. In the previous section, &lt;code&gt;self.poll_read()&lt;/code&gt; was invoked again with &lt;code&gt;self.state&lt;/code&gt; being &lt;code&gt;State::SendRead(BoxFuture&amp;lt;'static, Result&amp;lt;(RpRead, R)&amp;gt;&amp;gt;)&lt;/code&gt;. The value returned by &lt;code&gt;ready!(Pin::new(fut).poll(cx))&lt;/code&gt; corresponds to the result of &lt;code&gt;acc.read(&amp;amp;path, op).await&lt;/code&gt; from the previous section. For the S3 storage backend, remote calls happen here. &lt;/p&gt;

&lt;p&gt;Afterward, the internal state &lt;code&gt;self.poll_read&lt;/code&gt; is set to &lt;code&gt;State::Read(r)&lt;/code&gt;, and &lt;code&gt;self.poll_read(cx, buf)&lt;/code&gt; is invoked once more. Upon entering &lt;code&gt;self.poll_read()&lt;/code&gt; again, the internal state of &lt;code&gt;RangeReader&lt;/code&gt; is set to &lt;code&gt;State::Reader(R)&lt;/code&gt;. Here, &lt;code&gt;R(r)&lt;/code&gt; represents the byte stream of the read request's response. For the S3 storage backend, the &lt;code&gt;Pin::new(r).poll_read(cx, buf)&lt;/code&gt; writes the byte data from the TCP buffer into the upper-level application.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;impl&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nn"&gt;oio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Read&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;RangeReader&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt;
    &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Accessor&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nn"&gt;oio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;poll_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nv"&gt;'_&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Poll&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;usize&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Sanity check for normal cases.&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="nf"&gt;.is_empty&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;||&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.cur&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.size&lt;/span&gt;&lt;span class="nf"&gt;.unwrap_or&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;u64&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nn"&gt;Poll&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Ready&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.state&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="o"&gt;...&lt;/span&gt;
            &lt;span class="nn"&gt;State&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;SendRead&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fut&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nd"&gt;ready!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;Pin&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fut&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.poll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="nf"&gt;.map_err&lt;/span&gt;&lt;span class="p"&gt;(|&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="c1"&gt;// If the read future returns an error, reset the state to Idle to retry.&lt;/span&gt;
                    &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;State&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Idle&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="n"&gt;err&lt;/span&gt;
                &lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

                &lt;span class="c1"&gt;// Set the size if the read returns a size hint.&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nf"&gt;Some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rp&lt;/span&gt;&lt;span class="nf"&gt;.size&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.size&lt;/span&gt;&lt;span class="nf"&gt;.is_none&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;Some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.cur&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;State&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="nf"&gt;.poll_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="nn"&gt;State&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="nd"&gt;ready!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;Pin&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.poll_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="c1"&gt;// Reset the state to Idle after all data has been consumed.&lt;/span&gt;
                    &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;State&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Idle&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="nn"&gt;Poll&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Ready&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.cur&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;u64&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="nn"&gt;Poll&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Ready&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;State&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Idle&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="nn"&gt;Poll&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Ready&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Final Look at &lt;code&gt;self.poll_seek()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Remember the internal state of &lt;code&gt;RangeReader&lt;/code&gt; we discussed earlier? Yes, it was &lt;code&gt;State::Reader(R)&lt;/code&gt;. When we call &lt;code&gt;seek&lt;/code&gt; after a &lt;code&gt;read&lt;/code&gt;, the byte stream inside &lt;code&gt;RangeReader&lt;/code&gt; is discarded, and the state is reset to &lt;code&gt;State::Idle&lt;/code&gt;. In other words, every time &lt;code&gt;read&lt;/code&gt; is invoked after &lt;code&gt;seek&lt;/code&gt;, &lt;code&gt;RangeReader&lt;/code&gt; requests the &lt;code&gt;read&lt;/code&gt; method of the underlying &lt;code&gt;Accessor&lt;/code&gt; (&lt;code&gt;acc.read(&amp;amp;path, op).await&lt;/code&gt;) to initiate a remote call. For the S3 storage backend, invoking this interface incurs significant overhead (typically around hundreds of milliseconds).&lt;/p&gt;

&lt;p&gt;Additionally, there's a performance-related point to be considered. When attempting &lt;code&gt;SeekFrom::End()&lt;/code&gt; and &lt;code&gt;self.size&lt;/code&gt; is unknown, an additional &lt;code&gt;stat&lt;/code&gt; operation is performed. After invoking &lt;code&gt;self.poll_seek()&lt;/code&gt;, &lt;code&gt;self.cur&lt;/code&gt; will be set to &lt;code&gt;base.checked_add(amt)&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;We've implemented a quick fix that decreased the number of &lt;code&gt;RowGroups&lt;/code&gt; imported from 50 to just 1. However, this solution still necessitates 12 remote calls. Moving forward, we plan to contribute a &lt;code&gt;BufferReader&lt;/code&gt; to OpenDAL (details available at &lt;a href="https://github.com/apache/incubator-opendal/pull/3734"&gt;RFC here&lt;/a&gt;), which is expected to significantly reduce the number of consecutive remote calls triggered by 'seek' and 'read' operations in &lt;code&gt;RangeReader&lt;/code&gt;. In certain cases, these calls could be entirely eliminated.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When invokes &lt;code&gt;seek&lt;/code&gt; on a &lt;code&gt;RangeReader&lt;/code&gt;, the internal state will be reset, and a subsequent &lt;code&gt;read&lt;/code&gt; invoking results in a remote call that happens in the underlying &lt;code&gt;Accessor&lt;/code&gt; (in scenarios where the backend is S3). (For related information, please refer to &lt;a href="https://github.com/apache/incubator-opendal/issues/3747"&gt;this issue&lt;/a&gt; and &lt;a href="https://github.com/apache/incubator-opendal/pull/3734"&gt;discussion&lt;/a&gt; links provided).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Both &lt;code&gt;std::io::BufReader&lt;/code&gt; and &lt;code&gt;tokio::io::BufReader&lt;/code&gt; clear their internal buffers after &lt;code&gt;seek&lt;/code&gt;. If you wish to continue reading from the &lt;code&gt;Buffer&lt;/code&gt;, you should use &lt;code&gt;seek_relative&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>programming</category>
      <category>database</category>
      <category>api</category>
      <category>greptime</category>
    </item>
    <item>
      <title>Unlock Complex Time Series Analysis in SQL with Range Queries</title>
      <dc:creator>Greptime</dc:creator>
      <pubDate>Thu, 28 Dec 2023 00:58:06 +0000</pubDate>
      <link>https://dev.to/greptime/unlock-complex-time-series-analysis-in-sql-with-range-queries-3ig9</link>
      <guid>https://dev.to/greptime/unlock-complex-time-series-analysis-in-sql-with-range-queries-3ig9</guid>
      <description>&lt;h2&gt;
  
  
  Background
&lt;/h2&gt;

&lt;p&gt;Time-series data often requires querying and aggregating over specified time intervals, a pattern well-supported by &lt;code&gt;PromQL&lt;/code&gt;'s &lt;code&gt;Range selector&lt;/code&gt;. However, executing these queries in standard SQL is notably complex. To address this, GreptimeDB has introduced an enhanced SQL Range query syntax, effectively marrying SQL's robust flexibility with specialized time-series querying capabilities. This advancement ensures seamless, native handling of time-series data within SQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Explore Range Queries with SQL on GreptimePlay
&lt;/h2&gt;

&lt;p&gt;Our interactive documentation for range queries is now officially available on &lt;a href="https://www.greptime.com/playground" rel="noopener noreferrer"&gt;GreptimePlay&lt;/a&gt;! &lt;/p&gt;

&lt;p&gt;You can delve into various query techniques through a daily example using SQL and receive immediate, visualized feedback. Dive into the world of dynamic data querying on GreptimePlay today!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcfxcjwsn8mjr4v56zb3q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcfxcjwsn8mjr4v56zb3q.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjr8oam596ryffm9vfm19.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjr8oam596ryffm9vfm19.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Example
&lt;/h2&gt;

&lt;p&gt;Let's illustrate the Range query with an example. The following temperature table records the temperatures in different cities at various times:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Filmp4on5u92zzxpz09by.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Filmp4on5u92zzxpz09by.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For the given scenario, where we want to query the daily and weekly average temperatures in Beijing up to May 2, 2023 (timestamp &lt;code&gt;1682985600000&lt;/code&gt;), with a fallback to use linear interpolation to estimate query values for missing data.&lt;/p&gt;

&lt;p&gt;To conduct these two queries in PromQL, it is structured with a day as the step size. For the daily average temperature, we aggregate data over each day. For the weekly average, we extend this aggregation to a week-long period, calculating the average for each week. Additionally, to align our query with the specific timestamp of &lt;code&gt;1682985600000&lt;/code&gt;, we use the &lt;code&gt;@&lt;/code&gt; operator in PromQL. This aligns the query execution time exactly to the given timestamp, ensuring accurate and relevant data retrieval for the specified period.&lt;/p&gt;

&lt;p&gt;The final query looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Daily average temperature&lt;/span&gt;
&lt;span class="n"&gt;avg_over_time&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;"beijing"&lt;/span&gt;&lt;span class="p"&gt;}[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="mi"&gt;1682985600000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;

&lt;span class="c1"&gt;-- Weekly average temperature&lt;/span&gt;
&lt;span class="n"&gt;avg_over_time&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;"beijing"&lt;/span&gt;&lt;span class="p"&gt;}[&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="mi"&gt;1682985600000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;However, the above query has some issues: PromQL emphasizes on data querying but struggles with handling missing data points, i.e., smoothing the queried data. Though PromQL has a Lookback delta mechanism (see &lt;a href="https://promlabs.com/blog/2020/07/02/selecting-data-in-promql/#lookback-delta" rel="noopener noreferrer"&gt;this article&lt;/a&gt; for more details), which uses old data to replace missing data points, this default behavior might not be desirable for users under certain circumstances. Due to the existence of the Lookback delta mechanism, aggregated data might carry some old values. And it is challenging for PromQL to precisely control data accuracy. Furthermore, PromQL does not have an effective method for data smoothing, as our requirement mentioned above.&lt;/p&gt;

&lt;p&gt;From a traditional SQL perspective, since there is no such Lookback delta mechanism, we can precisely control the scope of our data selection and aggregation, allowing for more accurate queries. &lt;/p&gt;

&lt;p&gt;The query here essentially aggregates data daily and weekly. For daily average temperatures, we can use the scalar function &lt;code&gt;date_trunc&lt;/code&gt;, which truncates timestamp to a certain precision. We use this function to truncate time to a daily unit and then aggregate the data by day to get the desired results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Daily average temperature&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;avg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;temp&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="n"&gt;date_trunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'day'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;
        &lt;span class="k"&gt;temp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt;
        &lt;span class="n"&gt;temperature&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt;
        &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;"beijing"&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1682985600000&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above query roughly meets our needs, but there are issues with this type of query:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Complicated to write with the subqueries required;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This method can only calculate daily average temperatures, not weekly averages. In SQL, aggregation demands that each piece of data belong to only one group. However, this becomes problematic in time series queries where each sampling spans a week with intervals recorded daily. In such cases, a single data point is inevitably shared across multiple groups, making traditional SQL queries unsuitable for these queries.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Still doesn't address the issue of filling in blank data.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The crucial issue we now must address is that these queries are fundamentally time series in nature, yet the SQL we employ, despite its highly flexible expressive power, is not tailor-made for time series databases. This mismatch highlights the need for some new SQL extension syntax to effectively manage and query time series data. Some time series databases like InfluxDB offer &lt;code&gt;group by time&lt;/code&gt; syntax, and QuestDB offers &lt;code&gt;Sample By&lt;/code&gt; syntax. These implementations provide ideas for our Range queries. &lt;/p&gt;

&lt;p&gt;Next, we'll introduce how to utilize GreptimeDB's Range syntax for the above queries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- average daily temperature&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;avg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;temp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="s1"&gt;'1d'&lt;/span&gt; &lt;span class="n"&gt;FILL&lt;/span&gt; &lt;span class="n"&gt;LINEAR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
    &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;"beijing"&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1682985600000&lt;/span&gt;
&lt;span class="n"&gt;ALIGN&lt;/span&gt; &lt;span class="s1"&gt;'1d'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- average weekly temperature&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;avg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;temp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="s1"&gt;'7d'&lt;/span&gt; &lt;span class="n"&gt;FILL&lt;/span&gt; &lt;span class="n"&gt;LINEAR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;
    &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;"beijing"&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1682985600000&lt;/span&gt;
&lt;span class="n"&gt;ALIGN&lt;/span&gt; &lt;span class="s1"&gt;'1d'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We have introduced a keyword, &lt;code&gt;ALIGN&lt;/code&gt;, into a &lt;code&gt;SELECT&lt;/code&gt; statement to represent the step size of each time series query, aligning the time to the calendar. Following the aggregation function, a &lt;code&gt;RANGE&lt;/code&gt; keyword is used to denote the scope of each data aggregation. &lt;code&gt;FILL LINEAR&lt;/code&gt; indicates the method of filling in when data points are missing, by using linear interpolation to fill the data. Through this approach, we can more easily fulfill the requirements mentioned earlier.&lt;/p&gt;

&lt;p&gt;The Range query allows us to elegantly express time series queries in SQL, effectively compensating for SQL's shortcomings in describing time series queries. Moreover, it enables the combination of SQL's powerful expressive capabilities to achieve more complex data querying functions.&lt;br&gt;
Range queries also offer more flexible usage options, with specific details available in &lt;a href="https://docs.greptime.com/reference/sql/range" rel="noopener noreferrer"&gt;this documentation&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Implementation Logic
&lt;/h2&gt;

&lt;p&gt;Range query is essentially a data aggregation algorithm, but it differs from traditional SQL data aggregation in a key aspect: in Range queries, a single data point may be aggregated into multiple groups. For example, if a user wants to calculate the average weekly temperature for each day, each temperature data point will be used in the calculation for several weekly averages. &lt;/p&gt;

&lt;p&gt;The aforementioned query logic, when formulated as a Range query, can be articulated in the following manner.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;avg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;RANGE&lt;/span&gt; &lt;span class="s1"&gt;'7d'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="k"&gt;table&lt;/span&gt; &lt;span class="n"&gt;ALIGN&lt;/span&gt; &lt;span class="s1"&gt;'1d'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For each Range expression, we utilize &lt;code&gt;align_to&lt;/code&gt; (specified by the &lt;code&gt;TO&lt;/code&gt; keyword, the &lt;code&gt;TO&lt;/code&gt; keyword is not specified above, which is UTC 0 time. For more usage of the &lt;code&gt;TO&lt;/code&gt; keyword, please refer to &lt;a href="https://docs.greptime.com/reference/sql/range#to-option" rel="noopener noreferrer"&gt;this documentation&lt;/a&gt;, the align (&lt;code&gt;1d&lt;/code&gt;) and range (&lt;code&gt;7d&lt;/code&gt;) parameters to define time windows (each time window is called a time slot) and categorize data based on their appropriate timestamps into these time slots.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;The time origin on the time axis is set at &lt;code&gt;align_to&lt;/code&gt;, and we segment aligned time points both forwards and backwards using align as the step size. This collection of time points is referred to as &lt;code&gt;align_ts&lt;/code&gt;. The formula for &lt;code&gt;align_ts&lt;/code&gt; is &lt;code&gt;{ ts | ts = align_to + k * align, k is an integer }&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;For each element &lt;code&gt;ts&lt;/code&gt; in the &lt;code&gt;align_ts&lt;/code&gt; set, a &lt;code&gt;time slot&lt;/code&gt; is defined. A time slot is a left-closed, right-open interval satisfying &lt;code&gt;[ts , ts + range)&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When align is greater than range, the segmented time slots are as illustrated below, and in this scenario, a single data point will belong to only one time slot.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frdsn4l0oovd2e392mp7q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frdsn4l0oovd2e392mp7q.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When align is smaller than range, the segmented time slots appear as shown in the following illustration. In this situation, a single data point may belong to multiple time slots.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc2yz64m36mjf2d9923l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc2yz64m36mjf2d9923l.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The implementation of the Range feature utilizes the classic hash aggregation algorithm. This involves reserving a hash bucket for each time slot being sampled and placing all the data scheduled for sampling into the corresponding hash buckets.&lt;/p&gt;

&lt;p&gt;Unlike traditional aggregation algorithms, time series data aggregation may involve overlapping data points (e.g. calculating the daily average temperature for each week). In algorithmic terms, this means a single data point may belong to multiple hash buckets, which differentiates it from the conventional hash aggregation approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;By leveraging the SQL RANGE query syntax extension provided by GreptimeDB, combined with the powerful expressive capabilities of the SQL language itself, we can conduct more concise, elegant, and efficient analysis and querying of time series data within GreptimeDB. &lt;br&gt;
This approach also circumvents some of the limitations encountered in data querying with PromQL. Users can flexibly utilize RANGE queries in GreptimeDB to unlock new methods for time series data analysis and querying.&lt;/p&gt;

</description>
      <category>database</category>
      <category>sql</category>
      <category>timeseries</category>
      <category>greptime</category>
    </item>
  </channel>
</rss>
